Parallelizable approximate solvers for recursions arising in preconditioning
Shapira, Y.
1996-12-31
For the recursions used in the Modified Incomplete LU (MILU) preconditioner, namely, the incomplete decomposition, forward elimination and back substitution processes, a parallelizable approximate solver is presented. The present analysis shows that the solutions of the recursions depend only weakly on their initial conditions and may be interpreted to indicate that the inexact solution is close, in some sense, to the exact one. The method is based on a domain decomposition approach, suitable for parallel implementations with message passing architectures. It requires a fixed number of communication steps per preconditioned iteration, independently of the number of subdomains or the size of the problem. The overlapping subdomains are either cubes (suitable for mesh-connected arrays of processors) or constructed by the data-flow rule of the recursions (suitable for line-connected arrays with possibly SIMD or vector processors). Numerical examples show that, in both cases, the overhead in the number of iterations required for convergence of the preconditioned iteration is small relatively to the speed-up gained.
Approximate Riemann solvers for the Godunov SPH (GSPH)
NASA Astrophysics Data System (ADS)
Puri, Kunal; Ramachandran, Prabhu
2014-08-01
The Godunov Smoothed Particle Hydrodynamics (GSPH) method is coupled with non-iterative, approximate Riemann solvers for solutions to the compressible Euler equations. The use of approximate solvers avoids the expensive solution of the non-linear Riemann problem for every interacting particle pair, as required by GSPH. In addition, we establish an equivalence between the dissipative terms of GSPH and the signal based SPH artificial viscosity, under the restriction of a class of approximate Riemann solvers. This equivalence is used to explain the anomalous “wall heating” experienced by GSPH and we provide some suggestions to overcome it. Numerical tests in one and two dimensions are used to validate the proposed Riemann solvers. A general SPH pairing instability is observed for two-dimensional problems when using unequal mass particles. In general, Ducowicz Roe's and HLLC approximate Riemann solvers are found to be suitable replacements for the iterative Riemann solver in the original GSPH scheme.
Parallel iterative solvers and preconditioners using approximate hierarchical methods
Grama, A.; Kumar, V.; Sameh, A.
1996-12-31
In this paper, we report results of the performance, convergence, and accuracy of a parallel GMRES solver for Boundary Element Methods. The solver uses a hierarchical approximate matrix-vector product based on a hybrid Barnes-Hut / Fast Multipole Method. We study the impact of various accuracy parameters on the convergence and show that with minimal loss in accuracy, our solver yields significant speedups. We demonstrate the excellent parallel efficiency and scalability of our solver. The combined speedups from approximation and parallelism represent an improvement of several orders in solution time. We also develop fast and paralellizable preconditioners for this problem. We report on the performance of an inner-outer scheme and a preconditioner based on truncated Green`s function. Experimental results on a 256 processor Cray T3D are presented.
Approximate Riemann Solvers for the Cosmic Ray Magnetohydrodynamical Equations
NASA Astrophysics Data System (ADS)
Kudoh, Yuki; Hanawa, Tomoyuki
2016-08-01
We analyze the cosmic-ray magnetohydrodynamic (CR MHD) equations to improve the numerical simulations. We propose to solve them in the fully conservation form, which is equivalent to the conventional CR MHD equations. In the fully conservation form, the CR energy equation is replaced with the CR "number" conservation, where the CR number density is defined as the three fourths power of the CR energy density. The former contains an extra source term, while latter does not. An approximate Riemann solver is derived from the CR MHD equations in the fully conservation form. Based on the analysis, we propose a numerical scheme of which solutions satisfy the Rankine-Hugoniot relation at any shock. We demonstrate that it reproduces the Riemann solution derived by Pfrommer et al. (2006) for a 1D CR hydrodynamic shock tube problem. We compare the solution with those obtained by solving the CR energy equation. The latter solutions deviate from the Riemann solution seriously, when the CR pressure dominates over the gas pressure in the post-shocked gas. The former solutions converge to the Riemann solution and are of the second order accuracy in space and time. Our numerical examples include an expansion of high pressure sphere in an magnetized medium. Fast and slow shocks are sharply resolved in the example. We also discuss possible extension of the CR MHD equations to evaluate the average CR energy.
Approximate Riemann solvers for the cosmic ray magnetohydrodynamical equations
NASA Astrophysics Data System (ADS)
Kudoh, Yuki; Hanawa, Tomoyuki
2016-11-01
We analyse the cosmic ray magnetohydrodynamic (CR MHD) equations to improve the numerical simulations. We propose to solve them in the fully conservation form, which is equivalent to the conventional CR MHD equations. In the fully conservation form, the CR energy equation is replaced with the CR `number' conservation, where the CR number density is defined as the three-fourths power of the CR energy density. The former contains an extra source term, while latter does not. An approximate Riemann solver is derived from the CR MHD equations in the fully conservation form. Based on the analysis, we propose a numerical scheme of which solutions satisfy the Rankine-Hugoniot relation at any shock. We demonstrate that it reproduces the Riemann solution derived by Pfrommer et al. for a 1D CR hydrodynamic shock tube problem. We compare the solution with those obtained by solving the CR energy equation. The latter solutions deviate from the Riemann solution seriously, when the CR pressure dominates over the gas pressure in the post-shocked gas. The former solutions converge to the Riemann solution and are of the second-order accuracy in space and time. Our numerical examples include an expansion of high-pressure sphere in a magnetized medium. Fast and slow shocks are sharply resolved in the example. We also discuss possible extension of the CR MHD equations to evaluate the average CR energy.
NASA Astrophysics Data System (ADS)
Li, Xinya; Deng, Z. Daniel; Sun, Yannan; Martinez, Jayson J.; Fu, Tao; McMichael, Geoffrey A.; Carlson, Thomas J.
2014-11-01
Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developed using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature.
Li, Xinya; Deng, Z. Daniel; Sun, Yannan; Martinez, Jayson J.; Fu, Tao; McMichael, Geoffrey A.; Carlson, Thomas J.
2014-01-01
Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developed using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature. PMID:25427517
Li, Xinya; Deng, Z Daniel; Sun, Yannan; Martinez, Jayson J; Fu, Tao; McMichael, Geoffrey A; Carlson, Thomas J
2014-01-01
Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developed using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature. PMID:25427517
Li, Xinya; Deng, Z. Daniel; USA, Richland Washington; Sun, Yannan; USA, Richland Washington; Martinez, Jayson J.; USA, Richland Washington; Fu, Tao; USA, Richland Washington; McMichael, Geoffrey A.; et al
2014-11-27
Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developedmore » using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature.« less
Li, Xinya; Deng, Z Daniel; Sun, Yannan; Martinez, Jayson J; Fu, Tao; McMichael, Geoffrey A; Carlson, Thomas J
2014-11-27
Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developed using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature.
Li, Xinya; Deng, Z. Daniel; USA, Richland Washington; Sun, Yannan; USA, Richland Washington; Martinez, Jayson J.; USA, Richland Washington; Fu, Tao; USA, Richland Washington; McMichael, Geoffrey A.; USA, Richland Washington; Carlson, Thomas J.; USA, Richland Washington
2014-11-27
Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developed using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature.
Parallelizable adiabatic gate teleportation
NASA Astrophysics Data System (ADS)
Nakago, Kosuke; Hajdušek, Michal; Nakayama, Shojun; Murao, Mio
2015-12-01
To investigate how a temporally ordered gate sequence can be parallelized in adiabatic implementations of quantum computation, we modify adiabatic gate teleportation, a model of quantum computation proposed by Bacon and Flammia [Phys. Rev. Lett. 103, 120504 (2009), 10.1103/PhysRevLett.103.120504], to a form deterministically simulating parallelized gate teleportation, which is achievable only by postselection. We introduce a twisted Heisenberg-type interaction Hamiltonian, a Heisenberg-type spin interaction where the coordinates of the second qubit are twisted according to a unitary gate. We develop parallelizable adiabatic gate teleportation (PAGT) where a sequence of unitary gates is performed in a single step of the adiabatic process. In PAGT, numeric calculations suggest the necessary time for the adiabatic evolution implementing a sequence of L unitary gates increases at most as O (L5) . However, we show that it has the interesting property that it can map the temporal order of gates to the spatial order of interactions specified by the final Hamiltonian. Using this property, we present a controlled-PAGT scheme to manipulate the order of gates by a control qubit. In the controlled-PAGT scheme, two differently ordered sequential unitary gates F G and G F are coherently performed depending on the state of a control qubit by simultaneously applying the twisted Heisenberg-type interaction Hamiltonians implementing unitary gates F and G . We investigate why the twisted Heisenberg-type interaction Hamiltonian allows PAGT. We show that the twisted Heisenberg-type interaction Hamiltonian has an ability to perform a transposed unitary gate by just modifying the space ordering of the final Hamiltonian implementing a unitary gate in adiabatic gate teleportation. The dynamics generated by the time-reversed Hamiltonian represented by the transposed unitary gate enables deterministic simulation of a postselected event of parallelized gate teleportation in adiabatic
Improved implementation of the HLL approximate Riemann solver for one-dimensional open channel flows
Technology Transfer Automated Retrieval System (TEKTRAN)
Several new techniques are proposed to overcome the deficiencies in the conventional formulation of the approximate Riemann solvers for one-dimensional open channel flows, which include numerical imbalance and inaccuracy in the solution of discharge. The former arises in the case of irregular geomet...
An approximate Riemann solver for magnetohydrodynamics (that works in more than one dimension)
NASA Technical Reports Server (NTRS)
Powell, Kenneth G.
1994-01-01
An approximate Riemann solver is developed for the governing equations of ideal magnetohydrodynamics (MHD). The Riemann solver has an eight-wave structure, where seven of the waves are those used in previous work on upwind schemes for MHD, and the eighth wave is related to the divergence of the magnetic field. The structure of the eighth wave is not immediately obvious from the governing equations as they are usually written, but arises from a modification of the equations that is presented in this paper. The addition of the eighth wave allows multidimensional MHD problems to be solved without the use of staggered grids or a projection scheme, one or the other of which was necessary in previous work on upwind schemes for MHD. A test problem made up of a shock tube with rotated initial conditions is solved to show that the two-dimensional code yields answers consistent with the one-dimensional methods developed previously.
NASA Technical Reports Server (NTRS)
Ghil, M.; Balgovind, R.
1979-01-01
The inhomogeneous Cauchy-Riemann equations in a rectangle are discretized by a finite difference approximation. Several different boundary conditions are treated explicitly, leading to algorithms which have overall second-order accuracy. All boundary conditions with either u or v prescribed along a side of the rectangle can be treated by similar methods. The algorithms presented here have nearly minimal time and storage requirements and seem suitable for development into a general-purpose direct Cauchy-Riemann solver for arbitrary boundary conditions.
Approximate Harten-Lax-van Leer Riemann solvers for relativistic magnetohydrodynamics
NASA Astrophysics Data System (ADS)
Mignone, Andrea; Bodo, G.; Ugliano, M.
2012-11-01
We review a particular class of approximate Riemann solvers in the context of the equations of ideal relativistic magnetohydrodynamics. Commonly prefixed as Harten-Lax-van Leer (HLL), this family of solvers approaches the solution of the Riemann problem by providing suitable guesses to the outermots characteristic speeds, without any prior knowledge of the solution. By requiring consistency with the integral form of the conservation law, a simplified set of jump conditions with a reduced number of characteristic waves may be obtained. The degree of approximation crucially depends on the wave pattern used in prepresnting the Riemann fan arising from the initial discontinuity breakup. In the original HLL scheme, the solution is approximated by collapsing the full characteristic structure into a single average state enclosed by two outermost fast mangnetosonic speeds. On the other hand, HLLC and HLLD improves the accuracy of the solution by restoring the tangential and Alfvén modes therefore leading to a representation of the Riemann fan in terms of 3 and 5 waves, respectively.
Jouvet, Guillaume
2015-04-15
In this paper, a multilayer generalisation of the Shallow Shelf Approximation (SSA) is considered. In this recent hybrid ice flow model, the ice thickness is divided into thin layers, which can spread out, contract and slide over each other in such a way that the velocity profile is layer-wise constant. Like the SSA (1-layer model), the multilayer model can be reformulated as a minimisation problem. However, unlike the SSA, the functional to be minimised involves a new penalisation term for the interlayer jumps of the velocity, which represents the vertical shear stresses induced by interlayer sliding. Taking advantage of this reformulation, numerical solvers developed for the SSA can be naturally extended layer-wise or column-wise. Numerical results show that the column-wise extension of a Newton multigrid solver proves to be robust in the sense that its convergence is barely influenced by the number of layers and the type of ice flow. In addition, the multilayer formulation appears to be naturally better conditioned than the one of the first-order approximation to face the anisotropic conditions of the sliding-dominant ice flow of ISMIP-HOM experiments.
Low-diffusion approximate Riemann solvers for Reynolds-stress transport
NASA Astrophysics Data System (ADS)
Ben Nasr, N.; Gerolymos, G. A.; Vallet, I.
2014-07-01
The paper investigates the use of low-diffusion (contact-discontinuity-resolving) approximate Riemann solvers for the convective part of the Reynolds-averaged Navier-Stokes (RANS) equations with Reynolds-stress model (RSM) for turbulence. Different equivalent forms of the RSM-RANS system are discussed and classification of the complex terms introduced by advanced turbulence closures is attempted. Computational examples are presented, which indicate that the use of contact-discontinuity-resolving convective numerical fluxes, along with a passive-scalar approach for the Reynolds-stresses, may lead to unphysical oscillations of the solution. To determine the source of these instabilities, theoretical analysis of the Riemann problem for a simplified Reynolds-stress transport model-system, which incorporates the divergence of the Reynolds-stress tensor in the convective part of the mean-flow equations, and includes only those nonconservative products which are computable (do not require modelling), was undertaken, highlighting the differences in wave-structure compared to the passive-scalar case. A hybrid solution, allowing the combination of any low-diffusion approximate Riemann solver with the complex tensorial representations used in advanced models, is proposed, combining low-diffusion fluxes for the mean-flow equations with a more dissipative massflux for Reynolds-stress-transport. Several computational examples are presented to assess the performance of this approach, demonstrating enhanced accuracy and satisfactory convergence.
Regnier, D.; Verriere, M.; Dubray, N.; Schunck, N.
2015-11-30
In this study, we describe the software package FELIX that solves the equations of the time-dependent generator coordinate method (TDGCM) in NN-dimensions (N ≥ 1) under the Gaussian overlap approximation. The numerical resolution is based on the Galerkin finite element discretization of the collective space and the Crank–Nicolson scheme for time integration. The TDGCM solver is implemented entirely in C++. Several additional tools written in C++, Python or bash scripting language are also included for convenience. In this paper, the solver is tested with a series of benchmarks calculations. We also demonstrate the ability of our code to handle a realistic calculation of fission dynamics.
NASA Astrophysics Data System (ADS)
Regnier, D.; Verrière, M.; Dubray, N.; Schunck, N.
2016-03-01
We describe the software package FELIX that solves the equations of the time-dependent generator coordinate method (TDGCM) in N-dimensions (N ≥ 1) under the Gaussian overlap approximation. The numerical resolution is based on the Galerkin finite element discretization of the collective space and the Crank-Nicolson scheme for time integration. The TDGCM solver is implemented entirely in C++. Several additional tools written in C++, Python or bash scripting language are also included for convenience. In this paper, the solver is tested with a series of benchmarks calculations. We also demonstrate the ability of our code to handle a realistic calculation of fission dynamics.
NASA Astrophysics Data System (ADS)
Lin, Xue-lei; Lu, Xin; Ng, Micheal K.; Sun, Hai-Wei
2016-10-01
A fast accurate approximation method with multigrid solver is proposed to solve a two-dimensional fractional sub-diffusion equation. Using the finite difference discretization of fractional time derivative, a block lower triangular Toeplitz matrix is obtained where each main diagonal block contains a two-dimensional matrix for the Laplacian operator. Our idea is to make use of the block ɛ-circulant approximation via fast Fourier transforms, so that the resulting task is to solve a block diagonal system, where each diagonal block matrix is the sum of a complex scalar times the identity matrix and a Laplacian matrix. We show that the accuracy of the approximation scheme is of O (ɛ). Because of the special diagonal block structure, we employ the multigrid method to solve the resulting linear systems. The convergence of the multigrid method is studied. Numerical examples are presented to illustrate the accuracy of the proposed approximation scheme and the efficiency of the proposed solver.
Divergence-free approximate Riemann solver for the quasi-neutral two-fluid plasma model
NASA Astrophysics Data System (ADS)
Amano, Takanobu
2015-10-01
A numerical method for the quasi-neutral two-fluid (QNTF) plasma model is described. The basic equations are ion and electron fluid equations and the Maxwell equations without displacement current. The neglect of displacement current is consistent with the assumption of charge neutrality. Therefore, Langmuir waves and electromagnetic waves are eliminated from the system, which is in clear contrast to the fully electromagnetic two-fluid model. It thus reduces to the ideal magnetohydrodynamic (MHD) equations in the long wavelength limit, but the two-fluid effect appearing at ion and electron inertial scales is fully taken into account. It is shown that the basic equations may be rewritten in a form that has formally the same structure as the MHD equations. The total mass, momentum, and energy are all written in the conservative form. A new three-dimensional numerical simulation code has been developed for the QNTF equations. The HLL (Harten-Lax-van Leer) approximate Riemann solver combined with the upwind constrained transport (UCT) scheme is applied. The method was originally developed for MHD [25], but works quite well for the present model as well. The simulation code is able to capture sharp multidimensional discontinuities as well as dispersive waves arising from the two-fluid effect at small scales without producing ∇ ṡ B errors. It is well known that conventional Hall-MHD codes often suffer a numerical stability issue associated with short wavelength whistler waves. On the other hand, since finite electron inertia introduces an upper bound to the phase speed of whistler waves in the present model, our code is free from the issue even without explicit dissipation terms or implicit time integration. Numerical experiments have confirmed that there is no need to resolve characteristic time scales such as plasma frequency or cyclotron frequency for numerical stability. Consequently, the QNTF model offers a better alternative to the Hall-MHD or fully
Fast solvers for finite difference approximations for the Stokes and Navier-Stokes equations
Shin, D.
1992-01-01
The authors consider several methods for solving the linear equations arising from finite difference discretizations of the Stokes equations. The pressure equation method presented here for the first time, apparently, and the method, presented by Bramble and Pasciak, are shown to have computational effort that grows slowly with the number of grid points. The methods work with second-order accurate discretizations. Computational results are shown for both the Stokes and incompressible Navier-Stokes at low Reynolds number. The inf-sup conditions resulting from three finite difference approximations of the Stokes equations are proven. These conditions are used to prove that the Schur complement Q[sub h] of the linear system generated by each of these approximations is bounded uniformly away from zero. For the pressure equation method, this guarantees that the conjugate gradient method applied to Q[sub h] converges in a finite number of iterations which is independent of mesh size. The fact that Q[sub h] is bounded below is used to prove convergence estimates for the solutions generated by these finite difference approximations. One of the estimates is for a staggered grid and the estimate of the scheme shows that both the pressure and the velocity parts of the solution are second-order accurate. Iterative methods are compared by the use of the regularized central differencing introduced by Strikwerda. Several finite difference approximations of the Stokes equations by the SOR method are compared and the excellence of the approximations by the regularized central differencing over the other finite difference approximation is mentioned. This difference gives rise to a linear equation with a matrix which is slightly non-symmetric. The convergence of the typical steepest descent method and conjugate gradient method, which is almost as same as the typical conjugate gradient method, applied to slightly non-symmetric positive definite matrices are proven.
IDA: An implicit, parallelizable method for calculating drainage area
NASA Astrophysics Data System (ADS)
Richardson, Alan; Hill, Christopher N.; Perron, J. Taylor
2014-05-01
Models of landscape evolution or hydrological processes typically depend on the accurate determination of upslope drainage area from digital elevation data, but such calculations can be very computationally demanding when applied to high-resolution topographic data. To overcome this limitation, we propose calculating drainage area in an implicit, iterative manner using linear solvers. The basis of this method is a recasting of the flow routing problem as a sparse system of linear equations, which can be solved using established computational techniques. This approach is highly parallelizable, enabling data to be spread over multiple computer processors. Good scalability is exhibited, rendering it suitable for contemporary high-performance computing architectures with many processors, such as graphics processing units (GPUs). In addition, the iterative nature of the computational algorithms we use to solve the linear system creates the possibility of accelerating the solution by providing an initial guess, making the method well suited to iterative calculations such as numerical landscape evolution models. We compare this method with a previously proposed parallel drainage area algorithm and present several examples illustrating its advantages, including a continent-scale flow routing calculation at 3 arc sec resolution, improvements to models of fluvial sediment yield, and acceleration of drainage area calculations in a landscape evolution model. We additionally describe a modification that allows the method to be used for parallel basin delineation.
NASA Astrophysics Data System (ADS)
Yeckel, Andrew; Lun, Lisa; Derby, Jeffrey J.
2009-12-01
A new, approximate block Newton (ABN) method is derived and tested for the coupled solution of nonlinear models, each of which is treated as a modular, black box. Such an approach is motivated by a desire to maintain software flexibility without sacrificing solution efficiency or robustness. Though block Newton methods of similar type have been proposed and studied, we present a unique derivation and use it to sort out some of the more confusing points in the literature. In particular, we show that our ABN method behaves like a Newton iteration preconditioned by an inexact Newton solver derived from subproblem Jacobians. The method is demonstrated on several conjugate heat transfer problems modeled after melt crystal growth processes. These problems are represented by partitioned spatial regions, each modeled by independent heat transfer codes and linked by temperature and flux matching conditions at the boundaries common to the partitions. Whereas a typical block Gauss-Seidel iteration fails about half the time for the model problem, quadratic convergence is achieved by the ABN method under all conditions studied here. Additional performance advantages over existing methods are demonstrated and discussed.
NASA Astrophysics Data System (ADS)
Bauer, Petr; Klement, Vladimír; Oberhuber, Tomáš; Žabka, Vítězslav
2016-03-01
We present a complete GPU implementation of a geometric multigrid solver for the numerical solution of the Navier-Stokes equations for incompressible flow. The approximate solution is constructed on a two-dimensional unstructured triangular mesh. The problem is discretized by means of the mixed finite element method with semi-implicit timestepping. The linear saddle-point problem arising from the scheme is solved by the geometric multigrid method with a Vanka-type smoother. The parallel solver is based on the red-black coloring of the mesh triangles. We achieved a speed-up of 11 compared to a parallel (4 threads) code based on OpenMP and 19 compared to a sequential code.
Tezaur, I. K.; Perego, M.; Salinger, A. G.; Tuminaro, R. S.; Price, S. F.
2015-04-27
This paper describes a new parallel, scalable and robust finite element based solver for the first-order Stokes momentum balance equations for ice flow. The solver, known as Albany/FELIX, is constructed using the component-based approach to building application codes, in which mature, modular libraries developed as a part of the Trilinos project are combined using abstract interfaces and template-based generic programming, resulting in a final code with access to dozens of algorithmic and advanced analysis capabilities. Following an overview of the relevant partial differential equations and boundary conditions, the numerical methods chosen to discretize the ice flow equations are described, alongmore » with their implementation. The results of several verification studies of the model accuracy are presented using (1) new test cases for simplified two-dimensional (2-D) versions of the governing equations derived using the method of manufactured solutions, and (2) canonical ice sheet modeling benchmarks. Model accuracy and convergence with respect to mesh resolution are then studied on problems involving a realistic Greenland ice sheet geometry discretized using hexahedral and tetrahedral meshes. Also explored as a part of this study is the effect of vertical mesh resolution on the solution accuracy and solver performance. The robustness and scalability of our solver on these problems is demonstrated. Lastly, we show that good scalability can be achieved by preconditioning the iterative linear solver using a new algebraic multilevel preconditioner, constructed based on the idea of semi-coarsening.« less
NASA Astrophysics Data System (ADS)
Kalashnikova, I.; Perego, M.; Salinger, A. G.; Tuminaro, R. S.; Price, S. F.
2014-11-01
This paper describes a new parallel, scalable and robust finite-element based solver for the first-order Stokes momentum balance equations for ice flow. The solver, known as Albany/FELIX, is constructed using the component-based approach to building application codes, in which mature, modular libraries developed as a part of the Trilinos project are combined using abstract interfaces and Template-Based Generic Programming, resulting in a final code with access to dozens of algorithmic and advanced analysis capabilities. Following an overview of the relevant partial differential equations and boundary conditions, the numerical methods chosen to discretize the ice flow equations are described, along with their implementation. The results of several verification studies of the model accuracy are presented using: (1) new test cases derived using the method of manufactured solutions, and (2) canonical ice sheet modeling benchmarks. Model accuracy and convergence with respect to mesh resolution is then studied on problems involving a realistic Greenland ice sheet geometry discretized using structured and unstructured meshes. Also explored as a part of this study is the effect of vertical mesh resolution on the solution accuracy and solver performance. The robustness and scalability of our solver on these problems is demonstrated. Lastly, we show that good scalability can be achieved by preconditioning the iterative linear solver using a new algebraic multilevel preconditioner, constructed based on the idea of semi-coarsening.
NASA Astrophysics Data System (ADS)
Tezaur, I. K.; Perego, M.; Salinger, A. G.; Tuminaro, R. S.; Price, S. F.
2015-04-01
This paper describes a new parallel, scalable and robust finite element based solver for the first-order Stokes momentum balance equations for ice flow. The solver, known as Albany/FELIX, is constructed using the component-based approach to building application codes, in which mature, modular libraries developed as a part of the Trilinos project are combined using abstract interfaces and template-based generic programming, resulting in a final code with access to dozens of algorithmic and advanced analysis capabilities. Following an overview of the relevant partial differential equations and boundary conditions, the numerical methods chosen to discretize the ice flow equations are described, along with their implementation. The results of several verification studies of the model accuracy are presented using (1) new test cases for simplified two-dimensional (2-D) versions of the governing equations derived using the method of manufactured solutions, and (2) canonical ice sheet modeling benchmarks. Model accuracy and convergence with respect to mesh resolution are then studied on problems involving a realistic Greenland ice sheet geometry discretized using hexahedral and tetrahedral meshes. Also explored as a part of this study is the effect of vertical mesh resolution on the solution accuracy and solver performance. The robustness and scalability of our solver on these problems is demonstrated. Lastly, we show that good scalability can be achieved by preconditioning the iterative linear solver using a new algebraic multilevel preconditioner, constructed based on the idea of semi-coarsening.
Tezaur, I. K.; Perego, M.; Salinger, A. G.; Tuminaro, R. S.; Price, S. F.
2015-04-27
This paper describes a new parallel, scalable and robust finite element based solver for the first-order Stokes momentum balance equations for ice flow. The solver, known as Albany/FELIX, is constructed using the component-based approach to building application codes, in which mature, modular libraries developed as a part of the Trilinos project are combined using abstract interfaces and template-based generic programming, resulting in a final code with access to dozens of algorithmic and advanced analysis capabilities. Following an overview of the relevant partial differential equations and boundary conditions, the numerical methods chosen to discretize the ice flow equations are described, along with their implementation. The results of several verification studies of the model accuracy are presented using (1) new test cases for simplified two-dimensional (2-D) versions of the governing equations derived using the method of manufactured solutions, and (2) canonical ice sheet modeling benchmarks. Model accuracy and convergence with respect to mesh resolution are then studied on problems involving a realistic Greenland ice sheet geometry discretized using hexahedral and tetrahedral meshes. Also explored as a part of this study is the effect of vertical mesh resolution on the solution accuracy and solver performance. The robustness and scalability of our solver on these problems is demonstrated. Lastly, we show that good scalability can be achieved by preconditioning the iterative linear solver using a new algebraic multilevel preconditioner, constructed based on the idea of semi-coarsening.
Homman, Ahmed-Amine; Maillet, Jean-Bernard; Roussel, Julien; Stoltz, Gabriel
2016-01-14
This work presents new parallelizable numerical schemes for the integration of dissipative particle dynamics with energy conservation. So far, no numerical scheme introduced in the literature is able to correctly preserve the energy over long times and give rise to small errors on average properties for moderately small time steps, while being straightforwardly parallelizable. We present in this article two new methods, both straightforwardly parallelizable, allowing to correctly preserve the total energy of the system. We illustrate the accuracy and performance of these new schemes both on equilibrium and nonequilibrium parallel simulations. PMID:26772559
NASA Astrophysics Data System (ADS)
Homman, Ahmed-Amine; Maillet, Jean-Bernard; Roussel, Julien; Stoltz, Gabriel
2016-01-01
This work presents new parallelizable numerical schemes for the integration of dissipative particle dynamics with energy conservation. So far, no numerical scheme introduced in the literature is able to correctly preserve the energy over long times and give rise to small errors on average properties for moderately small time steps, while being straightforwardly parallelizable. We present in this article two new methods, both straightforwardly parallelizable, allowing to correctly preserve the total energy of the system. We illustrate the accuracy and performance of these new schemes both on equilibrium and nonequilibrium parallel simulations.
Hierarchically parallelized constrained nonlinear solvers with automated substructuring
NASA Technical Reports Server (NTRS)
Padovan, J.; Kwang, A.
1991-01-01
This paper develops a parallelizable multilevel constrained nonlinear equation solver. The substructuring process is automated to yield appropriately balanced partitioning of each succeeding level. Due to the generality of the procedure, both sequential, partially and fully parallel environments can be handled. This includes both single and multiprocessor assignment per individual partition. Several benchmark examples are presented. These illustrate the robustness of the procedure as well as its capacity to yield significant reductions in memory utilization and calculational effort due both to updating and inversion.
Hierarchically Parallelized Constrained Nonlinear Solvers with Automated Substructuring
NASA Technical Reports Server (NTRS)
Padovan, Joe; Kwang, Abel
1994-01-01
This paper develops a parallelizable multilevel multiple constrained nonlinear equation solver. The substructuring process is automated to yield appropriately balanced partitioning of each succeeding level. Due to the generality of the procedure,_sequential, as well as partially and fully parallel environments can be handled. This includes both single and multiprocessor assignment per individual partition. Several benchmark examples are presented. These illustrate the robustness of the procedure as well as its capability to yield significant reductions in memory utilization and calculational effort due both to updating and inversion.
Stanley, Vendall S.; Heroux, Michael A.; Hoekstra, Robert J.; Sala, Marzio
2004-03-01
Amesos is the Direct Sparse Solver Package in Trilinos. The goal of Amesos is to make AX=S as easy as it sounds, at least for direct methods. Amesos provides interfaces to a number of third party sparse direct solvers, including SuperLU, SuperLU MPI, DSCPACK, UMFPACK and KLU. Amesos provides a common object oriented interface to the best sparse direct solvers in the world. A sparse direct solver solves for x in Ax = b. where A is a matrix and x and b are vectors (or multi-vectors). A sparse direct solver flrst factors A into trinagular matrices L and U such that A = LU via gaussian elimination and then solves LU x = b. Switching amongst solvers in Amesos roquires a change to a single parameter. Yet, no solver needs to be linked it, unless it is used. All conversions between the matrices provided by the user and the format required by the underlying solver is performed by Amesos. As new sparse direct solvers are created, they will be incorporated into Amesos, allowing the user to simpty link with the new solver, change a single parameter in the calling sequence, and use the new solver. Amesos allows users to specify whether the matrix has changed. Amesos can be used anywhere that any sparse direct solver is needed.
2004-03-01
Amesos is the Direct Sparse Solver Package in Trilinos. The goal of Amesos is to make AX=S as easy as it sounds, at least for direct methods. Amesos provides interfaces to a number of third party sparse direct solvers, including SuperLU, SuperLU MPI, DSCPACK, UMFPACK and KLU. Amesos provides a common object oriented interface to the best sparse direct solvers in the world. A sparse direct solver solves for x in Ax = b. wheremore » A is a matrix and x and b are vectors (or multi-vectors). A sparse direct solver flrst factors A into trinagular matrices L and U such that A = LU via gaussian elimination and then solves LU x = b. Switching amongst solvers in Amesos roquires a change to a single parameter. Yet, no solver needs to be linked it, unless it is used. All conversions between the matrices provided by the user and the format required by the underlying solver is performed by Amesos. As new sparse direct solvers are created, they will be incorporated into Amesos, allowing the user to simpty link with the new solver, change a single parameter in the calling sequence, and use the new solver. Amesos allows users to specify whether the matrix has changed. Amesos can be used anywhere that any sparse direct solver is needed.« less
NASA Technical Reports Server (NTRS)
Ilin, Andrew V.
2006-01-01
The Magnetic Field Solver computer program calculates the magnetic field generated by a group of collinear, cylindrical axisymmetric electromagnet coils. Given the current flowing in, and the number of turns, axial position, and axial and radial dimensions of each coil, the program calculates matrix coefficients for a finite-difference system of equations that approximates a two-dimensional partial differential equation for the magnetic potential contributed by the coil. The program iteratively solves these finite-difference equations by use of the modified incomplete Cholesky preconditioned-conjugate-gradient method. The total magnetic potential as a function of axial (z) and radial (r) position is then calculated as a sum of the magnetic potentials of the individual coils, using a high-accuracy interpolation scheme. Then the r and z components of the magnetic field as functions of r and z are calculated from the total magnetic potential by use of a high-accuracy finite-difference scheme. Notably, for the finite-difference calculations, the program generates nonuniform two-dimensional computational meshes from nonuniform one-dimensional meshes. Each mesh is generated in such a way as to minimize the numerical error for a benchmark one-dimensional magnetostatic problem.
Solving block linear systems with low-rank off-diagonal blocks is easily parallelizable
Menkov, V.
1996-12-31
An easily and efficiently parallelizable direct method is given for solving a block linear system Bx = y, where B = D + Q is the sum of a non-singular block diagonal matrix D and a matrix Q with low-rank blocks. This implicitly defines a new preconditioning method with an operation count close to the cost of calculating a matrix-vector product Qw for some w, plus at most twice the cost of calculating Qw for some w. When implemented on a parallel machine the processor utilization can be as good as that of those operations. Order estimates are given for the general case, and an implementation is compared to block SSOR preconditioning.
Matrix decomposition graphics processing unit solver for Poisson image editing
NASA Astrophysics Data System (ADS)
Lei, Zhao; Wei, Li
2012-10-01
In recent years, gradient-domain methods have been widely discussed in the image processing field, including seamless cloning and image stitching. These algorithms are commonly carried out by solving a large sparse linear system: the Poisson equation. However, solving the Poisson equation is a computational and memory intensive task which makes it not suitable for real-time image editing. A new matrix decomposition graphics processing unit (GPU) solver (MDGS) is proposed to settle the problem. A matrix decomposition method is used to distribute the work among GPU threads, so that MDGS will take full advantage of the computing power of current GPUs. Additionally, MDGS is a hybrid solver (combines both the direct and iterative techniques) and has two-level architecture. These enable MDGS to generate identical solutions with those of the common Poisson methods and achieve high convergence rate in most cases. This approach is advantageous in terms of parallelizability, enabling real-time image processing, low memory-taken and extensive applications.
Kinetic simulation of fiber amplifier based on parallelizable and bidirectional algorithm
NASA Astrophysics Data System (ADS)
Chen, Haihuan; Yang, Huanbi; Wu, Wenhan
2015-10-01
The simulation of light waves propagating in fibers oppositely has to handle the extremely huge volume of data when employing sequential and unidirectional methods, where the simulation is in a coordinate system that moves along with the light waves. Therefore, alternative simulation algorithm should be used when calculating counter propagating light waves. Parallelizable and bidirectional (PB) algorithm simulates the light waves matching in time domain instead of space domain, does not need iteration, and permits efficient parallelization on multiple processors. The PB method is proposed to calculate the propagation of dispersing Gaussian pulse and a bit stream in fibers. However, PB method also has apparent advantages when simulating pulses in fiber laser amplifiers, which has not been investigated detailed yet. In this paper, we perform the simulation of pulses in a rare-earth-ions doped fiber amplifier. The influence of pump power, signal power, repetition rate, pulse width and fiber length on the amplifier's output average power, peak power, pulse energy and pulse shape are investigated. The results indicate that the PB method is effective when simulating high power amplification of pulses in fiber amplifier. Furthermore, nonlinear effects can be added into the simulation conveniently. The work in this paper will provide a more economic and efficient method to simulate power amplification of fiber lasers.
Parallel Multigrid Equation Solver
2001-09-07
Prometheus is a fully parallel multigrid equation solver for matrices that arise in unstructured grid finite element applications. It includes a geometric and an algebraic multigrid method and has solved problems of up to 76 mullion degrees of feedom, problems in linear elasticity on the ASCI blue pacific and ASCI red machines.
Two-dimensional time dependent Riemann solvers for neutron transport
Brunner, Thomas A. . E-mail: tabrunn@sandia.gov; Holloway, James Paul
2005-11-20
A two-dimensional Riemann solver is developed for the spherical harmonics approximation to the time dependent neutron transport equation. The eigenstructure of the resulting equations is explored, giving insight into both the spherical harmonics approximation and the Riemann solver. The classic Roe-type Riemann solver used here was developed for one-dimensional problems, but can be used in multidimensional problems by treating each face of a two-dimensional computation cell in a locally one-dimensional way. Several test problems are used to explore the capabilities of both the Riemann solver and the spherical harmonics approximation. The numerical solution for a simple line source problem is compared to the analytic solution to both the P{sub 1} equation and the full transport solution. A lattice problem is used to test the method on a more challenging problem.
2004-03-01
PLIRIS is an object-oriented solver built on top of a previous matrix solver used in a number of application codes. Puns solves a linear system directly via LU factorization with partial pivoting. The user provides the linear system in terms of Epetra Objects including a matrix and right-hand-sides. The user can then factor the matrix and perform the forward and back solve at a later time or solve for multiple right-hand-sides at once. This packagemore » is used when dense matrices are obtained in the problem formulation. These dense matrices occur whenever boundary element techniques are chosen for the solution procedure. This has been used in electromagnetics for both static and frequency domain problems.« less
A non-conforming 3D spherical harmonic transport solver
Van Criekingen, S.
2006-07-01
A new 3D transport solver for the time-independent Boltzmann transport equation has been developed. This solver is based on the second-order even-parity form of the transport equation. The angular discretization is performed through the expansion of the angular neutron flux in spherical harmonics (PN method). The novelty of this solver is the use of non-conforming finite elements for the spatial discretization. Such elements lead to a discontinuous flux approximation. This interface continuity requirement relaxation property is shared with mixed-dual formulations such as the ones based on Raviart-Thomas finite elements. Encouraging numerical results are presented. (authors)
Heroux, Michael A.
2007-03-01
HPCCG is a simple PDE application and preconditioned conjugate gradient solver that solves a linear system on a beam-shaped domain. Although it does not address many performance issues present in real engineering applications, such as load imbalance and preconditioner scalability, it can serve as a first "sanity test" of new processor design choices, inter-connect network design choices and the scalability of a new computer system. Because it is self-contained, easy to compile and easily scaled to 100s or 1000s of porcessors, it can be an attractive study code for computer system designers.
Scalable solvers and applications
Ribbens, C J
2000-10-27
The purpose of this report is to summarize research activities carried out under Lawrence Livermore National Laboratory (LLNL) research subcontract B501073. This contract supported the principal investigator (P1), Dr. Calvin Ribbens, during his sabbatical visit to LLNL from August 1999 through June 2000. Results and conclusions from the work are summarized below in two major sections. The first section covers contributions to the Scalable Linear Solvers and hypre projects in the Center for Applied Scientific Computing (CASC). The second section describes results from collaboration with Patrice Turchi of LLNL's Chemistry and Materials Science Directorate (CMS). A list of publications supported by this subcontract appears at the end of the report.
2007-03-01
HPCCG is a simple PDE application and preconditioned conjugate gradient solver that solves a linear system on a beam-shaped domain. Although it does not address many performance issues present in real engineering applications, such as load imbalance and preconditioner scalability, it can serve as a first "sanity test" of new processor design choices, inter-connect network design choices and the scalability of a new computer system. Because it is self-contained, easy to compile and easily scaledmore » to 100s or 1000s of porcessors, it can be an attractive study code for computer system designers.« less
Euler solvers for transonic applications
NASA Technical Reports Server (NTRS)
Vanleer, Bram
1989-01-01
The 1980s may well be called the Euler era of applied aerodynamics. Computer codes based on discrete approximations of the Euler equations are now routinely used to obtain solutions of transonic flow problems in which the effects of entropy and vorticity production are significant. Such codes can even predict separation from a sharp edge, owing to the inclusion of artificial dissipation, intended to lend numerical stability to the calculation but at the same time enforcing the Kutta condition. One effect not correctly predictable by Euler codes is the separation from a smooth surface, and neither is viscous drag; for these some form of the Navier-Stokes equation is needed. It, therefore, comes as no surprise to observe that the Navier-Stokes has already begun before Euler solutions were fully exploited. Moreover, most numerical developments for the Euler equations are now constrained by the requirement that the techniques introduced, notably artificial dissipation, must not interfere with the new physics added when going from an Euler to a full Navier-Stokes approximation. In order to appreciate the contributions of Euler solvers to the understanding of transonic aerodynamics, it is useful to review the components of these computational tools. Space discretization, time- or pseudo-time marching and boundary procedures, the essential constituents are discussed. The subject of grid generation and grid adaptation to the solution are touched upon only where relevant. A list of unanswered questions and an outlook for the future are covered.
Parallel tridiagonal equation solvers
NASA Technical Reports Server (NTRS)
Stone, H. S.
1974-01-01
Three parallel algorithms were compared for the direct solution of tridiagonal linear systems of equations. The algorithms are suitable for computers such as ILLIAC 4 and CDC STAR. For array computers similar to ILLIAC 4, cyclic odd-even reduction has the least operation count for highly structured sets of equations, and recursive doubling has the least count for relatively unstructured sets of equations. Since the difference in operation counts for these two algorithms is not substantial, their relative running times may be more related to overhead operations, which are not measured in this paper. The third algorithm, based on Buneman's Poisson solver, has more arithmetic operations than the others, and appears to be the least favorable. For pipeline computers similar to CDC STAR, cyclic odd-even reduction appears to be the most preferable algorithm for all cases.
Amesos2 Templated Direct Sparse Solver Package
2011-05-24
Amesos2 is a templated direct sparse solver package. Amesos2 provides interfaces to direct sparse solvers, rather than providing native solver capabilities. Amesos2 is a derivative work of the Trilinos package Amesos.
Modiri, A; Gu, X; Sawant, A
2014-06-15
Purpose: We present a particle swarm optimization (PSO)-based 4D IMRT planning technique designed for dynamic MLC tracking delivery to lung tumors. The key idea is to utilize the temporal dimension as an additional degree of freedom rather than a constraint in order to achieve improved sparing of organs at risk (OARs). Methods: The target and normal structures were manually contoured on each of the ten phases of a 4DCT scan acquired from a lung SBRT patient who exhibited 1.5cm tumor motion despite the use of abdominal compression. Corresponding ten IMRT plans were generated using the Eclipse treatment planning system. These plans served as initial guess solutions for the PSO algorithm. Fluence weights were optimized over the entire solution space i.e., 10 phases × 12 beams × 166 control points. The size of the solution space motivated our choice of PSO, which is a highly parallelizable stochastic global optimization technique that is well-suited for such large problems. A summed fluence map was created using an in-house B-spline deformable image registration. Each plan was compared with a corresponding, internal target volume (ITV)-based IMRT plan. Results: The PSO 4D IMRT plan yielded comparable PTV coverage and significantly higher dose—sparing for parallel and serial OARs compared to the ITV-based plan. The dose-sparing achieved via PSO-4DIMRT was: lung Dmean = 28%; lung V20 = 90%; spinal cord Dmax = 23%; esophagus Dmax = 31%; heart Dmax = 51%; heart Dmean = 64%. Conclusion: Truly 4D IMRT that uses the temporal dimension as an additional degree of freedom can achieve significant dose sparing of serial and parallel OARs. Given the large solution space, PSO represents an attractive, parallelizable tool to achieve globally optimal solutions for such problems. This work was supported through funding from the National Institutes of Health and Varian Medical Systems. Amit Sawant has research funding from Varian Medical Systems, VisionRT Ltd. and Elekta.
A multigrid solver for the semiconductor equations
NASA Technical Reports Server (NTRS)
Bachmann, Bernhard
1993-01-01
We present a multigrid solver for the exponential fitting method. The solver is applied to the current continuity equations of semiconductor device simulation in two dimensions. The exponential fitting method is based on a mixed finite element discretization using the lowest-order Raviart-Thomas triangular element. This discretization method yields a good approximation of front layers and guarantees current conservation. The corresponding stiffness matrix is an M-matrix. 'Standard' multigrid solvers, however, cannot be applied to the resulting system, as this is dominated by an unsymmetric part, which is due to the presence of strong convection in part of the domain. To overcome this difficulty, we explore the connection between Raviart-Thomas mixed methods and the nonconforming Crouzeix-Raviart finite element discretization. In this way we can construct nonstandard prolongation and restriction operators using easily computable weighted L(exp 2)-projections based on suitable quadrature rules and the upwind effects of the discretization. The resulting multigrid algorithm shows very good results, even for real-world problems and for locally refined grids.
Sherlock Holmes, Master Problem Solver.
ERIC Educational Resources Information Center
Ballew, Hunter
1994-01-01
Shows the connections between Sherlock Holmes's investigative methods and mathematical problem solving, including observations, characteristics of the problem solver, importance of data, questioning the obvious, learning from experience, learning from errors, and indirect proof. (MKR)
Fast wavelet based sparse approximate inverse preconditioner
Wan, W.L.
1996-12-31
Incomplete LU factorization is a robust preconditioner for both general and PDE problems but unfortunately not easy to parallelize. Recent study of Huckle and Grote and Chow and Saad showed that sparse approximate inverse could be a potential alternative while readily parallelizable. However, for special class of matrix A that comes from elliptic PDE problems, their preconditioners are not optimal in the sense that independent of mesh size. A reason may be that no good sparse approximate inverse exists for the dense inverse matrix. Our observation is that for this kind of matrices, its inverse entries typically have piecewise smooth changes. We can take advantage of this fact and use wavelet compression techniques to construct a better sparse approximate inverse preconditioner. We shall show numerically that our approach is effective for this kind of matrices.
Structured Multifrontal Sparse Solver
2014-05-01
StruMF is an algebraic structured preconditioner for the interative solution of large sparse linear systems. The preconditioner corresponds to a multifrontal variant of sparse LU factorization in which some dense blocks of the factors are approximated with low-rank matrices. It is algebraic in that it only requires the linear system itself, and the approximation threshold that determines the accuracy of individual low-rank approximations. Favourable rank properties are obtained using a block partitioning which is amore » refinement of the partitioning induced by nested dissection ordering.« less
MILAMIN 2 - Fast MATLAB FEM solver
NASA Astrophysics Data System (ADS)
Dabrowski, Marcin; Krotkiewski, Marcin; Schmid, Daniel W.
2013-04-01
MILAMIN is a free and efficient MATLAB-based two-dimensional FEM solver utilizing unstructured meshes [Dabrowski et al., G-cubed (2008)]. The code consists of steady-state thermal diffusion and incompressible Stokes flow solvers implemented in approximately 200 lines of native MATLAB code. The brevity makes the code easily customizable. An important quality of MILAMIN is speed - it can handle millions of nodes within minutes on one CPU core of a standard desktop computer, and is faster than many commercial solutions. The new MILAMIN 2 allows three-dimensional modeling. It is designed as a set of functional modules that can be used as building blocks for efficient FEM simulations using MATLAB. The utilities are largely implemented as native MATLAB functions. For performance critical parts we use MUTILS - a suite of compiled MEX functions optimized for shared memory multi-core computers. The most important features of MILAMIN 2 are: 1. Modular approach to defining, tracking, and discretizing the geometry of the model 2. Interfaces to external mesh generators (e.g., Triangle, Fade2d, T3D) and mesh utilities (e.g., element type conversion, fast point location, boundary extraction) 3. Efficient computation of the stiffness matrix for a wide range of element types, anisotropic materials and three-dimensional problems 4. Fast global matrix assembly using a dedicated MEX function 5. Automatic integration rules 6. Flexible prescription (spatial, temporal, and field functions) and efficient application of Dirichlet, Neuman, and periodic boundary conditions 7. Treatment of transient and non-linear problems 8. Various iterative and multi-level solution strategies 9. Post-processing tools (e.g., numerical integration) 10. Visualization primitives using MATLAB, and VTK export functions We provide a large number of examples that show how to implement a custom FEM solver using the MILAMIN 2 framework. The examples are MATLAB scripts of increasing complexity that address a given
Scalable Parallel Algebraic Multigrid Solvers
Bank, R; Lu, S; Tong, C; Vassilevski, P
2005-03-23
The authors propose a parallel algebraic multilevel algorithm (AMG), which has the novel feature that the subproblem residing in each processor is defined over the entire partition domain, although the vast majority of unknowns for each subproblem are associated with the partition owned by the corresponding processor. This feature ensures that a global coarse description of the problem is contained within each of the subproblems. The advantages of this approach are that interprocessor communication is minimized in the solution process while an optimal order of convergence rate is preserved; and the speed of local subproblem solvers can be maximized using the best existing sequential algebraic solvers.
NASA Technical Reports Server (NTRS)
Mineck, Raymond E.; Thomas, James L.; Biedron, Robert T.; Diskin, Boris
2005-01-01
FMG3D (full multigrid 3 dimensions) is a pilot computer program that solves equations of fluid flow using a finite difference representation on a structured grid. Infrastructure exists for three dimensions but the current implementation treats only two dimensions. Written in Fortran 90, FMG3D takes advantage of the recursive subroutine feature, dynamic memory allocation, and structured-programming constructs of that language. FMG3D supports multi-block grids with three types of block-to-block interfaces: periodic, C-zero, and C-infinity. For all three types, grid points must match at interfaces. For periodic and C-infinity types, derivatives of grid metrics must be continuous at interfaces. The available equation sets are as follows: scalar elliptic equations, scalar convection equations, and the pressure-Poisson formulation of the Navier-Stokes equations for an incompressible fluid. All the equation sets are implemented with nonzero forcing functions to enable the use of user-specified solutions to assist in verification and validation. The equations are solved with a full multigrid scheme using a full approximation scheme to converge the solution on each succeeding grid level. Restriction to the next coarser mesh uses direct injection for variables and full weighting for residual quantities; prolongation of the coarse grid correction from the coarse mesh to the fine mesh uses bilinear interpolation; and prolongation of the coarse grid solution uses bicubic interpolation.
Parallelizable 3D statistical reconstruction for C-arm tomosynthesis system
NASA Astrophysics Data System (ADS)
Wang, Beilei; Barner, Kenneth; Lee, Denny
2005-04-01
Clinical diagnosis and security detection tasks increasingly require 3D information which is difficult or impossible to obtain from 2D (two dimensional) radiographs. As a 3D (three dimensional) radiographic and non-destructive imaging technique, digital tomosynthesis is especially fit for cases where 3D information is required while a complete projection data is not available. Nowadays, FBP (filtered back projection) is extensively used in industry for its fast speed and simplicity. However, it is hard to deal with situations where only a limited number of projections from constrained directions are available, or the SNR (signal to noises ratio) of the projections is low. In order to deal with noise and take into account a priori information of the object, a statistical image reconstruction method is described based on the acquisition model of X-ray projections. We formulate a ML (maximum likelihood) function for this model and develop an ordered-subsets iterative algorithm to estimate the unknown attenuation of the object. Simulations show that satisfied results can be obtained after 1 to 2 iterations, and after that there is no significant improvement of the image quality. An adaptive wiener filter is also applied to the reconstructed image to remove its noise. Some approximations to speed up the reconstruction computation are also considered. Applying this method to computer generated projections of a revised Shepp phantom and true projections from diagnostic radiographs of a patient"s hand and mammography images yields reconstructions with impressive quality. Parallel programming is also implemented and tested. The quality of the reconstructed object is conserved, while the computation time is considerably reduced by almost the number of threads used.
Time-domain Raman analytical forward solvers.
Martelli, Fabrizio; Binzoni, Tiziano; Sekar, Sanathana Konugolu Venkata; Farina, Andrea; Cavalieri, Stefano; Pifferi, Antonio
2016-09-01
A set of time-domain analytical forward solvers for Raman signals detected from homogeneous diffusive media is presented. The time-domain solvers have been developed for two geometries: the parallelepiped and the finite cylinder. The potential presence of a background fluorescence emission, contaminating the Raman signal, has also been taken into account. All the solvers have been obtained as solutions of the time dependent diffusion equation. The validation of the solvers has been performed by means of comparisons with the results of "gold standard" Monte Carlo simulations. These forward solvers provide an accurate tool to explore the information content encoded in the time-resolved Raman measurements. PMID:27607645
On unstructured grids and solvers
NASA Technical Reports Server (NTRS)
Barth, T. J.
1990-01-01
The fundamentals and the state-of-the-art technology for unstructured grids and solvers are highlighted. Algorithms and techniques pertinent to mesh generation are discussed. It is shown that grid generation and grid manipulation schemes rely on fast multidimensional searching. Flow solution techniques for the Euler equations, which can be derived from the integral form of the equations are discussed. Sample calculations are also provided.
Parallelized solvers for heat conduction formulations
NASA Technical Reports Server (NTRS)
Padovan, Joe; Kwang, Abel
1991-01-01
Based on multilevel partitioning, this paper develops a structural parallelizable solution methodology that enables a significant reduction in computational effort and memory requirements for very large scale linear and nonlinear steady and transient thermal (heat conduction) models. Due to the generality of the formulation of the scheme, both finite element and finite difference simulations can be treated. Diverse model topologies can thus be handled, including both simply and multiply connected (branched/perforated) geometries. To verify the methodology, analytical and numerical benchmark trends are verified in both sequential and parallel computer environments.
Benchmarking ICRF Full-wave Solvers for ITER
R. V. Budny, L. Berry, R. Bilato, P. Bonoli, M. Brambilla, R. J. Dumont, A. Fukuyama, R. Harvey, E. F. Jaeger, K. Indireshkumar, E. Lerche, D. McCune, C. K. Phillips, V. Vdovin, J. Wright, and members of the ITPA-IOS
2011-01-06
Abstract Benchmarking of full-wave solvers for ICRF simulations is performed using plasma profiles and equilibria obtained from integrated self-consistent modeling predictions of four ITER plasmas. One is for a high performance baseline (5.3 T, 15 MA) DT H-mode. The others are for half-field, half-current plasmas of interest for the pre-activation phase with bulk plasma ion species being either hydrogen or He4. The predicted profiles are used by six full-wave solver groups to simulate the ICRF electromagnetic fields and heating, and by three of these groups to simulate the current-drive. Approximate agreement is achieved for the predicted heating power for the DT and He4 cases. Factor of two disagreements are found for the cases with second harmonic He3 heating in bulk H cases. Approximate agreement is achieved simulating the ICRF current drive.
Finite Element Interface to Linear Solvers
Williams, Alan
2005-03-18
Sparse systems of linear equations arise in many engineering applications, including finite elements, finite volumes, and others. The solution of linear systems is often the most computationally intensive portion of the application. Depending on the complexity of problems addressed by the application, there may be no single solver capable of solving all of the linear systems that arise. This motivates the desire to switch an application from one solver librwy to another, depending on the problem being solved. The interfaces provided by solver libraries differ greatly, making it difficult to switch an application code from one library to another. The amount of library-specific code in an application Can be greatly reduced by having an abstraction layer between solver libraries and the application, putting a common "face" on various solver libraries. One such abstraction layer is the Finite Element Interface to Linear Solvers (EEl), which has seen significant use by finite element applications at Sandia National Laboratories and Lawrence Livermore National Laboratory.
Analysis Tools for CFD Multigrid Solvers
NASA Technical Reports Server (NTRS)
Mineck, Raymond E.; Thomas, James L.; Diskin, Boris
2004-01-01
Analysis tools are needed to guide the development and evaluate the performance of multigrid solvers for the fluid flow equations. Classical analysis tools, such as local mode analysis, often fail to accurately predict performance. Two-grid analysis tools, herein referred to as Idealized Coarse Grid and Idealized Relaxation iterations, have been developed and evaluated within a pilot multigrid solver. These new tools are applicable to general systems of equations and/or discretizations and point to problem areas within an existing multigrid solver. Idealized Relaxation and Idealized Coarse Grid are applied in developing textbook-efficient multigrid solvers for incompressible stagnation flow problems.
The impact of improved sparse linear solvers on industrial engineering applications
Heroux, M.; Baddourah, M.; Poole, E.L.; Yang, Chao Wu
1996-12-31
There are usually many factors that ultimately determine the quality of computer simulation for engineering applications. Some of the most important are the quality of the analytical model and approximation scheme, the accuracy of the input data and the capability of the computing resources. However, in many engineering applications the characteristics of the sparse linear solver are the key factors in determining how complex a problem a given application code can solve. Therefore, the advent of a dramatically improved solver often brings with it dramatic improvements in our ability to do accurate and cost effective computer simulations. In this presentation we discuss the current status of sparse iterative and direct solvers in several key industrial CFD and structures codes, and show the impact that recent advances in linear solvers have made on both our ability to perform challenging simulations and the cost of those simulations. We also present some of the current challenges we have and the constraints we face in trying to improve these solvers. Finally, we discuss future requirements for sparse linear solvers on high performance architectures and try to indicate the opportunities that exist if we can develop even more improvements in linear solver capabilities.
Fast linear solvers for variable density turbulent flows
NASA Astrophysics Data System (ADS)
Pouransari, Hadi; Mani, Ali; Darve, Eric
2015-11-01
Variable density flows are ubiquitous in variety of natural and industrial systems. Two-phase and multi-phase flows in natural and industrial processes, astrophysical flows, and flows involved in combustion processes are such examples. For an ideal gas subject to low-Mach approximation, variations in temperature can lead to a non-uniform density field. In this work, we consider radiatively heated particle-laden turbulent flows as an example application in which density variability is resulted from inhomogeneities in the heat absorption by an inhomogeneous particle field. Under such conditions, the divergence constraint of the fluid is enforced through a variable coefficient Poisson equation. Inversion of the discretized variable coefficient Poisson operator is difficult using the conventional linear solvers as the size of the problem grows. We apply a novel hierarchical linear solve algorithm based on low-rank approximations. The proposed linear solver could be applied to variety of linear systems arising from discretized partial differential equations. It can be used as a standalone direct-solver with tunable accuracy and linear complexity, or as a high-accuracy pre-conditioner in conjunction with other iterative methods.
NITSOL: A Newton iterative solver for nonlinear systems
Pernice, M.; Walker, H.F.
1996-12-31
Newton iterative methods, also known as truncated Newton methods, are implementations of Newton`s method in which the linear systems that characterize Newton steps are solved approximately using iterative linear algebra methods. Here, we outline a well-developed Newton iterative algorithm together with a Fortran implementation called NITSOL. The basic algorithm is an inexact Newton method globalized by backtracking, in which each initial trial step is determined by applying an iterative linear solver until an inexact Newton criterion is satisfied. In the implementation, the user can specify inexact Newton criteria in several ways and select an iterative linear solver from among several popular {open_quotes}transpose-free{close_quotes} Krylov subspace methods. Jacobian-vector products used by the Krylov solver can be either evaluated analytically with a user-supplied routine or approximated using finite differences of function values. A flexible interface permits a wide variety of preconditioning strategies and allows the user to define a preconditioner and optionally update it periodically. We give details of these and other features and demonstrate the performance of the implementation on a representative set of test problems.
Elliptic Solvers with Adaptive Mesh Refinement on Complex Geometries
Phillip, B.
2000-07-24
Adaptive Mesh Refinement (AMR) is a numerical technique for locally tailoring the resolution computational grids. Multilevel algorithms for solving elliptic problems on adaptive grids include the Fast Adaptive Composite grid method (FAC) and its parallel variants (AFAC and AFACx). Theory that confirms the independence of the convergence rates of FAC and AFAC on the number of refinement levels exists under certain ellipticity and approximation property conditions. Similar theory needs to be developed for AFACx. The effectiveness of multigrid-based elliptic solvers such as FAC, AFAC, and AFACx on adaptively refined overlapping grids is not clearly understood. Finally, a non-trivial eye model problem will be solved by combining the power of using overlapping grids for complex moving geometries, AMR, and multilevel elliptic solvers.
A spectral Poisson solver for kinetic plasma simulation
NASA Astrophysics Data System (ADS)
Szeremley, Daniel; Obberath, Jens; Brinkmann, Ralf
2011-10-01
Plasma resonance spectroscopy is a well established plasma diagnostic method, realized in several designs. One of these designs is the multipole resonance probe (MRP). In its idealized - geometrically simplified - version it consists of two dielectrically shielded, hemispherical electrodes to which an RF signal is applied. A numerical tool is under development which is capable of simulating the dynamics of the plasma surrounding the MRP in electrostatic approximation. In this contribution we concentrate on the specialized Poisson solver for that tool. The plasma is represented by an ensemble of point charges. By expanding both the charge density and the potential into spherical harmonics, a largely analytical solution of the Poisson problem can be employed. For a practical implementation, the expansion must be appropriately truncated. With this spectral solver we are able to efficiently solve the Poisson equation in a kinetic plasma simulation without the need of introducing a spatial discretization.
KLU2 Direct Linear Solver Package
2012-01-04
KLU2 is a direct sparse solver for solving unsymmetric linear systems. It is related to the existing KLU solver, (in Amesos package and also as a stand-alone package from University of Florida) but provides template support for scalar and ordinal types. It uses a left looking LU factorization method.
Improving Resource-Unaware SAT Solvers
NASA Astrophysics Data System (ADS)
Hölldobler, Steffen; Manthey, Norbert; Saptawijaya, Ari
The paper discusses cache utilization in state-of-the-art SAT solvers. The aim of the study is to show how a resource-unaware SAT solver can be improved by utilizing the cache sensibly. The analysis is performed on a CDCL-based SAT solver using a subset of the industrial SAT Competition 2009 benchmark. For the analysis, the total cycles, the resource stall cycles, the L2 cache hits and the L2 cache misses are traced using sample based profiling. Based on the analysis, several techniques - some of which have not been used in SAT solvers so far - are proposed resulting in a combined speedup up to 83% without affecting the search path of the solver. The average speedup on the benchmark is 60%. The new techniques are also applied to MiniSAT2.0 improving its runtime by 20% on average.
Belos Block Linear Solvers Package
2004-03-01
Belos is an extensible and interoperable framework for large-scale, iterative methods for solving systems of linear equations with multiple right-hand sides. The motivation for this framework is to provide a generic interface to a collection of algorithms for solving large-scale linear systems. Belos is interoperable because both the matrix and vectors are considered to be opaque objects--only knowledge of the matrix and vectors via elementary operations is necessary. An implementation of Balos is accomplished viamore » the use of interfaces. One of the goals of Belos is to allow the user flexibility in specifying the data representation for the matrix and vectors and so leverage any existing software investment. The algorithms that will be included in package are Krylov-based linear solvers, like Block GMRES (Generalized Minimal RESidual) and Block CG (Conjugate-Gradient).« less
A robust multilevel simultaneous eigenvalue solver
NASA Technical Reports Server (NTRS)
Costiner, Sorin; Taasan, Shlomo
1993-01-01
Multilevel (ML) algorithms for eigenvalue problems are often faced with several types of difficulties such as: the mixing of approximated eigenvectors by the solution process, the approximation of incomplete clusters of eigenvectors, the poor representation of solution on coarse levels, and the existence of close or equal eigenvalues. Algorithms that do not treat appropriately these difficulties usually fail, or their performance degrades when facing them. These issues motivated the development of a robust adaptive ML algorithm which treats these difficulties, for the calculation of a few eigenvectors and their corresponding eigenvalues. The main techniques used in the new algorithm include: the adaptive completion and separation of the relevant clusters on different levels, the simultaneous treatment of solutions within each cluster, and the robustness tests which monitor the algorithm's efficiency and convergence. The eigenvectors' separation efficiency is based on a new ML projection technique generalizing the Rayleigh Ritz projection, combined with a technique, the backrotations. These separation techniques, when combined with an FMG formulation, in many cases lead to algorithms of O(qN) complexity, for q eigenvectors of size N on the finest level. Previously developed ML algorithms are less focused on the mentioned difficulties. Moreover, algorithms which employ fine level separation techniques are of O(q(sub 2)N) complexity and usually do not overcome all these difficulties. Computational examples are presented where Schrodinger type eigenvalue problems in 2-D and 3-D, having equal and closely clustered eigenvalues, are solved with the efficiency of the Poisson multigrid solver. A second order approximation is obtained in O(qN) work, where the total computational work is equivalent to only a few fine level relaxations per eigenvector.
Approximating the Generalized Voronoi Diagram of Closely Spaced Objects
Edwards, John; Daniel, Eric; Pascucci, Valerio; Bajaj, Chandrajit
2016-01-01
We present an algorithm to compute an approximation of the generalized Voronoi diagram (GVD) on arbitrary collections of 2D or 3D geometric objects. In particular, we focus on datasets with closely spaced objects; GVD approximation is expensive and sometimes intractable on these datasets using previous algorithms. With our approach, the GVD can be computed using commodity hardware even on datasets with many, extremely tightly packed objects. Our approach is to subdivide the space with an octree that is represented with an adjacency structure. We then use a novel adaptive distance transform to compute the distance function on octree vertices. The computed distance field is sampled more densely in areas of close object spacing, enabling robust and parallelizable GVD surface generation. We demonstrate our method on a variety of data and show example applications of the GVD in 2D and 3D. PMID:27540272
Approximating the Generalized Voronoi Diagram of Closely Spaced Objects
Edwards, John; Daniel, Eric; Pascucci, Valerio; Bajaj, Chandrajit
2015-06-22
We present an algorithm to compute an approximation of the generalized Voronoi diagram (GVD) on arbitrary collections of 2D or 3D geometric objects. In particular, we focus on datasets with closely spaced objects; GVD approximation is expensive and sometimes intractable on these datasets using previous algorithms. With our approach, the GVD can be computed using commodity hardware even on datasets with many, extremely tightly packed objects. Our approach is to subdivide the space with an octree that is represented with an adjacency structure. We then use a novel adaptive distance transform to compute the distance function on octree vertices. The computed distance field is sampled more densely in areas of close object spacing, enabling robust and parallelizable GVD surface generation. We demonstrate our method on a variety of data and show example applications of the GVD in 2D and 3D.
ALPS - A LINEAR PROGRAM SOLVER
NASA Technical Reports Server (NTRS)
Viterna, L. A.
1994-01-01
Linear programming is a widely-used engineering and management tool. Scheduling, resource allocation, and production planning are all well-known applications of linear programs (LP's). Most LP's are too large to be solved by hand, so over the decades many computer codes for solving LP's have been developed. ALPS, A Linear Program Solver, is a full-featured LP analysis program. ALPS can solve plain linear programs as well as more complicated mixed integer and pure integer programs. ALPS also contains an efficient solution technique for pure binary (0-1 integer) programs. One of the many weaknesses of LP solvers is the lack of interaction with the user. ALPS is a menu-driven program with no special commands or keywords to learn. In addition, ALPS contains a full-screen editor to enter and maintain the LP formulation. These formulations can be written to and read from plain ASCII files for portability. For those less experienced in LP formulation, ALPS contains a problem "parser" which checks the formulation for errors. ALPS creates fully formatted, readable reports that can be sent to a printer or output file. ALPS is written entirely in IBM's APL2/PC product, Version 1.01. The APL2 workspace containing all the ALPS code can be run on any APL2/PC system (AT or 386). On a 32-bit system, this configuration can take advantage of all extended memory. The user can also examine and modify the ALPS code. The APL2 workspace has also been "packed" to be run on any DOS system (without APL2) as a stand-alone "EXE" file, but has limited memory capacity on a 640K system. A numeric coprocessor (80X87) is optional but recommended. The standard distribution medium for ALPS is a 5.25 inch 360K MS-DOS format diskette. IBM, IBM PC and IBM APL2 are registered trademarks of International Business Machines Corporation. MS-DOS is a registered trademark of Microsoft Corporation.
GARDNER, P.R.
2006-04-01
Sudoku, also known as Number Place, is a logic-based placement puzzle. The aim of the puzzle is to enter a numerical digit from 1 through 9 in each cell of a 9 x 9 grid made up of 3 x 3 subgrids (called ''regions''), starting with various digits given in some cells (the ''givens''). Each row, column, and region must contain only one instance of each numeral. Completing the puzzle requires patience and logical ability. Although first published in a U.S. puzzle magazine in 1979, Sudoku initially caught on in Japan in 1986 and attained international popularity in 2005. Last fall, after noticing Sudoku puzzles in some newspapers and magazines, I attempted a few just to see how hard they were. Of course, the difficulties varied considerably. ''Obviously'' one could use Trial and Error but all the advice was to ''Use Logic''. Thinking to flex, and strengthen, those powers, I began to tackle the puzzles systematically. That is, when I discovered a new tactical rule, I would write it down, eventually generating a list of ten or so, with some having overlap. They served pretty well except for the more difficult puzzles, but even then I managed to develop an additional three rules that covered all of them until I hit the Oregonian puzzle shown. With all of my rules, I could not seem to solve that puzzle. Initially putting my failure down to rapid mental fatigue (being unable to hold a sufficient quantity of information in my mind at one time), I decided to write a program to implement my rules and see what I had failed to notice earlier. The solver, too, failed. That is, my rules were insufficient to solve that particular puzzle. I happened across a book written by a fellow who constructs such puzzles and who claimed that, sometimes, the only tactic left was trial and error. With a trial and error routine implemented, my solver successfully completed the Oregonian puzzle, and has successfully solved every puzzle submitted to it since.
SIERRA framework version 4 : solver services.
Williams, Alan B.
2005-02-01
Several SIERRA applications make use of third-party libraries to solve systems of linear and nonlinear equations, and to solve eigenproblems. The classes and interfaces in the SIERRA framework that provide linear system assembly services and access to solver libraries are collectively referred to as solver services. This paper provides an overview of SIERRA's solver services including the design goals that drove the development, and relationships and interactions among the various classes. The process of assembling and manipulating linear systems will be described, as well as access to solution methods and other operations.
NASA Technical Reports Server (NTRS)
Ferencz, Donald C.; Viterna, Larry A.
1991-01-01
ALPS is a computer program which can be used to solve general linear program (optimization) problems. ALPS was designed for those who have minimal linear programming (LP) knowledge and features a menu-driven scheme to guide the user through the process of creating and solving LP formulations. Once created, the problems can be edited and stored in standard DOS ASCII files to provide portability to various word processors or even other linear programming packages. Unlike many math-oriented LP solvers, ALPS contains an LP parser that reads through the LP formulation and reports several types of errors to the user. ALPS provides a large amount of solution data which is often useful in problem solving. In addition to pure linear programs, ALPS can solve for integer, mixed integer, and binary type problems. Pure linear programs are solved with the revised simplex method. Integer or mixed integer programs are solved initially with the revised simplex, and the completed using the branch-and-bound technique. Binary programs are solved with the method of implicit enumeration. This manual describes how to use ALPS to create, edit, and solve linear programming problems. Instructions for installing ALPS on a PC compatible computer are included in the appendices along with a general introduction to linear programming. A programmers guide is also included for assistance in modifying and maintaining the program.
Parallelizing alternating direction implicit solver on GPUs
Technology Transfer Automated Retrieval System (TEKTRAN)
We present a parallel Alternating Direction Implicit (ADI) solver on GPUs. Our implementation significantly improves existing implementations in two aspects. First, we address the scalability issue of existing Parallel Cyclic Reduction (PCR) implementations by eliminating their hardware resource con...
NASA Astrophysics Data System (ADS)
Willemsen, Bram; Malcolm, Alison; Lewis, Winston
2016-03-01
In a set of problems ranging from 4-D seismic to salt boundary estimation, updates to the velocity model often have a highly localized nature. Numerical techniques for these applications such as full-waveform inversion (FWI) require an estimate of the wavefield to compute the model updates. When dealing with localized problems, it is wasteful to compute these updates in the global domain, when we only need them in our region of interest. This paper introduces a local solver that generates forward and adjoint wavefields which are, to machine precision, identical to those generated by a full-domain solver evaluated within the region of interest. This means that the local solver computes all interactions between model updates within the region of interest and the inhomogeneities in the background model outside. Because no approximations are made in the calculation of the forward and adjoint wavefields, the local solver can compute the identical gradient in the region of interest as would be computed by the more expensive full-domain solver. In this paper, the local solver is used to efficiently generate the FWI gradient at the boundary of a salt body. This gradient is then used in a level set method to automatically update the salt boundary.
Finite Element Interface to Linear Solvers
2005-03-18
Sparse systems of linear equations arise in many engineering applications, including finite elements, finite volumes, and others. The solution of linear systems is often the most computationally intensive portion of the application. Depending on the complexity of problems addressed by the application, there may be no single solver capable of solving all of the linear systems that arise. This motivates the desire to switch an application from one solver librwy to another, depending on themore » problem being solved. The interfaces provided by solver libraries differ greatly, making it difficult to switch an application code from one library to another. The amount of library-specific code in an application Can be greatly reduced by having an abstraction layer between solver libraries and the application, putting a common "face" on various solver libraries. One such abstraction layer is the Finite Element Interface to Linear Solvers (EEl), which has seen significant use by finite element applications at Sandia National Laboratories and Lawrence Livermore National Laboratory.« less
A parallel PCG solver for MODFLOW.
Dong, Yanhui; Li, Guomin
2009-01-01
In order to simulate large-scale ground water flow problems more efficiently with MODFLOW, the OpenMP programming paradigm was used to parallelize the preconditioned conjugate-gradient (PCG) solver with in this study. Incremental parallelization, the significant advantage supported by OpenMP on a shared-memory computer, made the solver transit to a parallel program smoothly one block of code at a time. The parallel PCG solver, suitable for both MODFLOW-2000 and MODFLOW-2005, is verified using an 8-processor computer. Both the impact of compilers and different model domain sizes were considered in the numerical experiments. Based on the timing results, execution times using the parallel PCG solver are typically about 1.40 to 5.31 times faster than those using the serial one. In addition, the simulation results are the exact same as the original PCG solver, because the majority of serial codes were not changed. It is worth noting that this parallelizing approach reduces cost in terms of software maintenance because only a single source PCG solver code needs to be maintained in the MODFLOW source tree. PMID:19563427
Ordinary Differential Equation System Solver
1992-03-05
LSODE is a package of subroutines for the numerical solution of the initial value problem for systems of first order ordinary differential equations. The package is suitable for either stiff or nonstiff systems. For stiff systems the Jacobian matrix may be treated in either full or banded form. LSODE can also be used when the Jacobian can be approximated by a band matrix.
Using SPARK as a Solver for Modelica
Wetter, Michael; Wetter, Michael; Haves, Philip; Moshier, Michael A.; Sowell, Edward F.
2008-06-30
Modelica is an object-oriented acausal modeling language that is well positioned to become a de-facto standard for expressing models of complex physical systems. To simulate a model expressed in Modelica, it needs to be translated into executable code. For generating run-time efficient code, such a translation needs to employ algebraic formula manipulations. As the SPARK solver has been shown to be competitive for generating such code but currently cannot be used with the Modelica language, we report in this paper how SPARK's symbolic and numerical algorithms can be implemented in OpenModelica, an open-source implementation of a Modelica modeling and simulation environment. We also report benchmark results that show that for our air flow network simulation benchmark, the SPARK solver is competitive with Dymola, which is believed to provide the best solver for Modelica.
New iterative solvers for the NAG Libraries
Salvini, S.; Shaw, G.
1996-12-31
The purpose of this paper is to introduce the work which has been carried out at NAG Ltd to update the iterative solvers for sparse systems of linear equations, both symmetric and unsymmetric, in the NAG Fortran 77 Library. Our current plans to extend this work and include it in our other numerical libraries in our range are also briefly mentioned. We have added to the Library the new Chapter F11, entirely dedicated to sparse linear algebra. At Mark 17, the F11 Chapter includes sparse iterative solvers, preconditioners, utilities and black-box routines for sparse symmetric (both positive-definite and indefinite) linear systems. Mark 18 will add solvers, preconditioners, utilities and black-boxes for sparse unsymmetric systems: the development of these has already been completed.
ODE System Solver W. Krylov Iteration & Rootfinding
1991-09-09
LSODKR is a new initial value ODE solver for stiff and nonstiff systems. It is a variant of the LSODPK and LSODE solvers, intended mainly for large stiff systems. The main differences between LSODKR and LSODE are the following: (a) for stiff systems, LSODKR uses a corrector iteration composed of Newton iteration and one of four preconditioned Krylov subspace iteration methods. The user must supply routines for the preconditioning operations, (b) Within the corrector iteration,more » LSODKR does automatic switching between functional (fixpoint) iteration and modified Newton iteration, (c) LSODKR includes the ability to find roots of given functions of the solution during the integration.« less
Code Verification of the HIGRAD Computational Fluid Dynamics Solver
Van Buren, Kendra L.; Canfield, Jesse M.; Hemez, Francois M.; Sauer, Jeremy A.
2012-05-04
The purpose of this report is to outline code and solution verification activities applied to HIGRAD, a Computational Fluid Dynamics (CFD) solver of the compressible Navier-Stokes equations developed at the Los Alamos National Laboratory, and used to simulate various phenomena such as the propagation of wildfires and atmospheric hydrodynamics. Code verification efforts, as described in this report, are an important first step to establish the credibility of numerical simulations. They provide evidence that the mathematical formulation is properly implemented without significant mistakes that would adversely impact the application of interest. Highly accurate analytical solutions are derived for four code verification test problems that exercise different aspects of the code. These test problems are referred to as: (i) the quiet start, (ii) the passive advection, (iii) the passive diffusion, and (iv) the piston-like problem. These problems are simulated using HIGRAD with different levels of mesh discretization and the numerical solutions are compared to their analytical counterparts. In addition, the rates of convergence are estimated to verify the numerical performance of the solver. The first three test problems produce numerical approximations as expected. The fourth test problem (piston-like) indicates the extent to which the code is able to simulate a 'mild' discontinuity, which is a condition that would typically be better handled by a Lagrangian formulation. The current investigation concludes that the numerical implementation of the solver performs as expected. The quality of solutions is sufficient to provide credible simulations of fluid flows around wind turbines. The main caveat associated to these findings is the low coverage provided by these four problems, and somewhat limited verification activities. A more comprehensive evaluation of HIGRAD may be beneficial for future studies.
Newton-Raphson preconditioner for Krylov type solvers on GPU devices.
Kushida, Noriyuki
2016-01-01
A new Newton-Raphson method based preconditioner for Krylov type linear equation solvers for GPGPU is developed, and the performance is investigated. Conventional preconditioners improve the convergence of Krylov type solvers, and perform well on CPUs. However, they do not perform well on GPGPUs, because of the complexity of implementing powerful preconditioners. The developed preconditioner is based on the BFGS Hessian matrix approximation technique, which is well known as a robust and fast nonlinear equation solver. Because the Hessian matrix in the BFGS represents the coefficient matrix of a system of linear equations in some sense, the approximated Hessian matrix can be a preconditioner. On the other hand, BFGS is required to store dense matrices and to invert them, which should be avoided on modern computers and supercomputers. To overcome these disadvantages, we therefore introduce a limited memory BFGS, which requires less memory space and less computational effort than the BFGS. In addition, a limited memory BFGS can be implemented with BLAS libraries, which are well optimized for target architectures. There are advantages and disadvantages to the Hessian matrix approximation becoming better as the Krylov solver iteration continues. The preconditioning matrix varies through Krylov solver iterations, and only flexible Krylov solvers can work well with the developed preconditioner. The GCR method, which is a flexible Krylov solver, is employed because of the prevalence of GCR as a Krylov solver with a variable preconditioner. As a result of the performance investigation, the new preconditioner indicates the following benefits: (1) The new preconditioner is robust; i.e., it converges while conventional preconditioners (the diagonal scaling, and the SSOR preconditioners) fail. (2) In the best case scenarios, it is over 10 times faster than conventional preconditioners on a CPU. (3) Because it requries only simple operations, it performs well on a GPGPU. In
Newton-Raphson preconditioner for Krylov type solvers on GPU devices.
Kushida, Noriyuki
2016-01-01
A new Newton-Raphson method based preconditioner for Krylov type linear equation solvers for GPGPU is developed, and the performance is investigated. Conventional preconditioners improve the convergence of Krylov type solvers, and perform well on CPUs. However, they do not perform well on GPGPUs, because of the complexity of implementing powerful preconditioners. The developed preconditioner is based on the BFGS Hessian matrix approximation technique, which is well known as a robust and fast nonlinear equation solver. Because the Hessian matrix in the BFGS represents the coefficient matrix of a system of linear equations in some sense, the approximated Hessian matrix can be a preconditioner. On the other hand, BFGS is required to store dense matrices and to invert them, which should be avoided on modern computers and supercomputers. To overcome these disadvantages, we therefore introduce a limited memory BFGS, which requires less memory space and less computational effort than the BFGS. In addition, a limited memory BFGS can be implemented with BLAS libraries, which are well optimized for target architectures. There are advantages and disadvantages to the Hessian matrix approximation becoming better as the Krylov solver iteration continues. The preconditioning matrix varies through Krylov solver iterations, and only flexible Krylov solvers can work well with the developed preconditioner. The GCR method, which is a flexible Krylov solver, is employed because of the prevalence of GCR as a Krylov solver with a variable preconditioner. As a result of the performance investigation, the new preconditioner indicates the following benefits: (1) The new preconditioner is robust; i.e., it converges while conventional preconditioners (the diagonal scaling, and the SSOR preconditioners) fail. (2) In the best case scenarios, it is over 10 times faster than conventional preconditioners on a CPU. (3) Because it requries only simple operations, it performs well on a GPGPU. In
Equation solvers for distributed-memory computers
NASA Technical Reports Server (NTRS)
Storaasli, Olaf O.
1994-01-01
A large number of scientific and engineering problems require the rapid solution of large systems of simultaneous equations. The performance of parallel computers in this area now dwarfs traditional vector computers by nearly an order of magnitude. This talk describes the major issues involved in parallel equation solvers with particular emphasis on the Intel Paragon, IBM SP-1 and SP-2 processors.
Parallel solvers for reservoir simulation on MIMD computers
Piault, E.; Willien, F.; Roux, F.X.
1995-12-01
We have investigated parallel solvers for reservoir simulation. We compare different solvers and preconditioners using T3D and SP1 parallel computers. We use block diagonal domain decomposition preconditioner with non-overlapping sub-domains.
Frequency Domain Modelling by a Direct-Iterative Solver: A Space and Wavelet Approach
NASA Astrophysics Data System (ADS)
Hustedt, B.; Operto, S.; Virieux, J.
2002-12-01
Seismic forward modelling of wave propagation phenomena in complex rheologic media using a frequency domain finite-difference (FDFD) technique is of special interest for multisource experiments and waveform inversion schemes, because the complete wavefield solution can be computed in a fast and efficient way. FDFD modelling requires the inversion of an extremely large matrix-equation A x x = b, by either a direct or an iterative solver. The direct solver computes an effective inverse of A, called LU factorization. The main handicap is additional computer memory required for storing matrix fill-in coefficients, that are created during the factorization process. Iterative solvers are not limited by memory constraints (additional coefficients), but the convergence depends on a good initial solution difficult to guess before hand. For both solvers, available computer resources has limited wide-spread FDFD modelling applications to mainly two-dimensional (2D) and rarely three-dimensional (3D) problems. In order to overcome these limits, we propose the combination of a direct solver and an iterative solver, called Direct-Iterative Solver (DIS). The direct solver is used to compute an exact wavefield solution on a coarse discretized grid. We use a multifrontal decomposition technique. The coarse-grid size is determined preliminary by limits of the available computer resources, rather than by the wave simulation problem. We project the exact coarse-grid solution on a fine-grid, and use it as an initial solution for an iterative solver, which convergences to an acceptable approximation of the desired fine-grid solution. Two different DIS schemes have been implemented and tested for numerical accuracy and computational performance. The first approach, called the Direct-Iterative-Space Solver (DISS), projects the coarse-grid solution on the fine-grid by a bilinear interpolation. Though the interpolated solution nicely approximates the desired fine-grid solution, still for
Implicit solvers for unstructured meshes
NASA Technical Reports Server (NTRS)
Venkatakrishnan, V.; Mavriplis, Dimitri J.
1991-01-01
Implicit methods were developed and tested for unstructured mesh computations. The approximate system which arises from the Newton linearization of the nonlinear evolution operator is solved by using the preconditioned GMRES (Generalized Minimum Residual) technique. Three different preconditioners were studied, namely, the incomplete LU factorization (ILU), block diagonal factorization, and the symmetric successive over relaxation (SSOR). The preconditioners were optimized to have good vectorization properties. SSOR and ILU were also studied as iterative schemes. The various methods are compared over a wide range of problems. Ordering of the unknowns, which affects the convergence of these sparse matrix iterative methods, is also studied. Results are presented for inviscid and turbulent viscous calculations on single and multielement airfoil configurations using globally and adaptively generated meshes.
Implicit solvers for unstructured meshes
NASA Technical Reports Server (NTRS)
Venkatakrishnan, V.; Mavriplis, Dimitri J.
1991-01-01
Implicit methods for unstructured mesh computations are developed and tested. The approximate system which arises from the Newton-linearization of the nonlinear evolution operator is solved by using the preconditioned generalized minimum residual technique. These different preconditioners are investigated: the incomplete LU factorization (ILU), block diagonal factorization, and the symmetric successive over-relaxation (SSOR). The preconditioners have been optimized to have good vectorization properties. The various methods are compared over a wide range of problems. Ordering of the unknowns, which affects the convergence of these sparse matrix iterative methods, is also investigated. Results are presented for inviscid and turbulent viscous calculations on single and multielement airfoil configurations using globally and adaptively generated meshes.
CASTRO: A NEW COMPRESSIBLE ASTROPHYSICAL SOLVER. III. MULTIGROUP RADIATION HYDRODYNAMICS
Zhang, W.; Almgren, A.; Bell, J.; Howell, L.; Burrows, A.; Dolence, J.
2013-01-15
We present a formulation for multigroup radiation hydrodynamics that is correct to order O(v/c) using the comoving-frame approach and the flux-limited diffusion approximation. We describe a numerical algorithm for solving the system, implemented in the compressible astrophysics code, CASTRO. CASTRO uses a Eulerian grid with block-structured adaptive mesh refinement based on a nested hierarchy of logically rectangular variable-sized grids with simultaneous refinement in both space and time. In our multigroup radiation solver, the system is split into three parts: one part that couples the radiation and fluid in a hyperbolic subsystem, another part that advects the radiation in frequency space, and a parabolic part that evolves radiation diffusion and source-sink terms. The hyperbolic subsystem and the frequency space advection are solved explicitly with high-order Godunov schemes, whereas the parabolic part is solved implicitly with a first-order backward Euler method. Our multigroup radiation solver works for both neutrino and photon radiation.
CASTRO: A New Compressible Astrophysical Solver. III. Multigroup Radiation Hydrodynamics
NASA Astrophysics Data System (ADS)
Zhang, W.; Howell, L.; Almgren, A.; Burrows, A.; Dolence, J.; Bell, J.
2013-01-01
We present a formulation for multigroup radiation hydrodynamics that is correct to order O(v/c) using the comoving-frame approach and the flux-limited diffusion approximation. We describe a numerical algorithm for solving the system, implemented in the compressible astrophysics code, CASTRO. CASTRO uses a Eulerian grid with block-structured adaptive mesh refinement based on a nested hierarchy of logically rectangular variable-sized grids with simultaneous refinement in both space and time. In our multigroup radiation solver, the system is split into three parts: one part that couples the radiation and fluid in a hyperbolic subsystem, another part that advects the radiation in frequency space, and a parabolic part that evolves radiation diffusion and source-sink terms. The hyperbolic subsystem and the frequency space advection are solved explicitly with high-order Godunov schemes, whereas the parabolic part is solved implicitly with a first-order backward Euler method. Our multigroup radiation solver works for both neutrino and photon radiation.
NASA Astrophysics Data System (ADS)
Jia, Jingfei; Kim, Hyun K.; Hielscher, Andreas H.
2015-12-01
It is well known that radiative transfer equation (RTE) provides more accurate tomographic results than its diffusion approximation (DA). However, RTE-based tomographic reconstruction codes have limited applicability in practice due to their high computational cost. In this article, we propose a new efficient method for solving the RTE forward problem with multiple light sources in an all-at-once manner instead of solving it for each source separately. To this end, we introduce here a novel linear solver called block biconjugate gradient stabilized method (block BiCGStab) that makes full use of the shared information between different right hand sides to accelerate solution convergence. Two parallelized block BiCGStab methods are proposed for additional acceleration under limited threads situation. We evaluate the performance of this algorithm with numerical simulation studies involving the Delta-Eddington approximation to the scattering phase function. The results show that the single threading block RTE solver proposed here reduces computation time by a factor of 1.5-3 as compared to the traditional sequential solution method and the parallel block solver by a factor of 1.5 as compared to the traditional parallel sequential method. This block linear solver is, moreover, independent of discretization schemes and preconditioners used; thus further acceleration and higher accuracy can be expected when combined with other existing discretization schemes or preconditioners.
Perturbative forward solver software for small localized fluorophores in tissue
Martelli, F.; Bianco, S. Del; Di Ninni, P.
2011-01-01
In this paper a forward solver software for the time domain and the CW domain based on the Born approximation for simulating the effect of small localized fluorophores embedded in a non-fluorescent biological tissue is proposed. The fluorescence emission is treated with a mathematical model that describes the migration of photons from the source to the fluorophore and of emitted fluorescent photons from the fluorophore to the detector for all those geometries for which Green’s functions are available. Subroutines written in FORTRAN that can be used for calculating the fluorescent signal for the infinite medium and for the slab are provided with a linked file. With these subroutines, quantities such as reflectance, transmittance, and fluence rate can be calculated. PMID:22254165
Perturbative forward solver software for small localized fluorophores in tissue.
Martelli, F; Del Bianco, S; Di Ninni, P
2012-01-01
In this paper a forward solver software for the time domain and the CW domain based on the Born approximation for simulating the effect of small localized fluorophores embedded in a non-fluorescent biological tissue is proposed. The fluorescence emission is treated with a mathematical model that describes the migration of photons from the source to the fluorophore and of emitted fluorescent photons from the fluorophore to the detector for all those geometries for which Green's functions are available. Subroutines written in FORTRAN that can be used for calculating the fluorescent signal for the infinite medium and for the slab are provided with a linked file. With these subroutines, quantities such as reflectance, transmittance, and fluence rate can be calculated. PMID:22254165
Aleph Field Solver Challenge Problem Results Summary.
Hooper, Russell; Moore, Stan Gerald
2015-01-01
Aleph models continuum electrostatic and steady and transient thermal fields using a finite-element method. Much work has gone into expanding the core solver capability to support enriched mod- eling consisting of multiple interacting fields, special boundary conditions and two-way interfacial coupling with particles modeled using Aleph's complementary particle-in-cell capability. This report provides quantitative evidence for correct implementation of Aleph's field solver via order- of-convergence assessments on a collection of problems of increasing complexity. It is intended to provide Aleph with a pedigree and to establish a basis for confidence in results for more challeng- ing problems important to Sandia's mission that Aleph was specifically designed to address.
Domain decomposition for the SPN solver MINOS
Jamelot, Erell; Baudron, Anne-Marie; Lautard, Jean-Jacques
2012-07-01
In this article we present a domain decomposition method for the mixed SPN equations, discretized with Raviart-Thomas-Nedelec finite elements. This domain decomposition is based on the iterative Schwarz algorithm with Robin interface conditions to handle communications. After having described this method, we give details on how to optimize the convergence. Finally, we give some numerical results computed in a realistic 3D domain. The computations are done with the MINOS solver of the APOLLO3 (R) code. (authors)
A perspective on unstructured grid flow solvers
NASA Technical Reports Server (NTRS)
Venkatakrishnan, V.
1995-01-01
This survey paper assesses the status of compressible Euler and Navier-Stokes solvers on unstructured grids. Different spatial and temporal discretization options for steady and unsteady flows are discussed. The integration of these components into an overall framework to solve practical problems is addressed. Issues such as grid adaptation, higher order methods, hybrid discretizations and parallel computing are briefly discussed. Finally, some outstanding issues and future research directions are presented.
User documentation for PVODE, an ODE solver for parallel computers
Hindmarsh, A.C., LLNL
1998-05-01
PVODE is a general purpose ordinary differential equation (ODE) solver for stiff and nonstiff ODES It is based on CVODE [5] [6], which is written in ANSI- standard C PVODE uses MPI (Message-Passing Interface) [8] and a revised version of the vector module in CVODE to achieve parallelism and portability PVODE is intended for the SPMD (Single Program Multiple Data) environment with distributed memory, in which all vectors are identically distributed across processors In particular, the vector module is designed to help the user assign a contiguous segment of a given vector to each of the processors for parallel computation The idea is for each processor to solve a certain fixed subset of the ODES To better understand PVODE, we first need to understand CVODE and its historical background The ODE solver CVODE, which was written by Cohen and Hindmarsh, combines features of two earlier Fortran codes, VODE [l] and VODPK [3] Those two codes were written by Brown, Byrne, and Hindmarsh. Both use variable-coefficient multi-step integration methods, and address both stiff and nonstiff systems (Stiffness is defined as the presence of one or more very small damping time constants ) VODE uses direct linear algebraic techniques to solve the underlying banded or dense linear systems of equations in conjunction with a modified Newton method in the stiff ODE case On the other hand, VODPK uses a preconditioned Krylov iterative method [2] to solve the underlying linear system User-supplied preconditioners directly address the dominant source of stiffness Consequently, CVODE implements both the direct and iterative methods Currently, with regard to the nonlinear and linear system solution, PVODE has three method options available. functional iteration, Newton iteration with a diagonal approximate Jacobian, and Newton iteration with the iterative method SPGMR (Scaled Preconditioned Generalized Minimal Residual method) Both CVODE and PVODE are written in such a way that other linear
Galerkin CFD solvers for use in a multi-disciplinary suite for modeling advanced flight vehicles
NASA Astrophysics Data System (ADS)
Moffitt, Nicholas J.
This work extends existing Galerkin CFD solvers for use in a multi-disciplinary suite. The suite is proposed as a means of modeling advanced flight vehicles, which exhibit strong coupling between aerodynamics, structural dynamics, controls, rigid body motion, propulsion, and heat transfer. Such applications include aeroelastics, aeroacoustics, stability and control, and other highly coupled applications. The suite uses NASA STARS for modeling structural dynamics and heat transfer. Aerodynamics, propulsion, and rigid body dynamics are modeled in one of the five CFD solvers below. Euler2D and Euler3D are Galerkin CFD solvers created at OSU by Cowan (2003). These solvers are capable of modeling compressible inviscid aerodynamics with modal elastics and rigid body motion. This work reorganized these solvers to improve efficiency during editing and at run time. Simple and efficient propulsion models were added, including rocket, turbojet, and scramjet engines. Viscous terms were added to the previous solvers to create NS2D and NS3D. The viscous contributions were demonstrated in the inertial and non-inertial frames. Variable viscosity (Sutherland's equation) and heat transfer boundary conditions were added to both solvers but not verified in this work. Two turbulence models were implemented in NS2D and NS3D: Spalart-Allmarus (SA) model of Deck, et al. (2002) and Menter's SST model (1994). A rotation correction term (Shur, et al., 2000) was added to the production of turbulence. Local time stepping and artificial dissipation were adapted to each model. CFDsol is a Taylor-Galerkin solver with an SA turbulence model. This work improved the time accuracy, far field stability, viscous terms, Sutherland?s equation, and SA model with NS3D as a guideline and added the propulsion models from Euler3D to CFDsol. Simple geometries were demonstrated to utilize current meshing and processing capabilities. Air-breathing hypersonic flight vehicles (AHFVs) represent the ultimate
Guerin, P.; Baudron, A. M.; Lautard, J. J.
2006-07-01
This paper describes a new technique for determining the pin power in heterogeneous core calculations. It is based on a domain decomposition with overlapping sub-domains and a component mode synthesis technique for the global flux determination. Local basis functions are used to span a discrete space that allows fundamental global mode approximation through a Galerkin technique. Two approaches are given to obtain these local basis functions: in the first one (Component Mode Synthesis method), the first few spatial eigenfunctions are computed on each sub-domain, using periodic boundary conditions. In the second one (Factorized Component Mode Synthesis method), only the fundamental mode is computed, and we use a factorization principle for the flux in order to replace the higher order Eigenmodes. These different local spatial functions are extended to the global domain by defining them as zero outside the sub-domain. These methods are well-fitted for heterogeneous core calculations because the spatial interface modes are taken into account in the domain decomposition. Although these methods could be applied to higher order angular approximations - particularly easily to a SPN approximation - the numerical results we provide are obtained using a diffusion model. We show the methods' accuracy for reactor cores loaded with UOX and MOX assemblies, for which standard reconstruction techniques are known to perform poorly. Furthermore, we show that our methods are highly and easily parallelizable. (authors)
Domain decomposed preconditioners with Krylov subspace methods as subdomain solvers
Pernice, M.
1994-12-31
Domain decomposed preconditioners for nonsymmetric partial differential equations typically require the solution of problems on the subdomains. Most implementations employ exact solvers to obtain these solutions. Consequently work and storage requirements for the subdomain problems grow rapidly with the size of the subdomain problems. Subdomain solves constitute the single largest computational cost of a domain decomposed preconditioner, and improving the efficiency of this phase of the computation will have a significant impact on the performance of the overall method. The small local memory available on the nodes of most message-passing multicomputers motivates consideration of the use of an iterative method for solving subdomain problems. For large-scale systems of equations that are derived from three-dimensional problems, memory considerations alone may dictate the need for using iterative methods for the subdomain problems. In addition to reduced storage requirements, use of an iterative solver on the subdomains allows flexibility in specifying the accuracy of the subdomain solutions. Substantial savings in solution time is possible if the quality of the domain decomposed preconditioner is not degraded too much by relaxing the accuracy of the subdomain solutions. While some work in this direction has been conducted for symmetric problems, similar studies for nonsymmetric problems appear not to have been pursued. This work represents a first step in this direction, and explores the effectiveness of performing subdomain solves using several transpose-free Krylov subspace methods, GMRES, transpose-free QMR, CGS, and a smoothed version of CGS. Depending on the difficulty of the subdomain problem and the convergence tolerance used, a reduction in solution time is possible in addition to the reduced memory requirements. The domain decomposed preconditioner is a Schur complement method in which the interface operators are approximated using interface probing.
High Energy Boundary Conditions for a Cartesian Mesh Euler Solver
NASA Technical Reports Server (NTRS)
Pandya, Shishir; Murman, Scott; Aftosmis, Michael
2003-01-01
Inlets and exhaust nozzles are common place in the world of flight. Yet, many aerodynamic simulation packages do not provide a method of modelling such high energy boundaries in the flow field. For the purposes of aerodynamic simulation, inlets and exhausts are often fared over and it is assumed that the flow differences resulting from this assumption are minimal. While this is an adequate assumption for the prediction of lift, the lack of a plume behind the aircraft creates an evacuated base region thus effecting both drag and pitching moment values. In addition, the flow in the base region is often mis-predicted resulting in incorrect base drag. In order to accurately predict these quantities, a method for specifying inlet and exhaust conditions needs to be available in aerodynamic simulation packages. A method for a first approximation of a plume without accounting for chemical reactions is added to the Cartesian mesh based aerodynamic simulation package CART3D. The method consists of 3 steps. In the first step, a components approach where each triangle is assigned a component number is used. Here, a method for marking the inlet or exhaust plane triangles as separate components is discussed. In step two, the flow solver is modified to accept a reference state for the components marked inlet or exhaust. In the third step, the flow solver uses these separated components and the reference state to compute the correct flow condition at that triangle. The present method is implemented in the CART3D package which consists of a set of tools for generating a Cartesian volume mesh from a set of component triangulations. The Euler equations are solved on the resulting unstructured Cartesian mesh. The present methods is implemented in this package and its usefulness is demonstrated with two validation cases. A generic missile body is also presented to show the usefulness of the method on a real world geometry.
Updates to the NEQAIR Radiation Solver
NASA Technical Reports Server (NTRS)
Cruden, Brett A.; Brandis, Aaron M.
2014-01-01
The NEQAIR code is one of the original heritage solvers for radiative heating prediction in aerothermal environments, and is still used today for mission design purposes. This paper discusses the implementation of the first major revision to the NEQAIR code in the last five years, NEQAIR v14.0. The most notable features of NEQAIR v14.0 are the parallelization of the radiation computation, reducing runtimes by about 30×, and the inclusion of mid-wave CO2 infrared radiation.
DPS--a computerised diagnostic problem solver.
Bartos, P; Gyárfas, F; Popper, M
1982-01-01
The paper contains a short description of the DPS system which is a computerized diagnostic problem solver. The system is under development of the Research Institute of Medical Bionics in Bratislava, Czechoslovakia. Its underlying philosophy yields from viewing the diagnostic process as process of cognitive problem solving. The implementation of the system is based on the methods of Artificial Intelligence and utilisation of production systems and frame theory should be noted in this context. Finally a list of program modules and their characterisation is presented.
Input-output-controlled nonlinear equation solvers
NASA Technical Reports Server (NTRS)
Padovan, Joseph
1988-01-01
To upgrade the efficiency and stability of the successive substitution (SS) and Newton-Raphson (NR) schemes, the concept of input-output-controlled solvers (IOCS) is introduced. By employing the formal properties of the constrained version of the SS and NR schemes, the IOCS algorithm can handle indefiniteness of the system Jacobian, can maintain iterate monotonicity, and provide for separate control of load incrementation and iterate excursions, as well as having other features. To illustrate the algorithmic properties, the results for several benchmark examples are presented. These define the associated numerical efficiency and stability of the IOCS.
Using the scalable nonlinear equations solvers package
Gropp, W.D.; McInnes, L.C.; Smith, B.F.
1995-02-01
SNES (Scalable Nonlinear Equations Solvers) is a software package for the numerical solution of large-scale systems of nonlinear equations on both uniprocessors and parallel architectures. SNES also contains a component for the solution of unconstrained minimization problems, called SUMS (Scalable Unconstrained Minimization Solvers). Newton-like methods, which are known for their efficiency and robustness, constitute the core of the package. As part of the multilevel PETSc library, SNES incorporates many features and options from other parts of PETSc. In keeping with the spirit of the PETSc library, the nonlinear solution routines are data-structure-neutral, making them flexible and easily extensible. This users guide contains a detailed description of uniprocessor usage of SNES, with some added comments regarding multiprocessor usage. At this time the parallel version is undergoing refinement and extension, as we work toward a common interface for the uniprocessor and parallel cases. Thus, forthcoming versions of the software will contain additional features, and changes to parallel interface may result at any time. The new parallel version will employ the MPI (Message Passing Interface) standard for interprocessor communication. Since most of these details will be hidden, users will need to perform only minimal message-passing programming.
On code verification of RANS solvers
NASA Astrophysics Data System (ADS)
Eça, L.; Klaij, C. M.; Vaz, G.; Hoekstra, M.; Pereira, F. S.
2016-04-01
This article discusses Code Verification of Reynolds-Averaged Navier Stokes (RANS) solvers that rely on face based finite volume discretizations for volumes of arbitrary shape. The study includes test cases with known analytical solutions (generated with the method of manufactured solutions) corresponding to laminar and turbulent flow, with the latter using eddy-viscosity turbulence models. The procedure to perform Code Verification based on grid refinement studies is discussed and the requirements for its correct application are illustrated in a simple one-dimensional problem. It is shown that geometrically similar grids are recommended for proper Code Verification and so the data should not have scatter making the use of least square fits unnecessary. Results show that it may be advantageous to determine the extrapolated error to cell size/time step zero instead of assuming that it is zero, especially when it is hard to determine the asymptotic order of grid convergence. In the RANS examples, several of the features of the ReFRESCO solver are checked including the effects of the available turbulence models in the convergence properties of the code. It is shown that it is required to account for non-orthogonality effects in the discretization of the diffusion terms and that the turbulence quantities transport equations can deteriorate the order of grid convergence of mean flow quantities.
Two-Dimensional Ffowcs Williams/Hawkings Equation Solver
NASA Technical Reports Server (NTRS)
Lockard, David P.
2005-01-01
FWH2D is a Fortran 90 computer program that solves a two-dimensional (2D) version of the equation, derived by J. E. Ffowcs Williams and D. L. Hawkings, for sound generated by turbulent flow. FWH2D was developed especially for estimating noise generated by airflows around such approximately 2D airframe components as slats. The user provides input data on fluctuations of pressure, density, and velocity on some surface. These data are combined with information about the geometry of the surface to calculate histories of thickness and loading terms. These histories are fast-Fourier-transformed into the frequency domain. For each frequency of interest and each observer position specified by the user, kernel functions are integrated over the surface by use of the trapezoidal rule to calculate a pressure signal. The resulting frequency-domain signals are inverse-fast-Fourier-transformed back into the time domain. The output of the code consists of the time- and frequency-domain representations of the pressure signals at the observer positions. Because of its approximate nature, FWH2D overpredicts the noise from a finite-length (3D) component. The advantage of FWH2D is that it requires a fraction of the computation time of a 3D Ffowcs Williams/Hawkings solver.
NASA Astrophysics Data System (ADS)
Vincenti, H.; Vay, J.-L.
2016-03-01
Very high order or pseudo-spectral Maxwell solvers are the method of choice to reduce discretization effects (e.g. numerical dispersion) that are inherent to low order Finite-Difference Time-Domain (FDTD) schemes. However, due to their large stencils, these solvers are often subject to truncation errors in many electromagnetic simulations. These truncation errors come from non-physical modifications of Maxwell's equations in space that may generate spurious signals affecting the overall accuracy of the simulation results. Such modifications for instance occur when Perfectly Matched Layers (PMLs) are used at simulation domain boundaries to simulate open media. Another example is the use of arbitrary order Maxwell solver with domain decomposition technique that may under some condition involve stencil truncations at subdomain boundaries, resulting in small spurious errors that do eventually build up. In each case, a careful evaluation of the characteristics and magnitude of the errors resulting from these approximations, and their impact at any frequency and angle, requires detailed analytical and numerical studies. To this end, we present a general analytical approach that enables the evaluation of numerical errors of fully three-dimensional arbitrary order finite-difference Maxwell solver, with arbitrary modification of the local stencil in the simulation domain. The analytical model is validated against simulations of domain decomposition technique and PMLs, when these are used with very high-order Maxwell solver, as well as in the infinite order limit of pseudo-spectral solvers. Results confirm that the new analytical approach enables exact predictions in each case. It also confirms that the domain decomposition technique can be used with very high-order Maxwell solvers and a reasonably low number of guard cells with negligible effects on the whole accuracy of the simulation.
A quadtree-adaptive multigrid solver for the Serre-Green-Naghdi equations
NASA Astrophysics Data System (ADS)
Popinet, Stéphane
2015-12-01
The Serre-Green-Naghdi (SGN) equations, also known as the fully-nonlinear Boussinesq wave equations, accurately describe the behaviour of dispersive shoaling water waves. This article presents and validates a novel combination of methods for the numerical approximation of solutions to the SGN equations. The approach preserves the robustness of the original finite-volume Saint-Venant solver, in particular for the treatment of wetting/drying and equilibrium states. The linear system of coupled vector equations governing the dispersive SGN momentum sources is solved simply and efficiently using a generic multigrid solver. This approach generalises automatically to adaptive quadtree meshes. Adaptive mesh refinement is shown to provide orders-of-magnitude gains in speed and memory when applied to the dispersive propagation of waves during the Tohoku tsunami. The source code, test cases and examples are freely available.
jShyLU Scalable Hybrid Preconditioner and Solver
2012-09-11
ShyLU is numerical software to solve sparse linear systems of equations. ShyLU uses a hybrid direct-iterative Schur complement method, and may be used either as a preconditioner or as a solver. ShyLU is parallel and optimized for a single compute Solver node. ShyLU will be a package in the Trilinos software framework.
Experiences with linear solvers for oil reservoir simulation problems
Joubert, W.; Janardhan, R.; Biswas, D.; Carey, G.
1996-12-31
This talk will focus on practical experiences with iterative linear solver algorithms used in conjunction with Amoco Production Company`s Falcon oil reservoir simulation code. The goal of this study is to determine the best linear solver algorithms for these types of problems. The results of numerical experiments will be presented.
Shape reanalysis and sensitivities utilizing preconditioned iterative boundary solvers
NASA Technical Reports Server (NTRS)
Guru Prasad, K.; Kane, J. H.
1992-01-01
The computational advantages associated with the utilization of preconditined iterative equation solvers are quantified for the reanalysis of perturbed shapes using continuum structural boundary element analysis (BEA). Both single- and multi-zone three-dimensional problems are examined. Significant reductions in computer time are obtained by making use of previously computed solution vectors and preconditioners in subsequent analyses. The effectiveness of this technique is demonstrated for the computation of shape response sensitivities required in shape optimization. Computer times and accuracies achieved using the preconditioned iterative solvers are compared with those obtained via direct solvers and implicit differentiation of the boundary integral equations. It is concluded that this approach employing preconditioned iterative equation solvers in reanalysis and sensitivity analysis can be competitive with if not superior to those involving direct solvers.
A real-time impurity solver for DMFT
NASA Astrophysics Data System (ADS)
Kim, Hyungwon; Aron, Camille; Han, Jong E.; Kotliar, Gabriel
Dynamical mean-field theory (DMFT) offers a non-perturbative approach to problems with strongly correlated electrons. The method heavily relies on the ability to numerically solve an auxiliary Anderson-type impurity problem. While powerful Matsubara-frequency solvers have been developed over the past two decades to tackle equilibrium situations, the status of real-time impurity solvers that could compete with Matsubara-frequency solvers and be readily generalizable to non-equilibrium situations is still premature. We present a real-time solver which is based on a quantum Master equation description of the dissipative dynamics of the impurity and its exact diagonalization. As a benchmark, we illustrate the strengths of our solver in the context of the equilibrium Mott-insulator transition of the one-band Hubbard model and compare it with iterative perturbation theory (IPT) method. Finally, we discuss its direct application to a nonequilibrium situation.
Parallel solver for trajectory optimization search directions
NASA Technical Reports Server (NTRS)
Psiaki, M. L.; Park, K. H.
1992-01-01
A key algorithmic element of a real-time trajectory optimization hardware/software implementation is presented, the search step solver. This is one piece of an algorithm whose overall goal is to make nonlinear trajectory optimization fast enough to provide real-time commands during guidance of a vehicle such as an aeromaneuvering orbiter or the National Aerospace Plane. Many methods of nonlinear programming require the solution of a quadratic program (QP) at each iteration to determine the search step. In the trajectory optimization case, the QP has a special dynamic programming structure. The algorithm exploits this special structure with a divide- and conquer type of parallel implementation. The algorithm solves a (p.N)-stage problem on N processors in O(p + log2 N) operations. The algorithm yields a factor of 8 speed-up over the fastest known serial algorithm when solving a 1024-stage test problem on 32 processors.
Scalable Adaptive Multilevel Solvers for Multiphysics Problems
Xu, Jinchao
2014-12-01
In this project, we investigated adaptive, parallel, and multilevel methods for numerical modeling of various real-world applications, including Magnetohydrodynamics (MHD), complex fluids, Electromagnetism, Navier-Stokes equations, and reservoir simulation. First, we have designed improved mathematical models and numerical discretizaitons for viscoelastic fluids and MHD. Second, we have derived new a posteriori error estimators and extended the applicability of adaptivity to various problems. Third, we have developed multilevel solvers for solving scalar partial differential equations (PDEs) as well as coupled systems of PDEs, especially on unstructured grids. Moreover, we have integrated the study between adaptive method and multilevel methods, and made significant efforts and advances in adaptive multilevel methods of the multi-physics problems.
Optimising a parallel conjugate gradient solver
Field, M.R.
1996-12-31
This work arises from the introduction of a parallel iterative solver to a large structural analysis finite element code. The code is called FEX and it was developed at Hitachi`s Mechanical Engineering Laboratory. The FEX package can deal with a large range of structural analysis problems using a large number of finite element techniques. FEX can solve either stress or thermal analysis problems of a range of different types from plane stress to a full three-dimensional model. These problems can consist of a number of different materials which can be modelled by a range of material models. The structure being modelled can have the load applied at either a point or a surface, or by a pressure, a centrifugal force or just gravity. Alternatively a thermal load can be applied with a given initial temperature. The displacement of the structure can be constrained by having a fixed boundary or by prescribing the displacement at a boundary.
General purpose nonlinear system solver based on Newton-Krylov method.
2013-12-01
KINSOL is part of a software family called SUNDIALS: SUite of Nonlinear and Differential/Algebraic equation Solvers [1]. KINSOL is a general-purpose nonlinear system solver based on Newton-Krylov and fixed-point solver technologies [2].
NASA Astrophysics Data System (ADS)
Liu, Yan; Shen, Weidong; Tian, Baolin; Mao, De-kang
2015-03-01
We develop a new and more general formula for the construction of two dimensional nodal Riemann solver for a cell-centered Lagrangian scheme developed by Maire and his co-workers which allows us to use general one dimensional Riemann solvers that have intermediate velocity and pressure in the construction. The old formula for the scheme used in the papers of Maire et al. is only a special case of our new formula. We present an entropy discussion, which indicates that the schemes with nodal solvers constructed following the old formula, which can only use the 1D Riemann solvers satisfying our strong entropy condition, are usually numerically very dissipative. To develop numerically less dissipative schemes we introduce a so-called weak entropy condition, and present a one dimensional Riemann solver that satisfies the weak entropy condition but not the strong entropy condition. Analysis shows that the scheme using this 1D solver is numerically less dissipative than the schemes using solvers satisfying the strong condition. Finally, several numerical examples are presented to show that our new formula works well and the scheme using the one dimensional solver satisfying the weak entropy condition improves the accuracy in smooth region, resolution around rarefaction waves and two dimensional symmetry; however it sometimes produces small velocity oscillations and mesh distortions.
Comparison of open-source linear programming solvers.
Gearhart, Jared Lee; Adair, Kristin Lynn; Durfee, Justin David.; Jones, Katherine A.; Martin, Nathaniel; Detry, Richard Joseph
2013-10-01
When developing linear programming models, issues such as budget limitations, customer requirements, or licensing may preclude the use of commercial linear programming solvers. In such cases, one option is to use an open-source linear programming solver. A survey of linear programming tools was conducted to identify potential open-source solvers. From this survey, four open-source solvers were tested using a collection of linear programming test problems and the results were compared to IBM ILOG CPLEX Optimizer (CPLEX) [1], an industry standard. The solvers considered were: COIN-OR Linear Programming (CLP) [2], [3], GNU Linear Programming Kit (GLPK) [4], lp_solve [5] and Modular In-core Nonlinear Optimization System (MINOS) [6]. As no open-source solver outperforms CPLEX, this study demonstrates the power of commercial linear programming software. CLP was found to be the top performing open-source solver considered in terms of capability and speed. GLPK also performed well but cannot match the speed of CLP or CPLEX. lp_solve and MINOS were considerably slower and encountered issues when solving several test problems.
Robust large-scale parallel nonlinear solvers for simulations.
Bader, Brett William; Pawlowski, Roger Patrick; Kolda, Tamara Gibson
2005-11-01
This report documents research to develop robust and efficient solution techniques for solving large-scale systems of nonlinear equations. The most widely used method for solving systems of nonlinear equations is Newton's method. While much research has been devoted to augmenting Newton-based solvers (usually with globalization techniques), little has been devoted to exploring the application of different models. Our research has been directed at evaluating techniques using different models than Newton's method: a lower order model, Broyden's method, and a higher order model, the tensor method. We have developed large-scale versions of each of these models and have demonstrated their use in important applications at Sandia. Broyden's method replaces the Jacobian with an approximation, allowing codes that cannot evaluate a Jacobian or have an inaccurate Jacobian to converge to a solution. Limited-memory methods, which have been successful in optimization, allow us to extend this approach to large-scale problems. We compare the robustness and efficiency of Newton's method, modified Newton's method, Jacobian-free Newton-Krylov method, and our limited-memory Broyden method. Comparisons are carried out for large-scale applications of fluid flow simulations and electronic circuit simulations. Results show that, in cases where the Jacobian was inaccurate or could not be computed, Broyden's method converged in some cases where Newton's method failed to converge. We identify conditions where Broyden's method can be more efficient than Newton's method. We also present modifications to a large-scale tensor method, originally proposed by Bouaricha, for greater efficiency, better robustness, and wider applicability. Tensor methods are an alternative to Newton-based methods and are based on computing a step based on a local quadratic model rather than a linear model. The advantage of Bouaricha's method is that it can use any existing linear solver, which makes it simple to write
Multi-GPU kinetic solvers using MPI and CUDA
NASA Astrophysics Data System (ADS)
Zabelok, Sergey; Arslanbekov, Robert; Kolobov, Vladimir
2014-12-01
This paper describes recent progress towards porting a Unified Flow Solver (UFS) to heterogeneous parallel computing. The main challenge of porting UFS to graphics processing units (GPUs) comes from the dynamically adapted mesh, which causes irregular data access. We describe the implementation of CUDA kernels for three modules in UFS: the direct Boltzmann solver using discrete velocity method (DVM), the DSMC module, and the Lattice Boltzmann Method (LBM) solver, all using octree Cartesian mesh with adaptive Mesh Refinement (AMR). Double digit speedup on single GPU and good scaling for multi-GPU has been demonstrated.
A mimetic spectral element solver for the Grad-Shafranov equation
NASA Astrophysics Data System (ADS)
Palha, A.; Koren, B.; Felici, F.
2016-07-01
In this work we present a robust and accurate arbitrary order solver for the fixed-boundary plasma equilibria in toroidally axisymmetric geometries. To achieve this we apply the mimetic spectral element formulation presented in [56] to the solution of the Grad-Shafranov equation. This approach combines a finite volume discretization with the mixed finite element method. In this way the discrete differential operators (∇, ∇×, ∇ṡ) can be represented exactly and metric and all approximation errors are present in the constitutive relations. The result of this formulation is an arbitrary order method even on highly curved meshes. Additionally, the integral of the toroidal current Jϕ is exactly equal to the boundary integral of the poloidal field over the plasma boundary. This property can play an important role in the coupling between equilibrium and transport solvers. The proposed solver is tested on a varied set of plasma cross sections (smooth and with an X-point) and also for a wide range of pressure and toroidal magnetic flux profiles. Equilibria accurate up to machine precision are obtained. Optimal algebraic convergence rates of order p + 1 and geometric convergence rates are shown for Soloviev solutions (including high Shafranov shifts), field-reversed configuration (FRC) solutions and spheromak analytical solutions. The robustness of the method is demonstrated for non-linear test cases, in particular on an equilibrium solution with a pressure pedestal.
Flood simulation using an open source quadtree grid shallow water flow solver
NASA Astrophysics Data System (ADS)
An, H.; Yu, S.
2012-12-01
We carry out performance testing of Gerris for flood simulation. Gerris Flow Solver is open source software and has the capability of adaptive quadtree grid generation. In particular, the shallow water flow solver within Gerris Flow Solver implements second-order accurate Gudunov type numerical schemes, with preserving the balance of source and flux terms on quadtree cut cell grids. The combination of quadtree grids with the cut cell method improves the flexibility of quadtree grids for grid generation. In addition, the model has the capacity of adaptive meshing in an easy and effective way, which can improve computational efficiency in 2D modeling. Pre- and post-processors are already well equipped for users. Finally, an extension such as bed erosion or sediment transport can be added if needed. Two flood events, Malpasset dam break in France and Baeksan levee failure in Korea, are simulated using Gerris, with adaptively refining meshes near water fronts and the river boundary. Simulation results are compared with survey data, experimental data as well as simulation results by other researchers. The simulation results demonstrate that the adaptive quadtree model can save approximately 95% of the computational cost while preserving the accuracy. Gerris is a very attractive alternative for flood managers given the favorable features demonstrated in this paper.
NASA Astrophysics Data System (ADS)
Balsara, Dinshaw S.; Vides, Jeaniffer; Gurski, Katharine; Nkonga, Boniface; Dumbser, Michael; Garain, Sudip; Audit, Edouard
2016-01-01
Just as the quality of a one-dimensional approximate Riemann solver is improved by the inclusion of internal sub-structure, the quality of a multidimensional Riemann solver is also similarly improved. Such multidimensional Riemann problems arise when multiple states come together at the vertex of a mesh. The interaction of the resulting one-dimensional Riemann problems gives rise to a strongly-interacting state. We wish to endow this strongly-interacting state with physically-motivated sub-structure. The self-similar formulation of Balsara [16] proves especially useful for this purpose. While that work is based on a Galerkin projection, in this paper we present an analogous self-similar formulation that is based on a different interpretation. In the present formulation, we interpret the shock jumps at the boundary of the strongly-interacting state quite literally. The enforcement of the shock jump conditions is done with a least squares projection (Vides, Nkonga and Audit [67]). With that interpretation, we again show that the multidimensional Riemann solver can be endowed with sub-structure. However, we find that the most efficient implementation arises when we use a flux vector splitting and a least squares projection. An alternative formulation that is based on the full characteristic matrices is also presented. The multidimensional Riemann solvers that are demonstrated here use one-dimensional HLLC Riemann solvers as building blocks. Several stringent test problems drawn from hydrodynamics and MHD are presented to show that the method works. Results from structured and unstructured meshes demonstrate the versatility of our method. The reader is also invited to watch a video introduction to multidimensional Riemann solvers on http://www.nd.edu/~dbalsara/Numerical-PDE-Course.
Generic task problem solvers in Soar
NASA Technical Reports Server (NTRS)
Johnson, Todd R.; Smith, Jack W., Jr.; Chandrasekaran, B.
1989-01-01
Two trends can be discerned in research in problem solving architectures in the last few years. On one hand, interest in task-specific architectures has grown, wherein types of problems of general utility are identified, and special architectures that support the development of problem solving systems for those types of problems are proposed. These architectures help in the acquisition and specification of knowledge by providing inference methods that are appropriate for the type of problem. However, knowledge based systems which use only one type of problem solving method are very brittle, and adding more types of methods requires a principled approach to integrating them in a flexible way. Contrasting with this trend is the proposal for a flexible, general architecture contained in the work on Soar. Soar has features which make it attractive for flexible use of all potentially relevant knowledge or methods. But as the theory Soar does not make commitments to specific types of problem solvers or provide guidance for their construction. It was investigated how task-specific architectures can be constructed in Soar to retain as many of the advantages as possible of both approaches. Examples were used from the Generic Task approach for building knowledge based systems. Though this approach was developed and applied for a number of problems, the ideas are applicable to other task-specific approaches as well.
Elliptic Solvers for Adaptive Mesh Refinement Grids
Quinlan, D.J.; Dendy, J.E., Jr.; Shapira, Y.
1999-06-03
We are developing multigrid methods that will efficiently solve elliptic problems with anisotropic and discontinuous coefficients on adaptive grids. The final product will be a library that provides for the simplified solution of such problems. This library will directly benefit the efforts of other Laboratory groups. The focus of this work is research on serial and parallel elliptic algorithms and the inclusion of our black-box multigrid techniques into this new setting. The approach applies the Los Alamos object-oriented class libraries that greatly simplify the development of serial and parallel adaptive mesh refinement applications. In the final year of this LDRD, we focused on putting the software together; in particular we completed the final AMR++ library, we wrote tutorials and manuals, and we built example applications. We implemented the Fast Adaptive Composite Grid method as the principal elliptic solver. We presented results at the Overset Grid Conference and other more AMR specific conferences. We worked on optimization of serial and parallel performance and published several papers on the details of this work. Performance remains an important issue and is the subject of continuing research work.
NASA Technical Reports Server (NTRS)
Raju, Manthena S.
1998-01-01
Sprays occur in a wide variety of industrial and power applications and in the processing of materials. A liquid spray is a phase flow with a gas as the continuous phase and a liquid as the dispersed phase (in the form of droplets or ligaments). Interactions between the two phases, which are coupled through exchanges of mass, momentum, and energy, can occur in different ways at different times and locations involving various thermal, mass, and fluid dynamic factors. An understanding of the flow, combustion, and thermal properties of a rapidly vaporizing spray requires careful modeling of the rate-controlling processes associated with the spray's turbulent transport, mixing, chemical kinetics, evaporation, and spreading rates, as well as other phenomena. In an attempt to advance the state-of-the-art in multidimensional numerical methods, we at the NASA Lewis Research Center extended our previous work on sprays to unstructured grids and parallel computing. LSPRAY, which was developed by M.S. Raju of Nyma, Inc., is designed to be massively parallel and could easily be coupled with any existing gas-phase flow and/or Monte Carlo probability density function (PDF) solver. The LSPRAY solver accommodates the use of an unstructured mesh with mixed triangular, quadrilateral, and/or tetrahedral elements in the gas-phase solvers. It is used specifically for fuel sprays within gas turbine combustors, but it has many other uses. The spray model used in LSPRAY provided favorable results when applied to stratified-charge rotary combustion (Wankel) engines and several other confined and unconfined spray flames. The source code will be available with the National Combustion Code (NCC) as a complete package.
Performance of NASA Equation Solvers on Computational Mechanics Applications
NASA Technical Reports Server (NTRS)
Storaasli, Olaf O.
1996-01-01
This paper describes the performance of a new family of NASA-developed equation solvers used for large-scale (i.e. 551,705 equations) structural analysis. To minimize computer time and memory, the solvers are divided by application and matrix characteristics (sparse/dense, real/complex, symmetric/nonsymmetric, size: in-core/out of core) and exploit the hardware features of current and future computers. In this paper, the equation solvers, which are written in FORTRAN, and are therefore easily transportable, are shown to be faster than specialized computer library routines utilizing assembly code. Twenty NASA structural benchmark models with NASA solver timings reside on World Wide Web with a challenge to beat them.
NASA Astrophysics Data System (ADS)
Alemi Ardakani, Hamid; Bridges, Thomas J.; Turner, Matthew R.
2016-06-01
A class of augmented approximate Riemann solvers due to George (2008) [12] is extended to solve the shallow-water equations in a moving vessel with variable bottom topography and variable cross-section with wetting and drying. A class of Roe-type upwind solvers for the system of balance laws is derived which respects the steady-state solutions. The numerical solutions of the new adapted augmented f-wave solvers are validated against the Roe-type solvers. The theory is extended to solve the shallow-water flows in moving vessels with arbitrary cross-section with influx-efflux boundary conditions motivated by the shallow-water sloshing in the ocean wave energy converter (WEC) proposed by Offshore Wave Energy Ltd. (OWEL) [1]. A fractional step approach is used to handle the time-dependent forcing functions. The numerical solutions are compared to an extended new Roe-type solver for the system of balance laws with a time-dependent source function. The shallow-water sloshing finite volume solver can be coupled to a Runge-Kutta integrator for the vessel motion.
Multilevel solvers of first-order system least-squares for Stokes equations
Lai, Chen-Yao G.
1996-12-31
Recently, The use of first-order system least squares principle for the approximate solution of Stokes problems has been extensively studied by Cai, Manteuffel, and McCormick. In this paper, we study multilevel solvers of first-order system least-squares method for the generalized Stokes equations based on the velocity-vorticity-pressure formulation in three dimensions. The least-squares functionals is defined to be the sum of the L{sup 2}-norms of the residuals, which is weighted appropriately by the Reynolds number. We develop convergence analysis for additive and multiplicative multilevel methods applied to the resulting discrete equations.
A survey of deterministic solvers for rarefied flows (Invited)
NASA Astrophysics Data System (ADS)
Mieussens, Luc
2014-12-01
Numerical simulations of rarefied gas flows are generally made with DSMC methods. Up to a recent period, deterministic numerical methods based on a discretization of the Boltzmann equation were restricted to simple problems (1D, linearized flows, or simple geometries, for instance). In the last decade, several deterministic solvers have been developed in different teams to tackle more complex problems like 2D and 3D flows. Some of them are based on the full Boltzmann equation. Solving this equation numerically is still very challenging, and 3D solvers are still restricted to monoatomic gases, even if recent works have proved it was possible to simulate simple flows for polyatomic gases. Other solvers are based on simpler BGK like models: they allow for much more intensive simulations on 3D flows for realistic geometries, but treating complex gases requires extended BGK models that are still under development. In this paper, we discuss the main features of these existing solvers, and we focus on their strengths and inefficiencies. We will also review some recent results that show how these solvers can be improved: - higher accuracy (higher order finite volume methods, discontinuous Galerkin approaches) - lower memory and CPU costs with special velocity discretization (adaptive grids, spectral methods) - multi-scale simulations by using hybrid and asymptotic preserving schemes - efficient implementation on high performance computers (parallel computing, hybrid parallelization) Finally, we propose some perspectives to make these solvers more efficient and more popular.
A Comparative Study of Randomized Constraint Solvers for Random-Symbolic Testing
NASA Technical Reports Server (NTRS)
Takaki, Mitsuo; Cavalcanti, Diego; Gheyi, Rohit; Iyoda, Juliano; dAmorim, Marcelo; Prudencio, Ricardo
2009-01-01
The complexity of constraints is a major obstacle for constraint-based software verification. Automatic constraint solvers are fundamentally incomplete: input constraints often build on some undecidable theory or some theory the solver does not support. This paper proposes and evaluates several randomized solvers to address this issue. We compare the effectiveness of a symbolic solver (CVC3), a random solver, three hybrid solvers (i.e., mix of random and symbolic), and two heuristic search solvers. We evaluate the solvers on two benchmarks: one consisting of manually generated constraints and another generated with a concolic execution of 8 subjects. In addition to fully decidable constraints, the benchmarks include constraints with non-linear integer arithmetic, integer modulo and division, bitwise arithmetic, and floating-point arithmetic. As expected symbolic solving (in particular, CVC3) subsumes the other solvers for the concolic execution of subjects that only generate decidable constraints. For the remaining subjects the solvers are complementary.
Quantitative analysis of numerical solvers for oscillatory biomolecular system models
Quo, Chang F; Wang, May D
2008-01-01
Background This article provides guidelines for selecting optimal numerical solvers for biomolecular system models. Because various parameters of the same system could have drastically different ranges from 10-15 to 1010, the ODEs can be stiff and ill-conditioned, resulting in non-unique, non-existing, or non-reproducible modeling solutions. Previous studies have not examined in depth how to best select numerical solvers for biomolecular system models, which makes it difficult to experimentally validate the modeling results. To address this problem, we have chosen one of the well-known stiff initial value problems with limit cycle behavior as a test-bed system model. Solving this model, we have illustrated that different answers may result from different numerical solvers. We use MATLAB numerical solvers because they are optimized and widely used by the modeling community. We have also conducted a systematic study of numerical solver performances by using qualitative and quantitative measures such as convergence, accuracy, and computational cost (i.e. in terms of function evaluation, partial derivative, LU decomposition, and "take-off" points). The results show that the modeling solutions can be drastically different using different numerical solvers. Thus, it is important to intelligently select numerical solvers when solving biomolecular system models. Results The classic Belousov-Zhabotinskii (BZ) reaction is described by the Oregonator model and is used as a case study. We report two guidelines in selecting optimal numerical solver(s) for stiff, complex oscillatory systems: (i) for problems with unknown parameters, ode45 is the optimal choice regardless of the relative error tolerance; (ii) for known stiff problems, both ode113 and ode15s are good choices under strict relative tolerance conditions. Conclusions For any given biomolecular model, by building a library of numerical solvers with quantitative performance assessment metric, we show that it is possible
Solving Upwind-Biased Discretizations. 2; Multigrid Solver Using Semicoarsening
NASA Technical Reports Server (NTRS)
Diskin, Boris
1999-01-01
This paper studies a novel multigrid approach to the solution for a second order upwind biased discretization of the convection equation in two dimensions. This approach is based on semi-coarsening and well balanced explicit correction terms added to coarse-grid operators to maintain on coarse-grid the same cross-characteristic interaction as on the target (fine) grid. Colored relaxation schemes are used on all the levels allowing a very efficient parallel implementation. The results of the numerical tests can be summarized as follows: 1) The residual asymptotic convergence rate of the proposed V(0, 2) multigrid cycle is about 3 per cycle. This convergence rate far surpasses the theoretical limit (4/3) predicted for standard multigrid algorithms using full coarsening. The reported efficiency does not deteriorate with increasing the cycle, depth (number of levels) and/or refining the target-grid mesh spacing. 2) The full multi-grid algorithm (FMG) with two V(0, 2) cycles on the target grid and just one V(0, 2) cycle on all the coarse grids always provides an approximate solution with the algebraic error less than the discretization error. Estimates of the total work in the FMG algorithm are ranged between 18 and 30 minimal work units (depending on the target (discretizatioin). Thus, the overall efficiency of the FMG solver closely approaches (if does not achieve) the goal of the textbook multigrid efficiency. 3) A novel approach to deriving a discrete solution approximating the true continuous solution with a relative accuracy given in advance is developed. An adaptive multigrid algorithm (AMA) using comparison of the solutions on two successive target grids to estimate the accuracy of the current target-grid solution is defined. A desired relative accuracy is accepted as an input parameter. The final target grid on which this accuracy can be achieved is chosen automatically in the solution process. the actual relative accuracy of the discrete solution approximation
Performance Models for the Spike Banded Linear System Solver
Manguoglu, Murat; Saied, Faisal; Sameh, Ahmed; Grama, Ananth
2011-01-01
With availability of large-scale parallel platforms comprised of tens-of-thousands of processors and beyond, there is significant impetus for the development of scalable parallel sparse linear system solvers and preconditioners. An integral part of this design process is the development of performance models capable of predicting performance and providing accurate cost models for the solvers and preconditioners. There has been some work in the past on characterizing performance of the iterative solvers themselves. In this paper, we investigate the problem of characterizing performance and scalability of banded preconditioners. Recent work has demonstrated the superior convergence properties and robustness of banded preconditioners,more » compared to state-of-the-art ILU family of preconditioners as well as algebraic multigrid preconditioners. Furthermore, when used in conjunction with efficient banded solvers, banded preconditioners are capable of significantly faster time-to-solution. Our banded solver, the Truncated Spike algorithm is specifically designed for parallel performance and tolerance to deep memory hierarchies. Its regular structure is also highly amenable to accurate performance characterization. Using these characteristics, we derive the following results in this paper: (i) we develop parallel formulations of the Truncated Spike solver, (ii) we develop a highly accurate pseudo-analytical parallel performance model for our solver, (iii) we show excellent predication capabilities of our model – based on which we argue the high scalability of our solver. Our pseudo-analytical performance model is based on analytical performance characterization of each phase of our solver. These analytical models are then parameterized using actual runtime information on target platforms. An important consequence of our performance models is that they reveal underlying performance bottlenecks in both serial and parallel formulations. All of our results are validated
The novel high-performance 3-D MT inverse solver
NASA Astrophysics Data System (ADS)
Kruglyakov, Mikhail; Geraskin, Alexey; Kuvshinov, Alexey
2016-04-01
We present novel, robust, scalable, and fast 3-D magnetotelluric (MT) inverse solver. The solver is written in multi-language paradigm to make it as efficient, readable and maintainable as possible. Separation of concerns and single responsibility concepts go through implementation of the solver. As a forward modelling engine a modern scalable solver extrEMe, based on contracting integral equation approach, is used. Iterative gradient-type (quasi-Newton) optimization scheme is invoked to search for (regularized) inverse problem solution, and adjoint source approach is used to calculate efficiently the gradient of the misfit. The inverse solver is able to deal with highly detailed and contrasting models, allows for working (separately or jointly) with any type of MT responses, and supports massive parallelization. Moreover, different parallelization strategies implemented in the code allow optimal usage of available computational resources for a given problem statement. To parameterize an inverse domain the so-called mask parameterization is implemented, which means that one can merge any subset of forward modelling cells in order to account for (usually) irregular distribution of observation sites. We report results of 3-D numerical experiments aimed at analysing the robustness, performance and scalability of the code. In particular, our computational experiments carried out at different platforms ranging from modern laptops to HPC Piz Daint (6th supercomputer in the world) demonstrate practically linear scalability of the code up to thousands of nodes.
Adaptive kinetic-fluid solvers for heterogeneous computing architectures
NASA Astrophysics Data System (ADS)
Zabelok, Sergey; Arslanbekov, Robert; Kolobov, Vladimir
2015-12-01
We show feasibility and benefits of porting an adaptive multi-scale kinetic-fluid code to CPU-GPU systems. Challenges are due to the irregular data access for adaptive Cartesian mesh, vast difference of computational cost between kinetic and fluid cells, and desire to evenly load all CPUs and GPUs during grid adaptation and algorithm refinement. Our Unified Flow Solver (UFS) combines Adaptive Mesh Refinement (AMR) with automatic cell-by-cell selection of kinetic or fluid solvers based on continuum breakdown criteria. Using GPUs enables hybrid simulations of mixed rarefied-continuum flows with a million of Boltzmann cells each having a 24 × 24 × 24 velocity mesh. We describe the implementation of CUDA kernels for three modules in UFS: the direct Boltzmann solver using the discrete velocity method (DVM), the Direct Simulation Monte Carlo (DSMC) solver, and a mesoscopic solver based on the Lattice Boltzmann Method (LBM), all using adaptive Cartesian mesh. Double digit speedups on single GPU and good scaling for multi-GPUs have been demonstrated.
Continuous-time quantum Monte Carlo impurity solvers
NASA Astrophysics Data System (ADS)
Gull, Emanuel; Werner, Philipp; Fuchs, Sebastian; Surer, Brigitte; Pruschke, Thomas; Troyer, Matthias
2011-04-01
Continuous-time quantum Monte Carlo impurity solvers are algorithms that sample the partition function of an impurity model using diagrammatic Monte Carlo techniques. The present paper describes codes that implement the interaction expansion algorithm originally developed by Rubtsov, Savkin, and Lichtenstein, as well as the hybridization expansion method developed by Werner, Millis, Troyer, et al. These impurity solvers are part of the ALPS-DMFT application package and are accompanied by an implementation of dynamical mean-field self-consistency equations for (single orbital single site) dynamical mean-field problems with arbitrary densities of states. Program summaryProgram title: dmft Catalogue identifier: AEIL_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEIL_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: ALPS LIBRARY LICENSE version 1.1 No. of lines in distributed program, including test data, etc.: 899 806 No. of bytes in distributed program, including test data, etc.: 32 153 916 Distribution format: tar.gz Programming language: C++ Operating system: The ALPS libraries have been tested on the following platforms and compilers: Linux with GNU Compiler Collection (g++ version 3.1 and higher), and Intel C++ Compiler (icc version 7.0 and higher) MacOS X with GNU Compiler (g++ Apple-version 3.1, 3.3 and 4.0) IBM AIX with Visual Age C++ (xlC version 6.0) and GNU (g++ version 3.1 and higher) compilers Compaq Tru64 UNIX with Compq C++ Compiler (cxx) SGI IRIX with MIPSpro C++ Compiler (CC) HP-UX with HP C++ Compiler (aCC) Windows with Cygwin or coLinux platforms and GNU Compiler Collection (g++ version 3.1 and higher) RAM: 10 MB-1 GB Classification: 7.3 External routines: ALPS [1], BLAS/LAPACK, HDF5 Nature of problem: (See [2].) Quantum impurity models describe an atom or molecule embedded in a host material with which it can exchange electrons. They are basic to nanoscience as
A multiple right hand side iterative solver for history matching
Killough, J.E.; Sharma, Y.; Dupuy, A.; Bissell, R.; Wallis, J.
1995-12-31
History matching of oil and gas reservoirs can be accelerated by directly calculating the gradients of observed quantities (e.g., well pressure) with respect to the adjustable reserve parameters (e.g., permeability). This leads to a set of linear equations which add a significant overhead to the full simulation run without gradients. Direct Gauss elimination solvers can be used to address this problem by performing the factorization of the matrix only once and then reusing the factor matrix for the solution of the multiple right hand sides. This is a limited technique, however. Experience has shown that problems with greater than few thousand cells may not be practical for direct solvers because of computation time and memory limitations. This paper discusses the implementation of a multiple right hand side iterative linear equation solver (MRHS) for a system of adjoint equations to significantly enhance the performance of a gradient simulator.
Gpu Implementation of a Viscous Flow Solver on Unstructured Grids
NASA Astrophysics Data System (ADS)
Xu, Tianhao; Chen, Long
2016-06-01
Graphics processing units have gained popularities in scientific computing over past several years due to their outstanding parallel computing capability. Computational fluid dynamics applications involve large amounts of calculations, therefore a latest GPU card is preferable of which the peak computing performance and memory bandwidth are much better than a contemporary high-end CPU. We herein focus on the detailed implementation of our GPU targeting Reynolds-averaged Navier-Stokes equations solver based on finite-volume method. The solver employs a vertex-centered scheme on unstructured grids for the sake of being capable of handling complex topologies. Multiple optimizations are carried out to improve the memory accessing performance and kernel utilization. Both steady and unsteady flow simulation cases are carried out using explicit Runge-Kutta scheme. The solver with GPU acceleration in this paper is demonstrated to have competitive advantages over the CPU targeting one.
Two Solvers for Tractable Temporal Constraints with Preferences
NASA Technical Reports Server (NTRS)
Rossi, F.; Khatib,L.; Morris, P.; Morris, R.; Clancy, Daniel (Technical Monitor)
2002-01-01
A number of reasoning problems involving the manipulation of temporal information can naturally be viewed as implicitly inducing an ordering of potential local decisions involving time on the basis of preferences. Soft temporal constraints problems allow to describe in a natural way scenarios where events happen over time and preferences are associated to event distances and durations. In general, solving soft temporal problems require exponential time in the worst case, but there are interesting subclasses of problems which are polynomially solvable. We describe two solvers based on two different approaches for solving the same tractable subclass. For each solver we present the theoretical results it stands on, a description of the algorithm and some experimental results. The random generator used to build the problems on which tests are performed is also described. Finally, we compare the two solvers highlighting the tradeoff between performance and representational power.
Turbomachinery blade design using a Navier-Stokes solver and artificial neural network
Pierret, S.; Van den Braembussche, R.A.
1999-04-01
This paper describes a knowledge-based method for the automatic design of more efficient turbine blades. An Artificial Neural Network (ANN) is used to construct an approximate model (response surface) using a database containing Navier-Stokes solutions for all previous designs. This approximate model is used for the optimization, by means of Simulated Annealing (SA), of the blade geometry, which is then analyzed by a Navier-Stokes solver. This procedure results in a considerable speed-up of the design process by reducing both the interventions of the operator and the computational effort. It is also shown how such a method allows the design of more efficient blades while satisfying both the aerodynamic and mechanical constraints. The method has been applied to different types of two-dimensional turbine blades, of which three examples are presented in this paper.
An evaluation of parallel multigrid as a solver and a preconditioner for singular perturbed problems
Oosterlee, C.W.; Washio, T.
1996-12-31
In this paper we try to achieve h-independent convergence with preconditioned GMRES and BiCGSTAB for 2D singular perturbed equations. Three recently developed multigrid methods are adopted as a preconditioner. They are also used as solution methods in order to compare the performance of the methods as solvers and as preconditioners. Two of the multigrid methods differ only in the transfer operators. One uses standard matrix- dependent prolongation operators from. The second uses {open_quotes}upwind{close_quotes} prolongation operators, developed. Both employ the Galerkin coarse grid approximation and an alternating zebra line Gauss-Seidel smoother. The third method is based on the block LU decomposition of a matrix and on an approximate Schur complement. This multigrid variant is presented in. All three multigrid algorithms are algebraic methods.
LAPACKrc: Fast linear algebra kernels/solvers for FPGA accelerators
NASA Astrophysics Data System (ADS)
Gonzalez, Juan; Núñez, Rafael C.
2009-07-01
We present LAPACKrc, a family of FPGA-based linear algebra solvers able to achieve more than 100x speedup per commodity processor on certain problems. LAPACKrc subsumes some of the LAPACK and ScaLAPACK functionalities, and it also incorporates sparse direct and iterative matrix solvers. Current LAPACKrc prototypes demonstrate between 40x-150x speedup compared against top-of-the-line hardware/software systems. A technology roadmap is in place to validate current performance of LAPACKrc in HPC applications, and to increase the computational throughput by factors of hundreds within the next few years.
Numerical System Solver Developed for the National Cycle Program
NASA Technical Reports Server (NTRS)
Binder, Michael P.
1999-01-01
As part of the National Cycle Program (NCP), a powerful new numerical solver has been developed to support the simulation of aeropropulsion systems. This software uses a hierarchical object-oriented design. It can provide steady-state and time-dependent solutions to nonlinear and even discontinuous problems typically encountered when aircraft and spacecraft propulsion systems are simulated. It also can handle constrained solutions, in which one or more factors may limit the behavior of the engine system. Timedependent simulation capabilities include adaptive time-stepping and synchronization with digital control elements. The NCP solver is playing an important role in making the NCP a flexible, powerful, and reliable simulation package.
Profile solver in C for finite element equations
NASA Astrophysics Data System (ADS)
Hededal, O.; Krenk, S.
1994-08-01
This paper presents an efficient, pointer based profile solver with standard matrix indexing. Constrained equations Ax = b where x contains known and unknown values are solved and the full vectors x and b are obtained. Pseudo-code algorithms are formulated for a row oriented form of the LDL(sup T) factorization and implemented directly as a C code. The solver is implemented in C because of the close relation between two-dimensional arrays and pointers which makes it possible to write a clear and efficient code.
Rasin, A.
1994-04-01
We discuss the idea of approximate flavor symmetries. Relations between approximate flavor symmetries and natural flavor conservation and democracy models is explored. Implications for neutrino physics are also discussed.
Median Approximations for Genomes Modeled as Matrices.
Zanetti, Joao Paulo Pereira; Biller, Priscila; Meidanis, Joao
2016-04-01
The genome median problem is an important problem in phylogenetic reconstruction under rearrangement models. It can be stated as follows: Given three genomes, find a fourth that minimizes the sum of the pairwise rearrangement distances between it and the three input genomes. In this paper, we model genomes as matrices and study the matrix median problem using the rank distance. It is known that, for any metric distance, at least one of the corners is a [Formula: see text]-approximation of the median. Our results allow us to compute up to three additional matrix median candidates, all of them with approximation ratios at least as good as the best corner, when the input matrices come from genomes. We also show a class of instances where our candidates are optimal. From the application point of view, it is usually more interesting to locate medians farther from the corners, and therefore, these new candidates are potentially more useful. In addition to the approximation algorithm, we suggest a heuristic to get a genome from an arbitrary square matrix. This is useful to translate the results of our median approximation algorithm back to genomes, and it has good results in our tests. To assess the relevance of our approach in the biological context, we ran simulated evolution tests and compared our solutions to those of an exact DCJ median solver. The results show that our method is capable of producing very good candidates. PMID:27072561
Median Approximations for Genomes Modeled as Matrices.
Zanetti, Joao Paulo Pereira; Biller, Priscila; Meidanis, Joao
2016-04-01
The genome median problem is an important problem in phylogenetic reconstruction under rearrangement models. It can be stated as follows: Given three genomes, find a fourth that minimizes the sum of the pairwise rearrangement distances between it and the three input genomes. In this paper, we model genomes as matrices and study the matrix median problem using the rank distance. It is known that, for any metric distance, at least one of the corners is a [Formula: see text]-approximation of the median. Our results allow us to compute up to three additional matrix median candidates, all of them with approximation ratios at least as good as the best corner, when the input matrices come from genomes. We also show a class of instances where our candidates are optimal. From the application point of view, it is usually more interesting to locate medians farther from the corners, and therefore, these new candidates are potentially more useful. In addition to the approximation algorithm, we suggest a heuristic to get a genome from an arbitrary square matrix. This is useful to translate the results of our median approximation algorithm back to genomes, and it has good results in our tests. To assess the relevance of our approach in the biological context, we ran simulated evolution tests and compared our solutions to those of an exact DCJ median solver. The results show that our method is capable of producing very good candidates.
NASA Astrophysics Data System (ADS)
Niiniluoto, Ilkka
2014-03-01
Approximation of laws is an important theme in the philosophy of science. If we can make sense of the idea that two scientific laws are "close" to each other, then we can also analyze such methodological notions as approximate explanation of laws, approximate reduction of theories, approximate empirical success of theories, and approximate truth of laws. Proposals for measuring the distance between quantitative scientific laws were given in Niiniluoto (1982, 1987). In this paper, these definitions are reconsidered as a response to the interesting critical remarks by Liu (1999).
Navier-Stokes Solvers and Generalizations for Reacting Flow Problems
Elman, Howard C
2013-01-27
This is an overview of our accomplishments during the final term of this grant (1 September 2008 -- 30 June 2012). These fall mainly into three categories: fast algorithms for linear eigenvalue problems; solution algorithms and modeling methods for partial differential equations with uncertain coefficients; and preconditioning methods and solvers for models of computational fluid dynamics (CFD).
Intellectual Abilities That Discriminate Good and Poor Problem Solvers.
ERIC Educational Resources Information Center
Meyer, Ruth Ann
1981-01-01
This study compared good and poor fourth-grade problem solvers on a battery of 19 "reference" tests for verbal, induction, numerical, word fluency, memory, perceptual speed, and simple visualization abilities. Results suggest verbal, numerical, and especially induction abilities are important to successful mathematical problem solving. (MP)
Coordinate Projection-based Solver for ODE with Invariants
2008-04-08
CPODES is a general purpose (serial and parallel) solver for systems of ordinary differential equation (ODE) with invariants. It implements a coordinate projection approach using different types of projection (orthogonal or oblique) and one of several methods for the decompositon of the Jacobian of the invariant equations.
Two level scheme solvers for nuclear spectroscopy
NASA Astrophysics Data System (ADS)
Jansson, Kaj; DiJulio, Douglas; Cederkäll, Joakim
2011-10-01
A program for building level schemes from γ-spectroscopy coincidence data has been developed. The scheme builder was equipped with two different algorithms: a statistical one based on the Metropolis method and a more logical one, called REMP (REcurse, Merge and Permute), developed from scratch. These two methods are compared both on ideal cases and on experimental γ-ray data sets. The REMP algorithm is based on coincidences and transition energies. Using correct and complete coincidence data, it has solved approximately half a million schemes without failures. Also, for incomplete data and data with minor errors, the algorithm produces consistent sub-schemes when it is not possible to obtain a complete scheme from the provided data.
Sparse pseudospectral approximation method
NASA Astrophysics Data System (ADS)
Constantine, Paul G.; Eldred, Michael S.; Phipps, Eric T.
2012-07-01
Multivariate global polynomial approximations - such as polynomial chaos or stochastic collocation methods - are now in widespread use for sensitivity analysis and uncertainty quantification. The pseudospectral variety of these methods uses a numerical integration rule to approximate the Fourier-type coefficients of a truncated expansion in orthogonal polynomials. For problems in more than two or three dimensions, a sparse grid numerical integration rule offers accuracy with a smaller node set compared to tensor product approximation. However, when using a sparse rule to approximately integrate these coefficients, one often finds unacceptable errors in the coefficients associated with higher degree polynomials. By reexamining Smolyak's algorithm and exploiting the connections between interpolation and projection in tensor product spaces, we construct a sparse pseudospectral approximation method that accurately reproduces the coefficients of basis functions that naturally correspond to the sparse grid integration rule. The compelling numerical results show that this is the proper way to use sparse grid integration rules for pseudospectral approximation.
Multiscale Universal Interface: A concurrent framework for coupling heterogeneous solvers
NASA Astrophysics Data System (ADS)
Tang, Yu-Hang; Kudo, Shuhei; Bian, Xin; Li, Zhen; Karniadakis, George Em
2015-09-01
Concurrently coupled numerical simulations using heterogeneous solvers are powerful tools for modeling multiscale phenomena. However, major modifications to existing codes are often required to enable such simulations, posing significant difficulties in practice. In this paper we present a C++ library, i.e. the Multiscale Universal Interface (MUI), which is capable of facilitating the coupling effort for a wide range of multiscale simulations. The library adopts a header-only form with minimal external dependency and hence can be easily dropped into existing codes. A data sampler concept is introduced, combined with a hybrid dynamic/static typing mechanism, to create an easily customizable framework for solver-independent data interpretation. The library integrates MPI MPMD support and an asynchronous communication protocol to handle inter-solver information exchange irrespective of the solvers' own MPI awareness. Template metaprogramming is heavily employed to simultaneously improve runtime performance and code flexibility. We validated the library by solving three different multiscale problems, which also serve to demonstrate the flexibility of the framework in handling heterogeneous models and solvers. In the first example, a Couette flow was simulated using two concurrently coupled Smoothed Particle Hydrodynamics (SPH) simulations of different spatial resolutions. In the second example, we coupled the deterministic SPH method with the stochastic Dissipative Particle Dynamics (DPD) method to study the effect of surface grafting on the hydrodynamics properties on the surface. In the third example, we consider conjugate heat transfer between a solid domain and a fluid domain by coupling the particle-based energy-conserving DPD (eDPD) method with the Finite Element Method (FEM).
Multiscale Universal Interface: A concurrent framework for coupling heterogeneous solvers
Tang, Yu-Hang; Kudo, Shuhei; Bian, Xin; Li, Zhen; Karniadakis, George Em
2015-09-15
Graphical abstract: - Abstract: Concurrently coupled numerical simulations using heterogeneous solvers are powerful tools for modeling multiscale phenomena. However, major modifications to existing codes are often required to enable such simulations, posing significant difficulties in practice. In this paper we present a C++ library, i.e. the Multiscale Universal Interface (MUI), which is capable of facilitating the coupling effort for a wide range of multiscale simulations. The library adopts a header-only form with minimal external dependency and hence can be easily dropped into existing codes. A data sampler concept is introduced, combined with a hybrid dynamic/static typing mechanism, to create an easily customizable framework for solver-independent data interpretation. The library integrates MPI MPMD support and an asynchronous communication protocol to handle inter-solver information exchange irrespective of the solvers' own MPI awareness. Template metaprogramming is heavily employed to simultaneously improve runtime performance and code flexibility. We validated the library by solving three different multiscale problems, which also serve to demonstrate the flexibility of the framework in handling heterogeneous models and solvers. In the first example, a Couette flow was simulated using two concurrently coupled Smoothed Particle Hydrodynamics (SPH) simulations of different spatial resolutions. In the second example, we coupled the deterministic SPH method with the stochastic Dissipative Particle Dynamics (DPD) method to study the effect of surface grafting on the hydrodynamics properties on the surface. In the third example, we consider conjugate heat transfer between a solid domain and a fluid domain by coupling the particle-based energy-conserving DPD (eDPD) method with the Finite Element Method (FEM)
Migration of vectorized iterative solvers to distributed memory architectures
Pommerell, C.; Ruehl, R.
1994-12-31
Both necessity and opportunity motivate the use of high-performance computers for iterative linear solvers. Necessity results from the size of the problems being solved-smaller problems are often better handled by direct methods. Opportunity arises from the formulation of the iterative methods in terms of simple linear algebra operations, even if this {open_quote}natural{close_quotes} parallelism is not easy to exploit in irregularly structured sparse matrices and with good preconditioners. As a result, high-performance implementations of iterative solvers have attracted a lot of interest in recent years. Most efforts are geared to vectorize or parallelize the dominating operation-structured or unstructured sparse matrix-vector multiplication, or to increase locality and parallelism by reformulating the algorithm-reducing global synchronization in inner products or local data exchange in preconditioners. Target architectures for iterative solvers currently include mostly vector supercomputers and architectures with one or few optimized (e.g., super-scalar and/or super-pipelined RISC) processors and hierarchical memory systems. More recently, parallel computers with physically distributed memory and a better price/performance ratio have been offered by vendors as a very interesting alternative to vector supercomputers. However, programming comfort on such distributed memory parallel processors (DMPPs) still lags behind. Here the authors are concerned with iterative solvers and their changing computing environment. In particular, they are considering migration from traditional vector supercomputers to DMPPs. Application requirements force one to use flexible and portable libraries. They want to extend the portability of iterative solvers rather than reimplementing everything for each new machine, or even for each new architecture.
Decision Engines for Software Analysis Using Satisfiability Modulo Theories Solvers
NASA Technical Reports Server (NTRS)
Bjorner, Nikolaj
2010-01-01
The area of software analysis, testing and verification is now undergoing a revolution thanks to the use of automated and scalable support for logical methods. A well-recognized premise is that at the core of software analysis engines is invariably a component using logical formulas for describing states and transformations between system states. The process of using this information for discovering and checking program properties (including such important properties as safety and security) amounts to automatic theorem proving. In particular, theorem provers that directly support common software constructs offer a compelling basis. Such provers are commonly called satisfiability modulo theories (SMT) solvers. Z3 is a state-of-the-art SMT solver. It is developed at Microsoft Research. It can be used to check the satisfiability of logical formulas over one or more theories such as arithmetic, bit-vectors, lists, records and arrays. The talk describes some of the technology behind modern SMT solvers, including the solver Z3. Z3 is currently mainly targeted at solving problems that arise in software analysis and verification. It has been applied to various contexts, such as systems for dynamic symbolic simulation (Pex, SAGE, Vigilante), for program verification and extended static checking (Spec#/Boggie, VCC, HAVOC), for software model checking (Yogi, SLAM), model-based design (FORMULA), security protocol code (F7), program run-time analysis and invariant generation (VS3). We will describe how it integrates support for a variety of theories that arise naturally in the context of the applications. There are several new promising avenues and the talk will touch on some of these and the challenges related to SMT solvers. Proceedings
GPU accelerated flow solver for direct numerical simulation of turbulent flows
NASA Astrophysics Data System (ADS)
Salvadore, Francesco; Bernardini, Matteo; Botti, Michela
2013-02-01
Graphical processing units (GPUs), characterized by significant computing performance, are nowadays very appealing for the solution of computationally demanding tasks in a wide variety of scientific applications. However, to run on GPUs, existing codes need to be ported and optimized, a procedure which is not yet standardized and may require non trivial efforts, even to high-performance computing specialists. In the present paper we accurately describe the porting to CUDA (Compute Unified Device Architecture) of a finite-difference compressible Navier-Stokes solver, suitable for direct numerical simulation (DNS) of turbulent flows. Porting and validation processes are illustrated in detail, with emphasis on computational strategies and techniques that can be applied to overcome typical bottlenecks arising from the porting of common computational fluid dynamics solvers. We demonstrate that a careful optimization work is crucial to get the highest performance from GPU accelerators. The results show that the overall speedup of one NVIDIA Tesla S2070 GPU is approximately 22 compared with one AMD Opteron 2352 Barcelona chip and 11 compared with one Intel Xeon X5650 Westmere core. The potential of GPU devices in the simulation of unsteady three-dimensional turbulent flows is proved by performing a DNS of a spatially evolving compressible mixing layer.
GPU accelerated flow solver for direct numerical simulation of turbulent flows
Salvadore, Francesco; Botti, Michela
2013-02-15
Graphical processing units (GPUs), characterized by significant computing performance, are nowadays very appealing for the solution of computationally demanding tasks in a wide variety of scientific applications. However, to run on GPUs, existing codes need to be ported and optimized, a procedure which is not yet standardized and may require non trivial efforts, even to high-performance computing specialists. In the present paper we accurately describe the porting to CUDA (Compute Unified Device Architecture) of a finite-difference compressible Navier–Stokes solver, suitable for direct numerical simulation (DNS) of turbulent flows. Porting and validation processes are illustrated in detail, with emphasis on computational strategies and techniques that can be applied to overcome typical bottlenecks arising from the porting of common computational fluid dynamics solvers. We demonstrate that a careful optimization work is crucial to get the highest performance from GPU accelerators. The results show that the overall speedup of one NVIDIA Tesla S2070 GPU is approximately 22 compared with one AMD Opteron 2352 Barcelona chip and 11 compared with one Intel Xeon X5650 Westmere core. The potential of GPU devices in the simulation of unsteady three-dimensional turbulent flows is proved by performing a DNS of a spatially evolving compressible mixing layer.
Solvers for $$\\mathcal{O} (N)$$ Electronic Structure in the Strong Scaling Limit
Bock, Nicolas; Challacombe, William M.; Kale, Laxmikant
2016-01-26
Here we present a hybrid OpenMP/Charm\\tt++ framework for solving themore » $$\\mathcal{O} (N)$$ self-consistent-field eigenvalue problem with parallelism in the strong scaling regime, $$P\\gg{N}$$, where $P$ is the number of cores, and $N$ is a measure of system size, i.e., the number of matrix rows/columns, basis functions, atoms, molecules, etc. This result is achieved with a nested approach to spectral projection and the sparse approximate matrix multiply [Bock and Challacombe, SIAM J. Sci. Comput., 35 (2013), pp. C72--C98], and involves a recursive, task-parallel algorithm, often employed by generalized $N$-Body solvers, to occlusion and culling of negligible products in the case of matrices with decay. Lastly, employing classic technologies associated with generalized $N$-Body solvers, including overdecomposition, recursive task parallelism, orderings that preserve locality, and persistence-based load balancing, we obtain scaling beyond hundreds of cores per molecule for small water clusters ([H$${}_2$$O]$${}_N$$, $$N \\in \\{ 30, 90, 150 \\}$$, $$P/N \\approx \\{ 819, 273, 164 \\}$$) and find support for an increasingly strong scalability with increasing system size $N$.« less
Response of buoyant plumes to transient discharges investigated using an adaptive solver
NASA Astrophysics Data System (ADS)
O'Callaghan, J.; Rickard, G.; Popinet, S.; Stevens, C.
2010-11-01
The behavior of buoyant plumes driven by variable momentum inputs were examined using an adaptive Navier-Stokes solver (Gerris). Boundary conditions were representative of an idealized stratified, coastal environment. Salinity ranged from 5 to 30 in the top 5 m of the water column to replicate the strong vertical gradients experienced in fjord environments. Two-dimensional simulations examined the response of the buoyant plume driven by zero, steady, and variable momentum fluxes. The behavior was quantified in terms of the characteristic features of a buoyant plume, the thickness of the nose (or head of gravity current), and the trailing tail. Both the nose and tail of the plume were substantially thicker for the variable momentum run, whereas elongation and thinning of the plume was evident for the steady and zero momentum inputs. Furthermore, an order of magnitude difference in available potential energy was found for the variable momentum run. Validation of the Boussinesq approximation initially utilized the classic lock-exchange experiment with excellent agreement to previous numerical and theoretical experiments. Frontal speeds of the gravity current converged toward the theoretical value of Benjamin (1968). The adaptive mesh permitted lock-exchange simulations at Reynolds number (Re) of ˜10,500 and are some of the highest Re runs to date. Moreover, improved computational efficiency was achieved using the adaptive solver with simulations completed in 20% of the time they took on a static, high-resolution grid.
Approximations for photoelectron scattering
NASA Astrophysics Data System (ADS)
Fritzsche, V.
1989-04-01
The errors of several approximations in the theoretical approach of photoelectron scattering are systematically studied, in tungsten, for electron energies ranging from 10 to 1000 eV. The large inaccuracies of the plane-wave approximation (PWA) are substantially reduced by means of effective scattering amplitudes in the modified small-scattering-centre approximation (MSSCA). The reduced angular momentum expansion (RAME) is so accurate that it allows reliable calculations of multiple-scattering contributions for all the energies considered.
Efficient three-dimensional Poisson solvers in open rectangular conducting pipe
NASA Astrophysics Data System (ADS)
Qiang, Ji
2016-06-01
Three-dimensional (3D) Poisson solver plays an important role in the study of space-charge effects on charged particle beam dynamics in particle accelerators. In this paper, we propose three new 3D Poisson solvers for a charged particle beam in an open rectangular conducting pipe. These three solvers include a spectral integrated Green function (IGF) solver, a 3D spectral solver, and a 3D integrated Green function solver. These solvers effectively handle the longitudinal open boundary condition using a finite computational domain that contains the beam itself. This saves the computational cost of using an extra larger longitudinal domain in order to set up an appropriate finite boundary condition. Using an integrated Green function also avoids the need to resolve rapid variation of the Green function inside the beam. The numerical operational cost of the spectral IGF solver and the 3D IGF solver scales as O(N log(N)) , where N is the number of grid points. The cost of the 3D spectral solver scales as O(Nn N) , where Nn is the maximum longitudinal mode number. We compare these three solvers using several numerical examples and discuss the advantageous regime of each solver in the physical application.
NASA Technical Reports Server (NTRS)
Dutta, Soumitra
1988-01-01
A model for approximate spatial reasoning using fuzzy logic to represent the uncertainty in the environment is presented. Algorithms are developed which can be used to reason about spatial information expressed in the form of approximate linguistic descriptions similar to the kind of spatial information processed by humans. Particular attention is given to static spatial reasoning.
2d PDE Linear Asymmetric Matrix Solver
1983-10-01
ILUCG2 (Incomplete LU factorized Conjugate Gradient algorithm for 2d problems) was developed to solve a linear asymmetric matrix system arising from a 9-point discretization of two-dimensional elliptic and parabolic partial differential equations found in plasma physics applications, such as plasma diffusion, equilibria, and phase space transport (Fokker-Planck equation) problems. These equations share the common feature of being stiff and requiring implicit solution techniques. When these parabolic or elliptic PDE''s are discretized with finite-difference or finite-elementmore » methods, the resulting matrix system is frequently of block-tridiagonal form. To use ILUCG2, the discretization of the two-dimensional partial differential equation and its boundary conditions must result in a block-tridiagonal supermatrix composed of elementary tridiagonal matrices. A generalization of the incomplete Cholesky conjugate gradient algorithm is used to solve the matrix equation. Loops are arranged to vectorize on the Cray1 with the CFT compiler, wherever possible. Recursive loops, which cannot be vectorized, are written for optimum scalar speed. For problems having a symmetric matrix ICCG2 should be used since it runs up to four times faster and uses approximately 30% less storage. Similar methods in three dimensions are available in ICCG3 and ILUCG3. A general source, containing extensions and macros, which must be processed by a pre-compiler to obtain the standard FORTRAN source, is provided along with the standard FORTRAN source because it is believed to be more readable. The pre-compiler is not included, but pre-compilation may be performed by a text editor as described in the UCRL-88746 Preprint.« less
A parallel-vector equation solver for unsymmetric matrices on supercomputers
NASA Technical Reports Server (NTRS)
Qin, J.; Mei, C.; Nguyen, D. T.; Gray, C. E., Jr.
1991-01-01
A parallel-vector unsymmetric equation solver is presented. The solver exploits both vector and parallel capabilities provided by modern, high-performance supercomputers. A special storage scheme and loop-unrolling technique are used to optimize the vector performance. A parallel FORTRAN language is used to develop the solver on the CRAY 2 and CRAY Y-MP multiple processing computer environment. Three numerical examples are presented which demonstrate the efficiency and accuracy of this equation solver. The first two examples demonstrate the improved performance, and the third example utilizes the proposed solver to solve a highly nonlinear, unsymmetric finite element formulation for panel flutter.
Three-Dimensional High-Lift Analysis Using a Parallel Unstructured Multigrid Solver
NASA Technical Reports Server (NTRS)
Mavriplis, Dimitri J.
1998-01-01
A directional implicit unstructured agglomeration multigrid solver is ported to shared and distributed memory massively parallel machines using the explicit domain-decomposition and message-passing approach. Because the algorithm operates on local implicit lines in the unstructured mesh, special care is required in partitioning the problem for parallel computing. A weighted partitioning strategy is described which avoids breaking the implicit lines across processor boundaries, while incurring minimal additional communication overhead. Good scalability is demonstrated on a 128 processor SGI Origin 2000 machine and on a 512 processor CRAY T3E machine for reasonably fine grids. The feasibility of performing large-scale unstructured grid calculations with the parallel multigrid algorithm is demonstrated by computing the flow over a partial-span flap wing high-lift geometry on a highly resolved grid of 13.5 million points in approximately 4 hours of wall clock time on the CRAY T3E.
Rethinking Electrostatic Solvers in Particle Simulations for the Exascale Era
NASA Astrophysics Data System (ADS)
Deca, Jan; Markidis, Stefano; Lapenta, Giovanni; Járleberg, Erik; Apostolov, Rossen; Laure, Erwin
2012-10-01
In preparation to the exascale era, an alternative approach to calculate the electrostatic forces in Particle Mesh (PM) methods is proposed. While the traditional techniques are based on the calculation of the electrostatic potential by solving the Poisson equation, in the new approach the electric field is calculated by solving Ampère's law. When the Ampere's law is discretized explicitly in time, the electric field values on the mesh are simply updated from the previous values. In this way, the electrostatic solver becomes an embarrassingly parallel problem, making the algorithm extremely scalable and suitable for exascale computing platforms. An implementation PM code with the new electrostatic solver is presented to show that the proposed method produces correct results. It is a very promising algorithm for exascale PM simulations.
LDRD report : parallel repartitioning for optimal solver performance.
Heaphy, Robert; Devine, Karen Dragon; Preis, Robert; Hendrickson, Bruce Alan; Heroux, Michael Allen; Boman, Erik Gunnar
2004-02-01
We have developed infrastructure, utilities and partitioning methods to improve data partitioning in linear solvers and preconditioners. Our efforts included incorporation of data repartitioning capabilities from the Zoltan toolkit into the Trilinos solver framework, (allowing dynamic repartitioning of Trilinos matrices); implementation of efficient distributed data directories and unstructured communication utilities in Zoltan and Trilinos; development of a new multi-constraint geometric partitioning algorithm (which can generate one decomposition that is good with respect to multiple criteria); and research into hypergraph partitioning algorithms (which provide up to 56% reduction of communication volume compared to graph partitioning for a number of emerging applications). This report includes descriptions of the infrastructure and algorithms developed, along with results demonstrating the effectiveness of our approaches.
An exact solver for the DCJ median problem.
Zhang, Meng; Arndt, William; Tang, Jijun
2009-01-01
The "double-cut-and-join" (DCJ) model of genome rearrangement proposed by Yancopoulos et al. uses the single DCJ operation to account for all genome rearrangement events. Given three signed permutations, the DCJ median problem is to find a fourth permutation that minimizes the sum of the pairwise DCJ distances between it and the three others. In this paper, we present a branch-and-bound method that provides accurate solution to the multichromosomal DCJ median problems. We conduct extensive simulations and the results show that the DCJ median solver performs better than other median solvers for most of the test cases. These experiments also suggest that DCJ model is more suitable for real datasets where both reversals and transpositions occur.
Scalable Out-of-Core Solvers on Xeon Phi Cluster
D'Azevedo, Ed F; Chan, Ki Shing; Su, Shiquan; Wong, Kwai
2015-01-01
This paper documents the implementation of a distributive out-of-core (OOC) solver for performing LU and Cholesky factorizations of a large dense matrix on clusters of many-core programmable co-processors. The out-of- core algorithm combines both the left-looking and right-looking schemes aimed to minimize the movement of data between the CPU host and the co-processor, optimizing data locality as well as computing throughput. The OOC solver is built to align with the format of the ScaLAPACK software library, making it readily portable to any existing codes using ScaLAPACK. A runtime analysis conducted on Beacon (an Intel Xeon plus Intel Xeon Phi cluster which composed of 48 nodes of multi-core CPU and MIC) at the Na- tional Institute for Computational Sciences is presented. Comparison of the performance on the Intel Xeon Phi and GPU clusters are also provided.
A functional implementation of the Jacobi eigen-solver
Boehm, A.P.W. . Dept. of Computer Science); Hiromoto, R.E. )
1993-01-01
In this paper, we describe the systematic development of two implementations of the Jacobi eigen-solver and give performance results for the MIT/Motorola Monsoon dataflow machine. Our study is carried out using MINT, the MIT Monsoon simulator. The design of these implementations follows from the mathematics of the Jacobi method, and not from a translation of an existing sequential code. The functional semantics with respect to array updates, which cause excessive array copying, has lead us to a new implementation of a parallel group-rotations'' algorithm first described by Sameh. Our version of this algorithm requires 0(n[sup 3]) operations, whereas Sameh's original version requires 0(n[sup 4]) operations. The implementations are programmed in the language Id, and although Id has non-functional features, we have restricted the development of our eigen-solvers to the functional sub-set of the language.
A functional implementation of the Jacobi eigen-solver
Boehm, A.P.W.; Hiromoto, R.E.
1993-02-01
In this paper, we describe the systematic development of two implementations of the Jacobi eigen-solver and give performance results for the MIT/Motorola Monsoon dataflow machine. Our study is carried out using MINT, the MIT Monsoon simulator. The design of these implementations follows from the mathematics of the Jacobi method, and not from a translation of an existing sequential code. The functional semantics with respect to array updates, which cause excessive array copying, has lead us to a new implementation of a parallel ``group-rotations`` algorithm first described by Sameh. Our version of this algorithm requires 0(n{sup 3}) operations, whereas Sameh`s original version requires 0(n{sup 4}) operations. The implementations are programmed in the language Id, and although Id has non-functional features, we have restricted the development of our eigen-solvers to the functional sub-set of the language.
A Nonlinear Modal Aeroelastic Solver for FUN3D
NASA Technical Reports Server (NTRS)
Goldman, Benjamin D.; Bartels, Robert E.; Biedron, Robert T.; Scott, Robert C.
2016-01-01
A nonlinear structural solver has been implemented internally within the NASA FUN3D computational fluid dynamics code, allowing for some new aeroelastic capabilities. Using a modal representation of the structure, a set of differential or differential-algebraic equations are derived for general thin structures with geometric nonlinearities. ODEPACK and LAPACK routines are linked with FUN3D, and the nonlinear equations are solved at each CFD time step. The existing predictor-corrector method is retained, whereby the structural solution is updated after mesh deformation. The nonlinear solver is validated using a test case for a flexible aeroshell at transonic, supersonic, and hypersonic flow conditions. Agreement with linear theory is seen for the static aeroelastic solutions at relatively low dynamic pressures, but structural nonlinearities limit deformation amplitudes at high dynamic pressures. No flutter was found at any of the tested trajectory points, though LCO may be possible in the transonic regime.
On improving linear solver performance: a block variant of GMRES
Baker, A H; Dennis, J M; Jessup, E R
2004-05-10
The increasing gap between processor performance and memory access time warrants the re-examination of data movement in iterative linear solver algorithms. For this reason, we explore and establish the feasibility of modifying a standard iterative linear solver algorithm in a manner that reduces the movement of data through memory. In particular, we present an alternative to the restarted GMRES algorithm for solving a single right-hand side linear system Ax = b based on solving the block linear system AX = B. Algorithm performance, i.e. time to solution, is improved by using the matrix A in operations on groups of vectors. Experimental results demonstrate the importance of implementation choices on data movement as well as the effectiveness of the new method on a variety of problems from different application areas.
Verification and Validation Studies for the LAVA CFD Solver
NASA Technical Reports Server (NTRS)
Moini-Yekta, Shayan; Barad, Michael F; Sozer, Emre; Brehm, Christoph; Housman, Jeffrey A.; Kiris, Cetin C.
2013-01-01
The verification and validation of the Launch Ascent and Vehicle Aerodynamics (LAVA) computational fluid dynamics (CFD) solver is presented. A modern strategy for verification and validation is described incorporating verification tests, validation benchmarks, continuous integration and version control methods for automated testing in a collaborative development environment. The purpose of the approach is to integrate the verification and validation process into the development of the solver and improve productivity. This paper uses the Method of Manufactured Solutions (MMS) for the verification of 2D Euler equations, 3D Navier-Stokes equations as well as turbulence models. A method for systematic refinement of unstructured grids is also presented. Verification using inviscid vortex propagation and flow over a flat plate is highlighted. Simulation results using laminar and turbulent flow past a NACA 0012 airfoil and ONERA M6 wing are validated against experimental and numerical data.
An Upwind Solver for the National Combustion Code
NASA Technical Reports Server (NTRS)
Sockol, Peter M.
2011-01-01
An upwind solver is presented for the unstructured grid National Combustion Code (NCC). The compressible Navier-Stokes equations with time-derivative preconditioning and preconditioned flux-difference splitting of the inviscid terms are used. First order derivatives are computed on cell faces and used to evaluate the shear stresses and heat fluxes. A new flux limiter uses these same first order derivatives in the evaluation of left and right states used in the flux-difference splitting. The k-epsilon turbulence equations are solved with the same second-order method. The new solver has been installed in a recent version of NCC and the resulting code has been tested successfully in 2D on two laminar cases with known solutions and one turbulent case with experimental data.
Parallel Auxiliary Space AMG Solver for $H(div)$ Problems
Kolev, Tzanio V.; Vassilevski, Panayot S.
2012-12-18
We present a family of scalable preconditioners for matrices arising in the discretization of $H(div)$ problems using the lowest order Raviart--Thomas finite elements. Our approach belongs to the class of “auxiliary space''--based methods and requires only the finite element stiffness matrix plus some minimal additional discretization information about the topology and orientation of mesh entities. Also, we provide a detailed algebraic description of the theory, parallel implementation, and different variants of this parallel auxiliary space divergence solver (ADS) and discuss its relations to the Hiptmair--Xu (HX) auxiliary space decomposition of $H(div)$ [SIAM J. Numer. Anal., 45 (2007), pp. 2483--2509] and to the auxiliary space Maxwell solver AMS [J. Comput. Math., 27 (2009), pp. 604--623]. Finally, an extensive set of numerical experiments demonstrates the robustness and scalability of our implementation on large-scale $H(div)$ problems with large jumps in the material coefficients.
CASTRO: A NEW COMPRESSIBLE ASTROPHYSICAL SOLVER. II. GRAY RADIATION HYDRODYNAMICS
Zhang, W.; Almgren, A.; Bell, J.; Howell, L.; Burrows, A.
2011-10-01
We describe the development of a flux-limited gray radiation solver for the compressible astrophysics code, CASTRO. CASTRO uses an Eulerian grid with block-structured adaptive mesh refinement based on a nested hierarchy of logically rectangular variable-sized grids with simultaneous refinement in both space and time. The gray radiation solver is based on a mixed-frame formulation of radiation hydrodynamics. In our approach, the system is split into two parts, one part that couples the radiation and fluid in a hyperbolic subsystem, and another parabolic part that evolves radiation diffusion and source-sink terms. The hyperbolic subsystem is solved explicitly with a high-order Godunov scheme, whereas the parabolic part is solved implicitly with a first-order backward Euler method.
Brittle Solvers: Lessons and insights into effective solvers for visco-plasticity in geodynamics
NASA Astrophysics Data System (ADS)
Spiegelman, M. W.; May, D.; Wilson, C. R.
2014-12-01
Plasticity/Fracture and rock failure are essential ingredients in geodynamic models as terrestrial rocks do not possess an infinite yield strength. Numerous physical mechanisms have been proposed to limit the strength of rocks, including low temperature plasticity and brittle fracture. While ductile and creep behavior of rocks at depth is largely accepted, the constitutive relations associated with brittle failure, or shear localisation, are more controversial. Nevertheless, there are really only a few macroscopic constitutive laws for visco-plasticity that are regularly used in geodynamics models. Independent of derivation, all of these can be cast as simple effective viscosities which act as stress limiters with different choices for yield surfaces; the most common being a von Mises (constant yield stress) or Drucker-Prager (pressure dependent yield-stress) criterion. The choice of plasticity model, however, can have significant consequences for the degree of non-linearity in a problem and the choice and efficiency of non-linear solvers. Here we describe a series of simplified 2 and 3-D model problems to elucidate several issues associated with obtaining accurate description and solution of visco-plastic problems. We demonstrate that1) Picard/Successive substitution schemes for solution of the non-linear problems can often stall at large values of the non-linear residual, thus producing spurious solutions2) Combined Picard/Newton schemes can be effective for a range of plasticity models, however, they can produce serious convergence problems for strongly pressure dependent plasticity models such as Drucker-Prager.3) Nevertheless, full Drucker-Prager may not be the plasticity model of choice for strong materials as the dynamic pressures produced in these layers can develop pathological behavior with Drucker-Prager, leading to stress strengthening rather than stress weakening behavior.4) In general, for any incompressible Stoke's problem, it is highly advisable to
Parallel CFD Algorithms for Aerodynamical Flow Solvers on Unstructured Meshes. Parts 1 and 2
NASA Technical Reports Server (NTRS)
Barth, Timothy J.; Kwak, Dochan (Technical Monitor)
1995-01-01
The Advisory Group for Aerospace Research and Development (AGARD) has requested my participation in the lecture series entitled Parallel Computing in Computational Fluid Dynamics to be held at the von Karman Institute in Brussels, Belgium on May 15-19, 1995. In addition, a request has been made from the US Coordinator for AGARD at the Pentagon for NASA Ames to hold a repetition of the lecture series on October 16-20, 1995. I have been asked to be a local coordinator for the Ames event. All AGARD lecture series events have attendance limited to NATO allied countries. A brief of the lecture series is provided in the attached enclosure. Specifically, I have been asked to give two lectures of approximately 75 minutes each on the subject of parallel solution techniques for the fluid flow equations on unstructured meshes. The title of my lectures is "Parallel CFD Algorithms for Aerodynamical Flow Solvers on Unstructured Meshes" (Parts I-II). The contents of these lectures will be largely review in nature and will draw upon previously published work in this area. Topics of my lectures will include: (1) Mesh partitioning algorithms. Recursive techniques based on coordinate bisection, Cuthill-McKee level structures, and spectral bisection. (2) Newton's method for large scale CFD problems. Size and complexity estimates for Newton's method, modifications for insuring global convergence. (3) Techniques for constructing the Jacobian matrix. Analytic and numerical techniques for Jacobian matrix-vector products, constructing the transposed matrix, extensions to optimization and homotopy theories. (4) Iterative solution algorithms. Practical experience with GIVIRES and BICG-STAB matrix solvers. (5) Parallel matrix preconditioning. Incomplete Lower-Upper (ILU) factorization, domain-decomposed ILU, approximate Schur complement strategies.
Approximate kernel competitive learning.
Wu, Jian-Sheng; Zheng, Wei-Shi; Lai, Jian-Huang
2015-03-01
Kernel competitive learning has been successfully used to achieve robust clustering. However, kernel competitive learning (KCL) is not scalable for large scale data processing, because (1) it has to calculate and store the full kernel matrix that is too large to be calculated and kept in the memory and (2) it cannot be computed in parallel. In this paper we develop a framework of approximate kernel competitive learning for processing large scale dataset. The proposed framework consists of two parts. First, it derives an approximate kernel competitive learning (AKCL), which learns kernel competitive learning in a subspace via sampling. We provide solid theoretical analysis on why the proposed approximation modelling would work for kernel competitive learning, and furthermore, we show that the computational complexity of AKCL is largely reduced. Second, we propose a pseudo-parallelled approximate kernel competitive learning (PAKCL) based on a set-based kernel competitive learning strategy, which overcomes the obstacle of using parallel programming in kernel competitive learning and significantly accelerates the approximate kernel competitive learning for large scale clustering. The empirical evaluation on publicly available datasets shows that the proposed AKCL and PAKCL can perform comparably as KCL, with a large reduction on computational cost. Also, the proposed methods achieve more effective clustering performance in terms of clustering precision against related approximate clustering approaches.
Approximate kernel competitive learning.
Wu, Jian-Sheng; Zheng, Wei-Shi; Lai, Jian-Huang
2015-03-01
Kernel competitive learning has been successfully used to achieve robust clustering. However, kernel competitive learning (KCL) is not scalable for large scale data processing, because (1) it has to calculate and store the full kernel matrix that is too large to be calculated and kept in the memory and (2) it cannot be computed in parallel. In this paper we develop a framework of approximate kernel competitive learning for processing large scale dataset. The proposed framework consists of two parts. First, it derives an approximate kernel competitive learning (AKCL), which learns kernel competitive learning in a subspace via sampling. We provide solid theoretical analysis on why the proposed approximation modelling would work for kernel competitive learning, and furthermore, we show that the computational complexity of AKCL is largely reduced. Second, we propose a pseudo-parallelled approximate kernel competitive learning (PAKCL) based on a set-based kernel competitive learning strategy, which overcomes the obstacle of using parallel programming in kernel competitive learning and significantly accelerates the approximate kernel competitive learning for large scale clustering. The empirical evaluation on publicly available datasets shows that the proposed AKCL and PAKCL can perform comparably as KCL, with a large reduction on computational cost. Also, the proposed methods achieve more effective clustering performance in terms of clustering precision against related approximate clustering approaches. PMID:25528318
A Discontinuous Galerkin Chimera Overset Solver
NASA Astrophysics Data System (ADS)
Galbraith, Marshall Christopher
geometries. The large stencil associated with these high-order schemes can significantly complicate the inter-grid communication and hole cutting processes. Unlike these high-order schemes, the DG method always retains a small stencil regardless of the order of approximation. The small stencil of the DG method simplifies the inter-grid communication scheme as well as hole cutting procedures. The DG-Chimera scheme does not require a separate interpolation method because the DG scheme represents the solution as cell local polynomials. Hence, the DG-Chimera method does not require fringe points to maintain the interior stencil across inter-grid boundaries. Thus, inter-grid communication can be established as long as the receiving boundary is enclosed by or abuts the donor mesh. This makes the inter-grid communication procedure applicable to both Chimera and zonal meshes. The small stencil implies hole cutting can be performed without regard to maintaining a minimum stencil and thereby greatly simplifies hole cutting. Hence, the DG-Chimera scheme has the potential to greatly simplify the overset grid generation process. Furthermore, the DG-Chimera scheme is capable of using curved cells to represent geometric features. The curved cells resolve issues associated with linear Chimera viscous meshes used for finite volume and finite difference schemes. Finally, the convergence rate of the Chimera schemes is dramatically increased by linearization of the inter-grid communication.
Scaling Algebraic Multigrid Solvers: On the Road to Exascale
Baker, A H; Falgout, R D; Gamblin, T; Kolev, T; Schulz, M; Yang, U M
2010-12-12
Algebraic Multigrid (AMG) solvers are an essential component of many large-scale scientific simulation codes. Their continued numerical scalability and efficient implementation is critical for preparing these codes for exascale. Our experiences on modern multi-core machines show that significant challenges must be addressed for AMG to perform well on such machines. We discuss our experiences and describe the techniques we have used to overcome scalability challenges for AMG on hybrid architectures in preparation for exascale.
A chemical reaction network solver for the astrophysics code NIRVANA
NASA Astrophysics Data System (ADS)
Ziegler, U.
2016-02-01
Context. Chemistry often plays an important role in astrophysical gases. It regulates thermal properties by changing species abundances and via ionization processes. This way, time-dependent cooling mechanisms and other chemistry-related energy sources can have a profound influence on the dynamical evolution of an astrophysical system. Modeling those effects with the underlying chemical kinetics in realistic magneto-gasdynamical simulations provide the basis for a better link to observations. Aims: The present work describes the implementation of a chemical reaction network solver into the magneto-gasdynamical code NIRVANA. For this purpose a multispecies structure is installed, and a new module for evolving the rate equations of chemical kinetics is developed and coupled to the dynamical part of the code. A small chemical network for a hydrogen-helium plasma was constructed including associated thermal processes which is used in test problems. Methods: Evolving a chemical network within time-dependent simulations requires the additional solution of a set of coupled advection-reaction equations for species and gas temperature. Second-order Strang-splitting is used to separate the advection part from the reaction part. The ordinary differential equation (ODE) system representing the reaction part is solved with a fourth-order generalized Runge-Kutta method applicable for stiff systems inherent to astrochemistry. Results: A series of tests was performed in order to check the correctness of numerical and technical implementation. Tests include well-known stiff ODE problems from the mathematical literature in order to confirm accuracy properties of the solver used as well as problems combining gasdynamics and chemistry. Overall, very satisfactory results are achieved. Conclusions: The NIRVANA code is now ready to handle astrochemical processes in time-dependent simulations. An easy-to-use interface allows implementation of complex networks including thermal processes
An automatic ordering method for incomplete factorization iterative solvers
Forsyth, P.A.; Tang, W.P. . Dept. of Computer Science); D'Azevedo, E.F.D. )
1991-01-01
The minimum discarded fill (MDF) ordering strategy for incomplete factorization iterative solvers is developed. MDF ordering is demonstrated for several model son-symmetric problems, as well as a water-flooding simulation which uses an unstructured grid. The model problems show a three to five fold decrease in the number of iterations compared to natural orderings. Greater than twofold improvement was observed for the waterflooding simulation. 26 refs., 7 figs., 3 tabs.
A contribution to the great Riemann solver debate
NASA Technical Reports Server (NTRS)
Quirk, James J.
1992-01-01
The aims of this paper are threefold: to increase the level of awareness within the shock capturing community to the fact that many Godunov-type methods contain subtle flaws that can cause spurious solutions to be computed; to identify one mechanism that might thwart attempts to produce very high resolution simulations; and to proffer a simple strategy for overcoming the specific failings of individual Riemann solvers.
Boltzmann Solver with Adaptive Mesh in Velocity Space
Kolobov, Vladimir I.; Arslanbekov, Robert R.; Frolova, Anna A.
2011-05-20
We describe the implementation of direct Boltzmann solver with Adaptive Mesh in Velocity Space (AMVS) using quad/octree data structure. The benefits of the AMVS technique are demonstrated for the charged particle transport in weakly ionized plasmas where the collision integral is linear. We also describe the implementation of AMVS for the nonlinear Boltzmann collision integral. Test computations demonstrate both advantages and deficiencies of the current method for calculations of narrow-kernel distributions.
Direct linear programming solver in C for structural applications
NASA Astrophysics Data System (ADS)
Damkilde, L.; Hoyer, O.; Krenk, S.
1994-08-01
An optimization problem can be characterized by an object-function, which is maximized, and restrictions, which limit the variation of the variables. A subclass of optimization is Linear Programming (LP), where both the object-function and the restrictions are linear functions of the variables. The traditional solution methods for LP problems are based on the simplex method, and it is customary to allow only non-negative variables. Compared to other optimization routines the LP solvers are more robust and the optimum is reached in a finite number of steps and is not sensitive to the starting point. For structural applications many optimization problems can be linearized and solved by LP routines. However, the structural variables are not always non-negative, and this requires a reformation, where a variable x is substituted by the difference of two non-negative variables, x(sup + ) and x(sup - ). The transformation causes a doubling of the number of variables, and in a computer implementation the memory allocation doubles and for a typical problem the execution time at least doubles. This paper describes a LP solver written in C, which can handle a combination of non-negative variables and unlimited variables. The LP solver also allows restart, and this may reduce the computational costs if the solution to a similar LP problem is known a priori. The algorithm is based on the simplex method, and differs only in the logical choices. Application of the new LP solver will at the same time give both a more direct problem formulation and a more efficient program.
Transonic Drag Prediction Using an Unstructured Multigrid Solver
NASA Technical Reports Server (NTRS)
Mavriplis, D. J.; Levy, David W.
2001-01-01
This paper summarizes the results obtained with the NSU-3D unstructured multigrid solver for the AIAA Drag Prediction Workshop held in Anaheim, CA, June 2001. The test case for the workshop consists of a wing-body configuration at transonic flow conditions. Flow analyses for a complete test matrix of lift coefficient values and Mach numbers at a constant Reynolds number are performed, thus producing a set of drag polars and drag rise curves which are compared with experimental data. Results were obtained independently by both authors using an identical baseline grid and different refined grids. Most cases were run in parallel on commodity cluster-type machines while the largest cases were run on an SGI Origin machine using 128 processors. The objective of this paper is to study the accuracy of the subject unstructured grid solver for predicting drag in the transonic cruise regime, to assess the efficiency of the method in terms of convergence, cpu time, and memory, and to determine the effects of grid resolution on this predictive ability and its computational efficiency. A good predictive ability is demonstrated over a wide range of conditions, although accuracy was found to degrade for cases at higher Mach numbers and lift values where increasing amounts of flow separation occur. The ability to rapidly compute large numbers of cases at varying flow conditions using an unstructured solver on inexpensive clusters of commodity computers is also demonstrated.
A Survey of Solver-Related Geometry and Meshing Issues
NASA Technical Reports Server (NTRS)
Masters, James; Daniel, Derick; Gudenkauf, Jared; Hine, David; Sideroff, Chris
2016-01-01
There is a concern in the computational fluid dynamics community that mesh generation is a significant bottleneck in the CFD workflow. This is one of several papers that will help set the stage for a moderated panel discussion addressing this issue. Although certain general "rules of thumb" and a priori mesh metrics can be used to ensure that some base level of mesh quality is achieved, inadequate consideration is often given to the type of solver or particular flow regime on which the mesh will be utilized. This paper explores how an analyst may want to think differently about a mesh based on considerations such as if a flow is compressible vs. incompressible or hypersonic vs. subsonic or if the solver is node-centered vs. cell-centered. This paper is a high-level investigation intended to provide general insight into how considering the nature of the solver or flow when performing mesh generation has the potential to increase the accuracy and/or robustness of the solution and drive the mesh generation process to a state where it is no longer a hindrance to the analysis process.
QED multi-dimensional vacuum polarization finite-difference solver
NASA Astrophysics Data System (ADS)
Carneiro, Pedro; Grismayer, Thomas; Silva, Luís; Fonseca, Ricardo
2015-11-01
The Extreme Light Infrastructure (ELI) is expected to deliver peak intensities of 1023 - 1024 W/cm2 allowing to probe nonlinear Quantum Electrodynamics (QED) phenomena in an unprecedented regime. Within the framework of QED, the second order process of photon-photon scattering leads to a set of extended Maxwell's equations [W. Heisenberg and H. Euler, Z. Physik 98, 714] effectively creating nonlinear polarization and magnetization terms that account for the nonlinear response of the vacuum. To model this in a self-consistent way, we present a multi dimensional generalized Maxwell equation finite difference solver with significantly enhanced dispersive properties, which was implemented in the OSIRIS particle-in-cell code [R.A. Fonseca et al. LNCS 2331, pp. 342-351, 2002]. We present a detailed numerical analysis of this electromagnetic solver. As an illustration of the properties of the solver, we explore several examples in extreme conditions. We confirm the theoretical prediction of vacuum birefringence of a pulse propagating in the presence of an intense static background field [arXiv:1301.4918 [quant-ph
Fisher, A. C.; Bailey, D. S.; Kaiser, T. B.; Eder, D. C.; Gunney, B. T. N.; Masters, N. D.; Koniges, A. E.; Anderson, R. W.
2015-02-01
Here, we present a novel method for the solution of the diffusion equation on a composite AMR mesh. This approach is suitable for including diffusion based physics modules to hydrocodes that support ALE and AMR capabilities. To illustrate, we proffer our implementations of diffusion based radiation transport and heat conduction in a hydrocode called ALE-AMR. Numerical experiments conducted with the diffusion solver and associated physics packages yield 2nd order convergence in the L_{2} norm.
A High-Order Accurate Parallel Solver for Maxwell's Equations on Overlapping Grids
Henshaw, W D
2005-09-23
A scheme for the solution of the time dependent Maxwell's equations on composite overlapping grids is described. The method uses high-order accurate approximations in space and time for Maxwell's equations written as a second-order vector wave equation. High-order accurate symmetric difference approximations to the generalized Laplace operator are constructed for curvilinear component grids. The modified equation approach is used to develop high-order accurate approximations that only use three time levels and have the same time-stepping restriction as the second-order scheme. Discrete boundary conditions for perfect electrical conductors and for material interfaces are developed and analyzed. The implementation is optimized for component grids that are Cartesian, resulting in a fast and efficient method. The solver runs on parallel machines with each component grid distributed across one or more processors. Numerical results in two- and three-dimensions are presented for the fourth-order accurate version of the method. These results demonstrate the accuracy and efficiency of the approach.
ERIC Educational Resources Information Center
Wolff, Hans
This paper deals with a stochastic process for the approximation of the root of a regression equation. This process was first suggested by Robbins and Monro. The main result here is a necessary and sufficient condition on the iteration coefficients for convergence of the process (convergence with probability one and convergence in the quadratic…
NASA Astrophysics Data System (ADS)
Huang, Siendong
2009-11-01
The nonlocality of quantum states on a bipartite system \\mathcal {A+B} is tested by comparing probabilistic outcomes of two local observables of different subsystems. For a fixed observable A of the subsystem \\mathcal {A,} its optimal approximate double A' of the other system \\mathcal {B} is defined such that the probabilistic outcomes of A' are almost similar to those of the fixed observable A. The case of σ-finite standard von Neumann algebras is considered and the optimal approximate double A' of an observable A is explicitly determined. The connection between optimal approximate doubles and quantum correlations is explained. Inspired by quantum states with perfect correlation, like Einstein-Podolsky-Rosen states and Bohm states, the nonlocality power of an observable A for general quantum states is defined as the similarity that the outcomes of A look like the properties of the subsystem \\mathcal {B} corresponding to A'. As an application of optimal approximate doubles, maximal Bell correlation of a pure entangled state on \\mathcal {B}(\\mathbb {C}^{2})\\otimes \\mathcal {B}(\\mathbb {C}^{2}) is found explicitly.
Approximating Integrals Using Probability
ERIC Educational Resources Information Center
Maruszewski, Richard F., Jr.; Caudle, Kyle A.
2005-01-01
As part of a discussion on Monte Carlo methods, which outlines how to use probability expectations to approximate the value of a definite integral. The purpose of this paper is to elaborate on this technique and then to show several examples using visual basic as a programming tool. It is an interesting method because it combines two branches of…
NASA Astrophysics Data System (ADS)
Müller, Lucas O.; Blanco, Pablo J.
2015-11-01
We present a methodology for the high order approximation of hyperbolic conservation laws in networks by using the Dumbser-Enaux-Toro solver and exact solvers for the classical Riemann problem at junctions. The proposed strategy can be applied to any hyperbolic system, conservative or non-conservative, and possibly with flux functions containing discontinuous parameters, as long as an exact or approximate Riemann problem solver is available. The methodology is implemented for a one-dimensional blood flow model that considers discontinuous variations of mechanical and geometrical properties of vessels. The achievement of formal order of accuracy, as well as the robustness of the resulting numerical scheme, is verified through the simulation of both, academic tests and physiological flows.
A Fast and Robust Poisson-Boltzmann Solver Based on Adaptive Cartesian Grids.
Boschitsch, Alexander H; Fenley, Marcia O
2011-05-10
An adaptive Cartesian grid (ACG) concept is presented for the fast and robust numerical solution of the 3D Poisson-Boltzmann Equation (PBE) governing the electrostatic interactions of large-scale biomolecules and highly charged multi-biomolecular assemblies such as ribosomes and viruses. The ACG offers numerous advantages over competing grid topologies such as regular 3D lattices and unstructured grids. For very large biological molecules and multi-biomolecule assemblies, the total number of grid-points is several orders of magnitude less than that required in a conventional lattice grid used in the current PBE solvers thus allowing the end user to obtain accurate and stable nonlinear PBE solutions on a desktop computer. Compared to tetrahedral-based unstructured grids, ACG offers a simpler hierarchical grid structure, which is naturally suited to multigrid, relieves indirect addressing requirements and uses fewer neighboring nodes in the finite difference stencils. Construction of the ACG and determination of the dielectric/ionic maps are straightforward, fast and require minimal user intervention. Charge singularities are eliminated by reformulating the problem to produce the reaction field potential in the molecular interior and the total electrostatic potential in the exterior ionic solvent region. This approach minimizes grid-dependency and alleviates the need for fine grid spacing near atomic charge sites. The technical portion of this paper contains three parts. First, the ACG and its construction for general biomolecular geometries are described. Next, a discrete approximation to the PBE upon this mesh is derived. Finally, the overall solution procedure and multigrid implementation are summarized. Results obtained with the ACG-based PBE solver are presented for: (i) a low dielectric spherical cavity, containing interior point charges, embedded in a high dielectric ionic solvent - analytical solutions are available for this case, thus allowing rigorous
User documentation for KINSOL, a nonlinear solver for sequential and parallel computers
Taylor, A. G., LLNL
1998-07-01
KINSOL is a general purpose nonlinear system solver callable from either C or Fortran programs It is based on NKSOL [3], but is written in ANSI-standard C rather than Fortran77 Its most notable feature is that it uses Krylov Inexact Newton techniques in the system`s approximate solution, thus sharing significant modules previously written within CASC at LLNL to support CVODE[6, 7]/PVODE[9, 5] It also requires almost no matrix storage for solving the Newton equations as compared to direct methods The name KINSOL is derived from those techniques Krylov Inexact Newton SOLver The package was arranged so that selecting one of two forms of a single module in the compilation process will allow the entire package to be created in either sequential (serial) or parallel form The parallel version of KINSOL uses MPI (Message-Passing Interface) [8] and an appropriately revised version of the vector module NVECTOR, as mentioned above, to achieve parallelism and portability KINSOL in parallel form is intended for the SPMD (Single Program Multiple Data) model with distributed memory, in which all vectors are identically distributed across processors In particular, the vector module NVECTOR is designed to help the user assign a contiguous segment of a given vector to each of the processors for parallel computation Several primitives were added to NVECTOR as originally written for PVODE to implement KINSOL KINSOL has been run on a Cray-T3D, an eight- processor DEC ALPHA and a cluster of workstations It is currently being used in a simulation of tokamak edge plasmas and in groundwater two-phase flow studies at LLNL The remainder of this paper is organized as follows Section 2 sets the mathematical notation and summarizes the basic methods Section 3 summarizes the organization of the KINSOL solver, while Section 4 summarizes its usage Section 5 describes a preconditioner module, Section 6 describes a set of Fortran/C interfaces, Section 7 describes an example problem, and Section 8
Robust parallel iterative solvers for linear and least-squares problems, Final Technical Report
Saad, Yousef
2014-01-16
The primary goal of this project is to study and develop robust iterative methods for solving linear systems of equations and least squares systems. The focus of the Minnesota team is on algorithms development, robustness issues, and on tests and validation of the methods on realistic problems. 1. The project begun with an investigation on how to practically update a preconditioner obtained from an ILU-type factorization, when the coefficient matrix changes. 2. We investigated strategies to improve robustness in parallel preconditioners in a specific case of a PDE with discontinuous coefficients. 3. We explored ways to adapt standard preconditioners for solving linear systems arising from the Helmholtz equation. These are often difficult linear systems to solve by iterative methods. 4. We have also worked on purely theoretical issues related to the analysis of Krylov subspace methods for linear systems. 5. We developed an effective strategy for performing ILU factorizations for the case when the matrix is highly indefinite. The strategy uses shifting in some optimal way. The method was extended to the solution of Helmholtz equations by using complex shifts, yielding very good results in many cases. 6. We addressed the difficult problem of preconditioning sparse systems of equations on GPUs. 7. A by-product of the above work is a software package consisting of an iterative solver library for GPUs based on CUDA. This was made publicly available. It was the first such library that offers complete iterative solvers for GPUs. 8. We considered another form of ILU which blends coarsening techniques from Multigrid with algebraic multilevel methods. 9. We have released a new version on our parallel solver - called pARMS [new version is version 3]. As part of this we have tested the code in complex settings - including the solution of Maxwell and Helmholtz equations and for a problem of crystal growth.10. As an application of polynomial preconditioning we considered the
Optimizing the Zeldovich approximation
NASA Technical Reports Server (NTRS)
Melott, Adrian L.; Pellman, Todd F.; Shandarin, Sergei F.
1994-01-01
We have recently learned that the Zeldovich approximation can be successfully used for a far wider range of gravitational instability scenarios than formerly proposed; we study here how to extend this range. In previous work (Coles, Melott and Shandarin 1993, hereafter CMS) we studied the accuracy of several analytic approximations to gravitational clustering in the mildly nonlinear regime. We found that what we called the 'truncated Zeldovich approximation' (TZA) was better than any other (except in one case the ordinary Zeldovich approximation) over a wide range from linear to mildly nonlinear (sigma approximately 3) regimes. TZA was specified by setting Fourier amplitudes equal to zero for all wavenumbers greater than k(sub nl), where k(sub nl) marks the transition to the nonlinear regime. Here, we study the cross correlation of generalized TZA with a group of n-body simulations for three shapes of window function: sharp k-truncation (as in CMS), a tophat in coordinate space, or a Gaussian. We also study the variation in the crosscorrelation as a function of initial truncation scale within each type. We find that k-truncation, which was so much better than other things tried in CMS, is the worst of these three window shapes. We find that a Gaussian window e(exp(-k(exp 2)/2k(exp 2, sub G))) applied to the initial Fourier amplitudes is the best choice. It produces a greatly improved crosscorrelation in those cases which most needed improvement, e.g. those with more small-scale power in the initial conditions. The optimum choice of kG for the Gaussian window is (a somewhat spectrum-dependent) 1 to 1.5 times k(sub nl). Although all three windows produce similar power spectra and density distribution functions after application of the Zeldovich approximation, the agreement of the phases of the Fourier components with the n-body simulation is better for the Gaussian window. We therefore ascribe the success of the best-choice Gaussian window to its superior treatment
FIESTA 2: Parallelizeable multiloop numerical calculations
NASA Astrophysics Data System (ADS)
Smirnov, A. V.; Smirnov, V. A.; Tentyukov, M.
2011-03-01
The program FIESTA has been completely rewritten. Now it can be used not only as a tool to evaluate Feynman integrals numerically, but also to expand Feynman integrals automatically in limits of momenta and masses with the use of sector decompositions and Mellin-Barnes representations. Other important improvements to the code are complete parallelization (even to multiple computers), high-precision arithmetics (allowing to calculate integrals which were undoable before), new integrators, Speer sectors as a strategy, the possibility to evaluate more general parametric integrals. Program summaryProgram title:FIESTA 2 Catalogue identifier: AECP_v2_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AECP_v2_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU GPL version 2 No. of lines in distributed program, including test data, etc.: 39 783 No. of bytes in distributed program, including test data, etc.: 6 154 515 Distribution format: tar.gz Programming language: Wolfram Mathematica 6.0 (or higher) and C Computer: From a desktop PC to a supercomputer Operating system: Unix, Linux, Windows, Mac OS X Has the code been vectorised or parallelized?: Yes, the code has been parallelized for use on multi-kernel computers as well as clusters via Mathlink over the TCP/IP protocol. The program can work successfully with a single processor, however, it is ready to work in a parallel environment and the use of multi-kernel processor and multi-processor computers significantly speeds up the calculation; on clusters the calculation speed can be improved even further. RAM: Depends on the complexity of the problem Classification: 4.4, 4.12, 5, 6.5 Catalogue identifier of previous version: AECP_v1_0 Journal reference of previous version: Comput. Phys. Comm. 180 (2009) 735 External routines: QLink [1], Cuba library [2], MPFR [3] Does the new version supersede the previous version?: Yes Nature of problem: The sector decomposition approach to evaluating Feynman integrals falls apart into the sector decomposition itself, where one has to minimize the number of sectors; the pole resolution and epsilon expansion; and the numerical integration of the resulting expression. Solution method: The sector decomposition is based on a new strategy as well as on classical strategies such as Speer sectors. The sector decomposition, pole resolution and epsilon-expansion are performed in Wolfram Mathematica 6.0 or, preferably, 7.0 (enabling parallelization) [4]. The data is stored on hard disk via a special program, QLink [1]. The expression for integration is passed to the C-part of the code, that parses the string and performs the integration by one of the algorithms in the Cuba library package [2]. This part of the evaluation is perfectly parallelized on multi-kernel computers.
NASA Astrophysics Data System (ADS)
Guo, Xiaocheng
2015-06-01
By revisiting the derivation of the previously developed HLLC Riemann solver for magneto-hydrodynamics (MHD), the paper presents an extended HLLC Riemann solver specifically designed for the MHD system in which the magnetic field can be decomposed into a strong internal magnetic field and an external component. The derived HLLC Riemann solver satisfies the conservation laws. The numerical tests show that the extended solver deals with the global MHD simulation of the Earth's magnetosphere well, and maintains high numerical resolution. It recovers the previously developed HLLC Riemann solver for the MHD as long as the internal field is set to zero. Thus, it is backward compatible with the previous HLLC solver, and suitable for the MHD simulations no matter whether a strong internal magnetic field is included or not.
Application of Aeroelastic Solvers Based on Navier Stokes Equations
NASA Technical Reports Server (NTRS)
Keith, Theo G., Jr.; Srivastava, Rakesh
2001-01-01
The propulsion element of the NASA Advanced Subsonic Technology (AST) initiative is directed towards increasing the overall efficiency of current aircraft engines. This effort requires an increase in the efficiency of various components, such as fans, compressors, turbines etc. Improvement in engine efficiency can be accomplished through the use of lighter materials, larger diameter fans and/or higher-pressure ratio compressors. However, each of these has the potential to result in aeroelastic problems such as flutter or forced response. To address the aeroelastic problems, the Structural Dynamics Branch of NASA Glenn has been involved in the development of numerical capabilities for analyzing the aeroelastic stability characteristics and forced response of wide chord fans, multi-stage compressors and turbines. In order to design an engine to safely perform a set of desired tasks, accurate information of the stresses on the blade during the entire cycle of blade motion is required. This requirement in turn demands that accurate knowledge of steady and unsteady blade loading is available. To obtain the steady and unsteady aerodynamic forces for the complex flows around the engine components, for the flow regimes encountered by the rotor, an advanced compressible Navier-Stokes solver is required. A finite volume based Navier-Stokes solver has been developed at Mississippi State University (MSU) for solving the flow field around multistage rotors. The focus of the current research effort, under NASA Cooperative Agreement NCC3- 596 was on developing an aeroelastic analysis code (entitled TURBO-AE) based on the Navier-Stokes solver developed by MSU. The TURBO-AE code has been developed for flutter analysis of turbomachine components and delivered to NASA and its industry partners. The code has been verified. validated and is being applied by NASA Glenn and by aircraft engine manufacturers to analyze the aeroelastic stability characteristics of modem fans, compressors
A New Robust Solver for Saturated-Unsaturated Richards' Equation
NASA Astrophysics Data System (ADS)
Barajas-Solano, D. A.; Tartakovsky, D. M.
2012-12-01
We present a novel approach for the numerical integration of the saturated-unsaturated Richards' equation, a degenerate parabolic partial differential equation that models flow in porous media. The method is based on the mixed (pore pressure-water content) form of RE, written as a set of differential algebraic equations (DAEs) of index-1 for the fully saturated case and index-2 for the partially saturated case. A DAE-based approach allows us to overcome the numerical challenges posed by the degenerate nature of the Richards' equation. The resulting set of DAEs is solved using the stiffly-accurate, single-step, 3-stage implicit Runge-Kutta method Radau IIA, chosen for its favorable accuracy and stability properties, and its ease of implementation. For each time step a nonlinear system of equations on the intermediate Runge-Kutta states of the pore pressure is solved, written so to ensure that the next step pore pressure and water content correspond to one another correctly. The implementation of our approach compares favorably to state-of-the-art DAE-based solvers in both one- and two-dimensional simulations. These solvers use multi-step backward difference formulas together with a pressure-based form of Richards' equation. To the best of our knowledge, our method is the first instance of a successful DAE-based solver that uses the mixed form of Richards' equation. We consider this a promising line of research, with future work to be done on the use of globally convergent methods for the solution of the occurring nonlinear systems of equations.
A computationally efficient Multicomponent Equilibrium Solver for Aerosols (MESA)
NASA Astrophysics Data System (ADS)
Zaveri, Rahul A.; Easter, Richard C.; Peters, Leonard K.
2005-12-01
Development and application of a new Multicomponent Equilibrium Solver for Aerosols (MESA) is described for systems containing H+, NH4+, Na+, Ca2+, SO42-, HSO4-, NO3-, and Cl- ions. The equilibrium solution is obtained by integrating a set of pseudo-transient ordinary differential equations describing the precipitation and dissolution reactions for all the possible salts to steady state. A comprehensive temperature dependent mutual deliquescence relative humidity (MDRH) parameterization is developed for all the possible salt mixtures, thereby eliminating the need for a rigorous numerical solution when ambient RH is less than MDRH(T). The solver is unconditionally stable, mass conserving, and shows robust convergence. Performance of MESA was evaluated against the Web-based AIM Model III, which served as a benchmark for accuracy, and the EQUISOLV II solver for speed. Important differences in the convergence and thermodynamic errors in MESA and EQUISOLV II are discussed. The average ratios of speeds of MESA over EQUISOLV II ranged between 1.4 and 5.8, with minimum and maximum ratios of 0.6 and 17, respectively. Because MESA directly diagnoses MDRH, it is significantly more efficient when RH < MDRH. MESA's superior performance is partially due to its "hard-wired" code for the present system as opposed to EQUISOLV II, which has a more generalized structure for solving any number and type of reactions at temperatures down to 190 K. These considerations suggest that MESA is highly attractive for use in 3-D aerosol/air-quality models for lower tropospheric applications (T > 240 K) in which both accuracy and computational efficiency are critical.
Reformulation of the Fourier-Bessel steady state mode solver
NASA Astrophysics Data System (ADS)
Gauthier, Robert C.
2016-09-01
The Fourier-Bessel resonator state mode solver is reformulated using Maxwell's field coupled curl equations. The matrix generating expressions are greatly simplified as well as a reduction in the number of pre-computed tables making the technique simpler to implement on a desktop computer. The reformulation maintains the theoretical equivalence of the permittivity and permeability and as such structures containing both electric and magnetic properties can be examined. Computation examples are presented for a surface nanoscale axial photonic resonator and hybrid { ε , μ } quasi-crystal resonator.
Some fast elliptic solvers on parallel architectures and their complexities
NASA Technical Reports Server (NTRS)
Gallopoulos, E.; Saad, Y.
1989-01-01
The discretization of separable elliptic partial differential equations leads to linear systems with special block tridiagonal matrices. Several methods are known to solve these systems, the most general of which is the Block Cyclic Reduction (BCR) algorithm which handles equations with nonconstant coefficients. A method was recently proposed to parallelize and vectorize BCR. In this paper, the mapping of BCR on distributed memory architectures is discussed, and its complexity is compared with that of other approaches including the Alternating-Direction method. A fast parallel solver is also described, based on an explicit formula for the solution, which has parallel computational compelxity lower than that of parallel BCR.
Some fast elliptic solvers on parallel architectures and their complexities
NASA Technical Reports Server (NTRS)
Gallopoulos, E.; Saad, Youcef
1989-01-01
The discretization of separable elliptic partial differential equations leads to linear systems with special block triangular matrices. Several methods are known to solve these systems, the most general of which is the Block Cyclic Reduction (BCR) algorithm which handles equations with nonconsistant coefficients. A method was recently proposed to parallelize and vectorize BCR. Here, the mapping of BCR on distributed memory architectures is discussed, and its complexity is compared with that of other approaches, including the Alternating-Direction method. A fast parallel solver is also described, based on an explicit formula for the solution, which has parallel computational complexity lower than that of parallel BCR.
Algorithms for parallel flow solvers on message passing architectures
NASA Astrophysics Data System (ADS)
Vanderwijngaart, Rob F.
1995-01-01
The purpose of this project has been to identify and test suitable technologies for implementation of fluid flow solvers -- possibly coupled with structures and heat equation solvers -- on MIMD parallel computers. In the course of this investigation much attention has been paid to efficient domain decomposition strategies for ADI-type algorithms. Multi-partitioning derives its efficiency from the assignment of several blocks of grid points to each processor in the parallel computer. A coarse-grain parallelism is obtained, and a near-perfect load balance results. In uni-partitioning every processor receives responsibility for exactly one block of grid points instead of several. This necessitates fine-grain pipelined program execution in order to obtain a reasonable load balance. Although fine-grain parallelism is less desirable on many systems, especially high-latency networks of workstations, uni-partition methods are still in wide use in production codes for flow problems. Consequently, it remains important to achieve good efficiency with this technique that has essentially been superseded by multi-partitioning for parallel ADI-type algorithms. Another reason for the concentration on improving the performance of pipeline methods is their applicability in other types of flow solver kernels with stronger implied data dependence. Analytical expressions can be derived for the size of the dynamic load imbalance incurred in traditional pipelines. From these it can be determined what is the optimal first-processor retardation that leads to the shortest total completion time for the pipeline process. Theoretical predictions of pipeline performance with and without optimization match experimental observations on the iPSC/860 very well. Analysis of pipeline performance also highlights the effect of uncareful grid partitioning in flow solvers that employ pipeline algorithms. If grid blocks at boundaries are not at least as large in the wall-normal direction as those
Advances in the hydrodynamics solver of CO5BOLD
NASA Astrophysics Data System (ADS)
Freytag, Bernd
Many features of the Roe solver used in the hydrodynamics module of CO5BOLD have recently been added or overhauled, including the reconstruction methods (by adding the new second-order ``Frankenstein's method''), the treatment of transversal velocities, energy-flux averaging and entropy-wave treatment at small Mach numbers, the CTU scheme to combine the one-dimensional fluxes, and additional safety measures. All this results in a significantly better behavior at low Mach number flows, and an improved stability at larger Mach numbers requiring less (or no) additional tensor viscosity, which then leads to a noticeable increase in effective resolution.
FDIPS: Finite Difference Iterative Potential-field Solver
NASA Astrophysics Data System (ADS)
Toth, Gabor; van der Holst, Bartholomeus; Huang, Zhenguang
2016-06-01
FDIPS is a finite difference iterative potential-field solver that can generate the 3D potential magnetic field solution based on a magnetogram. It is offered as an alternative to the spherical harmonics approach, as when the number of spherical harmonics is increased, using the raw magnetogram data given on a grid that is uniform in the sine of the latitude coordinate can result in inaccurate and unreliable results, especially in the polar regions close to the Sun. FDIPS is written in Fortran 90 and uses the MPI library for parallel execution.
Object-Oriented Design for Sparse Direct Solvers
NASA Technical Reports Server (NTRS)
Dobrian, Florin; Kumfert, Gary; Pothen, Alex
1999-01-01
We discuss the object-oriented design of a software package for solving sparse, symmetric systems of equations (positive definite and indefinite) by direct methods. At the highest layers, we decouple data structure classes from algorithmic classes for flexibility. We describe the important structural and algorithmic classes in our design, and discuss the trade-offs we made for high performance. The kernels at the lower layers were optimized by hand. Our results show no performance loss from our object-oriented design, while providing flexibility, case of use, and extensibility over solvers using procedural design.
Performance issues for iterative solvers in device simulation
NASA Technical Reports Server (NTRS)
Fan, Qing; Forsyth, P. A.; Mcmacken, J. R. F.; Tang, Wei-Pai
1994-01-01
Due to memory limitations, iterative methods have become the method of choice for large scale semiconductor device simulation. However, it is well known that these methods still suffer from reliability problems. The linear systems which appear in numerical simulation of semiconductor devices are notoriously ill-conditioned. In order to produce robust algorithms for practical problems, careful attention must be given to many implementation issues. This paper concentrates on strategies for developing robust preconditioners. In addition, effective data structures and convergence check issues are also discussed. These algorithms are compared with a standard direct sparse matrix solver on a variety of problems.
Preconditioned CG-solvers and finite element grids
Bauer, R.; Selberherr, S.
1994-12-31
To extract parasitic capacitances in wiring structures of integrated circuits the authors developed the two- and three-dimensional finite element program SCAP (Smart Capacitance Analysis Program). The program computes the task of the electrostatic field from a solution of Poisson`s equation via finite elements and calculates the energies from which the capacitance matrix is extracted. The unknown potential vector, which has for three-dimensional applications 5000-50000 unknowns, is computed by a ICCG solver. Currently three- and six-node triangular, four- and ten-node tetrahedronal elements are supported.
Novel accurate and scalable 3-D MT forward solver based on a contracting integral equation method
NASA Astrophysics Data System (ADS)
Kruglyakov, M.; Geraskin, A.; Kuvshinov, A.
2016-11-01
We present a novel, open source 3-D MT forward solver based on a method of integral equations (IE) with contracting kernel. Special attention in the solver is paid to accurate calculations of Green's functions and their integrals which are cornerstones of any IE solution. The solver supports massive parallelization and is able to deal with highly detailed and contrasting models. We report results of a 3-D numerical experiment aimed at analyzing the accuracy and scalability of the code.
NASA Astrophysics Data System (ADS)
Vides, Jeaniffer; Nkonga, Boniface; Audit, Edouard
2015-01-01
We derive a simple method to numerically approximate the solution of the two-dimensional Riemann problem for gas dynamics, using the literal extension of the well-known HLL formalism as its basis. Essentially, any strategy attempting to extend the three-state HLL Riemann solver to multiple space dimensions will by some means involve a piecewise constant approximation of the complex two-dimensional interaction of waves, and our numerical scheme is not the exception. In order to determine closed form expressions for the involved fluxes, we rely on the equivalence between the consistency condition and the use of Rankine-Hugoniot conditions that hold across the outermost waves. The proposed scheme is carefully designed to simplify its eventual numerical implementation and its advantages are analytically attested. In addition, we show that the proposed solver can be applied to obtain the edge-centered electric fields needed in the constrained transport technique for the ideal magnetohydrodynamic (MHD) equations. We present several numerical results for hydrodynamics and magnetohydrodynamics that display the scheme's accuracy and its ability to be applied to various systems of conservation laws.
Multiply scaled constrained nonlinear equation solvers. [for nonlinear heat conduction problems
NASA Technical Reports Server (NTRS)
Padovan, Joe; Krishna, Lala
1986-01-01
To improve the numerical stability of nonlinear equation solvers, a partitioned multiply scaled constraint scheme is developed. This scheme enables hierarchical levels of control for nonlinear equation solvers. To complement the procedure, partitioned convergence checks are established along with self-adaptive partitioning schemes. Overall, such procedures greatly enhance the numerical stability of the original solvers. To demonstrate and motivate the development of the scheme, the problem of nonlinear heat conduction is considered. In this context the main emphasis is given to successive substitution-type schemes. To verify the improved numerical characteristics associated with partitioned multiply scaled solvers, results are presented for several benchmark examples.
A GPU-accelerated flow solver for incompressible two-phase fluid flows
NASA Astrophysics Data System (ADS)
Codyer, Stephen; Raessi, Mehdi; Khanna, Gaurav
2011-11-01
We present a numerical solver for incompressible, immiscible, two-phase fluid flows that is accelerated by using Graphics Processing Units (GPUs). The Navier-Stokes equations are solved by the projection method, which involves solving a pressure Poisson problem at each time step. A second-order discretization of the Poisson problem leads to a sparse matrix with five and seven diagonals for two- and three-dimensional simulations, respectively. Running a serial linear algebra solver on a single CPU can take 50-99.9% of the total simulation time to solve the above system for pressure. To remove this bottleneck, we utilized the large parallelization capabilities of GPUs; we developed a linear algebra solver based on the conjugate gradient iterative method (CGIM) by using CUDA 4.0 libraries and compared its performance with CUSP, an open-source, GPU library for linear algebra. Compared to running the CGIM solver on a single CPU core, for a 2D case, our GPU solver yields speedups of up to 88x in solver time and 81x overall time on a single GPU card. In 3D cases, the speedups are up to 81x (solver) and 15x (overall). Speedup is faster at higher grid resolutions and our GPU solver outperforms CUSP. Current work examines the acceleration versus a parallel CGIM CPU solver.
Chalasani, P.; Saias, I.; Jha, S.
1996-04-08
As increasingly large volumes of sophisticated options (called derivative securities) are traded in world financial markets, determining a fair price for these options has become an important and difficult computational problem. Many valuation codes use the binomial pricing model, in which the stock price is driven by a random walk. In this model, the value of an n-period option on a stock is the expected time-discounted value of the future cash flow on an n-period stock price path. Path-dependent options are particularly difficult to value since the future cash flow depends on the entire stock price path rather than on just the final stock price. Currently such options are approximately priced by Monte carlo methods with error bounds that hold only with high probability and which are reduced by increasing the number of simulation runs. In this paper the authors show that pricing an arbitrary path-dependent option is {number_sign}-P hard. They show that certain types f path-dependent options can be valued exactly in polynomial time. Asian options are path-dependent options that are particularly hard to price, and for these they design deterministic polynomial-time approximate algorithms. They show that the value of a perpetual American put option (which can be computed in constant time) is in many cases a good approximation to the value of an otherwise identical n-period American put option. In contrast to Monte Carlo methods, the algorithms have guaranteed error bounds that are polynormally small (and in some cases exponentially small) in the maturity n. For the error analysis they derive large-deviation results for random walks that may be of independent interest.
Riemann solvers and Alfven waves in black hole magnetospheres
NASA Astrophysics Data System (ADS)
Punsly, Brian; Balsara, Dinshaw; Kim, Jinho; Garain, Sudip
2016-09-01
In the magnetosphere of a rotating black hole, an inner Alfven critical surface (IACS) must be crossed by inflowing plasma. Inside the IACS, Alfven waves are inward directed toward the black hole. The majority of the proper volume of the active region of spacetime (the ergosphere) is inside of the IACS. The charge and the totally transverse momentum flux (the momentum flux transverse to both the wave normal and the unperturbed magnetic field) are both determined exclusively by the Alfven polarization. Thus, it is important for numerical simulations of black hole magnetospheres to minimize the dissipation of Alfven waves. Elements of the dissipated wave emerge in adjacent cells regardless of the IACS, there is no mechanism to prevent Alfvenic information from crossing outward. Thus, numerical dissipation can affect how simulated magnetospheres attain the substantial Goldreich-Julian charge density associated with the rotating magnetic field. In order to help minimize dissipation of Alfven waves in relativistic numerical simulations we have formulated a one-dimensional Riemann solver, called HLLI, which incorporates the Alfven discontinuity and the contact discontinuity. We have also formulated a multidimensional Riemann solver, called MuSIC, that enables low dissipation propagation of Alfven waves in multiple dimensions. The importance of higher order schemes in lowering the numerical dissipation of Alfven waves is also catalogued.
A massively parallel fractional step solver for incompressible flows
Houzeaux, G. Vazquez, M. Aubry, R. Cela, J.M.
2009-09-20
This paper presents a parallel implementation of fractional solvers for the incompressible Navier-Stokes equations using an algebraic approach. Under this framework, predictor-corrector and incremental projection schemes are seen as sub-classes of the same class, making apparent its differences and similarities. An additional advantage of this approach is to set a common basis for a parallelization strategy, which can be extended to other split techniques or to compressible flows. The predictor-corrector scheme consists in solving the momentum equation and a modified 'continuity' equation (namely a simple iteration for the pressure Schur complement) consecutively in order to converge to the monolithic solution, thus avoiding fractional errors. On the other hand, the incremental projection scheme solves only one iteration of the predictor-corrector per time step and adds a correction equation to fulfill the mass conservation. As shown in the paper, these two schemes are very well suited for massively parallel implementation. In fact, when compared with monolithic schemes, simpler solvers and preconditioners can be used to solve the non-symmetric momentum equations (GMRES, Bi-CGSTAB) and to solve the symmetric continuity equation (CG, Deflated CG). This gives good speedup properties of the algorithm. The implementation of the mesh partitioning technique is presented, as well as the parallel performances and speedups for thousands of processors.
Using computer algebra and SMT solvers in algebraic biology
NASA Astrophysics Data System (ADS)
Pineda Osorio, Mateo
2014-05-01
Biologic processes are represented as Boolean networks, in a discrete time. The dynamics within these networks are approached with the help of SMT Solvers and the use of computer algebra. Software such as Maple and Z3 was used in this case. The number of stationary states for each network was calculated. The network studied here corresponds to the immune system under the effects of drastic mood changes. Mood is considered as a Boolean variable that affects the entire dynamics of the immune system, changing the Boolean satisfiability and the number of stationary states of the immune network. Results obtained show Z3's great potential as a SMT Solver. Some of these results were verified in Maple, even though it showed not to be as suitable for the problem approach. The solving code was constructed using Z3-Python and Z3-SMT-LiB. Results obtained are important in biology systems and are expected to help in the design of immune therapies. As a future line of research, more complex Boolean network representations of the immune system as well as the whole psychological apparatus are suggested.
Agglomeration Multigrid for an Unstructured-Grid Flow Solver
NASA Technical Reports Server (NTRS)
Frink, Neal; Pandya, Mohagna J.
2004-01-01
An agglomeration multigrid scheme has been implemented into the sequential version of the NASA code USM3Dns, tetrahedral cell-centered finite volume Euler/Navier-Stokes flow solver. Efficiency and robustness of the multigrid-enhanced flow solver have been assessed for three configurations assuming an inviscid flow and one configuration assuming a viscous fully turbulent flow. The inviscid studies include a transonic flow over the ONERA M6 wing and a generic business jet with flow-through nacelles and a low subsonic flow over a high-lift trapezoidal wing. The viscous case includes a fully turbulent flow over the RAE 2822 rectangular wing. The multigrid solutions converged with 12%-33% of the Central Processing Unit (CPU) time required by the solutions obtained without multigrid. For all of the inviscid cases, multigrid in conjunction with an explicit time-stepping scheme performed the best with regard to the run time memory and CPU time requirements. However, for the viscous case multigrid had to be used with an implicit backward Euler time-stepping scheme that increased the run time memory requirement by 22% as compared to the run made without multigrid.
An efficient chemical kinetics solver using high dimensional model representation
Shorter, J.A.; Ip, P.C.; Rabitz, H.A.
1999-09-09
A high dimensional model representation (HDMR) technique is introduced to capture the input-output behavior of chemical kinetic models. The HDMR expresses the output chemical species concentrations as a rapidly convergent hierarchical correlated function expansion in the input variables. In this paper, the input variables are taken as the species concentrations at time t{sub i} and the output is the concentrations at time t{sub i} + {delta}, where {delta} can be much larger than conventional integration time steps. A specially designed set of model runs is performed to determine the correlated functions making up the HDMR. The resultant HDMR can be used to (1) identify the key input variables acting independently or cooperatively on the output, and (2) create a high speed fully equivalent operational model (FEOM) serving to replace the original kinetic model and its differential equation solver. A demonstration of the HDMR technique is presented for stratospheric chemical kinetics. The FEOM proved to give accurate and stable chemical concentrations out to long times of many years. In addition, the FEOM was found to be orders of magnitude faster than a conventional stiff equation solver. This computational acceleration should have significance in many chemical kinetic applications.
Parareal in time 3D numerical solver for the LWR Benchmark neutron diffusion transient model
Baudron, Anne-Marie; Riahi, Mohamed Kamel; Salomon, Julien
2014-12-15
In this paper we present a time-parallel algorithm for the 3D neutrons calculation of a transient model in a nuclear reactor core. The neutrons calculation consists in numerically solving the time dependent diffusion approximation equation, which is a simplified transport equation. The numerical resolution is done with finite elements method based on a tetrahedral meshing of the computational domain, representing the reactor core, and time discretization is achieved using a θ-scheme. The transient model presents moving control rods during the time of the reaction. Therefore, cross-sections (piecewise constants) are taken into account by interpolations with respect to the velocity of the control rods. The parallelism across the time is achieved by an adequate use of the parareal in time algorithm to the handled problem. This parallel method is a predictor corrector scheme that iteratively combines the use of two kinds of numerical propagators, one coarse and one fine. Our method is made efficient by means of a coarse solver defined with large time step and fixed position control rods model, while the fine propagator is assumed to be a high order numerical approximation of the full model. The parallel implementation of our method provides a good scalability of the algorithm. Numerical results show the efficiency of the parareal method on large light water reactor transient model corresponding to the Langenbuch–Maurer–Werner benchmark.
Miller, Gregory H.
2003-08-06
In this paper we present a general iterative method for the solution of the Riemann problem for hyperbolic systems of PDEs. The method is based on the multiple shooting method for free boundary value problems. We demonstrate the method by solving one-dimensional Riemann problems for hyperelastic solid mechanics. Even for conditions representative of routine laboratory conditions and military ballistics, dramatic differences are seen between the exact and approximate Riemann solution. The greatest discrepancy arises from misallocation of energy between compressional and thermal modes by the approximate solver, resulting in nonphysical entropy and temperature estimates. Several pathological conditions arise in common practice, and modifications to the method to handle these are discussed. These include points where genuine nonlinearity is lost, degeneracies, and eigenvector deficiencies that occur upon melting.
Roy, Swapnoneel; Thakur, Ashok Kumar
2008-01-01
Genome rearrangements have been modelled by a variety of primitives such as reversals, transpositions, block moves and block interchanges. We consider such a genome rearrangement primitive Strip Exchanges. Given a permutation, the challenge is to sort it by using minimum number of strip exchanges. A strip exchanging move interchanges the positions of two chosen strips so that they merge with other strips. The strip exchange problem is to sort a permutation using minimum number of strip exchanges. We present here the first non-trivial 2-approximation algorithm to this problem. We also observe that sorting by strip-exchanges is fixed-parameter-tractable. Lastly we discuss the application of strip exchanges in a different area Optical Character Recognition (OCR) with an example.
Hierarchical Approximate Bayesian Computation
Turner, Brandon M.; Van Zandt, Trisha
2013-01-01
Approximate Bayesian computation (ABC) is a powerful technique for estimating the posterior distribution of a model’s parameters. It is especially important when the model to be fit has no explicit likelihood function, which happens for computational (or simulation-based) models such as those that are popular in cognitive neuroscience and other areas in psychology. However, ABC is usually applied only to models with few parameters. Extending ABC to hierarchical models has been difficult because high-dimensional hierarchical models add computational complexity that conventional ABC cannot accommodate. In this paper we summarize some current approaches for performing hierarchical ABC and introduce a new algorithm called Gibbs ABC. This new algorithm incorporates well-known Bayesian techniques to improve the accuracy and efficiency of the ABC approach for estimation of hierarchical models. We then use the Gibbs ABC algorithm to estimate the parameters of two models of signal detection, one with and one without a tractable likelihood function. PMID:24297436
Relaxation approximations to second-order traffic flow models by high-resolution schemes
Nikolos, I.K.; Delis, A.I.; Papageorgiou, M.
2015-03-10
A relaxation-type approximation of second-order non-equilibrium traffic models, written in conservation or balance law form, is considered. Using the relaxation approximation, the nonlinear equations are transformed to a semi-linear diagonilizable problem with linear characteristic variables and stiff source terms with the attractive feature that neither Riemann solvers nor characteristic decompositions are in need. In particular, it is only necessary to provide the flux and source term functions and an estimate of the characteristic speeds. To discretize the resulting relaxation system, high-resolution reconstructions in space are considered. Emphasis is given on a fifth-order WENO scheme and its performance. The computations reported demonstrate the simplicity and versatility of relaxation schemes as numerical solvers.
NASA Astrophysics Data System (ADS)
Koldan, Jelena; Puzyrev, Vladimir; de la Puente, Josep; Houzeaux, Guillaume; Cela, José María
2014-06-01
We present an elaborate preconditioning scheme for Krylov subspace methods which has been developed to improve the performance and reduce the execution time of parallel node-based finite-element (FE) solvers for 3-D electromagnetic (EM) numerical modelling in exploration geophysics. This new preconditioner is based on algebraic multigrid (AMG) that uses different basic relaxation methods, such as Jacobi, symmetric successive over-relaxation (SSOR) and Gauss-Seidel, as smoothers and the wave front algorithm to create groups, which are used for a coarse-level generation. We have implemented and tested this new preconditioner within our parallel nodal FE solver for 3-D forward problems in EM induction geophysics. We have performed series of experiments for several models with different conductivity structures and characteristics to test the performance of our AMG preconditioning technique when combined with biconjugate gradient stabilized method. The results have shown that, the more challenging the problem is in terms of conductivity contrasts, ratio between the sizes of grid elements and/or frequency, the more benefit is obtained by using this preconditioner. Compared to other preconditioning schemes, such as diagonal, SSOR and truncated approximate inverse, the AMG preconditioner greatly improves the convergence of the iterative solver for all tested models. Also, when it comes to cases in which other preconditioners succeed to converge to a desired precision, AMG is able to considerably reduce the total execution time of the forward-problem code-up to an order of magnitude. Furthermore, the tests have confirmed that our AMG scheme ensures grid-independent rate of convergence, as well as improvement in convergence regardless of how big local mesh refinements are. In addition, AMG is designed to be a black-box preconditioner, which makes it easy to use and combine with different iterative methods. Finally, it has proved to be very practical and efficient in the
NASA Astrophysics Data System (ADS)
Go, Ara; Millis, Andrew J.
2014-03-01
The three-band copper oxide model is studied using the single-site and four-site dynamical mean-field theory with configuration interaction based impurity solver. Comparison of the single and four site approximations shows that short ranged antiferromagnetic correlations are crucial to the physics. In the undoped case, they increase the gap size, shift the metal-insulator phase boundary and enhance the conductivity at the gap edge. The relation of antiferromagnetism and the pseudogap is discussed for the doped case. The new solver permits the inclusion of more bath orbitals which are crucial for accurate studies of spectral properties near the gap edge. This work was supported by the US Department of Energy under Grants No. DOE FG02-04ER46169 and DE-SC0006613.
The SX Solver: A Computer Program for Analyzing Solvent-Extraction Equilibria: Version 3.0
Lumetta, Gregg J.
2002-01-17
A new computer program, the SX Solver, has been developed to analyze solvent-extraction equilibria. The program operates out of Microsoft Excel and uses the built-in Solver function to minimize the sum of the square of the residuals between measured and calculated distribution coefficients. The extraction of nitric acid by tributyl phosphate has been modeled to illustrate the programs use.
A block iterative LU solver for weakly coupled linear systems. [in fluid dynamics equations
NASA Technical Reports Server (NTRS)
Cooke, C. H.
1977-01-01
A hybrid technique, called the block iterative LU solver, is proposed for solving the linear equations resulting from a finite element numerical analysis of certain fluid dynamics problems where the equations are weakly coupled between distinct sets of variables. Either the block Jacobi iterative method or the block Gauss-Seidel iterative solver is combined with LU decomposition.
T2CG1, a package of preconditioned conjugate gradient solvers for TOUGH2
Moridis, G.; Pruess, K.; Antunez, E.
1994-03-01
Most of the computational work in the numerical simulation of fluid and heat flows in permeable media arises in the solution of large systems of linear equations. The simplest technique for solving such equations is by direct methods. However, because of large storage requirements and accumulation of roundoff errors, the application of direct solution techniques is limited, depending on matrix bandwidth, to systems of a few hundred to at most a few thousand simultaneous equations. T2CG1, a package of preconditioned conjugate gradient solvers, has been added to TOUGH2 to complement its direct solver and significantly increase the size of problems tractable on PCs. T2CG1 includes three different solvers: a Bi-Conjugate Gradient (BCG) solver, a Bi-Conjugate Gradient Squared (BCGS) solver, and a Generalized Minimum Residual (GMRES) solver. Results from six test problems with up to 30,000 equations show that T2CG1 (1) is significantly (and invariably) faster and requires far less memory than the MA28 direct solver, (2) it makes possible the solution of very large three-dimensional problems on PCs, and (3) that the BCGS solver is the fastest of the three in the tested problems. Sample problems are presented related to heat and fluid flow at Yucca Mountain and WIPP, environmental remediation by the Thermal Enhanced Vapor Extraction System, and geothermal resources.
Approximate Bayesian multibody tracking.
Lanz, Oswald
2006-09-01
Visual tracking of multiple targets is a challenging problem, especially when efficiency is an issue. Occlusions, if not properly handled, are a major source of failure. Solutions supporting principled occlusion reasoning have been proposed but are yet unpractical for online applications. This paper presents a new solution which effectively manages the trade-off between reliable modeling and computational efficiency. The Hybrid Joint-Separable (HJS) filter is derived from a joint Bayesian formulation of the problem, and shown to be efficient while optimal in terms of compact belief representation. Computational efficiency is achieved by employing a Markov random field approximation to joint dynamics and an incremental algorithm for posterior update with an appearance likelihood that implements a physically-based model of the occlusion process. A particle filter implementation is proposed which achieves accurate tracking during partial occlusions, while in cases of complete occlusion, tracking hypotheses are bound to estimated occlusion volumes. Experiments show that the proposed algorithm is efficient, robust, and able to resolve long-term occlusions between targets with identical appearance. PMID:16929730
Experimental validation of GADRAS's coupled neutron-photon inverse radiation transport solver.
Mattingly, John K.; Mitchell, Dean James; Harding, Lee T.
2010-08-01
Sandia National Laboratories has developed an inverse radiation transport solver that applies nonlinear regression to coupled neutron-photon deterministic transport models. The inverse solver uses nonlinear regression to fit a radiation transport model to gamma spectrometry and neutron multiplicity counting measurements. The subject of this paper is the experimental validation of that solver. This paper describes a series of experiments conducted with a 4.5 kg sphere of {alpha}-phase, weapons-grade plutonium. The source was measured bare and reflected by high-density polyethylene (HDPE) spherical shells with total thicknesses between 1.27 and 15.24 cm. Neutron and photon emissions from the source were measured using three instruments: a gross neutron counter, a portable neutron multiplicity counter, and a high-resolution gamma spectrometer. These measurements were used as input to the inverse radiation transport solver to evaluate the solver's ability to correctly infer the configuration of the source from its measured radiation signatures.
A High-Order Direct Solver for Helmholtz Equations with Neumann Boundary Conditions
NASA Technical Reports Server (NTRS)
Sun, Xian-He; Zhuang, Yu
1997-01-01
In this study, a compact finite-difference discretization is first developed for Helmholtz equations on rectangular domains. Special treatments are then introduced for Neumann and Neumann-Dirichlet boundary conditions to achieve accuracy and separability. Finally, a Fast Fourier Transform (FFT) based technique is used to yield a fast direct solver. Analytical and experimental results show this newly proposed solver is comparable to the conventional second-order elliptic solver when accuracy is not a primary concern, and is significantly faster than that of the conventional solver if a highly accurate solution is required. In addition, this newly proposed fourth order Helmholtz solver is parallel in nature. It is readily available for parallel and distributed computers. The compact scheme introduced in this study is likely extendible for sixth-order accurate algorithms and for more general elliptic equations.
High-performance equation solvers and their impact on finite element analysis
NASA Technical Reports Server (NTRS)
Poole, Eugene L.; Knight, Norman F., Jr.; Davis, D. Dale, Jr.
1990-01-01
The role of equation solvers in modern structural analysis software is described. Direct and iterative equation solvers which exploit vectorization on modern high-performance computer systems are described and compared. The direct solvers are two Cholesky factorization methods. The first method utilizes a novel variable-band data storage format to achieve very high computation rates and the second method uses a sparse data storage format designed to reduce the number of operations. The iterative solvers are preconditioned conjugate gradient methods. Two different preconditioners are included; the first uses a diagonal matrix storage scheme to achieve high computation rates and the second requires a sparse data storage scheme and converges to the solution in fewer iterations that the first. The impact of using all of the equation solvers in a common structural analysis software system is demonstrated by solving several representative structural analysis problems.
Blade design and analysis using a modified Euler solver
NASA Technical Reports Server (NTRS)
Leonard, O.; Vandenbraembussche, R. A.
1991-01-01
An iterative method for blade design based on Euler solver and described in an earlier paper is used to design compressor and turbine blades providing shock free transonic flows. The method shows a rapid convergence, and indicates how much the flow is sensitive to small modifications of the blade geometry, that the classical iterative use of analysis methods might not be able to define. The relationship between the required Mach number distribution and the resulting geometry is discussed. Examples show how geometrical constraints imposed upon the blade shape can be respected by using free geometrical parameters or by relaxing the required Mach number distribution. The same code is used both for the design of the required geometry and for the off-design calculations. Examples illustrate the difficulty of designing blade shapes with optimal performance also outside of the design point.
A high-accuracy Eulerian gyrokinetic solver for collisional plasmas
NASA Astrophysics Data System (ADS)
Candy, J.; Belli, E. A.; Bravenec, R. V.
2016-11-01
We describe a new approach to solve the electromagnetic gyrokinetic equations which is optimized for accurate treatment of multispecies Fokker-Planck collisions including both pitch-angle and energy diffusion. The new algorithm is spectral/pseudospectral in four of the five phase space dimensions, and in the fieldline direction a novel 5th-order conservative upwind scheme is used to permit high-accuracy electromagnetic simulation even in the limit of very high plasma β and vanishingly small perpendicular wavenumber, k⊥ → 0. To our knowledge, this is the first pseudospectral implementation of the collision operator in a gyrokinetic code. We show that the new solver agrees closely with GYRO in the limit of weak Lorentz collisions, but gives a significantly more realistic description of collisions at high collision frequency. The numerical methods are also designed to be efficient and scalable for multiscale simulations that treat ion-scale and electron-scale turbulence simultaneously.
AN ADAPTIVE PARTICLE-MESH GRAVITY SOLVER FOR ENZO
Passy, Jean-Claude; Bryan, Greg L.
2014-11-01
We describe and implement an adaptive particle-mesh algorithm to solve the Poisson equation for grid-based hydrodynamics codes with nested grids. The algorithm is implemented and extensively tested within the astrophysical code Enzo against the multigrid solver available by default. We find that while both algorithms show similar accuracy for smooth mass distributions, the adaptive particle-mesh algorithm is more accurate for the case of point masses, and is generally less noisy. We also demonstrate that the two-body problem can be solved accurately in a configuration with nested grids. In addition, we discuss the effect of subcycling, and demonstrate that evolving all the levels with the same timestep yields even greater precision.
Workload Characterization of CFD Applications Using Partial Differential Equation Solvers
NASA Technical Reports Server (NTRS)
Waheed, Abdul; Yan, Jerry; Saini, Subhash (Technical Monitor)
1998-01-01
Workload characterization is used for modeling and evaluating of computing systems at different levels of detail. We present workload characterization for a class of Computational Fluid Dynamics (CFD) applications that solve Partial Differential Equations (PDEs). This workload characterization focuses on three high performance computing platforms: SGI Origin2000, EBM SP-2, a cluster of Intel Pentium Pro bases PCs. We execute extensive measurement-based experiments on these platforms to gather statistics of system resource usage, which results in workload characterization. Our workload characterization approach yields a coarse-grain resource utilization behavior that is being applied for performance modeling and evaluation of distributed high performance metacomputing systems. In addition, this study enhances our understanding of interactions between PDE solver workloads and high performance computing platforms and is useful for tuning these applications.
GPU accelerated FDTD solver and its application in MRI.
Chi, J; Liu, F; Jin, J; Mason, D G; Crozier, S
2010-01-01
The finite difference time domain (FDTD) method is a popular technique for computational electromagnetics (CEM). The large computational power often required, however, has been a limiting factor for its applications. In this paper, we will present a graphics processing unit (GPU)-based parallel FDTD solver and its successful application to the investigation of a novel B1 shimming scheme for high-field magnetic resonance imaging (MRI). The optimized shimming scheme exhibits considerably improved transmit B(1) profiles. The GPU implementation dramatically shortened the runtime of FDTD simulation of electromagnetic field compared with its CPU counterpart. The acceleration in runtime has made such investigation possible, and will pave the way for other studies of large-scale computational electromagnetic problems in modern MRI which were previously impractical.
Using parallel banded linear system solvers in generalized eigenvalue problems
NASA Technical Reports Server (NTRS)
Zhang, Hong; Moss, William F.
1994-01-01
Subspace iteration is a reliable and cost effective method for solving positive definite banded symmetric generalized eigenproblems, especially in the case of large scale problems. This paper discusses an algorithm that makes use of two parallel banded solvers in subspace iteration. A shift is introduced to decompose the banded linear systems into relatively independent subsystems and to accelerate the iterations. With this shift, an eigenproblem is mapped efficiently into the memories of a multiprocessor and a high speedup is obtained for parallel implementations. An optimal shift is a shift that balances total computation and communication costs. Under certain conditions, we show how to estimate an optimal shift analytically using the decay rate for the inverse of a banded matrix, and how to improve this estimate. Computational results on iPSC/2 and iPSC/860 multiprocessors are presented.
Using parallel banded linear system solvers in generalized eigenvalue problems
NASA Technical Reports Server (NTRS)
Zhang, Hong; Moss, William F.
1993-01-01
Subspace iteration is a reliable and cost effective method for solving positive definite banded symmetric generalized eigenproblems, especially in the case of large scale problems. This paper discusses an algorithm that makes use of two parallel banded solvers in subspace iteration. A shift is introduced to decompose the banded linear systems into relatively independent subsystems and to accelerate the iterations. With this shift, an eigenproblem is mapped efficiently into the memories of a multiprocessor and a high speed-up is obtained for parallel implementations. An optimal shift is a shift that balances total computation and communication costs. Under certain conditions, we show how to estimate an optimal shift analytically using the decay rate for the inverse of a banded matrix, and how to improve this estimate. Computational results on iPSC/2 and iPSC/860 multiprocessors are presented.
Aeroelastic analysis of advanced propellers using an efficient Euler solver
NASA Technical Reports Server (NTRS)
Srivastava, R.; Reddy, T. S. R.; Mehmed, O.
1992-01-01
A 3D Euler solver is coupled with a 3D structural dynamics model to investigate flutter of propfans. A hybrid scheme is used to reduce computational time for the Euler equations and a normal mode analysis is used for flutter calculations. Experimental and calculated flutter results are compared for an advanced propeller propfan which experienced flutter at transonic tip relative velocities. The predicted flutter calculations are in close agreement with the experimental data. A structural damping value of 0.5 percent was required to predict the behavior observed in the experiment. Computations show that the flutter behavior is dominated by the second mode, but coupling with the first mode is required. The addition of other modes to the calculations did not affect the flutter behavior.
Progress in developing Poisson-Boltzmann equation solvers
Li, Chuan; Li, Lin; Petukh, Marharyta; Alexov, Emil
2013-01-01
This review outlines the recent progress made in developing more accurate and efficient solutions to model electrostatics in systems comprised of bio-macromolecules and nano-objects, the last one referring to objects that do not have biological function themselves but nowadays are frequently used in biophysical and medical approaches in conjunction with bio-macromolecules. The problem of modeling macromolecular electrostatics is reviewed from two different angles: as a mathematical task provided the specific definition of the system to be modeled and as a physical problem aiming to better capture the phenomena occurring in the real experiments. In addition, specific attention is paid to methods to extend the capabilities of the existing solvers to model large systems toward applications of calculations of the electrostatic potential and energies in molecular motors, mitochondria complex, photosynthetic machinery and systems involving large nano-objects. PMID:24199185
Application of sparse matrix solvers as effective preconditioners
Young, D.P.; Melvin, R.G.; Johnson, F.T.; Bussoletti, J.E.; Wigton, L.B.; Samant, S.S. )
1989-11-01
In this paper the use of a new out-of-core sparse matrix package for the numerical solution of partial differential equations involving complex geometries arising from aerospace applications is discussed. The sparse matrix solver accepts contributions to the matrix elements in random order and assembles the matrix using fast sort/merge routines. Fill-in is reduced through the use of a physically based nested dissection ordering. For very large problems a drop tolerance is used during the matrix decomposition phase. The resulting incomplete factorization is an effective preconditioner for Krylov subspace methods, such as GMRES. Problems involving 200,000 unknowns routinely are solved on the Cray X-MP using 64MW of solid-state storage device (SSD).
Extending the QUDA Library with the eigCG Solver
Strelchenko, Alexei; Stathopoulos, Andreas
2014-12-12
While the incremental eigCG algorithm [ 1 ] is included in many LQCD software packages, its realization on GPU micro-architectures was still missing. In this session we report our experi- ence of the eigCG implementation in the QUDA library. In particular, we will focus on how to employ the mixed precision technique to accelerate solutions of large sparse linear systems with multiple right-hand sides on GPUs. Although application of mixed precision techniques is a well-known optimization approach for linear solvers, its utilization for the eigenvector com- puting within eigCG requires special consideration. We will discuss implementation aspects of the mixed precision deflation and illustrate its numerical behavior on the example of the Wilson twisted mass fermion matrix inversions
A three-dimensional fast solver for arbitrary vorton distributions
Strickland, J.H.; Baty, R.S.
1994-05-01
A method which is capable of an efficient calculation of the three-dimensional flow field produced by a large system of vortons (discretized regions of vorticity) is presented in this report. The system of vortons can, in turn, be used to model body surfaces, container boundaries, free-surfaces, plumes, jets, and wakes in unsteady three-dimensional flow fields. This method takes advantage of multipole and local series expansions which enables one to make calculations for interactions between groups of vortons which are in well-separated spatial domains rather than having to consider interactions between every pair of vortons. In this work, series expansions for the vector potential of the vorton system are obtained. From such expansions, the three components of velocity can be obtained explicitly. A Fortran computer code FAST3D has been written to calculate the vector potential and the velocity components at selected points in the flow field. In this code, the evaluation points do not have to coincide with the location of the vortons themselves. Test cases have been run to benchmark the truncation errors and CPU time savings associated with the method. Non-dimensional truncation errors for the magnitudes of the vector potential and velocity fields are on the order of 10{sup {minus}4}and 10{sup {minus}3} respectively. Single precision accuracy produces errors in these quantities of up to 10{sup {minus}5}. For less than 1,000 to 2,000 vortons in the field, there is virtually no CPU time savings with the fast solver. For 100,000 vortons in the flow, the fast solver obtains solutions in 1 % to 10% of the time required for the direct solution technique depending upon the configuration.
NASA Astrophysics Data System (ADS)
Lubkin, Elihu
2002-04-01
In 1993,(E. & T. Lubkin, Int.J.Theor.Phys. 32), 993 (1993) we gave exact mean trace
Approximation by hinge functions
Faber, V.
1997-05-01
Breiman has defined {open_quotes}hinge functions{close_quotes} for use as basis functions in least squares approximations to data. A hinge function is the max (or min) function of two linear functions. In this paper, the author assumes the existence of smooth function f(x) and a set of samples of the form (x, f(x)) drawn from a probability distribution {rho}(x). The author hopes to find the best fitting hinge function h(x) in the least squares sense. There are two problems with this plan. First, Breiman has suggested an algorithm to perform this fit. The author shows that this algorithm is not robust and also shows how to create examples on which the algorithm diverges. Second, if the author tries to use the data to minimize the fit in the usual discrete least squares sense, the functional that must be minimized is continuous in the variables, but has a derivative which jumps at the data. This paper takes a different approach. This approach is an example of a method that the author has developed called {open_quotes}Monte Carlo Regression{close_quotes}. (A paper on the general theory is in preparation.) The author shall show that since the function f is continuous, the analytic form of the least squares equation is continuously differentiable. A local minimum is solved for by using Newton`s method, where the entries of the Hessian are estimated directly from the data by Monte Carlo. The algorithm has the desirable properties that it is quadratically convergent from any starting guess sufficiently close to a solution and that each iteration requires only a linear system solve.
NASA Astrophysics Data System (ADS)
Simmons, Alex; Yang, Qianqian; Moroney, Timothy
2015-04-01
The numerical solution of fractional partial differential equations poses significant computational challenges in regard to efficiency as a result of the spatial nonlocality of the fractional differential operators. The dense coefficient matrices that arise from spatial discretisation of these operators mean that even one-dimensional problems can be difficult to solve using standard methods on grids comprising thousands of nodes or more. In this work we address this issue of efficiency for one-dimensional, nonlinear space-fractional reaction-diffusion equations with fractional Laplacian operators. We apply variable-order, variable-stepsize backward differentiation formulas in a Jacobian-free Newton-Krylov framework to advance the solution in time. A key advantage of this approach is the elimination of any requirement to form the dense matrix representation of the fractional Laplacian operator. We show how a banded approximation to this matrix, which can be formed and factorised efficiently, can be used as part of an effective preconditioner that accelerates convergence of the Krylov subspace iterative solver. Our approach also captures the full contribution from the nonlinear reaction term in the preconditioner, which is crucial for problems that exhibit stiff reactions. Numerical examples are presented to illustrate the overall effectiveness of the solver.
NASA Astrophysics Data System (ADS)
Fosas de Pando, Miguel; Schmid, Peter J.; Sipp, Denis
2016-11-01
Nonlinear model reduction for large-scale flows is an essential component in many fluid applications such as flow control, optimization, parameter space exploration and statistical analysis. In this article, we generalize the POD-DEIM method, introduced by Chaturantabut & Sorensen [1], to address nonlocal nonlinearities in the equations without loss of performance or efficiency. The nonlinear terms are represented by nested DEIM-approximations using multiple expansion bases based on the Proper Orthogonal Decomposition. These extensions are imperative, for example, for applications of the POD-DEIM method to large-scale compressible flows. The efficient implementation of the presented model-reduction technique follows our earlier work [2] on linearized and adjoint analyses and takes advantage of the modular structure of our compressible flow solver. The efficacy of the nonlinear model-reduction technique is demonstrated to the flow around an airfoil and its acoustic footprint. We could obtain an accurate and robust low-dimensional model that captures the main features of the full flow.
A New Equation Solver for Modeling Turbulent Flow in Coupled Matrix-Conduit Flow Models.
Hubinger, Bernhard; Birk, Steffen; Hergarten, Stefan
2016-07-01
Karst aquifers represent dual flow systems consisting of a highly conductive conduit system embedded in a less permeable rock matrix. Hybrid models iteratively coupling both flow systems generally consume much time, especially because of the nonlinearity of turbulent conduit flow. To reduce calculation times compared to those of existing approaches, a new iterative equation solver for the conduit system is developed based on an approximated Newton-Raphson expression and a Gauß-Seidel or successive over-relaxation scheme with a single iteration step at the innermost level. It is implemented and tested in the research code CAVE but should be easily adaptable to similar models such as the Conduit Flow Process for MODFLOW-2005. It substantially reduces the computational effort as demonstrated by steady-state benchmark scenarios as well as by transient karst genesis simulations. Water balance errors are found to be acceptable in most of the test cases. However, the performance and accuracy may deteriorate under unfavorable conditions such as sudden, strong changes of the flow field at some stages of the karst genesis simulations.
A New Equation Solver for Modeling Turbulent Flow in Coupled Matrix-Conduit Flow Models.
Hubinger, Bernhard; Birk, Steffen; Hergarten, Stefan
2016-07-01
Karst aquifers represent dual flow systems consisting of a highly conductive conduit system embedded in a less permeable rock matrix. Hybrid models iteratively coupling both flow systems generally consume much time, especially because of the nonlinearity of turbulent conduit flow. To reduce calculation times compared to those of existing approaches, a new iterative equation solver for the conduit system is developed based on an approximated Newton-Raphson expression and a Gauß-Seidel or successive over-relaxation scheme with a single iteration step at the innermost level. It is implemented and tested in the research code CAVE but should be easily adaptable to similar models such as the Conduit Flow Process for MODFLOW-2005. It substantially reduces the computational effort as demonstrated by steady-state benchmark scenarios as well as by transient karst genesis simulations. Water balance errors are found to be acceptable in most of the test cases. However, the performance and accuracy may deteriorate under unfavorable conditions such as sudden, strong changes of the flow field at some stages of the karst genesis simulations. PMID:26821785
A hierarchical Krylov-Bayes iterative inverse solver for MEG with physiological preconditioning
NASA Astrophysics Data System (ADS)
Calvetti, D.; Pascarella, A.; Pitolli, F.; Somersalo, E.; Vantaggi, B.
2015-12-01
The inverse problem of MEG aims at estimating electromagnetic cerebral activity from measurements of the magnetic fields outside the head. After formulating the problem within the Bayesian framework, a hierarchical conditionally Gaussian prior model is introduced, including a physiologically inspired prior model that takes into account the preferred directions of the source currents. The hyperparameter vector consists of prior variances of the dipole moments, assumed to follow a non-conjugate gamma distribution with variable scaling and shape parameters. A point estimate of both dipole moments and their variances can be computed using an iterative alternating sequential updating algorithm, which is shown to be globally convergent. The numerical solution is based on computing an approximation of the dipole moments using a Krylov subspace iterative linear solver equipped with statistically inspired preconditioning and a suitable termination rule. The shape parameters of the model are shown to control the focality, and furthermore, using an empirical Bayes argument, it is shown that the scaling parameters can be naturally adjusted to provide a statistically well justified depth sensitivity scaling. The validity of this interpretation is verified through computed numerical examples. Also, a computed example showing the applicability of the algorithm to analyze realistic time series data is presented.
Simulation of an Isolated Tiltrotor in Hover with an Unstructured Overset-Grid RANS Solver
NASA Technical Reports Server (NTRS)
Lee-Rausch, Elizabeth M.; Biedron, Robert T.
2009-01-01
An unstructured overset-grid Reynolds Averaged Navier-Stokes (RANS) solver, FUN3D, is used to simulate an isolated tiltrotor in hover. An overview of the computational method is presented as well as the details of the overset-grid systems. Steady-state computations within a noninertial reference frame define the performance trends of the rotor across a range of the experimental collective settings. Results are presented to show the effects of off-body grid refinement and blade grid refinement. The computed performance and blade loading trends show good agreement with experimental results and previously published structured overset-grid computations. Off-body flow features indicate a significant improvement in the resolution of the first perpendicular blade vortex interaction with background grid refinement across the collective range. Considering experimental data uncertainty and effects of transition, the prediction of figure of merit on the baseline and refined grid is reasonable at the higher collective range- within 3 percent of the measured values. At the lower collective settings, the computed figure of merit is approximately 6 percent lower than the experimental data. A comparison of steady and unsteady results show that with temporal refinement, the dynamic results closely match the steady-state noninertial results which gives confidence in the accuracy of the dynamic overset-grid approach.
The value of continuity: Refined isogeometric analysis and fast direct solvers
Garcia, Daniel; Pardo, David; Dalcin, Lisandro; Paszynski, Maciej; Collier, Nathan; Calo, Victor M.
2016-08-24
Here, we propose the use of highly continuous finite element spaces interconnected with low continuity hyperplanes to maximize the performance of direct solvers. Starting from a highly continuous Isogeometric Analysis (IGA) discretization, we introduce C0-separators to reduce the interconnection between degrees of freedom in the mesh. By doing so, both the solution time and best approximation errors are simultaneously improved. We call the resulting method “refined Isogeometric Analysis (rIGA)”. To illustrate the impact of the continuity reduction, we analyze the number of Floating Point Operations (FLOPs), computational times, and memory required to solve the linear system obtained by discretizing themore » Laplace problem with structured meshes and uniform polynomial orders. Theoretical estimates demonstrate that an optimal continuity reduction may decrease the total computational time by a factor between p2 and p3, with pp being the polynomial order of the discretization. Numerical results indicate that our proposed refined isogeometric analysis delivers a speed-up factor proportional to p2. In a 2D mesh with four million elements and p=5, the linear system resulting from rIGA is solved 22 times faster than the one from highly continuous IGA. In a 3D mesh with one million elements and p=3, the linear system is solved 15 times faster for the refined than the maximum continuity isogeometric analysis.« less
Improving DWF Simulations: Force Gradient Integrator and the Mobius Accelerated DWF Solver
NASA Astrophysics Data System (ADS)
Yin, H.; Mawhinney, R.
We have implemented a variant of the force gradient integrator proposed by Kennedy et.al. and are using it in our production 2+1 flavor DWF simulations with pion masses of 180 MeV in (4.5fm)3 volumes. We find modest speed-ups (\\sim 20%) from using the force gradient integrator, compared to our previously used Omelyan integrator. On other ensembles, primarily finite temperature 2+1 flavor DWF QCD, we have extensively tuned the Hasenbusch preconditioning masses and achieved speed-ups of 2-3x. Here we have also switched to the force gradient integrator, but this change has not had any impact on the speed. We also report on an improved solver for DWF, which uses M\\"obius fermions, with a smaller fifth dimension than the original DWF fermions, as an intermediate step in the generation of solutions of the Dirac equation. This approach cuts the number of effective Dirac applications by approximately a factor of 2 when the conjugate gradient iteration count is large.
GORRAM: Introducing accurate operational-speed radiative transfer Monte Carlo solvers
NASA Astrophysics Data System (ADS)
Buras-Schnell, Robert; Schnell, Franziska; Buras, Allan
2016-06-01
We present a new approach for solving the radiative transfer equation in horizontally homogeneous atmospheres. The motivation was to develop a fast yet accurate radiative transfer solver to be used in operational retrieval algorithms for next generation meteorological satellites. The core component is the program GORRAM (Generator Of Really Rapid Accurate Monte-Carlo) which generates solvers individually optimized for the intended task. These solvers consist of a Monte Carlo model capable of path recycling and a representative set of photon paths. Latter is generated using the simulated annealing technique. GORRAM automatically takes advantage of limitations on the variability of the atmosphere. Due to this optimization the number of photon paths necessary for accurate results can be reduced by several orders of magnitude. For the shown example of a forward model intended for an aerosol satellite retrieval, comparison with an exact yet slow solver shows that a precision of better than 1% can be achieved with only 36 photons. The computational time is at least an order of magnitude faster than any other type of radiative transfer solver. Merely the lookup table approach often used in satellite retrieval is faster, but on the other hand suffers from limited accuracy. This makes GORRAM-generated solvers an eligible candidate as forward model in operational-speed retrieval algorithms and data assimilation applications. GORRAM also has the potential to create fast solvers of other integrable equations.
Oasis: A high-level/high-performance open source Navier-Stokes solver
NASA Astrophysics Data System (ADS)
Mortensen, Mikael; Valen-Sendstad, Kristian
2015-03-01
Oasis is a high-level/high-performance finite element Navier-Stokes solver written from scratch in Python using building blocks from the FEniCS project (fenicsproject.org). The solver is unstructured and targets large-scale applications in complex geometries on massively parallel clusters. Oasis utilizes MPI and interfaces, through FEniCS, to the linear algebra backend PETSc. Oasis advocates a high-level, programmable user interface through the creation of highly flexible Python modules for new problems. Through the high-level Python interface the user is placed in complete control of every aspect of the solver. A version of the solver, that is using piecewise linear elements for both velocity and pressure, is shown to reproduce very well the classical, spectral, turbulent channel simulations of Moser et al. (1999). The computational speed is strongly dominated by the iterative solvers provided by the linear algebra backend, which is arguably the best performance any similar implicit solver using PETSc may hope for. Higher order accuracy is also demonstrated and new solvers may be easily added within the same framework.
Acceleration of FDTD mode solver by high-performance computing techniques.
Han, Lin; Xi, Yanping; Huang, Wei-Ping
2010-06-21
A two-dimensional (2D) compact finite-difference time-domain (FDTD) mode solver is developed based on wave equation formalism in combination with the matrix pencil method (MPM). The method is validated for calculation of both real guided and complex leaky modes of typical optical waveguides against the bench-mark finite-difference (FD) eigen mode solver. By taking advantage of the inherent parallel nature of the FDTD algorithm, the mode solver is implemented on graphics processing units (GPUs) using the compute unified device architecture (CUDA). It is demonstrated that the high-performance computing technique leads to significant acceleration of the FDTD mode solver with more than 30 times improvement in computational efficiency in comparison with the conventional FDTD mode solver running on CPU of a standard desktop computer. The computational efficiency of the accelerated FDTD method is in the same order of magnitude of the standard finite-difference eigen mode solver and yet require much less memory (e.g., less than 10%). Therefore, the new method may serve as an efficient, accurate and robust tool for mode calculation of optical waveguides even when the conventional eigen value mode solvers are no longer applicable due to memory limitation.
The effects of advection solvers on the performance of air quality models
Tanrikulu, S.; Odman, M.T.
1996-12-31
The available numerical solvers for the advection term in the chemical species conservation equation have different properties, and consequently introduce different types of errors. These errors can affect the performance of air quality models and lead to biases in model results. In this study, a large number of advection solvers have been studied and six of them were identified as having potential for use in photochemical models. The identified solvers were evaluated extensively using various numerical tests that are relevant to air quality simulations. Among the solvers evaluated, three of them showed better performance in terms of accuracy and some other characteristics such as conservation of mass and positivity. They are the solvers by Bott, Yuamartino, and Dabdub and Seinfeld. These three solvers were incorporated into the SARMAP Air Quality Model (SAQM) and the August 3-6, 1990 ozone episode in the San Joaquin Valley of California was simulated with each. A model performance analysis was conducted for each simulation using the rich air quality database of the 1990 San Joaquin Valley Air Quality Study. The results of the simulations were compared with each other and the effects of advection solvers on the performance of the model are discussed.
NASA Astrophysics Data System (ADS)
Go, Ara; Millis, Andrew J.
2015-01-01
A recently proposed configuration-interaction-based impurity solver is used in combination with the single-site and four-site cluster dynamical mean field approximations to investigate the three-band copper oxide model believed to describe the electronic structure of high transition temperature copper-oxide superconductors. Use of the configuration interaction solver enables verification of the convergence of results with respect to the number of bath orbitals. The spatial correlations included in the cluster approximation substantially shift the metal-insulator phase boundary relative to the prediction of the single-site approximation and increase the predicted energy gap of the insulating phase by about 1 eV above the single-site result. Vertex corrections occurring in the four-site approximation act to dramatically increase the value of the optical conductivity near the gap edge, resulting in better agreement with the data. The calculations reveal two distinct correlated insulating states: the "magnetically correlated insulator," in which nontrivial intersite correlations play an essential role in stabilizing the insulating state, and the strongly correlated insulator, in which local physics suffices. Comparison of the calculations to the data places the cuprates in the magnetically correlated Mott insulator regime.
A new set of direct and iterative solvers for the TOUGH2 family of codes
Moridis, G.J.
1995-04-01
Two new solvers are discussed. LUBAND, the first routine is a direct solver for banded systems and is based on a LU decomposition with partial pivoting and row interchange. BCGSTB, the second routine, is a Preconditioned Conjugate Gradient (PCG) solver with improved speed and convergence characteristics. Bandwidth minimization and gridblock ordering schemes are also introduced into TOUGH2 to improve speed and accuracy. TOUGH2 simulates fluid and heat flows in permeable media and is used for the evaluation of WIPP and TEVES (Thermal Enhanced Vapor Extraction System) that will be used to extract solvents from the Chemical Waste Landfill at Sandia National Laboratories.
A multigrid solver for semi-implicit global shallow-water models
NASA Technical Reports Server (NTRS)
Barros, Saulo R. M.; Dee, Dick P.; Dickstein, Flavio
1990-01-01
A multigrid solver is developed for the discretized two-dimensional elliptic equation on the sphere that arises from a semiimplicit time discretization of the global shallow-water equations. Different formulations of the semiimplicit scheme result in variable-coefficient Helmholtz-type equations for which no fast direct solvers are available. The efficiency of the multigrid solver is optimal, in the sense that the total operation count is proportional to the number of unknowns. Numerical experiments using initial data derived from actual 300-mb height and wind velocity fields indicate that the present model has very good accuracy and stability properties.
Application of an unstructured grid flow solver to planes, trains and automobiles
NASA Technical Reports Server (NTRS)
Spragle, Gregory S.; Smith, Wayne A.; Yadlin, Yoram
1993-01-01
Rampant, an unstructured flow solver developed at Fluent Inc., is used to compute three-dimensional, viscous, turbulent, compressible flow fields within complex solution domains. Rampant is an explicit, finite-volume flow solver capable of computing flow fields using either triangular (2d) or tetrahedral (3d) unstructured grids. Local time stepping, implicit residual smoothing, and multigrid techniques are used to accelerate the convergence of the explicit scheme. The paper describes the Rampant flow solver and presents flow field solutions about a plane, train, and automobile.
Cwik, T.; Jamnejad, V.; Zuffada, C.
1994-12-31
The usefulness of finite element modeling follows from the ability to accurately simulate the geometry and three-dimensional fields on the scale of a fraction of a wavelength. To make this modeling practical for engineering design, it is necessary to integrate the stages of geometry modeling and mesh generation, numerical solution of the fields-a stage heavily dependent on the efficient use of a sparse matrix equation solver, and display of field information. The stages of geometry modeling, mesh generation, and field display are commonly completed using commercially available software packages. Algorithms for the numerical solution of the fields need to be written for the specific class of problems considered. Interior problems, i.e. simulating fields in waveguides and cavities, have been successfully solved using finite element methods. Exterior problems, i.e. simulating fields scattered or radiated from structures, are more difficult to model because of the need to numerically truncate the finite element mesh. To practically compute a solution to exterior problems, the domain must be truncated at some finite surface where the Sommerfeld radiation condition is enforced, either approximately or exactly. Approximate methods attempt to truncate the mesh using only local field information at each grid point, whereas exact methods are global, needing information from the entire mesh boundary. In this work, a method that couples three-dimensional finite element (FE) solutions interior to the bounding surface, with an efficient integral equation (IE) solution that exactly enforces the Sommerfeld radiation condition is developed. The bounding surface is taken to be a surface of revolution (SOR) to greatly reduce computational expense in the IE portion of the modeling.
Basis Function Approximation of Transonic Aerodynamic Influence Coefficient Matrix
NASA Technical Reports Server (NTRS)
Li, Wesley W.; Pak, Chan-gi
2011-01-01
A technique for approximating the modal aerodynamic influence coefficients matrices by using basis functions has been developed and validated. An application of the resulting approximated modal aerodynamic influence coefficients matrix for a flutter analysis in transonic speed regime has been demonstrated. This methodology can be applied to the unsteady subsonic, transonic, and supersonic aerodynamics. The method requires the unsteady aerodynamics in frequency-domain. The flutter solution can be found by the classic methods, such as rational function approximation, k, p-k, p, root-locus et cetera. The unsteady aeroelastic analysis for design optimization using unsteady transonic aerodynamic approximation is being demonstrated using the ZAERO flutter solver (ZONA Technology Incorporated, Scottsdale, Arizona). The technique presented has been shown to offer consistent flutter speed prediction on an aerostructures test wing 2 configuration with negligible loss in precision in transonic speed regime. These results may have practical significance in the analysis of aircraft aeroelastic calculation and could lead to a more efficient design optimization cycle.
Basis Function Approximation of Transonic Aerodynamic Influence Coefficient Matrix
NASA Technical Reports Server (NTRS)
Li, Wesley Waisang; Pak, Chan-Gi
2010-01-01
A technique for approximating the modal aerodynamic influence coefficients [AIC] matrices by using basis functions has been developed and validated. An application of the resulting approximated modal AIC matrix for a flutter analysis in transonic speed regime has been demonstrated. This methodology can be applied to the unsteady subsonic, transonic and supersonic aerodynamics. The method requires the unsteady aerodynamics in frequency-domain. The flutter solution can be found by the classic methods, such as rational function approximation, k, p-k, p, root-locus et cetera. The unsteady aeroelastic analysis for design optimization using unsteady transonic aerodynamic approximation is being demonstrated using the ZAERO(TradeMark) flutter solver (ZONA Technology Incorporated, Scottsdale, Arizona). The technique presented has been shown to offer consistent flutter speed prediction on an aerostructures test wing [ATW] 2 configuration with negligible loss in precision in transonic speed regime. These results may have practical significance in the analysis of aircraft aeroelastic calculation and could lead to a more efficient design optimization cycle
NASA Technical Reports Server (NTRS)
Chang, S. C.; Wang, X. Y.; Chow, C. Y.; Himansu, A.
1995-01-01
The method of space-time conservation element and solution element is a nontraditional numerical method designed from a physicist's perspective, i.e., its development is based more on physics than numerics. It uses only the simplest approximation techniques and yet is capable of generating nearly perfect solutions for a 2-D shock reflection problem used by Helen Yee and others. In addition to providing an overall view of the new method, we introduce a new concept in the design of implicit schemes, and use it to construct a highly accurate solver for a convection-diffusion equation. It is shown that, in the inviscid case, this new scheme becomes explicit and its amplification factors are identical to those of the Leapfrog scheme. On the other hand, in the pure diffusion case, its principal amplification factor becomes the amplification factor of the Crank-Nicolson scheme.
An optimal iterative solver for the Stokes problem
Wathen, A.; Silvester, D.
1994-12-31
Discretisations of the classical Stokes Problem for slow viscous incompressible flow gives rise to systems of equations in matrix form for the velocity u and the pressure p, where the coefficient matrix is symmetric but necessarily indefinite. The square submatrix A is symmetric and positive definite and represents a discrete (vector) Laplacian and the submatrix C may be the zero matrix or more generally will be symmetric positive semi-definite. For `stabilised` discretisations (C {ne} 0) and descretisations which are inherently `stable` (C = 0) and so do not admit spurious pressure components even as the mesh size, h approaches zero, the Schur compliment of the matrix has spectral condition number independent of h (given also that B is bounded). Here the authors will show how this property together with a multigrid preconditioner only for the Laplacian block A yields an optimal solver for the Stokes problem through use of the Minimum Residual iteration. That is, combining Minimum Residual iteration for the matrix equation with a block preconditioner which comprises a small number of multigrid V-cycles for the Laplacian block A together with a simple diagonal scaling block provides an iterative solution procedure for which the computational work grows only linearly with the problem size.
Towards Batched Linear Solvers on Accelerated Hardware Platforms
Haidar, Azzam; Dong, Tingzing Tim; Tomov, Stanimire; Dongarra, Jack J
2015-01-01
As hardware evolves, an increasingly effective approach to develop energy efficient, high-performance solvers, is to design them to work on many small and independent problems. Indeed, many applications already need this functionality, especially for GPUs, which are known to be currently about four to five times more energy efficient than multicore CPUs for every floating-point operation. In this paper, we describe the development of the main one-sided factorizations: LU, QR, and Cholesky; that are needed for a set of small dense matrices to work in parallel. We refer to such algorithms as batched factorizations. Our approach is based on representing the algorithms as a sequence of batched BLAS routines for GPU-contained execution. Note that this is similar in functionality to the LAPACK and the hybrid MAGMA algorithms for large-matrix factorizations. But it is different from a straightforward approach, whereby each of GPU's symmetric multiprocessors factorizes a single problem at a time. We illustrate how our performance analysis together with the profiling and tracing tools guided the development of batched factorizations to achieve up to 2-fold speedup and 3-fold better energy efficiency compared to our highly optimized batched CPU implementations based on the MKL library on a two-sockets, Intel Sandy Bridge server. Compared to a batched LU factorization featured in the NVIDIA's CUBLAS library for GPUs, we achieves up to 2.5-fold speedup on the K40 GPU.
Algorithmic Enhancements to the VULCAN Navier-Stokes Solver
NASA Technical Reports Server (NTRS)
Litton, D. K.; Edwards, J. R.; White, J. A.
2003-01-01
VULCAN (Viscous Upwind aLgorithm for Complex flow ANalysis) is a cell centered, finite volume code used to solve high speed flows related to hypersonic vehicles. Two algorithms are presented for expanding the range of applications of the current Navier-Stokes solver implemented in VULCAN. The first addition is a highly implicit approach that uses subiterations to enhance block to block connectivity between adjacent subdomains. The addition of this scheme allows more efficient solution of viscous flows on highly-stretched meshes. The second algorithm addresses the shortcomings associated with density-based schemes by the addition of a time-derivative preconditioning strategy. High speed, compressible flows are typically solved with density based schemes, which show a high level of degradation in accuracy and convergence at low Mach numbers (M less than or equal to 0.1). With the addition of preconditioning and associated modifications to the numerical discretization scheme, the eigenvalues will scale with the local velocity, and the above problems will be eliminated. With these additions, VULCAN now has improved convergence behavior for multi-block, highly-stretched meshes and also can solve the Navier-Stokes equations for very low Mach numbers.
Generation of Minimum-Consistent DFA Using SAT Solver
NASA Astrophysics Data System (ADS)
Inui, Nobuo; Aizawa, Akiko
The purpose of this study is to develop efficient methods for the minimum-consistent DFA (deterministic finite state automaton) problem. The graph-coloring based SAT (satisfiability) approach proposed by Heule is a state of the art method for this problem. It specially achieves high performance computing in dense problems such as in a popular benchmark problem where rich information about labels is included. In contrast, to solve sparse problems is a challenge for the minimum-consistent DFA problem. To solve sparse problems, we propose three approaches to the SAT formulation: a) the binary color representation, b) the dynamic symmetry breaking and c) the hyper-graph coloring constraint. We organized an experiment using the existing benchmark problems and sparse problems made from them. We observed that our symmetry breaking constraints made the speed up the running time of SAT solver. In addition with this, our other proposed methods were showing the possibility to improve the performance. Then we simulated the perfomance of our methods under the condition that we executed the several program set-ups in parallel. Compared with the previous research results, we finally could reduce the average relative time by 66.5% and the total relative time by 7.6% for sparse problems and by 79.7% and 38.5% for dense problems, respectively. These results showed that our proposed methods were effective for difficult problems.
A generalized Poisson solver for first-principles device simulations
NASA Astrophysics Data System (ADS)
Bani-Hashemian, Mohammad Hossein; Brück, Sascha; Luisier, Mathieu; VandeVondele, Joost
2016-01-01
Electronic structure calculations of atomistic systems based on density functional theory involve solving the Poisson equation. In this paper, we present a plane-wave based algorithm for solving the generalized Poisson equation subject to periodic or homogeneous Neumann conditions on the boundaries of the simulation cell and Dirichlet type conditions imposed at arbitrary subdomains. In this way, source, drain, and gate voltages can be imposed across atomistic models of electronic devices. Dirichlet conditions are enforced as constraints in a variational framework giving rise to a saddle point problem. The resulting system of equations is then solved using a stationary iterative method in which the generalized Poisson operator is preconditioned with the standard Laplace operator. The solver can make use of any sufficiently smooth function modelling the dielectric constant, including density dependent dielectric continuum models. For all the boundary conditions, consistent derivatives are available and molecular dynamics simulations can be performed. The convergence behaviour of the scheme is investigated and its capabilities are demonstrated.
Cooperative solutions coupling a geometry engine and adaptive solver codes
NASA Technical Reports Server (NTRS)
Dickens, Thomas P.
1995-01-01
Follow-on work has progressed in using Aero Grid and Paneling System (AGPS), a geometry and visualization system, as a dynamic real time geometry monitor, manipulator, and interrogator for other codes. In particular, AGPS has been successfully coupled with adaptive flow solvers which iterate, refining the grid in areas of interest, and continuing on to a solution. With the coupling to the geometry engine, the new grids represent the actual geometry much more accurately since they are derived directly from the geometry and do not use refits to the first-cut grids. Additional work has been done with design runs where the geometric shape is modified to achieve a desired result. Various constraints are used to point the solution in a reasonable direction which also more closely satisfies the desired results. Concepts and techniques are presented, as well as examples of sample case studies. Issues such as distributed operation of the cooperative codes versus running all codes locally and pre-calculation for performance are discussed. Future directions are considered which will build on these techniques in light of changing computer environments.
Development of parallel incompressible NS solver on stretched grids
NASA Astrophysics Data System (ADS)
Jothiprasad, G.; Caughey, D.; Pope, S. B.
2003-11-01
Development of a parallel NS solver for studying DNS and LES of temporal mixing layers is discussed. The equations are cast in strong conservation form on a uniform computational mesh, transformed from a stretched mesh in the physical domain. Variables are defined on a collocated grid, and the transformed equations are solved using a fractional step method. Convective and dissipative terms are treated using explicit Adams-Bashforth and implicit Crank-Nicolson, respectively. Fourth order spatial accuracy is maintained except for hyperviscous subgrid model terms, which are only 2nd order accurate. The block LU analysis of J. B. Perot, extended to fractional step methods on collocated grids, shows that an O(Δ t^2) term involving the pressure gradient must be added to the momentum equations to maintain 2nd order accuracy in time. Using a smaller stencil for the pressure gradients largely simplifies the pressure Poisson equation while still ensuring that discrete continuity is satisfied to appropriate order. Implementation on distributed-memory multiprocessors is achieved using MPI, with care taken to minimize communication overhead.
Verification of continuum drift kinetic equation solvers in NIMROD
Held, E. D.; Ji, J.-Y.; Kruger, S. E.; Belli, E. A.; Lyons, B. C.
2015-03-15
Verification of continuum solutions to the electron and ion drift kinetic equations (DKEs) in NIMROD [C. R. Sovinec et al., J. Comp. Phys. 195, 355 (2004)] is demonstrated through comparison with several neoclassical transport codes, most notably NEO [E. A. Belli and J. Candy, Plasma Phys. Controlled Fusion 54, 015015 (2012)]. The DKE solutions use NIMROD's spatial representation, 2D finite-elements in the poloidal plane and a 1D Fourier expansion in toroidal angle. For 2D velocity space, a novel 1D expansion in finite elements is applied for the pitch angle dependence and a collocation grid is used for the normalized speed coordinate. The full, linearized Coulomb collision operator is kept and shown to be important for obtaining quantitative results. Bootstrap currents, parallel ion flows, and radial particle and heat fluxes show quantitative agreement between NIMROD and NEO for a variety of tokamak equilibria. In addition, velocity space distribution function contours for ions and electrons show nearly identical detailed structure and agree quantitatively. A Θ-centered, implicit time discretization and a block-preconditioned, iterative linear algebra solver provide efficient electron and ion DKE solutions that ultimately will be used to obtain closures for NIMROD's evolving fluid model.
Shared Memory Parallelism for 3D Cartesian Discrete Ordinates Solver
NASA Astrophysics Data System (ADS)
Moustafa, Salli; Dutka-Malen, Ivan; Plagne, Laurent; Ponçot, Angélique; Ramet, Pierre
2014-06-01
This paper describes the design and the performance of DOMINO, a 3D Cartesian SN solver that implements two nested levels of parallelism (multicore+SIMD) on shared memory computation nodes. DOMINO is written in C++, a multi-paradigm programming language that enables the use of powerful and generic parallel programming tools such as Intel TBB and Eigen. These two libraries allow us to combine multi-thread parallelism with vector operations in an efficient and yet portable way. As a result, DOMINO can exploit the full power of modern multi-core processors and is able to tackle very large simulations, that usually require large HPC clusters, using a single computing node. For example, DOMINO solves a 3D full core PWR eigenvalue problem involving 26 energy groups, 288 angular directions (S16), 46 × 106 spatial cells and 1 × 1012 DoFs within 11 hours on a single 32-core SMP node. This represents a sustained performance of 235 GFlops and 40:74% of the SMP node peak performance for the DOMINO sweep implementation. The very high Flops/Watt ratio of DOMINO makes it a very interesting building block for a future many-nodes nuclear simulation tool.
A generalized Poisson solver for first-principles device simulations.
Bani-Hashemian, Mohammad Hossein; Brück, Sascha; Luisier, Mathieu; VandeVondele, Joost
2016-01-28
Electronic structure calculations of atomistic systems based on density functional theory involve solving the Poisson equation. In this paper, we present a plane-wave based algorithm for solving the generalized Poisson equation subject to periodic or homogeneous Neumann conditions on the boundaries of the simulation cell and Dirichlet type conditions imposed at arbitrary subdomains. In this way, source, drain, and gate voltages can be imposed across atomistic models of electronic devices. Dirichlet conditions are enforced as constraints in a variational framework giving rise to a saddle point problem. The resulting system of equations is then solved using a stationary iterative method in which the generalized Poisson operator is preconditioned with the standard Laplace operator. The solver can make use of any sufficiently smooth function modelling the dielectric constant, including density dependent dielectric continuum models. For all the boundary conditions, consistent derivatives are available and molecular dynamics simulations can be performed. The convergence behaviour of the scheme is investigated and its capabilities are demonstrated. PMID:26827208
Verification of continuum drift kinetic equation solvers in NIMROD
NASA Astrophysics Data System (ADS)
Held, E. D.; Kruger, S. E.; Ji, J.-Y.; Belli, E. A.; Lyons, B. C.
2015-03-01
Verification of continuum solutions to the electron and ion drift kinetic equations (DKEs) in NIMROD [C. R. Sovinec et al., J. Comp. Phys. 195, 355 (2004)] is demonstrated through comparison with several neoclassical transport codes, most notably NEO [E. A. Belli and J. Candy, Plasma Phys. Controlled Fusion 54, 015015 (2012)]. The DKE solutions use NIMROD's spatial representation, 2D finite-elements in the poloidal plane and a 1D Fourier expansion in toroidal angle. For 2D velocity space, a novel 1D expansion in finite elements is applied for the pitch angle dependence and a collocation grid is used for the normalized speed coordinate. The full, linearized Coulomb collision operator is kept and shown to be important for obtaining quantitative results. Bootstrap currents, parallel ion flows, and radial particle and heat fluxes show quantitative agreement between NIMROD and NEO for a variety of tokamak equilibria. In addition, velocity space distribution function contours for ions and electrons show nearly identical detailed structure and agree quantitatively. A Θ-centered, implicit time discretization and a block-preconditioned, iterative linear algebra solver provide efficient electron and ion DKE solutions that ultimately will be used to obtain closures for NIMROD's evolving fluid model.
Incremental planning to control a blackboard-based problem solver
NASA Technical Reports Server (NTRS)
Durfee, E. H.; Lesser, V. R.
1987-01-01
To control problem solving activity, a planner must resolve uncertainty about which specific long-term goals (solutions) to pursue and about which sequences of actions will best achieve those goals. A planner is described that abstracts the problem solving state to recognize possible competing and compatible solutions and to roughly predict the importance and expense of developing these solutions. With this information, the planner plans sequences of problem solving activities that most efficiently resolve its uncertainty about which of the possible solutions to work toward. The planner only details actions for the near future because the results of these actions will influence how (and whether) a plan should be pursued. As problem solving proceeds, the planner adds new details to the plan incrementally, and monitors and repairs the plan to insure it achieves its goals whenever possible. Through experiments, researchers illustrate how these new mechanisms significantly improve problem solving decisions and reduce overall computation. They briefly discuss current research directions, including how these mechanisms can improve a problem solver's real-time response and can enhance cooperation in a distributed problem solving network.
NASA Astrophysics Data System (ADS)
Guda, A. A.; Guda, S. A.; Soldatov, M. A.; Lomachenko, K. A.; Bugaev, A. L.; Lamberti, C.; Gawelda, W.; Bressler, C.; Smolentsev, G.; Soldatov, A. V.; Joly, Y.
2016-05-01
Finite difference method (FDM) implemented in the FDMNES software [Phys. Rev. B, 2001, 63, 125120] was revised. Thorough analysis shows, that the calculated diagonal in the FDM matrix consists of about 96% zero elements. Thus a sparse solver would be more suitable for the problem instead of traditional Gaussian elimination for the diagonal neighbourhood. We have tried several iterative sparse solvers and the direct one MUMPS solver with METIS ordering turned out to be the best. Compared to the Gaussian solver present method is up to 40 times faster and allows XANES simulations for complex systems already on personal computers. We show applicability of the software for metal-organic [Fe(bpy)3]2+ complex both for low spin and high spin states populated after laser excitation.
A novel high-order, entropy stable, 3D AMR MHD solver with guaranteed positive pressure
NASA Astrophysics Data System (ADS)
Derigs, Dominik; Winters, Andrew R.; Gassner, Gregor J.; Walch, Stefanie
2016-07-01
We describe a high-order numerical magnetohydrodynamics (MHD) solver built upon a novel non-linear entropy stable numerical flux function that supports eight travelling wave solutions. By construction the solver conserves mass, momentum, and energy and is entropy stable. The method is designed to treat the divergence-free constraint on the magnetic field in a similar fashion to a hyperbolic divergence cleaning technique. The solver described herein is especially well-suited for flows involving strong discontinuities. Furthermore, we present a new formulation to guarantee positivity of the pressure. We present the underlying theory and implementation of the new solver into the multi-physics, multi-scale adaptive mesh refinement (AMR) simulation code FLASH (http://flash.uchicago.edu)
Efficient Implementation of Multigrid Solvers on Message-Passing Parrallel Systems
NASA Technical Reports Server (NTRS)
Lou, John
1994-01-01
We discuss our implementation strategies for finite difference multigrid partial differential equation (PDE) solvers on message-passing systems. Our target parallel architecture is Intel parallel computers: the Delta and Paragon system.
Fault tolerance in an inner-outer solver: A GVR-enabled case study
Zhang, Ziming; Chien, Andrew A.; Teranishi, Keita
2015-04-18
Resilience is a major challenge for large-scale systems. It is particularly important for iterative linear solvers, since they take much of the time of many scientific applications. We show that single bit flip errors in the Flexible GMRES iterative linear solver can lead to high computational overhead or even failure to converge to the right answer. Informed by these results, we design and evaluate several strategies for fault tolerance in both inner and outer solvers appropriate across a range of error rates. We implement them, extending Trilinos’ solver library with the Global View Resilience (GVR) programming model, which provides multi-stream snapshots, multi-version data structures with portable and rich error checking/recovery. Lastly, experimental results validate correct execution with low performance overhead under varied error conditions.
Fault tolerance in an inner-outer solver: A GVR-enabled case study
Zhang, Ziming; Chien, Andrew A.; Teranishi, Keita
2015-04-18
Resilience is a major challenge for large-scale systems. It is particularly important for iterative linear solvers, since they take much of the time of many scientific applications. We show that single bit flip errors in the Flexible GMRES iterative linear solver can lead to high computational overhead or even failure to converge to the right answer. Informed by these results, we design and evaluate several strategies for fault tolerance in both inner and outer solvers appropriate across a range of error rates. We implement them, extending Trilinos’ solver library with the Global View Resilience (GVR) programming model, which provides multi-streammore » snapshots, multi-version data structures with portable and rich error checking/recovery. Lastly, experimental results validate correct execution with low performance overhead under varied error conditions.« less
User's Manual for PCSMS (Parallel Complex Sparse Matrix Solver). Version 1.
NASA Technical Reports Server (NTRS)
Reddy, C. J.
2000-01-01
PCSMS (Parallel Complex Sparse Matrix Solver) is a computer code written to make use of the existing real sparse direct solvers to solve complex, sparse matrix linear equations. PCSMS converts complex matrices into real matrices and use real, sparse direct matrix solvers to factor and solve the real matrices. The solution vector is reconverted to complex numbers. Though, this utility is written for Silicon Graphics (SGI) real sparse matrix solution routines, it is general in nature and can be easily modified to work with any real sparse matrix solver. The User's Manual is written to make the user acquainted with the installation and operation of the code. Driver routines are given to aid the users to integrate PCSMS routines in their own codes.
A new 3D Eikonal solver for accurate traveltimes, take-off angles and amplitudes
NASA Astrophysics Data System (ADS)
Noble, Mark; Gesret, Alexandrine
2013-04-01
The finite-difference approximation to the eikonal equation was first introduced by J.Vidale in 1988 to propagate first-arrival times throughout a 2D or 3D gridded velocity model. Even today this method is still very attractive from a computational point of view when dealing with large datasets. Among many domains of application, the eikonal solver may be used for 2-D or 3-D depth migration, tomography or microseismicity data analysis. The original 3D method proposed by Vidale in 1990 did exhibit some degree of travel time error that may lead to poor image focusing in migration or inaccurate velocities estimated via tomographic inversion. The method even failed when large and sharp velocity contrasts were encountered. To try and overcome these limitations many authors proposed alternative algorithms, incorporating new finite-difference operators and/or new schemes of implementing the operators to propagate the travel times through the velocity model. If many recently published algorithms for resolving the 3D eikonal equation do yield fairly accurate travel times for most applications, the spatial derivatives of travel times remain very approximate and prevent reliable computation of auxiliary quantities such as take-off angle and amplitude. This limitation is due to the fact that the finite-difference operators locally assume that the wavefront is flat (plane wave). This assumption is in particularly wrong when close to the source where a spherical approximation would be more suitable. To overcome this singularity at the source, some authors proposed an adaptive method that reduces inaccuracies, however, the cost is more algorithmic complexity. The objective of this study is to develop an efficient simple 3D eikonal solver that is able to: overcome the problem of the source singularity, handle velocity models that exhibit strong vertical and horizontal velocity variations, use different grid spacing in x, y and z axis of model. The final goal is of course to
The SX Solver: A New Computer Program for Analyzing Solvent-Extraction Equilibria.
Lumetta, Gregg J; McNamara, Bruce K; Rapko, Brian M
1999-01-08
A new computer program, the SX Solver, has been developed to analyze solvent-extraction equilibria. The program operates out of Microsoft Excel{reg_sign} and uses the built-in ''Solver'' function to minimize the sum of the square of the residuals between measured and calculated distribution coefficients. The extraction of nitric acid by tributyl phosphate has been modeled to illustrate the program's use.
The SX Solver: A New Computer Program for Analyzing Solvent-Extraction Equilibria
McNamara, B.K.; Rapko, B.M.; Lumetta, G.J.
1999-02-09
A new computer program, the SX Solver, has been developed to analyze solvent-extraction equilibria. The program operates out of Microsoft Excel{reg_sign} and uses the built-in ''Solver'' function to minimize the sum of the square of the residuals between measured and calculated distribution coefficients. The extraction of nitric acid by tributylphosphate has been modeled to illustrate the program's use.
The development of an intelligent interface to a computational fluid dynamics flow-solver code
NASA Technical Reports Server (NTRS)
Williams, Anthony D.
1988-01-01
Researchers at NASA Lewis are currently developing an 'intelligent' interface to aid in the development and use of large, computational fluid dynamics flow-solver codes for studying the internal fluid behavior of aerospace propulsion systems. This paper discusses the requirements, design, and implementation of an intelligent interface to Proteus, a general purpose, 3-D, Navier-Stokes flow solver. The interface is called PROTAIS to denote its introduction of artificial intelligence (AI) concepts to the Proteus code.
Implementation of a parallel unstructured Euler solver on the CM-5
NASA Technical Reports Server (NTRS)
Morano, Eric; Mavriplis, D. J.
1995-01-01
An efficient unstructured 3D Euler solver is parallelized on a Thinking Machine Corporation Connection Machine 5, distributed memory computer with vectoring capability. In this paper, the single instruction multiple data (SIMD) strategy is employed through the use of the CM Fortran language and the CMSSL scientific library. The performance of the CMSSL mesh partitioner is evaluated and the overall efficiency of the parallel flow solver is discussed.
Maxwell solvers for the simulations of the laser-matter interaction
NASA Astrophysics Data System (ADS)
Nuter, Rachel; Grech, Mickael; Gonzalez de Alaiza Martinez, Pedro; Bonnaud, Guy; d'Humières, Emmanuel
2014-06-01
With the advent of high intensity laser beams, solving the Maxwell equations with a free-dispersive algorithm is becoming essential. Several Maxwell solvers, implemented in Particle-In-Cell codes, have been proposed. We present here some of them by describing their computational stencil in two-dimensional geometry and defining their stability area as well as their numerical dispersion relation. Numerical simulations of Backward Raman amplification and laser wake-field are presented to compare these different solvers.
Fast Poisson, Fast Helmholtz and fast linear elastostatic solvers on rectangular parallelepipeds
Wiegmann, A.
1999-06-01
FFT-based fast Poisson and fast Helmholtz solvers on rectangular parallelepipeds for periodic boundary conditions in one-, two and three space dimensions can also be used to solve Dirichlet and Neumann boundary value problems. For non-zero boundary conditions, this is the special, grid-aligned case of jump corrections used in the Explicit Jump Immersed Interface method. Fast elastostatic solvers for periodic boundary conditions in two and three dimensions can also be based on the FFT. From the periodic solvers we derive fast solvers for the new 'normal' boundary conditions and essential boundary conditions on rectangular parallelepipeds. The periodic case allows a simple proof of existence and uniqueness of the solutions to the discretization of normal boundary conditions. Numerical examples demonstrate the efficiency of the fast elastostatic solvers for non-periodic boundary conditions. More importantly, the fast solvers on rectangular parallelepipeds can be used together with the Immersed Interface Method to solve problems on non-rectangular domains with general boundary conditions. Details of this are reported in the preprint The Explicit Jump Immersed Interface Method for 2D Linear Elastostatics by the author.
Woodward, Carol S.; Gardner, David J.; Evans, Katherine J.
2015-01-01
Efficient solutions of global climate models require effectively handling disparate length and time scales. Implicit solution approaches allow time integration of the physical system with a step size governed by accuracy of the processes of interest rather than by stability of the fastest time scales present. Implicit approaches, however, require the solution of nonlinear systems within each time step. Usually, a Newton's method is applied to solve these systems. Each iteration of the Newton's method, in turn, requires the solution of a linear model of the nonlinear system. This model employs the Jacobian of the problem-defining nonlinear residual, but thismore » Jacobian can be costly to form. If a Krylov linear solver is used for the solution of the linear system, the action of the Jacobian matrix on a given vector is required. In the case of spectral element methods, the Jacobian is not calculated but only implemented through matrix-vector products. The matrix-vector multiply can also be approximated by a finite difference approximation which may introduce inaccuracy in the overall nonlinear solver. In this paper, we review the advantages and disadvantages of finite difference approximations of these matrix-vector products for climate dynamics within the spectral element shallow water dynamical core of the Community Atmosphere Model.« less
Woodward, Carol S.; Gardner, David J.; Evans, Katherine J.
2015-01-01
Efficient solutions of global climate models require effectively handling disparate length and time scales. Implicit solution approaches allow time integration of the physical system with a step size governed by accuracy of the processes of interest rather than by stability of the fastest time scales present. Implicit approaches, however, require the solution of nonlinear systems within each time step. Usually, a Newton's method is applied to solve these systems. Each iteration of the Newton's method, in turn, requires the solution of a linear model of the nonlinear system. This model employs the Jacobian of the problem-defining nonlinear residual, but this Jacobian can be costly to form. If a Krylov linear solver is used for the solution of the linear system, the action of the Jacobian matrix on a given vector is required. In the case of spectral element methods, the Jacobian is not calculated but only implemented through matrix-vector products. The matrix-vector multiply can also be approximated by a finite difference approximation which may introduce inaccuracy in the overall nonlinear solver. In this paper, we review the advantages and disadvantages of finite difference approximations of these matrix-vector products for climate dynamics within the spectral element shallow water dynamical core of the Community Atmosphere Model.
Patients as partners, patients as problem-solvers.
Young, Amanda; Flower, Linda
2002-01-01
This article reports our ongoing work in developing a model of health care communication called collaborative interpretation, which we define as a rhetorical practice that generates building blocks for a more complete and coherent diagnostic story and for a collaborative treatment plan. It does this by situating patients as problem-solvers. Our study begins with an analysis of provider-patient interactions in a specific setting-the emergency department (ED) of an urban trauma-level hospital- where we observed patients and providers miscommunicating in at least 3 distinct areas: over the meaning of key terms, in the framing of the immediate problem, and over the perceived role of the ED in serving the individual and the community. From our observations, we argue that all of these miscommunications and missed opportunities are rooted in mismatched expectations on the part of both provider and patient and the lack of explicit comparison and negotiation of expectations-in other words, a failure to see the patient-provider interaction as a rhetorical, knowledge-building event. In the process of observing interactions, conversing with patients and providers, and working with a team of providers and patients, we have developed an operational model of communication that could narrow the gap between the lay public and the medical profession-a gap that is especially critical in intercultural settings like the one we have studied. This model of collaborative interpretation (CI) provides strategies to help patients to represent their medical problems in the context of their life experiences and to share the logic behind their health care decisions. In addition, CI helps both patient and provider identify their goals and expectations in treatment, the obstacles that each party perceives, and the available options. It is adaptableto various settings, including short, structured conversations in the emergency room, extended dialogue between a health educator and a patient in a
A two-dimensional fast solver for arbitrary vortex distributions
Strickland, J.H.; Baty, R.S.
1997-04-01
A method which is capable of an efficient calculation of the two-dimensional stream function and velocity field produced by a large system of vortices is presented in this report. This work is based on the adaptive scheme of Carrier, Greengard, and Rokhlin with the added feature that the evaluation or target points do not have to coincide with the location of the source or vortex positions. A simple algorithm based on numerical experiments has been developed to optimize the method for cases where the number of vortices N{sub V} differs significantly from the number of target points N{sub T}. The ability to specify separate source and target fields provides an efficient means for calculating boundary conditions, trajectories of passive scalar quantities, and stream-function plots, etc. Test cases have been run to benchmark the truncation errors and CPU time savings associated with the method. For six terms in the series expansions, non-dimensional truncation errors for the magnitudes of the complex potential and velocity fields are on the order of 10{sup {minus}5} and 10{sup {minus}3} respectively. The authors found that the CPU time scales as {radical}(N{sub V}N{sub T}) for N{sub V}/N{sub T} in the range of 0.1 to 10. For {radical}(N{sub V}N{sub T}) less than 200, there is virtually no CPU time savings while for {radical}N{sub V}N{sub T} roughly equal to 20,000, the fast solver obtains solutions in about 1% of the time required for the direct solution technique depending somewhat upon the configuration of the vortex field and the target field.
A Radiation Transfer Solver for Athena Using Short Characteristics
NASA Astrophysics Data System (ADS)
Davis, Shane W.; Stone, James M.; Jiang, Yan-Fei
2012-03-01
We describe the implementation of a module for the Athena magnetohydrodynamics (MHD) code that solves the time-independent, multi-frequency radiative transfer (RT) equation on multidimensional Cartesian simulation domains, including scattering and non-local thermodynamic equilibrium (LTE) effects. The module is based on well known and well tested algorithms developed for modeling stellar atmospheres, including the method of short characteristics to solve the RT equation, accelerated Lambda iteration to handle scattering and non-LTE effects, and parallelization via domain decomposition. The module serves several purposes: it can be used to generate spectra and images, to compute a variable Eddington tensor (VET) for full radiation MHD simulations, and to calculate the heating and cooling source terms in the MHD equations in flows where radiation pressure is small compared with gas pressure. For the latter case, the module is combined with the standard MHD integrators using operator splitting: we describe this approach in detail, including a new constraint on the time step for stability due to radiation diffusion modes. Implementation of the VET method for radiation pressure dominated flows is described in a companion paper. We present results from a suite of test problems for both the RT solver itself and for dynamical problems that include radiative heating and cooling. These tests demonstrate that the radiative transfer solution is accurate and confirm that the operator split method is stable, convergent, and efficient for problems of interest. We demonstrate there is no need to adopt ad hoc assumptions of questionable accuracy to solve RT problems in concert with MHD: the computational cost for our general-purpose module for simple (e.g., LTE gray) problems can be comparable to or less than a single time step of Athena's MHD integrators, and only few times more expensive than that for more general (non-LTE) problems.
Phenomenological applications of rational approximants
NASA Astrophysics Data System (ADS)
Gonzàlez-Solís, Sergi; Masjuan, Pere
2016-08-01
We illustrate the powerfulness of Padé approximants (PAs) as a summation method and explore one of their extensions, the so-called quadratic approximant (QAs), to access both space- and (low-energy) time-like (TL) regions. As an introductory and pedagogical exercise, the function 1 zln(1 + z) is approximated by both kind of approximants. Then, PAs are applied to predict pseudoscalar meson Dalitz decays and to extract Vub from the semileptonic B → πℓνℓ decays. Finally, the π vector form factor in the TL region is explored using QAs.
Approximating Functions with Exponential Functions
ERIC Educational Resources Information Center
Gordon, Sheldon P.
2005-01-01
The possibility of approximating a function with a linear combination of exponential functions of the form e[superscript x], e[superscript 2x], ... is considered as a parallel development to the notion of Taylor polynomials which approximate a function with a linear combination of power function terms. The sinusoidal functions sin "x" and cos "x"…
Efficient IMRT inverse planning with a new L1-solver: template for first-order conic solver
NASA Astrophysics Data System (ADS)
Kim, Hojin; Suh, Tae-Suk; Lee, Rena; Xing, Lei; Li, Ruijiang
2012-07-01
Intensity modulated radiation therapy (IMRT) inverse planning using total-variation (TV) regularization has been proposed to reduce the complexity of fluence maps and facilitate dose delivery. Conventionally, the optimization problem with L-1 norm is solved with quadratic programming (QP), which is time consuming and memory expensive due to the second-order Newton update. This study proposes to use a new algorithm, template for first-order conic solver (TFOCS), for fast and memory-efficient optimization in IMRT inverse planning. The TFOCS utilizes dual-variable updates and first-order approaches for TV minimization without the need to compute and store the enlarged Hessian matrix required for Newton update in the QP technique. To evaluate the effectiveness and efficiency of the proposed method, two clinical cases were used for IMRT inverse planning: a head and neck case and a prostate case. For comparison, the conventional QP-based method for the TV form was adopted to solve the fluence map optimization problem in the above two cases. The convergence criteria and algorithm parameters were selected to achieve similar dose conformity for a fair comparison between the two methods. Compared with conventional QP-based approach, the proposed TFOCS-based method shows a remarkable improvement in computational efficiency for fluence map optimization, while maintaining the conformal dose distribution. Compared with QP-based algorithms, the computational speed using TFOCS for fluence optimization is increased by a factor of 4 to 6, and at the same time the memory requirement is reduced by a factor of 3 to 4. Therefore, TFOCS provides an effective, fast and memory-efficient method for IMRT inverse planning. The unique features of the approach should be particularly important in inverse planning involving a large number of beams, such as in VMAT and dense angularly sampled and sparse intensity modulated radiation therapy (DASSIM-RT).
Modeling of photon migration in the human lung using a finite volume solver
NASA Astrophysics Data System (ADS)
Sikorski, Zbigniew; Furmanczyk, Michal; Przekwas, Andrzej J.
2006-02-01
The application of the frequency domain and steady-state diffusive optical spectroscopy (DOS) and steady-state near infrared spectroscopy (NIRS) to diagnosis of the human lung injury challenges many elements of these techniques. These include the DOS/NIRS instrument performance and accurate models of light transport in heterogeneous thorax tissue. The thorax tissue not only consists of different media (e.g. chest wall with ribs, lungs) but its optical properties also vary with time due to respiration and changes in thorax geometry with contusion (e.g. pneumothorax or hemothorax). This paper presents a finite volume solver developed to model photon migration in the diffusion approximation in heterogeneous complex 3D tissues. The code applies boundary conditions that account for Fresnel reflections. We propose an effective diffusion coefficient for the void volumes (pneumothorax) based on the assumption of the Lambertian diffusion of photons entering the pleural cavity and accounting for the local pleural cavity thickness. The code has been validated using the MCML Monte Carlo code as a benchmark. The code environment enables a semi-automatic preparation of 3D computational geometry from medical images and its rapid automatic meshing. We present the application of the code to analysis/optimization of the hybrid DOS/NIRS/ultrasound technique in which ultrasound provides data on the localization of thorax tissue boundaries. The code effectiveness (3D complex case computation takes 1 second) enables its use to quantitatively relate detected light signal to absorption and reduced scattering coefficients that are indicators of the pulmonary physiologic state (hemoglobin concentration and oxygenation).
Approximate circuits for increased reliability
Hamlet, Jason R.; Mayo, Jackson R.
2015-12-22
Embodiments of the invention describe a Boolean circuit having a voter circuit and a plurality of approximate circuits each based, at least in part, on a reference circuit. The approximate circuits are each to generate one or more output signals based on values of received input signals. The voter circuit is to receive the one or more output signals generated by each of the approximate circuits, and is to output one or more signals corresponding to a majority value of the received signals. At least some of the approximate circuits are to generate an output value different than the reference circuit for one or more input signal values; however, for each possible input signal value, the majority values of the one or more output signals generated by the approximate circuits and received by the voter circuit correspond to output signal result values of the reference circuit.
Approximate circuits for increased reliability
Hamlet, Jason R.; Mayo, Jackson R.
2015-08-18
Embodiments of the invention describe a Boolean circuit having a voter circuit and a plurality of approximate circuits each based, at least in part, on a reference circuit. The approximate circuits are each to generate one or more output signals based on values of received input signals. The voter circuit is to receive the one or more output signals generated by each of the approximate circuits, and is to output one or more signals corresponding to a majority value of the received signals. At least some of the approximate circuits are to generate an output value different than the reference circuit for one or more input signal values; however, for each possible input signal value, the majority values of the one or more output signals generated by the approximate circuits and received by the voter circuit correspond to output signal result values of the reference circuit.
Bounded fractional diffusion in geological media: Definition and Lagrangian approximation
Zhang, Yong; Green, Christopher T.; LaBolle, Eric M.; Neupauer, Roseanna M.; Sun, HongGuang
2016-01-01
Spatiotemporal Fractional-Derivative Models (FDMs) have been increasingly used to simulate non-Fickian diffusion, but methods have not been available to define boundary conditions for FDMs in bounded domains. This study defines boundary conditions and then develops a Lagrangian solver to approximate bounded, one-dimensional fractional diffusion. Both the zero-value and non-zero-value Dirichlet, Neumann, and mixed Robin boundary conditions are defined, where the sign of Riemann-Liouville fractional derivative (capturing non-zero-value spatial-nonlocal boundary conditions with directional super-diffusion) remains consistent with the sign of the fractional-diffusive flux term in the FDMs. New Lagrangian schemes are then proposed to track solute particles moving in bounded domains, where the solutions are checked against analytical or Eularian solutions available for simplified FDMs. Numerical experiments show that the particle-tracking algorithm for non-Fickian diffusion differs from Fickian diffusion in relocating the particle position around the reflective boundary, likely due to the non-local and non-symmetric fractional diffusion. For a non-zero-value Neumann or Robin boundary, a source cell with a reflective face can be applied to define the release rate of random-walking particles at the specified flux boundary. Mathematical definitions of physically meaningful nonlocal boundaries combined with bounded Lagrangian solvers in this study may provide the only viable techniques at present to quantify the impact of boundaries on anomalous diffusion, expanding the applicability of FDMs from infinite do mains to those with any size and boundary conditions.
Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications
NASA Technical Reports Server (NTRS)
Sun, Xian-He
1997-01-01
Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm
A Computationally Efficient Multicomponent Equilibrium Solver for Aerosols (MESA)
Zaveri, Rahul A.; Easter, Richard C.; Peters, Len K.
2005-12-23
This paper describes the development and application of a new multicomponent equilibrium solver for aerosol-phase (MESA) to predict the complex solid-liquid partitioning in atmospheric particles containing H+, NH4+, Na+, Ca2+, SO4=, HSO4-, NO3-, and Cl- ions. The algorithm of MESA involves integrating the set of ordinary differential equations describing the transient precipitation and dissolution reactions for each salt until the system satisfies the equilibrium or mass convergence criteria. Arbitrary values are chosen for the dissolution and precipitation rate constants such that their ratio is equal to the equilibrium constant. Numerically, this approach is equivalent to iterating all the equilibrium reactions simultaneously with a single iteration loop. Because CaSO4 is sparingly soluble, it is assumed to exist as a solid over the entire RH range to simplify the algorithm for calcium containing particles. Temperature-dependent mutual deliquescence relative humidity polynomials (valid from 240 to 310 K) for all the possible salt mixtures were constructed using the comprehensive Pitzer-Simonson-Clegg (PSC) activity coefficient model at 298.15 K and temperature-dependent equilibrium constants in MESA. Performance of MESA is evaluated for 16 representative mixed-electrolyte systems commonly found in tropospheric aerosols using PSC and two other multicomponent activity coefficient methods – Multicomponent Taylor Expansion Method (MTEM) of Zaveri et al. [2004], and the widely-used Kusik and Meissner method (KM), and the results are compared against the predictions of the Web-based AIM Model III or available experimental data. Excellent agreement was found between AIM, MESA-PSC, and MESA-MTEM predictions of the multistage deliquescence growth as a function of RH. On the other hand, MESA-KM displayed up to 20% deviations in the mass growth factors for common salt mixtures in the sulfate-poor cases while significant discrepancies were found in the predicted multistage
In view of accelerating CFD simulations through coupling with vortex particle approximations
NASA Astrophysics Data System (ADS)
Papadakis, Giorgos; Voutsinas, Spyros G.
2014-06-01
In order to exploit the capabilities of Computational Fluid Dynamics in aerodynamic design, the cost should be reduced without compromising accuracy and consistency. In this direction a hybrid methodology is formulated within the context of domain decomposition. The strategy is to choose in each sub-domain the best performing method. Close to solid boundaries a grid-based Eulerian flow solver is used while in the far field the flow is described in Lagrangian coordinates using particle approximations. Aiming at consistently including compressible effects, particles carry mass, dilatation, vorticity and energy and the complete set of conservation laws is solved in Lagrangian coordinates. At software level, the URANS solver MaPFlow is coupled to the vortex code GENUVP. In the present paper the two dimensional formulation is given alongside with validation tests around airfoils in steady and inherently unsteady conditions. It is verified that: purely Eulerian and hybrid simulations are equivalent; the Eulerian domain in the hybrid solver can be effectively restricted to a layer 1.5 chord lengths wide; significant cost reduction reaching up to 1:3 ratio is achieved.
Application of NASA General-Purpose Solver to Large-Scale Computations in Aeroacoustics
NASA Technical Reports Server (NTRS)
Watson, Willie R.; Storaasli, Olaf O.
2004-01-01
Of several iterative and direct equation solvers evaluated previously for computations in aeroacoustics, the most promising was the NASA-developed General-Purpose Solver (winner of NASA's 1999 software of the year award). This paper presents detailed, single-processor statistics of the performance of this solver, which has been tailored and optimized for large-scale aeroacoustic computations. The statistics, compiled using an SGI ORIGIN 2000 computer with 12 Gb available memory (RAM) and eight available processors, are the central processing unit time, RAM requirements, and solution error. The equation solver is capable of solving 10 thousand complex unknowns in as little as 0.01 sec using 0.02 Gb RAM, and 8.4 million complex unknowns in slightly less than 3 hours using all 12 Gb. This latter solution is the largest aeroacoustics problem solved to date with this technique. The study was unable to detect any noticeable error in the solution, since noise levels predicted from these solution vectors are in excellent agreement with the noise levels computed from the exact solution. The equation solver provides a means for obtaining numerical solutions to aeroacoustics problems in three dimensions.
Finite Element Interface to Linear Solvers (FEI) version 2.9 : users guide and reference manual.
Williams, Alan B.
2005-02-01
The Finite Element Interface to Linear Solvers (FEI) is a linear system assembly library. Sparse systems of linear equations arise in many computational engineering applications, and the solution of linear systems is often the most computationally intensive portion of the application. Depending on the complexity of problems addressed by the application, there may be no single solver package capable of solving all of the linear systems that arise. This motivates the need to switch an application from one solver library to another, depending on the problem being solved. The interfaces provided by various solver libraries for data assembly and problem solution differ greatly, making it difficult to switch an application code from one library to another. The amount of library-specific code in an application can be greatly reduced by having an abstraction layer that puts a 'common face' on various solver libraries. The FEI has seen significant use by finite element applications at Sandia National Laboratories and Lawrence Livermore National Laboratory. The original FEI offered several advantages over using linear algebra libraries directly, but also imposed significant limitations and disadvantages. A new set of interfaces has been added with the goal of removing the limitations of the original FEI while maintaining and extending its strengths.
Naff, Richard L.; Banta, Edward R.
2008-01-01
The preconditioned conjugate gradient with improved nonlinear control (PCGN) package provides addi-tional means by which the solution of nonlinear ground-water flow problems can be controlled as compared to existing solver packages for MODFLOW. Picard iteration is used to solve nonlinear ground-water flow equations by iteratively solving a linear approximation of the nonlinear equations. The linear solution is provided by means of the preconditioned conjugate gradient algorithm where preconditioning is provided by the modi-fied incomplete Cholesky algorithm. The incomplete Cholesky scheme incorporates two levels of fill, 0 and 1, in which the pivots can be modified so that the row sums of the preconditioning matrix and the original matrix are approximately equal. A relaxation factor is used to implement the modified pivots, which determines the degree of modification allowed. The effects of fill level and degree of pivot modification are briefly explored by means of a synthetic, heterogeneous finite-difference matrix; results are reported in the final section of this report. The preconditioned conjugate gradient method is coupled with Picard iteration so as to efficiently solve the nonlinear equations associated with many ground-water flow problems. The description of this coupling of the linear solver with Picard iteration is a primary concern of this document.
Gardner, David; Woodward, Carol S.; Evans, Katherine J
2015-01-01
Efficient solution of global climate models requires effectively handling disparate length and time scales. Implicit solution approaches allow time integration of the physical system with a time step dictated by accuracy of the processes of interest rather than by stability governed by the fastest of the time scales present. Implicit approaches, however, require the solution of nonlinear systems within each time step. Usually, a Newton s method is applied for these systems. Each iteration of the Newton s method, in turn, requires the solution of a linear model of the nonlinear system. This model employs the Jacobian of the problem-defining nonlinear residual, but this Jacobian can be costly to form. If a Krylov linear solver is used for the solution of the linear system, the action of the Jacobian matrix on a given vector is required. In the case of spectral element methods, the Jacobian is not calculated but only implemented through matrix-vector products. The matrix-vector multiply can also be approximated by a finite-difference which may show a loss of accuracy in the overall nonlinear solver. In this paper, we review the advantages and disadvantages of finite-difference approximations of these matrix-vector products for climate dynamics within the spectral-element based shallow-water dynamical-core of the Community Atmosphere Model (CAM).
NASA Astrophysics Data System (ADS)
Daude, F.; Galon, P.
2016-01-01
Computation of compressible two-phase flows with the unsteady compressible Baer-Nunziato model in conjunction with the moving grid approach is discussed in this paper. Both HLL- and HLLC-type Finite-Volume methods are presented and implemented in the context of Arbitrary Lagrangian-Eulerian formulation in a multidimensional framework. The construction of suitable numerical methods is linked to proper approximations of the non-conservative terms on moving grids. The HLL discretization follows global conservation properties such as free-stream preservation and uniform pressure and velocity profiles preservation on moving grids. The HLLC solver initially proposed by Tokareva and Toro [1] for the Baer-Nunziato model is based on an approximate solution of local Riemann problems containing all the characteristic fields present in the exact solution. Both "subsonic" and "supersonic" configurations are considered in the construction of the present HLLC solver. In addition, an adaptive 6-wave HLLC scheme is also proposed for computational efficiency. The methods are first assessed on a variety of 1-D Riemann problems including both fixed and moving grids applications. The methods are finally tested on 2-D and 3-D applications: 2-D Riemann problems, a 2-D shock-bubble interaction and finally a 3-D fluid-structure interaction problem with a good agreement with the experiments.
Mathematical algorithms for approximate reasoning
NASA Technical Reports Server (NTRS)
Murphy, John H.; Chay, Seung C.; Downs, Mary M.
1988-01-01
Most state of the art expert system environments contain a single and often ad hoc strategy for approximate reasoning. Some environments provide facilities to program the approximate reasoning algorithms. However, the next generation of expert systems should have an environment which contain a choice of several mathematical algorithms for approximate reasoning. To meet the need for validatable and verifiable coding, the expert system environment must no longer depend upon ad hoc reasoning techniques but instead must include mathematically rigorous techniques for approximate reasoning. Popular approximate reasoning techniques are reviewed, including: certainty factors, belief measures, Bayesian probabilities, fuzzy logic, and Shafer-Dempster techniques for reasoning. A group of mathematically rigorous algorithms for approximate reasoning are focused on that could form the basis of a next generation expert system environment. These algorithms are based upon the axioms of set theory and probability theory. To separate these algorithms for approximate reasoning various conditions of mutual exclusivity and independence are imposed upon the assertions. Approximate reasoning algorithms presented include: reasoning with statistically independent assertions, reasoning with mutually exclusive assertions, reasoning with assertions that exhibit minimum overlay within the state space, reasoning with assertions that exhibit maximum overlay within the state space (i.e. fuzzy logic), pessimistic reasoning (i.e. worst case analysis), optimistic reasoning (i.e. best case analysis), and reasoning with assertions with absolutely no knowledge of the possible dependency among the assertions. A robust environment for expert system construction should include the two modes of inference: modus ponens and modus tollens. Modus ponens inference is based upon reasoning towards the conclusion in a statement of logical implication, whereas modus tollens inference is based upon reasoning away
Approximating random quantum optimization problems
NASA Astrophysics Data System (ADS)
Hsu, B.; Laumann, C. R.; Läuchli, A. M.; Moessner, R.; Sondhi, S. L.
2013-06-01
We report a cluster of results regarding the difficulty of finding approximate ground states to typical instances of the quantum satisfiability problem k-body quantum satisfiability (k-QSAT) on large random graphs. As an approximation strategy, we optimize the solution space over “classical” product states, which in turn introduces a novel autonomous classical optimization problem, PSAT, over a space of continuous degrees of freedom rather than discrete bits. Our central results are (i) the derivation of a set of bounds and approximations in various limits of the problem, several of which we believe may be amenable to a rigorous treatment; (ii) a demonstration that an approximation based on a greedy algorithm borrowed from the study of frustrated magnetism performs well over a wide range in parameter space, and its performance reflects the structure of the solution space of random k-QSAT. Simulated annealing exhibits metastability in similar “hard” regions of parameter space; and (iii) a generalization of belief propagation algorithms introduced for classical problems to the case of continuous spins. This yields both approximate solutions, as well as insights into the free energy “landscape” of the approximation problem, including a so-called dynamical transition near the satisfiability threshold. Taken together, these results allow us to elucidate the phase diagram of random k-QSAT in a two-dimensional energy-density-clause-density space.
Shu, Yu-Chen; Chern, I-Liang; Chang, Chien C.
2014-10-15
Most elliptic interface solvers become complicated for complex interface problems at those “exceptional points” where there are not enough neighboring interior points for high order interpolation. Such complication increases especially in three dimensions. Usually, the solvers are thus reduced to low order accuracy. In this paper, we classify these exceptional points and propose two recipes to maintain order of accuracy there, aiming at improving the previous coupling interface method [26]. Yet the idea is also applicable to other interface solvers. The main idea is to have at least first order approximations for second order derivatives at those exceptional points. Recipe 1 is to use the finite difference approximation for the second order derivatives at a nearby interior grid point, whenever this is possible. Recipe 2 is to flip domain signatures and introduce a ghost state so that a second-order method can be applied. This ghost state is a smooth extension of the solution at the exceptional point from the other side of the interface. The original state is recovered by a post-processing using nearby states and jump conditions. The choice of recipes is determined by a classification scheme of the exceptional points. The method renders the solution and its gradient uniformly second-order accurate in the entire computed domain. Numerical examples are provided to illustrate the second order accuracy of the presently proposed method in approximating the gradients of the original states for some complex interfaces which we had tested previous in two and three dimensions, and a real molecule ( (1D63)) which is double-helix shape and composed of hundreds of atoms.
A multiscale two-point flux-approximation method
Møyner, Olav Lie, Knut-Andreas
2014-10-15
A large number of multiscale finite-volume methods have been developed over the past decade to compute conservative approximations to multiphase flow problems in heterogeneous porous media. In particular, several iterative and algebraic multiscale frameworks that seek to reduce the fine-scale residual towards machine precision have been presented. Common for all such methods is that they rely on a compatible primal–dual coarse partition, which makes it challenging to extend them to stratigraphic and unstructured grids. Herein, we propose a general idea for how one can formulate multiscale finite-volume methods using only a primal coarse partition. To this end, we use two key ingredients that are computed numerically: (i) elementary functions that correspond to flow solutions used in transmissibility upscaling, and (ii) partition-of-unity functions used to combine elementary functions into basis functions. We exemplify the idea by deriving a multiscale two-point flux-approximation (MsTPFA) method, which is robust with regards to strong heterogeneities in the permeability field and can easily handle general grids with unstructured fine- and coarse-scale connections. The method can easily be adapted to arbitrary levels of coarsening, and can be used both as a standalone solver and as a preconditioner. Several numerical experiments are presented to demonstrate that the MsTPFA method can be used to solve elliptic pressure problems on a wide variety of geological models in a robust and efficient manner.
Nonadiabatic charged spherical evolution in the postquasistatic approximation
Rosales, L.; Barreto, W.; Peralta, C.; Rodriguez-Mueller, B.
2010-10-15
We apply the postquasistatic approximation, an iterative method for the evolution of self-gravitating spheres of matter, to study the evolution of dissipative and electrically charged distributions in general relativity. The numerical implementation of our approach leads to a solver which is globally second-order convergent. We evolve nonadiabatic distributions assuming an equation of state that accounts for the anisotropy induced by the electric charge. Dissipation is described by streaming-out or diffusion approximations. We match the interior solution, in noncomoving coordinates, with the Vaidya-Reissner-Nordstroem exterior solution. Two models are considered: (i) a Schwarzschild-like shell in the diffusion limit; and (ii) a Schwarzschild-like interior in the free-streaming limit. These toy models tell us something about the nature of the dissipative and electrically charged collapse. Diffusion stabilizes the gravitational collapse producing a spherical shell whose contraction is halted in a short characteristic hydrodynamic time. The streaming-out radiation provides a more efficient mechanism for emission of energy, redistributing the electric charge on the whole sphere, while the distribution collapses indefinitely with a longer hydrodynamic time scale.
NASA Technical Reports Server (NTRS)
Hartung, Lin C.
1991-01-01
A method for predicting radiation adsorption and emission coefficients in thermochemical nonequilibrium flows is developed. The method is called the Langley optimized radiative nonequilibrium code (LORAN). It applies the smeared band approximation for molecular radiation to produce moderately detailed results and is intended to fill the gap between detailed but costly prediction methods and very fast but highly approximate methods. The optimization of the method to provide efficient solutions allowing coupling to flowfield solvers is discussed. Representative results are obtained and compared to previous nonequilibrium radiation methods, as well as to ground- and flight-measured data. Reasonable agreement is found in all cases. A multidimensional radiative transport method is also developed for axisymmetric flows. Its predictions for wall radiative flux are 20 to 25 percent lower than those of the tangent slab transport method, as expected, though additional investigation of the symmetry and outflow boundary conditions is indicated. The method was applied to the peak heating condition of the aeroassist flight experiment (AFE) trajectory, with results comparable to predictions from other methods. The LORAN method was also applied in conjunction with the computational fluid dynamics (CFD) code LAURA to study the sensitivity of the radiative heating prediction to various models used in nonequilibrium CFD. This study suggests that radiation measurements can provide diagnostic information about the detailed processes occurring in a nonequilibrium flowfield because radiation phenomena are very sensitive to these processes.
Using a multifrontal sparse solver in a high performance, finite element code
NASA Technical Reports Server (NTRS)
King, Scott D.; Lucas, Robert; Raefsky, Arthur
1990-01-01
We consider the performance of the finite element method on a vector supercomputer. The computationally intensive parts of the finite element method are typically the individual element forms and the solution of the global stiffness matrix both of which are vectorized in high performance codes. To further increase throughput, new algorithms are needed. We compare a multifrontal sparse solver to a traditional skyline solver in a finite element code on a vector supercomputer. The multifrontal solver uses the Multiple-Minimum Degree reordering heuristic to reduce the number of operations required to factor a sparse matrix and full matrix computational kernels (e.g., BLAS3) to enhance vector performance. The net result in an order-of-magnitude reduction in run time for a finite element application on one processor of a Cray X-MP.
Wavelet-based Poisson Solver for use in Particle-In-CellSimulations
Terzic, B.; Mihalcea, D.; Bohn, C.L.; Pogorelov, I.V.
2005-05-13
We report on a successful implementation of a wavelet based Poisson solver for use in 3D particle-in-cell (PIC) simulations. One new aspect of our algorithm is its ability to treat the general(inhomogeneous) Dirichlet boundary conditions (BCs). The solver harnesses advantages afforded by the wavelet formulation, such as sparsity of operators and data sets, existence of effective preconditioners, and the ability simultaneously to remove numerical noise and further compress relevant data sets. Having tested our method as a stand-alone solver on two model problems, we merged it into IMPACT-T to obtain a fully functional serial PIC code. We present and discuss preliminary results of application of the new code to the modeling of the Fermilab/NICADD and AES/JLab photoinjectors.
Implementation of An Implicit Unsplit Staggered Mesh MHD Solver in FLASH
NASA Astrophysics Data System (ADS)
Xia, G.; Lee, D.
2010-11-01
FLASH is a publicly available community code designed to solve highly compressible multi-physics reactive flows. We have been adding capabilities to FLASH to make it an open science code for the academic HEDP community. A key need is to provide a computationally efficient time-stepping integration method that overcomes the stiffness that arises in the equations describing a physical problem when there are disparate time scales. To address this problem, we are developing a fully implicit solver based on a Jacobian-Free Newton-Krylov implicit formulation. The method has been integrated into a robust, efficient, and high-order accurate Unsplit Staggered Mesh MHD (USM) solver. We are also integrating this solver into an anisotropic Spitzer-Braginskii conductivity model to treat thermal heat conduction along magnetic field lines, and into a treatment of the Biermann Battery effect that accounts for spontaneous generation of magnetic fields in the presence of non-parallel temperature and density gradients.
Prediction of ship resistance in head waves using RANS based solver
NASA Astrophysics Data System (ADS)
Islam, Hafizul; Akimoto, Hiromichi
2016-07-01
Maneuverability prediction of ships using CFD has gained high popularity over the years because of its improving accuracy and economics. This paper discusses the estimation of calm water and added resistance properties of a KVLCC2 model using a light and economical RaNS based solver, called SHIP_Motion. The solver solves overset structured mesh using finite volume method. In the calm water test, total drag coefficient, sinkage and trim values were predicted together with mesh dependency analysis and compared with experimental data. For added resistance in head sea, short wave cases were simulated and compared with experimental and other simulation data. Overall the results were well predicted and showed good agreement with comparative data. The paper concludes that it is well possible to predict ship maneuverability characteristics using the present solver, with reasonable accuracy utilizing minimum computational resources and within acceptable time.
EUPDF: An Eulerian-Based Monte Carlo Probability Density Function (PDF) Solver. User's Manual
NASA Technical Reports Server (NTRS)
Raju, M. S.
1998-01-01
EUPDF is an Eulerian-based Monte Carlo PDF solver developed for application with sprays, combustion, parallel computing and unstructured grids. It is designed to be massively parallel and could easily be coupled with any existing gas-phase flow and spray solvers. The solver accommodates the use of an unstructured mesh with mixed elements of either triangular, quadrilateral, and/or tetrahedral type. The manual provides the user with the coding required to couple the PDF code to any given flow code and a basic understanding of the EUPDF code structure as well as the models involved in the PDF formulation. The source code of EUPDF will be available with the release of the National Combustion Code (NCC) as a complete package.
Numerical Investigation of Vertical Plunging Jet Using a Hybrid Multifluid–VOF Multiphase CFD Solver
Shonibare, Olabanji Y.; Wardle, Kent E.
2015-01-01
A novel hybrid multiphase flow solver has been used to conduct simulations of a vertical plunging liquid jet. This solver combines a multifluid methodology with selective interface sharpening to enable simulation of both the initial jet impingement and the long-time entrained bubble plume phenomena. Models are implemented for variable bubble size capturing and dynamic switching of interface sharpened regions to capture transitions between the initially fully segregated flow types into the dispersed bubbly flow regime. It was found that the solver was able to capture the salient features of the flow phenomena under study and areas for quantitative improvement havemore » been explored and identified. In particular, a population balance approach is employed and detailed calibration of the underlying models with experimental data is required to enable quantitative prediction of bubble size and distribution to capture the transition between segregated and dispersed flow types with greater fidelity.« less
A Parallel Multigrid Solver for Viscous Flows on Anisotropic Structured Grids
NASA Technical Reports Server (NTRS)
Prieto, Manuel; Montero, Ruben S.; Llorente, Ignacio M.; Bushnell, Dennis M. (Technical Monitor)
2001-01-01
This paper presents an efficient parallel multigrid solver for speeding up the computation of a 3-D model that treats the flow of a viscous fluid over a flat plate. The main interest of this simulation lies in exhibiting some basic difficulties that prevent optimal multigrid efficiencies from being achieved. As the computing platform, we have used Coral, a Beowulf-class system based on Intel Pentium processors and equipped with GigaNet cLAN and switched Fast Ethernet networks. Our study not only examines the scalability of the solver but also includes a performance evaluation of Coral where the investigated solver has been used to compare several of its design choices, namely, the interconnection network (GigaNet versus switched Fast-Ethernet) and the node configuration (dual nodes versus single nodes). As a reference, the performance results have been compared with those obtained with the NAS-MG benchmark.
Flutter and Forced Response Analyses of Cascades using a Two-Dimensional Linearized Euler Solver
NASA Technical Reports Server (NTRS)
Reddy, T. S. R.; Srivastava, R.; Mehmed, O.
1999-01-01
Flutter and forced response analyses for a cascade of blades in subsonic and transonic flow is presented. The structural model for each blade is a typical section with bending and torsion degrees of freedom. The unsteady aerodynamic forces due to bending and torsion motions. and due to a vortical gust disturbance are obtained by solving unsteady linearized Euler equations. The unsteady linearized equations are obtained by linearizing the unsteady nonlinear equations about the steady flow. The predicted unsteady aerodynamic forces include the effect of steady aerodynamic loading due to airfoil shape, thickness and angle of attack. The aeroelastic equations are solved in the frequency domain by coupling the un- steady aerodynamic forces to the aeroelastic solver MISER. The present unsteady aerodynamic solver showed good correlation with published results for both flutter and forced response predictions. Further improvements are required to use the unsteady aerodynamic solver in a design cycle.
3-D seakeeping analysis with water on deck and slamming. Part 1: Numerical solver
NASA Astrophysics Data System (ADS)
Greco, M.; Lugni, C.
2012-08-01
A three-dimensional seakeeping numerical solver is developed to handle occurrence and effects of water-on-deck and bottom slamming. It couples (A) the rigid-ship motions with (B) the water flowing along the deck and (C) bottom slamming events. Problem A is studied with a 3-D weakly nonlinear potential flow solver based on the weak-scatterer hypothesis. Problem B, and so local and global induced green-water loads, are investigated by assuming shallow-water conditions onto the deck. Problem C is examined through a Wagner-type wedge-impact analysis. Within the coupling between A and B: the external seakeeping problem furnishes the initial and boundary conditions to the in-deck solver in terms of water level and velocity along the deck profile; in return the shallow-water problem makes available to the seakeeping solver the green-water loads to be introduced as additional loads into the rigid-motion equations. Within the coupling between A and C: the instantaneous ship configuration and its kinematic and dynamic conditions with respect to the incident waves will fix the parameters for the local impact problem; in return the slamming and water-entry pressures are integrated in the vessel region of interest and introduced as additional loads into the rigid-motion equations. The resulting numerical solver can study efficiently the ship interaction with regular and irregular sea states and the forward motion with limited speed of the vessel. This is crucial to perform reliable and feasible statistical investigations of vessel behavior. Main elements of the solver are described and validated against reference numerical solutions and model tests.
Wavelet Sparse Approximate Inverse Preconditioners
NASA Technical Reports Server (NTRS)
Chan, Tony F.; Tang, W.-P.; Wan, W. L.
1996-01-01
There is an increasing interest in using sparse approximate inverses as preconditioners for Krylov subspace iterative methods. Recent studies of Grote and Huckle and Chow and Saad also show that sparse approximate inverse preconditioner can be effective for a variety of matrices, e.g. Harwell-Boeing collections. Nonetheless a drawback is that it requires rapid decay of the inverse entries so that sparse approximate inverse is possible. However, for the class of matrices that, come from elliptic PDE problems, this assumption may not necessarily hold. Our main idea is to look for a basis, other than the standard one, such that a sparse representation of the inverse is feasible. A crucial observation is that the kind of matrices we are interested in typically have a piecewise smooth inverse. We exploit this fact, by applying wavelet techniques to construct a better sparse approximate inverse in the wavelet basis. We shall justify theoretically and numerically that our approach is effective for matrices with smooth inverse. We emphasize that in this paper we have only presented the idea of wavelet approximate inverses and demonstrated its potential but have not yet developed a highly refined and efficient algorithm.
Wavelet-based Poisson solver for use in particle-in-cell simulations.
Terzić, Balsa; Pogorelov, Ilya V
2005-06-01
We report on a successful implementation of a wavelet-based Poisson solver for use in three-dimensional particle-in-cell simulations. Our method harnesses advantages afforded by the wavelet formulation, such as sparsity of operators and data sets, existence of effective preconditioners, and the ability simultaneously to remove numerical noise and additional compression of relevant data sets. We present and discuss preliminary results relating to the application of the new solver to test problems in accelerator physics and astrophysics. PMID:15980304
Nearly Interactive Parabolized Navier-Stokes Solver for High Speed Forebody and Inlet Flows
NASA Technical Reports Server (NTRS)
Benson, Thomas J.; Liou, May-Fun; Jones, William H.; Trefny, Charles J.
2009-01-01
A system of computer programs is being developed for the preliminary design of high speed inlets and forebodies. The system comprises four functions: geometry definition, flow grid generation, flow solver, and graphics post-processor. The system runs on a dedicated personal computer using the Windows operating system and is controlled by graphical user interfaces written in MATLAB (The Mathworks, Inc.). The flow solver uses the Parabolized Navier-Stokes equations to compute millions of mesh points in several minutes. Sample two-dimensional and three-dimensional calculations are demonstrated in the paper.
Trust-region based solver for nonlinear transport in heterogeneous porous media
NASA Astrophysics Data System (ADS)
Wang, Xiaochen; Tchelepi, Hamdi A.
2013-11-01
We describe a new nonlinear solver for immiscible two-phase transport in porous media, where viscous, buoyancy, and capillary forces are significant. The flux (fractional flow) function, F, is a nonlinear function of saturation and typically has inflection points and can be non-monotonic. The non-convexity and non-monotonicity of F are major sources of difficulty for nonlinear solvers of coupled multiphase flow and transport in natural porous media. We describe a modified Newton algorithm that employs trust regions of the flux function to guide the Newton iterations. The flux function is divided into saturation trust regions delineated by the inflection, unit-flux, and end points. The updates are performed such that two successive iterations cannot cross any trust-region boundary. If a crossing is detected, the saturation value is chopped back to the appropriate trust-region boundary. The proposed trust-region Newton solver, which is demonstrated across the parameter space of viscous, buoyancy and capillary effects, is a significant extension of the inflection-point strategy of Jenny et al. (JCP, 2009) [5] for viscous dominated flows. We analyze the discrete nonlinear transport equation obtained using finite-volume discretization with phase-based upstream weighting. Then, we prove convergence of the trust-region Newton method irrespective of the timestep size for single-cell problems. Numerical results across the full range of the parameter space of viscous, gravity and capillary forces indicate that our trust-region scheme is unconditionally convergent for 1D transport. That is, for a given choice of timestep size, the unique discrete solution is found independently of the initial guess. For problems dominated by buoyancy and capillarity, the trust-region Newton solver overcomes the often severe limits on timestep size associated with existing methods. To validate the effectiveness of the new nonlinear solver for large reservoir models with strong heterogeneity
Parallel H1-based auxiliary space AMG solver for H(curl) problems
Kolev, T V; Vassilevski, P S
2006-06-30
This report describes a parallel implementation of the auxiliary space methods for definite Maxwell problems proposed in [4]. The solver, named AMS, extends our previous study [7]. AMS uses ParCSR sparse matrix storage and the parallel AMG (algebraic multigrid) solver BoomerAMG [1] from the hypre library. It is designed for general unstructured finite element discretizations of (semi)definite H(curl) problems discretized by Nedelec elements. We document the usage of AMS and illustrate its parallel scalability and overall performance.
Adaptive approximation models in optimization
Voronin, A.N.
1995-05-01
The paper proposes a method for optimization of functions of several variables that substantially reduces the number of objective function evaluations compared to traditional methods. The method is based on the property of iterative refinement of approximation models of the optimand function in approximation domains that contract to the extremum point. It does not require subjective specification of the starting point, step length, or other parameters of the search procedure. The method is designed for efficient optimization of unimodal functions of several (not more than 10-15) variables and can be applied to find the global extremum of polymodal functions and also for optimization of scalarized forms of vector objective functions.
Pythagorean Approximations and Continued Fractions
ERIC Educational Resources Information Center
Peralta, Javier
2008-01-01
In this article, we will show that the Pythagorean approximations of [the square root of] 2 coincide with those achieved in the 16th century by means of continued fractions. Assuming this fact and the known relation that connects the Fibonacci sequence with the golden section, we shall establish a procedure to obtain sequences of rational numbers…
Error Bounds for Interpolative Approximations.
ERIC Educational Resources Information Center
Gal-Ezer, J.; Zwas, G.
1990-01-01
Elementary error estimation in the approximation of functions by polynomials as a computational assignment, error-bounding functions and error bounds, and the choice of interpolation points are discussed. Precalculus and computer instruction are used on some of the calculations. (KR)
Chemical Laws, Idealization and Approximation
NASA Astrophysics Data System (ADS)
Tobin, Emma
2013-07-01
This paper examines the notion of laws in chemistry. Vihalemm ( Found Chem 5(1):7-22, 2003) argues that the laws of chemistry are fundamentally the same as the laws of physics they are all ceteris paribus laws which are true "in ideal conditions". In contrast, Scerri (2000) contends that the laws of chemistry are fundamentally different to the laws of physics, because they involve approximations. Christie ( Stud Hist Philos Sci 25:613-629, 1994) and Christie and Christie ( Of minds and molecules. Oxford University Press, New York, pp. 34-50, 2000) agree that the laws of chemistry are operationally different to the laws of physics, but claim that the distinction between exact and approximate laws is too simplistic to taxonomise them. Approximations in chemistry involve diverse kinds of activity and often what counts as a scientific law in chemistry is dictated by the context of its use in scientific practice. This paper addresses the question of what makes chemical laws distinctive independently of the separate question as to how they are related to the laws of physics. From an analysis of some candidate ceteris paribus laws in chemistry, this paper argues that there are two distinct kinds of ceteris paribus laws in chemistry; idealized and approximate chemical laws. Thus, while Christie ( Stud Hist Philos Sci 25:613-629, 1994) and Christie and Christie ( Of minds and molecules. Oxford University Press, New York, pp. 34--50, 2000) are correct to point out that the candidate generalisations in chemistry are diverse and heterogeneous, a distinction between idealizations and approximations can nevertheless be used to successfully taxonomise them.
3-Dimensional Marine CSEM Modeling by Employing TDFEM with Parallel Solvers
NASA Astrophysics Data System (ADS)
Wu, X.; Yang, T.
2013-12-01
In this paper, parallel fulfillment is developed for forward modeling of the 3-Dimensional controlled source electromagnetic (CSEM) by using time-domain finite element method (TDFEM). Recently, a greater attention rises on research of hydrocarbon (HC) reservoir detection mechanism in the seabed. Since China has vast ocean resources, seeking hydrocarbon reservoirs become significant in the national economy. However, traditional methods of seismic exploration shown a crucial obstacle to detect hydrocarbon reservoirs in the seabed with a complex structure, due to relatively high acquisition costs and high-risking exploration. In addition, the development of EM simulations typically requires both a deep knowledge of the computational electromagnetics (CEM) and a proper use of sophisticated techniques and tools from computer science. However, the complexity of large-scale EM simulations often requires large memory because of a large amount of data, or solution time to address problems concerning matrix solvers, function transforms, optimization, etc. The objective of this paper is to present parallelized implementation of the time-domain finite element method for analysis of three-dimensional (3D) marine controlled source electromagnetic problems. Firstly, we established a three-dimensional basic background model according to the seismic data, then electromagnetic simulation of marine CSEM was carried out by using time-domain finite element method, which works on a MPI (Message Passing Interface) platform with exact orientation to allow fast detecting of hydrocarbons targets in ocean environment. To speed up the calculation process, SuperLU of an MPI (Message Passing Interface) version called SuperLU_DIST is employed in this approach. Regarding the representation of three-dimension seabed terrain with sense of reality, the region is discretized into an unstructured mesh rather than a uniform one in order to reduce the number of unknowns. Moreover, high-order Whitney
A Navier-Stokes solver using the LU-SSOR TVD algorithm
NASA Technical Reports Server (NTRS)
Yoon, Seokkwan
1987-01-01
A new Navier-Stokes solver is developed by combining the efficiency of the LU-SSOR scheme and the accuracy of the flux-limited dissipation scheme. Application to laminar and turbulent flows and hypersonic flows proves the reliability of the new algorithm.
Experimental validation of a coupled neutron-photon inverse radiation transport solver.
Mattingly, John K.; Harding, Lee; Mitchell, Dean James
2010-03-01
Forward radiation transport is the problem of calculating the radiation field given a description of the radiation source and transport medium. In contrast, inverse transport is the problem of inferring the configuration of the radiation source and transport medium from measurements of the radiation field. As such, the identification and characterization of special nuclear materials (SNM) is a problem of inverse radiation transport, and numerous techniques to solve this problem have been previously developed. The authors have developed a solver based on nonlinear regression applied to deterministic coupled neutron-photon transport calculations. The subject of this paper is the experimental validation of that solver. This paper describes a series of experiments conducted with a 4.5-kg sphere of alpha-phase, weapons-grade plutonium. The source was measured in six different configurations: bare, and reflected by high-density polyethylene (HDPE) spherical shells with total thicknesses of 1.27, 2.54, 3.81, 7.62, and 15.24 cm. Neutron and photon emissions from the source were measured using three instruments: a gross neutron counter, a portable neutron multiplicity counter, and a high-resolution gamma spectrometer. These measurements were used as input to the inverse radiation transport solver to characterize the solver's ability to correctly infer the configuration of the source from its measured signatures.
NASA Astrophysics Data System (ADS)
Lipnikov, Konstantin; Moulton, David; Svyatskiy, Daniil
2016-08-01
We develop a new approach for solving the nonlinear Richards' equation arising in variably saturated flow modeling. The growing complexity of geometric models for simulation of subsurface flows leads to the necessity of using unstructured meshes and advanced discretization methods. Typically, a numerical solution is obtained by first discretizing PDEs and then solving the resulting system of nonlinear discrete equations with a Newton-Raphson-type method. Efficiency and robustness of the existing solvers rely on many factors, including an empiric quality control of intermediate iterates, complexity of the employed discretization method and a customized preconditioner. We propose and analyze a new preconditioning strategy that is based on a stable discretization of the continuum Jacobian. We will show with numerical experiments for challenging problems in subsurface hydrology that this new preconditioner improves convergence of the existing Jacobian-free solvers 3-20 times. We also show that the Picard method with this preconditioner becomes a more efficient nonlinear solver than a few widely used Jacobian-free solvers.
Study of the microbunching instability in single-pass systemsusing a direct 2D Vlasov solver
Venturini, Marco
2007-06-30
We apply a recently developed Vlasov solver to the study ofthemicrobunching instability generated by shot noise in the beamdeliverysystems of x-ray Free Electron Lasers (FELs). We discusstwo latticespresently under consideration for the FEL FERMI project at Elettra andshow that at least one of the two lattices appears capable of deliveringa beam with the desired quality in the longitudinal phasespace.
NASA Astrophysics Data System (ADS)
Wang, Y.; Shu, C.; Huang, H. B.; Teo, C. J.
2015-01-01
A multiphase lattice Boltzmann flux solver (MLBFS) is proposed in this paper for incompressible multiphase flows with low- and large-density-ratios. In the solver, the flow variables at cell centers are given from the solution of macroscopic governing differential equations (Navier-Stokes equations recovered by multiphase lattice Boltzmann (LB) model) by the finite volume method. At each cell interface, the viscous and inviscid fluxes are evaluated simultaneously by local reconstruction of solution for the standard lattice Boltzmann equation (LBE). The forcing terms in the governing equations are directly treated by the finite volume discretization. The phase interfaces are captured by solving the phase-field Cahn-Hilliard equation with a fifth order upwind scheme. Unlike the conventional multiphase LB models, which restrict their applications on uniform grids with fixed time step, the MLBFS has the capability and advantage to simulate multiphase flows on non-uniform grids. The proposed solver is validated by several benchmark problems, such as two-phase co-current flow, Taylor-Couette flow in an annulus, Rayleigh-Taylor instability, and droplet splashing on a thin film at density ratio of 1000 with Reynolds numbers ranging from 20 to 1000. Numerical results show the reliability of the proposed solver for multiphase flows with high density ratio and high Reynolds number.
NASA Astrophysics Data System (ADS)
Heil, Matthias; Hazel, Andrew L.; Boyle, Jonathan
2008-12-01
We compare the relative performance of monolithic and segregated (partitioned) solvers for large- displacement fluid structure interaction (FSI) problems within the framework of oomph-lib, the object-oriented multi-physics finite-element library, available as open-source software at
NASA Technical Reports Server (NTRS)
Mavriplis, D. J.; Das, Raja; Saltz, Joel; Vermeland, R. E.
1992-01-01
An efficient three dimensional unstructured Euler solver is parallelized on a Cray Y-MP C90 shared memory computer and on an Intel Touchstone Delta distributed memory computer. This paper relates the experiences gained and describes the software tools and hardware used in this study. Performance comparisons between two differing architectures are made.
A fast parallel solver for the forward problem in electrical impedance tomography.
Jehl, Markus; Dedner, Andreas; Betcke, Timo; Aristovich, Kirill; Klöfkorn, Robert; Holder, David
2015-01-01
Electrical impedance tomography (EIT) is a noninvasive imaging modality, where imperceptible currents are applied to the skin and the resulting surface voltages are measured. It has the potential to distinguish between ischaemic and haemorrhagic stroke with a portable and inexpensive device. The image reconstruction relies on an accurate forward model of the experimental setup. Because of the relatively small signal in stroke EIT, the finite-element modeling requires meshes of more than 10 million elements. To study the requirements in the forward modeling in EIT and also to reduce the time for experimental image acquisition, it is necessary to reduce the run time of the forward computation. We show the implementation of a parallel forward solver for EIT using the Dune-Fem C++ library and demonstrate its performance on many CPU's of a computer cluster. For a typical EIT application a direct solver was significantly slower and not an alternative to iterative solvers with multigrid preconditioning. With this new solver, we can compute the forward solutions and the Jacobian matrix of a typical EIT application with 30 electrodes on a 15-million element mesh in less than 15 min. This makes it a valuable tool for simulation studies and EIT applications with high precision requirements. It is freely available for download. PMID:25069109
Flowfield Comparisons from Three Navier-Stokes Solvers for an Axisymmetric Separate Flow Jet
NASA Technical Reports Server (NTRS)
Koch, L. Danielle; Bridges, James; Khavaran, Abbas
2002-01-01
To meet new noise reduction goals, many concepts to enhance mixing in the exhaust jets of turbofan engines are being studied. Accurate steady state flowfield predictions from state-of-the-art computational fluid dynamics (CFD) solvers are needed as input to the latest noise prediction codes. The main intent of this paper was to ascertain that similar Navier-Stokes solvers run at different sites would yield comparable results for an axisymmetric two-stream nozzle case. Predictions from the WIND and the NPARC codes are compared to previously reported experimental data and results from the CRAFT Navier-Stokes solver. Similar k-epsilon turbulence models were employed in each solver, and identical computational grids were used. Agreement between experimental data and predictions from each code was generally good for mean values. All three codes underpredict the maximum value of turbulent kinetic energy. The predicted locations of the maximum turbulent kinetic energy were farther downstream than seen in the data. A grid study was conducted using the WIND code, and comments about convergence criteria and grid requirements for CFD solutions to be used as input for noise prediction computations are given. Additionally, noise predictions from the MGBK code, using the CFD results from the CRAFT code, NPARC, and WIND as input are compared to data.
An Adaptive Flow Solver for Air-Borne Vehicles Undergoing Time-Dependent Motions/Deformations
NASA Technical Reports Server (NTRS)
Singh, Jatinder; Taylor, Stephen
1997-01-01
This report describes a concurrent Euler flow solver for flows around complex 3-D bodies. The solver is based on a cell-centered finite volume methodology on 3-D unstructured tetrahedral grids. In this algorithm, spatial discretization for the inviscid convective term is accomplished using an upwind scheme. A localized reconstruction is done for flow variables which is second order accurate. Evolution in time is accomplished using an explicit three-stage Runge-Kutta method which has second order temporal accuracy. This is adapted for concurrent execution using another proven methodology based on concurrent graph abstraction. This solver operates on heterogeneous network architectures. These architectures may include a broad variety of UNIX workstations and PCs running Windows NT, symmetric multiprocessors and distributed-memory multi-computers. The unstructured grid is generated using commercial grid generation tools. The grid is automatically partitioned using a concurrent algorithm based on heat diffusion. This results in memory requirements that are inversely proportional to the number of processors. The solver uses automatic granularity control and resource management techniques both to balance load and communication requirements, and deal with differing memory constraints. These ideas are again based on heat diffusion. Results are subsequently combined for visualization and analysis using commercial CFD tools. Flow simulation results are demonstrated for a constant section wing at subsonic, transonic, and a supersonic case. These results are compared with experimental data and numerical results of other researchers. Performance results are under way for a variety of network topologies.
NASA Technical Reports Server (NTRS)
Biedron, Robert T.; Vatsa, Veer N.; Atkins, Harold L.
2005-01-01
We apply an unsteady Reynolds-averaged Navier-Stokes (URANS) solver for unstructured grids to unsteady flows on moving and stationary grids. Example problems considered are relevant to active flow control and stability and control. Computational results are presented using the Spalart-Allmaras turbulence model and are compared to experimental data. The effect of grid and time-step refinement are examined.
A Flexible CUDA LU-based Solver for Small, Batched Linear Systems
Tumeo, Antonino; Gawande, Nitin A.; Villa, Oreste
2014-06-09
This chapter presents the implementation of a batched CUDA solver based on LU factorization for small linear systems. This solver may be used in applications such as reactive flow transport models, which apply the Newton-Raphson technique to linearize and iteratively solve the sets of non linear equations that represent the reactions for ten of thousands to millions of physical locations. The implementation exploits somewhat counterintuitive GPGPU programming techniques: it assigns the solution of a matrix (representing a system) to a single CUDA thread, does not exploit shared memory and employs dynamic memory allocation on the GPUs. These techniques enable our implementation to simultaneously solve sets of systems with over 100 equations and to employ LU decomposition with complete pivoting, providing the higher numerical accuracy required by certain applications. Other currently available solutions for batched linear solvers are limited by size and only support partial pivoting, although they may result faster in certain conditions. We discuss the code of our implementation and present a comparison with the other implementations, discussing the various tradeoffs in terms of performance and flexibility. This work will enable developers that need batched linear solvers to choose whichever implementation is more appropriate to the features and the requirements of their applications, and even to implement dynamic switching approaches that can choose the best implementation depending on the input data.
A Comparison of the Intellectual Abilities of Good and Poor Problem Solvers: An Exploratory Study.
ERIC Educational Resources Information Center
Meyer, Ruth Ann
This study examined a selected sample of fourth-grade students who had been previously identified as good or poor problem solvers. The pupils were compared on variables considered as "reference tests" for Verbal, Induction, Numerical, Word Fluency, Memory, Spatial Visualization, and Perceptual Speed abilities. The data were compiled to indicate…
Mahinthakumar, G.; Saied, F.; Valocchi, A.J.
1997-03-01
Some popular iterative solvers for non-symmetric systems arising from the finite-element discretization of three-dimensional groundwater contaminant transport problem are implemented and compared on distributed memory parallel platforms. This paper attempts to determine which solvers are most suitable for the contaminant transport problem under varied conditions for large scale simulations on distributed parallel platforms. The original parallel implementation was targeted for the 1024 node Intel paragon platform using explicit message passing with the NX library. This code was then ported to SGI Power Challenge Array, Convex Exemplar, and Origin 2000 machines using an MPI implementation. The performance of these solvers is studied for increasing problem size, roughness of the coefficients, and selected problem scenarios. These conditions affect the properties of the matrix and hence the difficulty level of the solution process. Performance is analyzed in terms of convergence behavior, overall time, parallel efficiency, and scalability. The solvers that are presented are BiCGSTAB, GMRES, ORTHOMIN, and CGS. A simple diagonal preconditioner is used in this parallel implementation for all the methods. The results indicate that all methods are comparable in performance with BiCGSTAB slightly outperforming the other methods for most problems. The authors achieved very good scalability in all the methods up to 1024 processors of the Intel Paragon XPS/150. They demonstrate scalability by solving 100 time steps of a 40 million element problem in about 5 minutes using either BiCGSTAB or GMRES.
Determining the Optimal Values of Exponential Smoothing Constants--Does Solver Really Work?
ERIC Educational Resources Information Center
Ravinder, Handanhal V.
2013-01-01
A key issue in exponential smoothing is the choice of the values of the smoothing constants used. One approach that is becoming increasingly popular in introductory management science and operations management textbooks is the use of Solver, an Excel-based non-linear optimizer, to identify values of the smoothing constants that minimize a measure…
Preconditioned implicit solvers for the Navier-Stokes equations on distributed-memory machines
NASA Technical Reports Server (NTRS)
Ajmani, Kumud; Liou, Meng-Sing; Dyson, Rodger W.
1994-01-01
The GMRES method is parallelized, and combined with local preconditioning to construct an implicit parallel solver to obtain steady-state solutions for the Navier-Stokes equations of fluid flow on distributed-memory machines. The new implicit parallel solver is designed to preserve the convergence rate of the equivalent 'serial' solver. A static domain-decomposition is used to partition the computational domain amongst the available processing nodes of the parallel machine. The SPMD (Single-Program Multiple-Data) programming model is combined with message-passing tools to develop the parallel code on a 32-node Intel Hypercube and a 512-node Intel Delta machine. The implicit parallel solver is validated for internal and external flow problems, and is found to compare identically with flow solutions obtained on a Cray Y-MP/8. A peak computational speed of 2300 MFlops/sec has been achieved on 512 nodes of the Intel Delta machine,k for a problem size of 1024 K equations (256 K grid points).
Lipnikov, Konstantin; Moulton, David; Svyatskiy, Daniil
2016-04-29
We develop a new approach for solving the nonlinear Richards’ equation arising in variably saturated flow modeling. The growing complexity of geometric models for simulation of subsurface flows leads to the necessity of using unstructured meshes and advanced discretization methods. Typically, a numerical solution is obtained by first discretizing PDEs and then solving the resulting system of nonlinear discrete equations with a Newton-Raphson-type method. Efficiency and robustness of the existing solvers rely on many factors, including an empiric quality control of intermediate iterates, complexity of the employed discretization method and a customized preconditioner. We propose and analyze a new preconditioningmore » strategy that is based on a stable discretization of the continuum Jacobian. We will show with numerical experiments for challenging problems in subsurface hydrology that this new preconditioner improves convergence of the existing Jacobian-free solvers 3-20 times. Furthermore, we show that the Picard method with this preconditioner becomes a more efficient nonlinear solver than a few widely used Jacobian-free solvers.« less
VDJSeq-Solver: in silico V(D)J recombination detection tool.
Paciello, Giulia; Acquaviva, Andrea; Pighi, Chiara; Ferrarini, Alberto; Macii, Enrico; Zamo', Alberto; Ficarra, Elisa
2015-01-01
In this paper we present VDJSeq-Solver, a methodology and tool to identify clonal lymphocyte populations from paired-end RNA Sequencing reads derived from the sequencing of mRNA neoplastic cells. The tool detects the main clone that characterises the tissue of interest by recognizing the most abundant V(D)J rearrangement among the existing ones in the sample under study. The exact sequence of the clone identified is capable of accounting for the modifications introduced by the enzymatic processes. The proposed tool overcomes limitations of currently available lymphocyte rearrangements recognition methods, working on a single sequence at a time, that are not applicable to high-throughput sequencing data. In this work, VDJSeq-Solver has been applied to correctly detect the main clone and identify its sequence on five Mantle Cell Lymphoma samples; then the tool has been tested on twelve Diffuse Large B-Cell Lymphoma samples. In order to comply with the privacy, ethics and intellectual property policies of the University Hospital and the University of Verona, data is available upon request to supporto.utenti@ateneo.univr.it after signing a mandatory Materials Transfer Agreement. VDJSeq-Solver JAVA/Perl/Bash software implementation is free and available at http://eda.polito.it/VDJSeq-Solver/. PMID:25799103
A generalized Poisson and Poisson-Boltzmann solver for electrostatic environments
NASA Astrophysics Data System (ADS)
Fisicaro, G.; Genovese, L.; Andreussi, O.; Marzari, N.; Goedecker, S.
2016-01-01
The computational study of chemical reactions in complex, wet environments is critical for applications in many fields. It is often essential to study chemical reactions in the presence of applied electrochemical potentials, taking into account the non-trivial electrostatic screening coming from the solvent and the electrolytes. As a consequence, the electrostatic potential has to be found by solving the generalized Poisson and the Poisson-Boltzmann equations for neutral and ionic solutions, respectively. In the present work, solvers for both problems have been developed. A preconditioned conjugate gradient method has been implemented for the solution of the generalized Poisson equation and the linear regime of the Poisson-Boltzmann, allowing to solve iteratively the minimization problem with some ten iterations of the ordinary Poisson equation solver. In addition, a self-consistent procedure enables us to solve the non-linear Poisson-Boltzmann problem. Both solvers exhibit very high accuracy and parallel efficiency and allow for the treatment of periodic, free, and slab boundary conditions. The solver has been integrated into the BigDFT and Quantum-ESPRESSO electronic-structure packages and will be released as an independent program, suitable for integration in other codes.
Testing the frozen flow approximation
NASA Technical Reports Server (NTRS)
Lucchin, Francesco; Matarrese, Sabino; Melott, Adrian L.; Moscardini, Lauro
1993-01-01
We investigate the accuracy of the frozen-flow approximation (FFA), recently proposed by Matarrese, et al. (1992), for following the nonlinear evolution of cosmological density fluctuations under gravitational instability. We compare a number of statistics between results of the FFA and n-body simulations, including those used by Melott, Pellman & Shandarin (1993) to test the Zel'dovich approximation. The FFA performs reasonably well in a statistical sense, e.g. in reproducing the counts-in-cell distribution, at small scales, but it does poorly in the crosscorrelation with n-body which means it is generally not moving mass to the right place, especially in models with high small-scale power.
Transonic Drag Prediction on a DLR-F6 Transport Configuration Using Unstructured Grid Solvers
NASA Technical Reports Server (NTRS)
Lee-Rausch, E. M.; Frink, N. T.; Mavriplis, D. J.; Rausch, R. D.; Milholen, W. E.
2004-01-01
A second international AIAA Drag Prediction Workshop (DPW-II) was organized and held in Orlando Florida on June 21-22, 2003. The primary purpose was to inves- tigate the code-to-code uncertainty. address the sensitivity of the drag prediction to grid size and quantify the uncertainty in predicting nacelle/pylon drag increments at a transonic cruise condition. This paper presents an in-depth analysis of the DPW-II computational results from three state-of-the-art unstructured grid Navier-Stokes flow solvers exercised on similar families of tetrahedral grids. The flow solvers are USM3D - a tetrahedral cell-centered upwind solver. FUN3D - a tetrahedral node-centered upwind solver, and NSU3D - a general element node-centered central-differenced solver. For the wingbody, the total drag predicted for a constant-lift transonic cruise condition showed a decrease in code-to-code variation with grid refinement as expected. For the same flight condition, the wing/body/nacelle/pylon total drag and the nacelle/pylon drag increment predicted showed an increase in code-to-code variation with grid refinement. Although the range in total drag for the wingbody fine grids was only 5 counts, a code-to-code comparison of surface pressures and surface restricted streamlines indicated that the three solvers were not all converging to the same flow solutions- different shock locations and separation patterns were evident. Similarly, the wing/body/nacelle/pylon solutions did not appear to be converging to the same flow solutions. Overall, grid refinement did not consistently improve the correlation with experimental data for either the wingbody or the wing/body/nacelle pylon configuration. Although the absolute values of total drag predicted by two of the solvers for the medium and fine grids did not compare well with the experiment, the incremental drag predictions were within plus or minus 3 counts of the experimental data. The correlation with experimental incremental drag was not
Approximate line shapes for hydrogen
NASA Technical Reports Server (NTRS)
Sutton, K.
1978-01-01
Two independent methods are presented for calculating radiative transport within hydrogen lines. In Method 1, a simple equation is proposed for calculating the line shape. In Method 2, the line shape is assumed to be a dispersion profile and an equation is presented for calculating the half half-width. The results obtained for the line shapes and curves of growth by the two approximate methods are compared with similar results using the detailed line shapes by Vidal et al.
S4 : A free electromagnetic solver for layered periodic structures
NASA Astrophysics Data System (ADS)
Liu, Victor; Fan, Shanhui
2012-10-01
We describe S4, a free implementation of the Fourier modal method (FMM), which has also been commonly referred to as rigorous coupled wave analysis (RCWA), for simulating electromagnetic propagation through 3D structures with 2D periodicity. We detail design aspects that allow S4 to be a flexible platform for these types of simulations. In particular, we highlight the ability to select different FMM formulations, user scripting, and extensibility of program capabilities for eigenmode computations. Program summary Program title: S4 Catalogue identifier: AEMO_v1_0. Program summary URL: http://cpc.cs.qub.ac.uk/summaries/AEMO_v1_0..html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU General Public License, version 2 No. of lines in distributed program, including test data, etc.: 56910 No. of bytes in distributed program, including test data, etc.: 433883 Distribution format: Programming language: C, C++. Computer: Any computer with a Unix-like environment and a C++ compiler. Developed on 2.3 GHz AMD Phenom 9600. Operating system: Any Unix-like environment; developed under MinGW32 on Windows 7. Has the code been vectorized or parallelized?: Yes. Parallelized using MPI. RAM: Problem dependent (linearly proportional to number of layers and quadratic in number of Fourier components). A single layer calculation with approximately 100 Fourier components uses approximately 10 MB. Classification: 10. Electrostatics and Electromagnetics. External routines: Lua [1] and optionally exploits additional free software packages: FFTW [2], CHOLMOD [3], MPI message-passing interface [4], LAPACK and BLAS linear-algebra software [5], and Kiss FFT [6]. Nature of problem: Time-harmonic electromagnetism in layered bi-periodic structures. Solution method: The Fourier modal method (rigorous coupled wave analysis) and the scattering matrix method. Running time: Problem dependent and highly dependent on quality of the BLAS
Approximate reasoning using terminological models
NASA Technical Reports Server (NTRS)
Yen, John; Vaidya, Nitin
1992-01-01
Term Subsumption Systems (TSS) form a knowledge-representation scheme in AI that can express the defining characteristics of concepts through a formal language that has a well-defined semantics and incorporates a reasoning mechanism that can deduce whether one concept subsumes another. However, TSS's have very limited ability to deal with the issue of uncertainty in knowledge bases. The objective of this research is to address issues in combining approximate reasoning with term subsumption systems. To do this, we have extended an existing AI architecture (CLASP) that is built on the top of a term subsumption system (LOOM). First, the assertional component of LOOM has been extended for asserting and representing uncertain propositions. Second, we have extended the pattern matcher of CLASP for plausible rule-based inferences. Third, an approximate reasoning model has been added to facilitate various kinds of approximate reasoning. And finally, the issue of inconsistency in truth values due to inheritance is addressed using justification of those values. This architecture enhances the reasoning capabilities of expert systems by providing support for reasoning under uncertainty using knowledge captured in TSS. Also, as definitional knowledge is explicit and separate from heuristic knowledge for plausible inferences, the maintainability of expert systems could be improved.
Computer Experiments for Function Approximations
Chang, A; Izmailov, I; Rizzo, S; Wynter, S; Alexandrov, O; Tong, C
2007-10-15
This research project falls in the domain of response surface methodology, which seeks cost-effective ways to accurately fit an approximate function to experimental data. Modeling and computer simulation are essential tools in modern science and engineering. A computer simulation can be viewed as a function that receives input from a given parameter space and produces an output. Running the simulation repeatedly amounts to an equivalent number of function evaluations, and for complex models, such function evaluations can be very time-consuming. It is then of paramount importance to intelligently choose a relatively small set of sample points in the parameter space at which to evaluate the given function, and then use this information to construct a surrogate function that is close to the original function and takes little time to evaluate. This study was divided into two parts. The first part consisted of comparing four sampling methods and two function approximation methods in terms of efficiency and accuracy for simple test functions. The sampling methods used were Monte Carlo, Quasi-Random LP{sub {tau}}, Maximin Latin Hypercubes, and Orthogonal-Array-Based Latin Hypercubes. The function approximation methods utilized were Multivariate Adaptive Regression Splines (MARS) and Support Vector Machines (SVM). The second part of the study concerned adaptive sampling methods with a focus on creating useful sets of sample points specifically for monotonic functions, functions with a single minimum and functions with a bounded first derivative.
Ultrafast approximation for phylogenetic bootstrap.
Minh, Bui Quang; Nguyen, Minh Anh Thi; von Haeseler, Arndt
2013-05-01
Nonparametric bootstrap has been a widely used tool in phylogenetic analysis to assess the clade support of phylogenetic trees. However, with the rapidly growing amount of data, this task remains a computational bottleneck. Recently, approximation methods such as the RAxML rapid bootstrap (RBS) and the Shimodaira-Hasegawa-like approximate likelihood ratio test have been introduced to speed up the bootstrap. Here, we suggest an ultrafast bootstrap approximation approach (UFBoot) to compute the support of phylogenetic groups in maximum likelihood (ML) based trees. To achieve this, we combine the resampling estimated log-likelihood method with a simple but effective collection scheme of candidate trees. We also propose a stopping rule that assesses the convergence of branch support values to automatically determine when to stop collecting candidate trees. UFBoot achieves a median speed up of 3.1 (range: 0.66-33.3) to 10.2 (range: 1.32-41.4) compared with RAxML RBS for real DNA and amino acid alignments, respectively. Moreover, our extensive simulations show that UFBoot is robust against moderate model violations and the support values obtained appear to be relatively unbiased compared with the conservative standard bootstrap. This provides a more direct interpretation of the bootstrap support. We offer an efficient and easy-to-use software (available at http://www.cibiv.at/software/iqtree) to perform the UFBoot analysis with ML tree inference.
Approximate Counting of Graphical Realizations
2015-01-01
In 1999 Kannan, Tetali and Vempala proposed a MCMC method to uniformly sample all possible realizations of a given graphical degree sequence and conjectured its rapidly mixing nature. Recently their conjecture was proved affirmative for regular graphs (by Cooper, Dyer and Greenhill, 2007), for regular directed graphs (by Greenhill, 2011) and for half-regular bipartite graphs (by Miklós, Erdős and Soukup, 2013). Several heuristics on counting the number of possible realizations exist (via sampling processes), and while they work well in practice, so far no approximation guarantees exist for such an approach. This paper is the first to develop a method for counting realizations with provable approximation guarantee. In fact, we solve a slightly more general problem; besides the graphical degree sequence a small set of forbidden edges is also given. We show that for the general problem (which contains the Greenhill problem and the Miklós, Erdős and Soukup problem as special cases) the derived MCMC process is rapidly mixing. Further, we show that this new problem is self-reducible therefore it provides a fully polynomial randomized approximation scheme (a.k.a. FPRAS) for counting of all realizations. PMID:26161994
Non-equilibrium Hybridization Expansion Impurity-solver
NASA Astrophysics Data System (ADS)
Dong, Qiaoyuan
2015-03-01
The study of non-equilibrium phenomena in strongly correlated systems has developed into one of the most active and exciting branches of condensed matter physics. Meanwhile, quantum impurity models play a prominent role as mathematical representations of quantum dots, single-molecule devices, and effective models for the dynamical mean field theory. We show results for a generalization of the hybridization expansion diagrammatic Monte Carlo technique for the Anderson impurity model. And we perform non-equilibrium calculations on the full Keldysh contour, where a dynamical sign problem vastly increases the complexity of real-time simulation. By further combining this method with a non-crossing approximation, our ``bold-line'' Monte Carlo can reach substantially longer times out of equilibrium than previously accessible, and provides an accurate description of quench and driven dynamics of correlated systems. Sponsored by the Department of Energy.
Yoon, E. S.; Chang, C. S.
2014-03-15
An approximate two-dimensional solver of the nonlinear Fokker-Planck-Landau collision operator has been developed using the assumption that the particle probability distribution function is independent of gyroangle in the limit of strong magnetic field. The isotropic one-dimensional scheme developed for nonlinear Fokker-Planck-Landau equation by Buet and Cordier [J. Comput. Phys. 179, 43 (2002)] and for linear Fokker-Planck-Landau equation by Chang and Cooper [J. Comput. Phys. 6, 1 (1970)] have been modified and extended to two-dimensional nonlinear equation. In addition, a method is suggested to apply the new velocity-grid based collision solver to Lagrangian particle-in-cell simulation by adjusting the weights of marker particles and is applied to a five dimensional particle-in-cell code to calculate the neoclassical ion thermal conductivity in a tokamak plasma. Error verifications show practical aspects of the present scheme for both grid-based and particle-based kinetic codes.
NASA Astrophysics Data System (ADS)
Frickenhaus, Stephan; Hiller, Wolfgang; Best, Meike
The portable software FoSSI is introduced that—in combination with additional free solver software packages—allows for an efficient and scalable parallel solution of large sparse linear equations systems arising in finite element model codes. FoSSI is intended to support rapid model code development, completely hiding the complexity of the underlying solver packages. In particular, the model developer need not be an expert in parallelization and is yet free to switch between different solver packages by simple modifications of the interface call. FoSSI offers an efficient and easy, yet flexible interface to several parallel solvers, most of them available on the web, such as PETSC, AZTEC, MUMPS, PILUT and HYPRE. FoSSI makes use of the concept of handles for vectors, matrices, preconditioners and solvers, that is frequently used in solver libraries. Hence, FoSSI allows for a flexible treatment of several linear equations systems and associated preconditioners at the same time, even in parallel on separate MPI-communicators. The second special feature in FoSSI is the task specifier, being a combination of keywords, each configuring a certain phase in the solver setup. This enables the user to control a solver over one unique subroutine. Furthermore, FoSSI has rather similar features for all solvers, making a fast solver intercomparison or exchange an easy task. FoSSI is a community software, proven in an adaptive 2D-atmosphere model and a 3D-primitive equation ocean model, both formulated in finite elements. The present paper discusses perspectives of an OpenMP-implementation of parallel iterative solvers based on domain decomposition methods. This approach to OpenMP solvers is rather attractive, as the code for domain-local operations of factorization, preconditioning and matrix-vector product can be readily taken from a sequential implementation that is also suitable to be used in an MPI-variant. Code development in this direction is in an advanced state under
A comparative study on low-memory iterative solvers for FFT-based homogenization of periodic media
NASA Astrophysics Data System (ADS)
Mishra, Nachiketa; Vondřejc, Jaroslav; Zeman, Jan
2016-09-01
In this paper, we assess the performance of four iterative algorithms for solving non-symmetric rank-deficient linear systems arising in the FFT-based homogenization of heterogeneous materials defined by digital images. Our framework is based on the Fourier-Galerkin method with exact and approximate integrations that has recently been shown to generalize the Lippmann-Schwinger setting of the original work by Moulinec and Suquet from 1994. It follows from this variational format that the ensuing system of linear equations can be solved by general-purpose iterative algorithms for symmetric positive-definite systems, such as the Richardson, the Conjugate gradient, and the Chebyshev algorithms, that are compared here to the Eyre-Milton scheme - the most efficient specialized method currently available. Our numerical experiments, carried out for two-dimensional elliptic problems, reveal that the Conjugate gradient algorithm is the most efficient option, while the Eyre-Milton method performs comparably to the Chebyshev semi-iteration. The Richardson algorithm, equivalent to the still widely used original Moulinec-Suquet solver, exhibits the slowest convergence. Besides this, we hope that our study highlights the potential of the well-established techniques of numerical linear algebra to further increase the efficiency of FFT-based homogenization methods.
Approximately Independent Features of Languages
NASA Astrophysics Data System (ADS)
Holman, Eric W.
To facilitate the testing of models for the evolution of languages, the present paper offers a set of linguistic features that are approximately independent of each other. To find these features, the adjusted Rand index (R‧) is used to estimate the degree of pairwise relationship among 130 linguistic features in a large published database. Many of the R‧ values prove to be near zero, as predicted for independent features, and a subset of 47 features is found with an average R‧ of -0.0001. These 47 features are recommended for use in statistical tests that require independent units of analysis.
The structural physical approximation conjecture
NASA Astrophysics Data System (ADS)
Shultz, Fred
2016-01-01
It was conjectured that the structural physical approximation (SPA) of an optimal entanglement witness is separable (or equivalently, that the SPA of an optimal positive map is entanglement breaking). This conjecture was disproved, first for indecomposable maps and more recently for decomposable maps. The arguments in both cases are sketched along with important related results. This review includes background material on topics including entanglement witnesses, optimality, duality of cones, decomposability, and the statement and motivation for the SPA conjecture so that it should be accessible for a broad audience.
Generalized Gradient Approximation Made Simple
Perdew, J.P.; Burke, K.; Ernzerhof, M.
1996-10-01
Generalized gradient approximations (GGA{close_quote}s) for the exchange-correlation energy improve upon the local spin density (LSD) description of atoms, molecules, and solids. We present a simple derivation of a simple GGA, in which all parameters (other than those in LSD) are fundamental constants. Only general features of the detailed construction underlying the Perdew-Wang 1991 (PW91) GGA are invoked. Improvements over PW91 include an accurate description of the linear response of the uniform electron gas, correct behavior under uniform scaling, and a smoother potential. {copyright} {ital 1996 The American Physical Society.}
Quantum tunneling beyond semiclassical approximation
NASA Astrophysics Data System (ADS)
Banerjee, Rabin; Ranjan Majhi, Bibhas
2008-06-01
Hawking radiation as tunneling by Hamilton-Jacobi method beyond semiclassical approximation is analysed. We compute all quantum corrections in the single particle action revealing that these are proportional to the usual semiclassical contribution. We show that a simple choice of the proportionality constants reproduces the one loop back reaction effect in the spacetime, found by conformal field theory methods, which modifies the Hawking temperature of the black hole. Using the law of black hole mechanics we give the corrections to the Bekenstein-Hawking area law following from the modified Hawking temperature. Some examples are explicitly worked out.
Fermion tunneling beyond semiclassical approximation
NASA Astrophysics Data System (ADS)
Majhi, Bibhas Ranjan
2009-02-01
Applying the Hamilton-Jacobi method beyond the semiclassical approximation prescribed in R. Banerjee and B. R. Majhi, J. High Energy Phys.JHEPFG1029-8479 06 (2008) 09510.1088/1126-6708/2008/06/095 for the scalar particle, Hawking radiation as tunneling of the Dirac particle through an event horizon is analyzed. We show that, as before, all quantum corrections in the single particle action are proportional to the usual semiclassical contribution. We also compute the modifications to the Hawking temperature and Bekenstein-Hawking entropy for the Schwarzschild black hole. Finally, the coefficient of the logarithmic correction to entropy is shown to be related with the trace anomaly.
Notes on Newton-Krylov based Incompressible Flow Projection Solver
Robert Nourgaliev; Mark Christon; J. Bakosi
2012-09-01
The purpose of the present document is to formulate Jacobian-free Newton-Krylov algorithm for approximate projection method used in Hydra-TH code. Hydra-TH is developed by Los Alamos National Laboratory (LANL) under the auspices of the Consortium for Advanced Simulation of Light-Water Reactors (CASL) for thermal-hydraulics applications ranging from grid-to-rod fretting (GTRF) to multiphase flow subcooled boiling. Currently, Hydra-TH is based on the semi-implicit projection method, which provides an excellent platform for simulation of transient single-phase thermalhydraulics problems. This algorithm however is not efficient when applied for very slow or steady-state problems, as well as for highly nonlinear multiphase problems relevant to nuclear reactor thermalhydraulics with boiling and condensation. These applications require fully-implicit tightly-coupling algorithms. The major technical contribution of the present report is the formulation of fully-implicit projection algorithm which will fulfill this purpose. This includes the definition of non-linear residuals used for GMRES-based linear iterations, as well as physics-based preconditioning techniques.
PDRK: A General Kinetic Dispersion Relation Solver for Magnetized Plasma
NASA Astrophysics Data System (ADS)
Xie, Huasheng; Xiao, Yong
2016-02-01
A general, fast, and effective approach is developed for numerical calculation of kinetic plasma linear dispersion relations. The plasma dispersion function is approximated by J-pole expansion. Subsequently, the dispersion relation is transformed to a standard matrix eigenvalue problem of an equivalent linear system. Numerical solutions for the least damped or fastest growing modes using an 8-pole expansion are generally accurate; more strongly damped modes are less accurate, but are less likely to be of physical interest. In contrast to conventional approaches, such as Newton's iterative method, this approach can give either all the solutions in the system or a few solutions around the initial guess. It is also free from convergence problems. The approach is demonstrated for electrostatic dispersion equations with one-dimensional and two-dimensional wavevectors, and for electromagnetic kinetic magnetized plasma dispersion relation for bi-Maxwellian distribution with relative parallel velocity flows between species. supported by the National Magnetic Confinement Fusion Science Program of China (Nos. 2015GB110003, 2011GB105001, 2013GB111000), National Natural Science Foundation of China (No. 91130031), the Recruitment Program of Global Youth Experts
NASA Astrophysics Data System (ADS)
Gainullin, I. K.; Sonkin, M. A.
2015-03-01
A parallelized three-dimensional (3D) time-dependent Schrodinger equation (TDSE) solver for one-electron systems is presented in this paper. The TDSE Solver is based on the finite-difference method (FDM) in Cartesian coordinates and uses a simple and explicit leap-frog numerical scheme. The simplicity of the numerical method provides very efficient parallelization and high performance of calculations using Graphics Processing Units (GPUs). For example, calculation of 106 time-steps on the 1000ṡ1000ṡ1000 numerical grid (109 points) takes only 16 hours on 16 Tesla M2090 GPUs. The TDSE Solver demonstrates scalability (parallel efficiency) close to 100% with some limitations on the problem size. The TDSE Solver is validated by calculation of energy eigenstates of the hydrogen atom (13.55 eV) and affinity level of H- ion (0.75 eV). The comparison with other TDSE solvers shows that a GPU-based TDSE Solver is 3 times faster for the problems of the same size and with the same cost of computational resources. The usage of a non-regular Cartesian grid or problem-specific non-Cartesian coordinates increases this benefit up to 10 times. The TDSE Solver was applied to the calculation of the resonant charge transfer (RCT) in nanosystems, including several related physical problems, such as electron capture during H+-H0 collision and electron tunneling between H- ion and thin metallic island film.
Plasma Physics Approximations in Ares
Managan, R. A.
2015-01-08
Lee & More derived analytic forms for the transport properties of a plasma. Many hydro-codes use their formulae for electrical and thermal conductivity. The coefficients are complex functions of Fermi-Dirac integrals, F_{n}( μ/θ ), the chemical potential, μ or ζ = ln(1+e^{ μ/θ} ), and the temperature, θ = kT. Since these formulae are expensive to compute, rational function approximations were fit to them. Approximations are also used to find the chemical potential, either μ or ζ . The fits use ζ as the independent variable instead of μ/θ . New fits are provided for A^{α} (ζ ),A^{β} (ζ ), ζ, f(ζ ) = (1 + e^{-μ/θ})F_{1/2}(μ/θ), F_{1/2}'/F_{1/2}, F_{c}^{α}, and F_{c}^{β}. In each case the relative error of the fit is minimized since the functions can vary by many orders of magnitude. The new fits are designed to exactly preserve the limiting values in the non-degenerate and highly degenerate limits or as ζ→ 0 or ∞. The original fits due to Lee & More and George Zimmerman are presented for comparison.
Wavelet Approximation in Data Assimilation
NASA Technical Reports Server (NTRS)
Tangborn, Andrew; Atlas, Robert (Technical Monitor)
2002-01-01
Estimation of the state of the atmosphere with the Kalman filter remains a distant goal because of high computational cost of evolving the error covariance for both linear and nonlinear systems. Wavelet approximation is presented here as a possible solution that efficiently compresses both global and local covariance information. We demonstrate the compression characteristics on the the error correlation field from a global two-dimensional chemical constituent assimilation, and implement an adaptive wavelet approximation scheme on the assimilation of the one-dimensional Burger's equation. In the former problem, we show that 99%, of the error correlation can be represented by just 3% of the wavelet coefficients, with good representation of localized features. In the Burger's equation assimilation, the discrete linearized equations (tangent linear model) and analysis covariance are projected onto a wavelet basis and truncated to just 6%, of the coefficients. A nearly optimal forecast is achieved and we show that errors due to truncation of the dynamics are no greater than the errors due to covariance truncation.
NASA Astrophysics Data System (ADS)
Eymet, V.; Poitou, D.; Galtier, M.; El Hafi, M.; Terrée, G.; Fournier, R.
2013-11-01
The Monte-Carlo method is often presented as a reference method for radiative transfer simulation when dealing with participating, inhomogeneous media. The reason is that numerical uncertainties are only of a statistical nature and are accurately evaluated by measuring the standard deviation of the Monte Carlo weight. But classical Monte-Carlo algorithms first sample optical thicknesses and then determine absorption or scattering locations by inverting the formal integral definition of optical thickness as an increasing function of path length. This function is only seldom analytically invertible and numerical inversion procedures are required. Most commonly, a volumic grid is introduced and optical properties within each cell are replaced by approximate homogeneous or linear fields. Simulation results are then sensitive to the grid and can no longer be considered as references. We propose a new algorithmic formulation based on the use of null-collisions that eliminate the need for numerical inversion: no volumic grid is required. Benchmark configurations are first considered in order to evaluate the effect of two free parameters: the amount of null-collisions, and the criterion used to decide at which stage a Russian Roulette is used to exit the path tracking process. Then the corresponding algorithm is implemented using a development environment allowing to deal with complex geometries (thanks to computer graphics techniques), leading to a Monte Carlo code that can be easily used for validation of fast radiative transfer solvers embedded in combustion simulators. "Easily" means here that the way the Monte Carlo algorithm deals with both the geometry and the temperature/pressure/concentration fields is independent of the choices made inside the combustion solver: there is no need for the design of a new path-tracking procedure adapted to each new CFD grid. The Monte Carlo simulator is ready for use as soon as combustion specialists provide a localization
Augustin, Christoph M.; Neic, Aurel; Liebmann, Manfred; Prassl, Anton J.; Niederer, Steven A.; Haase, Gundolf; Plank, Gernot
2016-01-01
Electromechanical (EM) models of the heart have been used successfully to study fundamental mechanisms underlying a heart beat in health and disease. However, in all modeling studies reported so far numerous simplifications were made in terms of representing biophysical details of cellular function and its heterogeneity, gross anatomy and tissue microstructure, as well as the bidirectional coupling between electrophysiology (EP) and tissue distension. One limiting factor is the employed spatial discretization methods which are not sufficiently flexible to accommodate complex geometries or resolve heterogeneities, but, even more importantly, the limited efficiency of the prevailing solver techniques which are not sufficiently scalable to deal with the incurring increase in degrees of freedom (DOF) when modeling cardiac electromechanics at high spatio-temporal resolution. This study reports on the development of a novel methodology for solving the nonlinear equation of finite elasticity using human whole organ models of cardiac electromechanics, discretized at a high para-cellular resolution. Three patient-specific, anatomically accurate, whole heart EM models were reconstructed from magnetic resonance (MR) scans at resolutions of 220 μm, 440 μm and 880 μm, yielding meshes of approximately 184.6, 24.4 and 3.7 million tetrahedral elements and 95.9, 13.2 and 2.1 million displacement DOF, respectively. The same mesh was used for discretizing the governing equations of both electrophysiology (EP) and nonlinear elasticity. A novel algebraic multigrid (AMG) preconditioner for an iterative Krylov solver was developed to deal with the resulting computational load. The AMG preconditioner was designed under the primary objective of achieving favorable strong scaling characteristics for both setup and solution runtimes, as this is key for exploiting current high performance computing hardware. Benchmark results using the 220 μm, 440 μm and 880 μm meshes demonstrate
NASA Astrophysics Data System (ADS)
Augustin, Christoph M.; Neic, Aurel; Liebmann, Manfred; Prassl, Anton J.; Niederer, Steven A.; Haase, Gundolf; Plank, Gernot
2016-01-01
Electromechanical (EM) models of the heart have been used successfully to study fundamental mechanisms underlying a heart beat in health and disease. However, in all modeling studies reported so far numerous simplifications were made in terms of representing biophysical details of cellular function and its heterogeneity, gross anatomy and tissue microstructure, as well as the bidirectional coupling between electrophysiology (EP) and tissue distension. One limiting factor is the employed spatial discretization methods which are not sufficiently flexible to accommodate complex geometries or resolve heterogeneities, but, even more importantly, the limited efficiency of the prevailing solver techniques which is not sufficiently scalable to deal with the incurring increase in degrees of freedom (DOF) when modeling cardiac electromechanics at high spatio-temporal resolution. This study reports on the development of a novel methodology for solving the nonlinear equation of finite elasticity using human whole organ models of cardiac electromechanics, discretized at a high para-cellular resolution. Three patient-specific, anatomically accurate, whole heart EM models were reconstructed from magnetic resonance (MR) scans at resolutions of 220 μm, 440 μm and 880 μm, yielding meshes of approximately 184.6, 24.4 and 3.7 million tetrahedral elements and 95.9, 13.2 and 2.1 million displacement DOF, respectively. The same mesh was used for discretizing the governing equations of both electrophysiology (EP) and nonlinear elasticity. A novel algebraic multigrid (AMG) preconditioner for an iterative Krylov solver was developed to deal with the resulting computational load. The AMG preconditioner was designed under the primary objective of achieving favorable strong scaling characteristics for both setup and solution runtimes, as this is key for exploiting current high performance computing hardware. Benchmark results using the 220 μm, 440 μm and 880 μm meshes demonstrate
Approximating metal-insulator transitions
NASA Astrophysics Data System (ADS)
Danieli, Carlo; Rayanov, Kristian; Pavlov, Boris; Martin, Gaven; Flach, Sergej
2015-12-01
We consider quantum wave propagation in one-dimensional quasiperiodic lattices. We propose an iterative construction of quasiperiodic potentials from sequences of potentials with increasing spatial period. At each finite iteration step, the eigenstates reflect the properties of the limiting quasiperiodic potential properties up to a controlled maximum system size. We then observe approximate Metal-Insulator Transitions (MIT) at the finite iteration steps. We also report evidence on mobility edges, which are at variance to the celebrated Aubry-André model. The dynamics near the MIT shows a critical slowing down of the ballistic group velocity in the metallic phase, similar to the divergence of the localization length in the insulating phase.
New generalized gradient approximation functionals
NASA Astrophysics Data System (ADS)
Boese, A. Daniel; Doltsinis, Nikos L.; Handy, Nicholas C.; Sprik, Michiel
2000-01-01
New generalized gradient approximation (GGA) functionals are reported, using the expansion form of A. D. Becke, J. Chem. Phys. 107, 8554 (1997), with 15 linear parameters. Our original such GGA functional, called HCTH, was determined through a least squares refinement to data of 93 systems. Here, the data are extended to 120 systems and 147 systems, introducing electron and proton affinities, and weakly bound dimers to give the new functionals HCTH/120 and HCTH/147. HCTH/120 has already been shown to give high quality predictions for weakly bound systems. The functionals are applied in a comparative study of the addition reaction of water to formaldehyde and sulfur trioxide, respectively. Furthermore, the performance of the HCTH/120 functional in Car-Parrinello molecular dynamics simulations of liquid water is encouraging.
Interplay of approximate planning strategies.
Huys, Quentin J M; Lally, Níall; Faulkner, Paul; Eshel, Neir; Seifritz, Erich; Gershman, Samuel J; Dayan, Peter; Roiser, Jonathan P
2015-03-10
Humans routinely formulate plans in domains so complex that even the most powerful computers are taxed. To do so, they seem to avail themselves of many strategies and heuristics that efficiently simplify, approximate, and hierarchically decompose hard tasks into simpler subtasks. Theoretical and cognitive research has revealed several such strategies; however, little is known about their establishment, interaction, and efficiency. Here, we use model-based behavioral analysis to provide a detailed examination of the performance of human subjects in a moderately deep planning task. We find that subjects exploit the structure of the domain to establish subgoals in a way that achieves a nearly maximal reduction in the cost of computing values of choices, but then combine partial searches with greedy local steps to solve subtasks, and maladaptively prune the decision trees of subtasks in a reflexive manner upon encountering salient losses. Subjects come idiosyncratically to favor particular sequences of actions to achieve subgoals, creating novel complex actions or "options." PMID:25675480
Indexing the approximate number system.
Inglis, Matthew; Gilmore, Camilla
2014-01-01
Much recent research attention has focused on understanding individual differences in the approximate number system, a cognitive system believed to underlie human mathematical competence. To date researchers have used four main indices of ANS acuity, and have typically assumed that they measure similar properties. Here we report a study which questions this assumption. We demonstrate that the numerical ratio effect has poor test-retest reliability and that it does not relate to either Weber fractions or accuracy on nonsymbolic comparison tasks. Furthermore, we show that Weber fractions follow a strongly skewed distribution and that they have lower test-retest reliability than a simple accuracy measure. We conclude by arguing that in the future researchers interested in indexing individual differences in ANS acuity should use accuracy figures, not Weber fractions or numerical ratio effects. PMID:24361686
IONIS: Approximate atomic photoionization intensities
NASA Astrophysics Data System (ADS)
Heinäsmäki, Sami
2012-02-01
A program to compute relative atomic photoionization cross sections is presented. The code applies the output of the multiconfiguration Dirac-Fock method for atoms in the single active electron scheme, by computing the overlap of the bound electron states in the initial and final states. The contribution from the single-particle ionization matrix elements is assumed to be the same for each final state. This method gives rather accurate relative ionization probabilities provided the single-electron ionization matrix elements do not depend strongly on energy in the region considered. The method is especially suited for open shell atoms where electronic correlation in the ionic states is large. Program summaryProgram title: IONIS Catalogue identifier: AEKK_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEKK_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 1149 No. of bytes in distributed program, including test data, etc.: 12 877 Distribution format: tar.gz Programming language: Fortran 95 Computer: Workstations Operating system: GNU/Linux, Unix Classification: 2.2, 2.5 Nature of problem: Photoionization intensities for atoms. Solution method: The code applies the output of the multiconfiguration Dirac-Fock codes Grasp92 [1] or Grasp2K [2], to compute approximate photoionization intensities. The intensity is computed within the one-electron transition approximation and by assuming that the sum of the single-particle ionization probabilities is the same for all final ionic states. Restrictions: The program gives nonzero intensities for those transitions where only one electron is removed from the initial configuration(s). Shake-type many-electron transitions are not computed. The ionized shell must be closed in the initial state. Running time: Few seconds for a
Approximate l-fold cross-validation with Least Squares SVM and Kernel Ridge Regression
Edwards, Richard E; Zhang, Hao; Parker, Lynne Edwards; New, Joshua Ryan
2013-01-01
Kernel methods have difficulties scaling to large modern data sets. The scalability issues are based on computational and memory requirements for working with a large matrix. These requirements have been addressed over the years by using low-rank kernel approximations or by improving the solvers scalability. However, Least Squares Support VectorMachines (LS-SVM), a popular SVM variant, and Kernel Ridge Regression still have several scalability issues. In particular, the O(n^3) computational complexity for solving a single model, and the overall computational complexity associated with tuning hyperparameters are still major problems. We address these problems by introducing an O(n log n) approximate l-fold cross-validation method that uses a multi-level circulant matrix to approximate the kernel. In addition, we prove our algorithm s computational complexity and present empirical runtimes on data sets with approximately 1 million data points. We also validate our approximate method s effectiveness at selecting hyperparameters on real world and standard benchmark data sets. Lastly, we provide experimental results on using a multi-level circulant kernel approximation to solve LS-SVM problems with hyperparameters selected using our method.
Parallel performance of a preconditioned CG solver for unstructured finite element applications
Shadid, J.N.; Hutchinson, S.A.; Moffat, H.K.
1994-12-31
A parallel unstructured finite element (FE) implementation designed for message passing MIMD machines is described. This implementation employs automated problem partitioning algorithms for load balancing unstructured grids, a distributed sparse matrix representation of the global finite element equations and a parallel conjugate gradient (CG) solver. In this paper a number of issues related to the efficient implementation of parallel unstructured mesh applications are presented. These include the differences between structured and unstructured mesh parallel applications, major communication kernels for unstructured CG solvers, automatic mesh partitioning algorithms, and the influence of mesh partitioning metrics on parallel performance. Initial results are presented for example finite element (FE) heat transfer analysis applications on a 1024 processor nCUBE 2 hypercube. Results indicate over 95% scaled efficiencies are obtained for some large problems despite the required unstructured data communication.
Analysis, tuning and comparison of two general sparse solvers for distributed memory computers
Amestoy, P.R.; Duff, I.S.; L'Excellent, J.-Y.; Li, X.S.
2000-06-30
We describe the work performed in the context of a Franco-Berkeley funded project between NERSC-LBNL located in Berkeley (USA) and CERFACS-ENSEEIHT located in Toulouse (France). We discuss both the tuning and performance analysis of two distributed memory sparse solvers (superlu from Berkeley and mumps from Toulouse) on the 512 processor Cray T3E from NERSC (Lawrence Berkeley National Laboratory). This project gave us the opportunity to improve the algorithms and add new features to the codes. We then quite extensively analyze and compare the two approaches on a set of large problems from real applications. We further explain the main differences in the behavior of the approaches on artificial regular grid problems. As a conclusion to this activity report, we mention a set of parallel sparse solvers on which this type of study should be extended.
NASA Technical Reports Server (NTRS)
Wang, Xiao-Yen; Chow, Chuen-Yen; Chang, Sin-Chung
1996-01-01
The I-D, quasi I-D and 2-D Euler solvers based on the method of space-time conservation element and solution element are used to simulate various flow phenomena including shock waves, Mach stem, contact surface, expansion waves, and their intersections and reflections. Seven test problems are solved to demonstrate the capability of this method for handling unsteady compressible flows in various configurations. Numerical results so obtained are compared with exact solutions and/or numerical solutions obtained by schemes based on other established computational techniques. Comparisons show that the present Euler solvers can generate highly accurate numerical solutions to complex flow problems in a straightforward manner without using any ad hoc techniques in the scheme.
An assessment of the adaptive unstructured tetrahedral grid, Euler Flow Solver Code FELISA
NASA Technical Reports Server (NTRS)
Djomehri, M. Jahed; Erickson, Larry L.
1994-01-01
A three-dimensional solution-adaptive Euler flow solver for unstructured tetrahedral meshes is assessed, and the accuracy and efficiency of the method for predicting sonic boom pressure signatures about simple generic models are demonstrated. Comparison of computational and wind tunnel data and enhancement of numerical solutions by means of grid adaptivity are discussed. The mesh generation is based on the advancing front technique. The FELISA code consists of two solvers, the Taylor-Galerkin and the Runge-Kutta-Galerkin schemes, both of which are spacially discretized by the usual Galerkin weighted residual finite-element methods but with different explicit time-marching schemes to steady state. The solution-adaptive grid procedure is based on either remeshing or mesh refinement techniques. An alternative geometry adaptive procedure is also incorporated.
NASA Technical Reports Server (NTRS)
Yarrow, Maurice; Vastano, John A.; Lomax, Harvard
1992-01-01
Generic shapes are subjected to pulsed plane waves of arbitrary shape. The resulting scattered electromagnetic fields are determined analytically. These fields are then computed efficiently at field locations for which numerically determined EM fields are required. Of particular interest are the pulsed waveform shapes typically utilized by radar systems. The results can be used to validate the accuracy of finite difference time domain Maxwell's equations solvers. A two-dimensional solver which is second- and fourth-order accurate in space and fourth-order accurate in time is examined. Dielectric media properties are modeled by a ramping technique which simplifies the associated gridding of body shapes. The attributes of the ramping technique are evaluated by comparison with the analytic solutions.
Courant Number and Mach Number Insensitive CE/SE Euler Solvers
NASA Technical Reports Server (NTRS)
Chang, Sin-Chung
2005-01-01
It has been known that the space-time CE/SE method can be used to obtain ID, 2D, and 3D steady and unsteady flow solutions with Mach numbers ranging from 0.0028 to 10. However, it is also known that a CE/SE solution may become overly dissipative when the Mach number is very small. As an initial attempt to remedy this weakness, new 1D Courant number and Mach number insensitive CE/SE Euler solvers are developed using several key concepts underlying the recent successful development of Courant number insensitive CE/SE schemes. Numerical results indicate that the new solvers are capable of resolving crisply a contact discontinuity embedded in a flow with the maximum Mach number = 0.01.
Generating Combinatorial Test Cases by Efficient SAT Encodings Suitable for CDCL SAT Solvers
NASA Astrophysics Data System (ADS)
Banbara, Mutsunori; Matsunaka, Haruki; Tamura, Naoyuki; Inoue, Katsumi
Generating test cases for combinatorial testing is to find a covering array in Combinatorial Designs. In this paper, we consider the problem of finding optimal covering arrays by SAT encoding. We present two encodings suitable for modern CDCL SAT solvers. One is based on the order encoding that is efficient in the sense that unit propagation achieves the bounds consistency in CSPs. Another one is based on a combination of the order encoding and Hnich's encoding. CDCL SAT solvers have an important role in the latest SAT technology. The effective use of them is essential for enhancing efficiency. In our experiments, we found solutions that can be competitive with the previously known results for the arrays of strength two to six with small to moderate size of components and symbols. Moreover, we succeeded either in proving the optimality of known bounds or in improving known lower bounds for some arrays.
Extending Clause Learning of SAT Solvers with Boolean Gröbner Bases
NASA Astrophysics Data System (ADS)
Zengler, Christoph; Küchlin, Wolfgang
We extend clause learning as performed by most modern SAT Solvers by integrating the computation of Boolean Gröbner bases into the conflict learning process. Instead of learning only one clause per conflict, we compute and learn additional binary clauses from a Gröbner basis of the current conflict. We used the Gröbner basis engine of the logic package Redlog contained in the computer algebra system Reduce to extend the SAT solver MiniSAT with Gröbner basis learning. Our approach shows a significant reduction of conflicts and a reduction of restarts and computation time on many hard problems from the SAT 2009 competition.
Linear optical response of finite systems using multishift linear system solvers
Hübener, Hannes; Giustino, Feliciano
2014-07-28
We discuss the application of multishift linear system solvers to linear-response time-dependent density functional theory. Using this technique the complete frequency-dependent electronic density response of finite systems to an external perturbation can be calculated at the cost of a single solution of a linear system via conjugate gradients. We show that multishift time-dependent density functional theory yields excitation energies and oscillator strengths in perfect agreement with the standard diagonalization of the response matrix (Casida's method), while being computationally advantageous. We present test calculations for benzene, porphin, and chlorophyll molecules. We argue that multishift solvers may find broad applicability in the context of excited-state calculations within density-functional theory and beyond.
Linear optical response of finite systems using multishift linear system solvers.
Hübener, Hannes; Giustino, Feliciano
2014-07-28
We discuss the application of multishift linear system solvers to linear-response time-dependent density functional theory. Using this technique the complete frequency-dependent electronic density response of finite systems to an external perturbation can be calculated at the cost of a single solution of a linear system via conjugate gradients. We show that multishift time-dependent density functional theory yields excitation energies and oscillator strengths in perfect agreement with the standard diagonalization of the response matrix (Casida's method), while being computationally advantageous. We present test calculations for benzene, porphin, and chlorophyll molecules. We argue that multishift solvers may find broad applicability in the context of excited-state calculations within density-functional theory and beyond.
Progress Toward Overset-Grid Moving Body Capability for USM3D Unstructured Flow Solver
NASA Technical Reports Server (NTRS)
Pandyna, Mohagna J.; Frink, Neal T.; Noack, Ralph W.
2005-01-01
A static and dynamic Chimera overset-grid capability is added to an established NASA tetrahedral unstructured parallel Navier-Stokes flow solver, USM3D. Modifications to the solver primarily consist of a few strategic calls to the Donor interpolation Receptor Transaction library (DiRTlib) to facilitate communication of solution information between various grids. The assembly of multiple overlapping grids into a single-zone composite grid is performed by the Structured, Unstructured and Generalized Grid AssembleR (SUGGAR) code. Several test cases are presented to verify the implementation, assess overset-grid solution accuracy and convergence relative to single-grid solutions, and demonstrate the prescribed relative grid motion capability.
A steady-state solver and stability calculator for nonlinear internal wave flows
NASA Astrophysics Data System (ADS)
Viner, Kevin C.; Epifanio, Craig C.; Doyle, James D.
2013-10-01
A steady solver and stability calculator is presented for the problem of nonlinear internal gravity waves forced by topography. Steady-state solutions are obtained using Newton's method, as applied to a finite-difference discretization in terrain-following coordinates. The iteration is initialized using a boundary-inflation scheme, in which the nonlinearity of the flow is gradually increased over the first few Newton steps. The resulting method is shown to be robust over the full range of nonhydrostatic and rotating parameter space. Examples are given for both nonhydrostatic and rotating flows, as well as flows with realistic upstream shear and static stability profiles. With a modest extension, the solver also allows for a linear stability analysis of the steady-state wave fields. Unstable modes are computed using a shifted-inverse method, combined with a parameter-space search over a set of realistic target values. An example is given showing resonant instability in a nonhydrostatic mountain wave.
SuperLU{_}DIST: A scalable distributed-memory sparse direct solver for unsymmetric linear systems
Li, Xiaoye S.; Demmel, James W.
2002-03-27
In this paper, we present the main algorithmic features in the software package SuperLU{_}DIST, a distributed-memory sparse direct solver for large sets of linear equations. We give in detail our parallelization strategies, with focus on scalability issues, and demonstrate the parallel performance and scalability on current machines. The solver is based on sparse Gaussian elimination, with an innovative static pivoting strategy proposed earlier by the authors. The main advantage of static pivoting over classical partial pivoting is that it permits a priori determination of data structures and communication pattern for sparse Gaussian elimination, which makes it more scalable on distributed memory machines. Based on this a priori knowledge, we designed highly parallel and scalable algorithms for both LU decomposition and triangular solve and we show that they are suitable for large-scale distributed memory machines.
A Massively Parallel Solver for the Mechanical Harmonic Analysis of Accelerator Cavities
O. Kononenko
2015-02-17
ACE3P is a 3D massively parallel simulation suite that developed at SLAC National Accelerator Laboratory that can perform coupled electromagnetic, thermal and mechanical study. Effectively utilizing supercomputer resources, ACE3P has become a key simulation tool for particle accelerator R and D. A new frequency domain solver to perform mechanical harmonic response analysis of accelerator components is developed within the existing parallel framework. This solver is designed to determine the frequency response of the mechanical system to external harmonic excitations for time-efficient accurate analysis of the large-scale problems. Coupled with the ACE3P electromagnetic modules, this capability complements a set of multi-physics tools for a comprehensive study of microphonics in superconducting accelerating cavities in order to understand the RF response and feedback requirements for the operational reliability of a particle accelerator. (auth)
NASA Astrophysics Data System (ADS)
Cerroni, D.; Fancellu, L.; Manservisi, S.; Menghini, F.
2016-06-01
In this work we propose to study the behavior of a solid elastic object that interacts with a multiphase flow. Fluid structure interaction and multiphase problems are of great interest in engineering and science because of many potential applications. The study of this interaction by coupling a fluid structure interaction (FSI) solver with a multiphase problem could open a large range of possibilities in the investigation of realistic problems. We use a FSI solver based on a monolithic approach, while the two-phase interface advection and reconstruction is computed in the framework of a Volume of Fluid method which is one of the more popular algorithms for two-phase flow problems. The coupling between the FSI and VOF algorithm is efficiently handled with the use of MEDMEM libraries implemented in the computational platform Salome. The numerical results of a dam break problem over a deformable solid are reported in order to show the robustness and stability of this numerical approach.
Multitasking domain decomposition fast Poisson solvers on the Cray Y-MP
NASA Technical Reports Server (NTRS)
Chan, Tony F.; Fatoohi, Rod A.
1990-01-01
The results of multitasking implementation of a domain decomposition fast Poisson solver on eight processors of the Cray Y-MP are presented. The object of this research is to study the performance of domain decomposition methods on a Cray supercomputer and to analyze the performance of different multitasking techniques using highly parallel algorithms. Two implementations of multitasking are considered: macrotasking (parallelism at the subroutine level) and microtasking (parallelism at the do-loop level). A conventional FFT-based fast Poisson solver is also multitasked. The results of different implementations are compared and analyzed. A speedup of over 7.4 on the Cray Y-MP running in a dedicated environment is achieved for all cases.
Extending substructure based iterative solvers to multiple load and repeated analyses
NASA Technical Reports Server (NTRS)
Farhat, Charbel
1993-01-01
Direct solvers currently dominate commercial finite element structural software, but do not scale well in the fine granularity regime targeted by emerging parallel processors. Substructure based iterative solvers--often called also domain decomposition algorithms--lend themselves better to parallel processing, but must overcome several obstacles before earning their place in general purpose structural analysis programs. One such obstacle is the solution of systems with many or repeated right hand sides. Such systems arise, for example, in multiple load static analyses and in implicit linear dynamics computations. Direct solvers are well-suited for these problems because after the system matrix has been factored, the multiple or repeated solutions can be obtained through relatively inexpensive forward and backward substitutions. On the other hand, iterative solvers in general are ill-suited for these problems because they often must restart from scratch for every different right hand side. In this paper, we present a methodology for extending the range of applications of domain decomposition methods to problems with multiple or repeated right hand sides. Basically, we formulate the overall problem as a series of minimization problems over K-orthogonal and supplementary subspaces, and tailor the preconditioned conjugate gradient algorithm to solve them efficiently. The resulting solution method is scalable, whereas direct factorization schemes and forward and backward substitution algorithms are not. We illustrate the proposed methodology with the solution of static and dynamic structural problems, and highlight its potential to outperform forward and backward substitutions on parallel computers. As an example, we show that for a linear structural dynamics problem with 11640 degrees of freedom, every time-step beyond time-step 15 is solved in a single iteration and consumes 1.0 second on a 32 processor iPSC-860 system; for the same problem and the same parallel
IDSOLVER: A general purpose solver for nth-order integro-differential equations
NASA Astrophysics Data System (ADS)
Gelmi, Claudio A.; Jorquera, Héctor
2014-01-01
Many mathematical models of complex processes may be posed as integro-differential equations (IDE). Many numerical methods have been proposed for solving those equations, but most of them are ad hoc thus new equations have to be solved from scratch for translating the IDE into the framework of the specific method chosen. Furthermore, there is a paucity of general-purpose numerical solvers that free the user from additional tasks.
A parallel scheduler for block iterative solvers in heterogeneous computing environments
Arioli, M.; Drummond, A.; Ruiz, D.
1995-12-01
We present a parallel scheduler for distributing work to a group of processors in a heterogeneous computing environment. Some of the processors in the heterogeneous computing environment can be clustered to take advantage of particular communication networks. Here, the scheduler has been used in the implementation of a parallel block iterative solver based on the Cimmino method. We have used PVM 3 to implement the communication between the heterogeneous processors.
A semi-direct solver for compressible 3-dimensional rotational flow
NASA Technical Reports Server (NTRS)
Chang, S. C.; Adamczyk, J. J.
1983-01-01
An iterative procedure is presented for solving steady inviscid 3-D subsonic rotational flow problems. The procedure combines concepts from classical secondary flow theory with an extension to 3-D of a novel semi-direct Cauchy-Riemann solver. It is developed for generalized coordinates and can be exercised using standard finite difference procedures. The stability criterion of the iterative procedure is discussed along with its ability to capture the evolution of inviscid secondary flow in a turning channel.
A semi-direct solver for compressible three-dimensional rotational flow
NASA Technical Reports Server (NTRS)
Chang, S.-C.; Adamczyk, J. J.
1983-01-01
An iterative procedure is presented for solving steady inviscid 3-D subsonic rotational flow problems. The procedure combines concepts from classical secondary flow theory with an extension to 3-D of a novel semi-direct Cauchy-Riemann solver. It is developed for generalized coordinates and can be exercised using standard finite difference procedures. The stability criterion of the iterative procedure is discussed along with its ability to capture the evolution of inviscid secondary flow in a turning channel.
Cummings, Julian C.
2013-05-15
This project was a collaboration between researchers at the California Institute of Technology and the University of California, Irvine to investigate the utility of a global field-aligned mesh and gyrokinetic field solver for simulations of the tokamak plasma edge region. Mesh generation software from UC Irvine was tested with specific tokamak edge magnetic geometry scenarios and the quality of the meshes and the solutions to the gyrokinetic Poisson equation were evaluated.
Hybrid MPI+OpenMP Programming of an Overset CFD Solver and Performance Investigations
NASA Technical Reports Server (NTRS)
Djomehri, M. Jahed; Jin, Haoqiang H.; Biegel, Bryan (Technical Monitor)
2002-01-01
This report describes a two level parallelization of a Computational Fluid Dynamic (CFD) solver with multi-zone overset structured grids. The approach is based on a hybrid MPI+OpenMP programming model suitable for shared memory and clusters of shared memory machines. The performance investigations of the hybrid application on an SGI Origin2000 (O2K) machine is reported using medium and large scale test problems.
A Newton-Krylov Solver for Implicit Solution of Hydrodynamics in Core Collapse Supernovae
Reynolds, D R; Swesty, F D; Woodward, C S
2008-06-12
This paper describes an implicit approach and nonlinear solver for solution of radiation-hydrodynamic problems in the context of supernovae and proto-neutron star cooling. The robust approach applies Newton-Krylov methods and overcomes the difficulties of discontinuous limiters in the discretized equations and scaling of the equations over wide ranges of physical behavior. We discuss these difficulties, our approach for overcoming them, and numerical results demonstrating accuracy and efficiency of the method.
NASA Astrophysics Data System (ADS)
Mena, Andres; Ferrero, Jose M.; Rodriguez Matas, Jose F.
2015-11-01
Solving the electric activity of the heart possess a big challenge, not only because of the structural complexities inherent to the heart tissue, but also because of the complex electric behaviour of the cardiac cells. The multi-scale nature of the electrophysiology problem makes difficult its numerical solution, requiring temporal and spatial resolutions of 0.1 ms and 0.2 mm respectively for accurate simulations, leading to models with millions degrees of freedom that need to be solved for thousand time steps. Solution of this problem requires the use of algorithms with higher level of parallelism in multi-core platforms. In this regard the newer programmable graphic processing units (GPU) has become a valid alternative due to their tremendous computational horsepower. This paper presents results obtained with a novel electrophysiology simulation software entirely developed in Compute Unified Device Architecture (CUDA). The software implements fully explicit and semi-implicit solvers for the monodomain model, using operator splitting. Performance is compared with classical multi-core MPI based solvers operating on dedicated high-performance computer clusters. Results obtained with the GPU based solver show enormous potential for this technology with accelerations over 50 × for three-dimensional problems.
Premixed flame response to pressure fluctuations using an implicit solver with detailed chemistry
NASA Astrophysics Data System (ADS)
Malik, Nadeem
2015-11-01
A major challenge in combustion research is the coupling of the compressible flow field to the detailed thermochemistry. Recent advances in numerical solvers has met this challenge within an implicit numerical framework, retaining the full stiffness of the realistic comprehensive chemistry and multicomponent transport properties in the system. Here, the solver TARDIS (Transient Advection Reaction Diffusion Implicit Simulations) is demonstrated, first, by investigating the laminar flame speed in stoichiometric H2/air and CH4/air flames as a function of the flame curvature and found to follow non-linear regimes, contrary to previous thinking. Second, planar and curved laminar flames are subjected to pressure and equivalence ratio oscillations and found to respond through a spectrum of time and length scales. TARDIS has the potential to elucidate fundamental aspects of flame structure and thermochemistry, and could be the basis for a new generation of implicit DNS solvers. The author acknowledge financial support from SABIC, #SB101018, through the Dean of Scientific Research at KFUPM.
Parallel satellite orbital situational problems solver for space missions design and control
NASA Astrophysics Data System (ADS)
Atanassov, Atanas Marinov
2016-11-01
Solving different scientific problems for space applications demands implementation of observations, measurements or realization of active experiments during time intervals in which specific geometric and physical conditions are fulfilled. The solving of situational problems for determination of these time intervals when the satellite instruments work optimally is a very important part of all activities on every stage of preparation and realization of space missions. The elaboration of universal, flexible and robust approach for situation analysis, which is easily portable toward new satellite missions, is significant for reduction of missions' preparation times and costs. Every situation problem could be based on one or more situation conditions. Simultaneously solving different kinds of situation problems based on different number and types of situational conditions, each one of them satisfied on different segments of satellite orbit requires irregular calculations. Three formal approaches are presented. First one is related to situation problems description that allows achieving flexibility in situation problem assembling and presentation in computer memory. The second formal approach is connected with developing of situation problem solver organized as processor that executes specific code for every particular situational condition. The third formal approach is related to solver parallelization utilizing threads and dynamic scheduling based on "pool of threads" abstraction and ensures a good load balance. The developed situation problems solver is intended for incorporation in the frames of multi-physics multi-satellite space mission's design and simulation tools.
Coordinate-Space Hartree-Fock-Bogoliubov Solvers for Superfluid Fermi Systems in Large Boxes
Pei, J. C.; Fann, George I; Harrison, Robert J; Nazarewicz, W.; Hill, Judith C; Galindo, Diego A; Jia, Jun
2012-01-01
The self-consistent Hartree-Fock-Bogoliubov problem in large boxes can be solved accurately in the coordinate space with the recently developed solvers HFB-AX (2D) and MADNESS-HFB (3D). This is essential for the description of superfluid Fermi systems with complicated topologies and significant spatial extend, such as fissioning nuclei, weakly-bound nuclei, nuclear matter in the neutron star rust, and ultracold Fermi atoms in elongated traps. The HFB-AX solver based on B-spline techniques uses a hybrid MPI and OpenMP programming model for parallel computation for distributed parallel computation, within a node multi-threaded LAPACK and BLAS libraries are used to further enable parallel calculations of large eigensystems. The MADNESS-HFB solver uses a novel multi-resolution analysis based adaptive pseudo-spectral techniques to enable fully parallel 3D calculations of very large systems. In this work we present benchmark results for HFB-AX and MADNESS-HFB on ultracold trapped fermions.
NASA Technical Reports Server (NTRS)
Eidson, T. M.; Erlebacher, G.
1994-01-01
While parallel computers offer significant computational performance, it is generally necessary to evaluate several programming strategies. Two programming strategies for a fairly common problem - a periodic tridiagonal solver - are developed and evaluated. Simple model calculations as well as timing results are presented to evaluate the various strategies. The particular tridiagonal solver evaluated is used in many computational fluid dynamic simulation codes. The feature that makes this algorithm unique is that these simulation codes usually require simultaneous solutions for multiple right-hand-sides (RHS) of the system of equations. Each RHS solutions is independent and thus can be computed in parallel. Thus a Gaussian elimination type algorithm can be used in a parallel computation and the more complicated approaches such as cyclic reduction are not required. The two strategies are a transpose strategy and a distributed solver strategy. For the transpose strategy, the data is moved so that a subset of all the RHS problems is solved on each of the several processors. This usually requires significant data movement between processor memories across a network. The second strategy attempts to have the algorithm allow the data across processor boundaries in a chained manner. This usually requires significantly less data movement. An approach to accomplish this second strategy in a near-perfect load-balanced manner is developed. In addition, an algorithm will be shown to directly transform a sequential Gaussian elimination type algorithm into the parallel chained, load-balanced algorithm.
How to Compute Green's Functions for Entire Mass Trajectories Within Krylov Solvers
NASA Astrophysics Data System (ADS)
Glässner, Uwe; Güsken, Stephan; Lippert, Thomas; Ritzenhöfer, Gero; Schilling, Klaus; Frommer, Andreas
The availability of efficient Krylov subspace solvers plays a vital role in the solution of a variety of numerical problems in computational science. Here we consider lattice field theory. We present a new general numerical method to compute many Green's functions for complex non-singular matrices within one iteration process. Our procedure applies to matrices of structure A = D - m, with m proportional to the unit matrix, and can be integrated within any Krylov subspace solver. We can compute the derivatives x(n) of the solution vector x with respect to the parameter m and construct the Taylor expansion of x around m. We demonstrate the advantages of our method using a minimal residual solver. Here the procedure requires one intermediate vector for each Green's function to compute. As real-life example, we determine a mass trajectory of the Wilson fermion matrix for lattice QCD. Here we find that we can obtain Green's functions at all masses ≥ m at the price of one inversion at mass m.
Three-Dimensional Inverse Transport Solver Based on Compressive Sensing Technique
NASA Astrophysics Data System (ADS)
Cheng, Yuxiong; Wu, Hongchun; Cao, Liangzhi; Zheng, Youqi
2013-09-01
According to the direct exposure measurements from flash radiographic image, a compressive sensing-based method for three-dimensional inverse transport problem is presented. The linear absorption coefficients and interface locations of objects are reconstructed directly at the same time. It is always very expensive to obtain enough measurements. With limited measurements, compressive sensing sparse reconstruction technique orthogonal matching pursuit is applied to obtain the sparse coefficients by solving an optimization problem. A three-dimensional inverse transport solver is developed based on a compressive sensing-based technique. There are three features in this solver: (1) AutoCAD is employed as a geometry preprocessor due to its powerful capacity in graphic. (2) The forward projection matrix rather than Gauss matrix is constructed by the visualization tool generator. (3) Fourier transform and Daubechies wavelet transform are adopted to convert an underdetermined system to a well-posed system in the algorithm. Simulations are performed and numerical results in pseudo-sine absorption problem, two-cube problem and two-cylinder problem when using compressive sensing-based solver agree well with the reference value.
A User's Manual for ROTTILT Solver: Tiltrotor Fountain Flow Field Prediction
NASA Technical Reports Server (NTRS)
Tadghighi, Hormoz; Rajagopalan, R. Ganesh
1999-01-01
A CFD solver has been developed to provide the time averaged details of the fountain flow typical for tiltrotor aircraft in hover. This Navier-Stokes solver, designated as ROTTILT, assumes the 3-D fountain flowfield to be steady and incompressible. The theoretical background is described in this manual. In order to enable the rotor trim solution in the presence of tiltrotor aircraft components such as wing, nacelle, and fuselage, the solver is coupled with a set of trim routines which are highly efficient in CPU and suitable for CFD analysis. The Cartesian grid technique utilized provides the user with a unique capability for insertion or elimination of any components of the bodies considered for a given tiltrotor aircraft configuration. The flowfield associated with either a semi or full-span configuration can be computed through user options in the ROTTILT input file. Full details associated with the numerical solution implemented in ROTTILT and assumptions are presented. A description of input surface mesh topology is provided in the appendices along with a listing of all preprocessor programs. Input variable definitions and default values are provided for the V22 aircraft. Limited predicted results using the coupled ROTTILT/WOPWOP program for the V22 in hover are made and compared with measurement. To visualize the V22 aircraft and predictions, a preprocessor graphics program GNU-PLOT3D was used. This program is described and example graphic results presented.
Two-Dimensional Riemann Solver for Euler Equations of Gas Dynamics
NASA Astrophysics Data System (ADS)
Brio, M.; Zakharian, A. R.; Webb, G. M.
2001-02-01
We construct a Riemann solver based on two-dimensional linear wave contributions to the numerical flux that generalizes the one-dimensional method due to Roe (1981, J. Comput. Phys.43, 157). The solver is based on a multistate Riemann problem and is suitable for arbitrary triangular grids or any other finite volume tessellations of the plane. We present numerical examples illustrating the performance of the method using both first- and second-order-accurate numerical solutions. The numerical flux contributions are due to one-dimensional waves and multidimensional waves originating from the corners of the computational cell. Under appropriate CFL restrictions, the contributions of one-dimensional waves dominate the flux, which explains good performance of dimensionally split solvers in practice. The multidimensional flux corrections increase the accuracy and stability, allowing a larger time step. The improvements are more pronounced on a coarse mesh and for large CFL numbers. For the second-order method, the improvements can be comparable to the improvements resulting from a less diffusive limiter.
An installed nacelle design code using a multiblock Euler solver. Volume 1: Theory document
NASA Technical Reports Server (NTRS)
Chen, H. C.
1992-01-01
An efficient multiblock Euler design code was developed for designing a nacelle installed on geometrically complex airplane configurations. This approach employed a design driver based on a direct iterative surface curvature method developed at LaRC. A general multiblock Euler flow solver was used for computing flow around complex geometries. The flow solver used a finite-volume formulation with explicit time-stepping to solve the Euler Equations. It used a multiblock version of the multigrid method to accelerate the convergence of the calculations. The design driver successively updated the surface geometry to reduce the difference between the computed and target pressure distributions. In the flow solver, the change in surface geometry was simulated by applying surface transpiration boundary conditions to avoid repeated grid generation during design iterations. Smoothness of the designed surface was ensured by alternate application of streamwise and circumferential smoothings. The capability and efficiency of the code was demonstrated through the design of both an isolated nacelle and an installed nacelle at various flow conditions. Information on the execution of the computer program is provided in volume 2.
Uysal, Ismail E; Arda Ülkü, H; Bağci, Hakan
2016-09-01
Transient electromagnetic interactions on plasmonic nanostructures are analyzed by solving the Poggio-Miller-Chan-Harrington-Wu-Tsai (PMCHWT) surface integral equation (SIE). Equivalent (unknown) electric and magnetic current densities, which are introduced on the surfaces of the nanostructures, are expanded using Rao-Wilton-Glisson and polynomial basis functions in space and time, respectively. Inserting this expansion into the PMCHWT-SIE and Galerkin testing the resulting equation at discrete times yield a system of equations that is solved for the current expansion coefficients by a marching on-in-time (MOT) scheme. The resulting MOT-PMCHWT-SIE solver calls for computation of additional convolutions between the temporal basis function and the plasmonic medium's permittivity and Green function. This computation is carried out with almost no additional cost and without changing the computational complexity of the solver. Time-domain samples of the permittivity and the Green function required by these convolutions are obtained from their frequency-domain samples using a fast relaxed vector fitting algorithm. Numerical results demonstrate the accuracy and applicability of the proposed MOT-PMCHWT solver. PMID:27607496
NASA Astrophysics Data System (ADS)
Mena, Andres; Ferrero, Jose M.; Rodriguez Matas, Jose F.
2015-11-01
Solving the electric activity of the heart possess a big challenge, not only because of the structural complexities inherent to the heart tissue, but also because of the complex electric behaviour of the cardiac cells. The multi-scale nature of the electrophysiology problem makes difficult its numerical solution, requiring temporal and spatial resolutions of 0.1 ms and 0.2 mm respectively for accurate simulations, leading to models with millions degrees of freedom that need to be solved for thousand time steps. Solution of this problem requires the use of algorithms with higher level of parallelism in multi-core platforms. In this regard the newer programmable graphic processing units (GPU) has become a valid alternative due to their tremendous computational horsepower. This paper presents results obtained with a novel electrophysiology simulation software entirely developed in Compute Unified Device Architecture (CUDA). The software implements fully explicit and semi-implicit solvers for the monodomain model, using operator splitting. Performance is compared with classical multi-core MPI based solvers operating on dedicated high-performance computer clusters. Results obtained with the GPU based solver show enormous potential for this technology with accelerations over 50 × for three-dimensional problems.
Evaluation of parallel direct sparse linear solvers in electromagnetic geophysical problems
NASA Astrophysics Data System (ADS)
Puzyrev, Vladimir; Koric, Seid; Wilkin, Scott
2016-04-01
High performance computing is absolutely necessary for large-scale geophysical simulations. In order to obtain a realistic image of a geologically complex area, industrial surveys collect vast amounts of data making the computational cost extremely high for the subsequent simulations. A major computational bottleneck of modeling and inversion algorithms is solving the large sparse systems of linear ill-conditioned equations in complex domains with multiple right hand sides. Recently, parallel direct solvers have been successfully applied to multi-source seismic and electromagnetic problems. These methods are robust and exhibit good performance, but often require large amounts of memory and have limited scalability. In this paper, we evaluate modern direct solvers on large-scale modeling examples that previously were considered unachievable with these methods. Performance and scalability tests utilizing up to 65,536 cores on the Blue Waters supercomputer clearly illustrate the robustness, efficiency and competitiveness of direct solvers compared to iterative techniques. Wide use of direct methods utilizing modern parallel architectures will allow modeling tools to accurately support multi-source surveys and 3D data acquisition geometries, thus promoting a more efficient use of the electromagnetic methods in geophysics.
Amesos2 and Belos: Direct and Iterative Solvers for Large Sparse Linear Systems
Bavier, Eric; Hoemmen, Mark; Rajamanickam, Sivasankaran; Thornquist, Heidi
2012-01-01
Solvers for large sparse linear systems come in two categories: direct and iterative. Amesos2, a package in the Trilinos software project, provides direct methods, and Belos, another Trilinos package, provides iterative methods. Amesos2 offers a common interface to many different sparse matrix factorization codes, and can handle any implementation of sparse matrices and vectors, via an easy-to-extend C++ traits interface. It can also factor matrices whose entries have arbitrary “Scalar” type, enabling extended-precision and mixed-precision algorithms. Belos includes many different iterative methods for solving large sparse linear systems and least-squares problems. Unlike competing iterative solver libraries, Belos completely decouples themore » algorithms from the implementations of the underlying linear algebra objects. This lets Belos exploit the latest hardware without changes to the code. Belos favors algorithms that solve higher-level problems, such as multiple simultaneous linear systems and sequences of related linear systems, faster than standard algorithms. The package also supports extended-precision and mixed-precision algorithms. Together, Amesos2 and Belos form a complete suite of sparse linear solvers.« less
Two-dimensional flux-corrected transport solver for convectively dominated flows
Baer, M.R.; Gross, R.J.
1986-01-01
A numerical technique designed to solve a wide class of convectively dominated flow problems is presented. An attractive feature of the technique is its ability to resolve the behavior of field quantities possessing large gradients and/or shocks. The method is a finite-difference technique known as flux-corrected transport (FCT) that maintains four important numerical considerations - stability, accuracy, monotonicity, and conservation. The theory and methodology of two-dimensional FCT is presented. The method is applied in demonstrative example calculations of a 2-D Riemann problem with known exact solutions and to the Euler equations in a study of classical Rayleigh-Taylor and Kelvin-Helmholtz instability problems. The FCT solver has been vectorized for execution on the Cray 1S - a typical call with a 50 by 50 mesh requires about 0.00428 cpu seconds of execution time per call to the routine. Additionally, we have maintained a modular structure for the solver that eases its implementation. Fortran listings of two versions of the 2-D FCT solvers are appended with a driver main program illustrating the call sequence for the modules. 59 refs., 49 figs.
AQUAgpusph, a new free 3D SPH solver accelerated with OpenCL
NASA Astrophysics Data System (ADS)
Cercos-Pita, J. L.
2015-07-01
In this paper, AQUAgpusph, a new free Smoothed Particle Hydrodynamics (SPH) software accelerated with OpenCL, is described. The main differences and progress with respect to other existing alternatives are considered. These are the use of the Open Computing Language (OpenCL) framework instead of the Compute Unified Device Architecture (CUDA), the implementation of the most popular boundary conditions, the easy customization of the code to different problems, the extensibility with regard to Python scripts, and the runtime output which allows the tracking of simulations in real time, or a higher frequency in saving some results without a significant performance lost. These modifications are shown to improve the solver speed, the results quality, and allow for a wider areas of application. AQUAgpusph has been designed trying to provide researchers and engineers with a valuable tool to test and apply the SPH method. Three practical applications are discussed in detail. The evolution of a dam break is used to quantify and compare the computational performance and modeling accuracy with the most popular SPH Graphics Processing Unit (GPU) accelerated alternatives. The dynamics of a coupled system, a Tuned Liquid Damper (TLD), is discussed in order to show the integration capabilities of the solver with external dynamics. Finally, the sloshing flow inside a nuclear reactor is simulated in order to show the capabilities of the solver to treat 3-D problems with complex geometries and of industrial interest.
Multidimensional stochastic approximation Monte Carlo.
Zablotskiy, Sergey V; Ivanov, Victor A; Paul, Wolfgang
2016-06-01
Stochastic Approximation Monte Carlo (SAMC) has been established as a mathematically founded powerful flat-histogram Monte Carlo method, used to determine the density of states, g(E), of a model system. We show here how it can be generalized for the determination of multidimensional probability distributions (or equivalently densities of states) of macroscopic or mesoscopic variables defined on the space of microstates of a statistical mechanical system. This establishes this method as a systematic way for coarse graining a model system, or, in other words, for performing a renormalization group step on a model. We discuss the formulation of the Kadanoff block spin transformation and the coarse-graining procedure for polymer models in this language. We also apply it to a standard case in the literature of two-dimensional densities of states, where two competing energetic effects are present g(E_{1},E_{2}). We show when and why care has to be exercised when obtaining the microcanonical density of states g(E_{1}+E_{2}) from g(E_{1},E_{2}). PMID:27415383
Interplay of approximate planning strategies
Huys, Quentin J. M.; Lally, Níall; Faulkner, Paul; Eshel, Neir; Seifritz, Erich; Gershman, Samuel J.; Dayan, Peter; Roiser, Jonathan P.
2015-01-01
Humans routinely formulate plans in domains so complex that even the most powerful computers are taxed. To do so, they seem to avail themselves of many strategies and heuristics that efficiently simplify, approximate, and hierarchically decompose hard tasks into simpler subtasks. Theoretical and cognitive research has revealed several such strategies; however, little is known about their establishment, interaction, and efficiency. Here, we use model-based behavioral analysis to provide a detailed examination of the performance of human subjects in a moderately deep planning task. We find that subjects exploit the structure of the domain to establish subgoals in a way that achieves a nearly maximal reduction in the cost of computing values of choices, but then combine partial searches with greedy local steps to solve subtasks, and maladaptively prune the decision trees of subtasks in a reflexive manner upon encountering salient losses. Subjects come idiosyncratically to favor particular sequences of actions to achieve subgoals, creating novel complex actions or “options.” PMID:25675480
Multidimensional stochastic approximation Monte Carlo
NASA Astrophysics Data System (ADS)
Zablotskiy, Sergey V.; Ivanov, Victor A.; Paul, Wolfgang
2016-06-01
Stochastic Approximation Monte Carlo (SAMC) has been established as a mathematically founded powerful flat-histogram Monte Carlo method, used to determine the density of states, g (E ) , of a model system. We show here how it can be generalized for the determination of multidimensional probability distributions (or equivalently densities of states) of macroscopic or mesoscopic variables defined on the space of microstates of a statistical mechanical system. This establishes this method as a systematic way for coarse graining a model system, or, in other words, for performing a renormalization group step on a model. We discuss the formulation of the Kadanoff block spin transformation and the coarse-graining procedure for polymer models in this language. We also apply it to a standard case in the literature of two-dimensional densities of states, where two competing energetic effects are present g (E1,E2) . We show when and why care has to be exercised when obtaining the microcanonical density of states g (E1+E2) from g (E1,E2) .
Semiclassics beyond the diagonal approximation
NASA Astrophysics Data System (ADS)
Turek, Marko
2004-05-01
The statistical properties of the energy spectrum of classically chaotic closed quantum systems are the central subject of this thesis. It has been conjectured by O.Bohigas, M.-J.Giannoni and C.Schmit that the spectral statistics of chaotic systems is universal and can be described by random-matrix theory. This conjecture has been confirmed in many experiments and numerical studies but a formal proof is still lacking. In this thesis we present a semiclassical evaluation of the spectral form factor which goes beyond M.V.Berry's diagonal approximation. To this end we extend a method developed by M.Sieber and K.Richter for a specific system: the motion of a particle on a two-dimensional surface of constant negative curvature. In particular we prove that these semiclassical methods reproduce the random-matrix theory predictions for the next to leading order correction also for a much wider class of systems, namely non-uniformly hyperbolic systems with f>2 degrees of freedom. We achieve this result by extending the configuration-space approach of M.Sieber and K.Richter to a canonically invariant phase-space approach.
Randomized approximate nearest neighbors algorithm.
Jones, Peter Wilcox; Osipov, Andrei; Rokhlin, Vladimir
2011-09-20
We present a randomized algorithm for the approximate nearest neighbor problem in d-dimensional Euclidean space. Given N points {x(j)} in R(d), the algorithm attempts to find k nearest neighbors for each of x(j), where k is a user-specified integer parameter. The algorithm is iterative, and its running time requirements are proportional to T·N·(d·(log d) + k·(d + log k)·(log N)) + N·k(2)·(d + log k), with T the number of iterations performed. The memory requirements of the procedure are of the order N·(d + k). A by-product of the scheme is a data structure, permitting a rapid search for the k nearest neighbors among {x(j)} for an arbitrary point x ∈ R(d). The cost of each such query is proportional to T·(d·(log d) + log(N/k)·k·(d + log k)), and the memory requirements for the requisite data structure are of the order N·(d + k) + T·(d + N). The algorithm utilizes random rotations and a basic divide-and-conquer scheme, followed by a local graph search. We analyze the scheme's behavior for certain types of distributions of {x(j)} and illustrate its performance via several numerical examples.
Bordner, J.; Saied, F.
1996-12-31
GLab3D is an enhancement of an interactive environment (MGLab) for experimenting with iterative solvers and multigrid algorithms. It is implemented in MATLAB. The new version has built-in 3D elliptic pde`s and several iterative methods and preconditioners that were not available in the original version. A sparse direct solver option has also been included. The multigrid solvers have also been extended to 3D. The discretization and pde domains are restricted to standard finite differences on the unit square/cube. The power of this software studies in the fact that no programming is needed to solve, for example, the convection-diffusion equation in 3D with TFQMR and a customized V-cycle preconditioner, for a variety of problem sizes and mesh Reynolds, numbers. In addition to the graphical user interface, some sample drivers are included to show how experiments can be composed using the underlying suite of problems and solvers.
A fast Poisson solver for unsteady incompressible Navier-Stokes equations on the half-staggered grid
NASA Technical Reports Server (NTRS)
Golub, G. H.; Huang, L. C.; Simon, H.; Tang, W. -P.
1995-01-01
In this paper, a fast Poisson solver for unsteady, incompressible Navier-Stokes equations with finite difference methods on the non-uniform, half-staggered grid is presented. To achieve this, new algorithms for diagonalizing a semi-definite pair are developed. Our fast solver can also be extended to the three dimensional case. The motivation and related issues in using this second kind of staggered grid are also discussed. Numerical testing has indicated the effectiveness of this algorithm.
Tezaur, Irina K.; Tuminaro, Raymond S.; Perego, Mauro; Salinger, Andrew G.; Price, Stephen F.
2015-01-01
We examine the scalability of the recently developed Albany/FELIX finite-element based code for the first-order Stokes momentum balance equations for ice flow. We focus our analysis on the performance of two possible preconditioners for the iterative solution of the sparse linear systems that arise from the discretization of the governing equations: (1) a preconditioner based on the incomplete LU (ILU) factorization, and (2) a recently-developed algebraic multigrid (AMG) preconditioner, constructed using the idea of semi-coarsening. A strong scalability study on a realistic, high resolution Greenland ice sheet problem reveals that, for a given number of processor cores, the AMG preconditionermore » results in faster linear solve times but the ILU preconditioner exhibits better scalability. A weak scalability study is performed on a realistic, moderate resolution Antarctic ice sheet problem, a substantial fraction of which contains floating ice shelves, making it fundamentally different from the Greenland ice sheet problem. Here, we show that as the problem size increases, the performance of the ILU preconditioner deteriorates whereas the AMG preconditioner maintains scalability. This is because the linear systems are extremely ill-conditioned in the presence of floating ice shelves, and the ill-conditioning has a greater negative effect on the ILU preconditioner than on the AMG preconditioner.« less
Stevens, D.E.; Bretherton, S.
1996-12-01
This paper presents a new forward-in-time advection method for nearly incompressible flow, MU, and its application to an adaptive multilevel flow solver for atmospheric flows. MU is a modification of Leonard et al.`s UTOPIA scheme. MU, like UTOPIA, is based on third-order accurate semi-Lagrangian multidimensional upwinding for constant velocity flows. for varying velocity fields, MU is a second-order conservative method. MU has greater stability and accuracy than UTOPIA and naturally decomposes into a monotone low-order method and a higher-order accurate correction for use with flux limiting. Its stability and accuracy make it a computationally efficient alternative to current finite-difference advection methods. We present a fully second-order accurate flow solver for the anelastic equations, a prototypical low Mach number flow. The flow solver is based on MU which is used for both momentum and scalar transport equations. This flow solver can also be implemented with any forward-in-time advection scheme. The multilevel flow solver conserves discrete global integrals of advected quantities and includes adaptive mesh refinements. Its second-order accuracy is verified using a nonlinear energy conservation integral for the anelastic equations. For a typical geophysical problem in which the flow is most rapidly varying in a small part of the domain, the multilevel flow solver achieves global accuracy comparable to uniform-resolution simulation for 10% of the computational cost. 36 refs., 10 figs.
Evaluating the performance of the two-phase flow solver interFoam
NASA Astrophysics Data System (ADS)
Deshpande, Suraj S.; Anumolu, Lakshman; Trujillo, Mario F.
2012-01-01
The performance of the open source multiphase flow solver, interFoam, is evaluated in this work. The solver is based on a modified volume of fluid (VoF) approach, which incorporates an interfacial compression flux term to mitigate the effects of numerical smearing of the interface. It forms a part of the C + + libraries and utilities of OpenFOAM and is gaining popularity in the multiphase flow research community. However, to the best of our knowledge, the evaluation of this solver is confined to the validation tests of specific interest to the users of the code and the extent of its applicability to a wide range of multiphase flow situations remains to be explored. In this work, we have performed a thorough investigation of the solver performance using a variety of verification and validation test cases, which include (i) verification tests for pure advection (kinematics), (ii) dynamics in the high Weber number limit and (iii) dynamics of surface tension-dominated flows. With respect to (i), the kinematics tests show that the performance of interFoam is generally comparable with the recent algebraic VoF algorithms; however, it is noticeably worse than the geometric reconstruction schemes. For (ii), the simulations of inertia-dominated flows with large density ratios {\\sim }\\mathscr {O}(10^3) yielded excellent agreement with analytical and experimental results. In regime (iii), where surface tension is important, consistency of pressure-surface tension formulation and accuracy of curvature are important, as established by Francois et al (2006 J. Comput. Phys. 213 141-73). Several verification tests were performed along these lines and the main findings are: (a) the algorithm of interFoam ensures a consistent formulation of pressure and surface tension; (b) the curvatures computed by the solver converge to a value slightly (10%) different from the analytical value and a scope for improvement exists in this respect. To reduce the disruptive effects of spurious
The development of a robust, efficient solver for spectral and spectral-element time discretizations
NASA Astrophysics Data System (ADS)
Mundis, Nathan L.
This work examines alternative time discretizations for the Euler equations and methods for the robust and efficient solution of these discretizations. Specifically, the time-spectral method (TS), quasi-periodic time-spectral method (BDFTS), and spectral-element method in time (SEMT) are derived and examined in detail. For the two time-spectral based methods, focus is given to expanding these methods for more complicated problems than have been typically solved by other authors, including problems with spectral content in a large number of harmonics, gust response problems, and aeroelastic problems. To solve these more complicated problems, it was necessary to implement the flexible variant of the Generalized Minimal Residual method (FGMRES), utilizing the full second-order accurate spatial Jacobian, complete temporal coupling of the chosen time discretization, and fully-implicit coupling of the aeroelastic equations in the cases where they are needed. The FGMRES solver developed utilizes a block-colored Gauss-Seidel (BCGS) preconditioner augmented by a defect-correction process to increase its effectiveness. Exploration of more efficient preconditioners for the FGMRES solver is an anticipated topic for future work in this field. It was a logical extension to apply this already developed FGMRES solver to the spectral-element method in time, which has some advantages over the spectral methods already discussed. Unlike purely-spectral methods, SEMT allows for bothh- and p-refinement. This property could allow for element clustering around areas of sharp gradients and discontinuities, which in turn could make SEMT more efficient than TS for periodic problems that contain these sharp gradients and would require many time instances to produce a precise solution using the TS method. As such, a preliminary investigation of the SEMT method applied to the Euler equations is conducted and some areas for needed improvement in future work are identified. In this work, it is
NASA Astrophysics Data System (ADS)
Balsara, Dinshaw S.; Amano, Takanobu; Garain, Sudip; Kim, Jinho
2016-08-01
In various astrophysics settings it is common to have a two-fluid relativistic plasma that interacts with the electromagnetic field. While it is common to ignore the displacement current in the ideal, classical magnetohydrodynamic limit, when the flows become relativistic this approximation is less than absolutely well-justified. In such a situation, it is more natural to consider a positively charged fluid made up of positrons or protons interacting with a negatively charged fluid made up of electrons. The two fluids interact collectively with the full set of Maxwell's equations. As a result, a solution strategy for that coupled system of equations is sought and found here. Our strategy extends to higher orders, providing increasing accuracy. The primary variables in the Maxwell solver are taken to be the facially-collocated components of the electric and magnetic fields. Consistent with such a collocation, three important innovations are reported here. The first two pertain to the Maxwell solver. In our first innovation, the magnetic field within each zone is reconstructed in a divergence-free fashion while the electric field within each zone is reconstructed in a form that is consistent with Gauss' law. In our second innovation, a multidimensionally upwinded strategy is presented which ensures that the magnetic field can be updated via a discrete interpretation of Faraday's law and the electric field can be updated via a discrete interpretation of the generalized Ampere's law. This multidimensional upwinding is achieved via a multidimensional Riemann solver. The multidimensional Riemann solver automatically provides edge-centered electric field components for the Stokes law-based update of the magnetic field. It also provides edge-centered magnetic field components for the Stokes law-based update of the electric field. The update strategy ensures that the electric field is always consistent with Gauss' law and the magnetic field is always divergence-free. This
Beyond the Tamm-Dancoff approximation for extended systems using exact diagonalization
NASA Astrophysics Data System (ADS)
Sander, Tobias; Maggio, Emanuele; Kresse, Georg
2015-07-01
Linear optical properties can be accurately calculated using the Bethe-Salpeter equation. After introducing a suitable product basis for the electron-hole pairs, the Bethe-Salpeter equation is usually recast into a complex non-Hermitian eigenvalue problem that is difficult to solve using standard eigenvalue solvers. In solid-state physics, it is therefore common practice to neglect the problematic coupling between the positive- and negative-frequency branches, reducing the problem to a Hermitian eigenvalue problem [Tamm-Dancoff approximation (TDA)]. We use time-inversion symmetry to recast the full problem into a quadratic Hermitian eigenvalue problem, which can be solved routinely using standard eigenvalue solvers even at a finite wave vector q . This allows us to access the importance of the coupling between the positive- and negative-frequency branch for prototypical solids. As a starting point for the Bethe-Salpeter calculations, we use self-consistent Green's-function methods (GW ), making the present scheme entirely ab initio. We calculate the optical spectra of carbon (C), silicon (Si), lithium fluoride (LiF), and the cyclic dimer Li2F2 and discuss why the differences between the TDA and the full solution are tiny. However, at finite momentum transfer q , significant differences between the TDA and our exact treatment are found. The origin of these differences is explained.
NASA Astrophysics Data System (ADS)
Shi, F.; Kirby, J. T.; Tehranirad, B.
2010-12-01
Recent progress in the development of Boussinesq-type wave models using TVD-MUSCL schemes have shown robust performance of the shock-capturing method in simulating breaking waves and coastal inundation (Tonelli and Petti, 2009, Roeber et al., 2010, Shiach and Mingham, 2009, Erduran et al., 2005, and others). Shock-capturing schemes make the treatment of wave breaking straightforward without an artificial viscosity adopted in some breaking wave models such as in Kennedy et al. (2000). The schemes are also able to capture the sharp wave front occurring in the swash zone. A high-order temporal scheme usually requires uniform time-stepping, decreasing model efficiency in applications to breaking waves and inundation where super-critical fluid conditions limit the time step associated with the CFL-criterion. In this presentation, we describe the use of a higher order, adaptive time-stepping algorithm using the Runge-Kutta method in a fully nonlinear Boussinesq wave model. Higher-order numerical schemes in both space and time were applied in order to avoid contamination of the physical dispersive terms in Boussinesq equations resulting from truncation errors in the lower-order (second-order) approximation. The spatial derivatives are discritized using a combination of finite-volume and finite-difference methods. A fourth-order MUSCL reconstruction technique is used in the Riemann solver. The model code is parallelized for the MPI computational environment. We illustrate the model's application to the problems of wave runup and coastal inundation in the context of a standard suite of benchmark tests.
NASA Technical Reports Server (NTRS)
Raju, M. S.
1998-01-01
The success of any solution methodology used in the study of gas-turbine combustor flows depends a great deal on how well it can model the various complex and rate controlling processes associated with the spray's turbulent transport, mixing, chemical kinetics, evaporation, and spreading rates, as well as convective and radiative heat transfer and other phenomena. The phenomena to be modeled, which are controlled by these processes, often strongly interact with each other at different times and locations. In particular, turbulence plays an important role in determining the rates of mass and heat transfer, chemical reactions, and evaporation in many practical combustion devices. The influence of turbulence in a diffusion flame manifests itself in several forms, ranging from the so-called wrinkled, or stretched, flamelets regime to the distributed combustion regime, depending upon how turbulence interacts with various flame scales. Conventional turbulence models have difficulty treating highly nonlinear reaction rates. A solution procedure based on the composition joint probability density function (PDF) approach holds the promise of modeling various important combustion phenomena relevant to practical combustion devices (such as extinction, blowoff limits, and emissions predictions) because it can account for nonlinear chemical reaction rates without making approximations. In an attempt to advance the state-of-the-art in multidimensional numerical methods, we at the NASA Lewis Research Center extended our previous work on the PDF method to unstructured grids, parallel computing, and sprays. EUPDF, which was developed by M.S. Raju of Nyma, Inc., was designed to be massively parallel and could easily be coupled with any existing gas-phase and/or spray solvers. EUPDF can use an unstructured mesh with mixed triangular, quadrilateral, and/or tetrahedral elements. The application of the PDF method showed favorable results when applied to several supersonic
NASA Astrophysics Data System (ADS)
Sanan, P.; Schnepp, S. M.; May, D.; Schenk, O.
2014-12-01
Geophysical applications require efficient forward models for non-linear Stokes flow on high resolution spatio-temporal domains. The bottleneck in applying the forward model is solving the linearized, discretized Stokes problem which takes the form of a large, indefinite (saddle point) linear system. Due to the heterogeniety of the effective viscosity in the elliptic operator, devising effective preconditioners for saddle point problems has proven challenging and highly problem-dependent. Nevertheless, at least three approaches show promise for preconditioning these difficult systems in an algorithmically scalable way using multigrid and/or domain decomposition techniques. The first is to work with a hierarchy of coarser or smaller saddle point problems. The second is to use the Schur complement method to decouple and sequentially solve for the pressure and velocity. The third is to use the Schur decomposition to devise preconditioners for the full operator. These involve sub-solves resembling inexact versions of the sequential solve. The choice of approach and sub-methods depends crucially on the motivating physics, the discretization, and available computational resources. Here we examine the performance trade-offs for preconditioning strategies applied to idealized models of mantle convection and lithospheric dynamics, characterized by large viscosity gradients. Due to the arbitrary topological structure of the viscosity field in geodynamical simulations, we utilize low order, inf-sup stable mixed finite element spatial discretizations which are suitable when sharp viscosity variations occur in element interiors. Particular attention is paid to possibilities within the decoupled and approximate Schur complement factorization-based monolithic approaches to leverage recently-developed flexible, communication-avoiding, and communication-hiding Krylov subspace methods in combination with `heavy' smoothers, which require solutions of large per-node sub-problems, well
ERIC Educational Resources Information Center
Melrose, Pamela
2010-01-01
In June 2009 Pamela Melrose was a recipient of a "Premier's Energy Australia Scholarship". Her study tour took her to Scandinavia. This paper is an account of that tour. In her report Pamela argues for the use of authentic studies in science teaching. She cites examples from museums, nature schools and research establishments to highlight the…
ERIC Educational Resources Information Center
Starkman, Neal
2007-01-01
US students continue to lag behind the rest of the world in science, technology, engineering, and math--taken together, STEM. Even as the US falls further and further behind other countries in these four critical academic areas, not everyone sees it as a crisis. Fortunately, there are those who do. One organization out front on the issue is,…
NASA Astrophysics Data System (ADS)
Lu, Benzhuo; Cheng, Xiaolin; Huang, Jingfang; McCammon, J. Andrew
2010-06-01
A Fortran program package is introduced for rapid evaluation of the electrostatic potentials and forces in biomolecular systems modeled by the linearized Poisson-Boltzmann equation. The numerical solver utilizes a well-conditioned boundary integral equation (BIE) formulation, a node-patch discretization scheme, a Krylov subspace iterative solver package with reverse communication protocols, and an adaptive new version of fast multipole method in which the exponential expansions are used to diagonalize the multipole-to-local translations. The program and its full description, as well as several closely related libraries and utility tools are available at http://lsec.cc.ac.cn/~lubz/afmpb.html and a mirror site at http://mccammon.ucsd.edu/. This paper is a brief summary of the program: the algorithms, the implementation and the usage. Program summaryProgram title: AFMPB: Adaptive fast multipole Poisson-Boltzmann solver Catalogue identifier: AEGB_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEGB_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GPL 2.0 No. of lines in distributed program, including test data, etc.: 453 649 No. of bytes in distributed program, including test data, etc.: 8 764 754 Distribution format: tar.gz Programming language: Fortran Computer: Any Operating system: Any RAM: Depends on the size of the discretized biomolecular system Classification: 3 External routines: Pre- and post-processing tools are required for generating the boundary elements and for visualization. Users can use MSMS ( http://www.scripps.edu/~sanner/html/msms_home.html) for pre-processing, and VMD ( http://www.ks.uiuc.edu/Research/vmd/) for visualization. Sub-programs included: An iterative Krylov subspace solvers package from SPARSKIT by Yousef Saad ( http://www-users.cs.umn.edu/~saad/software/SPARSKIT/sparskit.html), and the fast multipole methods subroutines from FMMSuite ( http
NASA Astrophysics Data System (ADS)
Zhou, Yong; Ni, Sidao; Chu, Risheng; Yao, Huajian
2016-06-01
Numerical solvers of wave equations have been widely used to simulate global seismic waves including PP waves for modeling 410/660 km discontinuity and Rayleigh waves for imaging crustal structure. In order to avoid extra computation cost due to ocean water effects, these numerical solvers usually adopt water column approximation, whose accuracy depends on frequency and needs to be investigated quantitatively. In this paper, we describe a unified representation of accurate and approximate forms of the equivalent water column boundary condition as well as the free boundary condition. Then we derive an analytical form of the PP-wave reflection coefficient with the unified boundary condition, and quantify the effects of water column approximation on amplitude and phase shift of the PP waves. We also study the effects of water column approximation on phase velocity dispersion of the fundamental mode Rayleigh wave with a propagation matrix method. We find that with the water column approximation: (1) The error of PP amplitude and phase shift is less than 5% and 9 ° at periods greater than 25 s for most oceanic regions. But at periods of 15 s or less, PP is inaccurate up to 10% in amplitude and a few seconds in time shift for deep oceans. (2) The error in Rayleigh wave phase velocity is less than 1% at periods greater than 30 s in most oceanic regions, but the error is up to 2% for deep oceans at periods of 20 s or less. This study confirms that the water column approximation is only accurate at long periods and it needs to be improved at shorter periods.
NASA Astrophysics Data System (ADS)
Zhou, Yong; Ni, Sidao; Chu, Risheng; Yao, Huajian
2016-08-01
Numerical solvers of wave equations have been widely used to simulate global seismic waves including PP waves for modelling 410/660 km discontinuity and Rayleigh waves for imaging crustal structure. In order to avoid extra computation cost due to ocean water effects, these numerical solvers usually adopt water column approximation, whose accuracy depends on frequency and needs to be investigated quantitatively. In this paper, we describe a unified representation of accurate and approximate forms of the equivalent water column boundary condition as well as the free boundary condition. Then we derive an analytical form of the PP-wave reflection coefficient with the unified boundary condition, and quantify the effects of water column approximation on amplitude and phase shift of the PP waves. We also study the effects of water column approximation on phase velocity dispersion of the fundamental mode Rayleigh wave with a propagation matrix method. We find that with the water column approximation: (1) The error of PP amplitude and phase shift is less than 5 per cent and 9° at periods greater than 25 s for most oceanic regions. But at periods of 15 s or less, PP is inaccurate up to 10 per cent in amplitude and a few seconds in time shift for deep oceans. (2) The error in Rayleigh wave phase velocity is less than 1 per cent at periods greater than 30 s in most oceanic regions, but the error is up to 2 per cent for deep oceans at periods of 20 s or less. This study confirms that the water column approximation is only accurate at long periods and it needs to be improved at shorter periods.
NASA Technical Reports Server (NTRS)
Lee-Rausch, Elizabeth M.; Hammond, Dana P.; Nielsen, Eric J.; Pirzadeh, S. Z.; Rumsey, Christopher L.
2010-01-01
FUN3D Navier-Stokes solutions were computed for the 4th AIAA Drag Prediction Workshop grid convergence study, downwash study, and Reynolds number study on a set of node-based mixed-element grids. All of the baseline tetrahedral grids were generated with the VGRID (developmental) advancing-layer and advancing-front grid generation software package following the gridding guidelines developed for the workshop. With maximum grid sizes exceeding 100 million nodes, the grid convergence study was particularly challenging for the node-based unstructured grid generators and flow solvers. At the time of the workshop, the super-fine grid with 105 million nodes and 600 million elements was the largest grid known to have been generated using VGRID. FUN3D Version 11.0 has a completely new pre- and post-processing paradigm that has been incorporated directly into the solver and functions entirely in a parallel, distributed memory environment. This feature allowed for practical pre-processing and solution times on the largest unstructured-grid size requested for the workshop. For the constant-lift grid convergence case, the convergence of total drag is approximately second-order on the finest three grids. The variation in total drag between the finest two grids is only 2 counts. At the finest grid levels, only small variations in wing and tail pressure distributions are seen with grid refinement. Similarly, a small wing side-of-body separation also shows little variation at the finest grid levels. Overall, the FUN3D results compare well with the structured-grid code CFL3D. The FUN3D downwash study and Reynolds number study results compare well with the range of results shown in the workshop presentations.
Adaptation of the anelastic solver EULAG to high performance computing architectures.
NASA Astrophysics Data System (ADS)
Wójcik, Damian; Ciżnicki, Miłosz; Kopta, Piotr; Kulczewski, Michał; Kurowski, Krzysztof; Piotrowski, Zbigniew; Rojek, Krzysztof; Rosa, Bogdan; Szustak, Łukasz; Wyrzykowski, Roman
2014-05-01
In recent years there has been widespread interest in employing heterogeneous and hybrid supercomputing architectures for geophysical research. Especially promising application for the modern supercomputing architectures is the numerical weather prediction (NWP). Adopting traditional NWP codes to the new machines based on multi- and many-core processors, such as GPUs allows to increase computational efficiency and decrease energy consumption. This offers unique opportunity to develop simulations with finer grid resolutions and computational domains larger than ever before. Further, it enables to extend the range of scales represented in the model so that the accuracy of representation of the simulated atmospheric processes can be improved. Consequently, it allows to improve quality of weather forecasts. Coalition of Polish scientific institutions launched a project aimed at adopting EULAG fluid solver for future high-performance computing platforms. EULAG is currently being implemented as a new dynamical core of COSMO Consortium weather prediction framework. The solver code combines features of a stencil and point wise computations. Its communication scheme consists of both halo exchange subroutines and global reduction functions. Within the project, two main modules of EULAG, namely MPDATA advection and iterative GCR elliptic solver are analyzed and optimized. Relevant techniques have been chosen and applied to accelerate code execution on modern HPC architectures: stencil decomposition, block decomposition (with weighting analysis between computation and communication), reduction of inter-cache communication by partitioning of cores into independent teams, cache reusing and vectorization. Experiments with matching computational domain topology to cluster topology are performed as well. The parallel formulation was extended from pure MPI to hybrid MPI - OpenMP approach. Porting to GPU using CUDA directives is in progress. Preliminary results of performance of the
The a(4) Scheme-A High Order Neutrally Stable CESE Solver
NASA Technical Reports Server (NTRS)
Chang, Sin-Chung
2009-01-01
The CESE development is driven by a belief that a solver should (i) enforce conservation laws in both space and time, and (ii) be built from a nondissipative (i.e., neutrally stable) core scheme so that the numerical dissipation can be controlled effectively. To provide a solid foundation for a systematic CESE development of high order schemes, in this paper we describe a new high order (4-5th order) and neutrally stable CESE solver of a 1D advection equation with a constant advection speed a. The space-time stencil of this two-level explicit scheme is formed by one point at the upper time level and two points at the lower time level. Because it is associated with four independent mesh variables (the numerical analogues of the dependent variable and its first, second, and third-order spatial derivatives) and four equations per mesh point, the new scheme is referred to as the a(4) scheme. As in the case of other similar CESE neutrally stable solvers, the a(4) scheme enforces conservation laws in space-time locally and globally, and it has the basic, forward marching, and backward marching forms. Except for a singular case, these forms are equivalent and satisfy a space-time inversion (STI) invariant property which is shared by the advection equation. Based on the concept of STI invariance, a set of algebraic relations is developed and used to prove the a(4) scheme must be neutrally stable when it is stable. Numerically, it has been established that the scheme is stable if the value of the Courant number is less than 1/3
A Multi-Level Parallelization Concept for High-Fidelity Multi-Block Solvers
NASA Technical Reports Server (NTRS)
Hatay, Ferhat F.; Jespersen, Dennis C.; Guruswamy, Guru P.; Rizk, Yehia M.; Byun, Chansup; Gee, Ken; VanDalsem, William R. (Technical Monitor)
1997-01-01
The integration of high-fidelity Computational Fluid Dynamics (CFD) analysis tools with the industrial design process benefits greatly from the robust implementations that are transportable across a wide range of computer architectures. In the present work, a hybrid domain-decomposition and parallelization concept was developed and implemented into the widely-used NASA multi-block Computational Fluid Dynamics (CFD) packages implemented in ENSAERO and OVERFLOW. The new parallel solver concept, PENS (Parallel Euler Navier-Stokes Solver), employs both fine and coarse granularity in data partitioning as well as data coalescing to obtain the desired load-balance characteristics on the available computer platforms. This multi-level parallelism implementation itself introduces no changes to the numerical results, hence the original fidelity of the packages are identically preserved. The present implementation uses the Message Passing Interface (MPI) library for interprocessor message passing and memory accessing. By choosing an appropriate combination of the available partitioning and coalescing capabilities only during the execution stage, the PENS solver becomes adaptable to different computer architectures from shared-memory to distributed-memory platforms with varying degrees of parallelism. The PENS implementation on the IBM SP2 distributed memory environment at the NASA Ames Research Center obtains 85 percent scalable parallel performance using fine-grain partitioning of single-block CFD domains using up to 128 wide computational nodes. Multi-block CFD simulations of complete aircraft simulations achieve 75 percent perfect load-balanced executions using data coalescing and the two levels of parallelism. SGI PowerChallenge, SGI Origin 2000, and a cluster of workstations are the other platforms where the robustness of the implementation is tested. The performance behavior on the other computer platforms with a variety of realistic problems will be included as this on
LSRN: A PARALLEL ITERATIVE SOLVER FOR STRONGLY OVER- OR UNDERDETERMINED SYSTEMS*
Meng, Xiangrui; Saunders, Michael A.; Mahoney, Michael W.
2014-01-01
We describe a parallel iterative least squares solver named LSRN that is based on random normal projection. LSRN computes the min-length solution to minx∈ℝn ‖Ax − b‖2, where A ∈ ℝm × n with m ≫ n or m ≪ n, and where A may be rank-deficient. Tikhonov regularization may also be included. Since A is involved only in matrix-matrix and matrix-vector multiplications, it can be a dense or sparse matrix or a linear operator, and LSRN automatically speeds up when A is sparse or a fast linear operator. The preconditioning phase consists of a random normal projection, which is embarrassingly parallel, and a singular value decomposition of size ⌈γ min(m, n)⌉ × min(m, n), where γ is moderately larger than 1, e.g., γ = 2. We prove that the preconditioned system is well-conditioned, with a strong concentration result on the extreme singular values, and hence that the number of iterations is fully predictable when we apply LSQR or the Chebyshev semi-iterative method. As we demonstrate, the Chebyshev method is particularly efficient for solving large problems on clusters with high communication cost. Numerical results show that on a shared-memory machine, LSRN is very competitive with LAPACK’s DGELSD and a fast randomized least squares solver called Blendenpik on large dense problems, and it outperforms the least squares solver from SuiteSparseQR on sparse problems without sparsity patterns that can be exploited to reduce fill-in. Further experiments show that LSRN scales well on an Amazon Elastic Compute Cloud cluster. PMID:25419094
NASA Technical Reports Server (NTRS)
Padovan, J.; Lackney, J.
1986-01-01
The current paper develops a constrained hierarchical least square nonlinear equation solver. The procedure can handle the response behavior of systems which possess indefinite tangent stiffness characteristics. Due to the generality of the scheme, this can be achieved at various hierarchical application levels. For instance, in the case of finite element simulations, various combinations of either degree of freedom, nodal, elemental, substructural, and global level iterations are possible. Overall, this enables a solution methodology which is highly stable and storage efficient. To demonstrate the capability of the constrained hierarchical least square methodology, benchmarking examples are presented which treat structure exhibiting highly nonlinear pre- and postbuckling behavior wherein several indefinite stiffness transitions occur.
Two-Dimensional Unsteady Euler-Equation Solver for Arbitrarily Shaped Flow Regions
NASA Technical Reports Server (NTRS)
Hindman, R. G.; Kutler, Paul; Anderson, Dale
1981-01-01
A new technique is described for solving supersonic fluid dynamics problems containing multiple regions of continuous flow, each bounded by a permeable or impermeable surface. Region boundaries are, in general, arbitrarily shaped and time dependent. Discretization of such a region for solution by conventional finite difference procedures is accomplished using an elliptic solver which alleviates the dependence on a particular base coordinate system. Multiple regions are coupled together through the boundary conditions. The technique has been applied to a variety of problems including a shock diffraction problem and supersonic flow over a pointed ogive.
A speciation solver for cement paste modeling and the semismooth Newton method
Georget, Fabien; Prévost, Jean H.; Vanderbei, Robert J.
2015-02-15
The mineral assemblage of a cement paste may vary considerably with its environment. In addition, the water content of a cement paste is relatively low and the ionic strength of the interstitial solution is often high. These conditions are extreme conditions with respect to the common assumptions made in speciation problem. Furthermore the common trial and error algorithm to find the phase assemblage does not provide any guarantee of convergence. We propose a speciation solver based on a semismooth Newton method adapted to the thermodynamic modeling of cement paste. The strong theoretical properties associated with these methods offer practical advantages. Results of numerical experiments indicate that the algorithm is reliable, robust, and efficient.
Proteus-MOC: A 3D deterministic solver incorporating 2D method of characteristics
Marin-Lafleche, A.; Smith, M. A.; Lee, C.
2013-07-01
A new transport solution methodology was developed by combining the two-dimensional method of characteristics with the discontinuous Galerkin method for the treatment of the axial variable. The method, which can be applied to arbitrary extruded geometries, was implemented in PROTEUS-MOC and includes parallelization in group, angle, plane, and space using a top level GMRES linear algebra solver. Verification tests were performed to show accuracy and stability of the method with the increased number of angular directions and mesh elements. Good scalability with parallelism in angle and axial planes is displayed. (authors)
Application of a Scalable, Parallel, Unstructured-Grid-Based Navier-Stokes Solver
NASA Technical Reports Server (NTRS)
Parikh, Paresh
2001-01-01
A parallel version of an unstructured-grid based Navier-Stokes solver, USM3Dns, previously developed for efficient operation on a variety of parallel computers, has been enhanced to incorporate upgrades made to the serial version. The resultant parallel code has been extensively tested on a variety of problems of aerospace interest and on two sets of parallel computers to understand and document its characteristics. An innovative grid renumbering construct and use of non-blocking communication are shown to produce superlinear computing performance. Preliminary results from parallelization of a recently introduced "porous surface" boundary condition are also presented.
Parallelization of Unsteady Adaptive Mesh Refinement for Unstructured Navier-Stokes Solvers
NASA Technical Reports Server (NTRS)
Schwing, Alan M.; Nompelis, Ioannis; Candler, Graham V.
2014-01-01
This paper explores the implementation of the MPI parallelization in a Navier-Stokes solver using adaptive mesh re nement. Viscous and inviscid test problems are considered for the purpose of benchmarking, as are implicit and explicit time advancement methods. The main test problem for comparison includes e ects from boundary layers and other viscous features and requires a large number of grid points for accurate computation. Ex- perimental validation against double cone experiments in hypersonic ow are shown. The adaptive mesh re nement shows promise for a staple test problem in the hypersonic com- munity. Extension to more advanced techniques for more complicated ows is described.
On the Performance of an Algebraic Multigrid Solver on Multicore Clusters
Baker, A; Schulz, M; Yang, U M
2009-11-24
Algebraic multigrid (AMG) solvers have proven to be extremely efficient on distributed-memory architectures. However, when executed on modern multicore cluster architectures, we face new challenges that can significantly harm AMG's performance. We discuss our experiences on such an architecture and present a set of techniques that help users to overcome the associated problems, including thread and process pinning and correct memory associations. We have implemented most of the techniques in a MultiCore SUPport library (MCSup), which helps to map OpenMP applications to multicore machines. We present results using both an MPI-only and a hybrid MPI/OpenMP model.
On the Performance of an Algebraic MultigridSolver on Multicore Clusters
Baker, A H; Schulz, M; Yang, U M
2010-04-29
Algebraic multigrid (AMG) solvers have proven to be extremely efficient on distributed-memory architectures. However, when executed on modern multicore cluster architectures, we face new challenges that can significantly harm AMG's performance. We discuss our experiences on such an architecture and present a set of techniques that help users to overcome the associated problems, including thread and process pinning and correct memory associations. We have implemented most of the techniques in a MultiCore SUPport library (MCSup), which helps to map OpenMP applications to multicore machines. We present results using both an MPI-only and a hybrid MPI/OpenMP model.
Robust and automated solution for correcting hotspots locally using cost-function based OPC solver
NASA Astrophysics Data System (ADS)
Babcock, Carl; Yang, Dongok; McGowan, Sarah; Ye, Jun; Yan, Bo; Qiu, Jianhong; Baron, Stanislas; Pandey, Taksh; Kapasi, Sanjay; Aquino, Chris
2014-03-01
In previous work1, we introduced a new technology called Flexible Mask Optimization (FMO) that was successfully used for localized OPC correction. OPC/RET techniques such as model-based assist feature and process-window-based OPC solvers have become essential for addressing critical patterning issues at 2× and lower technology nodes. With an FMO flow, critical patterns were identified, classified and corrected in localized areas only, using advanced techniques. One challenge with this flow is that once the hotspots are identified, a user still has to come up with OPC solutions to address the hotspots. This process can be cumbersome and time consuming as different types of hotspots with new designs may require different recipes, causing delays to tapeout. What is required is a robust, powerful and automated OPC technique that can handle various types of hotspots, so an automatic hotspot correction flow can be established. In this work, we introduce a new cost-function-based OPC technique called Co-optimization OPC that can be used to correct various types of hotspots with minimum tuning effort. In this approach, the OPC solver simultaneously solves for all the segments in a patch including main and sub-resolution assist features (SRAF), applying additional user-defined cost function constraints such as MEEF, PV band, MRC and SRAF printability. Unlike conventional OPC solvers, Cooptimization solvers can also move and grow SRAFs, which further improves the process window. The key benefit of the Co-optimization OPC solution is that it can be used in a standard recipe to resolve many different hotspots encountered across various designs for a given layer. In this study, we demonstrate that Co-optimization OPC can be successfully used to address various types of hotspots across designs for selected 2× nm node line/space layers, as an example. These layers have been particularly challenging as they use single-exposure lithography with k1 around 0.3. Aggressive RET
Performance of the block-Krylov energy group solvers in Jaguar
Watson, A. M.; Kennedy, R. A.
2012-07-01
A new method of coupling the inner and outer iterations for deterministic transport problems is proposed. This method is termed the Multigroup Energy Blocking Method (MEBM) and has been implemented in the deterministic transport solver Jaguar, which is currently under development at KAPL. The method is derived for both fixed-source and eigenvalue problems. The method is then applied to a PWR pin cell model, both in fixed-source mode and eigenvalue mode. The results show that the MEBM improves the convergence of both types of problems when applied to the thermal (up-scattering) groups. (authors)
iAPBS: a programming interface to Adaptive Poisson-Boltzmann Solver (APBS).
Konecny, Robert; Baker, Nathan A; McCammon, J Andrew
2012-07-26
The Adaptive Poisson-Boltzmann Solver (APBS) is a state-of-the-art suite for performing Poisson-Boltzmann electrostatic calculations on biomolecules. The iAPBS package provides a modular programmatic interface to the APBS library of electrostatic calculation routines. The iAPBS interface library can be linked with a FORTRAN or C/C++ program thus making all of the APBS functionality available from within the application. Several application modules for popular molecular dynamics simulation packages - Amber, NAMD and CHARMM are distributed with iAPBS allowing users of these packages to perform implicit solvent electrostatic calculations with APBS. PMID:22905037
Gerris: a tree-based adaptive solver for the incompressible Euler equations in complex geometries
NASA Astrophysics Data System (ADS)
Popinet, Stéphane
2003-09-01
An adaptive mesh projection method for the time-dependent incompressible Euler equations is presented. The domain is spatially discretised using quad/octrees and a multilevel Poisson solver is used to obtain the pressure. Complex solid boundaries are represented using a volume-of-fluid approach. Second-order convergence in space and time is demonstrated on regular, statically and dynamically refined grids. The quad/octree discretisation proves to be very flexible and allows accurate and efficient tracking of flow features. The source code of the method implementation is freely available.
Development of a New and Fast Linear Solver for Multi-component Reactive Transport Simulation
NASA Astrophysics Data System (ADS)
Qiao, C.; Li, L.; Bao, C.; Hu, X.; Johns, R.; Xu, J.
2013-12-01
Reactive transport models (RTM) have been extensively used to understand the coupling between solute transport and (bio) geochemical reactions in complex earth systems. RTM typically involves a large number of primary and secondary species with a complex reaction network in large domains. The computational expenses increase significantly with the number of grid blocks and the number of chemical species. Within both the operator splitting approach (OS) and the global implicit approach (GI) that are commonly used, the steps that involve Newton-Raphson method are typically one of the most time-consuming parts (up to 80% to 90% of CPU times). Under such circumstances, accelerating reactive transport simulation is very essential. In this research, we present a physics-based linear system solution strategy for general reactive transport models with many species. We observed up to five times speed up for the linear solver portion of the simulations in our test cases. Our new linear solver takes advantage of the sparsity of the Jacobian matrix arising from the reaction network. The Jacobian matrix for the speciation problem is typically considered as a dense matrix and solved with a direct method such as Gaussian elimination. For the reactive transport problem, the graph of the local Jacobian matrix has a one-to-one correspondence to the reaction network graph. The Jacobian matrix is commonly sparse and has the same sparsity structure for the same reaction network. We developed a strategy that performs a minimum degree of reordering and symbolic factorization to determine the non-zero pattern at the beginning of the OS and GI simulation. During the speciation calculation in OS, we calculate the L and U factors and solve the triangular matrices according to the non-zero pattern. For GI, our strategy can be applied to inverse the diagonal blocks in the block-Jacobi preconditioner and smoothers of the multigrid preconditioners in iterative solvers. Our strategy is naturally
Benchmarks of 3D Laplace Equation Solvers in a Cubic Configuration for Streamer Simulation
NASA Astrophysics Data System (ADS)
Joseph-Marie, Plewa; Olivier, Ducasse; Philippe, Dessante; Carolyn, Jacobs; Olivier, Eichwald; Nicolas, Renon; Mohammed, Yousfi
2016-05-01
The aim of this paper is to test a developed SOR R&B method using the Chebyshev accelerator algorithm to solve the Laplace equation in a cubic 3D configuration. Comparisons are made in terms of precision and computing time with other elliptic equation solvers proposed in the open source LIS library. The first results, obtained by using a single core on a HPC, show that the developed SOR R&B method is efficient when the spectral radius needed for the Chebyshev acceleration is carefully pre-estimated. Preliminary results obtained with a parallelized code using the MPI library are also discussed when the calculation is distributed over one hundred cores.
Saddlepoint distribution function approximations in biostatistical inference.
Kolassa, J E
2003-01-01
Applications of saddlepoint approximations to distribution functions are reviewed. Calculations are provided for marginal distributions and conditional distributions. These approximations are applied to problems of testing and generating confidence intervals, particularly in canonical exponential families.
Development of a grid-independent approximate Riemannsolver. Ph.D. Thesis - Michigan Univ.
NASA Technical Reports Server (NTRS)
Rumsey, Christopher Lockwood
1991-01-01
A grid-independent approximate Riemann solver for use with the Euler and Navier-Stokes equations was introduced and explored. The two-dimensional Euler and Navier-Stokes equations are described in Cartesian and generalized coordinates, as well as the traveling wave form of the Euler equations. The spatial and temporal discretization are described for both explicit and implicit time-marching schemes. The grid-aligned flux function of Roe is outlined, while the 5-wave grid-independent flux function is derived. The stability and monotonicity analysis of the 5-wave model are presented. Two-dimensional results are provided and extended to three dimensions. The corresponding results are presented.
An approximation technique for jet impingement flow
Najafi, Mahmoud; Fincher, Donald; Rahni, Taeibi; Javadi, KH.; Massah, H.
2015-03-10
The analytical approximate solution of a non-linear jet impingement flow model will be demonstrated. We will show that this is an improvement over the series approximation obtained via the Adomian decomposition method, which is itself, a powerful method for analysing non-linear differential equations. The results of these approximations will be compared to the Runge-Kutta approximation in order to demonstrate their validity.
NASA Astrophysics Data System (ADS)
Bhardwaj, Rajneesh; Mittal, Rajat
2011-11-01
The modeling of complex biological phenomena such as cardiac mechanics is challenging. It involves complex three dimensional geometries, moving structure boundaries inside the fluid domain and large flow-induced deformations of the structure. We present a fluid-structure interaction solver (FSI) which couples a sharp-interface immersed boundary method for flow simulation with a powerful finite-element based structure dynamics solver. An implicit partitioned (or segregated) approach is implemented to ensure the stability of the solver. We validate the FSI solver with published benchmark for a configuration which involves a thin elastic plate attached to a rigid cylinder. The frequency and amplitude of the oscillations of the plate are in good agreement with published results and non-linear dynamics of the plate and its coupling with the flow field are discussed. The FSI solver is used to understand left-ventricular hemodynamics and flow-induced dynamics of mitral leaflets during early diastolic filling and results from this study are presented.
Veijola, Timo; Råback, Peter
2007-01-01
We present a straightforward method to solve gas damping problems for perforated structures in two dimensions (2D) utilising a Perforation Profile Reynolds (PPR) solver. The PPR equation is an extended Reynolds equation that includes additional terms modelling the leakage flow through the perforations, and variable diffusivity and compressibility profiles. The solution method consists of two phases: 1) determination of the specific admittance profile and relative diffusivity (and relative compressibility) profiles due to the perforation, and 2) solution of the PPR equation with a FEM solver in 2D. Rarefied gas corrections in the slip-flow region are also included. Analytic profiles for circular and square holes with slip conditions are presented in the paper. To verify the method, square perforated dampers with 16–64 holes were simulated with a three-dimensional (3D) Navier-Stokes solver, a homogenised extended Reynolds solver, and a 2D PPR solver. Cases for both translational (in normal to the surfaces) and torsional motion were simulated. The presented method extends the region of accurate simulation of perforated structures to cases where the homogenisation method is inaccurate and the full 3D Navier-Stokes simulation is too time-consuming.
A unified approach to the Darwin approximation
Krause, Todd B.; Apte, A.; Morrison, P. J.
2007-10-15
There are two basic approaches to the Darwin approximation. The first involves solving the Maxwell equations in Coulomb gauge and then approximating the vector potential to remove retardation effects. The second approach approximates the Coulomb gauge equations themselves, then solves these exactly for the vector potential. There is no a priori reason that these should result in the same approximation. Here, the equivalence of these two approaches is investigated and a unified framework is provided in which to view the Darwin approximation. Darwin's original treatment is variational in nature, but subsequent applications of his ideas in the context of Vlasov's theory are not. We present here action principles for the Darwin approximation in the Vlasov context, and this serves as a consistency check on the use of the approximation in this setting.
TemperSAT: A new efficient fair-sampling random k-SAT solver
NASA Astrophysics Data System (ADS)
Fang, Chao; Zhu, Zheng; Katzgraber, Helmut G.
The set membership problem is of great importance to many applications and, in particular, database searches for target groups. Recently, an approach to speed up set membership searches based on the NP-hard constraint-satisfaction problem (random k-SAT) has been developed. However, the bottleneck of the approach lies in finding the solution to a large SAT formula efficiently and, in particular, a large number of independent solutions is needed to reduce the probability of false positives. Unfortunately, traditional random k-SAT solvers such as WalkSAT are biased when seeking solutions to the Boolean formulas. By porting parallel tempering Monte Carlo to the sampling of binary optimization problems, we introduce a new algorithm (TemperSAT) whose performance is comparable to current state-of-the-art SAT solvers for large k with the added benefit that theoretically it can find many independent solutions quickly. We illustrate our results by comparing to the currently fastest implementation of WalkSAT, WalkSATlm.
Aerodynamics Simulations for the D8 ``Double Bubble'' Aircraft Using the LAVA Unstructured Solver
NASA Astrophysics Data System (ADS)
Ballinger, Sean
2013-11-01
The D8 ``double bubble'' is a proposed design for quieter and more efficient domestic passenger aircraft of the Boeing 737 class. It features boundary layer-ingesting engines located under a non-load-bearing π-tail and a lightweight low-sweep wing for flight around Mach 0.7. The D8's wide lifting body is expected to supply 15% of its total lift, while a Boeing 737's fuselage contributes only 8%. The tapering rear of the fuselage is also predicted to experience a negative moment resulting in positive pitch, produce a thicker boundary layer for ingestion by distortion-tolerant engines, and act as a noise shield. To investigate these predictions, unstructured grids generated over a fine surface triangulation using Star-CCM+ are used to model the unpowered D8 with flow conditions mimicking those in the MIT Wright brothers wind tunnel at angles of attack from - 2 to 14 degrees. LAVA, the recently developed Launch Ascent and Vehicle Aerodynamics solver, is used to carry out simulations on an unstructured grid. The results are compared to wind tunnel data, and to data from structured grid simulations using the LAVA, Overflow, and Cart3D solvers. Applied Modeling and Simulation Branch, NASA Advanced Supercomputing Division, funded by New York Space Grant.
Extension of the Time-Spectral Approach to Overset Solvers for Arbitrary Motion
NASA Technical Reports Server (NTRS)
Leffell, Joshua Isaac; Murman, Scott M.; Pulliam, Thomas H.
2012-01-01
Forced periodic flows arise in a broad range of aerodynamic applications such as rotorcraft, turbomachinery, and flapping wing configurations. Standard practice involves solving the unsteady flow equations forward in time until the initial transient exits the domain and a statistically stationary flow is achieved. It is often required to simulate through several periods to remove the initial transient making unsteady design optimization prohibitively expensive for most realistic problems. An effort to reduce the computational cost of these calculations led to the development of the Harmonic Balance method [1, 2] which capitalizes on the periodic nature of the solution. The approach exploits the fact that forced temporally periodic flow, while varying in the time domain, is invariant in the frequency domain. Expanding the temporal variation at each spatial node into a Fourier series transforms the unsteady governing equations into a steady set of equations in integer harmonics that can be tackled with the acceleration techniques afforded to steady-state flow solvers. Other similar approaches, such as the Nonlinear Frequency Domain [3,4,5], Reduced Frequency [6] and Time-Spectral [7, 8, 9] methods, were developed shortly thereafter. Additionally, adjoint-based optimization techniques can be applied [10, 11] as well as frequency-adaptive methods [12, 13, 14] to provide even more flexibility to the method. The Fourier temporal basis functions imply spectral convergence as the number of harmonic modes, and correspondingly number of time samples, N, is increased. Some elect to solve the equations in the frequency domain directly, while others choose to transform the equations back into the time domain to simplify the process of adding this capability to existing solvers, but each harnesses the underlying steady solution in the frequency domain. These temporal projection methods will herein be collectively referred to as Time-Spectral methods. Time-Spectral methods have
Intervertebral disc creep behavior assessment through an open source finite element solver.
Castro, A P G; Wilson, W; Huyghe, J M; Ito, K; Alves, J L
2014-01-01
Degenerative Disc Disease (DDD) is one of the largest health problems faced worldwide, based on lost working time and associated costs. By means of this motivation, this work aims to evaluate a biomimetic Finite Element (FE) model of the Intervertebral Disc (IVD). Recent studies have emphasized the importance of an accurate biomechanical modeling of the IVD, as it is a highly complex multiphasic medium. Poroelastic models of the disc are mostly implemented in commercial finite element packages with limited access to the algorithms. Therefore, a novel poroelastic formulation implemented on a home-developed open source FE solver is briefly addressed throughout this paper. The combination of this formulation with biphasic osmotic swelling behavior is also taken into account. Numerical simulations were devoted to the analysis of the non-degenerated human lumbar IVD time-dependent behavior. The results of the tests performed for creep assessment were inside the scope of the experimental data, with a remarkable improvement of the numerical accuracy when compared with previously published results obtained with ABAQUS(®). In brief, this in-development open-source FE solver was validated with literature experimental data and aims to be a valuable tool to study the IVD biomechanics and DDD mechanisms. PMID:24210477
NASA Astrophysics Data System (ADS)
Sun, Y.; Shu, C.; Teo, C. J.; Wang, Y.; Yang, L. M.
2015-11-01
In this paper, a gas-kinetic flux solver (GKFS) is presented for the simulation of incompressible and compressible viscous flows. In this solver, the finite volume method is applied to discretize the Navier-Stokes equations. The inviscid and viscous fluxes at the interface are obtained simultaneously via the gas-kinetic scheme, which locally reconstruct the solution for the continuous Boltzmann equation. Different from the conventional gas-kinetic BGK scheme [1], a simple way is presented in this work to evaluate the non-equilibrium distribution function, which is calculated by the difference of equilibrium distribution functions at the cell interface and its surrounding points. As a consequence, explicit formulations for computing the conservative flow variables and fluxes are simply derived. In particular, three specific schemes are proposed and validated via several incompressible and compressible test examples. Numerical results show that all three schemes can provide accurate numerical results for incompressible flows. On the other hand, Scheme III is much more stable and consistent in simulation of compressible flows.
A Comparison of Three Navier-Stokes Solvers for Exhaust Nozzle Flowfields
NASA Technical Reports Server (NTRS)
Georgiadis, Nicholas J.; Yoder, Dennis A.; Debonis, James R.
1999-01-01
A comparison of the NPARC, PAB, and WIND (previously known as NASTD) Navier-Stokes solvers is made for two flow cases with turbulent mixing as the dominant flow characteristic, a two-dimensional ejector nozzle and a Mach 1.5 elliptic jet. The objective of the work is to determine if comparable predictions of nozzle flows can be obtained from different Navier-Stokes codes employed in a multiple site research program. A single computational grid was constructed for each of the two flows and used for all of the Navier-Stokes solvers. In addition, similar k-e based turbulence models were employed in each code, and boundary conditions were specified as similarly as possible across the codes. Comparisons of mass flow rates, velocity profiles, and turbulence model quantities are made between the computations and experimental data. The computational cost of obtaining converged solutions with each of the codes is also documented. Results indicate that all of the codes provided similar predictions for the two nozzle flows. Agreement of the Navier-Stokes calculations with experimental data was good for the ejector nozzle. However, for the Mach 1.5 elliptic jet, the calculations were unable to accurately capture the development of the three dimensional elliptic mixing layer.
NASA Astrophysics Data System (ADS)
Prihantoro, Rudy; Sutarno, Doddy; Nurhasan
2016-08-01
In this work, we seek numerical solution of 3-D Magnetotelluric (MT) using edge- based finite element method. This approach is a variant of standard finite element method and commonly referred as vector finite-element (VFE) method. Nonphysical solutions usually occurred when the solution is sought using standard finite element which is a node based element. Vector finite element attempt to overcome those nonphysical solutions by using the edges of the element as vector basis. The proposed approach on solving second order Maxwell differential equation of 3-D MT is using direct solver rather than iterative method. Therefore, divergence correction to accelerate the rate of convergence for its iterative solution is no longer needed. The utilization of direct solver has been verified previously for correctness by comparing the resulting solution to those given by analytical solution, as well as the solution come from the other numerical methods, for earth layered model, 2-D models and COMMEMI 3D-2 model. In this work, further verification resulted from recent comparison model of Dublin Test Model 1 (DTM1) is presented.
Verification and Validation of a Chemical Reaction Solver Coupled to the Piecewise Parabolic Method
NASA Astrophysics Data System (ADS)
Attal, Nitesh; Ramaprabhu, Praveen; Hossain, Jahed; Karkhanis, Varad; Roy, Sukesh; Gord, James; Uddin, Mesbah
2012-11-01
We present a detailed chemical kinetics reaction solver coupled to the Piecewise Parabolic Method (PPM) embedded in the widely used astrophysical FLASH code. The FLASH code solves the compressible Euler equations with a directionally split, PPM with Adaptive Mesh Refinement (AMR). The reaction network is solved using a library of coupled ODE solvers, specialized for handling stiff systems of equations. Finally, the diffusion of heat, mass, and momentum is handled either through an update of the fluxes of each quantity, or by directly solving a diffusion equation for each. The resulting product is capable of handling a variety of physics such as gas-phase chemical kinetics, diffusive transport of mass, momentum, and heat, shocks, sharp interfaces, multi-species mixtures, and thermal radiation. We will present results from verification and validation of the above capabilities through comparison with analytical solutions, and published numerical and experimental data. Our validation cases include advection of reacting fronts in 1-D and 2D, laminar premixed flames in a Bunsen burner configuration, and shock-driven combustion. We acknowledge funding from Spectral Energies LLC.
Turbulance boundary conditions for shear flow analysis, using the DTNS flow solver
NASA Technical Reports Server (NTRS)
Mizukami, M.
1995-01-01
The effects of different turbulence boundary conditions were examined for two classical flows: a turbulent plane free shear layer and a flat plate turbulent boundary layer with zero pressure gradient. The flow solver used was DTNS, an incompressible Reynolds averaged Navier-Stokes solver with k-epsilon turbulence modeling, developed at the U.S. Navy David Taylor Research Center. Six different combinations of turbulence boundary conditions at the inflow boundary were investigated: In case 1, 'exact' k and epsilon profiles were used; in case 2, the 'exact' k profile was used, and epsilon was extrapolated upstream; in case 3, both k and epsilon were extrapolated; in case 4, the turbulence intensity (I) was 1 percent, and the turbulent viscosity (mu(sub t)) was equal to the laminar viscosity; in case 5, the 'exact' k profile was used and mu(sub t) was equal to the laminar viscosity; in case 6, the I was 1 percent, and epsilon was extrapolated. Comparisons were made with experimental data, direct numerical simulation results, or theoretical predictions as applicable. Results obtained with DTNS showed that turbulence boundary conditions can have significant impacts on the solutions, especially for the free shear layer.