Parallelizable approximate solvers for recursions arising in preconditioning
Shapira, Y.
1996-12-31
For the recursions used in the Modified Incomplete LU (MILU) preconditioner, namely, the incomplete decomposition, forward elimination and back substitution processes, a parallelizable approximate solver is presented. The present analysis shows that the solutions of the recursions depend only weakly on their initial conditions and may be interpreted to indicate that the inexact solution is close, in some sense, to the exact one. The method is based on a domain decomposition approach, suitable for parallel implementations with message passing architectures. It requires a fixed number of communication steps per preconditioned iteration, independently of the number of subdomains or the size of the problem. The overlapping subdomains are either cubes (suitable for mesh-connected arrays of processors) or constructed by the data-flow rule of the recursions (suitable for line-connected arrays with possibly SIMD or vector processors). Numerical examples show that, in both cases, the overhead in the number of iterations required for convergence of the preconditioned iteration is small relatively to the speed-up gained.
An approximate Riemann solver for hypervelocity flows
NASA Technical Reports Server (NTRS)
Jacobs, Peter A.
1991-01-01
We describe an approximate Riemann solver for the computation of hypervelocity flows in which there are strong shocks and viscous interactions. The scheme has three stages, the first of which computes the intermediate states assuming isentropic waves. A second stage, based on the strong shock relations, may then be invoked if the pressure jump across either wave is large. The third stage interpolates the interface state from the two initial states and the intermediate states. The solver is used as part of a finite-volume code and is demonstrated on two test cases. The first is a high Mach number flow over a sphere while the second is a flow over a slender cone with an adiabatic boundary layer. In both cases the solver performs well.
Approximate Riemann solvers for the Godunov SPH (GSPH)
NASA Astrophysics Data System (ADS)
Puri, Kunal; Ramachandran, Prabhu
2014-08-01
The Godunov Smoothed Particle Hydrodynamics (GSPH) method is coupled with non-iterative, approximate Riemann solvers for solutions to the compressible Euler equations. The use of approximate solvers avoids the expensive solution of the non-linear Riemann problem for every interacting particle pair, as required by GSPH. In addition, we establish an equivalence between the dissipative terms of GSPH and the signal based SPH artificial viscosity, under the restriction of a class of approximate Riemann solvers. This equivalence is used to explain the anomalous “wall heating” experienced by GSPH and we provide some suggestions to overcome it. Numerical tests in one and two dimensions are used to validate the proposed Riemann solvers. A general SPH pairing instability is observed for two-dimensional problems when using unequal mass particles. In general, Ducowicz Roe's and HLLC approximate Riemann solvers are found to be suitable replacements for the iterative Riemann solver in the original GSPH scheme.
A 3D approximate maximum likelihood localization solver
2016-09-23
A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with acoustic transmitters and vocalizing marine mammals to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives and support Marine Renewable Energy. An approximate maximum likelihood solver was developed using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature.
Parallel iterative solvers and preconditioners using approximate hierarchical methods
Grama, A.; Kumar, V.; Sameh, A.
1996-12-31
In this paper, we report results of the performance, convergence, and accuracy of a parallel GMRES solver for Boundary Element Methods. The solver uses a hierarchical approximate matrix-vector product based on a hybrid Barnes-Hut / Fast Multipole Method. We study the impact of various accuracy parameters on the convergence and show that with minimal loss in accuracy, our solver yields significant speedups. We demonstrate the excellent parallel efficiency and scalability of our solver. The combined speedups from approximation and parallelism represent an improvement of several orders in solution time. We also develop fast and paralellizable preconditioners for this problem. We report on the performance of an inner-outer scheme and a preconditioner based on truncated Green`s function. Experimental results on a 256 processor Cray T3D are presented.
NONLINEAR MULTIGRID SOLVER EXPLOITING AMGe COARSE SPACES WITH APPROXIMATION PROPERTIES
Christensen, Max La Cour; Villa, Umberto E.; Engsig-Karup, Allan P.; Vassilevski, Panayot S.
2016-01-22
The paper introduces a nonlinear multigrid solver for mixed nite element discretizations based on the Full Approximation Scheme (FAS) and element-based Algebraic Multigrid (AMGe). The main motivation to use FAS for unstruc- tured problems is the guaranteed approximation property of the AMGe coarse spaces that were developed recently at Lawrence Livermore National Laboratory. These give the ability to derive stable and accurate coarse nonlinear discretization problems. The previous attempts (including ones with the original AMGe method, [5, 11]), were less successful due to lack of such good approximation properties of the coarse spaces. With coarse spaces with approximation properties, our FAS approach on un- structured meshes should be as powerful/successful as FAS on geometrically re ned meshes. For comparison, Newton's method and Picard iterations with an inner state-of-the-art linear solver is compared to FAS on a nonlinear saddle point problem with applications to porous media ow. It is demonstrated that FAS is faster than Newton's method and Picard iterations for the experiments considered here. Due to the guaranteed approximation properties of our AMGe, the coarse spaces are very accurate, providing a solver with the potential for mesh-independent convergence on general unstructured meshes.
Approximate Riemann solvers for the cosmic ray magnetohydrodynamical equations
NASA Astrophysics Data System (ADS)
Kudoh, Yuki; Hanawa, Tomoyuki
2016-11-01
We analyse the cosmic ray magnetohydrodynamic (CR MHD) equations to improve the numerical simulations. We propose to solve them in the fully conservation form, which is equivalent to the conventional CR MHD equations. In the fully conservation form, the CR energy equation is replaced with the CR `number' conservation, where the CR number density is defined as the three-fourths power of the CR energy density. The former contains an extra source term, while latter does not. An approximate Riemann solver is derived from the CR MHD equations in the fully conservation form. Based on the analysis, we propose a numerical scheme of which solutions satisfy the Rankine-Hugoniot relation at any shock. We demonstrate that it reproduces the Riemann solution derived by Pfrommer et al. for a 1D CR hydrodynamic shock tube problem. We compare the solution with those obtained by solving the CR energy equation. The latter solutions deviate from the Riemann solution seriously, when the CR pressure dominates over the gas pressure in the post-shocked gas. The former solutions converge to the Riemann solution and are of the second-order accuracy in space and time. Our numerical examples include an expansion of high-pressure sphere in a magnetized medium. Fast and slow shocks are sharply resolved in the example. We also discuss possible extension of the CR MHD equations to evaluate the average CR energy.
Li, Xinya; Deng, Z. Daniel; USA, Richland Washington; Sun, Yannan; USA, Richland Washington; Martinez, Jayson J.; USA, Richland Washington; Fu, Tao; USA, Richland Washington; McMichael, Geoffrey A.; USA, Richland Washington; Carlson, Thomas J.; USA, Richland Washington
2014-11-27
Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developed using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature.
Li, Xinya; Deng, Z. Daniel; USA, Richland Washington; ...
2014-11-27
Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developedmore » using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature.« less
Li, Xinya; Deng, Z Daniel; Sun, Yannan; Martinez, Jayson J; Fu, Tao; McMichael, Geoffrey A; Carlson, Thomas J
2014-11-27
Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developed using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature.
NASA Astrophysics Data System (ADS)
Li, Xinya; Deng, Z. Daniel; Sun, Yannan; Martinez, Jayson J.; Fu, Tao; McMichael, Geoffrey A.; Carlson, Thomas J.
2014-11-01
Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developed using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature.
Li, Xinya; Deng, Z. Daniel; Sun, Yannan; Martinez, Jayson J.; Fu, Tao; McMichael, Geoffrey A.; Carlson, Thomas J.
2014-01-01
Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developed using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature. PMID:25427517
Parallelizable adiabatic gate teleportation
NASA Astrophysics Data System (ADS)
Nakago, Kosuke; Hajdušek, Michal; Nakayama, Shojun; Murao, Mio
2015-12-01
To investigate how a temporally ordered gate sequence can be parallelized in adiabatic implementations of quantum computation, we modify adiabatic gate teleportation, a model of quantum computation proposed by Bacon and Flammia [Phys. Rev. Lett. 103, 120504 (2009), 10.1103/PhysRevLett.103.120504], to a form deterministically simulating parallelized gate teleportation, which is achievable only by postselection. We introduce a twisted Heisenberg-type interaction Hamiltonian, a Heisenberg-type spin interaction where the coordinates of the second qubit are twisted according to a unitary gate. We develop parallelizable adiabatic gate teleportation (PAGT) where a sequence of unitary gates is performed in a single step of the adiabatic process. In PAGT, numeric calculations suggest the necessary time for the adiabatic evolution implementing a sequence of L unitary gates increases at most as O (L5) . However, we show that it has the interesting property that it can map the temporal order of gates to the spatial order of interactions specified by the final Hamiltonian. Using this property, we present a controlled-PAGT scheme to manipulate the order of gates by a control qubit. In the controlled-PAGT scheme, two differently ordered sequential unitary gates F G and G F are coherently performed depending on the state of a control qubit by simultaneously applying the twisted Heisenberg-type interaction Hamiltonians implementing unitary gates F and G . We investigate why the twisted Heisenberg-type interaction Hamiltonian allows PAGT. We show that the twisted Heisenberg-type interaction Hamiltonian has an ability to perform a transposed unitary gate by just modifying the space ordering of the final Hamiltonian implementing a unitary gate in adiabatic gate teleportation. The dynamics generated by the time-reversed Hamiltonian represented by the transposed unitary gate enables deterministic simulation of a postselected event of parallelized gate teleportation in adiabatic
On Using a Fast Multipole Method-based Poisson Solver in anApproximate Projection Method
Williams, Sarah A.; Almgren, Ann S.; Puckett, E. Gerry
2006-03-28
Approximate projection methods are useful computational tools for solving the equations of time-dependent incompressible flow.Inthis report we will present a new discretization of the approximate projection in an approximate projection method. The discretizations of divergence and gradient will be identical to those in existing approximate projection methodology using cell-centered values of pressure; however, we will replace inversion of the five-point cell-centered discretization of the Laplacian operator by a Fast Multipole Method-based Poisson Solver (FMM-PS).We will show that the FMM-PS solver can be an accurate and robust component of an approximation projection method for constant density, inviscid, incompressible flow problems. Computational examples exhibiting second-order accuracy for smooth problems will be shown. The FMM-PS solver will be found to be more robust than inversion of the standard five-point cell-centered discretization of the Laplacian for certain time-dependent problems that challenge the robustness of the approximate projection methodology.
An approximate Riemann solver for real gas parabolized Navier-Stokes equations
Urbano, Annafederica; Nasuti, Francesco
2013-01-15
Under specific assumptions, parabolized Navier-Stokes equations are a suitable mean to study channel flows. A special case is that of high pressure flow of real gases in cooling channels where large crosswise gradients of thermophysical properties occur. To solve the parabolized Navier-Stokes equations by a space marching approach, the hyperbolicity of the system of governing equations is obtained, even for very low Mach number flow, by recasting equations such that the streamwise pressure gradient is considered as a source term. For this system of equations an approximate Roe's Riemann solver is developed as the core of a Godunov type finite volume algorithm. The properties of the approximated Riemann solver, which is a modification of Roe's Riemann solver for the parabolized Navier-Stokes equations, are presented and discussed with emphasis given to its original features introduced to handle fluids governed by a generic real gas EoS. Sample solutions are obtained for low Mach number high compressible flows of transcritical methane, heated in straight long channels, to prove the solver ability to describe flows dominated by complex thermodynamic phenomena.
NASA Astrophysics Data System (ADS)
Castro, Manuel J.; Gallardo, José M.; Marquina, Antonio
2017-10-01
We present recent advances in PVM (Polynomial Viscosity Matrix) methods based on internal approximations to the absolute value function, and compare them with Chebyshev-based PVM solvers. These solvers only require a bound on the maximum wave speed, so no spectral decomposition is needed. Another important feature of the proposed methods is that they are suitable to be written in Jacobian-free form, in which only evaluations of the physical flux are used. This is particularly interesting when considering systems for which the Jacobians involve complex expressions, e.g., the relativistic magnetohydrodynamics (RMHD) equations. On the other hand, the proposed Jacobian-free solvers have also been extended to the case of approximate DOT (Dumbser-Osher-Toro) methods, which can be regarded as simple and efficient approximations to the classical Osher-Solomon method, sharing most of it interesting features and being applicable to general hyperbolic systems. To test the properties of our schemes a number of numerical experiments involving the RMHD equations are presented, both in one and two dimensions. The obtained results are in good agreement with those found in the literature and show that our schemes are robust and accurate, running stable under a satisfactory time step restriction. It is worth emphasizing that, although this work focuses on RMHD, the proposed schemes are suitable to be applied to general hyperbolic systems.
An approximate Riemann solver for magnetohydrodynamics (that works in more than one dimension)
NASA Technical Reports Server (NTRS)
Powell, Kenneth G.
1994-01-01
An approximate Riemann solver is developed for the governing equations of ideal magnetohydrodynamics (MHD). The Riemann solver has an eight-wave structure, where seven of the waves are those used in previous work on upwind schemes for MHD, and the eighth wave is related to the divergence of the magnetic field. The structure of the eighth wave is not immediately obvious from the governing equations as they are usually written, but arises from a modification of the equations that is presented in this paper. The addition of the eighth wave allows multidimensional MHD problems to be solved without the use of staggered grids or a projection scheme, one or the other of which was necessary in previous work on upwind schemes for MHD. A test problem made up of a shock tube with rotated initial conditions is solved to show that the two-dimensional code yields answers consistent with the one-dimensional methods developed previously.
NASA Astrophysics Data System (ADS)
Jouvet, Guillaume
2015-04-01
In this paper, a multilayer generalisation of the Shallow Shelf Approximation (SSA) is considered. In this recent hybrid ice flow model, the ice thickness is divided into thin layers, which can spread out, contract and slide over each other in such a way that the velocity profile is layer-wise constant. Like the SSA (1-layer model), the multilayer model can be reformulated as a minimisation problem. However, unlike the SSA, the functional to be minimised involves a new penalisation term for the interlayer jumps of the velocity, which represents the vertical shear stresses induced by interlayer sliding. Taking advantage of this reformulation, numerical solvers developed for the SSA can be naturally extended layer-wise or column-wise. Numerical results show that the column-wise extension of a Newton multigrid solver proves to be robust in the sense that its convergence is barely influenced by the number of layers and the type of ice flow. In addition, the multilayer formulation appears to be naturally better conditioned than the one of the first-order approximation to face the anisotropic conditions of the sliding-dominant ice flow of ISMIP-HOM experiments.
Jouvet, Guillaume
2015-04-15
In this paper, a multilayer generalisation of the Shallow Shelf Approximation (SSA) is considered. In this recent hybrid ice flow model, the ice thickness is divided into thin layers, which can spread out, contract and slide over each other in such a way that the velocity profile is layer-wise constant. Like the SSA (1-layer model), the multilayer model can be reformulated as a minimisation problem. However, unlike the SSA, the functional to be minimised involves a new penalisation term for the interlayer jumps of the velocity, which represents the vertical shear stresses induced by interlayer sliding. Taking advantage of this reformulation, numerical solvers developed for the SSA can be naturally extended layer-wise or column-wise. Numerical results show that the column-wise extension of a Newton multigrid solver proves to be robust in the sense that its convergence is barely influenced by the number of layers and the type of ice flow. In addition, the multilayer formulation appears to be naturally better conditioned than the one of the first-order approximation to face the anisotropic conditions of the sliding-dominant ice flow of ISMIP-HOM experiments.
Gorpas, Dimitris; Andersson-Engels, Stefan
2012-12-01
The solution of the forward problem in fluorescence molecular imaging strongly influences the successful convergence of the fluorophore reconstruction. The most common approach to meeting this problem has been to apply the diffusion approximation. However, this model is a first-order angular approximation of the radiative transfer equation, and thus is subject to some well-known limitations. This manuscript proposes a methodology that confronts these limitations by applying the radiative transfer equation in spatial regions in which the diffusion approximation gives decreased accuracy. The explicit integro differential equations that formulate this model were solved by applying the Galerkin finite element approximation. The required spatial discretization of the investigated domain was implemented through the Delaunay triangulation, while the azimuthal discretization scheme was used for the angular space. This model has been evaluated on two simulation geometries and the results were compared with results from an independent Monte Carlo method and the radiative transfer equation by calculating the absolute values of the relative errors between these models. The results show that the proposed forward solver can approximate the radiative transfer equation and the Monte Carlo method with better than 95% accuracy, while the accuracy of the diffusion approximation is approximately 10% lower.
Sequentially Optimized Meshfree Approximation as a New Computational Fluid Dynamics Solver
NASA Astrophysics Data System (ADS)
Wilkinson, Matthew
This thesis presents the Sequentially Optimized Meshfree Approximation (SOMA) method, a new and powerful Computational Fluid Dynamics (CFD) solver. While standard computational methods can be faster and cheaper that physical experimentation, both in cost and work time, these methods do have some time and user interaction overhead which SOMA eliminates. As a meshfree method which could use adaptive domain refinement methods, SOMA avoids the need for user generated and/or analyzed grids, volumes, and meshes. Incremental building of a feed-forward artificial neural network through machine learning to solve the flow problem significantly reduces user interaction and reduces computational cost. This is done by avoiding the creation and inversion of possibly dense block diagonal matrices and by focusing computational work on regions where the flow changes and ignoring regions where no changes occur.
Regnier, D.; Verriere, M.; Dubray, N.; Schunck, N.
2015-11-30
In this study, we describe the software package FELIX that solves the equations of the time-dependent generator coordinate method (TDGCM) in NN-dimensions (N ≥ 1) under the Gaussian overlap approximation. The numerical resolution is based on the Galerkin finite element discretization of the collective space and the Crank–Nicolson scheme for time integration. The TDGCM solver is implemented entirely in C++. Several additional tools written in C++, Python or bash scripting language are also included for convenience. In this paper, the solver is tested with a series of benchmarks calculations. We also demonstrate the ability of our code to handle a realistic calculation of fission dynamics.
NASA Astrophysics Data System (ADS)
Lin, Xue-lei; Lu, Xin; Ng, Micheal K.; Sun, Hai-Wei
2016-10-01
A fast accurate approximation method with multigrid solver is proposed to solve a two-dimensional fractional sub-diffusion equation. Using the finite difference discretization of fractional time derivative, a block lower triangular Toeplitz matrix is obtained where each main diagonal block contains a two-dimensional matrix for the Laplacian operator. Our idea is to make use of the block ɛ-circulant approximation via fast Fourier transforms, so that the resulting task is to solve a block diagonal system, where each diagonal block matrix is the sum of a complex scalar times the identity matrix and a Laplacian matrix. We show that the accuracy of the approximation scheme is of O (ɛ). Because of the special diagonal block structure, we employ the multigrid method to solve the resulting linear systems. The convergence of the multigrid method is studied. Numerical examples are presented to illustrate the accuracy of the proposed approximation scheme and the efficiency of the proposed solver.
Divergence-free approximate Riemann solver for the quasi-neutral two-fluid plasma model
NASA Astrophysics Data System (ADS)
Amano, Takanobu
2015-10-01
A numerical method for the quasi-neutral two-fluid (QNTF) plasma model is described. The basic equations are ion and electron fluid equations and the Maxwell equations without displacement current. The neglect of displacement current is consistent with the assumption of charge neutrality. Therefore, Langmuir waves and electromagnetic waves are eliminated from the system, which is in clear contrast to the fully electromagnetic two-fluid model. It thus reduces to the ideal magnetohydrodynamic (MHD) equations in the long wavelength limit, but the two-fluid effect appearing at ion and electron inertial scales is fully taken into account. It is shown that the basic equations may be rewritten in a form that has formally the same structure as the MHD equations. The total mass, momentum, and energy are all written in the conservative form. A new three-dimensional numerical simulation code has been developed for the QNTF equations. The HLL (Harten-Lax-van Leer) approximate Riemann solver combined with the upwind constrained transport (UCT) scheme is applied. The method was originally developed for MHD [25], but works quite well for the present model as well. The simulation code is able to capture sharp multidimensional discontinuities as well as dispersive waves arising from the two-fluid effect at small scales without producing ∇ ṡ B errors. It is well known that conventional Hall-MHD codes often suffer a numerical stability issue associated with short wavelength whistler waves. On the other hand, since finite electron inertia introduces an upper bound to the phase speed of whistler waves in the present model, our code is free from the issue even without explicit dissipation terms or implicit time integration. Numerical experiments have confirmed that there is no need to resolve characteristic time scales such as plasma frequency or cyclotron frequency for numerical stability. Consequently, the QNTF model offers a better alternative to the Hall-MHD or fully
NASA Astrophysics Data System (ADS)
Lochon, H.; Daude, F.; Galon, P.; Hérard, J.-M.
2016-12-01
The computation of compressible two-phase flows with the Baer-Nunziato model is addressed. Only the convective part of the model that exhibits non-conservative products is considered and the source terms of the model that represent the exchange between phases are neglected. Based on the solver proposed by Tokareva & Toro [1], a new HLLC-type Riemann solver is built. The key idea of this new solver lies in an approximation of the two-phase contact discontinuity of the model. Thus the Riemann invariants of the wave are approximated in the "subsonic" case. A major consequence of this approximation is that the resulting solver can deal with any Equation Of State. It also allows to bypass the resolution of a non-linear equation based on those Riemann invariants. We assess the solver and compare it with others on 1D Riemann problems including grid convergence and efficiency studies. The ability of the proposed solver to deal with complex Equations Of State is also investigated. Finally, the different solvers have been compared on challenging 2D test-cases due to the presence of both material interfaces and shock waves: a shock-bubble interaction and underwater explosions. When compared with others, the present solver appears to be accurate, efficient and robust.
NASA Technical Reports Server (NTRS)
Rumsey, Christopher L.; Van Leer, Bram; Roe, Philip L.
1991-01-01
A new two-dimensional approximate Riemann solver has been developed that obtains fluxes on grid faces via wave decomposition. By utilizing information propagation in the velocity-difference directions rather than in the grid-normal directions, this flux function more appropriately interprets and hence more sharply resolves shock and shear waves when they lie oblique to the grid. The model uses five waves to describe the difference in states at a grid face. Two acoustic waves, one shear wave, and one entropy wave propagate in the direction defined by the local velocity difference vector, while the fifth wave is a shear wave that propagates at a right angle to the other four. Test cases presented include a shock reflecting off a wall, a pure shear wave, supersonic flow over an airfoil, and viscous separated airfoil flow. Results using the new model give significantly sharper shock and shear contours than a grid-aligned solver. Navier-Stokes computations over an aifoil show reduced pressure distortions in the separated region as a result of the grid-independent upwinding.
Fast solvers for finite difference approximations for the Stokes and Navier-Stokes equations
Shin, D.
1992-01-01
The authors consider several methods for solving the linear equations arising from finite difference discretizations of the Stokes equations. The pressure equation method presented here for the first time, apparently, and the method, presented by Bramble and Pasciak, are shown to have computational effort that grows slowly with the number of grid points. The methods work with second-order accurate discretizations. Computational results are shown for both the Stokes and incompressible Navier-Stokes at low Reynolds number. The inf-sup conditions resulting from three finite difference approximations of the Stokes equations are proven. These conditions are used to prove that the Schur complement Q[sub h] of the linear system generated by each of these approximations is bounded uniformly away from zero. For the pressure equation method, this guarantees that the conjugate gradient method applied to Q[sub h] converges in a finite number of iterations which is independent of mesh size. The fact that Q[sub h] is bounded below is used to prove convergence estimates for the solutions generated by these finite difference approximations. One of the estimates is for a staggered grid and the estimate of the scheme shows that both the pressure and the velocity parts of the solution are second-order accurate. Iterative methods are compared by the use of the regularized central differencing introduced by Strikwerda. Several finite difference approximations of the Stokes equations by the SOR method are compared and the excellence of the approximations by the regularized central differencing over the other finite difference approximation is mentioned. This difference gives rise to a linear equation with a matrix which is slightly non-symmetric. The convergence of the typical steepest descent method and conjugate gradient method, which is almost as same as the typical conjugate gradient method, applied to slightly non-symmetric positive definite matrices are proven.
The Boundary Riemann Solver Coming from the Real Vanishing Viscosity Approximation
NASA Astrophysics Data System (ADS)
Bianchini, Stefano; Spinolo, Laura V.
2009-01-01
We study the limit of the hyperbolic-parabolic approximation left\\{begin{array}{l@{quad}l@{quad}l} v^{\\varepsilon}_t + tilde{A} left(v^{\\varepsilon}, \\varepsilon v^{\\varepsilon}_x right) v^{\\varepsilon}_x = \\varepsilon tilde{B}(v^{\\varepsilon} ) v^{\\varepsilon}_{xx} quad v^{\\varepsilon} in mathbb{R}^N ŗtildess(v^{\\varepsilon} (t, 0)) equiv bar g ŗv^{\\varepsilon} (0, x) equiv bar{v}_0. right. The function {tilde {ss}} is defined in such a way as to guarantee that the initial boundary value problem is well posed even if {tilde {B}} is not invertible. The data {bar {g}} and {bar {v}0} are constant. When {tilde {B}} is invertible, the previous problem takes the simpler form left\\{begin{array}{l@{quad}l@{quad}l} v^{\\varepsilon}_t + tilde{A}left(v^{\\varepsilon}, \\varepsilon v^{\\varepsilon}_x right) v^{\\varepsilon}_x = \\varepsilon tilde{B}(v^{\\varepsilon} ) v^{\\varepsilon}_{xx}quad v^{\\varepsilon} in mathbb{R}^N ŗv^{\\varepsilon} (t, 0) equiv bar v_b ŗv^{\\varepsilon} (0, x) equiv bar{v}_0. right. Again, the data {bar {v}_b} and {bar {v}_0} are constant. The conservative case is included in the previous formulations. Convergence of the {v^{\\varepsilon}} , smallness of the total variation and other technical hypotheses are assumed, and a complete characterization of the limit is provided. The most interesting points are the following: First, the boundary characteristic case is considered, that is, one eigenvalue of {tilde {A}} can be 0. Second, as pointed out before, we take into account the possibility that {tilde {B}} is not invertible. To deal with this case, we take as hypotheses conditions that were introduced by Kawashima and Shizuta relying on physically meaningful examples. We also introduce a new condition of block linear degeneracy. We prove that, if this condition is not satisfied, then pathological behaviors may occur.
Connections in sub-Riemannian geometry of parallelizable distributions
NASA Astrophysics Data System (ADS)
Youssef, Nabil L.; Taha, Ebtsam H.
The notion of a parallelizable distribution has been introduced and investigated. A non-integrable parallelizable distribution carries a natural sub-Riemannian structure. The geometry of this structure has been studied from the bi-viewpoint of absolute parallelism geometry and sub-Riemannian geometry. Two remarkable linear connections have been constructed on a sub-Riemannian parallelizable distribution, namely, the Weitzenböck connection and the sub-Riemannian connection. The obtained results have been applied to two concrete examples: the spheres S3 and S7.
NASA Astrophysics Data System (ADS)
Bauer, Petr; Klement, Vladimír; Oberhuber, Tomáš; Žabka, Vítězslav
2016-03-01
We present a complete GPU implementation of a geometric multigrid solver for the numerical solution of the Navier-Stokes equations for incompressible flow. The approximate solution is constructed on a two-dimensional unstructured triangular mesh. The problem is discretized by means of the mixed finite element method with semi-implicit timestepping. The linear saddle-point problem arising from the scheme is solved by the geometric multigrid method with a Vanka-type smoother. The parallel solver is based on the red-black coloring of the mesh triangles. We achieved a speed-up of 11 compared to a parallel (4 threads) code based on OpenMP and 19 compared to a sequential code.
Tezaur, I. K.; Perego, M.; Salinger, A. G.; ...
2015-04-27
This paper describes a new parallel, scalable and robust finite element based solver for the first-order Stokes momentum balance equations for ice flow. The solver, known as Albany/FELIX, is constructed using the component-based approach to building application codes, in which mature, modular libraries developed as a part of the Trilinos project are combined using abstract interfaces and template-based generic programming, resulting in a final code with access to dozens of algorithmic and advanced analysis capabilities. Following an overview of the relevant partial differential equations and boundary conditions, the numerical methods chosen to discretize the ice flow equations are described, alongmore » with their implementation. The results of several verification studies of the model accuracy are presented using (1) new test cases for simplified two-dimensional (2-D) versions of the governing equations derived using the method of manufactured solutions, and (2) canonical ice sheet modeling benchmarks. Model accuracy and convergence with respect to mesh resolution are then studied on problems involving a realistic Greenland ice sheet geometry discretized using hexahedral and tetrahedral meshes. Also explored as a part of this study is the effect of vertical mesh resolution on the solution accuracy and solver performance. The robustness and scalability of our solver on these problems is demonstrated. Lastly, we show that good scalability can be achieved by preconditioning the iterative linear solver using a new algebraic multilevel preconditioner, constructed based on the idea of semi-coarsening.« less
Tezaur, I. K.; Perego, M.; Salinger, A. G.; Tuminaro, R. S.; Price, S. F.
2015-04-27
This paper describes a new parallel, scalable and robust finite element based solver for the first-order Stokes momentum balance equations for ice flow. The solver, known as Albany/FELIX, is constructed using the component-based approach to building application codes, in which mature, modular libraries developed as a part of the Trilinos project are combined using abstract interfaces and template-based generic programming, resulting in a final code with access to dozens of algorithmic and advanced analysis capabilities. Following an overview of the relevant partial differential equations and boundary conditions, the numerical methods chosen to discretize the ice flow equations are described, along with their implementation. The results of several verification studies of the model accuracy are presented using (1) new test cases for simplified two-dimensional (2-D) versions of the governing equations derived using the method of manufactured solutions, and (2) canonical ice sheet modeling benchmarks. Model accuracy and convergence with respect to mesh resolution are then studied on problems involving a realistic Greenland ice sheet geometry discretized using hexahedral and tetrahedral meshes. Also explored as a part of this study is the effect of vertical mesh resolution on the solution accuracy and solver performance. The robustness and scalability of our solver on these problems is demonstrated. Lastly, we show that good scalability can be achieved by preconditioning the iterative linear solver using a new algebraic multilevel preconditioner, constructed based on the idea of semi-coarsening.
NASA Astrophysics Data System (ADS)
Tezaur, I. K.; Perego, M.; Salinger, A. G.; Tuminaro, R. S.; Price, S. F.
2015-04-01
This paper describes a new parallel, scalable and robust finite element based solver for the first-order Stokes momentum balance equations for ice flow. The solver, known as Albany/FELIX, is constructed using the component-based approach to building application codes, in which mature, modular libraries developed as a part of the Trilinos project are combined using abstract interfaces and template-based generic programming, resulting in a final code with access to dozens of algorithmic and advanced analysis capabilities. Following an overview of the relevant partial differential equations and boundary conditions, the numerical methods chosen to discretize the ice flow equations are described, along with their implementation. The results of several verification studies of the model accuracy are presented using (1) new test cases for simplified two-dimensional (2-D) versions of the governing equations derived using the method of manufactured solutions, and (2) canonical ice sheet modeling benchmarks. Model accuracy and convergence with respect to mesh resolution are then studied on problems involving a realistic Greenland ice sheet geometry discretized using hexahedral and tetrahedral meshes. Also explored as a part of this study is the effect of vertical mesh resolution on the solution accuracy and solver performance. The robustness and scalability of our solver on these problems is demonstrated. Lastly, we show that good scalability can be achieved by preconditioning the iterative linear solver using a new algebraic multilevel preconditioner, constructed based on the idea of semi-coarsening.
NASA Astrophysics Data System (ADS)
Kalashnikova, I.; Perego, M.; Salinger, A. G.; Tuminaro, R. S.; Price, S. F.
2014-11-01
This paper describes a new parallel, scalable and robust finite-element based solver for the first-order Stokes momentum balance equations for ice flow. The solver, known as Albany/FELIX, is constructed using the component-based approach to building application codes, in which mature, modular libraries developed as a part of the Trilinos project are combined using abstract interfaces and Template-Based Generic Programming, resulting in a final code with access to dozens of algorithmic and advanced analysis capabilities. Following an overview of the relevant partial differential equations and boundary conditions, the numerical methods chosen to discretize the ice flow equations are described, along with their implementation. The results of several verification studies of the model accuracy are presented using: (1) new test cases derived using the method of manufactured solutions, and (2) canonical ice sheet modeling benchmarks. Model accuracy and convergence with respect to mesh resolution is then studied on problems involving a realistic Greenland ice sheet geometry discretized using structured and unstructured meshes. Also explored as a part of this study is the effect of vertical mesh resolution on the solution accuracy and solver performance. The robustness and scalability of our solver on these problems is demonstrated. Lastly, we show that good scalability can be achieved by preconditioning the iterative linear solver using a new algebraic multilevel preconditioner, constructed based on the idea of semi-coarsening.
Homman, Ahmed-Amine; Maillet, Jean-Bernard; Roussel, Julien; Stoltz, Gabriel
2016-01-14
This work presents new parallelizable numerical schemes for the integration of dissipative particle dynamics with energy conservation. So far, no numerical scheme introduced in the literature is able to correctly preserve the energy over long times and give rise to small errors on average properties for moderately small time steps, while being straightforwardly parallelizable. We present in this article two new methods, both straightforwardly parallelizable, allowing to correctly preserve the total energy of the system. We illustrate the accuracy and performance of these new schemes both on equilibrium and nonequilibrium parallel simulations.
Development and Characterization of a Parallelizable Perfusion Bioreactor for 3D Cell Culture
Egger, Dominik; Fischer, Monica; Clementi, Andreas; Ribitsch, Volker; Hansmann, Jan; Kasper, Cornelia
2017-01-01
The three dimensional (3D) cultivation of stem cells in dynamic bioreactor systems is essential in the context of regenerative medicine. Still, there is a lack of bioreactor systems that allow the cultivation of multiple independent samples under different conditions while ensuring comprehensive control over the mechanical environment. Therefore, we developed a miniaturized, parallelizable perfusion bioreactor system with two different bioreactor chambers. Pressure sensors were also implemented to determine the permeability of biomaterials which allows us to approximate the shear stress conditions. To characterize the flow velocity and shear stress profile of a porous scaffold in both bioreactor chambers, a computational fluid dynamics analysis was performed. Furthermore, the mixing behavior was characterized by acquisition of the residence time distributions. Finally, the effects of the different flow and shear stress profiles of the bioreactor chambers on osteogenic differentiation of human mesenchymal stem cells were evaluated in a proof of concept study. In conclusion, the data from computational fluid dynamics and shear stress calculations were found to be predictable for relative comparison of the bioreactor geometries, but not for final determination of the optimal flow rate. However, we suggest that the system is beneficial for parallel dynamic cultivation of multiple samples for 3D cell culture processes.
Hierarchically Parallelized Constrained Nonlinear Solvers with Automated Substructuring
NASA Technical Reports Server (NTRS)
Padovan, Joe; Kwang, Abel
1994-01-01
This paper develops a parallelizable multilevel multiple constrained nonlinear equation solver. The substructuring process is automated to yield appropriately balanced partitioning of each succeeding level. Due to the generality of the procedure,_sequential, as well as partially and fully parallel environments can be handled. This includes both single and multiprocessor assignment per individual partition. Several benchmark examples are presented. These illustrate the robustness of the procedure as well as its capability to yield significant reductions in memory utilization and calculational effort due both to updating and inversion.
Real-space method for highly parallelizable electronic transport calculations
NASA Astrophysics Data System (ADS)
Feldman, Baruch; Seideman, Tamar; Hod, Oded; Kronik, Leeor
2014-07-01
We present a real-space method for first-principles nanoscale electronic transport calculations. We use the nonequilibrium Green's function method with density functional theory and implement absorbing boundary conditions (ABCs, also known as complex absorbing potentials, or CAPs) to represent the effects of the semi-infinite leads. In real space, the Kohn-Sham Hamiltonian matrix is highly sparse. As a result, the transport problem parallelizes naturally and can scale favorably with system size, enabling the computation of conductance in relatively large molecular junction models. Our use of ABCs circumvents the demanding task of explicitly calculating the leads' self-energies from surface Green's functions, and is expected to be more accurate than the use of the jellium approximation. In addition, we take advantage of the sparsity in real space to solve efficiently for the Green's function over the entire energy range relevant to low-bias transport. We illustrate the advantages of our method with calculations on several challenging test systems and find good agreement with reference calculation results.
NASA Technical Reports Server (NTRS)
Ilin, Andrew V.
2006-01-01
The Magnetic Field Solver computer program calculates the magnetic field generated by a group of collinear, cylindrical axisymmetric electromagnet coils. Given the current flowing in, and the number of turns, axial position, and axial and radial dimensions of each coil, the program calculates matrix coefficients for a finite-difference system of equations that approximates a two-dimensional partial differential equation for the magnetic potential contributed by the coil. The program iteratively solves these finite-difference equations by use of the modified incomplete Cholesky preconditioned-conjugate-gradient method. The total magnetic potential as a function of axial (z) and radial (r) position is then calculated as a sum of the magnetic potentials of the individual coils, using a high-accuracy interpolation scheme. Then the r and z components of the magnetic field as functions of r and z are calculated from the total magnetic potential by use of a high-accuracy finite-difference scheme. Notably, for the finite-difference calculations, the program generates nonuniform two-dimensional computational meshes from nonuniform one-dimensional meshes. Each mesh is generated in such a way as to minimize the numerical error for a benchmark one-dimensional magnetostatic problem.
Solving block linear systems with low-rank off-diagonal blocks is easily parallelizable
Menkov, V.
1996-12-31
An easily and efficiently parallelizable direct method is given for solving a block linear system Bx = y, where B = D + Q is the sum of a non-singular block diagonal matrix D and a matrix Q with low-rank blocks. This implicitly defines a new preconditioning method with an operation count close to the cost of calculating a matrix-vector product Qw for some w, plus at most twice the cost of calculating Qw for some w. When implemented on a parallel machine the processor utilization can be as good as that of those operations. Order estimates are given for the general case, and an implementation is compared to block SSOR preconditioning.
Matrix decomposition graphics processing unit solver for Poisson image editing
NASA Astrophysics Data System (ADS)
Lei, Zhao; Wei, Li
2012-10-01
In recent years, gradient-domain methods have been widely discussed in the image processing field, including seamless cloning and image stitching. These algorithms are commonly carried out by solving a large sparse linear system: the Poisson equation. However, solving the Poisson equation is a computational and memory intensive task which makes it not suitable for real-time image editing. A new matrix decomposition graphics processing unit (GPU) solver (MDGS) is proposed to settle the problem. A matrix decomposition method is used to distribute the work among GPU threads, so that MDGS will take full advantage of the computing power of current GPUs. Additionally, MDGS is a hybrid solver (combines both the direct and iterative techniques) and has two-level architecture. These enable MDGS to generate identical solutions with those of the common Poisson methods and achieve high convergence rate in most cases. This approach is advantageous in terms of parallelizability, enabling real-time image processing, low memory-taken and extensive applications.
A Fast Poisson Solver with Periodic Boundary Conditions for GPU Clusters in Various Configurations
NASA Astrophysics Data System (ADS)
Rattermann, Dale Nicholas
Fast Poisson solvers using the Fast Fourier Transform on uniform grids are especially suited for parallel implementation, making them appropriate for portability on graphical processing unit (GPU) devices. The goal of the following work was to implement, test, and evaluate a fast Poisson solver for periodic boundary conditions for use on a variety of GPU configurations. The solver used in this research was FLASH, an immersed-boundary-based method, which is well suited for complex, time-dependent geometries, has robust adaptive mesh refinement/de-refinement capabilities to capture evolving flow structures, and has been successfully implemented on conventional, parallel supercomputers. However, these solvers are still computationally costly to employ, and the total solver time is dominated by the solution of the pressure Poisson equation using state-of-the-art multigrid methods. FLASH improves the performance of its multigrid solvers by integrating a parallel FFT solver on a uniform grid during a coarse level. This hybrid solver could then be theoretically improved by replacing the highly-parallelizable FFT solver with one that utilizes GPUs, and, thus, was the motivation for my research. In the present work, the CPU-utilizing parallel FFT solver (PFFT) used in the base version of FLASH for solving the Poisson equation on uniform grids has been modified to enable parallel execution on CUDA-enabled GPU devices. New algorithms have been implemented to replace the Poisson solver that decompose the computational domain and send each new block to a GPU for parallel computation. One-dimensional (1-D) decomposition of the computational domain minimizes the amount of network traffic involved in this bandwidth-intensive computation by limiting the amount of all-to-all communication required between processes. Advanced techniques have been incorporated and implemented in a GPU-centric code design, while allowing end users the flexibility of parameter control at runtime in
Parallel Multigrid Equation Solver
Adams, Mark
2001-09-07
Prometheus is a fully parallel multigrid equation solver for matrices that arise in unstructured grid finite element applications. It includes a geometric and an algebraic multigrid method and has solved problems of up to 76 mullion degrees of feedom, problems in linear elasticity on the ASCI blue pacific and ASCI red machines.
Murasaki: a fast, parallelizable algorithm to find anchors from multiple genomes.
Popendorf, Kris; Tsuyoshi, Hachiya; Osana, Yasunori; Sakakibara, Yasubumi
2010-09-24
With the number of available genome sequences increasing rapidly, the magnitude of sequence data required for multiple-genome analyses is a challenging problem. When large-scale rearrangements break the collinearity of gene orders among genomes, genome comparison algorithms must first identify sets of short well-conserved sequences present in each genome, termed anchors. Previously, anchor identification among multiple genomes has been achieved using pairwise alignment tools like BLASTZ through progressive alignment tools like TBA, but the computational requirements for sequence comparisons of multiple genomes quickly becomes a limiting factor as the number and scale of genomes grows. Our algorithm, named Murasaki, makes it possible to identify anchors within multiple large sequences on the scale of several hundred megabases in few minutes using a single CPU. Two advanced features of Murasaki are (1) adaptive hash function generation, which enables efficient use of arbitrary mismatch patterns (spaced seeds) and therefore the comparison of multiple mammalian genomes in a practical amount of computation time, and (2) parallelizable execution that decreases the required wall-clock and CPU times. Murasaki can perform a sensitive anchoring of eight mammalian genomes (human, chimp, rhesus, orangutan, mouse, rat, dog, and cow) in 21 hours CPU time (42 minutes wall time). This is the first single-pass in-core anchoring of multiple mammalian genomes. We evaluated Murasaki by comparing it with the genome alignment programs BLASTZ and TBA. We show that Murasaki can anchor multiple genomes in near linear time, compared to the quadratic time requirements of BLASTZ and TBA, while improving overall accuracy. Murasaki provides an open source platform to take advantage of long patterns, cluster computing, and novel hash algorithms to produce accurate anchors across multiple genomes with computational efficiency significantly greater than existing methods. Murasaki is available
A non-conforming 3D spherical harmonic transport solver
Van Criekingen, S.
2006-07-01
A new 3D transport solver for the time-independent Boltzmann transport equation has been developed. This solver is based on the second-order even-parity form of the transport equation. The angular discretization is performed through the expansion of the angular neutron flux in spherical harmonics (PN method). The novelty of this solver is the use of non-conforming finite elements for the spatial discretization. Such elements lead to a discontinuous flux approximation. This interface continuity requirement relaxation property is shared with mixed-dual formulations such as the ones based on Raviart-Thomas finite elements. Encouraging numerical results are presented. (authors)
Scalable solvers and applications
Ribbens, C J
2000-10-27
The purpose of this report is to summarize research activities carried out under Lawrence Livermore National Laboratory (LLNL) research subcontract B501073. This contract supported the principal investigator (P1), Dr. Calvin Ribbens, during his sabbatical visit to LLNL from August 1999 through June 2000. Results and conclusions from the work are summarized below in two major sections. The first section covers contributions to the Scalable Linear Solvers and hypre projects in the Center for Applied Scientific Computing (CASC). The second section describes results from collaboration with Patrice Turchi of LLNL's Chemistry and Materials Science Directorate (CMS). A list of publications supported by this subcontract appears at the end of the report.
Euler solvers for transonic applications
NASA Technical Reports Server (NTRS)
Vanleer, Bram
1989-01-01
The 1980s may well be called the Euler era of applied aerodynamics. Computer codes based on discrete approximations of the Euler equations are now routinely used to obtain solutions of transonic flow problems in which the effects of entropy and vorticity production are significant. Such codes can even predict separation from a sharp edge, owing to the inclusion of artificial dissipation, intended to lend numerical stability to the calculation but at the same time enforcing the Kutta condition. One effect not correctly predictable by Euler codes is the separation from a smooth surface, and neither is viscous drag; for these some form of the Navier-Stokes equation is needed. It, therefore, comes as no surprise to observe that the Navier-Stokes has already begun before Euler solutions were fully exploited. Moreover, most numerical developments for the Euler equations are now constrained by the requirement that the techniques introduced, notably artificial dissipation, must not interfere with the new physics added when going from an Euler to a full Navier-Stokes approximation. In order to appreciate the contributions of Euler solvers to the understanding of transonic aerodynamics, it is useful to review the components of these computational tools. Space discretization, time- or pseudo-time marching and boundary procedures, the essential constituents are discussed. The subject of grid generation and grid adaptation to the solution are touched upon only where relevant. A list of unanswered questions and an outlook for the future are covered.
Implicit Riemann solvers for the Pn equations.
Mehlhorn, Thomas Alan; McClarren, Ryan; Brunner, Thomas A.; Holloway, James Paul
2005-03-01
The spherical harmonics (P{sub n}) approximation to the transport equation for time dependent problems has previously been treated using Riemann solvers and explicit time integration. Here we present an implicit time integration method for the P n equations using Riemann solvers. Both first-order and high-resolution spatial discretization schemes are detailed. One facet of the high-resolution scheme is that a system of nonlinear equations must be solved at each time step. This nonlinearity is the result of slope reconstruction techniques necessary to avoid the introduction of artifical extrema in the numerical solution. Results are presented that show auspicious agreement with analytical solutions using time steps well beyond the CFL limit.
NASA Astrophysics Data System (ADS)
Holmström, M.; Nilsson, H.
2012-09-01
We present a hybrid plasma solver (particle ions, fluid mass-less electrons). The software is built on the public available FLASH software, developed at the University of Chicago [1], that provide adaptive grids and is fully parallelized. FLASH is a general parallel solver for compressible flow problems. It is written in Fortran 90, well structured into modules, has good support, and is open source. The parallelization is done using a block-structured adaptive cartesian grid with the Message-Passing Interface (MPI) library as the underlying communication layer. The hybrid solver in FLASH uses cell centered finite differences [2] and conserves energy well [3]. Recently we have added to the hybrid solver the capability of handling vacuum regions, non-uniform resistivity, external fields, and hyperresistivity. We also present an application of the solver to the interaction between the Moon and the solar wind [4], as illustrated in Fig. 1.
Parallel tridiagonal equation solvers
NASA Technical Reports Server (NTRS)
Stone, H. S.
1974-01-01
Three parallel algorithms were compared for the direct solution of tridiagonal linear systems of equations. The algorithms are suitable for computers such as ILLIAC 4 and CDC STAR. For array computers similar to ILLIAC 4, cyclic odd-even reduction has the least operation count for highly structured sets of equations, and recursive doubling has the least count for relatively unstructured sets of equations. Since the difference in operation counts for these two algorithms is not substantial, their relative running times may be more related to overhead operations, which are not measured in this paper. The third algorithm, based on Buneman's Poisson solver, has more arithmetic operations than the others, and appears to be the least favorable. For pipeline computers similar to CDC STAR, cyclic odd-even reduction appears to be the most preferable algorithm for all cases.
Parallel, Implicit, Finite Element Solver
NASA Astrophysics Data System (ADS)
Lowrie, Weston; Shumlak, Uri; Meier, Eric; Marklin, George
2007-11-01
A parallel, implicit, finite element solver is described for solutions to the ideal MHD equations and the Pseudo-1D Euler equations. The solver uses the conservative flux source form of the equations. This helps simplify the discretization of the finite element method by keeping the specification of the physics separate. An implicit time advance is used to allow sufficiently large time steps. The Portable Extensible Toolkit for Scientific Computation (PETSc) is implemented for parallel matrix solvers and parallel data structures. Results for several test cases are described as well as accuracy of the method.
A multigrid solver for the semiconductor equations
NASA Technical Reports Server (NTRS)
Bachmann, Bernhard
1993-01-01
We present a multigrid solver for the exponential fitting method. The solver is applied to the current continuity equations of semiconductor device simulation in two dimensions. The exponential fitting method is based on a mixed finite element discretization using the lowest-order Raviart-Thomas triangular element. This discretization method yields a good approximation of front layers and guarantees current conservation. The corresponding stiffness matrix is an M-matrix. 'Standard' multigrid solvers, however, cannot be applied to the resulting system, as this is dominated by an unsymmetric part, which is due to the presence of strong convection in part of the domain. To overcome this difficulty, we explore the connection between Raviart-Thomas mixed methods and the nonconforming Crouzeix-Raviart finite element discretization. In this way we can construct nonstandard prolongation and restriction operators using easily computable weighted L(exp 2)-projections based on suitable quadrature rules and the upwind effects of the discretization. The resulting multigrid algorithm shows very good results, even for real-world problems and for locally refined grids.
Sherlock Holmes, Master Problem Solver.
ERIC Educational Resources Information Center
Ballew, Hunter
1994-01-01
Shows the connections between Sherlock Holmes's investigative methods and mathematical problem solving, including observations, characteristics of the problem solver, importance of data, questioning the obvious, learning from experience, learning from errors, and indirect proof. (MKR)
Sherlock Holmes, Master Problem Solver.
ERIC Educational Resources Information Center
Ballew, Hunter
1994-01-01
Shows the connections between Sherlock Holmes's investigative methods and mathematical problem solving, including observations, characteristics of the problem solver, importance of data, questioning the obvious, learning from experience, learning from errors, and indirect proof. (MKR)
Modiri, A; Gu, X; Sawant, A
2014-06-15
Purpose: We present a particle swarm optimization (PSO)-based 4D IMRT planning technique designed for dynamic MLC tracking delivery to lung tumors. The key idea is to utilize the temporal dimension as an additional degree of freedom rather than a constraint in order to achieve improved sparing of organs at risk (OARs). Methods: The target and normal structures were manually contoured on each of the ten phases of a 4DCT scan acquired from a lung SBRT patient who exhibited 1.5cm tumor motion despite the use of abdominal compression. Corresponding ten IMRT plans were generated using the Eclipse treatment planning system. These plans served as initial guess solutions for the PSO algorithm. Fluence weights were optimized over the entire solution space i.e., 10 phases × 12 beams × 166 control points. The size of the solution space motivated our choice of PSO, which is a highly parallelizable stochastic global optimization technique that is well-suited for such large problems. A summed fluence map was created using an in-house B-spline deformable image registration. Each plan was compared with a corresponding, internal target volume (ITV)-based IMRT plan. Results: The PSO 4D IMRT plan yielded comparable PTV coverage and significantly higher dose—sparing for parallel and serial OARs compared to the ITV-based plan. The dose-sparing achieved via PSO-4DIMRT was: lung Dmean = 28%; lung V20 = 90%; spinal cord Dmax = 23%; esophagus Dmax = 31%; heart Dmax = 51%; heart Dmean = 64%. Conclusion: Truly 4D IMRT that uses the temporal dimension as an additional degree of freedom can achieve significant dose sparing of serial and parallel OARs. Given the large solution space, PSO represents an attractive, parallelizable tool to achieve globally optimal solutions for such problems. This work was supported through funding from the National Institutes of Health and Varian Medical Systems. Amit Sawant has research funding from Varian Medical Systems, VisionRT Ltd. and Elekta.
Fast wavelet based sparse approximate inverse preconditioner
Wan, W.L.
1996-12-31
Incomplete LU factorization is a robust preconditioner for both general and PDE problems but unfortunately not easy to parallelize. Recent study of Huckle and Grote and Chow and Saad showed that sparse approximate inverse could be a potential alternative while readily parallelizable. However, for special class of matrix A that comes from elliptic PDE problems, their preconditioners are not optimal in the sense that independent of mesh size. A reason may be that no good sparse approximate inverse exists for the dense inverse matrix. Our observation is that for this kind of matrices, its inverse entries typically have piecewise smooth changes. We can take advantage of this fact and use wavelet compression techniques to construct a better sparse approximate inverse preconditioner. We shall show numerically that our approach is effective for this kind of matrices.
An approximate Riemann solver for thermal and chemical nonequilibrium flows
NASA Technical Reports Server (NTRS)
Prabhu, Ramadas K.
1994-01-01
Among the many methods available for the determination of inviscid fluxes across a surface of discontinuity, the flux-difference-splitting technique that employs Roe-averaged variables has been used extensively by the CFD community because of its simplicity and its ability to capture shocks exactly. This method, originally developed for perfect gas flows, has since been extended to equilibrium as well as nonequilibrium flows. Determination of the Roe-averaged variables for the case of a perfect gas flow is a simple task; however, for thermal and chemical nonequilibrium flows, some of the variables are not uniquely defined. Methods available in the literature to determine these variables seem to lack sound bases. The present paper describes a simple, yet accurate, method to determine all the variables for nonequilibrium flows in the Roe-average state. The basis for this method is the requirement that the Roe-averaged variables form a consistent set of thermodynamic variables. The present method satisfies the requirement that the square of the speed of sound be positive.
Scalable Parallel Algebraic Multigrid Solvers
Bank, R; Lu, S; Tong, C; Vassilevski, P
2005-03-23
The authors propose a parallel algebraic multilevel algorithm (AMG), which has the novel feature that the subproblem residing in each processor is defined over the entire partition domain, although the vast majority of unknowns for each subproblem are associated with the partition owned by the corresponding processor. This feature ensures that a global coarse description of the problem is contained within each of the subproblems. The advantages of this approach are that interprocessor communication is minimized in the solution process while an optimal order of convergence rate is preserved; and the speed of local subproblem solvers can be maximized using the best existing sequential algebraic solvers.
Self-correcting Multigrid Solver
Jerome L.V. Lewandowski
2004-06-29
A new multigrid algorithm based on the method of self-correction for the solution of elliptic problems is described. The method exploits information contained in the residual to dynamically modify the source term (right-hand side) of the elliptic problem. It is shown that the self-correcting solver is more efficient at damping the short wavelength modes of the algebraic error than its standard equivalent. When used in conjunction with a multigrid method, the resulting solver displays an improved convergence rate with no additional computational work.
Numerical comparison of Riemann solvers for astrophysical hydrodynamics
NASA Astrophysics Data System (ADS)
Klingenberg, Christian; Schmidt, Wolfram; Waagan, Knut
2007-11-01
The idea of this work is to compare a new positive and entropy stable approximate Riemann solver by Francois Bouchut with a state-of the-art algorithm for astrophysical fluid dynamics. We implemented the new Riemann solver into an astrophysical PPM-code, the Prometheus code, and also made a version with a different, more theoretically grounded higher order algorithm than PPM. We present shock tube tests, two-dimensional instability tests and forced turbulence simulations in three dimensions. We find subtle differences between the codes in the shock tube tests, and in the statistics of the turbulence simulations. The new Riemann solver increases the computational speed without significant loss of accuracy.
NASA Technical Reports Server (NTRS)
Mineck, Raymond E.; Thomas, James L.; Biedron, Robert T.; Diskin, Boris
2005-01-01
FMG3D (full multigrid 3 dimensions) is a pilot computer program that solves equations of fluid flow using a finite difference representation on a structured grid. Infrastructure exists for three dimensions but the current implementation treats only two dimensions. Written in Fortran 90, FMG3D takes advantage of the recursive subroutine feature, dynamic memory allocation, and structured-programming constructs of that language. FMG3D supports multi-block grids with three types of block-to-block interfaces: periodic, C-zero, and C-infinity. For all three types, grid points must match at interfaces. For periodic and C-infinity types, derivatives of grid metrics must be continuous at interfaces. The available equation sets are as follows: scalar elliptic equations, scalar convection equations, and the pressure-Poisson formulation of the Navier-Stokes equations for an incompressible fluid. All the equation sets are implemented with nonzero forcing functions to enable the use of user-specified solutions to assist in verification and validation. The equations are solved with a full multigrid scheme using a full approximation scheme to converge the solution on each succeeding grid level. Restriction to the next coarser mesh uses direct injection for variables and full weighting for residual quantities; prolongation of the coarse grid correction from the coarse mesh to the fine mesh uses bilinear interpolation; and prolongation of the coarse grid solution uses bicubic interpolation.
Linear iterative solvers for implicit ODE methods
NASA Technical Reports Server (NTRS)
Saylor, Paul E.; Skeel, Robert D.
1990-01-01
The numerical solution of stiff initial value problems, which lead to the problem of solving large systems of mildly nonlinear equations are considered. For many problems derived from engineering and science, a solution is possible only with methods derived from iterative linear equation solvers. A common approach to solving the nonlinear equations is to employ an approximate solution obtained from an explicit method. The error is examined to determine how it is distributed among the stiff and non-stiff components, which bears on the choice of an iterative method. The conclusion is that error is (roughly) uniformly distributed, a fact that suggests the Chebyshev method (and the accompanying Manteuffel adaptive parameter algorithm). This method is described, also commenting on Richardson's method and its advantages for large problems. Richardson's method and the Chebyshev method with the Mantueffel algorithm are applied to the solution of the nonlinear equations by Newton's method.
SU-E-T-22: A Deterministic Solver of the Boltzmann-Fokker-Planck Equation for Dose Calculation
Hong, X; Gao, H; Paganetti, H
2015-06-15
Purpose: The Boltzmann-Fokker-Planck equation (BFPE) accurately models the migration of photons/charged particles in tissues. While the Monte Carlo (MC) method is popular for solving BFPE in a statistical manner, we aim to develop a deterministic BFPE solver based on various state-of-art numerical acceleration techniques for rapid and accurate dose calculation. Methods: Our BFPE solver is based on the structured grid that is maximally parallelizable, with the discretization in energy, angle and space, and its cross section coefficients are derived or directly imported from the Geant4 database. The physical processes that are taken into account are Compton scattering, photoelectric effect, pair production for photons, and elastic scattering, ionization and bremsstrahlung for charged particles.While the spatial discretization is based on the diamond scheme, the angular discretization synergizes finite element method (FEM) and spherical harmonics (SH). Thus, SH is used to globally expand the scattering kernel and FFM is used to locally discretize the angular sphere. As a Result, this hybrid method (FEM-SH) is both accurate in dealing with forward-peaking scattering via FEM, and efficient for multi-energy-group computation via SH. In addition, FEM-SH enables the analytical integration in energy variable of delta scattering kernel for elastic scattering with reduced truncation error from the numerical integration based on the classic SH-based multi-energy-group method. Results: The accuracy of the proposed BFPE solver was benchmarked against Geant4 for photon dose calculation. In particular, FEM-SH had improved accuracy compared to FEM, while both were within 2% of the results obtained with Geant4. Conclusion: A deterministic solver of the Boltzmann-Fokker-Planck equation is developed for dose calculation, and benchmarked against Geant4. Xiang Hong and Hao Gao were partially supported by the NSFC (#11405105), the 973 Program (#2015CB856000) and the Shanghai Pujiang
On unstructured grids and solvers
NASA Technical Reports Server (NTRS)
Barth, T. J.
1990-01-01
The fundamentals and the state-of-the-art technology for unstructured grids and solvers are highlighted. Algorithms and techniques pertinent to mesh generation are discussed. It is shown that grid generation and grid manipulation schemes rely on fast multidimensional searching. Flow solution techniques for the Euler equations, which can be derived from the integral form of the equations are discussed. Sample calculations are also provided.
Benchmarking ICRF Full-wave Solvers for ITER
R. V. Budny, L. Berry, R. Bilato, P. Bonoli, M. Brambilla, R. J. Dumont, A. Fukuyama, R. Harvey, E. F. Jaeger, K. Indireshkumar, E. Lerche, D. McCune, C. K. Phillips, V. Vdovin, J. Wright, and members of the ITPA-IOS
2011-01-06
Abstract Benchmarking of full-wave solvers for ICRF simulations is performed using plasma profiles and equilibria obtained from integrated self-consistent modeling predictions of four ITER plasmas. One is for a high performance baseline (5.3 T, 15 MA) DT H-mode. The others are for half-field, half-current plasmas of interest for the pre-activation phase with bulk plasma ion species being either hydrogen or He4. The predicted profiles are used by six full-wave solver groups to simulate the ICRF electromagnetic fields and heating, and by three of these groups to simulate the current-drive. Approximate agreement is achieved for the predicted heating power for the DT and He4 cases. Factor of two disagreements are found for the cases with second harmonic He3 heating in bulk H cases. Approximate agreement is achieved simulating the ICRF current drive.
Parallelized solvers for heat conduction formulations
NASA Technical Reports Server (NTRS)
Padovan, Joe; Kwang, Abel
1991-01-01
Based on multilevel partitioning, this paper develops a structural parallelizable solution methodology that enables a significant reduction in computational effort and memory requirements for very large scale linear and nonlinear steady and transient thermal (heat conduction) models. Due to the generality of the formulation of the scheme, both finite element and finite difference simulations can be treated. Diverse model topologies can thus be handled, including both simply and multiply connected (branched/perforated) geometries. To verify the methodology, analytical and numerical benchmark trends are verified in both sequential and parallel computer environments.
Parallelized solvers for heat conduction formulations
NASA Technical Reports Server (NTRS)
Padovan, Joe; Kwang, Abel
1991-01-01
Based on multilevel partitioning, this paper develops a structural parallelizable solution methodology that enables a significant reduction in computational effort and memory requirements for very large scale linear and nonlinear steady and transient thermal (heat conduction) models. Due to the generality of the formulation of the scheme, both finite element and finite difference simulations can be treated. Diverse model topologies can thus be handled, including both simply and multiply connected (branched/perforated) geometries. To verify the methodology, analytical and numerical benchmark trends are verified in both sequential and parallel computer environments.
Pelanti, Marica; Bouchut, Francois; Mangeney, Anne
2011-02-01
We present a Riemann solver derived by a relaxation technique for classical single-phase shallow flow equations and for a two-phase shallow flow model describing a mixture of solid granular material and fluid. Our primary interest is the numerical approximation of this two-phase solid/fluid model, whose complexity poses numerical difficulties that cannot be efficiently addressed by existing solvers. In particular, we are concerned with ensuring a robust treatment of dry bed states. The relaxation system used by the proposed solver is formulated by introducing auxiliary variables that replace the momenta in the spatial gradients of the original model systems. The resulting relaxation solver is related to Roe solver in that its Riemann solution for the flow height and relaxation variables is formally computed as Roe's Riemann solution. The relaxation solver has the advantage of a certain degree of freedom in the specification of the wave structure through the choice of the relaxation parameters. This flexibility can be exploited to handle robustly vacuum states, which is a well known difficulty of standard Roe's method, while maintaining Roe's low diffusivity. For the single-phase model positivity of flow height is rigorously preserved. For the two-phase model positivity of volume fractions in general is not ensured, and a suitable restriction on the CFL number might be needed. Nonetheless, numerical experiments suggest that the proposed two-phase flow solver efficiently models wet/dry fronts and vacuum formation for a large range of flow conditions. As a corollary of our study, we show that for single-phase shallow flow equations the relaxation solver is formally equivalent to the VFRoe solver with conservative variables of Gallouet and Masella [T. Gallouet, J.-M. Masella, Un schema de Godunov approche C.R. Acad. Sci. Paris, Serie I, 323 (1996) 77-84]. The relaxation interpretation allows establishing positivity conditions for this VFRoe method.
A new fast direct solver for the boundary element method
NASA Astrophysics Data System (ADS)
Huang, S.; Liu, Y. J.
2017-04-01
A new fast direct linear equation solver for the boundary element method (BEM) is presented in this paper. The idea of the new fast direct solver stems from the concept of the hierarchical off-diagonal low-rank matrix. The hierarchical off-diagonal low-rank matrix can be decomposed into the multiplication of several diagonal block matrices. The inverse of the hierarchical off-diagonal low-rank matrix can be calculated efficiently with the Sherman-Morrison-Woodbury formula. In this paper, a more general and efficient approach to approximate the coefficient matrix of the BEM with the hierarchical off-diagonal low-rank matrix is proposed. Compared to the current fast direct solver based on the hierarchical off-diagonal low-rank matrix, the proposed method is suitable for solving general 3-D boundary element models. Several numerical examples of 3-D potential problems with the total number of unknowns up to above 200,000 are presented. The results show that the new fast direct solver can be applied to solve large 3-D BEM models accurately and with better efficiency compared with the conventional BEM.
Assessment of Linear Finite-Difference Poisson-Boltzmann Solvers
Wang, Jun; Luo, Ray
2009-01-01
CPU time and memory usage are two vital issues that any numerical solvers for the Poisson-Boltzmann equation have to face in biomolecular applications. In this study we systematically analyzed the CPU time and memory usage of five commonly used finite-difference solvers with a large and diversified set of biomolecular structures. Our comparative analysis shows that modified incomplete Cholesky conjugate gradient and geometric multigrid are the most efficient in the diversified test set. For the two efficient solvers, our test shows that their CPU times increase approximately linearly with the numbers of grids. Their CPU times also increase almost linearly with the negative logarithm of the convergence criterion at very similar rate. Our comparison further shows that geometric multigrid performs better in the large set of tested biomolecules. However, modified incomplete Cholesky conjugate gradient is superior to geometric multigrid in molecular dynamics simulations of tested molecules. We also investigated other significant components in numerical solutions of the Poisson-Boltzmann equation. It turns out that the time-limiting step is the free boundary condition setup for the linear systems for the selected proteins if the electrostatic focusing is not used. Thus, development of future numerical solvers for the Poisson-Boltzmann equation should balance all aspects of the numerical procedures in realistic biomolecular applications. PMID:20063271
Le, Mai; Fessler, Jeffrey A
2017-03-01
Undersampling is an effective method for reducing scan acquisition time for MRI. Strategies for accelerated MRI such as parallel MRI and Compressed Sensing MRI present challenging image reconstruction problems with non-differentiable cost functions and computationally demanding operations. Variable splitting (VS) can simplify implementation of difficult image reconstruction problems, such as the combination of parallel MRI and Compressed Sensing, CS-SENSE-MRI. Combined with augmented Lagrangian (AL) and alternating minimization strategies, variable splitting can yield iterative minimization algorithms with simpler auxiliary variable updates. However, arbitrary variable splitting schemes are not guaranteed to converge. Many variable splitting strategies are combined with periodic boundary conditions. The resultant circulant Hessians enable (n log n) computation but may compromise image accuracy at the spatial boundaries. We propose two methods for CS-SENSE-MRI that use regularization with non-periodic boundary conditions to prevent wrap-around artifacts. Each algorithm computes one of the resulting variable updates efficiently in (n) time using a parallelizable tridiagonal solver. AL-tridiag is a VS method designed to enable efficient computation for non-periodic boundary conditions. Another proposed algorithm, ADMM-tridiag, uses a similar VS scheme but also ensures convergence to a minimizer of the proposed cost function using the Alternating Direction Method of Multipliers (ADMM). AL-tridiag and ADMM-tridiag show speeds competitive with previous VS CS-SENSE-MRI reconstruction algorithm AL-P2. We also apply the tridiagonal VS approach to a simple image inpainting problem.
Finite Element Interface to Linear Solvers
Williams, Alan
2005-03-18
Sparse systems of linear equations arise in many engineering applications, including finite elements, finite volumes, and others. The solution of linear systems is often the most computationally intensive portion of the application. Depending on the complexity of problems addressed by the application, there may be no single solver capable of solving all of the linear systems that arise. This motivates the desire to switch an application from one solver librwy to another, depending on the problem being solved. The interfaces provided by solver libraries differ greatly, making it difficult to switch an application code from one library to another. The amount of library-specific code in an application Can be greatly reduced by having an abstraction layer between solver libraries and the application, putting a common "face" on various solver libraries. One such abstraction layer is the Finite Element Interface to Linear Solvers (EEl), which has seen significant use by finite element applications at Sandia National Laboratories and Lawrence Livermore National Laboratory.
NASA Astrophysics Data System (ADS)
Gong, Weiwei; Zhou, Xu
2017-06-01
In Computer Science, the Boolean Satisfiability Problem(SAT) is the problem of determining if there exists an interpretation that satisfies a given Boolean formula. SAT is one of the first problems that was proven to be NP-complete, which is also fundamental to artificial intelligence, algorithm and hardware design. This paper reviews the main algorithms of the SAT solver in recent years, including serial SAT algorithms, parallel SAT algorithms, SAT algorithms based on GPU, and SAT algorithms based on FPGA. The development of SAT is analyzed comprehensively in this paper. Finally, several possible directions for the development of the SAT problem are proposed.
Analysis Tools for CFD Multigrid Solvers
NASA Technical Reports Server (NTRS)
Mineck, Raymond E.; Thomas, James L.; Diskin, Boris
2004-01-01
Analysis tools are needed to guide the development and evaluate the performance of multigrid solvers for the fluid flow equations. Classical analysis tools, such as local mode analysis, often fail to accurately predict performance. Two-grid analysis tools, herein referred to as Idealized Coarse Grid and Idealized Relaxation iterations, have been developed and evaluated within a pilot multigrid solver. These new tools are applicable to general systems of equations and/or discretizations and point to problem areas within an existing multigrid solver. Idealized Relaxation and Idealized Coarse Grid are applied in developing textbook-efficient multigrid solvers for incompressible stagnation flow problems.
A Wavelet Technique For Multi-grid Solver For Large Linear Systems
NASA Astrophysics Data System (ADS)
Keller, W.
In general, large systems of linear equations cannot be solved directly. An iterative solver has to be applied instead. Unfortunately, iterative solvers have a notouriously slow convergence rate, which in the worst case can prevent convergence at all, due to the inavoidable rounding errors. Multi-grid iteration schemes are meant to guarantee a sufficiently high convergence rate, independent from the dimension of the linear system. The idea behind the multi-grid solvers is that the traditional iterative solvers eliminate only the short-wavelength error constituents in the initial guess for the solution. For the elimination of the remaining long-wavelength error constituents a much coarser grid is sufficient. On the coarse grid the dimension of the problem is much smaller so that the elimination can be done by a direct solver. The paper shows that wavelet techniques successfully can be applied for following steps of a multi-grid procedure: · Generation of an approximation of the proplem on a coarse grid from a given approximation on the fine grid. · Restriction of a signal on a fine grid to its approximation on a co grid. · Uplift of a signal from the coarse to the fine grid. The paper starts with a theoretical explanation of the links between wavelets and multi-grid solvers. Based on this investigation the class o operators, which are suitable for a multi-grid solution strategy can be characterized. The numerical efficiency of the approach will be tested for the Planar Stokes problem.
Newton Solver Stabilization for Stokes Solvers in Geodynamic Problems
NASA Astrophysics Data System (ADS)
Fraters, Menno; Bangerth, Wolfgang; Thieulot, Cedric; Spakman, Wim
2017-04-01
The most commonly used method by the geodynamical community for solving non-linear equations is the Picard fixed-point iteration. However, the Newton method has recently gained interest within this community because it formally leads to quadratic convergence close to the solution as compared to the global linear convergence of the Picard iteration. In mantle dynamics, a blend of pressure and strain-rate dependent visco-plastic rheologies is often used. While for power-law rheologies the Jacobian is guaranteed to be Symmetric Positive Definite (SPD), for more complex (compressible) rheologies, the Jacobian may become non-SPD. Here we present a new method for efficiently enforce the Jacobian to be SPD, necessary for our current highly efficient Stokes solvers, with a minimum loss in convergence rate. Furthermore, we show results for both incompressible and compressible models.
The impact of improved sparse linear solvers on industrial engineering applications
Heroux, M.; Baddourah, M.; Poole, E.L.; Yang, Chao Wu
1996-12-31
There are usually many factors that ultimately determine the quality of computer simulation for engineering applications. Some of the most important are the quality of the analytical model and approximation scheme, the accuracy of the input data and the capability of the computing resources. However, in many engineering applications the characteristics of the sparse linear solver are the key factors in determining how complex a problem a given application code can solve. Therefore, the advent of a dramatically improved solver often brings with it dramatic improvements in our ability to do accurate and cost effective computer simulations. In this presentation we discuss the current status of sparse iterative and direct solvers in several key industrial CFD and structures codes, and show the impact that recent advances in linear solvers have made on both our ability to perform challenging simulations and the cost of those simulations. We also present some of the current challenges we have and the constraints we face in trying to improve these solvers. Finally, we discuss future requirements for sparse linear solvers on high performance architectures and try to indicate the opportunities that exist if we can develop even more improvements in linear solver capabilities.
NITSOL: A Newton iterative solver for nonlinear systems
Pernice, M.; Walker, H.F.
1996-12-31
Newton iterative methods, also known as truncated Newton methods, are implementations of Newton`s method in which the linear systems that characterize Newton steps are solved approximately using iterative linear algebra methods. Here, we outline a well-developed Newton iterative algorithm together with a Fortran implementation called NITSOL. The basic algorithm is an inexact Newton method globalized by backtracking, in which each initial trial step is determined by applying an iterative linear solver until an inexact Newton criterion is satisfied. In the implementation, the user can specify inexact Newton criteria in several ways and select an iterative linear solver from among several popular {open_quotes}transpose-free{close_quotes} Krylov subspace methods. Jacobian-vector products used by the Krylov solver can be either evaluated analytically with a user-supplied routine or approximated using finite differences of function values. A flexible interface permits a wide variety of preconditioning strategies and allows the user to define a preconditioner and optionally update it periodically. We give details of these and other features and demonstrate the performance of the implementation on a representative set of test problems.
MACSYMA's symbolic ordinary differential equation solver
NASA Technical Reports Server (NTRS)
Golden, J. P.
1977-01-01
The MACSYMA's symbolic ordinary differential equation solver ODE2 is described. The code for this routine is delineated, which is of interest because it is written in top-level MACSYMA language, and may serve as a good example of programming in that language. Other symbolic ordinary differential equation solvers are mentioned.
Elliptic Solvers with Adaptive Mesh Refinement on Complex Geometries
Phillip, B.
2000-07-24
Adaptive Mesh Refinement (AMR) is a numerical technique for locally tailoring the resolution computational grids. Multilevel algorithms for solving elliptic problems on adaptive grids include the Fast Adaptive Composite grid method (FAC) and its parallel variants (AFAC and AFACx). Theory that confirms the independence of the convergence rates of FAC and AFAC on the number of refinement levels exists under certain ellipticity and approximation property conditions. Similar theory needs to be developed for AFACx. The effectiveness of multigrid-based elliptic solvers such as FAC, AFAC, and AFACx on adaptively refined overlapping grids is not clearly understood. Finally, a non-trivial eye model problem will be solved by combining the power of using overlapping grids for complex moving geometries, AMR, and multilevel elliptic solvers.
A spectral Poisson solver for kinetic plasma simulation
NASA Astrophysics Data System (ADS)
Szeremley, Daniel; Obberath, Jens; Brinkmann, Ralf
2011-10-01
Plasma resonance spectroscopy is a well established plasma diagnostic method, realized in several designs. One of these designs is the multipole resonance probe (MRP). In its idealized - geometrically simplified - version it consists of two dielectrically shielded, hemispherical electrodes to which an RF signal is applied. A numerical tool is under development which is capable of simulating the dynamics of the plasma surrounding the MRP in electrostatic approximation. In this contribution we concentrate on the specialized Poisson solver for that tool. The plasma is represented by an ensemble of point charges. By expanding both the charge density and the potential into spherical harmonics, a largely analytical solution of the Poisson problem can be employed. For a practical implementation, the expansion must be appropriately truncated. With this spectral solver we are able to efficiently solve the Poisson equation in a kinetic plasma simulation without the need of introducing a spatial discretization.
NASA Technical Reports Server (NTRS)
Diosady, Laslo; Murman, Scott; Blonigan, Patrick; Garai, Anirban
2017-01-01
Presented space-time adjoint solver for turbulent compressible flows. Confirmed failure of traditional sensitivity methods for chaotic flows. Assessed rate of exponential growth of adjoint for practical 3D turbulent simulation. Demonstrated failure of short-window sensitivity approximations.
A robust multilevel simultaneous eigenvalue solver
NASA Technical Reports Server (NTRS)
Costiner, Sorin; Taasan, Shlomo
1993-01-01
Multilevel (ML) algorithms for eigenvalue problems are often faced with several types of difficulties such as: the mixing of approximated eigenvectors by the solution process, the approximation of incomplete clusters of eigenvectors, the poor representation of solution on coarse levels, and the existence of close or equal eigenvalues. Algorithms that do not treat appropriately these difficulties usually fail, or their performance degrades when facing them. These issues motivated the development of a robust adaptive ML algorithm which treats these difficulties, for the calculation of a few eigenvectors and their corresponding eigenvalues. The main techniques used in the new algorithm include: the adaptive completion and separation of the relevant clusters on different levels, the simultaneous treatment of solutions within each cluster, and the robustness tests which monitor the algorithm's efficiency and convergence. The eigenvectors' separation efficiency is based on a new ML projection technique generalizing the Rayleigh Ritz projection, combined with a technique, the backrotations. These separation techniques, when combined with an FMG formulation, in many cases lead to algorithms of O(qN) complexity, for q eigenvectors of size N on the finest level. Previously developed ML algorithms are less focused on the mentioned difficulties. Moreover, algorithms which employ fine level separation techniques are of O(q(sub 2)N) complexity and usually do not overcome all these difficulties. Computational examples are presented where Schrodinger type eigenvalue problems in 2-D and 3-D, having equal and closely clustered eigenvalues, are solved with the efficiency of the Poisson multigrid solver. A second order approximation is obtained in O(qN) work, where the total computational work is equivalent to only a few fine level relaxations per eigenvector.
GARDNER, P.R.
2006-04-01
Sudoku, also known as Number Place, is a logic-based placement puzzle. The aim of the puzzle is to enter a numerical digit from 1 through 9 in each cell of a 9 x 9 grid made up of 3 x 3 subgrids (called ''regions''), starting with various digits given in some cells (the ''givens''). Each row, column, and region must contain only one instance of each numeral. Completing the puzzle requires patience and logical ability. Although first published in a U.S. puzzle magazine in 1979, Sudoku initially caught on in Japan in 1986 and attained international popularity in 2005. Last fall, after noticing Sudoku puzzles in some newspapers and magazines, I attempted a few just to see how hard they were. Of course, the difficulties varied considerably. ''Obviously'' one could use Trial and Error but all the advice was to ''Use Logic''. Thinking to flex, and strengthen, those powers, I began to tackle the puzzles systematically. That is, when I discovered a new tactical rule, I would write it down, eventually generating a list of ten or so, with some having overlap. They served pretty well except for the more difficult puzzles, but even then I managed to develop an additional three rules that covered all of them until I hit the Oregonian puzzle shown. With all of my rules, I could not seem to solve that puzzle. Initially putting my failure down to rapid mental fatigue (being unable to hold a sufficient quantity of information in my mind at one time), I decided to write a program to implement my rules and see what I had failed to notice earlier. The solver, too, failed. That is, my rules were insufficient to solve that particular puzzle. I happened across a book written by a fellow who constructs such puzzles and who claimed that, sometimes, the only tactic left was trial and error. With a trial and error routine implemented, my solver successfully completed the Oregonian puzzle, and has successfully solved every puzzle submitted to it since.
ALPS - A LINEAR PROGRAM SOLVER
NASA Technical Reports Server (NTRS)
Viterna, L. A.
1994-01-01
Linear programming is a widely-used engineering and management tool. Scheduling, resource allocation, and production planning are all well-known applications of linear programs (LP's). Most LP's are too large to be solved by hand, so over the decades many computer codes for solving LP's have been developed. ALPS, A Linear Program Solver, is a full-featured LP analysis program. ALPS can solve plain linear programs as well as more complicated mixed integer and pure integer programs. ALPS also contains an efficient solution technique for pure binary (0-1 integer) programs. One of the many weaknesses of LP solvers is the lack of interaction with the user. ALPS is a menu-driven program with no special commands or keywords to learn. In addition, ALPS contains a full-screen editor to enter and maintain the LP formulation. These formulations can be written to and read from plain ASCII files for portability. For those less experienced in LP formulation, ALPS contains a problem "parser" which checks the formulation for errors. ALPS creates fully formatted, readable reports that can be sent to a printer or output file. ALPS is written entirely in IBM's APL2/PC product, Version 1.01. The APL2 workspace containing all the ALPS code can be run on any APL2/PC system (AT or 386). On a 32-bit system, this configuration can take advantage of all extended memory. The user can also examine and modify the ALPS code. The APL2 workspace has also been "packed" to be run on any DOS system (without APL2) as a stand-alone "EXE" file, but has limited memory capacity on a 640K system. A numeric coprocessor (80X87) is optional but recommended. The standard distribution medium for ALPS is a 5.25 inch 360K MS-DOS format diskette. IBM, IBM PC and IBM APL2 are registered trademarks of International Business Machines Corporation. MS-DOS is a registered trademark of Microsoft Corporation.
SIERRA framework version 4 : solver services.
Williams, Alan B.
2005-02-01
Several SIERRA applications make use of third-party libraries to solve systems of linear and nonlinear equations, and to solve eigenproblems. The classes and interfaces in the SIERRA framework that provide linear system assembly services and access to solver libraries are collectively referred to as solver services. This paper provides an overview of SIERRA's solver services including the design goals that drove the development, and relationships and interactions among the various classes. The process of assembling and manipulating linear systems will be described, as well as access to solution methods and other operations.
NASA Technical Reports Server (NTRS)
Ferencz, Donald C.; Viterna, Larry A.
1991-01-01
ALPS is a computer program which can be used to solve general linear program (optimization) problems. ALPS was designed for those who have minimal linear programming (LP) knowledge and features a menu-driven scheme to guide the user through the process of creating and solving LP formulations. Once created, the problems can be edited and stored in standard DOS ASCII files to provide portability to various word processors or even other linear programming packages. Unlike many math-oriented LP solvers, ALPS contains an LP parser that reads through the LP formulation and reports several types of errors to the user. ALPS provides a large amount of solution data which is often useful in problem solving. In addition to pure linear programs, ALPS can solve for integer, mixed integer, and binary type problems. Pure linear programs are solved with the revised simplex method. Integer or mixed integer programs are solved initially with the revised simplex, and the completed using the branch-and-bound technique. Binary programs are solved with the method of implicit enumeration. This manual describes how to use ALPS to create, edit, and solve linear programming problems. Instructions for installing ALPS on a PC compatible computer are included in the appendices along with a general introduction to linear programming. A programmers guide is also included for assistance in modifying and maintaining the program.
Parallelizing alternating direction implicit solver on GPUs
USDA-ARS?s Scientific Manuscript database
We present a parallel Alternating Direction Implicit (ADI) solver on GPUs. Our implementation significantly improves existing implementations in two aspects. First, we address the scalability issue of existing Parallel Cyclic Reduction (PCR) implementations by eliminating their hardware resource con...
Improved Stiff ODE Solvers for Combustion CFD
NASA Astrophysics Data System (ADS)
Imren, A.; Haworth, D. C.
2016-11-01
Increasingly large chemical mechanisms are needed to predict autoignition, heat release and pollutant emissions in computational fluid dynamics (CFD) simulations of in-cylinder processes in compression-ignition engines and other applications. Calculation of chemical source terms usually dominates the computational effort, and several strategies have been proposed to reduce the high computational cost associated with realistic chemistry in CFD. Central to most strategies is a stiff ordinary differential equation (ODE) solver to compute the change in composition due to chemical reactions over a computational time step. Most work to date on stiff ODE solvers for computational combustion has focused on backward differential formula (BDF) methods, and has not explicitly considered the implications of how the stiff ODE solver couples with the CFD algorithm. In this work, a fresh look at stiff ODE solvers is taken that includes how the solver is integrated into a turbulent combustion CFD code, and the advantages of extrapolation-based solvers in this regard are demonstrated. Benefits in CPU time and accuracy are demonstrated for homogeneous systems and compression-ignition engines, for chemical mechanisms that range in size from fewer than 50 to more than 7,000 species.
A parallel PCG solver for MODFLOW.
Dong, Yanhui; Li, Guomin
2009-01-01
In order to simulate large-scale ground water flow problems more efficiently with MODFLOW, the OpenMP programming paradigm was used to parallelize the preconditioned conjugate-gradient (PCG) solver with in this study. Incremental parallelization, the significant advantage supported by OpenMP on a shared-memory computer, made the solver transit to a parallel program smoothly one block of code at a time. The parallel PCG solver, suitable for both MODFLOW-2000 and MODFLOW-2005, is verified using an 8-processor computer. Both the impact of compilers and different model domain sizes were considered in the numerical experiments. Based on the timing results, execution times using the parallel PCG solver are typically about 1.40 to 5.31 times faster than those using the serial one. In addition, the simulation results are the exact same as the original PCG solver, because the majority of serial codes were not changed. It is worth noting that this parallelizing approach reduces cost in terms of software maintenance because only a single source PCG solver code needs to be maintained in the MODFLOW source tree.
Approximating the Generalized Voronoi Diagram of Closely Spaced Objects
Edwards, John; Daniel, Eric; Pascucci, Valerio; Bajaj, Chandrajit
2015-06-22
We present an algorithm to compute an approximation of the generalized Voronoi diagram (GVD) on arbitrary collections of 2D or 3D geometric objects. In particular, we focus on datasets with closely spaced objects; GVD approximation is expensive and sometimes intractable on these datasets using previous algorithms. With our approach, the GVD can be computed using commodity hardware even on datasets with many, extremely tightly packed objects. Our approach is to subdivide the space with an octree that is represented with an adjacency structure. We then use a novel adaptive distance transform to compute the distance function on octree vertices. The computed distance field is sampled more densely in areas of close object spacing, enabling robust and parallelizable GVD surface generation. We demonstrate our method on a variety of data and show example applications of the GVD in 2D and 3D.
Approximating the Generalized Voronoi Diagram of Closely Spaced Objects
Edwards, John; Daniel, Eric; Pascucci, Valerio; Bajaj, Chandrajit
2016-01-01
We present an algorithm to compute an approximation of the generalized Voronoi diagram (GVD) on arbitrary collections of 2D or 3D geometric objects. In particular, we focus on datasets with closely spaced objects; GVD approximation is expensive and sometimes intractable on these datasets using previous algorithms. With our approach, the GVD can be computed using commodity hardware even on datasets with many, extremely tightly packed objects. Our approach is to subdivide the space with an octree that is represented with an adjacency structure. We then use a novel adaptive distance transform to compute the distance function on octree vertices. The computed distance field is sampled more densely in areas of close object spacing, enabling robust and parallelizable GVD surface generation. We demonstrate our method on a variety of data and show example applications of the GVD in 2D and 3D. PMID:27540272
A general second order complete active space self-consistent-field solver for large-scale systems
NASA Astrophysics Data System (ADS)
Sun, Qiming; Yang, Jun; Chan, Garnet Kin-Lic
2017-09-01
We present a new second order complete active space self-consistent field implementation to converge wavefunctions for both large active spaces and large atomic orbital (AO) bases. Our algorithm decouples the active space wavefunction solver from the orbital optimization in the microiterations, and thus may be easily combined with various modern active space solvers. We also introduce efficient approximate orbital gradient and Hessian updates, and step size determination. We demonstrate its capabilities by calculating the low-lying states of the Fe(II)-porphine complex with modest resources using a density matrix renormalization group solver in a CAS(22, 27) active space and a 3000 AO basis.
An advanced implicit solver for MHD
NASA Astrophysics Data System (ADS)
Udrea, Bogdan
A new implicit algorithm has been developed for the solution of the time-dependent, viscous and resistive single fluid magnetohydrodynamic (MHD) equations. The algorithm is based on an approximate Riemann solver for the hyperbolic fluxes and central differencing applied on a staggered grid for the parabolic fluxes. The algorithm employs a locally aligned coordinate system that allows the solution to the Riemann problems to be solved in a natural direction, normal to cell interfaces. The result is an original scheme that is robust and reduces the complexity of the flux formulas. The evaluation of the parabolic fluxes is also implemented using a locally aligned coordinate system, this time on the staggered grid. The implicit formulation employed by WARP3 is a two level scheme that was applied for the first time to the single fluid MHD model. The flux Jacobians that appear in the implicit scheme are evaluated numerically. The linear system that results from the implicit discretization is solved using a robust symmetric Gauss-Seidel method. The code has an explicit mode capability so that implementation and test of new algorithms or new physics can be performed in this simpler mode. Last but not least the code was designed and written to run on parallel computers so that complex, high resolution runs can be per formed in hours rather than days. The code has been benchmarked against analytical and experimental gas dynamics and MHD results. The benchmarks consisted of one-dimensional Riemann problems and diffusion dominated problems, two-dimensional supersonic flow over a wedge, axisymmetric magnetoplasmadynamic (MPD) thruster simulation and three-dimensional supersonic flow over intersecting wedges and spheromak stability simulation. The code has been proven to be robust and the results of the simulations showed excellent agreement with analytical and experimental results. Parallel performance studies showed that the code performs as expected when run on parallel
NASA Astrophysics Data System (ADS)
Balsara, Dinshaw S.; Nkonga, Boniface
2017-10-01
Just as the quality of a one-dimensional approximate Riemann solver is improved by the inclusion of internal sub-structure, the quality of a multidimensional Riemann solver is also similarly improved. Such multidimensional Riemann problems arise when multiple states come together at the vertex of a mesh. The interaction of the resulting one-dimensional Riemann problems gives rise to a strongly-interacting state. We wish to endow this strongly-interacting state with physically-motivated sub-structure. The fastest way of endowing such sub-structure consists of making a multidimensional extension of the HLLI Riemann solver for hyperbolic conservation laws. Presenting such a multidimensional analogue of the HLLI Riemann solver with linear sub-structure for use on structured meshes is the goal of this work. The multidimensional MuSIC Riemann solver documented here is universal in the sense that it can be applied to any hyperbolic conservation law. The multidimensional Riemann solver is made to be consistent with constraints that emerge naturally from the Galerkin projection of the self-similar states within the wave model. When the full eigenstructure in both directions is used in the present Riemann solver, it becomes a complete Riemann solver in a multidimensional sense. I.e., all the intermediate waves are represented in the multidimensional wave model. The work also presents, for the very first time, an important analysis of the dissipation characteristics of multidimensional Riemann solvers. The present Riemann solver results in the most efficient implementation of a multidimensional Riemann solver with sub-structure. Because it preserves stationary linearly degenerate waves, it might also help with well-balancing. Implementation-related details are presented in pointwise fashion for the one-dimensional HLLI Riemann solver as well as the multidimensional MuSIC Riemann solver.
Inductive ionospheric solver for magnetospheric MHD simulations
NASA Astrophysics Data System (ADS)
Vanhamäki, H.
2011-01-01
We present a new scheme for solving the ionospheric boundary conditions required in magnetospheric MHD simulations. In contrast to the electrostatic ionospheric solvers currently in use, the new solver takes ionospheric induction into account by solving Faraday's law simultaneously with Ohm's law and current continuity. From the viewpoint of an MHD simulation, the new inductive solver is similar to the electrostatic solvers, as the same input data is used (field-aligned current [FAC] and ionospheric conductances) and similar output is produced (ionospheric electric field). The inductive solver is tested using realistic, databased models of an omega-band and westward traveling surge. Although the tests were performed with local models and MHD simulations require a global ionospheric solution, we may nevertheless conclude that the new solution scheme is feasible also in practice. In the test cases the difference between static and electrodynamic solutions is up to ~10 V km-1 in certain locations, or up to 20-40% of the total electric field. This is in agreement with previous estimates. It should also be noted that if FAC is replaced by the ground magnetic field (or ionospheric equivalent current) in the input data set, exactly the same formalism can be used to construct an inductive version of the KRM method originally developed by Kamide et al. (1981).
Using SPARK as a Solver for Modelica
Wetter, Michael; Wetter, Michael; Haves, Philip; Moshier, Michael A.; Sowell, Edward F.
2008-06-30
Modelica is an object-oriented acausal modeling language that is well positioned to become a de-facto standard for expressing models of complex physical systems. To simulate a model expressed in Modelica, it needs to be translated into executable code. For generating run-time efficient code, such a translation needs to employ algebraic formula manipulations. As the SPARK solver has been shown to be competitive for generating such code but currently cannot be used with the Modelica language, we report in this paper how SPARK's symbolic and numerical algorithms can be implemented in OpenModelica, an open-source implementation of a Modelica modeling and simulation environment. We also report benchmark results that show that for our air flow network simulation benchmark, the SPARK solver is competitive with Dymola, which is believed to provide the best solver for Modelica.
New iterative solvers for the NAG Libraries
Salvini, S.; Shaw, G.
1996-12-31
The purpose of this paper is to introduce the work which has been carried out at NAG Ltd to update the iterative solvers for sparse systems of linear equations, both symmetric and unsymmetric, in the NAG Fortran 77 Library. Our current plans to extend this work and include it in our other numerical libraries in our range are also briefly mentioned. We have added to the Library the new Chapter F11, entirely dedicated to sparse linear algebra. At Mark 17, the F11 Chapter includes sparse iterative solvers, preconditioners, utilities and black-box routines for sparse symmetric (both positive-definite and indefinite) linear systems. Mark 18 will add solvers, preconditioners, utilities and black-boxes for sparse unsymmetric systems: the development of these has already been completed.
NASA Technical Reports Server (NTRS)
Martin, E. D.; Lomax, H.
1977-01-01
Revised and extended versions of a fast, direct (noniterative) numerical Cauchy-Riemann solver are presented for solving finite difference approximations of first order systems of partial differential equations. Although the difference operators treated are linear and elliptic, one significant application of these extended direct Cauchy-Riemann solvers is in the fast, semidirect (iterative) solution of fluid dynamic problems governed by the nonlinear mixed elliptic-hyperbolic equations of transonic flow. Different versions of the algorithms are derived and the corresponding FORTRAN computer programs for a simple example problem are described and listed. The algorithms are demonstrated to be efficient and accurate.
DG-FDF solver for large eddy simulation of compressible flows
NASA Astrophysics Data System (ADS)
Sammak, Shervin; Brazell, Michael; Mavriplis, Dimitri; Givi, Peyman
2016-11-01
A new computational scheme is developed for large eddy simulation (LES) of compressible turbulent flows with the filtered density function (FDF) subgrid scale closure. This is a hybrid scheme, combining the discontinuous Galerkin (DG) Eulerian solver with a Lagrangian Monte Carlo FDF simulator. The methodology is shown to be suitable for LES, as a larger portion of the resolved energy is captured as the order of spectral approximation increases. Simulations are conducted of both subsonic and supersonic flows. The consistency and the overall performance of the DG-FDF solver are demonstrated, together with its shock capturing capabilities.
Development of a parallel implicit solver of fluid modeling equations for gas discharges
NASA Astrophysics Data System (ADS)
Hung, Chieh-Tsan; Chiu, Yuan-Ming; Hwang, Feng-Nan; Wu, Jong-Shinn
2011-01-01
A parallel fully implicit PETSc-based fluid modeling equations solver for simulating gas discharges is developed. Fluid modeling equations include: the neutral species continuity equation, the charged species continuity equation with drift-diffusion approximation for mass fluxes, the electron energy density equation, and Poisson's equation for electrostatic potential. Except for Poisson's equation, all model equations are discretized by the fully implicit backward Euler method as a time integrator, and finite differences with the Scharfetter-Gummel scheme for mass fluxes on the spatial domain. At each time step, the resulting large sparse algebraic nonlinear system is solved by the Newton-Krylov-Schwarz algorithm. A 2D-GEC RF discharge is used as a benchmark to validate our solver by comparing the numerical results with both the published experimental data and the theoretical prediction. The parallel performance of the solver is investigated.
Code Verification of the HIGRAD Computational Fluid Dynamics Solver
Van Buren, Kendra L.; Canfield, Jesse M.; Hemez, Francois M.; Sauer, Jeremy A.
2012-05-04
The purpose of this report is to outline code and solution verification activities applied to HIGRAD, a Computational Fluid Dynamics (CFD) solver of the compressible Navier-Stokes equations developed at the Los Alamos National Laboratory, and used to simulate various phenomena such as the propagation of wildfires and atmospheric hydrodynamics. Code verification efforts, as described in this report, are an important first step to establish the credibility of numerical simulations. They provide evidence that the mathematical formulation is properly implemented without significant mistakes that would adversely impact the application of interest. Highly accurate analytical solutions are derived for four code verification test problems that exercise different aspects of the code. These test problems are referred to as: (i) the quiet start, (ii) the passive advection, (iii) the passive diffusion, and (iv) the piston-like problem. These problems are simulated using HIGRAD with different levels of mesh discretization and the numerical solutions are compared to their analytical counterparts. In addition, the rates of convergence are estimated to verify the numerical performance of the solver. The first three test problems produce numerical approximations as expected. The fourth test problem (piston-like) indicates the extent to which the code is able to simulate a 'mild' discontinuity, which is a condition that would typically be better handled by a Lagrangian formulation. The current investigation concludes that the numerical implementation of the solver performs as expected. The quality of solutions is sufficient to provide credible simulations of fluid flows around wind turbines. The main caveat associated to these findings is the low coverage provided by these four problems, and somewhat limited verification activities. A more comprehensive evaluation of HIGRAD may be beneficial for future studies.
A modified global Newton solver for viscous-plastic sea ice models
NASA Astrophysics Data System (ADS)
Mehlmann, C.; Richter, T.
2017-08-01
We present and analyze a modified Newton solver, the so called operator-related damped Jacobian method, with a line search globalization for the solution of the strongly nonlinear momentum equation in a viscous-plastic (VP) sea ice model.Due to large variations in the viscosities, the resulting nonlinear problem is very difficult to solve. The development of fast, robust and converging solvers is subject to present research. There are mainly three approaches for solving the nonlinear momentum equation of the VP model, a fixed-point method denoted as Picard solver, an inexact Newton method and a subcycling procedure based on an elastic-viscous-plastic model approximation. All methods tend to have problems on fine meshes by sharp structures in the solution. Convergence rates deteriorate such that either too many iterations are required to reach sufficient accuracy or convergence is not obtained at all.To improve robustness globalization and acceleration approaches, which increase the area of fast convergence, are needed. We develop an implicit scheme with improved convergence properties by combining an inexact Newton method with a Picard solver. We derive the full Jacobian of the viscous-plastic sea ice momentum equation and show that the Jacobian is a positive definite matrix, guaranteeing global convergence of a properly damped Newton iteration. We compare our modified Newton solver with line search damping to an inexact Newton method with established globalization and acceleration techniques. We present a test case that shows improved robustness of our new approach, in particular on fine meshes.
Towards an efficient meshfree solver
NASA Astrophysics Data System (ADS)
Ala, Guido; Francomano, Elisa; Paliaga, Marta
2016-10-01
In this paper we focus on the enhancement in accuracy approximating a function and its derivatives via smoothed particle hydrodynamics. We discuss about improvements in the solution by reformulating the original method by means of the Taylor series expansion and by projecting with the kernel function and its derivatives. The accuracy of a function and its derivatives, up to a fixed order, can be simultaneously improved by assuming them as unknowns of a linear system. The improved formulation has been assessed with gridded and scattered data points distribution and the convergence has been analyzed referring to a case study in a 2D domain.
Newton-Raphson preconditioner for Krylov type solvers on GPU devices.
Kushida, Noriyuki
2016-01-01
A new Newton-Raphson method based preconditioner for Krylov type linear equation solvers for GPGPU is developed, and the performance is investigated. Conventional preconditioners improve the convergence of Krylov type solvers, and perform well on CPUs. However, they do not perform well on GPGPUs, because of the complexity of implementing powerful preconditioners. The developed preconditioner is based on the BFGS Hessian matrix approximation technique, which is well known as a robust and fast nonlinear equation solver. Because the Hessian matrix in the BFGS represents the coefficient matrix of a system of linear equations in some sense, the approximated Hessian matrix can be a preconditioner. On the other hand, BFGS is required to store dense matrices and to invert them, which should be avoided on modern computers and supercomputers. To overcome these disadvantages, we therefore introduce a limited memory BFGS, which requires less memory space and less computational effort than the BFGS. In addition, a limited memory BFGS can be implemented with BLAS libraries, which are well optimized for target architectures. There are advantages and disadvantages to the Hessian matrix approximation becoming better as the Krylov solver iteration continues. The preconditioning matrix varies through Krylov solver iterations, and only flexible Krylov solvers can work well with the developed preconditioner. The GCR method, which is a flexible Krylov solver, is employed because of the prevalence of GCR as a Krylov solver with a variable preconditioner. As a result of the performance investigation, the new preconditioner indicates the following benefits: (1) The new preconditioner is robust; i.e., it converges while conventional preconditioners (the diagonal scaling, and the SSOR preconditioners) fail. (2) In the best case scenarios, it is over 10 times faster than conventional preconditioners on a CPU. (3) Because it requries only simple operations, it performs well on a GPGPU. In
Novel Scalable 3-D MT Inverse Solver
NASA Astrophysics Data System (ADS)
Kuvshinov, A. V.; Kruglyakov, M.; Geraskin, A.
2016-12-01
We present a new, robust and fast, three-dimensional (3-D) magnetotelluric (MT) inverse solver. As a forward modelling engine a highly-scalable solver extrEMe [1] is used. The (regularized) inversion is based on an iterative gradient-type optimization (quasi-Newton method) and exploits adjoint sources approach for fast calculation of the gradient of the misfit. The inverse solver is able to deal with highly detailed and contrasting models, allows for working (separately or jointly) with any type of MT (single-site and/or inter-site) responses, and supports massive parallelization. Different parallelization strategies implemented in the code allow for optimal usage of available computational resources for a given problem set up. To parameterize an inverse domain a mask approach is implemented, which means that one can merge any subset of forward modelling cells in order to account for (usually) irregular distribution of observation sites. We report results of 3-D numerical experiments aimed at analysing the robustness, performance and scalability of the code. In particular, our computational experiments carried out at different platforms ranging from modern laptops to high-performance clusters demonstrate practically linear scalability of the code up to thousands of nodes. 1. Kruglyakov, M., A. Geraskin, A. Kuvshinov, 2016. Novel accurate and scalable 3-D MT forward solver based on a contracting integral equation method, Computers and Geosciences, in press.
Equation solvers for distributed-memory computers
NASA Technical Reports Server (NTRS)
Storaasli, Olaf O.
1994-01-01
A large number of scientific and engineering problems require the rapid solution of large systems of simultaneous equations. The performance of parallel computers in this area now dwarfs traditional vector computers by nearly an order of magnitude. This talk describes the major issues involved in parallel equation solvers with particular emphasis on the Intel Paragon, IBM SP-1 and SP-2 processors.
NASA Astrophysics Data System (ADS)
Lafferty, Nathan; Badreddine, Hassan; Niceno, Bojan; Prasser, Horst-Michael
2015-11-01
A parallelizable flood fill algorithm is developed for identifying and tracking closed regions of fluids, dispersed phases, in CFD simulations of multiphase flows. It is used in conjunction with a newly developed method, corrective interface tracking, for simulating finite size dispersed bubbly flows in which the bubbles are too small relative to the grid to be simulated accurately with interface tracking techniques and too large relative to the grid for Lagrangian particle tracking techniques. The latter situation arising if local bubble induced turbulence is resolved, or modeled with LES. With corrective interface tracking the governing equations are solved on a static Eulerian grid. A correcting force, derived from empirical correlation based hydrodynamic forces, is applied to the bubble which is then advected using interface tracking techniques. This method results in accurate fluid-gas two-way coupling, bubble shapes, and terminal rise velocities. The flood fill algorithm and corrective interface tracking technique are applied to an air/water simulation of multiple bubbles rising and merging with a free surface. They are then validated against the same simulation performed using only interface tracking with a much finer grid.
Implicit solvers for unstructured meshes
NASA Technical Reports Server (NTRS)
Venkatakrishnan, V.; Mavriplis, Dimitri J.
1991-01-01
Implicit methods were developed and tested for unstructured mesh computations. The approximate system which arises from the Newton linearization of the nonlinear evolution operator is solved by using the preconditioned GMRES (Generalized Minimum Residual) technique. Three different preconditioners were studied, namely, the incomplete LU factorization (ILU), block diagonal factorization, and the symmetric successive over relaxation (SSOR). The preconditioners were optimized to have good vectorization properties. SSOR and ILU were also studied as iterative schemes. The various methods are compared over a wide range of problems. Ordering of the unknowns, which affects the convergence of these sparse matrix iterative methods, is also studied. Results are presented for inviscid and turbulent viscous calculations on single and multielement airfoil configurations using globally and adaptively generated meshes.
CASTRO: A NEW COMPRESSIBLE ASTROPHYSICAL SOLVER. III. MULTIGROUP RADIATION HYDRODYNAMICS
Zhang, W.; Almgren, A.; Bell, J.; Howell, L.; Burrows, A.; Dolence, J.
2013-01-15
We present a formulation for multigroup radiation hydrodynamics that is correct to order O(v/c) using the comoving-frame approach and the flux-limited diffusion approximation. We describe a numerical algorithm for solving the system, implemented in the compressible astrophysics code, CASTRO. CASTRO uses a Eulerian grid with block-structured adaptive mesh refinement based on a nested hierarchy of logically rectangular variable-sized grids with simultaneous refinement in both space and time. In our multigroup radiation solver, the system is split into three parts: one part that couples the radiation and fluid in a hyperbolic subsystem, another part that advects the radiation in frequency space, and a parabolic part that evolves radiation diffusion and source-sink terms. The hyperbolic subsystem and the frequency space advection are solved explicitly with high-order Godunov schemes, whereas the parabolic part is solved implicitly with a first-order backward Euler method. Our multigroup radiation solver works for both neutrino and photon radiation.
Jia, Jingfei
2015-01-01
It is well known that radiative transfer equation (RTE) provides more accurate tomographic results than its diffusion approximation (DA). However, RTE-based tomographic reconstruction codes have limited applicability in practice due to their high computational cost. In this article, we propose a new efficient method for solving the RTE forward problem with multiple light sources in an all-at-once manner instead of solving it for each source separately. To this end, we introduce here a novel linear solver called block biconjugate gradient stabilized method (block BiCGStab) that makes full use of the shared information between different right hand sides to accelerate solution convergence. Two parallelized block BiCGStab methods are proposed for additional acceleration under limited threads situation. We evaluate the performance of this algorithm with numerical simulation studies involving the Delta-Eddington approximation to the scattering phase function. The results show that the single threading block RTE solver proposed here reduces computation time by a factor of 1.5~3 as compared to the traditional sequential solution method and the parallel block solver by a factor of 1.5 as compared to the traditional parallel sequential method. This block linear solver is, moreover, independent of discretization schemes and preconditioners used; thus further acceleration and higher accuracy can be expected when combined with other existing discretization schemes or preconditioners. PMID:26345531
Jia, Jingfei; Kim, Hyun K; Hielscher, Andreas H
2015-12-01
It is well known that radiative transfer equation (RTE) provides more accurate tomographic results than its diffusion approximation (DA). However, RTE-based tomographic reconstruction codes have limited applicability in practice due to their high computational cost. In this article, we propose a new efficient method for solving the RTE forward problem with multiple light sources in an all-at-once manner instead of solving it for each source separately. To this end, we introduce here a novel linear solver called block biconjugate gradient stabilized method (block BiCGStab) that makes full use of the shared information between different right hand sides to accelerate solution convergence. Two parallelized block BiCGStab methods are proposed for additional acceleration under limited threads situation. We evaluate the performance of this algorithm with numerical simulation studies involving the Delta-Eddington approximation to the scattering phase function. The results show that the single threading block RTE solver proposed here reduces computation time by a factor of 1.5~3 as compared to the traditional sequential solution method and the parallel block solver by a factor of 1.5 as compared to the traditional parallel sequential method. This block linear solver is, moreover, independent of discretization schemes and preconditioners used; thus further acceleration and higher accuracy can be expected when combined with other existing discretization schemes or preconditioners.
Schulz, Andreas S.; Shmoys, David B.; Williamson, David P.
1997-01-01
Increasing global competition, rapidly changing markets, and greater consumer awareness have altered the way in which corporations do business. To become more efficient, many industries have sought to model some operational aspects by gigantic optimization problems. It is not atypical to encounter models that capture 106 separate “yes” or “no” decisions to be made. Although one could, in principle, try all 2106 possible solutions to find the optimal one, such a method would be impractically slow. Unfortunately, for most of these models, no algorithms are known that find optimal solutions with reasonable computation times. Typically, industry must rely on solutions of unguaranteed quality that are constructed in an ad hoc manner. Fortunately, for some of these models there are good approximation algorithms: algorithms that produce solutions quickly that are provably close to optimal. Over the past 6 years, there has been a sequence of major breakthroughs in our understanding of the design of approximation algorithms and of limits to obtaining such performance guarantees; this area has been one of the most flourishing areas of discrete mathematics and theoretical computer science. PMID:9370525
Implicit compressible flow solvers on unstructured meshes
NASA Astrophysics Data System (ADS)
Nagaoka, Makoto; Horinouchi, Nariaki
1993-09-01
An implicit solver for compressible flows using Bi-CGSTAB method is proposed. The Euler equations are discretized with the delta-form by the finite volume method on the cell-centered triangular unstructured meshes. The numerical flux is calculated by Roe's upwind scheme. The linearized simultaneous equations with the irregular nonsymmetric sparse matrix are solved by the Bi-CGSTAB method with the preconditioner of incomplete LU factorization. This method is also vectorized by the multi-colored ordering. Although the solver requires more computational memory, it shows faster and more robust convergence than the other conventional methods: three-stage Runge-Kutta method, point Gauss-Seidel method, and Jacobi method for two-dimensional inviscid steady flows.
Aleph Field Solver Challenge Problem Results Summary
Hooper, Russell; Moore, Stan Gerald
2015-01-01
Aleph models continuum electrostatic and steady and transient thermal fields using a finite-element method. Much work has gone into expanding the core solver capability to support enriched modeling consisting of multiple interacting fields, special boundary conditions and two-way interfacial coupling with particles modeled using Aleph's complementary particle-in-cell capability. This report provides quantitative evidence for correct implementation of Aleph's field solver via order- of-convergence assessments on a collection of problems of increasing complexity. It is intended to provide Aleph with a pedigree and to establish a basis for confidence in results for more challenging problems important to Sandia's mission that Aleph was specifically designed to address.
A perspective on unstructured grid flow solvers
NASA Technical Reports Server (NTRS)
Venkatakrishnan, V.
1995-01-01
This survey paper assesses the status of compressible Euler and Navier-Stokes solvers on unstructured grids. Different spatial and temporal discretization options for steady and unsteady flows are discussed. The integration of these components into an overall framework to solve practical problems is addressed. Issues such as grid adaptation, higher order methods, hybrid discretizations and parallel computing are briefly discussed. Finally, some outstanding issues and future research directions are presented.
Domain decomposition for the SPN solver MINOS
Jamelot, Erell; Baudron, Anne-Marie; Lautard, Jean-Jacques
2012-07-01
In this article we present a domain decomposition method for the mixed SPN equations, discretized with Raviart-Thomas-Nedelec finite elements. This domain decomposition is based on the iterative Schwarz algorithm with Robin interface conditions to handle communications. After having described this method, we give details on how to optimize the convergence. Finally, we give some numerical results computed in a realistic 3D domain. The computations are done with the MINOS solver of the APOLLO3 (R) code. (authors)
The Openpipeflow Navier-Stokes solver
NASA Astrophysics Data System (ADS)
Willis, Ashley P.
Pipelines are used in a huge range of industrial processes involving fluids, and the ability to accurately predict properties of the flow through a pipe is of fundamental engineering importance. Armed with parallel MPI, Arnoldi and Newton-Krylov solvers, the Openpipeflow code can be used in a range of settings, from large-scale simulation of highly turbulent flow, to the detailed analysis of nonlinear invariant solutions (equilibria and periodic orbits) and their influence on the dynamics of the flow.
Domain Decomposition for the SPN Solver MINOS
NASA Astrophysics Data System (ADS)
Jamelot, Erell; Baudron, Anne-Marie; Lautard, Jean-Jacques
2012-12-01
In this article we present a domain decomposition method for the mixed SPN equations, discretized with Raviart-Thomas-Nédélec finite elements. This domain decomposition is based on the iterative Schwarz algorithm with Robin interface conditions to handle communications. After having described this method, we give details on how to optimize the convergence. Finally, we give some numerical results computed in a realistic 3D domain. The computations are done with the MINOS solver of the APOLLO3® code.
Gerris Flow Solver: Implementation and Application
2013-05-12
Zienkiewicz, 1966). It is the solver for the Imperial College Ocean Model (ICOM), which uses 3D adaptive mesh methods (Ford et al., 2004). The finite...method (Popinet, 2003). The 3D Gerris model was used to study air turbulence associated with a complex shape with good match to observations (Popinet...et al., 2004). The Ocean module of Gerris was described by Popinet and Rickard (2004) as an adaptive, finite-volume, 3D , incompressible, N-S fluid
User documentation for PVODE, an ODE solver for parallel computers
Hindmarsh, A.C., LLNL
1998-05-01
PVODE is a general purpose ordinary differential equation (ODE) solver for stiff and nonstiff ODES It is based on CVODE [5] [6], which is written in ANSI- standard C PVODE uses MPI (Message-Passing Interface) [8] and a revised version of the vector module in CVODE to achieve parallelism and portability PVODE is intended for the SPMD (Single Program Multiple Data) environment with distributed memory, in which all vectors are identically distributed across processors In particular, the vector module is designed to help the user assign a contiguous segment of a given vector to each of the processors for parallel computation The idea is for each processor to solve a certain fixed subset of the ODES To better understand PVODE, we first need to understand CVODE and its historical background The ODE solver CVODE, which was written by Cohen and Hindmarsh, combines features of two earlier Fortran codes, VODE [l] and VODPK [3] Those two codes were written by Brown, Byrne, and Hindmarsh. Both use variable-coefficient multi-step integration methods, and address both stiff and nonstiff systems (Stiffness is defined as the presence of one or more very small damping time constants ) VODE uses direct linear algebraic techniques to solve the underlying banded or dense linear systems of equations in conjunction with a modified Newton method in the stiff ODE case On the other hand, VODPK uses a preconditioned Krylov iterative method [2] to solve the underlying linear system User-supplied preconditioners directly address the dominant source of stiffness Consequently, CVODE implements both the direct and iterative methods Currently, with regard to the nonlinear and linear system solution, PVODE has three method options available. functional iteration, Newton iteration with a diagonal approximate Jacobian, and Newton iteration with the iterative method SPGMR (Scaled Preconditioned Generalized Minimal Residual method) Both CVODE and PVODE are written in such a way that other linear
Galerkin CFD solvers for use in a multi-disciplinary suite for modeling advanced flight vehicles
NASA Astrophysics Data System (ADS)
Moffitt, Nicholas J.
This work extends existing Galerkin CFD solvers for use in a multi-disciplinary suite. The suite is proposed as a means of modeling advanced flight vehicles, which exhibit strong coupling between aerodynamics, structural dynamics, controls, rigid body motion, propulsion, and heat transfer. Such applications include aeroelastics, aeroacoustics, stability and control, and other highly coupled applications. The suite uses NASA STARS for modeling structural dynamics and heat transfer. Aerodynamics, propulsion, and rigid body dynamics are modeled in one of the five CFD solvers below. Euler2D and Euler3D are Galerkin CFD solvers created at OSU by Cowan (2003). These solvers are capable of modeling compressible inviscid aerodynamics with modal elastics and rigid body motion. This work reorganized these solvers to improve efficiency during editing and at run time. Simple and efficient propulsion models were added, including rocket, turbojet, and scramjet engines. Viscous terms were added to the previous solvers to create NS2D and NS3D. The viscous contributions were demonstrated in the inertial and non-inertial frames. Variable viscosity (Sutherland's equation) and heat transfer boundary conditions were added to both solvers but not verified in this work. Two turbulence models were implemented in NS2D and NS3D: Spalart-Allmarus (SA) model of Deck, et al. (2002) and Menter's SST model (1994). A rotation correction term (Shur, et al., 2000) was added to the production of turbulence. Local time stepping and artificial dissipation were adapted to each model. CFDsol is a Taylor-Galerkin solver with an SA turbulence model. This work improved the time accuracy, far field stability, viscous terms, Sutherland?s equation, and SA model with NS3D as a guideline and added the propulsion models from Euler3D to CFDsol. Simple geometries were demonstrated to utilize current meshing and processing capabilities. Air-breathing hypersonic flight vehicles (AHFVs) represent the ultimate
Domain decomposed preconditioners with Krylov subspace methods as subdomain solvers
Pernice, M.
1994-12-31
Domain decomposed preconditioners for nonsymmetric partial differential equations typically require the solution of problems on the subdomains. Most implementations employ exact solvers to obtain these solutions. Consequently work and storage requirements for the subdomain problems grow rapidly with the size of the subdomain problems. Subdomain solves constitute the single largest computational cost of a domain decomposed preconditioner, and improving the efficiency of this phase of the computation will have a significant impact on the performance of the overall method. The small local memory available on the nodes of most message-passing multicomputers motivates consideration of the use of an iterative method for solving subdomain problems. For large-scale systems of equations that are derived from three-dimensional problems, memory considerations alone may dictate the need for using iterative methods for the subdomain problems. In addition to reduced storage requirements, use of an iterative solver on the subdomains allows flexibility in specifying the accuracy of the subdomain solutions. Substantial savings in solution time is possible if the quality of the domain decomposed preconditioner is not degraded too much by relaxing the accuracy of the subdomain solutions. While some work in this direction has been conducted for symmetric problems, similar studies for nonsymmetric problems appear not to have been pursued. This work represents a first step in this direction, and explores the effectiveness of performing subdomain solves using several transpose-free Krylov subspace methods, GMRES, transpose-free QMR, CGS, and a smoothed version of CGS. Depending on the difficulty of the subdomain problem and the convergence tolerance used, a reduction in solution time is possible in addition to the reduced memory requirements. The domain decomposed preconditioner is a Schur complement method in which the interface operators are approximated using interface probing.
High Energy Boundary Conditions for a Cartesian Mesh Euler Solver
NASA Technical Reports Server (NTRS)
Pandya, Shishir; Murman, Scott; Aftosmis, Michael
2003-01-01
Inlets and exhaust nozzles are common place in the world of flight. Yet, many aerodynamic simulation packages do not provide a method of modelling such high energy boundaries in the flow field. For the purposes of aerodynamic simulation, inlets and exhausts are often fared over and it is assumed that the flow differences resulting from this assumption are minimal. While this is an adequate assumption for the prediction of lift, the lack of a plume behind the aircraft creates an evacuated base region thus effecting both drag and pitching moment values. In addition, the flow in the base region is often mis-predicted resulting in incorrect base drag. In order to accurately predict these quantities, a method for specifying inlet and exhaust conditions needs to be available in aerodynamic simulation packages. A method for a first approximation of a plume without accounting for chemical reactions is added to the Cartesian mesh based aerodynamic simulation package CART3D. The method consists of 3 steps. In the first step, a components approach where each triangle is assigned a component number is used. Here, a method for marking the inlet or exhaust plane triangles as separate components is discussed. In step two, the flow solver is modified to accept a reference state for the components marked inlet or exhaust. In the third step, the flow solver uses these separated components and the reference state to compute the correct flow condition at that triangle. The present method is implemented in the CART3D package which consists of a set of tools for generating a Cartesian volume mesh from a set of component triangulations. The Euler equations are solved on the resulting unstructured Cartesian mesh. The present methods is implemented in this package and its usefulness is demonstrated with two validation cases. A generic missile body is also presented to show the usefulness of the method on a real world geometry.
Updates to the NEQAIR Radiation Solver
NASA Technical Reports Server (NTRS)
Cruden, Brett A.; Brandis, Aaron M.
2014-01-01
The NEQAIR code is one of the original heritage solvers for radiative heating prediction in aerothermal environments, and is still used today for mission design purposes. This paper discusses the implementation of the first major revision to the NEQAIR code in the last five years, NEQAIR v14.0. The most notable features of NEQAIR v14.0 are the parallelization of the radiation computation, reducing runtimes by about 30×, and the inclusion of mid-wave CO2 infrared radiation.
Some topics of Navier-Stokes solvers
NASA Astrophysics Data System (ADS)
Honma, H.; Nishikawa, N.
1990-03-01
The process of numerical simulation consists of selection of some items: a mathematical model, a numerical scheme, the level of the computer, and post processing. From this point of view, recent numerical studies of viscous flows are described especially for the fluid engineering laboratories in the Chiba University. The examples of simulations are Mach reflection on a wedge using a kinetic model equation and a cylinder-plate juncture flow using incompressible Navier Stokes equation. Some attempts at graphic monitoring of fluid mechanical calculations are also shown for some combinations of computers with Computational Fluid Dynamics (CFD) solvers.
A finite different field solver for dipole modes
Nelson, E.M.
1992-08-01
A finite element field solver for dipole modes in axisymmetric structures has been written. The second-order elements used in this formulation yield accurate mode frequencies with no spurious modes. Quasi-periodic boundaries are included to allow travelling waves in periodic structures. The solver is useful in applications requiring precise frequency calculations such as detuned accelerator structures for linear colliders. Comparisons are made with measurements and with the popular but less accurate field solver URMEL.
Guerin, P.; Baudron, A. M.; Lautard, J. J.
2006-07-01
This paper describes a new technique for determining the pin power in heterogeneous core calculations. It is based on a domain decomposition with overlapping sub-domains and a component mode synthesis technique for the global flux determination. Local basis functions are used to span a discrete space that allows fundamental global mode approximation through a Galerkin technique. Two approaches are given to obtain these local basis functions: in the first one (Component Mode Synthesis method), the first few spatial eigenfunctions are computed on each sub-domain, using periodic boundary conditions. In the second one (Factorized Component Mode Synthesis method), only the fundamental mode is computed, and we use a factorization principle for the flux in order to replace the higher order Eigenmodes. These different local spatial functions are extended to the global domain by defining them as zero outside the sub-domain. These methods are well-fitted for heterogeneous core calculations because the spatial interface modes are taken into account in the domain decomposition. Although these methods could be applied to higher order angular approximations - particularly easily to a SPN approximation - the numerical results we provide are obtained using a diffusion model. We show the methods' accuracy for reactor cores loaded with UOX and MOX assemblies, for which standard reconstruction techniques are known to perform poorly. Furthermore, we show that our methods are highly and easily parallelizable. (authors)
Explicit solvers in an implicit code
NASA Astrophysics Data System (ADS)
Martinez Montesinos, Beatriz; Kaus, Boris J. P.; Popov, Anton
2017-04-01
Many geodynamic processes occur over long timescales (millions of years), and are best solved with implicit solvers. Yet, some processes, such as hydrofracking, or wave propagation, occur over smaller timescales. In those cases, it might be advantageous to use an explicit rather than an implicit approach as it requires significantly less memory and computational costs. Here, we discuss our ongoing work to include explicit solvers in the parallel software package LaMEM (Lithosphere and Mantle Evolution Model). As a first step, we focus on modelling seismic wave propagation in heterogeneous 3D poro-elasto-plastic models. To do that, we add inertial terms to the momentum equations as well as elastic compressibility to the mass conservation equations in an explicit way using the staggered grid finite difference discretization method. Results are similar to that of existing wave propagation codes and are capable to simulate wave propagation in heterogeneous media. To simulate geomechanical problems, timestep restrictions posed by the seismic wave speed are usually too severe to allow simulating deformation on a timescale of months-years. The classical (FLAC) method introduces a mass-density scaling in which a non-physical (larger) density is employed in the momentum equations. We will discuss how this method fits simple benchmarks for elastic and elastoplastic deformation. As an application, we use the code to model different complex media subject to compression and we investigate how mass scaling influence in our results.
Two-Dimensional Ffowcs Williams/Hawkings Equation Solver
NASA Technical Reports Server (NTRS)
Lockard, David P.
2005-01-01
FWH2D is a Fortran 90 computer program that solves a two-dimensional (2D) version of the equation, derived by J. E. Ffowcs Williams and D. L. Hawkings, for sound generated by turbulent flow. FWH2D was developed especially for estimating noise generated by airflows around such approximately 2D airframe components as slats. The user provides input data on fluctuations of pressure, density, and velocity on some surface. These data are combined with information about the geometry of the surface to calculate histories of thickness and loading terms. These histories are fast-Fourier-transformed into the frequency domain. For each frequency of interest and each observer position specified by the user, kernel functions are integrated over the surface by use of the trapezoidal rule to calculate a pressure signal. The resulting frequency-domain signals are inverse-fast-Fourier-transformed back into the time domain. The output of the code consists of the time- and frequency-domain representations of the pressure signals at the observer positions. Because of its approximate nature, FWH2D overpredicts the noise from a finite-length (3D) component. The advantage of FWH2D is that it requires a fraction of the computation time of a 3D Ffowcs Williams/Hawkings solver.
NASA Astrophysics Data System (ADS)
Guillen, Ph.; Borrel, M.; Dormieux, M.
1990-10-01
A numerical scheme of the MUSCL type used for the numerical simulation of gas flow of different types around complex configurations is described. Approximate Riemann solvers of the Van Leer, Roc, and Osher types, developed for perfect gas flows are used. These solvers have been extended to non-reactive mixtures of two species and real gas flows by Abgrall, Montagne and Vinokur. The architecture of the code, dictated by constraints in geometrical considerations, computational aspects, the specific nature of the flow, and ergonomy, is described.
Computational results for flows over 2-D ramp and 3-D obstacle with an upwind Navier-Stokes solver
NASA Technical Reports Server (NTRS)
Venkatapathy, Ethiraj
1990-01-01
An implicit, finite-difference, upwind, full Navier-Stokes solver was applied to supersonic/hypersonic flows over two-dimensional ramps and three-dimensional obstacle. Some of the computed results are presented. The numerical scheme used in the study is an implicit, spacially second order accurate, upwind, LU-ADI scheme based on Roe's approximate Reimann solver with MUSCL differencing of Van Leer. An algebraic grid generation scheme based on generalized interpolation scheme was used in generating the grids for the various 2-D and 3-D problems.
NASA Astrophysics Data System (ADS)
Vincenti, H.; Vay, J.-L.
2016-03-01
Very high order or pseudo-spectral Maxwell solvers are the method of choice to reduce discretization effects (e.g. numerical dispersion) that are inherent to low order Finite-Difference Time-Domain (FDTD) schemes. However, due to their large stencils, these solvers are often subject to truncation errors in many electromagnetic simulations. These truncation errors come from non-physical modifications of Maxwell's equations in space that may generate spurious signals affecting the overall accuracy of the simulation results. Such modifications for instance occur when Perfectly Matched Layers (PMLs) are used at simulation domain boundaries to simulate open media. Another example is the use of arbitrary order Maxwell solver with domain decomposition technique that may under some condition involve stencil truncations at subdomain boundaries, resulting in small spurious errors that do eventually build up. In each case, a careful evaluation of the characteristics and magnitude of the errors resulting from these approximations, and their impact at any frequency and angle, requires detailed analytical and numerical studies. To this end, we present a general analytical approach that enables the evaluation of numerical errors of fully three-dimensional arbitrary order finite-difference Maxwell solver, with arbitrary modification of the local stencil in the simulation domain. The analytical model is validated against simulations of domain decomposition technique and PMLs, when these are used with very high-order Maxwell solver, as well as in the infinite order limit of pseudo-spectral solvers. Results confirm that the new analytical approach enables exact predictions in each case. It also confirms that the domain decomposition technique can be used with very high-order Maxwell solvers and a reasonably low number of guard cells with negligible effects on the whole accuracy of the simulation.
Ghysels, Pieter; Li, Xiaoye S.; Rouet, Francois -Henry; Williams, Samuel; Napov, Artem
2016-10-27
We present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factoriz ation leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite. The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK - STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices.
Ghysels, Pieter; Li, Xiaoye S.; Rouet, Francois -Henry; ...
2016-10-27
We present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factoriz ation leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite. Themore » implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK - STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices.« less
Experiences with linear solvers for oil reservoir simulation problems
Joubert, W.; Janardhan, R.; Biswas, D.; Carey, G.
1996-12-31
This talk will focus on practical experiences with iterative linear solver algorithms used in conjunction with Amoco Production Company`s Falcon oil reservoir simulation code. The goal of this study is to determine the best linear solver algorithms for these types of problems. The results of numerical experiments will be presented.
NASA Astrophysics Data System (ADS)
Wille, S. Ø.
1996-06-01
An iterative adaptive equation multigrid solver for solving the implicit Navier-Stokes equations simultaneously with tri-tree grid generation is developed. The tri-tree grid generator builds a hierarchical grid structur e which is mapped to a finite element grid at each hierarchical level. For each hierarchical finite element multigrid the Navier-Stokes equations are solved approximately. The solution at each level is projected onto the next finer grid and used as a start vector for the iterative equation solver at the finer level. When the finest grid is reached, the equation solver is iterated until a tolerated solution is reached. The iterative multigrid equation solver is preconditioned by incomplete LU factorization with coupled node fill-in.The non-linear Navier-Stokes equations are linearized by both the Newton method and grid adaption. The efficiency and behaviour of the present adaptive method are compared with those of the previously developed iterative equation solver which is preconditioned by incomplete LU factorization with coupled node fill-in.
Vincenti, H.; Vay, J. -L.
2015-11-22
Due to discretization effects and truncation to finite domains, many electromagnetic simulations present non-physical modifications of Maxwell's equations in space that may generate spurious signals affecting the overall accuracy of the result. Such modifications for instance occur when Perfectly Matched Layers (PMLs) are used at simulation domain boundaries to simulate open media. Another example is the use of arbitrary order Maxwell solver with domain decomposition technique that may under some condition involve stencil truncations at subdomain boundaries, resulting in small spurious errors that do eventually build up. In each case, a careful evaluation of the characteristics and magnitude of themore » errors resulting from these approximations, and their impact at any frequency and angle, requires detailed analytical and numerical studies. To this end, we present a general analytical approach that enables the evaluation of numerical discretization errors of fully three-dimensional arbitrary order finite-difference Maxwell solver, with arbitrary modification of the local stencil in the simulation domain. The analytical model is validated against simulations of domain decomposition technique and PMLs, when these are used with very high-order Maxwell solver, as well as in the infinite order limit of pseudo-spectral solvers. Results confirm that the new analytical approach enables exact predictions in each case. It also confirms that the domain decomposition technique can be used with very high-order Maxwell solver and a reasonably low number of guard cells with negligible effects on the whole accuracy of the simulation.« less
Vincenti, H.; Vay, J. -L.
2015-11-22
Due to discretization effects and truncation to finite domains, many electromagnetic simulations present non-physical modifications of Maxwell's equations in space that may generate spurious signals affecting the overall accuracy of the result. Such modifications for instance occur when Perfectly Matched Layers (PMLs) are used at simulation domain boundaries to simulate open media. Another example is the use of arbitrary order Maxwell solver with domain decomposition technique that may under some condition involve stencil truncations at subdomain boundaries, resulting in small spurious errors that do eventually build up. In each case, a careful evaluation of the characteristics and magnitude of the errors resulting from these approximations, and their impact at any frequency and angle, requires detailed analytical and numerical studies. To this end, we present a general analytical approach that enables the evaluation of numerical discretization errors of fully three-dimensional arbitrary order finite-difference Maxwell solver, with arbitrary modification of the local stencil in the simulation domain. The analytical model is validated against simulations of domain decomposition technique and PMLs, when these are used with very high-order Maxwell solver, as well as in the infinite order limit of pseudo-spectral solvers. Results confirm that the new analytical approach enables exact predictions in each case. It also confirms that the domain decomposition technique can be used with very high-order Maxwell solver and a reasonably low number of guard cells with negligible effects on the whole accuracy of the simulation.
A matrix-form GSM-CFD solver for incompressible fluids and its application to hemodynamics
NASA Astrophysics Data System (ADS)
Yao, Jianyao; Liu, G. R.
2014-10-01
A GSM-CFD solver for incompressible flows is developed based on the gradient smoothing method (GSM). A matrix-form algorithm and corresponding data structure for GSM are devised to efficiently approximate the spatial gradients of field variables using the gradient smoothing operation. The calculated gradient values on various test fields show that the proposed GSM is capable of exactly reproducing linear field and of second order accuracy on all kinds of meshes. It is found that the GSM is much more robust to mesh deformation and therefore more suitable for problems with complicated geometries. Integrated with the artificial compressibility approach, the GSM is extended to solve the incompressible flows. As an example, the flow simulation of carotid bifurcation is carried out to show the effectiveness of the proposed GSM-CFD solver. The blood is modeled as incompressible Newtonian fluid and the vessel is treated as rigid wall in this paper.
A real-time impurity solver for DMFT
NASA Astrophysics Data System (ADS)
Kim, Hyungwon; Aron, Camille; Han, Jong E.; Kotliar, Gabriel
Dynamical mean-field theory (DMFT) offers a non-perturbative approach to problems with strongly correlated electrons. The method heavily relies on the ability to numerically solve an auxiliary Anderson-type impurity problem. While powerful Matsubara-frequency solvers have been developed over the past two decades to tackle equilibrium situations, the status of real-time impurity solvers that could compete with Matsubara-frequency solvers and be readily generalizable to non-equilibrium situations is still premature. We present a real-time solver which is based on a quantum Master equation description of the dissipative dynamics of the impurity and its exact diagonalization. As a benchmark, we illustrate the strengths of our solver in the context of the equilibrium Mott-insulator transition of the one-band Hubbard model and compare it with iterative perturbation theory (IPT) method. Finally, we discuss its direct application to a nonequilibrium situation.
Shape reanalysis and sensitivities utilizing preconditioned iterative boundary solvers
NASA Technical Reports Server (NTRS)
Guru Prasad, K.; Kane, J. H.
1992-01-01
The computational advantages associated with the utilization of preconditined iterative equation solvers are quantified for the reanalysis of perturbed shapes using continuum structural boundary element analysis (BEA). Both single- and multi-zone three-dimensional problems are examined. Significant reductions in computer time are obtained by making use of previously computed solution vectors and preconditioners in subsequent analyses. The effectiveness of this technique is demonstrated for the computation of shape response sensitivities required in shape optimization. Computer times and accuracies achieved using the preconditioned iterative solvers are compared with those obtained via direct solvers and implicit differentiation of the boundary integral equations. It is concluded that this approach employing preconditioned iterative equation solvers in reanalysis and sensitivity analysis can be competitive with if not superior to those involving direct solvers.
General purpose nonlinear system solver based on Newton-Krylov method.
2013-12-01
KINSOL is part of a software family called SUNDIALS: SUite of Nonlinear and Differential/Algebraic equation Solvers [1]. KINSOL is a general-purpose nonlinear system solver based on Newton-Krylov and fixed-point solver technologies [2].
Optimising a parallel conjugate gradient solver
Field, M.R.
1996-12-31
This work arises from the introduction of a parallel iterative solver to a large structural analysis finite element code. The code is called FEX and it was developed at Hitachi`s Mechanical Engineering Laboratory. The FEX package can deal with a large range of structural analysis problems using a large number of finite element techniques. FEX can solve either stress or thermal analysis problems of a range of different types from plane stress to a full three-dimensional model. These problems can consist of a number of different materials which can be modelled by a range of material models. The structure being modelled can have the load applied at either a point or a surface, or by a pressure, a centrifugal force or just gravity. Alternatively a thermal load can be applied with a given initial temperature. The displacement of the structure can be constrained by having a fixed boundary or by prescribing the displacement at a boundary.
A composite grid solver for conjugate heat transfer in fluid-structure systems
Henshaw, William D. Chand, Kyle K.
2009-06-01
We describe a numerical method for modeling temperature-dependent fluid flow coupled to heat transfer in solids. This approach to conjugate heat transfer can be used to compute transient and steady state solutions to a wide range of fluid-solid systems in complex two- and three-dimensional geometry. Fluids are modeled with the temperature-dependent incompressible Navier-Stokes equations using the Boussinesq approximation. Solids with heat transfer are modeled with the heat equation. Appropriate interface equations are applied to couple the solutions across different domains. The computational region is divided into a number of sub-domains corresponding to fluid domains and solid domains. There may be multiple fluid domains and multiple solid domains. Each fluid or solid sub-domain is discretized with an overlapping grid. The entire region is associated with a composite grid which is the union of the overlapping grids for the sub-domains. A different physics solver (fluid solver or solid solver) is associated with each sub-domain. A higher-level multi-domain solver manages the entire solution process. We propose and analyze some centered discrete approximations to the interface equations that have some desirable stability properties. The coupled interface equations may be solved directly when using explicit time-stepping methods in the sub-domains, resulting in a strongly coupled approach. The stability of the interface treatment in this case is independent of the relative sizes of the material properties in the two domains with the time-step only depending on the usual von Neumann conditions for each sub-domain. For implicit time-stepping methods we solve the interface equations in a weakly coupled fashion to avoid forming a coupled implicit system across all sub-domains. The convergence of this approach does depend on the relative sizes of the thermal conductivities and diffusivities. We analyze different iteration strategies for solving these implicit equations
Comparison of open-source linear programming solvers.
Gearhart, Jared Lee; Adair, Kristin Lynn; Durfee, Justin David.; Jones, Katherine A.; Martin, Nathaniel; Detry, Richard Joseph
2013-10-01
When developing linear programming models, issues such as budget limitations, customer requirements, or licensing may preclude the use of commercial linear programming solvers. In such cases, one option is to use an open-source linear programming solver. A survey of linear programming tools was conducted to identify potential open-source solvers. From this survey, four open-source solvers were tested using a collection of linear programming test problems and the results were compared to IBM ILOG CPLEX Optimizer (CPLEX) [1], an industry standard. The solvers considered were: COIN-OR Linear Programming (CLP) [2], [3], GNU Linear Programming Kit (GLPK) [4], lp_solve [5] and Modular In-core Nonlinear Optimization System (MINOS) [6]. As no open-source solver outperforms CPLEX, this study demonstrates the power of commercial linear programming software. CLP was found to be the top performing open-source solver considered in terms of capability and speed. GLPK also performed well but cannot match the speed of CLP or CPLEX. lp_solve and MINOS were considerably slower and encountered issues when solving several test problems.
Neutrino transport in type II supernovae: Boltzmann solver vs. Monte Carlo method
NASA Astrophysics Data System (ADS)
Yamada, Shoichi; Janka, Hans-Thomas; Suzuki, Hideyuki
1999-04-01
We have coded a Boltzmann solver based on a finite difference scheme (S_N method) aiming at calculations of neutrino transport in type II supernovae. Close comparison between the Boltzmann solver and a Monte Carlo transport code has been made for realistic atmospheres of post bounce core models under the assumption of a static background. We have also investigated in detail the dependence of the results on the numbers of radial, angular, and energy grid points and the way to discretize the spatial advection term which is used in the Boltzmann solver. A general relativistic calculation has been done for one of the models. We find good overall agreement between the two methods. This gives credibility to both methods which are based on completely different formulations. In particular, the number and energy fluxes and the mean energies of the neutrinos show remarkably good agreement, because these quantities are determined in a region where the angular distribution of the neutrinos is nearly isotropic and they are essentially frozen in later on. On the other hand, because of a relatively small number of angular grid points (which is inevitable due to limitations of the computation time) the Boltzmann solver tends to slightly underestimate the flux factor and the Eddington factor outside the (mean) ``neutrinosphere'' where the angular distribution of the neutrinos becomes highly anisotropic. As a result, the neutrino number (and energy) density is somewhat overestimated in this region. This fact suggests that the Boltzmann solver should be applied to calculations of the neutrino heating in the hot-bubble region with some caution because there might be a tendency to overestimate the energy deposition rate in disadvantageous situations. A comparison shows that this trend is opposite to the results obtained with a multi-group flux-limited diffusion approximation of neutrino transport. Employing three different flux limiters, we find that all of them lead to a significant
A dynamic-solver-consistent minimum action method: With an application to 2D Navier-Stokes equations
NASA Astrophysics Data System (ADS)
Wan, Xiaoliang; Yu, Haijun
2017-02-01
This paper discusses the necessity and strategy to unify the development of a dynamic solver and a minimum action method (MAM) for a spatially extended system when employing the large deviation principle (LDP) to study the effects of small random perturbations. A dynamic solver is used to approximate the unperturbed system, and a minimum action method is used to approximate the LDP, which corresponds to solving an Euler-Lagrange equation related to but more complicated than the unperturbed system. We will clarify possible inconsistencies induced by independent numerical approximations of the unperturbed system and the LDP, based on which we propose to define both the dynamic solver and the MAM on the same approximation space for spatial discretization. The semi-discrete LDP can then be regarded as the exact LDP of the semi-discrete unperturbed system, which is a finite-dimensional ODE system. We achieve this methodology for the two-dimensional Navier-Stokes equations using a divergence-free approximation space. The method developed can be used to study the nonlinear instability of wall-bounded parallel shear flows, and be generalized straightforwardly to three-dimensional cases. Numerical experiments are presented.
Robust large-scale parallel nonlinear solvers for simulations.
Bader, Brett William; Pawlowski, Roger Patrick; Kolda, Tamara Gibson
2005-11-01
This report documents research to develop robust and efficient solution techniques for solving large-scale systems of nonlinear equations. The most widely used method for solving systems of nonlinear equations is Newton's method. While much research has been devoted to augmenting Newton-based solvers (usually with globalization techniques), little has been devoted to exploring the application of different models. Our research has been directed at evaluating techniques using different models than Newton's method: a lower order model, Broyden's method, and a higher order model, the tensor method. We have developed large-scale versions of each of these models and have demonstrated their use in important applications at Sandia. Broyden's method replaces the Jacobian with an approximation, allowing codes that cannot evaluate a Jacobian or have an inaccurate Jacobian to converge to a solution. Limited-memory methods, which have been successful in optimization, allow us to extend this approach to large-scale problems. We compare the robustness and efficiency of Newton's method, modified Newton's method, Jacobian-free Newton-Krylov method, and our limited-memory Broyden method. Comparisons are carried out for large-scale applications of fluid flow simulations and electronic circuit simulations. Results show that, in cases where the Jacobian was inaccurate or could not be computed, Broyden's method converged in some cases where Newton's method failed to converge. We identify conditions where Broyden's method can be more efficient than Newton's method. We also present modifications to a large-scale tensor method, originally proposed by Bouaricha, for greater efficiency, better robustness, and wider applicability. Tensor methods are an alternative to Newton-based methods and are based on computing a step based on a local quadratic model rather than a linear model. The advantage of Bouaricha's method is that it can use any existing linear solver, which makes it simple to write
An iterative solver for the 3D Helmholtz equation
NASA Astrophysics Data System (ADS)
Belonosov, Mikhail; Dmitriev, Maxim; Kostin, Victor; Neklyudov, Dmitry; Tcheverda, Vladimir
2017-09-01
We develop a frequency-domain iterative solver for numerical simulation of acoustic waves in 3D heterogeneous media. It is based on the application of a unique preconditioner to the Helmholtz equation that ensures convergence for Krylov subspace iteration methods. Effective inversion of the preconditioner involves the Fast Fourier Transform (FFT) and numerical solution of a series of boundary value problems for ordinary differential equations. Matrix-by-vector multiplication for iterative inversion of the preconditioned matrix involves inversion of the preconditioner and pointwise multiplication of grid functions. Our solver has been verified by benchmarking against exact solutions and a time-domain solver.
GPU accelerated kinetic solvers for rarefied gas dynamics
NASA Astrophysics Data System (ADS)
Zabelok, Sergey A.; Kolobov, Vladimir I.; Arslanbekov, Robert R.
2012-11-01
GPU-acceleration is applied to the Boltzmann solver with adaptive Cartesian mesh in the Unified Flow Solver framework. NVIDIA CUDA technology is used with threads being grouped in thread blocks by points of Korobov sequences in each cell for computing the collision integral and by points in coordinate space for the free-molecular flow stage. GPU-accelerated Boltzmann solver with octree Cartesian mesh has been tested on several computer systems. Speedup of several times for GPU-based code compared to single-core CPU computations on the same machines has been observed.
A high order multi-resolution solver for the Poisson equation with application to vortex methods
NASA Astrophysics Data System (ADS)
Hejlesen, Mads Mølholm; Spietz, Henrik Juul; Walther, Jens Honore
2015-11-01
A high order method is presented for solving the Poisson equation subject to mixed free-space and periodic boundary conditions by using fast Fourier transforms (FFT). The high order convergence is achieved by deriving mollified Green's functions from a high order regularization function which provides a correspondingly smooth solution to the Poisson equation. The high order regularization function may be obtained analogous to the approximate deconvolution method used in turbulence models and strongly relates to deblurring algorithms used in image processing. At first we show that the regularized solver can be combined with a short range particle-particle correction for evaluating discrete particle interactions in the context of a particle-particle particle-mesh (P3M) method. By a similar approach we extend the regularized solver to handle multi-resolution patches in continuum field simulations by super-positioning an inter-mesh correction. For sufficiently smooth vector fields this multi-resolution correction can be achieved without the loss of convergence rate. An implementation of the multi-resolution solver in a two-dimensional re-meshed particle-mesh based vortex method is presented and validated.
Development and Application of a Parallel Implicit Solver for Unsteady Viscous Flows
NASA Astrophysics Data System (ADS)
Morgan, P. E.; Visbal, M. R.; Sadayappan, P.
This work investigates the performance and application of a parallel version of a three-dimensional second-order time accurate Navier-Stokes solver based on an implicit approximate-factorization Beam-Warming algorithm. A systematic incremental approach for parallelizing the serial code was developed which ensures that the parallel version of the code produces identical results to the original serial code. The current parallel scheme decomposes the grid using two-dimensional multipartitioning to evenly distribute the work across multiple processors with parallel communication via Message-Passing Interface (MPI) library. The code's performance has been assessed on three supercomputers: the IBM SP2, IBM SP3 and the Silicon Graphics Origin 2000. The solver is validated for Couette flow, and both steady and unsteady flow over a circular cylinder. Additional applications include both two- and three-dimensional flow over a stationary and a rotationally oscillating circular cylinder. This new solver enables the efficient simulation of large-scale unsteady viscous flows employing grids containing on the order of 107 points using available parallel supercomputers.
Parallel O(N) Stokes' Solver Towards Scalable Brownian Dynamics in General Geometries.
Zhao, Xujun; Li, Jiyuan; Jiang, Xikai; ...
2017-01-01
An efficient parallel Stokes’s solver is developed towards the complete inclusion of hydrodynamic interactions of Brownian particles in any geometry. A Langevin description of the particle dynamics is adopted, where the long-range interactions are included using a Green’s function formalism. A scalable parallel computational approach is presented, where the general geometry Stokeslet is calculated following a matrix-free algorithm using the General geometry Ewald-like method. Our approach employs a highly-efficient iterative finite element Stokes’ solver for the accurate treatment of long-range hydrodynamic interactions within arbitrary confined geometries. A combination of mid-point time integration of the Brownian stochastic differential equation, the parallelmore » Stokes’ solver, and a Chebyshev polynomial approximation for the fluctuation-dissipation theorem result in an O(N) parallel algorithm. We illustrate the new algorithm in the context of the dynamics of confined polymer solutions in equilibrium and non-equilibrium conditions. Our method is extended to treat suspended finite size particles of arbitrary shape in any geometry using an Immersed Boundary approach.« less
A mimetic spectral element solver for the Grad-Shafranov equation
NASA Astrophysics Data System (ADS)
Palha, A.; Koren, B.; Felici, F.
2016-07-01
In this work we present a robust and accurate arbitrary order solver for the fixed-boundary plasma equilibria in toroidally axisymmetric geometries. To achieve this we apply the mimetic spectral element formulation presented in [56] to the solution of the Grad-Shafranov equation. This approach combines a finite volume discretization with the mixed finite element method. In this way the discrete differential operators (∇, ∇×, ∇ṡ) can be represented exactly and metric and all approximation errors are present in the constitutive relations. The result of this formulation is an arbitrary order method even on highly curved meshes. Additionally, the integral of the toroidal current Jϕ is exactly equal to the boundary integral of the poloidal field over the plasma boundary. This property can play an important role in the coupling between equilibrium and transport solvers. The proposed solver is tested on a varied set of plasma cross sections (smooth and with an X-point) and also for a wide range of pressure and toroidal magnetic flux profiles. Equilibria accurate up to machine precision are obtained. Optimal algebraic convergence rates of order p + 1 and geometric convergence rates are shown for Soloviev solutions (including high Shafranov shifts), field-reversed configuration (FRC) solutions and spheromak analytical solutions. The robustness of the method is demonstrated for non-linear test cases, in particular on an equilibrium solution with a pressure pedestal.
NASA Astrophysics Data System (ADS)
Balsara, Dinshaw S.; Vides, Jeaniffer; Gurski, Katharine; Nkonga, Boniface; Dumbser, Michael; Garain, Sudip; Audit, Edouard
2016-01-01
Just as the quality of a one-dimensional approximate Riemann solver is improved by the inclusion of internal sub-structure, the quality of a multidimensional Riemann solver is also similarly improved. Such multidimensional Riemann problems arise when multiple states come together at the vertex of a mesh. The interaction of the resulting one-dimensional Riemann problems gives rise to a strongly-interacting state. We wish to endow this strongly-interacting state with physically-motivated sub-structure. The self-similar formulation of Balsara [16] proves especially useful for this purpose. While that work is based on a Galerkin projection, in this paper we present an analogous self-similar formulation that is based on a different interpretation. In the present formulation, we interpret the shock jumps at the boundary of the strongly-interacting state quite literally. The enforcement of the shock jump conditions is done with a least squares projection (Vides, Nkonga and Audit [67]). With that interpretation, we again show that the multidimensional Riemann solver can be endowed with sub-structure. However, we find that the most efficient implementation arises when we use a flux vector splitting and a least squares projection. An alternative formulation that is based on the full characteristic matrices is also presented. The multidimensional Riemann solvers that are demonstrated here use one-dimensional HLLC Riemann solvers as building blocks. Several stringent test problems drawn from hydrodynamics and MHD are presented to show that the method works. Results from structured and unstructured meshes demonstrate the versatility of our method. The reader is also invited to watch a video introduction to multidimensional Riemann solvers on http://www.nd.edu/ dbalsara/Numerical-PDE-Course.
Flow Solver for Incompressible Rectangular Domains
NASA Technical Reports Server (NTRS)
Kalb, Virginia L.
2008-01-01
This is an extension of the Flow Solver for Incompressible 2-D Drive Cavity software described in the preceding article. It solves the Navier-Stokes equations for incompressible flow using finite differencing on a uniform, staggered grid. There is a runtime choice of either central differencing or modified upwinding for the convective term. The domain must be rectangular, but may have a rectangular walled region within it. Currently, the position of the interior region and exterior boundary conditions are changed by modifying parameters in the code and recompiling. These features make it possible to solve a variety of classical fluid flow problems such as an L-shaped cavity, channel flow, or wake flow past a square cylinder. The code uses fourth-order Runge-Kutta time-stepping and overall second-order spatial accuracy. This software permits the walled region to be positioned such that flow past a square cylinder, an L-shaped cavity, and the flow over a back-facing step can all be solved by reconfiguration. Also, this extension has an automatic detection of periodicity, as well as use of specialized data structure for ease of configuring domain decomposition and computing convergence in overlap regions.
Advanced Multigrid Solvers for Fluid Dynamics
NASA Technical Reports Server (NTRS)
Brandt, Achi
1999-01-01
The main objective of this project has been to support the development of multigrid techniques in computational fluid dynamics that can achieve "textbook multigrid efficiency" (TME), which is several orders of magnitude faster than current industrial CFD solvers. Toward that goal we have assembled a detailed table which lists every foreseen kind of computational difficulty for achieving it, together with the possible ways for resolving the difficulty, their current state of development, and references. We have developed several codes to test and demonstrate, in the framework of simple model problems, several approaches for overcoming the most important of the listed difficulties that had not been resolved before. In particular, TME has been demonstrated for incompressible flows on one hand, and for near-sonic flows on the other hand. General approaches were advanced for the relaxation of stagnation points and boundary conditions under various situations. Also, new algebraic multigrid techniques were formed for treating unstructured grid formulations. More details on all these are given below.
Elliptic Solvers for Adaptive Mesh Refinement Grids
Quinlan, D.J.; Dendy, J.E., Jr.; Shapira, Y.
1999-06-03
We are developing multigrid methods that will efficiently solve elliptic problems with anisotropic and discontinuous coefficients on adaptive grids. The final product will be a library that provides for the simplified solution of such problems. This library will directly benefit the efforts of other Laboratory groups. The focus of this work is research on serial and parallel elliptic algorithms and the inclusion of our black-box multigrid techniques into this new setting. The approach applies the Los Alamos object-oriented class libraries that greatly simplify the development of serial and parallel adaptive mesh refinement applications. In the final year of this LDRD, we focused on putting the software together; in particular we completed the final AMR++ library, we wrote tutorials and manuals, and we built example applications. We implemented the Fast Adaptive Composite Grid method as the principal elliptic solver. We presented results at the Overset Grid Conference and other more AMR specific conferences. We worked on optimization of serial and parallel performance and published several papers on the details of this work. Performance remains an important issue and is the subject of continuing research work.
Generic task problem solvers in Soar
NASA Technical Reports Server (NTRS)
Johnson, Todd R.; Smith, Jack W., Jr.; Chandrasekaran, B.
1989-01-01
Two trends can be discerned in research in problem solving architectures in the last few years. On one hand, interest in task-specific architectures has grown, wherein types of problems of general utility are identified, and special architectures that support the development of problem solving systems for those types of problems are proposed. These architectures help in the acquisition and specification of knowledge by providing inference methods that are appropriate for the type of problem. However, knowledge based systems which use only one type of problem solving method are very brittle, and adding more types of methods requires a principled approach to integrating them in a flexible way. Contrasting with this trend is the proposal for a flexible, general architecture contained in the work on Soar. Soar has features which make it attractive for flexible use of all potentially relevant knowledge or methods. But as the theory Soar does not make commitments to specific types of problem solvers or provide guidance for their construction. It was investigated how task-specific architectures can be constructed in Soar to retain as many of the advantages as possible of both approaches. Examples were used from the Generic Task approach for building knowledge based systems. Though this approach was developed and applied for a number of problems, the ideas are applicable to other task-specific approaches as well.
NASA Technical Reports Server (NTRS)
Raju, Manthena S.
1998-01-01
Sprays occur in a wide variety of industrial and power applications and in the processing of materials. A liquid spray is a phase flow with a gas as the continuous phase and a liquid as the dispersed phase (in the form of droplets or ligaments). Interactions between the two phases, which are coupled through exchanges of mass, momentum, and energy, can occur in different ways at different times and locations involving various thermal, mass, and fluid dynamic factors. An understanding of the flow, combustion, and thermal properties of a rapidly vaporizing spray requires careful modeling of the rate-controlling processes associated with the spray's turbulent transport, mixing, chemical kinetics, evaporation, and spreading rates, as well as other phenomena. In an attempt to advance the state-of-the-art in multidimensional numerical methods, we at the NASA Lewis Research Center extended our previous work on sprays to unstructured grids and parallel computing. LSPRAY, which was developed by M.S. Raju of Nyma, Inc., is designed to be massively parallel and could easily be coupled with any existing gas-phase flow and/or Monte Carlo probability density function (PDF) solver. The LSPRAY solver accommodates the use of an unstructured mesh with mixed triangular, quadrilateral, and/or tetrahedral elements in the gas-phase solvers. It is used specifically for fuel sprays within gas turbine combustors, but it has many other uses. The spray model used in LSPRAY provided favorable results when applied to stratified-charge rotary combustion (Wankel) engines and several other confined and unconfined spray flames. The source code will be available with the National Combustion Code (NCC) as a complete package.
Performance of NASA Equation Solvers on Computational Mechanics Applications
NASA Technical Reports Server (NTRS)
Storaasli, Olaf O.
1996-01-01
This paper describes the performance of a new family of NASA-developed equation solvers used for large-scale (i.e. 551,705 equations) structural analysis. To minimize computer time and memory, the solvers are divided by application and matrix characteristics (sparse/dense, real/complex, symmetric/nonsymmetric, size: in-core/out of core) and exploit the hardware features of current and future computers. In this paper, the equation solvers, which are written in FORTRAN, and are therefore easily transportable, are shown to be faster than specialized computer library routines utilizing assembly code. Twenty NASA structural benchmark models with NASA solver timings reside on World Wide Web with a challenge to beat them.
Experiences Running a Parallel Answer Set Solver on Blue Gene
NASA Astrophysics Data System (ADS)
Schneidenbach, Lars; Schnor, Bettina; Gebser, Martin; Kaminski, Roland; Kaufmann, Benjamin; Schaub, Torsten
This paper presents the concept of parallelisation of a solver for Answer Set Programming (ASP). While there already exist some approaches to parallel ASP solving, there was a lack of a parallel version of the powerful clasp solver. We implemented a parallel version of clasp based on message-passing. Experimental results on Blue Gene P/L indicate the potential of such an approach.
Anton, Luis; MartI, Jose M; Ibanez, Jose M; Aloy, Miguel A.; Mimica, Petar; Miralles, Juan A.
2010-05-01
We obtain renormalized sets of right and left eigenvectors of the flux vector Jacobians of the relativistic MHD equations, which are regular and span a complete basis in any physical state including degenerate ones. The renormalization procedure relies on the characterization of the degeneracy types in terms of the normal and tangential components of the magnetic field to the wave front in the fluid rest frame. Proper expressions of the renormalized eigenvectors in conserved variables are obtained through the corresponding matrix transformations. Our work completes previous analysis that present different sets of right eigenvectors for non-degenerate and degenerate states, and can be seen as a relativistic generalization of earlier work performed in classical MHD. Based on the full wave decomposition (FWD) provided by the renormalized set of eigenvectors in conserved variables, we have also developed a linearized (Roe-type) Riemann solver. Extensive testing against one- and two-dimensional standard numerical problems allows us to conclude that our solver is very robust. When compared with a family of simpler solvers that avoid the knowledge of the full characteristic structure of the equations in the computation of the numerical fluxes, our solver turns out to be less diffusive than HLL and HLLC, and comparable in accuracy to the HLLD solver. The amount of operations needed by the FWD solver makes it less efficient computationally than those of the HLL family in one-dimensional problems. However, its relative efficiency increases in multidimensional simulations.
NASA Astrophysics Data System (ADS)
Antón, Luis; Miralles, Juan A.; Martí, José M.; Ibáñez, José M.; Aloy, Miguel A.; Mimica, Petar
2010-05-01
We obtain renormalized sets of right and left eigenvectors of the flux vector Jacobians of the relativistic MHD equations, which are regular and span a complete basis in any physical state including degenerate ones. The renormalization procedure relies on the characterization of the degeneracy types in terms of the normal and tangential components of the magnetic field to the wave front in the fluid rest frame. Proper expressions of the renormalized eigenvectors in conserved variables are obtained through the corresponding matrix transformations. Our work completes previous analysis that present different sets of right eigenvectors for non-degenerate and degenerate states, and can be seen as a relativistic generalization of earlier work performed in classical MHD. Based on the full wave decomposition (FWD) provided by the renormalized set of eigenvectors in conserved variables, we have also developed a linearized (Roe-type) Riemann solver. Extensive testing against one- and two-dimensional standard numerical problems allows us to conclude that our solver is very robust. When compared with a family of simpler solvers that avoid the knowledge of the full characteristic structure of the equations in the computation of the numerical fluxes, our solver turns out to be less diffusive than HLL and HLLC, and comparable in accuracy to the HLLD solver. The amount of operations needed by the FWD solver makes it less efficient computationally than those of the HLL family in one-dimensional problems. However, its relative efficiency increases in multidimensional simulations.
Multilevel solvers of first-order system least-squares for Stokes equations
Lai, Chen-Yao G.
1996-12-31
Recently, The use of first-order system least squares principle for the approximate solution of Stokes problems has been extensively studied by Cai, Manteuffel, and McCormick. In this paper, we study multilevel solvers of first-order system least-squares method for the generalized Stokes equations based on the velocity-vorticity-pressure formulation in three dimensions. The least-squares functionals is defined to be the sum of the L{sup 2}-norms of the residuals, which is weighted appropriately by the Reynolds number. We develop convergence analysis for additive and multiplicative multilevel methods applied to the resulting discrete equations.
Benchmarking transport solvers for fracture flow problems
NASA Astrophysics Data System (ADS)
Olkiewicz, Piotr; Dabrowski, Marcin
2015-04-01
Fracture flow may dominate in rocks with low porosity and it can accompany both industrial and natural processes. Typical examples of such processes are natural flows in crystalline rocks and industrial flows in geothermal systems or hydraulic fracturing. Fracture flow provides an important mechanism for transporting mass and energy. For example, geothermal energy is primarily transported by the flow of the heated water or steam rather than by the thermal diffusion. The geometry of the fracture network and the distribution of the mean apertures of individual fractures are the key parameters with regard to the fracture network transmissivity. Transport in fractures can occur through the combination of advection and diffusion processes like in the case of dissolved chemical components. The local distribution of the fracture aperture may play an important role for both flow and transport processes. In this work, we benchmark various numerical solvers for flow and transport processes in a single fracture in 2D and 3D. Fracture aperture distributions are generated by a number of synthetic methods. We examine a single-phase flow of an incompressible viscous Newtonian fluid in the low Reynolds number limit. Periodic boundary conditions are used and a pressure difference is imposed in the background. The velocity field is primarly found using the Stokes equations. We systematically compare the obtained velocity field to the results obtained by solving the Reynolds equation. This allows us to examine the impact of the aperture distribution on the permeability of the medium and the local velocity distribution for two different mathematical descriptions of the fracture flow. Furthermore, we analyse the impact of aperture distribution on the front characteristics such as the standard deviation and the fractal dimension for systems in 2D and 3D.
A Comparative Study of Randomized Constraint Solvers for Random-Symbolic Testing
NASA Technical Reports Server (NTRS)
Takaki, Mitsuo; Cavalcanti, Diego; Gheyi, Rohit; Iyoda, Juliano; dAmorim, Marcelo; Prudencio, Ricardo
2009-01-01
The complexity of constraints is a major obstacle for constraint-based software verification. Automatic constraint solvers are fundamentally incomplete: input constraints often build on some undecidable theory or some theory the solver does not support. This paper proposes and evaluates several randomized solvers to address this issue. We compare the effectiveness of a symbolic solver (CVC3), a random solver, three hybrid solvers (i.e., mix of random and symbolic), and two heuristic search solvers. We evaluate the solvers on two benchmarks: one consisting of manually generated constraints and another generated with a concolic execution of 8 subjects. In addition to fully decidable constraints, the benchmarks include constraints with non-linear integer arithmetic, integer modulo and division, bitwise arithmetic, and floating-point arithmetic. As expected symbolic solving (in particular, CVC3) subsumes the other solvers for the concolic execution of subjects that only generate decidable constraints. For the remaining subjects the solvers are complementary.
Quantitative analysis of numerical solvers for oscillatory biomolecular system models
Quo, Chang F; Wang, May D
2008-01-01
Background This article provides guidelines for selecting optimal numerical solvers for biomolecular system models. Because various parameters of the same system could have drastically different ranges from 10-15 to 1010, the ODEs can be stiff and ill-conditioned, resulting in non-unique, non-existing, or non-reproducible modeling solutions. Previous studies have not examined in depth how to best select numerical solvers for biomolecular system models, which makes it difficult to experimentally validate the modeling results. To address this problem, we have chosen one of the well-known stiff initial value problems with limit cycle behavior as a test-bed system model. Solving this model, we have illustrated that different answers may result from different numerical solvers. We use MATLAB numerical solvers because they are optimized and widely used by the modeling community. We have also conducted a systematic study of numerical solver performances by using qualitative and quantitative measures such as convergence, accuracy, and computational cost (i.e. in terms of function evaluation, partial derivative, LU decomposition, and "take-off" points). The results show that the modeling solutions can be drastically different using different numerical solvers. Thus, it is important to intelligently select numerical solvers when solving biomolecular system models. Results The classic Belousov-Zhabotinskii (BZ) reaction is described by the Oregonator model and is used as a case study. We report two guidelines in selecting optimal numerical solver(s) for stiff, complex oscillatory systems: (i) for problems with unknown parameters, ode45 is the optimal choice regardless of the relative error tolerance; (ii) for known stiff problems, both ode113 and ode15s are good choices under strict relative tolerance conditions. Conclusions For any given biomolecular model, by building a library of numerical solvers with quantitative performance assessment metric, we show that it is possible
A Comparison of Stiff ODE Solvers for Astrochemical Kinetics Problems
NASA Astrophysics Data System (ADS)
Nejad, Lida A. M.
2005-09-01
The time dependent chemical rate equations arising from astrochemical kinetics problems are described by a system of stiff ordinary differential equations (ODEs). In this paper, using three astrochemical models of varying physical and computational complexity, and hence different degrees of stiffness, we present a comprehensive performance survey of a set of well-established ODE solver packages from the ODEPACK collection, namely LSODE, LSODES, VODE and VODPK. For completeness, we include results from the GEAR package in one of the test models. The results demonstrate that significant performance improvements can be obtained over GEAR which is still being used by many astrochemists by default. We show that a simple appropriate ordering of the species set results in a substantial improvement in the performance of the tested ODE solvers. The sparsity of the associated Jacobian matrix can be exploited and results using the sparse direct solver routine LSODES show an extensive reduction in CPU time without any loss in accuracy. We compare the performance and the computed abundances of one model with a 175 species set and a reduced set of 88 species, keeping all physical and chemical parameters identical with both sets.We found that the calculated abundances using two different size models agree quite well. However, with no extra computational effort and more reliable results, it is possible for the computation to be many times faster with the larger species set than the reduced set, depending on the use of solvers, the ordering and the chosen options. It is also shown that though a particular solver with certain chosen parameters may have severe difficulty or even fail to complete a run over the required integration time, another solver can easily complete the run with a wider range of control parameters and options. As a result of the superior performance of LSODES for the solution of astrochemical kinetics systems, we have tailor-made a sparse version of the VODE
Solving Upwind-Biased Discretizations. 2; Multigrid Solver Using Semicoarsening
NASA Technical Reports Server (NTRS)
Diskin, Boris
1999-01-01
This paper studies a novel multigrid approach to the solution for a second order upwind biased discretization of the convection equation in two dimensions. This approach is based on semi-coarsening and well balanced explicit correction terms added to coarse-grid operators to maintain on coarse-grid the same cross-characteristic interaction as on the target (fine) grid. Colored relaxation schemes are used on all the levels allowing a very efficient parallel implementation. The results of the numerical tests can be summarized as follows: 1) The residual asymptotic convergence rate of the proposed V(0, 2) multigrid cycle is about 3 per cycle. This convergence rate far surpasses the theoretical limit (4/3) predicted for standard multigrid algorithms using full coarsening. The reported efficiency does not deteriorate with increasing the cycle, depth (number of levels) and/or refining the target-grid mesh spacing. 2) The full multi-grid algorithm (FMG) with two V(0, 2) cycles on the target grid and just one V(0, 2) cycle on all the coarse grids always provides an approximate solution with the algebraic error less than the discretization error. Estimates of the total work in the FMG algorithm are ranged between 18 and 30 minimal work units (depending on the target (discretizatioin). Thus, the overall efficiency of the FMG solver closely approaches (if does not achieve) the goal of the textbook multigrid efficiency. 3) A novel approach to deriving a discrete solution approximating the true continuous solution with a relative accuracy given in advance is developed. An adaptive multigrid algorithm (AMA) using comparison of the solutions on two successive target grids to estimate the accuracy of the current target-grid solution is defined. A desired relative accuracy is accepted as an input parameter. The final target grid on which this accuracy can be achieved is chosen automatically in the solution process. the actual relative accuracy of the discrete solution approximation
Euler/Navier-Stokes Solvers Applied to Ducted Fan Configurations
NASA Technical Reports Server (NTRS)
Keith, Theo G., Jr.; Srivastava, Rakesh
1997-01-01
Due to noise considerations, ultra high bypass ducted fans have become a more viable design. These ducted fans typically consist of a rotor stage containing a wide chord fan and a stator stage. One of the concerns for this design is the classical flutter that keeps occurring in various unducted fan blade designs. These flutter are catastrophic and are to be avoided in the flight envelope of the engine. Some numerical investigations by Williams, Cho and Dalton, have suggested that a duct around a propeller makes it more unstable. This needs to be further investigated. In order to design an engine to safely perform a set of desired tasks, accurate information of the stresses on the blade during the entire cycle of blade motion is required. This requirement in turn demands that accurate knowledge of steady and unsteady blade loading be available. Aerodynamic solvers based on unsteady three-dimensional analysis will provide accurate and fast solutions and are best suited for aeroelastic analysis. The Euler solvers capture significant physics of the flowfield and are reasonably fast. An aerodynamic solver Ref. based on Euler equations had been developed under a separate grant from NASA Lewis in the past. Under the current grant, this solver has been modified to calculate the aeroelastic characteristics of unducted and ducted rotors. Even though, the aeroelastic solver based on three-dimensional Euler equations is computationally efficient, it is still very expensive to investigate the effects of multiple stages on the aeroelastic characteristics. In order to investigate the effects of multiple stages, a two-dimensional multi stage aeroelastic solver was also developed under this task, in collaboration with Dr. T. S. R. Reddy of the University of Toledo. Both of these solvers were applied to several test cases and validated against experimental data, where available.
Performance Models for the Spike Banded Linear System Solver
Manguoglu, Murat; Saied, Faisal; Sameh, Ahmed; ...
2011-01-01
With availability of large-scale parallel platforms comprised of tens-of-thousands of processors and beyond, there is significant impetus for the development of scalable parallel sparse linear system solvers and preconditioners. An integral part of this design process is the development of performance models capable of predicting performance and providing accurate cost models for the solvers and preconditioners. There has been some work in the past on characterizing performance of the iterative solvers themselves. In this paper, we investigate the problem of characterizing performance and scalability of banded preconditioners. Recent work has demonstrated the superior convergence properties and robustness of banded preconditioners,more » compared to state-of-the-art ILU family of preconditioners as well as algebraic multigrid preconditioners. Furthermore, when used in conjunction with efficient banded solvers, banded preconditioners are capable of significantly faster time-to-solution. Our banded solver, the Truncated Spike algorithm is specifically designed for parallel performance and tolerance to deep memory hierarchies. Its regular structure is also highly amenable to accurate performance characterization. Using these characteristics, we derive the following results in this paper: (i) we develop parallel formulations of the Truncated Spike solver, (ii) we develop a highly accurate pseudo-analytical parallel performance model for our solver, (iii) we show excellent predication capabilities of our model – based on which we argue the high scalability of our solver. Our pseudo-analytical performance model is based on analytical performance characterization of each phase of our solver. These analytical models are then parameterized using actual runtime information on target platforms. An important consequence of our performance models is that they reveal underlying performance bottlenecks in both serial and parallel formulations. All of our results are validated
A robust HLLC-type Riemann solver for strong shock
NASA Astrophysics Data System (ADS)
Shen, Zhijun; Yan, Wei; Yuan, Guangwei
2016-03-01
It is well known that for the Eulerian equations the numerical schemes that can accurately capture contact discontinuity usually suffer from some disastrous carbuncle phenomenon, while some more dissipative schemes, such as the HLL scheme, are free from this kind of shock instability. Hybrid schemes to combine a dissipative flux with a less dissipative flux can cure the shock instability, but also may lead to other problems, such as certain arbitrariness of choosing switching parameters or contact interface becoming smeared. In order to overcome these drawbacks, this paper proposes a simple and robust HLLC-type Riemann solver for inviscid, compressible gas flows, which is capable of preserving sharp contact surface and is free from instability. The main work is to construct a HLL-type Riemann solver and a HLLC-type Riemann solver by modifying the shear viscosity of the original HLL and HLLC methods. Both of the two new schemes are positively conservative under some typical wavespeed estimations. Moreover, a linear matrix stability analysis for the proposed schemes is accomplished, which illustrates the HLLC-type solver with shear viscosity is stable whereas the HLL-type solver with vorticity wave is unstable. Our arguments and numerical experiments demonstrate that the inadequate dissipation associated to the shear wave may be a unique reason to cause the instability.
Adaptive kinetic-fluid solvers for heterogeneous computing architectures
NASA Astrophysics Data System (ADS)
Zabelok, Sergey; Arslanbekov, Robert; Kolobov, Vladimir
2015-12-01
We show feasibility and benefits of porting an adaptive multi-scale kinetic-fluid code to CPU-GPU systems. Challenges are due to the irregular data access for adaptive Cartesian mesh, vast difference of computational cost between kinetic and fluid cells, and desire to evenly load all CPUs and GPUs during grid adaptation and algorithm refinement. Our Unified Flow Solver (UFS) combines Adaptive Mesh Refinement (AMR) with automatic cell-by-cell selection of kinetic or fluid solvers based on continuum breakdown criteria. Using GPUs enables hybrid simulations of mixed rarefied-continuum flows with a million of Boltzmann cells each having a 24 × 24 × 24 velocity mesh. We describe the implementation of CUDA kernels for three modules in UFS: the direct Boltzmann solver using the discrete velocity method (DVM), the Direct Simulation Monte Carlo (DSMC) solver, and a mesoscopic solver based on the Lattice Boltzmann Method (LBM), all using adaptive Cartesian mesh. Double digit speedups on single GPU and good scaling for multi-GPUs have been demonstrated.
The novel high-performance 3-D MT inverse solver
NASA Astrophysics Data System (ADS)
Kruglyakov, Mikhail; Geraskin, Alexey; Kuvshinov, Alexey
2016-04-01
We present novel, robust, scalable, and fast 3-D magnetotelluric (MT) inverse solver. The solver is written in multi-language paradigm to make it as efficient, readable and maintainable as possible. Separation of concerns and single responsibility concepts go through implementation of the solver. As a forward modelling engine a modern scalable solver extrEMe, based on contracting integral equation approach, is used. Iterative gradient-type (quasi-Newton) optimization scheme is invoked to search for (regularized) inverse problem solution, and adjoint source approach is used to calculate efficiently the gradient of the misfit. The inverse solver is able to deal with highly detailed and contrasting models, allows for working (separately or jointly) with any type of MT responses, and supports massive parallelization. Moreover, different parallelization strategies implemented in the code allow optimal usage of available computational resources for a given problem statement. To parameterize an inverse domain the so-called mask parameterization is implemented, which means that one can merge any subset of forward modelling cells in order to account for (usually) irregular distribution of observation sites. We report results of 3-D numerical experiments aimed at analysing the robustness, performance and scalability of the code. In particular, our computational experiments carried out at different platforms ranging from modern laptops to HPC Piz Daint (6th supercomputer in the world) demonstrate practically linear scalability of the code up to thousands of nodes.
Interpolation and Approximation Theory.
ERIC Educational Resources Information Center
Kaijser, Sten
1991-01-01
Introduced are the basic ideas of interpolation and approximation theory through a combination of theory and exercises written for extramural education at the university level. Topics treated are spline methods, Lagrange interpolation, trigonometric approximation, Fourier series, and polynomial approximation. (MDH)
Continuous-time quantum Monte Carlo impurity solvers
NASA Astrophysics Data System (ADS)
Gull, Emanuel; Werner, Philipp; Fuchs, Sebastian; Surer, Brigitte; Pruschke, Thomas; Troyer, Matthias
2011-04-01
Continuous-time quantum Monte Carlo impurity solvers are algorithms that sample the partition function of an impurity model using diagrammatic Monte Carlo techniques. The present paper describes codes that implement the interaction expansion algorithm originally developed by Rubtsov, Savkin, and Lichtenstein, as well as the hybridization expansion method developed by Werner, Millis, Troyer, et al. These impurity solvers are part of the ALPS-DMFT application package and are accompanied by an implementation of dynamical mean-field self-consistency equations for (single orbital single site) dynamical mean-field problems with arbitrary densities of states. Program summaryProgram title: dmft Catalogue identifier: AEIL_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEIL_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: ALPS LIBRARY LICENSE version 1.1 No. of lines in distributed program, including test data, etc.: 899 806 No. of bytes in distributed program, including test data, etc.: 32 153 916 Distribution format: tar.gz Programming language: C++ Operating system: The ALPS libraries have been tested on the following platforms and compilers: Linux with GNU Compiler Collection (g++ version 3.1 and higher), and Intel C++ Compiler (icc version 7.0 and higher) MacOS X with GNU Compiler (g++ Apple-version 3.1, 3.3 and 4.0) IBM AIX with Visual Age C++ (xlC version 6.0) and GNU (g++ version 3.1 and higher) compilers Compaq Tru64 UNIX with Compq C++ Compiler (cxx) SGI IRIX with MIPSpro C++ Compiler (CC) HP-UX with HP C++ Compiler (aCC) Windows with Cygwin or coLinux platforms and GNU Compiler Collection (g++ version 3.1 and higher) RAM: 10 MB-1 GB Classification: 7.3 External routines: ALPS [1], BLAS/LAPACK, HDF5 Nature of problem: (See [2].) Quantum impurity models describe an atom or molecule embedded in a host material with which it can exchange electrons. They are basic to nanoscience as
An adaptive fast multipole accelerated Poisson solver for complex geometries
NASA Astrophysics Data System (ADS)
Askham, T.; Cerfon, A. J.
2017-09-01
We present a fast, direct and adaptive Poisson solver for complex two-dimensional geometries based on potential theory and fast multipole acceleration. More precisely, the solver relies on the standard decomposition of the solution as the sum of a volume integral to account for the source distribution and a layer potential to enforce the desired boundary condition. The volume integral is computed by applying the FMM on a square box that encloses the domain of interest. For the sake of efficiency and convergence acceleration, we first extend the source distribution (the right-hand side in the Poisson equation) to the enclosing box as a C0 function using a fast, boundary integral-based method. We demonstrate on multiply connected domains with irregular boundaries that this continuous extension leads to high accuracy without excessive adaptive refinement near the boundary and, as a result, to an extremely efficient ;black box; fast solver.
Overset Techniques for Hypersonic Multibody Configurations with the DPLR Solver
NASA Technical Reports Server (NTRS)
Hyatt, Andrew James; Prabhu, Dinesh K.; Boger, David A.
2010-01-01
Three unit problems in shock-shock/shock-boundary layer interactions are considered in the evaluation overset techniques with the Data Parallel Line Relaxation (DPLR) computational fluid dynamics solver, a three dimensional Navier-Stokes solver . The unit problems considered are those of two stacked hemispherical cylinders (of different diameters and lengths, and at various orientations relative to each other or relative to the nozzle axis) tested in a hypersonic wind tunnel. These problems are taken as representative of a Two-Stage-To-Orbit design. The objective of the present presentation would be to discuss the techniques used to develop suitable overset grid systems and then evaluate their respective solutions by comparing to corresponding point matched grid solutions and experimental data. Both successful and unsuccessful techniques would be discussed. All solutions would be calculated using the DPLR solver and SUGGAR will be used to develop the domain connectivity information.
General Equation Set Solver for Compressible and Incompressible Turbomachinery Flows
NASA Technical Reports Server (NTRS)
Sondak, Douglas L.; Dorney, Daniel J.
2002-01-01
Turbomachines for propulsion applications operate with many different working fluids and flow conditions. The flow may be incompressible, such as in the liquid hydrogen pump in a rocket engine, or supersonic, such as in the turbine which may drive the hydrogen pump. Separate codes have traditionally been used for incompressible and compressible flow solvers. The General Equation Set (GES) method can be used to solve both incompressible and compressible flows, and it is not restricted to perfect gases, as are many compressible-flow turbomachinery solvers. An unsteady GES turbomachinery flow solver has been developed and applied to both air and water flows through turbines. It has been shown to be an excellent alternative to maintaining two separate codes.
Advanced Fast 3D Electromagnetic Solver for Microwave Tomography Imaging.
Simonov, Nikolai; Kim, Bo-Ra; Lee, Kwang-Jae; Jeon, Soon-Ik; Son, Seong-Ho
2017-06-07
This paper describes a fast forward electromagnetic solver (FFS) for the image reconstruction algorithm of our microwave tomography (MT) system. Our apparatus is a preclinical prototype of a biomedical imaging system, designed for the purpose of early breast cancer detection. It operates in the 3-6 GHz frequency band using a circular array of probe antennas immersed in a matching liquid; it produces image reconstructions of the permittivity and conductivity profiles of the breast under examination. Our reconstruction algorithm solves the electromagnetic inverse problem and takes into account the real electromagnetic properties of the probe antenna array as well as the influence of the patient's body and that of the upper metal screen sheet. This FFS algorithm is much faster than conventional electromagnetic simulation solvers. In comparison, in the same PC, the CST solver takes ~45 min, while the FFS takes ~1 s of effective simulation time for the same electromagnetic model of a numerical breast phantom.
Two Solvers for Tractable Temporal Constraints with Preferences
NASA Technical Reports Server (NTRS)
Rossi, F.; Khatib,L.; Morris, P.; Morris, R.; Clancy, Daniel (Technical Monitor)
2002-01-01
A number of reasoning problems involving the manipulation of temporal information can naturally be viewed as implicitly inducing an ordering of potential local decisions involving time on the basis of preferences. Soft temporal constraints problems allow to describe in a natural way scenarios where events happen over time and preferences are associated to event distances and durations. In general, solving soft temporal problems require exponential time in the worst case, but there are interesting subclasses of problems which are polynomially solvable. We describe two solvers based on two different approaches for solving the same tractable subclass. For each solver we present the theoretical results it stands on, a description of the algorithm and some experimental results. The random generator used to build the problems on which tests are performed is also described. Finally, we compare the two solvers highlighting the tradeoff between performance and representational power.
Numerical System Solver Developed for the National Cycle Program
NASA Technical Reports Server (NTRS)
Binder, Michael P.
1999-01-01
As part of the National Cycle Program (NCP), a powerful new numerical solver has been developed to support the simulation of aeropropulsion systems. This software uses a hierarchical object-oriented design. It can provide steady-state and time-dependent solutions to nonlinear and even discontinuous problems typically encountered when aircraft and spacecraft propulsion systems are simulated. It also can handle constrained solutions, in which one or more factors may limit the behavior of the engine system. Timedependent simulation capabilities include adaptive time-stepping and synchronization with digital control elements. The NCP solver is playing an important role in making the NCP a flexible, powerful, and reliable simulation package.
Nonlinear Least Squares Curve Fitting with Microsoft Excel Solver
NASA Astrophysics Data System (ADS)
Harris, Daniel C.
1998-01-01
"Solver" is a powerful tool in the Microsoft Excel spreadsheet that provides a simple means of fitting experimental data to nonlinear functions. The procedure is so easy to use and its mode of operation is so obvious that it is excellent for students to learn the underlying principle of lease squares curve fitting. This article introduces the method of fitting nonlinear functions with Solver and extends the treatment to weighted least squares and to the estimation of uncertainties in the least-squares parameters.
An Easy Method To Accelerate An Iterative Algebraic Equation Solver
Yao, Jin
2014-01-06
This article proposes to add a simple term to an iterative algebraic equation solver with an order n convergence rate, and to raise the order of convergence to (2n - 1). In particular, a simple algebraic equation solver with the 5th order convergence but uses only 4 function values in each iteration, is described in details. When this scheme is applied to a Newton-Raphson method of the quadratic convergence for a system of algebraic equations, a cubic convergence can be achieved with an low overhead cost of function evaluation that can be ignored as the size of the system increases.
An evaluation of parallel multigrid as a solver and a preconditioner for singular perturbed problems
Oosterlee, C.W.; Washio, T.
1996-12-31
In this paper we try to achieve h-independent convergence with preconditioned GMRES and BiCGSTAB for 2D singular perturbed equations. Three recently developed multigrid methods are adopted as a preconditioner. They are also used as solution methods in order to compare the performance of the methods as solvers and as preconditioners. Two of the multigrid methods differ only in the transfer operators. One uses standard matrix- dependent prolongation operators from. The second uses {open_quotes}upwind{close_quotes} prolongation operators, developed. Both employ the Galerkin coarse grid approximation and an alternating zebra line Gauss-Seidel smoother. The third method is based on the block LU decomposition of a matrix and on an approximate Schur complement. This multigrid variant is presented in. All three multigrid algorithms are algebraic methods.
Albers, Robert C; Julien, Jean P
2008-01-01
We have developed a new efficient and accurate impurity solver for the single impurity Anderson model (SIAM), which is based on a non-perturbative recursion technique in a space of operators and involves expanding the self-energy as a continued fraction. The method has no special occupation number or temperature restrictions; the only approximation is the number of levels of the continued fraction retained in the expansion. We also show how this approach can be used as a new approach to Dynamical Mean Field Theory (DMTF) and illustrate this with the Hubbard model. The three lowest orders of recursion give the Hartree-Fock, Hubbard I, and Hubbard III approximations. A higher level of recursion is able to reproduce the expected 3-peak structure in the spectral function and Fermi liquid behavior.
A Time Dependent Transport Equation Solver
1991-05-01
Using TWIGL Mesh Spacing ............. 63 11 Initial FEMP2D Flux Using 2X TWIGL Mesh Spacing ........ .. 64 12 Time Dependent Thermal Absorption...energy group, and g = G is the lowest ( thermal ) energy group. ?oo(r, E, t) the coefficient in the P approximation that phys- ically r’iDresents the total...than these MrPs. This suggest that the thermal flux calculations could be suspect. Indeed, both the FEMP2D and FMP2DT calculations showed that the
A Robust Multilevel Simultaneous Eigenvalue Solver
1993-06-01
same efficiency is obtained for problems in 3-D as for problem in 2-D. In all examples the periodic boundary conditions Schr ~ dinger eigenvalue problem (A...coarse level work on levels 1, 2, took ap- proximately 1/6 of the computer time and on levels 1, 2, 3, approximately 1/4 of the computer time . This is a...eigenvalues. The results of the numerical tests for Schr ~ dinger eigenvalue problems, show that the algorithm achieved the same accuracy, using the same
Intellectual Abilities That Discriminate Good and Poor Problem Solvers.
ERIC Educational Resources Information Center
Meyer, Ruth Ann
1981-01-01
This study compared good and poor fourth-grade problem solvers on a battery of 19 "reference" tests for verbal, induction, numerical, word fluency, memory, perceptual speed, and simple visualization abilities. Results suggest verbal, numerical, and especially induction abilities are important to successful mathematical problem solving.…
Navier-Stokes Solvers and Generalizations for Reacting Flow Problems
Elman, Howard C
2013-01-27
This is an overview of our accomplishments during the final term of this grant (1 September 2008 -- 30 June 2012). These fall mainly into three categories: fast algorithms for linear eigenvalue problems; solution algorithms and modeling methods for partial differential equations with uncertain coefficients; and preconditioning methods and solvers for models of computational fluid dynamics (CFD).
Intellectual Abilities That Discriminate Good and Poor Problem Solvers.
ERIC Educational Resources Information Center
Meyer, Ruth Ann
1981-01-01
This study compared good and poor fourth-grade problem solvers on a battery of 19 "reference" tests for verbal, induction, numerical, word fluency, memory, perceptual speed, and simple visualization abilities. Results suggest verbal, numerical, and especially induction abilities are important to successful mathematical problem solving.…
PSH3D fast Poisson solver for petascale DNS
NASA Astrophysics Data System (ADS)
Adams, Darren; Dodd, Michael; Ferrante, Antonino
2016-11-01
Direct numerical simulation (DNS) of high Reynolds number, Re >= O (105) , turbulent flows requires computational meshes >= O (1012) grid points, and, thus, the use of petascale supercomputers. DNS often requires the solution of a Helmholtz (or Poisson) equation for pressure, which constitutes the bottleneck of the solver. We have developed a parallel solver of the Helmholtz equation in 3D, PSH3D. The numerical method underlying PSH3D combines a parallel 2D Fast Fourier transform in two spatial directions, and a parallel linear solver in the third direction. For computational meshes up to 81923 grid points, our numerical results show that PSH3D scales up to at least 262k cores of Cray XT5 (Blue Waters). PSH3D has a peak performance 6 × faster than 3D FFT-based methods when used with the 'partial-global' optimization, and for a 81923 mesh solves the Poisson equation in 1 sec using 128k cores. Also, we have verified that the use of PSH3D with the 'partial-global' optimization in our DNS solver does not reduce the accuracy of the numerical solution of the incompressible Navier-Stokes equations.
Development of multiphase CFD flow solver in OpenFOAM
NASA Astrophysics Data System (ADS)
Rollins, Chad; Luo, Hong; Dinh, Nam
2016-11-01
We are developing a pressure-based multiphase (Eulerian) CFD solver using OpenFOAM with Reynolds-averaged turbulence stress modeling. Our goal is the evaluation and improvement of the current OpenFOAM two-fluid (Eulerian) solver in boiling channels with a motivation to produce a more consistent modeling and numerics treatment. The difficulty lies in the prescense of the many forces and models that are tightly non-linearly coupled in the solver. Therefore, the solver platform will allow not only the modeling, but the tracking as well, of the effects of the individual components (various interfacial forces/heat transfer models) and their interactions. This is essential for the development of a robust and efficient solution method. There has be a lot of work already performed in related areas that generally indicates a lack of robustness of the solution methods. The objective here is therefore to identify and develop remedies for numerical/modeling issues through a systematic approach to verification and validation, taking advantage of the open source nature of OpenFOAM. The presentation will discuss major findings, and suggest strategies for robust and consistent modeling (probably, a more consistent treatment of heat transfer models with two-fluid models in the near-wall cells).
Thinking Process of Naive Problem Solvers to Solve Mathematical Problems
ERIC Educational Resources Information Center
Mairing, Jackson Pasini
2017-01-01
Solving problems is not only a goal of mathematical learning. Students acquire ways of thinking, habits of persistence and curiosity, and confidence in unfamiliar situations by learning to solve problems. In fact, there were students who had difficulty in solving problems. The students were naive problem solvers. This research aimed to describe…
Hypersonic simulations using open-source CFD and DSMC solvers
NASA Astrophysics Data System (ADS)
Casseau, V.; Scanlon, T. J.; John, B.; Emerson, D. R.; Brown, R. E.
2016-11-01
Hypersonic hybrid hydrodynamic-molecular gas flow solvers are required to satisfy the two essential requirements of any high-speed reacting code, these being physical accuracy and computational efficiency. The James Weir Fluids Laboratory at the University of Strathclyde is currently developing an open-source hybrid code which will eventually reconcile the direct simulation Monte-Carlo method, making use of the OpenFOAM application called dsmcFoam, and the newly coded open-source two-temperature computational fluid dynamics solver named hy2Foam. In conjunction with employing the CVDV chemistry-vibration model in hy2Foam, novel use is made of the QK rates in a CFD solver. In this paper, further testing is performed, in particular with the CFD solver, to ensure its efficacy before considering more advanced test cases. The hy2Foam and dsmcFoam codes have shown to compare reasonably well, thus providing a useful basis for other codes to compare against.
Coordinate Projection-based Solver for ODE with Invariants
Serban, Radu
2008-04-08
CPODES is a general purpose (serial and parallel) solver for systems of ordinary differential equation (ODE) with invariants. It implements a coordinate projection approach using different types of projection (orthogonal or oblique) and one of several methods for the decompositon of the Jacobian of the invariant equations.
Time-varying Riemann solvers for conservation laws on networks
NASA Astrophysics Data System (ADS)
Garavello, Mauro; Piccoli, Benedetto
We consider a conservation law on a network and generic Riemann solvers at nodes depending on parameters, which can be seen as control functions. Assuming that the parameters have bounded variation as functions of time, we prove existence of solutions to Cauchy problems on the whole network.
Parallel Solver for H(div) Problems Using Hybridization and AMG
Lee, Chak S.; Vassilevski, Panayot S.
2016-01-15
In this paper, a scalable parallel solver is proposed for H(div) problems discretized by arbitrary order finite elements on general unstructured meshes. The solver is based on hybridization and algebraic multigrid (AMG). Unlike some previously studied H(div) solvers, the hybridization solver does not require discrete curl and gradient operators as additional input from the user. Instead, only some element information is needed in the construction of the solver. The hybridization results in a H1-equivalent symmetric positive definite system, which is then rescaled and solved by AMG solvers designed for H1 problems. Weak and strong scaling of the method are examined through several numerical tests. Our numerical results show that the proposed solver provides a promising alternative to ADS, a state-of-the-art solver [12], for H(div) problems. In fact, it outperforms ADS for higher order elements.
A fast Laplace solver approach to pore scale permeability
NASA Astrophysics Data System (ADS)
Arns, Christoph; Adler, Pierre
2017-04-01
The permeability of a porous medium can be derived by solving the Stokes equations in the pore space with no slip at the walls. The resulting velocity averaged over the pore volume yields the permeability KS by application of the Darcy law. The Stokes equations can be solved by a number of different techniques such as finite differences, finite volume, Lattice Boltzmann, but whatever the technique it remains a heavy task since there are four unknowns at each node (the three velocity components and the pressure) which necessitate the solution of four equations (the projection of Newton's law on each axis and mass conservation). By comparison, the Laplace equation is scalar with a single unknown at each node. The objective of this work is to replace the Stokes equations by an elliptical equation with a space dependent permeability. More precisely, the local permeability k is supposed to be proportional to (r-alpha)**2 where r is the distance of the voxel to the closest wall, and alpha a constant; k is zero in the solid phase. The elliptical equation is div(k gradp)=0. A macroscopic pressure gradient is assumed to be exerted on the medium and again the resulting velocity averaged over space yields a permeability K_L. In order to validate this method, systematic calculations have been performed. First, elementary shapes (plane channel, circular pipe, rectangular channels) were studied for which flow occurs along parallel lines in which case KL is the arithmetic average of the k's. KL was calculated for various discretizations of the pore space and various values of alpha. For alpha=0.5, the agreement with the exact analytical value of KS is excellent for the plane and rectangular channels while it is only approximate for circular pipes. Second, the permeability KL of channels with sinusoidal walls was calculated and compared with analytical results and numerical ones provided by a Lattice Boltzmann algorithm. Generally speaking, the discrepancy does not exceed 25% when
Multiscale Universal Interface: A concurrent framework for coupling heterogeneous solvers
NASA Astrophysics Data System (ADS)
Tang, Yu-Hang; Kudo, Shuhei; Bian, Xin; Li, Zhen; Karniadakis, George Em
2015-09-01
Concurrently coupled numerical simulations using heterogeneous solvers are powerful tools for modeling multiscale phenomena. However, major modifications to existing codes are often required to enable such simulations, posing significant difficulties in practice. In this paper we present a C++ library, i.e. the Multiscale Universal Interface (MUI), which is capable of facilitating the coupling effort for a wide range of multiscale simulations. The library adopts a header-only form with minimal external dependency and hence can be easily dropped into existing codes. A data sampler concept is introduced, combined with a hybrid dynamic/static typing mechanism, to create an easily customizable framework for solver-independent data interpretation. The library integrates MPI MPMD support and an asynchronous communication protocol to handle inter-solver information exchange irrespective of the solvers' own MPI awareness. Template metaprogramming is heavily employed to simultaneously improve runtime performance and code flexibility. We validated the library by solving three different multiscale problems, which also serve to demonstrate the flexibility of the framework in handling heterogeneous models and solvers. In the first example, a Couette flow was simulated using two concurrently coupled Smoothed Particle Hydrodynamics (SPH) simulations of different spatial resolutions. In the second example, we coupled the deterministic SPH method with the stochastic Dissipative Particle Dynamics (DPD) method to study the effect of surface grafting on the hydrodynamics properties on the surface. In the third example, we consider conjugate heat transfer between a solid domain and a fluid domain by coupling the particle-based energy-conserving DPD (eDPD) method with the Finite Element Method (FEM).
Multiscale Universal Interface: A concurrent framework for coupling heterogeneous solvers
Tang, Yu-Hang; Kudo, Shuhei; Bian, Xin; Li, Zhen; Karniadakis, George Em
2015-09-15
Graphical abstract: - Abstract: Concurrently coupled numerical simulations using heterogeneous solvers are powerful tools for modeling multiscale phenomena. However, major modifications to existing codes are often required to enable such simulations, posing significant difficulties in practice. In this paper we present a C++ library, i.e. the Multiscale Universal Interface (MUI), which is capable of facilitating the coupling effort for a wide range of multiscale simulations. The library adopts a header-only form with minimal external dependency and hence can be easily dropped into existing codes. A data sampler concept is introduced, combined with a hybrid dynamic/static typing mechanism, to create an easily customizable framework for solver-independent data interpretation. The library integrates MPI MPMD support and an asynchronous communication protocol to handle inter-solver information exchange irrespective of the solvers' own MPI awareness. Template metaprogramming is heavily employed to simultaneously improve runtime performance and code flexibility. We validated the library by solving three different multiscale problems, which also serve to demonstrate the flexibility of the framework in handling heterogeneous models and solvers. In the first example, a Couette flow was simulated using two concurrently coupled Smoothed Particle Hydrodynamics (SPH) simulations of different spatial resolutions. In the second example, we coupled the deterministic SPH method with the stochastic Dissipative Particle Dynamics (DPD) method to study the effect of surface grafting on the hydrodynamics properties on the surface. In the third example, we consider conjugate heat transfer between a solid domain and a fluid domain by coupling the particle-based energy-conserving DPD (eDPD) method with the Finite Element Method (FEM)
Migration of vectorized iterative solvers to distributed memory architectures
Pommerell, C.; Ruehl, R.
1994-12-31
Both necessity and opportunity motivate the use of high-performance computers for iterative linear solvers. Necessity results from the size of the problems being solved-smaller problems are often better handled by direct methods. Opportunity arises from the formulation of the iterative methods in terms of simple linear algebra operations, even if this {open_quote}natural{close_quotes} parallelism is not easy to exploit in irregularly structured sparse matrices and with good preconditioners. As a result, high-performance implementations of iterative solvers have attracted a lot of interest in recent years. Most efforts are geared to vectorize or parallelize the dominating operation-structured or unstructured sparse matrix-vector multiplication, or to increase locality and parallelism by reformulating the algorithm-reducing global synchronization in inner products or local data exchange in preconditioners. Target architectures for iterative solvers currently include mostly vector supercomputers and architectures with one or few optimized (e.g., super-scalar and/or super-pipelined RISC) processors and hierarchical memory systems. More recently, parallel computers with physically distributed memory and a better price/performance ratio have been offered by vendors as a very interesting alternative to vector supercomputers. However, programming comfort on such distributed memory parallel processors (DMPPs) still lags behind. Here the authors are concerned with iterative solvers and their changing computing environment. In particular, they are considering migration from traditional vector supercomputers to DMPPs. Application requirements force one to use flexible and portable libraries. They want to extend the portability of iterative solvers rather than reimplementing everything for each new machine, or even for each new architecture.
Decision Engines for Software Analysis Using Satisfiability Modulo Theories Solvers
NASA Technical Reports Server (NTRS)
Bjorner, Nikolaj
2010-01-01
The area of software analysis, testing and verification is now undergoing a revolution thanks to the use of automated and scalable support for logical methods. A well-recognized premise is that at the core of software analysis engines is invariably a component using logical formulas for describing states and transformations between system states. The process of using this information for discovering and checking program properties (including such important properties as safety and security) amounts to automatic theorem proving. In particular, theorem provers that directly support common software constructs offer a compelling basis. Such provers are commonly called satisfiability modulo theories (SMT) solvers. Z3 is a state-of-the-art SMT solver. It is developed at Microsoft Research. It can be used to check the satisfiability of logical formulas over one or more theories such as arithmetic, bit-vectors, lists, records and arrays. The talk describes some of the technology behind modern SMT solvers, including the solver Z3. Z3 is currently mainly targeted at solving problems that arise in software analysis and verification. It has been applied to various contexts, such as systems for dynamic symbolic simulation (Pex, SAGE, Vigilante), for program verification and extended static checking (Spec#/Boggie, VCC, HAVOC), for software model checking (Yogi, SLAM), model-based design (FORMULA), security protocol code (F7), program run-time analysis and invariant generation (VS3). We will describe how it integrates support for a variety of theories that arise naturally in the context of the applications. There are several new promising avenues and the talk will touch on some of these and the challenges related to SMT solvers. Proceedings
Solvers for $$\\mathcal{O} (N)$$ Electronic Structure in the Strong Scaling Limit
Bock, Nicolas; Challacombe, William M.; Kale, Laxmikant
2016-01-26
Here we present a hybrid OpenMP/Charm\\tt++ framework for solving themore » $$\\mathcal{O} (N)$$ self-consistent-field eigenvalue problem with parallelism in the strong scaling regime, $$P\\gg{N}$$, where $P$ is the number of cores, and $N$ is a measure of system size, i.e., the number of matrix rows/columns, basis functions, atoms, molecules, etc. This result is achieved with a nested approach to spectral projection and the sparse approximate matrix multiply [Bock and Challacombe, SIAM J. Sci. Comput., 35 (2013), pp. C72--C98], and involves a recursive, task-parallel algorithm, often employed by generalized $N$-Body solvers, to occlusion and culling of negligible products in the case of matrices with decay. Lastly, employing classic technologies associated with generalized $N$-Body solvers, including overdecomposition, recursive task parallelism, orderings that preserve locality, and persistence-based load balancing, we obtain scaling beyond hundreds of cores per molecule for small water clusters ([H$${}_2$$O]$${}_N$$, $$N \\in \\{ 30, 90, 150 \\}$$, $$P/N \\approx \\{ 819, 273, 164 \\}$$) and find support for an increasingly strong scalability with increasing system size $N$.« less
Solvers for $\\mathcal{O} (N)$ Electronic Structure in the Strong Scaling Limit
Bock, Nicolas; Challacombe, William M.; Kale, Laxmikant
2016-01-26
Here we present a hybrid OpenMP/Charm\\tt++ framework for solving the $\\mathcal{O} (N)$ self-consistent-field eigenvalue problem with parallelism in the strong scaling regime, $P\\gg{N}$, where $P$ is the number of cores, and $N$ is a measure of system size, i.e., the number of matrix rows/columns, basis functions, atoms, molecules, etc. This result is achieved with a nested approach to spectral projection and the sparse approximate matrix multiply [Bock and Challacombe, SIAM J. Sci. Comput., 35 (2013), pp. C72--C98], and involves a recursive, task-parallel algorithm, often employed by generalized $N$-Body solvers, to occlusion and culling of negligible products in the case of matrices with decay. Lastly, employing classic technologies associated with generalized $N$-Body solvers, including overdecomposition, recursive task parallelism, orderings that preserve locality, and persistence-based load balancing, we obtain scaling beyond hundreds of cores per molecule for small water clusters ([H${}_2$O]${}_N$, $N \\in \\{ 30, 90, 150 \\}$, $P/N \\approx \\{ 819, 273, 164 \\}$) and find support for an increasingly strong scalability with increasing system size $N$.
GPU accelerated flow solver for direct numerical simulation of turbulent flows
Salvadore, Francesco; Botti, Michela
2013-02-15
Graphical processing units (GPUs), characterized by significant computing performance, are nowadays very appealing for the solution of computationally demanding tasks in a wide variety of scientific applications. However, to run on GPUs, existing codes need to be ported and optimized, a procedure which is not yet standardized and may require non trivial efforts, even to high-performance computing specialists. In the present paper we accurately describe the porting to CUDA (Compute Unified Device Architecture) of a finite-difference compressible Navier–Stokes solver, suitable for direct numerical simulation (DNS) of turbulent flows. Porting and validation processes are illustrated in detail, with emphasis on computational strategies and techniques that can be applied to overcome typical bottlenecks arising from the porting of common computational fluid dynamics solvers. We demonstrate that a careful optimization work is crucial to get the highest performance from GPU accelerators. The results show that the overall speedup of one NVIDIA Tesla S2070 GPU is approximately 22 compared with one AMD Opteron 2352 Barcelona chip and 11 compared with one Intel Xeon X5650 Westmere core. The potential of GPU devices in the simulation of unsteady three-dimensional turbulent flows is proved by performing a DNS of a spatially evolving compressible mixing layer.
GPU accelerated flow solver for direct numerical simulation of turbulent flows
NASA Astrophysics Data System (ADS)
Salvadore, Francesco; Bernardini, Matteo; Botti, Michela
2013-02-01
Graphical processing units (GPUs), characterized by significant computing performance, are nowadays very appealing for the solution of computationally demanding tasks in a wide variety of scientific applications. However, to run on GPUs, existing codes need to be ported and optimized, a procedure which is not yet standardized and may require non trivial efforts, even to high-performance computing specialists. In the present paper we accurately describe the porting to CUDA (Compute Unified Device Architecture) of a finite-difference compressible Navier-Stokes solver, suitable for direct numerical simulation (DNS) of turbulent flows. Porting and validation processes are illustrated in detail, with emphasis on computational strategies and techniques that can be applied to overcome typical bottlenecks arising from the porting of common computational fluid dynamics solvers. We demonstrate that a careful optimization work is crucial to get the highest performance from GPU accelerators. The results show that the overall speedup of one NVIDIA Tesla S2070 GPU is approximately 22 compared with one AMD Opteron 2352 Barcelona chip and 11 compared with one Intel Xeon X5650 Westmere core. The potential of GPU devices in the simulation of unsteady three-dimensional turbulent flows is proved by performing a DNS of a spatially evolving compressible mixing layer.
A rapid fast ion Fokker-Planck solver for integrated modelling of tokamaks
NASA Astrophysics Data System (ADS)
Schneider, M.; Eriksson, L.-G.; Johnson, T.; Futtersack, R.; Artaud, J. F.; Dumont, R.; Wolle, B.; Contributors, ITM-TF
2015-01-01
The RISK (rapid ion solver for tokamaks) code for simulating the evolution of the distribution function of neutral beam injected ions (NBI) in tokamak plasmas is described. The code has been especially developed for use in integrated modelling frameworks. Within this context, a code needs to be modular, machine independent and fast. RISK fulfils all these conditions. The RISK code solves the bounce averaged Fokker-Planck equation for the species of the injected ions by expanding the distribution function in the eigenfunctions of the collisional pitch angle scattering operator. The velocity dependent coefficient functions are calculated with a finite element solver. Finite orbit width effects are handled by an ad hoc broadening algorithm of the NBI ionization source. In order to assess the validity of the approximations employed in RISK, a comparison with a full orbit following Monte Carlo code is presented. RISK is integrated into the CRONOS transport suite of codes (Artaud et al 2010 Nucl. Fusion 50 043001) and the European integrated modelling (EU-IM) framework (Falchetto et al 2014 Nucl. Fusion 54 043018). The RISK implementation in this platform is discussed and exemplified to show the strength of running simulation codes in a modular and machine independent environment for simulation of fusion plasmas.
Rasin, A.
1994-04-01
We discuss the idea of approximate flavor symmetries. Relations between approximate flavor symmetries and natural flavor conservation and democracy models is explored. Implications for neutrino physics are also discussed.
Efficient three-dimensional Poisson solvers in open rectangular conducting pipe
NASA Astrophysics Data System (ADS)
Qiang, Ji
2016-06-01
Three-dimensional (3D) Poisson solver plays an important role in the study of space-charge effects on charged particle beam dynamics in particle accelerators. In this paper, we propose three new 3D Poisson solvers for a charged particle beam in an open rectangular conducting pipe. These three solvers include a spectral integrated Green function (IGF) solver, a 3D spectral solver, and a 3D integrated Green function solver. These solvers effectively handle the longitudinal open boundary condition using a finite computational domain that contains the beam itself. This saves the computational cost of using an extra larger longitudinal domain in order to set up an appropriate finite boundary condition. Using an integrated Green function also avoids the need to resolve rapid variation of the Green function inside the beam. The numerical operational cost of the spectral IGF solver and the 3D IGF solver scales as O(N log(N)) , where N is the number of grid points. The cost of the 3D spectral solver scales as O(Nn N) , where Nn is the maximum longitudinal mode number. We compare these three solvers using several numerical examples and discuss the advantageous regime of each solver in the physical application.
Performance of algebraic multi-grid solvers based on unsmoothed and smoothed aggregation schemes
NASA Astrophysics Data System (ADS)
Webster, R.
2001-08-01
A comparison is made of the performance of two algebraic multi-grid (AMG0 and AMG1) solvers for the solution of discrete, coupled, elliptic field problems. In AMG0, the basis functions for each coarse grid/level approximation (CGA) are obtained directly by unsmoothed aggregation, an appropriate scaling being applied to each CGA to improve consistency. In AMG1 they are assembled using a smoothed aggregation with a constrained energy optimization method providing the smoothing. Although more costly, smoothed basis functions provide a better (more consistent) CGA. Thus, AMG1 might be viewed as a benchmark for the assessment of the simpler AMG0. Selected test problems for D'Arcy flow in pipe networks, Fick diffusion, plane strain elasticity and Navier-Stokes flow (in a Stokes approximation) are used in making the comparison. They are discretized on the basis of both structured and unstructured finite element meshes. The range of discrete equation sets covers both symmetric positive definite systems and systems that may be non-symmetric and/or indefinite. Both global and local mesh refinements to at least one order of resolving power are examined. Some of these include anisotropic refinements involving elements of large aspect ratio; in some hydrodynamics cases, the anisotropy is extreme, with aspect ratios exceeding two orders. As expected, AMG1 delivers typical multi-grid convergence rates, which for all practical purposes are independent of mesh bandwidth. AMG0 rates are slower. They may also be more discernibly mesh-dependent. However, for the range of mesh bandwidths examined, the overall cost effectiveness of the two solvers is remarkably similar when a full convergence to machine accuracy is demanded. Thus, the shorter solution times for AMG1 do not necessarily compensate for the extra time required for its costly grid generation. This depends on the severity of the problem and the demanded level of convergence. For problems requiring few iterations, where grid
NASA Astrophysics Data System (ADS)
Lanti, E.; Dominski, J.; Brunner, S.; McMillan, B. F.; Villard, L.
2016-11-01
This work aims at completing the implementation of a solver for the quasineutrality equation using a Padé approximation in the global gyrokinetic code ORB5. Initially [Dominski, Ph.D. thesis, 2016], the Pade approximation was only implemented for the kinetic electron model. To enable runs with adiabatic or hybrid electron models while using a Pade approximation to the polarization response, the adiabatic response term of the quasi-neutrality equation must be consistently modified. It is shown that the Pade solver is in good agreement with the arbitrary wavelength solver of ORB5 [Dominski, Ph.D. thesis, 2016]. To perform this verification, the linear dispersion relation of an ITG-TEM transition is computed for both solvers and the linear growth rates and frequencies are compared.
NASA Astrophysics Data System (ADS)
Niiniluoto, Ilkka
2014-03-01
Approximation of laws is an important theme in the philosophy of science. If we can make sense of the idea that two scientific laws are "close" to each other, then we can also analyze such methodological notions as approximate explanation of laws, approximate reduction of theories, approximate empirical success of theories, and approximate truth of laws. Proposals for measuring the distance between quantitative scientific laws were given in Niiniluoto (1982, 1987). In this paper, these definitions are reconsidered as a response to the interesting critical remarks by Liu (1999).
Development of advanced Navier-Stokes solver
NASA Technical Reports Server (NTRS)
Yoon, Seokkwan
1994-01-01
The objective of research was to develop and validate new computational algorithms for solving the steady and unsteady Euler and Navier-Stokes equations. The end-products are new three-dimensional Euler and Navier-Stokes codes that are faster, more reliable, more accurate, and easier to use. The three-dimensional Euler and full/thin-layer Reynolds-averaged Navier-Stokes equations for compressible/incompressible flows are solved on structured hexahedral grids. The Baldwin-Lomax algebraic turbulence model is used for closure. The space discretization is based on a cell-centered finite-volume method augmented by a variety of numerical dissipation models with optional total variation diminishing limiters. The governing equations are integrated in time by an implicit method based on lower-upper factorization and symmetric Gauss-Seidel relaxation. The algorithm is vectorized on diagonal planes of sweep using two-dimensional indices in three dimensions. Convergence rates and the robustness of the codes are enhanced by the use of an implicit full approximation storage multigrid method.
Accurate and efficient computation of nonlocal potentials based on Gaussian-sum approximation
NASA Astrophysics Data System (ADS)
Exl, Lukas; Mauser, Norbert J.; Zhang, Yong
2016-12-01
We introduce an accurate and efficient method for the numerical evaluation of nonlocal potentials, including the 3D/2D Coulomb, 2D Poisson and 3D dipole-dipole potentials. Our method is based on a Gaussian-sum approximation of the singular convolution kernel combined with a Taylor expansion of the density. Starting from the convolution formulation of the nonlocal potential, for smooth and fast decaying densities, we make a full use of the Fourier pseudospectral (plane wave) approximation of the density and a separable Gaussian-sum approximation of the kernel in an interval where the singularity (the origin) is excluded. The potential is separated into a regular integral and a near-field singular correction integral. The first is computed with the Fourier pseudospectral method, while the latter is well resolved utilizing a low-order Taylor expansion of the density. Both parts are accelerated by fast Fourier transforms (FFT). The method is accurate (14-16 digits), efficient (O (Nlog N) complexity), low in storage, easily adaptable to other different kernels, applicable for anisotropic densities and highly parallelizable.
Approximate symmetries of Hamiltonians
NASA Astrophysics Data System (ADS)
Chubb, Christopher T.; Flammia, Steven T.
2017-08-01
We explore the relationship between approximate symmetries of a gapped Hamiltonian and the structure of its ground space. We start by considering approximate symmetry operators, defined as unitary operators whose commutators with the Hamiltonian have norms that are sufficiently small. We show that when approximate symmetry operators can be restricted to the ground space while approximately preserving certain mutual commutation relations. We generalize the Stone-von Neumann theorem to matrices that approximately satisfy the canonical (Heisenberg-Weyl-type) commutation relations and use this to show that approximate symmetry operators can certify the degeneracy of the ground space even though they only approximately form a group. Importantly, the notions of "approximate" and "small" are all independent of the dimension of the ambient Hilbert space and depend only on the degeneracy in the ground space. Our analysis additionally holds for any gapped band of sufficiently small width in the excited spectrum of the Hamiltonian, and we discuss applications of these ideas to topological quantum phases of matter and topological quantum error correcting codes. Finally, in our analysis, we also provide an exponential improvement upon bounds concerning the existence of shared approximate eigenvectors of approximately commuting operators under an added normality constraint, which may be of independent interest.
Zhao, Xujun; Li, Jiyuan; Jiang, Xikai; ...
2017-06-29
An efficient parallel Stokes’s solver is developed towards the complete inclusion of hydrodynamic interactions of Brownian particles in any geometry. A Langevin description of the particle dynamics is adopted, where the long-range interactions are included using a Green’s function formalism. We present a scalable parallel computational approach, where the general geometry Stokeslet is calculated following a matrix-free algorithm using the General geometry Ewald-like method. Our approach employs a highly-efficient iterative finite element Stokes’ solver for the accurate treatment of long-range hydrodynamic interactions within arbitrary confined geometries. A combination of mid-point time integration of the Brownian stochastic differential equation, the parallelmore » Stokes’ solver, and a Chebyshev polynomial approximation for the fluctuation-dissipation theorem result in an O(N) parallel algorithm. We also illustrate the new algorithm in the context of the dynamics of confined polymer solutions in equilibrium and non-equilibrium conditions. Our method is extended to treat suspended finite size particles of arbitrary shape in any geometry using an Immersed Boundary approach.« less
NASA Astrophysics Data System (ADS)
Zhao, Xujun; Li, Jiyuan; Jiang, Xikai; Karpeev, Dmitry; Heinonen, Olle; Smith, Barry; Hernandez-Ortiz, Juan P.; de Pablo, Juan J.
2017-06-01
An efficient parallel Stokes' solver has been developed for complete description of hydrodynamic interactions between Brownian particles in bulk and confined geometries. A Langevin description of the particle dynamics is adopted, where the long-range interactions are included using a Green's function formalism. A scalable parallel computational approach is presented, where the general geometry Stokeslet is calculated following a matrix-free algorithm using the general geometry Ewald-like method. Our approach employs a highly efficient iterative finite-element Stokes' solver for the accurate treatment of long-range hydrodynamic interactions in arbitrary confined geometries. A combination of mid-point time integration of the Brownian stochastic differential equation, the parallel Stokes' solver, and a Chebyshev polynomial approximation for the fluctuation-dissipation theorem leads to an O(N) parallel algorithm. We illustrate the new algorithm in the context of the dynamics of confined polymer solutions under equilibrium and non-equilibrium conditions. The method is then extended to treat suspended finite size particles of arbitrary shape in any geometry using an immersed boundary approach.
NASA Astrophysics Data System (ADS)
Ferrari, Alessia; Vacondio, Renato; Dazzi, Susanna; Mignosa, Paolo
2017-09-01
A novel augmented Riemann Solver capable of handling porosity discontinuities in 1D and 2D Shallow Water Equation (SWE) models is presented. With the aim of accurately approximating the porosity source term, a Generalized Riemann Problem is derived by adding an additional fictitious equation to the SWEs system and imposing mass and momentum conservation across the porosity discontinuity. The modified Shallow Water Equations are theoretically investigated, and the implementation of an augmented Roe Solver in a 1D Godunov-type finite volume scheme is presented. Robust treatment of transonic flows is ensured by introducing an entropy fix based on the wave pattern of the Generalized Riemann Problem. An Exact Riemann Solver is also derived in order to validate the numerical model. As an extension of the 1D scheme, an analogous 2D numerical model is also derived and validated through test cases with radial symmetry. The capability of the 1D and 2D numerical models to capture different wave patterns is assessed against several Riemann Problems with different wave patterns.
Zhao, Xujun; Li, Jiyuan; Jiang, Xikai; Karpeev, Dmitry; Heinonen, Olle; Smith, Barry; Hernandez-Ortiz, Juan P; de Pablo, Juan J
2017-06-28
An efficient parallel Stokes' solver has been developed for complete description of hydrodynamic interactions between Brownian particles in bulk and confined geometries. A Langevin description of the particle dynamics is adopted, where the long-range interactions are included using a Green's function formalism. A scalable parallel computational approach is presented, where the general geometry Stokeslet is calculated following a matrix-free algorithm using the general geometry Ewald-like method. Our approach employs a highly efficient iterative finite-element Stokes' solver for the accurate treatment of long-range hydrodynamic interactions in arbitrary confined geometries. A combination of mid-point time integration of the Brownian stochastic differential equation, the parallel Stokes' solver, and a Chebyshev polynomial approximation for the fluctuation-dissipation theorem leads to an O(N) parallel algorithm. We illustrate the new algorithm in the context of the dynamics of confined polymer solutions under equilibrium and non-equilibrium conditions. The method is then extended to treat suspended finite size particles of arbitrary shape in any geometry using an immersed boundary approach.
A Nonlinear Modal Aeroelastic Solver for FUN3D
NASA Technical Reports Server (NTRS)
Goldman, Benjamin D.; Bartels, Robert E.; Biedron, Robert T.; Scott, Robert C.
2016-01-01
A nonlinear structural solver has been implemented internally within the NASA FUN3D computational fluid dynamics code, allowing for some new aeroelastic capabilities. Using a modal representation of the structure, a set of differential or differential-algebraic equations are derived for general thin structures with geometric nonlinearities. ODEPACK and LAPACK routines are linked with FUN3D, and the nonlinear equations are solved at each CFD time step. The existing predictor-corrector method is retained, whereby the structural solution is updated after mesh deformation. The nonlinear solver is validated using a test case for a flexible aeroshell at transonic, supersonic, and hypersonic flow conditions. Agreement with linear theory is seen for the static aeroelastic solutions at relatively low dynamic pressures, but structural nonlinearities limit deformation amplitudes at high dynamic pressures. No flutter was found at any of the tested trajectory points, though LCO may be possible in the transonic regime.
Verification and Validation Studies for the LAVA CFD Solver
NASA Technical Reports Server (NTRS)
Moini-Yekta, Shayan; Barad, Michael F; Sozer, Emre; Brehm, Christoph; Housman, Jeffrey A.; Kiris, Cetin C.
2013-01-01
The verification and validation of the Launch Ascent and Vehicle Aerodynamics (LAVA) computational fluid dynamics (CFD) solver is presented. A modern strategy for verification and validation is described incorporating verification tests, validation benchmarks, continuous integration and version control methods for automated testing in a collaborative development environment. The purpose of the approach is to integrate the verification and validation process into the development of the solver and improve productivity. This paper uses the Method of Manufactured Solutions (MMS) for the verification of 2D Euler equations, 3D Navier-Stokes equations as well as turbulence models. A method for systematic refinement of unstructured grids is also presented. Verification using inviscid vortex propagation and flow over a flat plate is highlighted. Simulation results using laminar and turbulent flow past a NACA 0012 airfoil and ONERA M6 wing are validated against experimental and numerical data.
Parallel Auxiliary Space AMG Solver for $H(div)$ Problems
Kolev, Tzanio V.; Vassilevski, Panayot S.
2012-12-18
We present a family of scalable preconditioners for matrices arising in the discretization of $H(div)$ problems using the lowest order Raviart--Thomas finite elements. Our approach belongs to the class of “auxiliary space''--based methods and requires only the finite element stiffness matrix plus some minimal additional discretization information about the topology and orientation of mesh entities. Also, we provide a detailed algebraic description of the theory, parallel implementation, and different variants of this parallel auxiliary space divergence solver (ADS) and discuss its relations to the Hiptmair--Xu (HX) auxiliary space decomposition of $H(div)$ [SIAM J. Numer. Anal., 45 (2007), pp. 2483--2509] and to the auxiliary space Maxwell solver AMS [J. Comput. Math., 27 (2009), pp. 604--623]. Finally, an extensive set of numerical experiments demonstrates the robustness and scalability of our implementation on large-scale $H(div)$ problems with large jumps in the material coefficients.
An Upwind Solver for the National Combustion Code
NASA Technical Reports Server (NTRS)
Sockol, Peter M.
2011-01-01
An upwind solver is presented for the unstructured grid National Combustion Code (NCC). The compressible Navier-Stokes equations with time-derivative preconditioning and preconditioned flux-difference splitting of the inviscid terms are used. First order derivatives are computed on cell faces and used to evaluate the shear stresses and heat fluxes. A new flux limiter uses these same first order derivatives in the evaluation of left and right states used in the flux-difference splitting. The k-epsilon turbulence equations are solved with the same second-order method. The new solver has been installed in a recent version of NCC and the resulting code has been tested successfully in 2D on two laminar cases with known solutions and one turbulent case with experimental data.
On improving linear solver performance: a block variant of GMRES
Baker, A H; Dennis, J M; Jessup, E R
2004-05-10
The increasing gap between processor performance and memory access time warrants the re-examination of data movement in iterative linear solver algorithms. For this reason, we explore and establish the feasibility of modifying a standard iterative linear solver algorithm in a manner that reduces the movement of data through memory. In particular, we present an alternative to the restarted GMRES algorithm for solving a single right-hand side linear system Ax = b based on solving the block linear system AX = B. Algorithm performance, i.e. time to solution, is improved by using the matrix A in operations on groups of vectors. Experimental results demonstrate the importance of implementation choices on data movement as well as the effectiveness of the new method on a variety of problems from different application areas.
LDRD report : parallel repartitioning for optimal solver performance.
Heaphy, Robert; Devine, Karen Dragon; Preis, Robert; Hendrickson, Bruce Alan; Heroux, Michael Allen; Boman, Erik Gunnar
2004-02-01
We have developed infrastructure, utilities and partitioning methods to improve data partitioning in linear solvers and preconditioners. Our efforts included incorporation of data repartitioning capabilities from the Zoltan toolkit into the Trilinos solver framework, (allowing dynamic repartitioning of Trilinos matrices); implementation of efficient distributed data directories and unstructured communication utilities in Zoltan and Trilinos; development of a new multi-constraint geometric partitioning algorithm (which can generate one decomposition that is good with respect to multiple criteria); and research into hypergraph partitioning algorithms (which provide up to 56% reduction of communication volume compared to graph partitioning for a number of emerging applications). This report includes descriptions of the infrastructure and algorithms developed, along with results demonstrating the effectiveness of our approaches.
A 3-D upwind Euler solver for unstructured meshes
NASA Technical Reports Server (NTRS)
Barth, Timothy J.
1991-01-01
A three-dimensional finite-volume upwind Euler solver is developed for unstructured meshes. The finite-volume scheme solves for solution variables at vertices of the mesh and satisfies the integral conservation law on nonoverlapping polyhedral control volumes surrounding vertices of the mesh. The schene achieves improved solution accuracy by assuming a piecewise linear variation of the solution in each control volume. This improved spatial accuracy hinges heavily upon the calculation of the solution gradient in each control volume given pointwise values of the solution at vertices of the mesh. Several algorithms are discussed for obtaining these gradients. Details concerning implementation procedures and data structures are discussed. Sample calculations for inviscid Euler flow about isolated aircraft wings at subsonic and transonic speeds are compared with established Euler solvers as well as experiment.
A functional implementation of the Jacobi eigen-solver
Boehm, A.P.W.; Hiromoto, R.E.
1993-02-01
In this paper, we describe the systematic development of two implementations of the Jacobi eigen-solver and give performance results for the MIT/Motorola Monsoon dataflow machine. Our study is carried out using MINT, the MIT Monsoon simulator. The design of these implementations follows from the mathematics of the Jacobi method, and not from a translation of an existing sequential code. The functional semantics with respect to array updates, which cause excessive array copying, has lead us to a new implementation of a parallel ``group-rotations`` algorithm first described by Sameh. Our version of this algorithm requires 0(n{sup 3}) operations, whereas Sameh`s original version requires 0(n{sup 4}) operations. The implementations are programmed in the language Id, and although Id has non-functional features, we have restricted the development of our eigen-solvers to the functional sub-set of the language.
A functional implementation of the Jacobi eigen-solver
Boehm, A.P.W. . Dept. of Computer Science); Hiromoto, R.E. )
1993-01-01
In this paper, we describe the systematic development of two implementations of the Jacobi eigen-solver and give performance results for the MIT/Motorola Monsoon dataflow machine. Our study is carried out using MINT, the MIT Monsoon simulator. The design of these implementations follows from the mathematics of the Jacobi method, and not from a translation of an existing sequential code. The functional semantics with respect to array updates, which cause excessive array copying, has lead us to a new implementation of a parallel group-rotations'' algorithm first described by Sameh. Our version of this algorithm requires 0(n[sup 3]) operations, whereas Sameh's original version requires 0(n[sup 4]) operations. The implementations are programmed in the language Id, and although Id has non-functional features, we have restricted the development of our eigen-solvers to the functional sub-set of the language.
CASTRO: A NEW COMPRESSIBLE ASTROPHYSICAL SOLVER. II. GRAY RADIATION HYDRODYNAMICS
Zhang, W.; Almgren, A.; Bell, J.; Howell, L.; Burrows, A.
2011-10-01
We describe the development of a flux-limited gray radiation solver for the compressible astrophysics code, CASTRO. CASTRO uses an Eulerian grid with block-structured adaptive mesh refinement based on a nested hierarchy of logically rectangular variable-sized grids with simultaneous refinement in both space and time. The gray radiation solver is based on a mixed-frame formulation of radiation hydrodynamics. In our approach, the system is split into two parts, one part that couples the radiation and fluid in a hyperbolic subsystem, and another parabolic part that evolves radiation diffusion and source-sink terms. The hyperbolic subsystem is solved explicitly with a high-order Godunov scheme, whereas the parabolic part is solved implicitly with a first-order backward Euler method.
Scalable Out-of-Core Solvers on Xeon Phi Cluster
D'Azevedo, Ed F; Chan, Ki Shing; Su, Shiquan; Wong, Kwai
2015-01-01
This paper documents the implementation of a distributive out-of-core (OOC) solver for performing LU and Cholesky factorizations of a large dense matrix on clusters of many-core programmable co-processors. The out-of- core algorithm combines both the left-looking and right-looking schemes aimed to minimize the movement of data between the CPU host and the co-processor, optimizing data locality as well as computing throughput. The OOC solver is built to align with the format of the ScaLAPACK software library, making it readily portable to any existing codes using ScaLAPACK. A runtime analysis conducted on Beacon (an Intel Xeon plus Intel Xeon Phi cluster which composed of 48 nodes of multi-core CPU and MIC) at the Na- tional Institute for Computational Sciences is presented. Comparison of the performance on the Intel Xeon Phi and GPU clusters are also provided.
Brittle Solvers: Lessons and insights into effective solvers for visco-plasticity in geodynamics
NASA Astrophysics Data System (ADS)
Spiegelman, M. W.; May, D.; Wilson, C. R.
2014-12-01
Plasticity/Fracture and rock failure are essential ingredients in geodynamic models as terrestrial rocks do not possess an infinite yield strength. Numerous physical mechanisms have been proposed to limit the strength of rocks, including low temperature plasticity and brittle fracture. While ductile and creep behavior of rocks at depth is largely accepted, the constitutive relations associated with brittle failure, or shear localisation, are more controversial. Nevertheless, there are really only a few macroscopic constitutive laws for visco-plasticity that are regularly used in geodynamics models. Independent of derivation, all of these can be cast as simple effective viscosities which act as stress limiters with different choices for yield surfaces; the most common being a von Mises (constant yield stress) or Drucker-Prager (pressure dependent yield-stress) criterion. The choice of plasticity model, however, can have significant consequences for the degree of non-linearity in a problem and the choice and efficiency of non-linear solvers. Here we describe a series of simplified 2 and 3-D model problems to elucidate several issues associated with obtaining accurate description and solution of visco-plastic problems. We demonstrate that1) Picard/Successive substitution schemes for solution of the non-linear problems can often stall at large values of the non-linear residual, thus producing spurious solutions2) Combined Picard/Newton schemes can be effective for a range of plasticity models, however, they can produce serious convergence problems for strongly pressure dependent plasticity models such as Drucker-Prager.3) Nevertheless, full Drucker-Prager may not be the plasticity model of choice for strong materials as the dynamic pressures produced in these layers can develop pathological behavior with Drucker-Prager, leading to stress strengthening rather than stress weakening behavior.4) In general, for any incompressible Stoke's problem, it is highly advisable to
A contribution to the great Riemann solver debate
NASA Technical Reports Server (NTRS)
Quirk, James J.
1992-01-01
The aims of this paper are threefold: to increase the level of awareness within the shock capturing community to the fact that many Godunov-type methods contain subtle flaws that can cause spurious solutions to be computed; to identify one mechanism that might thwart attempts to produce very high resolution simulations; and to proffer a simple strategy for overcoming the specific failings of individual Riemann solvers.
A chemical reaction network solver for the astrophysics code NIRVANA
NASA Astrophysics Data System (ADS)
Ziegler, U.
2016-02-01
Context. Chemistry often plays an important role in astrophysical gases. It regulates thermal properties by changing species abundances and via ionization processes. This way, time-dependent cooling mechanisms and other chemistry-related energy sources can have a profound influence on the dynamical evolution of an astrophysical system. Modeling those effects with the underlying chemical kinetics in realistic magneto-gasdynamical simulations provide the basis for a better link to observations. Aims: The present work describes the implementation of a chemical reaction network solver into the magneto-gasdynamical code NIRVANA. For this purpose a multispecies structure is installed, and a new module for evolving the rate equations of chemical kinetics is developed and coupled to the dynamical part of the code. A small chemical network for a hydrogen-helium plasma was constructed including associated thermal processes which is used in test problems. Methods: Evolving a chemical network within time-dependent simulations requires the additional solution of a set of coupled advection-reaction equations for species and gas temperature. Second-order Strang-splitting is used to separate the advection part from the reaction part. The ordinary differential equation (ODE) system representing the reaction part is solved with a fourth-order generalized Runge-Kutta method applicable for stiff systems inherent to astrochemistry. Results: A series of tests was performed in order to check the correctness of numerical and technical implementation. Tests include well-known stiff ODE problems from the mathematical literature in order to confirm accuracy properties of the solver used as well as problems combining gasdynamics and chemistry. Overall, very satisfactory results are achieved. Conclusions: The NIRVANA code is now ready to handle astrochemical processes in time-dependent simulations. An easy-to-use interface allows implementation of complex networks including thermal processes
Menu-Driven Solver Of Linear-Programming Problems
NASA Technical Reports Server (NTRS)
Viterna, L. A.; Ferencz, D.
1992-01-01
Program assists inexperienced user in formulating linear-programming problems. A Linear Program Solver (ALPS) computer program is full-featured LP analysis program. Solves plain linear-programming problems as well as more-complicated mixed-integer and pure-integer programs. Also contains efficient technique for solution of purely binary linear-programming problems. Written entirely in IBM's APL2/PC software, Version 1.01. Packed program contains licensed material, property of IBM (copyright 1988, all rights reserved).
Direct linear programming solver in C for structural applications
NASA Astrophysics Data System (ADS)
Damkilde, L.; Hoyer, O.; Krenk, S.
1994-08-01
An optimization problem can be characterized by an object-function, which is maximized, and restrictions, which limit the variation of the variables. A subclass of optimization is Linear Programming (LP), where both the object-function and the restrictions are linear functions of the variables. The traditional solution methods for LP problems are based on the simplex method, and it is customary to allow only non-negative variables. Compared to other optimization routines the LP solvers are more robust and the optimum is reached in a finite number of steps and is not sensitive to the starting point. For structural applications many optimization problems can be linearized and solved by LP routines. However, the structural variables are not always non-negative, and this requires a reformation, where a variable x is substituted by the difference of two non-negative variables, x(sup + ) and x(sup - ). The transformation causes a doubling of the number of variables, and in a computer implementation the memory allocation doubles and for a typical problem the execution time at least doubles. This paper describes a LP solver written in C, which can handle a combination of non-negative variables and unlimited variables. The LP solver also allows restart, and this may reduce the computational costs if the solution to a similar LP problem is known a priori. The algorithm is based on the simplex method, and differs only in the logical choices. Application of the new LP solver will at the same time give both a more direct problem formulation and a more efficient program.
Boltzmann Solver with Adaptive Mesh in Velocity Space
Kolobov, Vladimir I.; Arslanbekov, Robert R.; Frolova, Anna A.
2011-05-20
We describe the implementation of direct Boltzmann solver with Adaptive Mesh in Velocity Space (AMVS) using quad/octree data structure. The benefits of the AMVS technique are demonstrated for the charged particle transport in weakly ionized plasmas where the collision integral is linear. We also describe the implementation of AMVS for the nonlinear Boltzmann collision integral. Test computations demonstrate both advantages and deficiencies of the current method for calculations of narrow-kernel distributions.
Scaling Algebraic Multigrid Solvers: On the Road to Exascale
Baker, A H; Falgout, R D; Gamblin, T; Kolev, T; Schulz, M; Yang, U M
2010-12-12
Algebraic Multigrid (AMG) solvers are an essential component of many large-scale scientific simulation codes. Their continued numerical scalability and efficient implementation is critical for preparing these codes for exascale. Our experiences on modern multi-core machines show that significant challenges must be addressed for AMG to perform well on such machines. We discuss our experiences and describe the techniques we have used to overcome scalability challenges for AMG on hybrid architectures in preparation for exascale.
A Discontinuous Galerkin Chimera Overset Solver
NASA Astrophysics Data System (ADS)
Galbraith, Marshall Christopher
geometries. The large stencil associated with these high-order schemes can significantly complicate the inter-grid communication and hole cutting processes. Unlike these high-order schemes, the DG method always retains a small stencil regardless of the order of approximation. The small stencil of the DG method simplifies the inter-grid communication scheme as well as hole cutting procedures. The DG-Chimera scheme does not require a separate interpolation method because the DG scheme represents the solution as cell local polynomials. Hence, the DG-Chimera method does not require fringe points to maintain the interior stencil across inter-grid boundaries. Thus, inter-grid communication can be established as long as the receiving boundary is enclosed by or abuts the donor mesh. This makes the inter-grid communication procedure applicable to both Chimera and zonal meshes. The small stencil implies hole cutting can be performed without regard to maintaining a minimum stencil and thereby greatly simplifies hole cutting. Hence, the DG-Chimera scheme has the potential to greatly simplify the overset grid generation process. Furthermore, the DG-Chimera scheme is capable of using curved cells to represent geometric features. The curved cells resolve issues associated with linear Chimera viscous meshes used for finite volume and finite difference schemes. Finally, the convergence rate of the Chimera schemes is dramatically increased by linearization of the inter-grid communication.
Parallel CFD Algorithms for Aerodynamical Flow Solvers on Unstructured Meshes. Parts 1 and 2
NASA Technical Reports Server (NTRS)
Barth, Timothy J.; Kwak, Dochan (Technical Monitor)
1995-01-01
The Advisory Group for Aerospace Research and Development (AGARD) has requested my participation in the lecture series entitled Parallel Computing in Computational Fluid Dynamics to be held at the von Karman Institute in Brussels, Belgium on May 15-19, 1995. In addition, a request has been made from the US Coordinator for AGARD at the Pentagon for NASA Ames to hold a repetition of the lecture series on October 16-20, 1995. I have been asked to be a local coordinator for the Ames event. All AGARD lecture series events have attendance limited to NATO allied countries. A brief of the lecture series is provided in the attached enclosure. Specifically, I have been asked to give two lectures of approximately 75 minutes each on the subject of parallel solution techniques for the fluid flow equations on unstructured meshes. The title of my lectures is "Parallel CFD Algorithms for Aerodynamical Flow Solvers on Unstructured Meshes" (Parts I-II). The contents of these lectures will be largely review in nature and will draw upon previously published work in this area. Topics of my lectures will include: (1) Mesh partitioning algorithms. Recursive techniques based on coordinate bisection, Cuthill-McKee level structures, and spectral bisection. (2) Newton's method for large scale CFD problems. Size and complexity estimates for Newton's method, modifications for insuring global convergence. (3) Techniques for constructing the Jacobian matrix. Analytic and numerical techniques for Jacobian matrix-vector products, constructing the transposed matrix, extensions to optimization and homotopy theories. (4) Iterative solution algorithms. Practical experience with GIVIRES and BICG-STAB matrix solvers. (5) Parallel matrix preconditioning. Incomplete Lower-Upper (ILU) factorization, domain-decomposed ILU, approximate Schur complement strategies.
Transonic Drag Prediction Using an Unstructured Multigrid Solver
NASA Technical Reports Server (NTRS)
Mavriplis, D. J.; Levy, David W.
2001-01-01
This paper summarizes the results obtained with the NSU-3D unstructured multigrid solver for the AIAA Drag Prediction Workshop held in Anaheim, CA, June 2001. The test case for the workshop consists of a wing-body configuration at transonic flow conditions. Flow analyses for a complete test matrix of lift coefficient values and Mach numbers at a constant Reynolds number are performed, thus producing a set of drag polars and drag rise curves which are compared with experimental data. Results were obtained independently by both authors using an identical baseline grid and different refined grids. Most cases were run in parallel on commodity cluster-type machines while the largest cases were run on an SGI Origin machine using 128 processors. The objective of this paper is to study the accuracy of the subject unstructured grid solver for predicting drag in the transonic cruise regime, to assess the efficiency of the method in terms of convergence, cpu time, and memory, and to determine the effects of grid resolution on this predictive ability and its computational efficiency. A good predictive ability is demonstrated over a wide range of conditions, although accuracy was found to degrade for cases at higher Mach numbers and lift values where increasing amounts of flow separation occur. The ability to rapidly compute large numbers of cases at varying flow conditions using an unstructured solver on inexpensive clusters of commodity computers is also demonstrated.
An immersed interface vortex particle-mesh solver
NASA Astrophysics Data System (ADS)
Marichal, Yves; Chatelain, Philippe; Winckelmans, Gregoire
2014-11-01
An immersed interface-enabled vortex particle-mesh (VPM) solver is presented for the simulation of 2-D incompressible viscous flows, in the framework of external aerodynamics. Considering the simulation of free vortical flows, such as wakes and jets, vortex particle-mesh methods already provide a valuable alternative to standard CFD methods, thanks to the interesting numerical properties arising from its Lagrangian nature. Yet, accounting for solid bodies remains challenging, despite the extensive research efforts that have been made for several decades. The present immersed interface approach aims at improving the consistency and the accuracy of one very common technique (based on Lighthill's model) for the enforcement of the no-slip condition at the wall in vortex methods. Targeting a sharp treatment of the wall calls for substantial modifications at all computational levels of the VPM solver. More specifically, the solution of the underlying Poisson equation, the computation of the diffusion term and the particle-mesh interpolation are adapted accordingly and the spatial accuracy is assessed. The immersed interface VPM solver is subsequently validated on the simulation of some challenging impulsively started flows, such as the flow past a cylinder and that past an airfoil. Research Fellow (PhD student) of the F.R.S.-FNRS of Belgium.
A Survey of Solver-Related Geometry and Meshing Issues
NASA Technical Reports Server (NTRS)
Masters, James; Daniel, Derick; Gudenkauf, Jared; Hine, David; Sideroff, Chris
2016-01-01
There is a concern in the computational fluid dynamics community that mesh generation is a significant bottleneck in the CFD workflow. This is one of several papers that will help set the stage for a moderated panel discussion addressing this issue. Although certain general "rules of thumb" and a priori mesh metrics can be used to ensure that some base level of mesh quality is achieved, inadequate consideration is often given to the type of solver or particular flow regime on which the mesh will be utilized. This paper explores how an analyst may want to think differently about a mesh based on considerations such as if a flow is compressible vs. incompressible or hypersonic vs. subsonic or if the solver is node-centered vs. cell-centered. This paper is a high-level investigation intended to provide general insight into how considering the nature of the solver or flow when performing mesh generation has the potential to increase the accuracy and/or robustness of the solution and drive the mesh generation process to a state where it is no longer a hindrance to the analysis process.
Error control of iterative linear solvers for integrated groundwater models.
Dixon, Matthew F; Bai, Zhaojun; Brush, Charles F; Chung, Francis I; Dogrul, Emin C; Kadir, Tariq N
2011-01-01
An open problem that arises when using modern iterative linear solvers, such as the preconditioned conjugate gradient method or Generalized Minimum RESidual (GMRES) method, is how to choose the residual tolerance in the linear solver to be consistent with the tolerance on the solution error. This problem is especially acute for integrated groundwater models, which are implicitly coupled to another model, such as surface water models, and resolve both multiple scales of flow and temporal interaction terms, giving rise to linear systems with variable scaling. This article uses the theory of "forward error bound estimation" to explain the correspondence between the residual error in the preconditioned linear system and the solution error. Using examples of linear systems from models developed by the US Geological Survey and the California State Department of Water Resources, we observe that this error bound guides the choice of a practical measure for controlling the error in linear systems. We implemented a preconditioned GMRES algorithm and benchmarked it against the Successive Over-Relaxation (SOR) method, the most widely known iterative solver for nonsymmetric coefficient matrices. With forward error control, GMRES can easily replace the SOR method in legacy groundwater modeling packages, resulting in the overall simulation speedups as large as 7.74×. This research is expected to broadly impact groundwater modelers through the demonstration of a practical and general approach for setting the residual tolerance in line with the solution error tolerance and presentation of GMRES performance benchmarking results.
QED multi-dimensional vacuum polarization finite-difference solver
NASA Astrophysics Data System (ADS)
Carneiro, Pedro; Grismayer, Thomas; Silva, Luís; Fonseca, Ricardo
2015-11-01
The Extreme Light Infrastructure (ELI) is expected to deliver peak intensities of 1023 - 1024 W/cm2 allowing to probe nonlinear Quantum Electrodynamics (QED) phenomena in an unprecedented regime. Within the framework of QED, the second order process of photon-photon scattering leads to a set of extended Maxwell's equations [W. Heisenberg and H. Euler, Z. Physik 98, 714] effectively creating nonlinear polarization and magnetization terms that account for the nonlinear response of the vacuum. To model this in a self-consistent way, we present a multi dimensional generalized Maxwell equation finite difference solver with significantly enhanced dispersive properties, which was implemented in the OSIRIS particle-in-cell code [R.A. Fonseca et al. LNCS 2331, pp. 342-351, 2002]. We present a detailed numerical analysis of this electromagnetic solver. As an illustration of the properties of the solver, we explore several examples in extreme conditions. We confirm the theoretical prediction of vacuum birefringence of a pulse propagating in the presence of an intense static background field [arXiv:1301.4918 [quant-ph
Non-linear curve fitting using Microsoft Excel solver.
Walsh, S; Diamond, D
1995-04-01
Solver, an analysis tool incorporated into Microsoft Excel V 5.0 for Windows, has been evaluated for solving non-linear equations. Test and experimental data sets have been processed, and the results suggest that solver can be successfully used for modelling data obtained in many analytical situations (e.g. chromatography and FIA peaks, fluorescence decays and ISE response characteristics). The relatively simple user interface, and the fact that Excel is commonly bundled free with new PCs makes it an ideal tool for those wishing to experiment with solving non-linear equations without having to purchase and learn a completely new package. The dynamic display of the iterative search process enables the user to monitor location of the optimum solution by the search algorithm. This, together with the almost universal availability of Excel, makes solver an ideal vehicle for teaching the principles of iterative non-linear curve fitting techniques. In addition, complete control of the modelling process lies with the user, who must present the raw data and enter the equation of the model, in contrast to many commercial packages bundled with instruments which perform these operations with a 'black-box' approach.
IGA-ADS: Isogeometric analysis FEM using ADS solver
NASA Astrophysics Data System (ADS)
Łoś, Marcin M.; Woźniak, Maciej; Paszyński, Maciej; Lenharth, Andrew; Hassaan, Muhamm Amber; Pingali, Keshav
2017-08-01
In this paper we present a fast explicit solver for solution of non-stationary problems using L2 projections with isogeometric finite element method. The solver has been implemented within GALOIS framework. It enables parallel multi-core simulations of different time-dependent problems, in 1D, 2D, or 3D. We have prepared the solver framework in a way that enables direct implementation of the selected PDE and corresponding boundary conditions. In this paper we describe the installation, implementation of exemplary three PDEs, and execution of the simulations on multi-core Linux cluster nodes. We consider three case studies, including heat transfer, linear elasticity, as well as non-linear flow in heterogeneous media. The presented package generates output suitable for interfacing with Gnuplot and ParaView visualization software. The exemplary simulations show near perfect scalability on Gilbert shared-memory node with four Intel® Xeon® CPU E7-4860 processors, each possessing 10 physical cores (for a total of 40 cores).
NASA Technical Reports Server (NTRS)
Dutta, Soumitra
1988-01-01
A model for approximate spatial reasoning using fuzzy logic to represent the uncertainty in the environment is presented. Algorithms are developed which can be used to reason about spatial information expressed in the form of approximate linguistic descriptions similar to the kind of spatial information processed by humans. Particular attention is given to static spatial reasoning.
NASA Technical Reports Server (NTRS)
Dutta, Soumitra
1988-01-01
A model for approximate spatial reasoning using fuzzy logic to represent the uncertainty in the environment is presented. Algorithms are developed which can be used to reason about spatial information expressed in the form of approximate linguistic descriptions similar to the kind of spatial information processed by humans. Particular attention is given to static spatial reasoning.
NASA Astrophysics Data System (ADS)
Barry, D. A.; Parlange, J.-Y.; Li, L.; Jeng, D.-S.; Crapper, M.
2005-10-01
The solution to the Green and Ampt infiltration equation is expressible in terms of the Lambert W-1 function. Approximations for Green and Ampt infiltration are thus derivable from approximations for the W-1 function and vice versa. An infinite family of asymptotic expansions to W-1 is presented. Although these expansions do not converge near the branch point of the W function (corresponds to Green-Ampt infiltration with immediate ponding), a method is presented for approximating W-1 that is exact at the branch point and asymptotically, with interpolation between these limits. Some existing and several new simple and compact yet robust approximations applicable to Green-Ampt infiltration and flux are presented, the most accurate of which has a maximum relative error of 5 × 10 -5%. This error is orders of magnitude lower than any existing analytical approximations.
Fisher, A. C.; Bailey, D. S.; Kaiser, T. B.; Eder, D. C.; Gunney, B. T. N.; Masters, N. D.; Koniges, A. E.; Anderson, R. W.
2015-02-01
Here, we present a novel method for the solution of the diffusion equation on a composite AMR mesh. This approach is suitable for including diffusion based physics modules to hydrocodes that support ALE and AMR capabilities. To illustrate, we proffer our implementations of diffusion based radiation transport and heat conduction in a hydrocode called ALE-AMR. Numerical experiments conducted with the diffusion solver and associated physics packages yield 2nd order convergence in the L_{2} norm.
A High-Order Accurate Parallel Solver for Maxwell's Equations on Overlapping Grids
Henshaw, W D
2005-09-23
A scheme for the solution of the time dependent Maxwell's equations on composite overlapping grids is described. The method uses high-order accurate approximations in space and time for Maxwell's equations written as a second-order vector wave equation. High-order accurate symmetric difference approximations to the generalized Laplace operator are constructed for curvilinear component grids. The modified equation approach is used to develop high-order accurate approximations that only use three time levels and have the same time-stepping restriction as the second-order scheme. Discrete boundary conditions for perfect electrical conductors and for material interfaces are developed and analyzed. The implementation is optimized for component grids that are Cartesian, resulting in a fast and efficient method. The solver runs on parallel machines with each component grid distributed across one or more processors. Numerical results in two- and three-dimensions are presented for the fourth-order accurate version of the method. These results demonstrate the accuracy and efficiency of the approach.
Intrinsic Nilpotent Approximation.
1985-06-01
RD-A1II58 265 INTRINSIC NILPOTENT APPROXIMATION(U) MASSACHUSETTS INST 1/2 OF TECH CAMBRIDGE LAB FOR INFORMATION AND, DECISION UMCLRSSI SYSTEMS C...TYPE OF REPORT & PERIOD COVERED Intrinsic Nilpotent Approximation Technical Report 6. PERFORMING ORG. REPORT NUMBER LIDS-R-1482 7. AUTHOR(.) S...certain infinite-dimensional filtered Lie algebras L by (finite-dimensional) graded nilpotent Lie algebras or g . where x E M, (x,,Z) E T*M/O. It
Anomalous diffraction approximation limits
NASA Astrophysics Data System (ADS)
Videen, Gorden; Chýlek, Petr
It has been reported in a recent article [Liu, C., Jonas, P.R., Saunders, C.P.R., 1996. Accuracy of the anomalous diffraction approximation to light scattering by column-like ice crystals. Atmos. Res., 41, pp. 63-69] that the anomalous diffraction approximation (ADA) accuracy does not depend on particle refractive index, but instead is dependent on the particle size parameter. Since this is at odds with previous research, we thought these results warranted further discussion.
NASA Technical Reports Server (NTRS)
Dutta, Soumitra
1988-01-01
Much of human reasoning is approximate in nature. Formal models of reasoning traditionally try to be precise and reject the fuzziness of concepts in natural use and replace them with non-fuzzy scientific explicata by a process of precisiation. As an alternate to this approach, it has been suggested that rather than regard human reasoning processes as themselves approximating to some more refined and exact logical process that can be carried out with mathematical precision, the essence and power of human reasoning is in its capability to grasp and use inexact concepts directly. This view is supported by the widespread fuzziness of simple everyday terms (e.g., near tall) and the complexity of ordinary tasks (e.g., cleaning a room). Spatial reasoning is an area where humans consistently reason approximately with demonstrably good results. Consider the case of crossing a traffic intersection. We have only an approximate idea of the locations and speeds of various obstacles (e.g., persons and vehicles), but we nevertheless manage to cross such traffic intersections without any harm. The details of our mental processes which enable us to carry out such intricate tasks in such apparently simple manner are not well understood. However, it is that we try to incorporate such approximate reasoning techniques in our computer systems. Approximate spatial reasoning is very important for intelligent mobile agents (e.g., robots), specially for those operating in uncertain or unknown or dynamic domains.
Approximate kernel competitive learning.
Wu, Jian-Sheng; Zheng, Wei-Shi; Lai, Jian-Huang
2015-03-01
Kernel competitive learning has been successfully used to achieve robust clustering. However, kernel competitive learning (KCL) is not scalable for large scale data processing, because (1) it has to calculate and store the full kernel matrix that is too large to be calculated and kept in the memory and (2) it cannot be computed in parallel. In this paper we develop a framework of approximate kernel competitive learning for processing large scale dataset. The proposed framework consists of two parts. First, it derives an approximate kernel competitive learning (AKCL), which learns kernel competitive learning in a subspace via sampling. We provide solid theoretical analysis on why the proposed approximation modelling would work for kernel competitive learning, and furthermore, we show that the computational complexity of AKCL is largely reduced. Second, we propose a pseudo-parallelled approximate kernel competitive learning (PAKCL) based on a set-based kernel competitive learning strategy, which overcomes the obstacle of using parallel programming in kernel competitive learning and significantly accelerates the approximate kernel competitive learning for large scale clustering. The empirical evaluation on publicly available datasets shows that the proposed AKCL and PAKCL can perform comparably as KCL, with a large reduction on computational cost. Also, the proposed methods achieve more effective clustering performance in terms of clustering precision against related approximate clustering approaches.
A Fast and Robust Poisson-Boltzmann Solver Based on Adaptive Cartesian Grids.
Boschitsch, Alexander H; Fenley, Marcia O
2011-05-10
An adaptive Cartesian grid (ACG) concept is presented for the fast and robust numerical solution of the 3D Poisson-Boltzmann Equation (PBE) governing the electrostatic interactions of large-scale biomolecules and highly charged multi-biomolecular assemblies such as ribosomes and viruses. The ACG offers numerous advantages over competing grid topologies such as regular 3D lattices and unstructured grids. For very large biological molecules and multi-biomolecule assemblies, the total number of grid-points is several orders of magnitude less than that required in a conventional lattice grid used in the current PBE solvers thus allowing the end user to obtain accurate and stable nonlinear PBE solutions on a desktop computer. Compared to tetrahedral-based unstructured grids, ACG offers a simpler hierarchical grid structure, which is naturally suited to multigrid, relieves indirect addressing requirements and uses fewer neighboring nodes in the finite difference stencils. Construction of the ACG and determination of the dielectric/ionic maps are straightforward, fast and require minimal user intervention. Charge singularities are eliminated by reformulating the problem to produce the reaction field potential in the molecular interior and the total electrostatic potential in the exterior ionic solvent region. This approach minimizes grid-dependency and alleviates the need for fine grid spacing near atomic charge sites. The technical portion of this paper contains three parts. First, the ACG and its construction for general biomolecular geometries are described. Next, a discrete approximation to the PBE upon this mesh is derived. Finally, the overall solution procedure and multigrid implementation are summarized. Results obtained with the ACG-based PBE solver are presented for: (i) a low dielectric spherical cavity, containing interior point charges, embedded in a high dielectric ionic solvent - analytical solutions are available for this case, thus allowing rigorous
A Fast and Robust Poisson-Boltzmann Solver Based on Adaptive Cartesian Grids
Boschitsch, Alexander H.; Fenley, Marcia O.
2011-01-01
An adaptive Cartesian grid (ACG) concept is presented for the fast and robust numerical solution of the 3D Poisson-Boltzmann Equation (PBE) governing the electrostatic interactions of large-scale biomolecules and highly charged multi-biomolecular assemblies such as ribosomes and viruses. The ACG offers numerous advantages over competing grid topologies such as regular 3D lattices and unstructured grids. For very large biological molecules and multi-biomolecule assemblies, the total number of grid-points is several orders of magnitude less than that required in a conventional lattice grid used in the current PBE solvers thus allowing the end user to obtain accurate and stable nonlinear PBE solutions on a desktop computer. Compared to tetrahedral-based unstructured grids, ACG offers a simpler hierarchical grid structure, which is naturally suited to multigrid, relieves indirect addressing requirements and uses fewer neighboring nodes in the finite difference stencils. Construction of the ACG and determination of the dielectric/ionic maps are straightforward, fast and require minimal user intervention. Charge singularities are eliminated by reformulating the problem to produce the reaction field potential in the molecular interior and the total electrostatic potential in the exterior ionic solvent region. This approach minimizes grid-dependency and alleviates the need for fine grid spacing near atomic charge sites. The technical portion of this paper contains three parts. First, the ACG and its construction for general biomolecular geometries are described. Next, a discrete approximation to the PBE upon this mesh is derived. Finally, the overall solution procedure and multigrid implementation are summarized. Results obtained with the ACG-based PBE solver are presented for: (i) a low dielectric spherical cavity, containing interior point charges, embedded in a high dielectric ionic solvent – analytical solutions are available for this case, thus allowing rigorous
Robust parallel iterative solvers for linear and least-squares problems, Final Technical Report
Saad, Yousef
2014-01-16
The primary goal of this project is to study and develop robust iterative methods for solving linear systems of equations and least squares systems. The focus of the Minnesota team is on algorithms development, robustness issues, and on tests and validation of the methods on realistic problems. 1. The project begun with an investigation on how to practically update a preconditioner obtained from an ILU-type factorization, when the coefficient matrix changes. 2. We investigated strategies to improve robustness in parallel preconditioners in a specific case of a PDE with discontinuous coefficients. 3. We explored ways to adapt standard preconditioners for solving linear systems arising from the Helmholtz equation. These are often difficult linear systems to solve by iterative methods. 4. We have also worked on purely theoretical issues related to the analysis of Krylov subspace methods for linear systems. 5. We developed an effective strategy for performing ILU factorizations for the case when the matrix is highly indefinite. The strategy uses shifting in some optimal way. The method was extended to the solution of Helmholtz equations by using complex shifts, yielding very good results in many cases. 6. We addressed the difficult problem of preconditioning sparse systems of equations on GPUs. 7. A by-product of the above work is a software package consisting of an iterative solver library for GPUs based on CUDA. This was made publicly available. It was the first such library that offers complete iterative solvers for GPUs. 8. We considered another form of ILU which blends coarsening techniques from Multigrid with algebraic multilevel methods. 9. We have released a new version on our parallel solver - called pARMS [new version is version 3]. As part of this we have tested the code in complex settings - including the solution of Maxwell and Helmholtz equations and for a problem of crystal growth.10. As an application of polynomial preconditioning we considered the
A Robust Compressible Flow Solver for Studies on Solar Fuel Production in Microwave Plasma
NASA Astrophysics Data System (ADS)
Tadayon Mousavi, Samaneh; Koelman, Peter; Groen, Pieter Willem; van Dijk, Jan; Epg/ Applied Physics/ Eindhoven University Of Technology Team; Dutch InstituteFundamental Energy Research (Differ) Team
2016-09-01
n order to simulate the dissociation of CO2 with H2O admixture by microwave plasma for the production of solar fuels, we need a multicomponent solver that is able to capture the complex nature of the plasma by combining the chemistry, flow, and electromagnetic field. To achieve this goal, first we developed a robust finite volume compressible flow solver in C++. The solver is implemented in the framework of the PLASIMO software and will be used in complete plasma simulations later on. Due to the compressible nature of the solver, it can be used for simulation of dissociation of CO2 with H2O admixture by supersonic expansion in microwave plasmas. A spatially second order version of this solver is able to reveal the vortex flow structure of the plasmas. Capabilities of this solver are presented by benchmarking against well-established analytical and numerical test cases.
A New Robust Solver for Saturated-Unsaturated Richards' Equation
NASA Astrophysics Data System (ADS)
Barajas-Solano, D. A.; Tartakovsky, D. M.
2012-12-01
We present a novel approach for the numerical integration of the saturated-unsaturated Richards' equation, a degenerate parabolic partial differential equation that models flow in porous media. The method is based on the mixed (pore pressure-water content) form of RE, written as a set of differential algebraic equations (DAEs) of index-1 for the fully saturated case and index-2 for the partially saturated case. A DAE-based approach allows us to overcome the numerical challenges posed by the degenerate nature of the Richards' equation. The resulting set of DAEs is solved using the stiffly-accurate, single-step, 3-stage implicit Runge-Kutta method Radau IIA, chosen for its favorable accuracy and stability properties, and its ease of implementation. For each time step a nonlinear system of equations on the intermediate Runge-Kutta states of the pore pressure is solved, written so to ensure that the next step pore pressure and water content correspond to one another correctly. The implementation of our approach compares favorably to state-of-the-art DAE-based solvers in both one- and two-dimensional simulations. These solvers use multi-step backward difference formulas together with a pressure-based form of Richards' equation. To the best of our knowledge, our method is the first instance of a successful DAE-based solver that uses the mixed form of Richards' equation. We consider this a promising line of research, with future work to be done on the use of globally convergent methods for the solution of the occurring nonlinear systems of equations.
Application of Aeroelastic Solvers Based on Navier Stokes Equations
NASA Technical Reports Server (NTRS)
Keith, Theo G., Jr.; Srivastava, Rakesh
2001-01-01
The propulsion element of the NASA Advanced Subsonic Technology (AST) initiative is directed towards increasing the overall efficiency of current aircraft engines. This effort requires an increase in the efficiency of various components, such as fans, compressors, turbines etc. Improvement in engine efficiency can be accomplished through the use of lighter materials, larger diameter fans and/or higher-pressure ratio compressors. However, each of these has the potential to result in aeroelastic problems such as flutter or forced response. To address the aeroelastic problems, the Structural Dynamics Branch of NASA Glenn has been involved in the development of numerical capabilities for analyzing the aeroelastic stability characteristics and forced response of wide chord fans, multi-stage compressors and turbines. In order to design an engine to safely perform a set of desired tasks, accurate information of the stresses on the blade during the entire cycle of blade motion is required. This requirement in turn demands that accurate knowledge of steady and unsteady blade loading is available. To obtain the steady and unsteady aerodynamic forces for the complex flows around the engine components, for the flow regimes encountered by the rotor, an advanced compressible Navier-Stokes solver is required. A finite volume based Navier-Stokes solver has been developed at Mississippi State University (MSU) for solving the flow field around multistage rotors. The focus of the current research effort, under NASA Cooperative Agreement NCC3- 596 was on developing an aeroelastic analysis code (entitled TURBO-AE) based on the Navier-Stokes solver developed by MSU. The TURBO-AE code has been developed for flutter analysis of turbomachine components and delivered to NASA and its industry partners. The code has been verified. validated and is being applied by NASA Glenn and by aircraft engine manufacturers to analyze the aeroelastic stability characteristics of modem fans, compressors
Working towards a numerical solver for seismic wave propagation in unsaturated porous media
NASA Astrophysics Data System (ADS)
Boxberg, Marc S.; Friederich, Wolfgang
2017-04-01
Modeling the propagation of seismic waves in porous media gets more and more popular in the seismological community. However, it is still a challenging task in the field of computational seismology. Nevertheless, it is important to account for the fluid content of, e.g., reservoir rocks or soils, and the interaction between the fluid and the rock or between different immiscible fluids to accurately describe seismic wave propagation through such porous media. Often, numerical models are based on the elastic wave equation and some might include artificially introduced attenuation. This simplifies the computation, because it only approximates the physics behind that problem. However, the results are also simplified and could miss phenomena and lack accuracy in some applications. We present a numerical solver for wave propagation in porous media saturated by two immiscible fluids. It is based on Biot's theory of poroelasticity and accounts for macroscopic flow that occurs on the same scale as the wavelength of the seismic waves. Fluid flow is described by a Darcy type flow law and interactions between the fluids by means of capillary pressure curve models. In addition, consistent boundary conditions on interfaces between poroelastic media and elastic or acoustic media are derived from this poroelastic theory itself. The poroelastic solver is integrated into the larger software package NEXD that uses the nodal discontinuous Galerkin method to solve wave equations in 1D, 2D, and 3D on a mesh of linear (1D), triangular (2D), or tetrahedral (3D) elements. Triangular and tetrahedral elements have great advantages as soon as the model has a complex structure, like it is often the case for geologic models. We illustrate the capabilities of the codes by numerical examples. This work can be applied to various scientific questions in, e.g., exploration and monitoring of hydrocarbon or geothermal reservoirs as well as CO2 storage sites.
Preconditioned CG-solvers and finite element grids
Bauer, R.; Selberherr, S.
1994-12-31
To extract parasitic capacitances in wiring structures of integrated circuits the authors developed the two- and three-dimensional finite element program SCAP (Smart Capacitance Analysis Program). The program computes the task of the electrostatic field from a solution of Poisson`s equation via finite elements and calculates the energies from which the capacitance matrix is extracted. The unknown potential vector, which has for three-dimensional applications 5000-50000 unknowns, is computed by a ICCG solver. Currently three- and six-node triangular, four- and ten-node tetrahedronal elements are supported.
Evaluating Sparse Linear System Solvers on Scalable Parallel Architectures
2008-10-01
iterations will be necessary to assure sufficient accuracy whenever we do not use a direct method to solve (1.3) or (1.5). The overall SPIKE algorithm...boosting is activated, SPIKE is not used as a direct solver but rather as a preconditioner. In this case outer iterations via a Krylov subspace method ...robustness. Preconditioning aims to improve the robustness of iterative methods by transforming the system into M−1Ax = M−1f, or AM−1(Mx) = f. (3.2
Algorithms for parallel flow solvers on message passing architectures
NASA Technical Reports Server (NTRS)
Vanderwijngaart, Rob F.
1995-01-01
The purpose of this project has been to identify and test suitable technologies for implementation of fluid flow solvers -- possibly coupled with structures and heat equation solvers -- on MIMD parallel computers. In the course of this investigation much attention has been paid to efficient domain decomposition strategies for ADI-type algorithms. Multi-partitioning derives its efficiency from the assignment of several blocks of grid points to each processor in the parallel computer. A coarse-grain parallelism is obtained, and a near-perfect load balance results. In uni-partitioning every processor receives responsibility for exactly one block of grid points instead of several. This necessitates fine-grain pipelined program execution in order to obtain a reasonable load balance. Although fine-grain parallelism is less desirable on many systems, especially high-latency networks of workstations, uni-partition methods are still in wide use in production codes for flow problems. Consequently, it remains important to achieve good efficiency with this technique that has essentially been superseded by multi-partitioning for parallel ADI-type algorithms. Another reason for the concentration on improving the performance of pipeline methods is their applicability in other types of flow solver kernels with stronger implied data dependence. Analytical expressions can be derived for the size of the dynamic load imbalance incurred in traditional pipelines. From these it can be determined what is the optimal first-processor retardation that leads to the shortest total completion time for the pipeline process. Theoretical predictions of pipeline performance with and without optimization match experimental observations on the iPSC/860 very well. Analysis of pipeline performance also highlights the effect of uncareful grid partitioning in flow solvers that employ pipeline algorithms. If grid blocks at boundaries are not at least as large in the wall-normal direction as those
Some fast elliptic solvers on parallel architectures and their complexities
NASA Technical Reports Server (NTRS)
Gallopoulos, E.; Saad, Youcef
1989-01-01
The discretization of separable elliptic partial differential equations leads to linear systems with special block triangular matrices. Several methods are known to solve these systems, the most general of which is the Block Cyclic Reduction (BCR) algorithm which handles equations with nonconsistant coefficients. A method was recently proposed to parallelize and vectorize BCR. Here, the mapping of BCR on distributed memory architectures is discussed, and its complexity is compared with that of other approaches, including the Alternating-Direction method. A fast parallel solver is also described, based on an explicit formula for the solution, which has parallel computational complexity lower than that of parallel BCR.
Some fast elliptic solvers on parallel architectures and their complexities
NASA Technical Reports Server (NTRS)
Gallopoulos, E.; Saad, Y.
1989-01-01
The discretization of separable elliptic partial differential equations leads to linear systems with special block tridiagonal matrices. Several methods are known to solve these systems, the most general of which is the Block Cyclic Reduction (BCR) algorithm which handles equations with nonconstant coefficients. A method was recently proposed to parallelize and vectorize BCR. In this paper, the mapping of BCR on distributed memory architectures is discussed, and its complexity is compared with that of other approaches including the Alternating-Direction method. A fast parallel solver is also described, based on an explicit formula for the solution, which has parallel computational compelxity lower than that of parallel BCR.
Advances in the hydrodynamics solver of CO5BOLD
NASA Astrophysics Data System (ADS)
Freytag, Bernd
Many features of the Roe solver used in the hydrodynamics module of CO5BOLD have recently been added or overhauled, including the reconstruction methods (by adding the new second-order ``Frankenstein's method''), the treatment of transversal velocities, energy-flux averaging and entropy-wave treatment at small Mach numbers, the CTU scheme to combine the one-dimensional fluxes, and additional safety measures. All this results in a significantly better behavior at low Mach number flows, and an improved stability at larger Mach numbers requiring less (or no) additional tensor viscosity, which then leads to a noticeable increase in effective resolution.
A Simple Quantum Integro-Differential Solver (SQuIDS)
NASA Astrophysics Data System (ADS)
Argüelles Delgado, Carlos A.; Salvado, Jordi; Weaver, Christopher N.
2015-11-01
Simple Quantum Integro-Differential Solver (SQuIDS) is a C++ code designed to solve semi-analytically the evolution of a set of density matrices and scalar functions. This is done efficiently by expressing all operators in an SU(N) basis. SQuIDS provides a base class from which users can derive new classes to include new non-trivial terms from the right hand sides of density matrix equations. The code was designed in the context of solving neutrino oscillation problems, but can be applied to any problem that involves solving the quantum evolution of a collection of particles with Hilbert space of dimension up to six.
High Energy Boundary Conditions for a Cartesian Mesh Euler Solver
NASA Technical Reports Server (NTRS)
Pandya, Shishir A.; Murman, Scott M.; Aftosmis, Michael J.
2004-01-01
Inlets and exhaust nozzles are often omitted or fared over in aerodynamic simulations of aircraft due to the complexities involving in the modeling of engine details such as complex geometry and flow physics. However, the assumption is often improper as inlet or plume flows have a substantial effect on vehicle aerodynamics. A tool for specifying inlet and exhaust plume conditions through the use of high-energy boundary conditions in an established inviscid flow solver is presented. The effects of the plume on the flow fields near the inlet and plume are discussed.
A Navier-Stokes solver for cascade flows
NASA Technical Reports Server (NTRS)
Arnone, A.; Swanson, R. C.
1988-01-01
A computer code for solving the Reynolds averaged full Navier-Stokes equations has been developed and applied using sheared H-type grids. The Baldwin-Lomax eddy-viscosity model is used for turbulence closure. The integration in time is based on an explicit four-stage Runge-Kutta scheme. Local time stepping, variable coefficient implicit residual smoothing, and a full multigrid method have been implemented to accelerate steady state calculations. Comparisons with experimental data show that the code is an accurate viscous solver and can give very good blade-to-blade predictions for engineering applications in less than 100 multigrid cycles on the finest mesh.
Reformulation of the Fourier-Bessel steady state mode solver
NASA Astrophysics Data System (ADS)
Gauthier, Robert C.
2016-09-01
The Fourier-Bessel resonator state mode solver is reformulated using Maxwell's field coupled curl equations. The matrix generating expressions are greatly simplified as well as a reduction in the number of pre-computed tables making the technique simpler to implement on a desktop computer. The reformulation maintains the theoretical equivalence of the permittivity and permeability and as such structures containing both electric and magnetic properties can be examined. Computation examples are presented for a surface nanoscale axial photonic resonator and hybrid { ε , μ } quasi-crystal resonator.
Performance issues for iterative solvers in device simulation
NASA Technical Reports Server (NTRS)
Fan, Qing; Forsyth, P. A.; Mcmacken, J. R. F.; Tang, Wei-Pai
1994-01-01
Due to memory limitations, iterative methods have become the method of choice for large scale semiconductor device simulation. However, it is well known that these methods still suffer from reliability problems. The linear systems which appear in numerical simulation of semiconductor devices are notoriously ill-conditioned. In order to produce robust algorithms for practical problems, careful attention must be given to many implementation issues. This paper concentrates on strategies for developing robust preconditioners. In addition, effective data structures and convergence check issues are also discussed. These algorithms are compared with a standard direct sparse matrix solver on a variety of problems.
NASA Astrophysics Data System (ADS)
Müller, Lucas O.; Blanco, Pablo J.
2015-11-01
We present a methodology for the high order approximation of hyperbolic conservation laws in networks by using the Dumbser-Enaux-Toro solver and exact solvers for the classical Riemann problem at junctions. The proposed strategy can be applied to any hyperbolic system, conservative or non-conservative, and possibly with flux functions containing discontinuous parameters, as long as an exact or approximate Riemann problem solver is available. The methodology is implemented for a one-dimensional blood flow model that considers discontinuous variations of mechanical and geometrical properties of vessels. The achievement of formal order of accuracy, as well as the robustness of the resulting numerical scheme, is verified through the simulation of both, academic tests and physiological flows.
Novel accurate and scalable 3-D MT forward solver based on a contracting integral equation method
NASA Astrophysics Data System (ADS)
Kruglyakov, M.; Geraskin, A.; Kuvshinov, A.
2016-11-01
We present a novel, open source 3-D MT forward solver based on a method of integral equations (IE) with contracting kernel. Special attention in the solver is paid to accurate calculations of Green's functions and their integrals which are cornerstones of any IE solution. The solver supports massive parallelization and is able to deal with highly detailed and contrasting models. We report results of a 3-D numerical experiment aimed at analyzing the accuracy and scalability of the code.
Covariant approximation averaging
NASA Astrophysics Data System (ADS)
Shintani, Eigo; Arthur, Rudy; Blum, Thomas; Izubuchi, Taku; Jung, Chulwoo; Lehner, Christoph
2015-06-01
We present a new class of statistical error reduction techniques for Monte Carlo simulations. Using covariant symmetries, we show that correlation functions can be constructed from inexpensive approximations without introducing any systematic bias in the final result. We introduce a new class of covariant approximation averaging techniques, known as all-mode averaging (AMA), in which the approximation takes account of contributions of all eigenmodes through the inverse of the Dirac operator computed from the conjugate gradient method with a relaxed stopping condition. In this paper we compare the performance and computational cost of our new method with traditional methods using correlation functions and masses of the pion, nucleon, and vector meson in Nf=2 +1 lattice QCD using domain-wall fermions. This comparison indicates that AMA significantly reduces statistical errors in Monte Carlo calculations over conventional methods for the same cost.
Approximate Bayesian Computation
NASA Astrophysics Data System (ADS)
Cisewski, Jessi
2015-08-01
Explicitly specifying a likelihood function is becoming increasingly difficult for many problems in astronomy. Astronomers often specify a simpler approximate likelihood - leaving out important aspects of a more realistic model. Approximate Bayesian computation (ABC) provides a framework for performing inference in cases where the likelihood is not available or intractable. I will introduce ABC and explain how it can be a useful tool for astronomers. In particular, I will focus on the eccentricity distribution for a sample of exoplanets with multiple sub-populations.
Multiply scaled constrained nonlinear equation solvers. [for nonlinear heat conduction problems
NASA Technical Reports Server (NTRS)
Padovan, Joe; Krishna, Lala
1986-01-01
To improve the numerical stability of nonlinear equation solvers, a partitioned multiply scaled constraint scheme is developed. This scheme enables hierarchical levels of control for nonlinear equation solvers. To complement the procedure, partitioned convergence checks are established along with self-adaptive partitioning schemes. Overall, such procedures greatly enhance the numerical stability of the original solvers. To demonstrate and motivate the development of the scheme, the problem of nonlinear heat conduction is considered. In this context the main emphasis is given to successive substitution-type schemes. To verify the improved numerical characteristics associated with partitioned multiply scaled solvers, results are presented for several benchmark examples.
On the implicit density based OpenFOAM solver for turbulent compressible flows
NASA Astrophysics Data System (ADS)
Fürst, Jiří
The contribution deals with the development of coupled implicit density based solver for compressible flows in the framework of open source package OpenFOAM. However the standard distribution of OpenFOAM contains several ready-made segregated solvers for compressible flows, the performance of those solvers is rather week in the case of transonic flows. Therefore we extend the work of Shen [15] and we develop an implicit semi-coupled solver. The main flow field variables are updated using lower-upper symmetric Gauss-Seidel method (LU-SGS) whereas the turbulence model variables are updated using implicit Euler method.
NASA Astrophysics Data System (ADS)
Vides, Jeaniffer; Nkonga, Boniface; Audit, Edouard
2015-01-01
We derive a simple method to numerically approximate the solution of the two-dimensional Riemann problem for gas dynamics, using the literal extension of the well-known HLL formalism as its basis. Essentially, any strategy attempting to extend the three-state HLL Riemann solver to multiple space dimensions will by some means involve a piecewise constant approximation of the complex two-dimensional interaction of waves, and our numerical scheme is not the exception. In order to determine closed form expressions for the involved fluxes, we rely on the equivalence between the consistency condition and the use of Rankine-Hugoniot conditions that hold across the outermost waves. The proposed scheme is carefully designed to simplify its eventual numerical implementation and its advantages are analytically attested. In addition, we show that the proposed solver can be applied to obtain the edge-centered electric fields needed in the constrained transport technique for the ideal magnetohydrodynamic (MHD) equations. We present several numerical results for hydrodynamics and magnetohydrodynamics that display the scheme's accuracy and its ability to be applied to various systems of conservation laws.
Multicriteria approximation through decomposition
Burch, C.; Krumke, S.; Marathe, M.; Phillips, C.; Sundberg, E.
1998-06-01
The authors propose a general technique called solution decomposition to devise approximation algorithms with provable performance guarantees. The technique is applicable to a large class of combinatorial optimization problems that can be formulated as integer linear programs. Two key ingredients of their technique involve finding a decomposition of a fractional solution into a convex combination of feasible integral solutions and devising generic approximation algorithms based on calls to such decompositions as oracles. The technique is closely related to randomized rounding. Their method yields as corollaries unified solutions to a number of well studied problems and it provides the first approximation algorithms with provable guarantees for a number of new problems. The particular results obtained in this paper include the following: (1) the authors demonstrate how the technique can be used to provide more understanding of previous results and new algorithms for classical problems such as Multicriteria Spanning Trees, and Suitcase Packing; (2) they also show how the ideas can be extended to apply to multicriteria optimization problems, in which they wish to minimize a certain objective function subject to one or more budget constraints. As corollaries they obtain first non-trivial multicriteria approximation algorithms for problems including the k-Hurdle and the Network Inhibition problems.
Multicriteria approximation through decomposition
Burch, C. |; Krumke, S.; Marathe, M.; Phillips, C.; Sundberg, E. |
1997-12-01
The authors propose a general technique called solution decomposition to devise approximation algorithms with provable performance guarantees. The technique is applicable to a large class of combinatorial optimization problems that can be formulated as integer linear programs. Two key ingredients of the technique involve finding a decomposition of a fractional solution into a convex combination of feasible integral solutions and devising generic approximation algorithms based on calls to such decompositions as oracles. The technique is closely related to randomized rounding. The method yields as corollaries unified solutions to a number of well studied problems and it provides the first approximation algorithms with provable guarantees for a number of new problems. The particular results obtained in this paper include the following: (1) The authors demonstrate how the technique can be used to provide more understanding of previous results and new algorithms for classical problems such as Multicriteria Spanning Trees, and Suitcase Packing. (2) They show how the ideas can be extended to apply to multicriteria optimization problems, in which they wish to minimize a certain objective function subject to one or more budget constraints. As corollaries they obtain first non-trivial multicriteria approximation algorithms for problems including the k-Hurdle and the Network Inhibition problems.
ERIC Educational Resources Information Center
Wolff, Hans
This paper deals with a stochastic process for the approximation of the root of a regression equation. This process was first suggested by Robbins and Monro. The main result here is a necessary and sufficient condition on the iteration coefficients for convergence of the process (convergence with probability one and convergence in the quadratic…
Approximating Integrals Using Probability
ERIC Educational Resources Information Center
Maruszewski, Richard F., Jr.; Caudle, Kyle A.
2005-01-01
As part of a discussion on Monte Carlo methods, which outlines how to use probability expectations to approximate the value of a definite integral. The purpose of this paper is to elaborate on this technique and then to show several examples using visual basic as a programming tool. It is an interesting method because it combines two branches of…
Approximating Integrals Using Probability
ERIC Educational Resources Information Center
Maruszewski, Richard F., Jr.; Caudle, Kyle A.
2005-01-01
As part of a discussion on Monte Carlo methods, which outlines how to use probability expectations to approximate the value of a definite integral. The purpose of this paper is to elaborate on this technique and then to show several examples using visual basic as a programming tool. It is an interesting method because it combines two branches of…
FIESTA 2: Parallelizeable multiloop numerical calculations
NASA Astrophysics Data System (ADS)
Smirnov, A. V.; Smirnov, V. A.; Tentyukov, M.
2011-03-01
The program FIESTA has been completely rewritten. Now it can be used not only as a tool to evaluate Feynman integrals numerically, but also to expand Feynman integrals automatically in limits of momenta and masses with the use of sector decompositions and Mellin-Barnes representations. Other important improvements to the code are complete parallelization (even to multiple computers), high-precision arithmetics (allowing to calculate integrals which were undoable before), new integrators, Speer sectors as a strategy, the possibility to evaluate more general parametric integrals. Program summaryProgram title:FIESTA 2 Catalogue identifier: AECP_v2_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AECP_v2_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU GPL version 2 No. of lines in distributed program, including test data, etc.: 39 783 No. of bytes in distributed program, including test data, etc.: 6 154 515 Distribution format: tar.gz Programming language: Wolfram Mathematica 6.0 (or higher) and C Computer: From a desktop PC to a supercomputer Operating system: Unix, Linux, Windows, Mac OS X Has the code been vectorised or parallelized?: Yes, the code has been parallelized for use on multi-kernel computers as well as clusters via Mathlink over the TCP/IP protocol. The program can work successfully with a single processor, however, it is ready to work in a parallel environment and the use of multi-kernel processor and multi-processor computers significantly speeds up the calculation; on clusters the calculation speed can be improved even further. RAM: Depends on the complexity of the problem Classification: 4.4, 4.12, 5, 6.5 Catalogue identifier of previous version: AECP_v1_0 Journal reference of previous version: Comput. Phys. Comm. 180 (2009) 735 External routines: QLink [1], Cuba library [2], MPFR [3] Does the new version supersede the previous version?: Yes Nature of problem: The sector decomposition approach to evaluating Feynman integrals falls apart into the sector decomposition itself, where one has to minimize the number of sectors; the pole resolution and epsilon expansion; and the numerical integration of the resulting expression. Solution method: The sector decomposition is based on a new strategy as well as on classical strategies such as Speer sectors. The sector decomposition, pole resolution and epsilon-expansion are performed in Wolfram Mathematica 6.0 or, preferably, 7.0 (enabling parallelization) [4]. The data is stored on hard disk via a special program, QLink [1]. The expression for integration is passed to the C-part of the code, that parses the string and performs the integration by one of the algorithms in the Cuba library package [2]. This part of the evaluation is perfectly parallelized on multi-kernel computers.
Riemann solvers and Alfven waves in black hole magnetospheres
NASA Astrophysics Data System (ADS)
Punsly, Brian; Balsara, Dinshaw; Kim, Jinho; Garain, Sudip
2016-09-01
In the magnetosphere of a rotating black hole, an inner Alfven critical surface (IACS) must be crossed by inflowing plasma. Inside the IACS, Alfven waves are inward directed toward the black hole. The majority of the proper volume of the active region of spacetime (the ergosphere) is inside of the IACS. The charge and the totally transverse momentum flux (the momentum flux transverse to both the wave normal and the unperturbed magnetic field) are both determined exclusively by the Alfven polarization. Thus, it is important for numerical simulations of black hole magnetospheres to minimize the dissipation of Alfven waves. Elements of the dissipated wave emerge in adjacent cells regardless of the IACS, there is no mechanism to prevent Alfvenic information from crossing outward. Thus, numerical dissipation can affect how simulated magnetospheres attain the substantial Goldreich-Julian charge density associated with the rotating magnetic field. In order to help minimize dissipation of Alfven waves in relativistic numerical simulations we have formulated a one-dimensional Riemann solver, called HLLI, which incorporates the Alfven discontinuity and the contact discontinuity. We have also formulated a multidimensional Riemann solver, called MuSIC, that enables low dissipation propagation of Alfven waves in multiple dimensions. The importance of higher order schemes in lowering the numerical dissipation of Alfven waves is also catalogued.
A massively parallel fractional step solver for incompressible flows
Houzeaux, G. Vazquez, M. Aubry, R. Cela, J.M.
2009-09-20
This paper presents a parallel implementation of fractional solvers for the incompressible Navier-Stokes equations using an algebraic approach. Under this framework, predictor-corrector and incremental projection schemes are seen as sub-classes of the same class, making apparent its differences and similarities. An additional advantage of this approach is to set a common basis for a parallelization strategy, which can be extended to other split techniques or to compressible flows. The predictor-corrector scheme consists in solving the momentum equation and a modified 'continuity' equation (namely a simple iteration for the pressure Schur complement) consecutively in order to converge to the monolithic solution, thus avoiding fractional errors. On the other hand, the incremental projection scheme solves only one iteration of the predictor-corrector per time step and adds a correction equation to fulfill the mass conservation. As shown in the paper, these two schemes are very well suited for massively parallel implementation. In fact, when compared with monolithic schemes, simpler solvers and preconditioners can be used to solve the non-symmetric momentum equations (GMRES, Bi-CGSTAB) and to solve the symmetric continuity equation (CG, Deflated CG). This gives good speedup properties of the algorithm. The implementation of the mesh partitioning technique is presented, as well as the parallel performances and speedups for thousands of processors.
Agglomeration Multigrid for an Unstructured-Grid Flow Solver
NASA Technical Reports Server (NTRS)
Frink, Neal; Pandya, Mohagna J.
2004-01-01
An agglomeration multigrid scheme has been implemented into the sequential version of the NASA code USM3Dns, tetrahedral cell-centered finite volume Euler/Navier-Stokes flow solver. Efficiency and robustness of the multigrid-enhanced flow solver have been assessed for three configurations assuming an inviscid flow and one configuration assuming a viscous fully turbulent flow. The inviscid studies include a transonic flow over the ONERA M6 wing and a generic business jet with flow-through nacelles and a low subsonic flow over a high-lift trapezoidal wing. The viscous case includes a fully turbulent flow over the RAE 2822 rectangular wing. The multigrid solutions converged with 12%-33% of the Central Processing Unit (CPU) time required by the solutions obtained without multigrid. For all of the inviscid cases, multigrid in conjunction with an explicit time-stepping scheme performed the best with regard to the run time memory and CPU time requirements. However, for the viscous case multigrid had to be used with an implicit backward Euler time-stepping scheme that increased the run time memory requirement by 22% as compared to the run made without multigrid.
A Hybrid Stiff Solver for the Rayleigh-Plesset Equation
NASA Astrophysics Data System (ADS)
Alsayegh, Mutaz; Lee, Chung-Min
2011-11-01
We seek to apply efficient computational algorithms to investigate the locations of bubble concentrations in liquid flow. In flows with large velocities, bubbles tend to form in concentrated areas. Moreover, experiments show that bubbles formed at high velocities release large amount of energy once they collapse causing damage to equipment and objects that are in the path of the flow. To gain more insight on the formation of these bubbles, we will first study the dynamics of a single bubble and assume the bubble is a sphere. The dynamics of the bubble in terms of its radius and the driven pressure is modeled by the Rayleigh-Plesset (RP) equation. The RP equation is a second order nonlinear stiff ordinary differential equation (ode) and theoretically, its solution can be obtained numerically using Finite Difference (FD) methods. However, under large pressure variations, the rate of change of the bubble's radius approaches infinity when the bubble is collapsing. Explicit numerical integration methods require time steps of magnitude of (10-12 s) to achieve stable solutions. Iterations under this time scale are highly impractical and require immense CPU time. Therefore, a stiff ode solver is needed to alleviate the computation cost. Therefore, we would like to devise a hybrid algorithm that automatically selects between an explicit method and the stiff ode solver. Once we have a robust implementation, we will use it to process the data and analyze the relations between bubble locations and flow structures.
Using computer algebra and SMT solvers in algebraic biology
NASA Astrophysics Data System (ADS)
Pineda Osorio, Mateo
2014-05-01
Biologic processes are represented as Boolean networks, in a discrete time. The dynamics within these networks are approached with the help of SMT Solvers and the use of computer algebra. Software such as Maple and Z3 was used in this case. The number of stationary states for each network was calculated. The network studied here corresponds to the immune system under the effects of drastic mood changes. Mood is considered as a Boolean variable that affects the entire dynamics of the immune system, changing the Boolean satisfiability and the number of stationary states of the immune network. Results obtained show Z3's great potential as a SMT Solver. Some of these results were verified in Maple, even though it showed not to be as suitable for the problem approach. The solving code was constructed using Z3-Python and Z3-SMT-LiB. Results obtained are important in biology systems and are expected to help in the design of immune therapies. As a future line of research, more complex Boolean network representations of the immune system as well as the whole psychological apparatus are suggested.
Parareal in time 3D numerical solver for the LWR Benchmark neutron diffusion transient model
Baudron, Anne-Marie; Riahi, Mohamed Kamel; Salomon, Julien
2014-12-15
In this paper we present a time-parallel algorithm for the 3D neutrons calculation of a transient model in a nuclear reactor core. The neutrons calculation consists in numerically solving the time dependent diffusion approximation equation, which is a simplified transport equation. The numerical resolution is done with finite elements method based on a tetrahedral meshing of the computational domain, representing the reactor core, and time discretization is achieved using a θ-scheme. The transient model presents moving control rods during the time of the reaction. Therefore, cross-sections (piecewise constants) are taken into account by interpolations with respect to the velocity of the control rods. The parallelism across the time is achieved by an adequate use of the parareal in time algorithm to the handled problem. This parallel method is a predictor corrector scheme that iteratively combines the use of two kinds of numerical propagators, one coarse and one fine. Our method is made efficient by means of a coarse solver defined with large time step and fixed position control rods model, while the fine propagator is assumed to be a high order numerical approximation of the full model. The parallel implementation of our method provides a good scalability of the algorithm. Numerical results show the efficiency of the parareal method on large light water reactor transient model corresponding to the Langenbuch–Maurer–Werner benchmark.
Optimizing the Zeldovich approximation
NASA Technical Reports Server (NTRS)
Melott, Adrian L.; Pellman, Todd F.; Shandarin, Sergei F.
1994-01-01
We have recently learned that the Zeldovich approximation can be successfully used for a far wider range of gravitational instability scenarios than formerly proposed; we study here how to extend this range. In previous work (Coles, Melott and Shandarin 1993, hereafter CMS) we studied the accuracy of several analytic approximations to gravitational clustering in the mildly nonlinear regime. We found that what we called the 'truncated Zeldovich approximation' (TZA) was better than any other (except in one case the ordinary Zeldovich approximation) over a wide range from linear to mildly nonlinear (sigma approximately 3) regimes. TZA was specified by setting Fourier amplitudes equal to zero for all wavenumbers greater than k(sub nl), where k(sub nl) marks the transition to the nonlinear regime. Here, we study the cross correlation of generalized TZA with a group of n-body simulations for three shapes of window function: sharp k-truncation (as in CMS), a tophat in coordinate space, or a Gaussian. We also study the variation in the crosscorrelation as a function of initial truncation scale within each type. We find that k-truncation, which was so much better than other things tried in CMS, is the worst of these three window shapes. We find that a Gaussian window e(exp(-k(exp 2)/2k(exp 2, sub G))) applied to the initial Fourier amplitudes is the best choice. It produces a greatly improved crosscorrelation in those cases which most needed improvement, e.g. those with more small-scale power in the initial conditions. The optimum choice of kG for the Gaussian window is (a somewhat spectrum-dependent) 1 to 1.5 times k(sub nl). Although all three windows produce similar power spectra and density distribution functions after application of the Zeldovich approximation, the agreement of the phases of the Fourier components with the n-body simulation is better for the Gaussian window. We therefore ascribe the success of the best-choice Gaussian window to its superior treatment
NASA Technical Reports Server (NTRS)
Merrill, W. C.
1978-01-01
The Routh approximation technique for reducing the complexity of system models was applied in the frequency domain to a 16th order, state variable model of the F100 engine and to a 43d order, transfer function model of a launch vehicle boost pump pressure regulator. The results motivate extending the frequency domain formulation of the Routh method to the time domain in order to handle the state variable formulation directly. The time domain formulation was derived and a characterization that specifies all possible Routh similarity transformations was given. The characterization was computed by solving two eigenvalue-eigenvector problems. The application of the time domain Routh technique to the state variable engine model is described, and some results are given. Additional computational problems are discussed, including an optimization procedure that can improve the approximation accuracy by taking advantage of the transformation characterization.
Topics in Metric Approximation
NASA Astrophysics Data System (ADS)
Leeb, William Edward
This thesis develops effective approximations of certain metrics that occur frequently in pure and applied mathematics. We show that distances that often arise in applications, such as the Earth Mover's Distance between two probability measures, can be approximated by easily computed formulas for a wide variety of ground distances. We develop simple and easily computed characterizations both of norms measuring a function's regularity -- such as the Lipschitz norm -- and of their duals. We are particularly concerned with the tensor product of metric spaces, where the natural notion of regularity is not the Lipschitz condition but the mixed Lipschitz condition. A theme that runs throughout this thesis is that snowflake metrics (metrics raised to a power less than 1) are often better-behaved than ordinary metrics. For example, we show that snowflake metrics on finite spaces can be approximated by the average of tree metrics with a distortion bounded by intrinsic geometric characteristics of the space and not the number of points. Many of the metrics for which we characterize the Lipschitz space and its dual are snowflake metrics. We also present applications of the characterization of certain regularity norms to the problem of recovering a matrix that has been corrupted by noise. We are able to achieve an optimal rate of recovery for certain families of matrices by exploiting the relationship between mixed-variable regularity conditions and the decay of a function's coefficients in a certain orthonormal basis.
Miller, Gregory H.
2003-08-06
In this paper we present a general iterative method for the solution of the Riemann problem for hyperbolic systems of PDEs. The method is based on the multiple shooting method for free boundary value problems. We demonstrate the method by solving one-dimensional Riemann problems for hyperelastic solid mechanics. Even for conditions representative of routine laboratory conditions and military ballistics, dramatic differences are seen between the exact and approximate Riemann solution. The greatest discrepancy arises from misallocation of energy between compressional and thermal modes by the approximate solver, resulting in nonphysical entropy and temperature estimates. Several pathological conditions arise in common practice, and modifications to the method to handle these are discussed. These include points where genuine nonlinearity is lost, degeneracies, and eigenvector deficiencies that occur upon melting.
A fast solver for systems of reaction-diffusion equations.
Garbey, M.; Kaper, H. G.; Romanyukha, N.
2001-04-20
In this paper we present a fast algorithm for the numerical solution of systems of reaction-diffusion equations, {partial_derivative}{sub t} u + a {center_dot} {del}u = {Delta}u + f(x,t,u), and x element of {Omega} contained in R{sup 3}, t > 0. Here, u is a vector-valued function, u triple bond u(x,t) element of R{sup m} is large, and the corresponding system of ODEs, {partial_derivative}{sub t}u = F(x,t,u), is stiff. Typical examples arise in air pollution studies, where a is the given wind field and the nonlinear function F models the atmospheric chemistry. The time integration of Eq. (1) is best handled by the method of characteristics. The problem is thus reduced to designing for the reaction-diffusion part a fast solver that has good stability properties for the given time step and does not require the computation of the full Jacobi matrix. An operator-splitting technique, even a high-order one, combining a fast nonlinear ODE solver with an efficient solver for the diffusion operator is less effective when the reaction term is stiff. In fact, the classical Strang splitting method may underperform a first-order source splitting method. The algorithm we propose in this paper uses an a posteriori filtering technique to stabilize the computation of the diffusion term. The algorithm parallelizes well, because the solution of the large system of ODEs is done pointwise; however, the integration of the chemistry may lead to load-balancing problems. The Tchebycheff acceleration technique proposed in offers an alternative that complements the approach presented here. To facilitate the presentation, we limit the discussion to domains {Omega} that either admit a regular discretization grid or decompose into subdomains that admit regular discretization grids. We describe the algorithm for one-dimensional domains in Section 2 and for multidimensional domains in Section 3. Section 4 briefly outlines future work.
NASA Astrophysics Data System (ADS)
Koldan, Jelena; Puzyrev, Vladimir; de la Puente, Josep; Houzeaux, Guillaume; Cela, José María
2014-06-01
We present an elaborate preconditioning scheme for Krylov subspace methods which has been developed to improve the performance and reduce the execution time of parallel node-based finite-element (FE) solvers for 3-D electromagnetic (EM) numerical modelling in exploration geophysics. This new preconditioner is based on algebraic multigrid (AMG) that uses different basic relaxation methods, such as Jacobi, symmetric successive over-relaxation (SSOR) and Gauss-Seidel, as smoothers and the wave front algorithm to create groups, which are used for a coarse-level generation. We have implemented and tested this new preconditioner within our parallel nodal FE solver for 3-D forward problems in EM induction geophysics. We have performed series of experiments for several models with different conductivity structures and characteristics to test the performance of our AMG preconditioning technique when combined with biconjugate gradient stabilized method. The results have shown that, the more challenging the problem is in terms of conductivity contrasts, ratio between the sizes of grid elements and/or frequency, the more benefit is obtained by using this preconditioner. Compared to other preconditioning schemes, such as diagonal, SSOR and truncated approximate inverse, the AMG preconditioner greatly improves the convergence of the iterative solver for all tested models. Also, when it comes to cases in which other preconditioners succeed to converge to a desired precision, AMG is able to considerably reduce the total execution time of the forward-problem code-up to an order of magnitude. Furthermore, the tests have confirmed that our AMG scheme ensures grid-independent rate of convergence, as well as improvement in convergence regardless of how big local mesh refinements are. In addition, AMG is designed to be a black-box preconditioner, which makes it easy to use and combine with different iterative methods. Finally, it has proved to be very practical and efficient in the
Evaluation of linear solvers for oil reservoir simulation problems. Part 2: The fully implicit case
Joubert, W.; Janardhan, R.
1997-12-01
A previous paper [Joubert/Biswas 1997] contained investigations of linear solver performance for matrices arising from Amoco`s Falcon parallel oil reservoir simulation code using the IMPES formulation (implicit pressure, explicit saturation). In this companion paper, similar issues are explored for linear solvers applied to matrices arising from more difficult fully implicit problems. The results of numerical experiments are given.
T2CG1, a package of preconditioned conjugate gradient solvers for TOUGH2
Moridis, G.; Pruess, K.; Antunez, E.
1994-03-01
Most of the computational work in the numerical simulation of fluid and heat flows in permeable media arises in the solution of large systems of linear equations. The simplest technique for solving such equations is by direct methods. However, because of large storage requirements and accumulation of roundoff errors, the application of direct solution techniques is limited, depending on matrix bandwidth, to systems of a few hundred to at most a few thousand simultaneous equations. T2CG1, a package of preconditioned conjugate gradient solvers, has been added to TOUGH2 to complement its direct solver and significantly increase the size of problems tractable on PCs. T2CG1 includes three different solvers: a Bi-Conjugate Gradient (BCG) solver, a Bi-Conjugate Gradient Squared (BCGS) solver, and a Generalized Minimum Residual (GMRES) solver. Results from six test problems with up to 30,000 equations show that T2CG1 (1) is significantly (and invariably) faster and requires far less memory than the MA28 direct solver, (2) it makes possible the solution of very large three-dimensional problems on PCs, and (3) that the BCGS solver is the fastest of the three in the tested problems. Sample problems are presented related to heat and fluid flow at Yucca Mountain and WIPP, environmental remediation by the Thermal Enhanced Vapor Extraction System, and geothermal resources.
Chalasani, P.; Saias, I.; Jha, S.
1996-04-08
As increasingly large volumes of sophisticated options (called derivative securities) are traded in world financial markets, determining a fair price for these options has become an important and difficult computational problem. Many valuation codes use the binomial pricing model, in which the stock price is driven by a random walk. In this model, the value of an n-period option on a stock is the expected time-discounted value of the future cash flow on an n-period stock price path. Path-dependent options are particularly difficult to value since the future cash flow depends on the entire stock price path rather than on just the final stock price. Currently such options are approximately priced by Monte carlo methods with error bounds that hold only with high probability and which are reduced by increasing the number of simulation runs. In this paper the authors show that pricing an arbitrary path-dependent option is {number_sign}-P hard. They show that certain types f path-dependent options can be valued exactly in polynomial time. Asian options are path-dependent options that are particularly hard to price, and for these they design deterministic polynomial-time approximate algorithms. They show that the value of a perpetual American put option (which can be computed in constant time) is in many cases a good approximation to the value of an otherwise identical n-period American put option. In contrast to Monte Carlo methods, the algorithms have guaranteed error bounds that are polynormally small (and in some cases exponentially small) in the maturity n. For the error analysis they derive large-deviation results for random walks that may be of independent interest.
Beyond the Kirchhoff approximation
NASA Technical Reports Server (NTRS)
Rodriguez, Ernesto
1989-01-01
The three most successful models for describing scattering from random rough surfaces are the Kirchhoff approximation (KA), the small-perturbation method (SPM), and the two-scale-roughness (or composite roughness) surface-scattering (TSR) models. In this paper it is shown how these three models can be derived rigorously from one perturbation expansion based on the extinction theorem for scalar waves scattering from perfectly rigid surface. It is also shown how corrections to the KA proportional to the surface curvature and higher-order derivatives may be obtained. Using these results, the scattering cross section is derived for various surface models.
A Newton-Krylov solver for fast spin-up of online ocean tracers
NASA Astrophysics Data System (ADS)
Lindsay, Keith
2017-01-01
We present a Newton-Krylov based solver to efficiently spin up tracers in an online ocean model. We demonstrate that the solver converges, that tracer simulations initialized with the solution from the solver have small drift, and that the solver takes orders of magnitude less computational time than the brute force spin-up approach. To demonstrate the application of the solver, we use it to efficiently spin up the tracer ideal age with respect to the circulation from different time intervals in a long physics run. We then evaluate how the spun-up ideal age tracer depends on the duration of the physics run, i.e., on how equilibrated the circulation is.
High-performance equation solvers and their impact on finite element analysis
NASA Technical Reports Server (NTRS)
Poole, Eugene L.; Knight, Norman F., Jr.; Davis, D. Dale, Jr.
1990-01-01
The role of equation solvers in modern structural analysis software is described. Direct and iterative equation solvers which exploit vectorization on modern high-performance computer systems are described and compared. The direct solvers are two Cholesky factorization methods. The first method utilizes a novel variable-band data storage format to achieve very high computation rates and the second method uses a sparse data storage format designed to reduce the number of operations. The iterative solvers are preconditioned conjugate gradient methods. Two different preconditioners are included; the first uses a diagonal matrix storage scheme to achieve high computation rates and the second requires a sparse data storage scheme and converges to the solution in fewer iterations that the first. The impact of using all of the equation solvers in a common structural analysis software system is demonstrated by solving several representative structural analysis problems.
Experimental validation of a coupled neutron-photon inverse radiation transport solver
NASA Astrophysics Data System (ADS)
Mattingly, John; Mitchell, Dean J.; Harding, Lee T.
2011-10-01
Sandia National Laboratories has developed an inverse radiation transport solver that applies nonlinear regression to coupled neutron-photon deterministic transport models. The inverse solver uses nonlinear regression to fit a radiation transport model to gamma spectrometry and neutron multiplicity counting measurements. The subject of this paper is the experimental validation of that solver. This paper describes a series of experiments conducted with a 4.5 kg sphere of α-phase, weapons-grade plutonium. The source was measured bare and reflected by high-density polyethylene (HDPE) spherical shells with total thicknesses between 1.27 and 15.24 cm. Neutron and photon emissions from the source were measured using three instruments: a gross neutron counter, a portable neutron multiplicity counter, and a high-resolution gamma spectrometer. These measurements were used as input to the inverse radiation transport solver to evaluate the solver's ability to correctly infer the configuration of the source from its measured radiation signatures.
A finite element Poisson solver for gyrokinetic particle simulations in a global field aligned mesh
Nishimura, Y. . E-mail: nishimuy@uci.edu; Lin, Z.; Lewandowski, J.L.V.; Ethier, S.
2006-05-20
A new finite element Poisson solver is developed and applied to a global gyrokinetic toroidal code (GTC) which employs the field aligned mesh and thus a logically non-rectangular grid in a general geometry. Employing test cases where the analytical solutions are known, the finite element solver has been verified. The CPU time scaling versus the matrix size employing portable, extensible toolkit for scientific computation (PETSc) to solve the sparse matrix is promising. Taking the ion temperature gradient modes (ITG) as an example, the solution from the new finite element solver has been compared to the solution from the original GTC's iterative solver which is only efficient for adiabatic electrons. Linear and nonlinear simulation results from the two different forms of the gyrokinetic Poisson equation (integral form and the differential form) coincide each other. The new finite element solver enables the implementation of advanced kinetic electron models for global electromagnetic simulations.
Experimental validation of GADRAS's coupled neutron-photon inverse radiation transport solver.
Mattingly, John K.; Mitchell, Dean James; Harding, Lee T.
2010-08-01
Sandia National Laboratories has developed an inverse radiation transport solver that applies nonlinear regression to coupled neutron-photon deterministic transport models. The inverse solver uses nonlinear regression to fit a radiation transport model to gamma spectrometry and neutron multiplicity counting measurements. The subject of this paper is the experimental validation of that solver. This paper describes a series of experiments conducted with a 4.5 kg sphere of {alpha}-phase, weapons-grade plutonium. The source was measured bare and reflected by high-density polyethylene (HDPE) spherical shells with total thicknesses between 1.27 and 15.24 cm. Neutron and photon emissions from the source were measured using three instruments: a gross neutron counter, a portable neutron multiplicity counter, and a high-resolution gamma spectrometer. These measurements were used as input to the inverse radiation transport solver to evaluate the solver's ability to correctly infer the configuration of the source from its measured radiation signatures.
A High-Order Direct Solver for Helmholtz Equations with Neumann Boundary Conditions
NASA Technical Reports Server (NTRS)
Sun, Xian-He; Zhuang, Yu
1997-01-01
In this study, a compact finite-difference discretization is first developed for Helmholtz equations on rectangular domains. Special treatments are then introduced for Neumann and Neumann-Dirichlet boundary conditions to achieve accuracy and separability. Finally, a Fast Fourier Transform (FFT) based technique is used to yield a fast direct solver. Analytical and experimental results show this newly proposed solver is comparable to the conventional second-order elliptic solver when accuracy is not a primary concern, and is significantly faster than that of the conventional solver if a highly accurate solution is required. In addition, this newly proposed fourth order Helmholtz solver is parallel in nature. It is readily available for parallel and distributed computers. The compact scheme introduced in this study is likely extendible for sixth-order accurate algorithms and for more general elliptic equations.
Polyurethanes: versatile materials and sustainable problem solvers for today's challenges.
Engels, Hans-Wilhelm; Pirkl, Hans-Georg; Albers, Reinhard; Albach, Rolf W; Krause, Jens; Hoffmann, Andreas; Casselmann, Holger; Dormish, Jeff
2013-09-02
Polyurethanes are the only class of polymers that display thermoplastic, elastomeric, and thermoset behavior depending on their chemical and morphological makeup. In addition to compact polyurethanes, foamed variations in particular are very widespread, and they achieve their targeted properties at very low weights. The simple production of sandwich structures and material composites in a single processing step is a key advantage of polyurethane technology. The requirement of energy and resource efficiency increasingly demands lightweight structures. Polyurethanes can serve this requirement by acting as matrix materials or as flexible adhesives for composites. Polyurethanes are indispensable when it comes to high-quality decorative coatings or maintaining the value of numerous objects. They are extremely adaptable and sustainable problem solvers for today's challenges facing our society, all of which impose special demands on materials.
Progress in developing Poisson-Boltzmann equation solvers
Li, Chuan; Li, Lin; Petukh, Marharyta; Alexov, Emil
2013-01-01
This review outlines the recent progress made in developing more accurate and efficient solutions to model electrostatics in systems comprised of bio-macromolecules and nano-objects, the last one referring to objects that do not have biological function themselves but nowadays are frequently used in biophysical and medical approaches in conjunction with bio-macromolecules. The problem of modeling macromolecular electrostatics is reviewed from two different angles: as a mathematical task provided the specific definition of the system to be modeled and as a physical problem aiming to better capture the phenomena occurring in the real experiments. In addition, specific attention is paid to methods to extend the capabilities of the existing solvers to model large systems toward applications of calculations of the electrostatic potential and energies in molecular motors, mitochondria complex, photosynthetic machinery and systems involving large nano-objects. PMID:24199185
Status Of The UPS Space-Marching Flow Solver
NASA Technical Reports Server (NTRS)
Lawerence, Scott L.; VanDalsem, William (Technical Monitor)
1995-01-01
The status of the three-dimensional parabolized Navier-Stokes solver UPS is described. The UPS code, initiated at NASA Ames Research Center in 1986, continues to develop and evolve through application to supersonic and hypersonic flow fields. Hypersonic applications have motivated enhancement of the physical modeling capabilities of the code, specifically real gas modeling, boundary conditions, and turbulence and transition modeling. The UPS code has also been modified to enhance robustness and efficiency in order to be practically used in concert with an optimization code for supersonic transport design. These developments are briefly described along with some relevant results for generic test problems obtained during verification of the enhancements. Included developments and results have previously been published and widely disseminated domestically.
Workload Characterization of CFD Applications Using Partial Differential Equation Solvers
NASA Technical Reports Server (NTRS)
Waheed, Abdul; Yan, Jerry; Saini, Subhash (Technical Monitor)
1998-01-01
Workload characterization is used for modeling and evaluating of computing systems at different levels of detail. We present workload characterization for a class of Computational Fluid Dynamics (CFD) applications that solve Partial Differential Equations (PDEs). This workload characterization focuses on three high performance computing platforms: SGI Origin2000, EBM SP-2, a cluster of Intel Pentium Pro bases PCs. We execute extensive measurement-based experiments on these platforms to gather statistics of system resource usage, which results in workload characterization. Our workload characterization approach yields a coarse-grain resource utilization behavior that is being applied for performance modeling and evaluation of distributed high performance metacomputing systems. In addition, this study enhances our understanding of interactions between PDE solver workloads and high performance computing platforms and is useful for tuning these applications.
Workload Characterization of CFD Applications Using Partial Differential Equation Solvers
NASA Technical Reports Server (NTRS)
Waheed, Abdul; Yan, Jerry; Saini, Subhash (Technical Monitor)
1998-01-01
Workload characterization is used for modeling and evaluating of computing systems at different levels of detail. We present workload characterization for a class of Computational Fluid Dynamics (CFD) applications that solve Partial Differential Equations (PDEs). This workload characterization focuses on three high performance computing platforms: SGI Origin2000, EBM SP-2, a cluster of Intel Pentium Pro bases PCs. We execute extensive measurement-based experiments on these platforms to gather statistics of system resource usage, which results in workload characterization. Our workload characterization approach yields a coarse-grain resource utilization behavior that is being applied for performance modeling and evaluation of distributed high performance metacomputing systems. In addition, this study enhances our understanding of interactions between PDE solver workloads and high performance computing platforms and is useful for tuning these applications.
Performance evaluation of a parallel sparse lattice Boltzmann solver
Axner, L. Bernsdorf, J. Zeiser, T. Lammers, P. Linxweiler, J. Hoekstra, A.G.
2008-05-01
We develop a performance prediction model for a parallelized sparse lattice Boltzmann solver and present performance results for simulations of flow in a variety of complex geometries. A special focus is on partitioning and memory/load balancing strategy for geometries with a high solid fraction and/or complex topology such as porous media, fissured rocks and geometries from medical applications. The topology of the lattice nodes representing the fluid fraction of the computational domain is mapped on a graph. Graph decomposition is performed with both multilevel recursive-bisection and multilevel k-way schemes based on modified Kernighan-Lin and Fiduccia-Mattheyses partitioning algorithms. Performance results and optimization strategies are presented for a variety of platforms, showing a parallel efficiency of almost 80% for the largest problem size. A good agreement between the performance model and experimental results is demonstrated.
GPU accelerated FDTD solver and its application in MRI.
Chi, J; Liu, F; Jin, J; Mason, D G; Crozier, S
2010-01-01
The finite difference time domain (FDTD) method is a popular technique for computational electromagnetics (CEM). The large computational power often required, however, has been a limiting factor for its applications. In this paper, we will present a graphics processing unit (GPU)-based parallel FDTD solver and its successful application to the investigation of a novel B1 shimming scheme for high-field magnetic resonance imaging (MRI). The optimized shimming scheme exhibits considerably improved transmit B(1) profiles. The GPU implementation dramatically shortened the runtime of FDTD simulation of electromagnetic field compared with its CPU counterpart. The acceleration in runtime has made such investigation possible, and will pave the way for other studies of large-scale computational electromagnetic problems in modern MRI which were previously impractical.
Large-scale linear nonparallel support vector machine solver.
Tian, Yingjie; Ping, Yuan
2014-02-01
Twin support vector machines (TWSVMs), as the representative nonparallel hyperplane classifiers, have shown the effectiveness over standard SVMs from some aspects. However, they still have some serious defects restricting their further study and real applications: (1) They have to compute and store the inverse matrices before training, it is intractable for many applications where data appear with a huge number of instances as well as features; (2) TWSVMs lost the sparseness by using a quadratic loss function making the proximal hyperplane close enough to the class itself. This paper proposes a Sparse Linear Nonparallel Support Vector Machine, termed as L1-NPSVM, to deal with large-scale data based on an efficient solver-dual coordinate descent (DCD) method. Both theoretical analysis and experiments indicate that our method is not only suitable for large scale problems, but also performs as good as TWSVMs and SVMs.
A Coupled Finite Volume Solver for Incompressible Flows
NASA Astrophysics Data System (ADS)
Moukalled, F.; Darwish, M.
2008-09-01
This paper reports on a pressure-based coupled algorithm for the solution of laminar incompressible flow problems. The implicit pressure-velocity coupling is accomplished by deriving a pressure equation in a way similar to a segregated SIMPLE algorithm with the extended set of equations solved simultaneously and having diagonally dominant coefficients. The superiority of the coupled approach over the segregated approach is demonstrated by solving the lid-driven flow in a square cavity problem using both methodologies and comparing their computational costs. Results indicate that the number of iterations needed by the coupled solver is grid independent. Moreover, recorded CPU time values reveal that the coupled approach substantially reduces the computational cost with the reduction rate for the problem solved increasing as the grid size increases and reaching a value as high as 115.
Extending the QUDA Library with the eigCG Solver
Strelchenko, Alexei; Stathopoulos, Andreas
2014-12-12
While the incremental eigCG algorithm [ 1 ] is included in many LQCD software packages, its realization on GPU micro-architectures was still missing. In this session we report our experi- ence of the eigCG implementation in the QUDA library. In particular, we will focus on how to employ the mixed precision technique to accelerate solutions of large sparse linear systems with multiple right-hand sides on GPUs. Although application of mixed precision techniques is a well-known optimization approach for linear solvers, its utilization for the eigenvector com- puting within eigCG requires special consideration. We will discuss implementation aspects of the mixed precision deflation and illustrate its numerical behavior on the example of the Wilson twisted mass fermion matrix inversions
Progress in developing Poisson-Boltzmann equation solvers.
Li, Chuan; Li, Lin; Petukh, Marharyta; Alexov, Emil
2013-03-01
This review outlines the recent progress made in developing more accurate and efficient solutions to model electrostatics in systems comprised of bio-macromolecules and nano-objects, the last one referring to objects that do not have biological function themselves but nowadays are frequently used in biophysical and medical approaches in conjunction with bio-macromolecules. The problem of modeling macromolecular electrostatics is reviewed from two different angles: as a mathematical task provided the specific definition of the system to be modeled and as a physical problem aiming to better capture the phenomena occurring in the real experiments. In addition, specific attention is paid to methods to extend the capabilities of the existing solvers to model large systems toward applications of calculations of the electrostatic potential and energies in molecular motors, mitochondria complex, photosynthetic machinery and systems involving large nano-objects.
Performance evaluation of a parallel sparse lattice Boltzmann solver
NASA Astrophysics Data System (ADS)
Axner, L.; Bernsdorf, J.; Zeiser, T.; Lammers, P.; Linxweiler, J.; Hoekstra, A. G.
2008-05-01
We develop a performance prediction model for a parallelized sparse lattice Boltzmann solver and present performance results for simulations of flow in a variety of complex geometries. A special focus is on partitioning and memory/load balancing strategy for geometries with a high solid fraction and/or complex topology such as porous media, fissured rocks and geometries from medical applications. The topology of the lattice nodes representing the fluid fraction of the computational domain is mapped on a graph. Graph decomposition is performed with both multilevel recursive-bisection and multilevel k-way schemes based on modified Kernighan-Lin and Fiduccia-Mattheyses partitioning algorithms. Performance results and optimization strategies are presented for a variety of platforms, showing a parallel efficiency of almost 80% for the largest problem size. A good agreement between the performance model and experimental results is demonstrated.
AN ADAPTIVE PARTICLE-MESH GRAVITY SOLVER FOR ENZO
Passy, Jean-Claude; Bryan, Greg L.
2014-11-01
We describe and implement an adaptive particle-mesh algorithm to solve the Poisson equation for grid-based hydrodynamics codes with nested grids. The algorithm is implemented and extensively tested within the astrophysical code Enzo against the multigrid solver available by default. We find that while both algorithms show similar accuracy for smooth mass distributions, the adaptive particle-mesh algorithm is more accurate for the case of point masses, and is generally less noisy. We also demonstrate that the two-body problem can be solved accurately in a configuration with nested grids. In addition, we discuss the effect of subcycling, and demonstrate that evolving all the levels with the same timestep yields even greater precision.
Using parallel banded linear system solvers in generalized eigenvalue problems
NASA Technical Reports Server (NTRS)
Zhang, Hong; Moss, William F.
1993-01-01
Subspace iteration is a reliable and cost effective method for solving positive definite banded symmetric generalized eigenproblems, especially in the case of large scale problems. This paper discusses an algorithm that makes use of two parallel banded solvers in subspace iteration. A shift is introduced to decompose the banded linear systems into relatively independent subsystems and to accelerate the iterations. With this shift, an eigenproblem is mapped efficiently into the memories of a multiprocessor and a high speed-up is obtained for parallel implementations. An optimal shift is a shift that balances total computation and communication costs. Under certain conditions, we show how to estimate an optimal shift analytically using the decay rate for the inverse of a banded matrix, and how to improve this estimate. Computational results on iPSC/2 and iPSC/860 multiprocessors are presented.
Blade design and analysis using a modified Euler solver
NASA Technical Reports Server (NTRS)
Leonard, O.; Vandenbraembussche, R. A.
1991-01-01
An iterative method for blade design based on Euler solver and described in an earlier paper is used to design compressor and turbine blades providing shock free transonic flows. The method shows a rapid convergence, and indicates how much the flow is sensitive to small modifications of the blade geometry, that the classical iterative use of analysis methods might not be able to define. The relationship between the required Mach number distribution and the resulting geometry is discussed. Examples show how geometrical constraints imposed upon the blade shape can be respected by using free geometrical parameters or by relaxing the required Mach number distribution. The same code is used both for the design of the required geometry and for the off-design calculations. Examples illustrate the difficulty of designing blade shapes with optimal performance also outside of the design point.
Accurate derivative evaluation for any Grad–Shafranov solver
Ricketson, L.F.; Cerfon, A.J.; Rachh, M.; Freidberg, J.P.
2016-01-15
We present a numerical scheme that can be combined with any fixed boundary finite element based Poisson or Grad–Shafranov solver to compute the first and second partial derivatives of the solution to these equations with the same order of convergence as the solution itself. At the heart of our scheme is an efficient and accurate computation of the Dirichlet to Neumann map through the evaluation of a singular volume integral and the solution to a Fredholm integral equation of the second kind. Our numerical method is particularly useful for magnetic confinement fusion simulations, since it allows the evaluation of quantities such as the magnetic field, the parallel current density and the magnetic curvature with much higher accuracy than has been previously feasible on the affordable coarse grids that are usually implemented.
A fast solver for systems of axisymmetric ring vortices
Strickland, J.H.; Amos, D.E.
1990-09-01
A method which is capable of efficient calculation of the axisymmetric flow field produced by a large system of ring vortices is presented in this report. The system of ring vortices can, in turn, be used to model body surfaces and wakes in incompressible unsteady axisymmetric flow fields. This method takes advantage of source point and field point series expansions which enables one to make calculations for interactions between groups of vortices which are in well separated spatial domains rather than having to consider interactions between every pair of vortices. In this work, series expansions for the stream function of the ring vortex system are obtained. Such expansions explicitly contain the radial and axial velocity components. A Fortran computer code RSOLV has been written to execute the fast solution technique to calculate the stream function and the axial and radial velocity components at points in the flow field. Test cases have been run to optimize the code and to benchmark the truncation errors and CPU time savings associated with the method. Non-dimensional truncation errors for the stream function and total velocity field are on the order of 5 {times} 10{sup {minus}5} and 3 {times} 10{sup {minus}3} respectively. Single precision accuracy produces errors in these quantities up to about 1 {times} 10{sup {minus}5}. For 100 vortices in the field, there is virtually no CPU time savings with the fast solver. For 10,000 vortices in the flow, the fast solver obtains solutions in about 1% to 3% of the time required for the direct solution technique. Simulations of vortices with square and circular cores were run in order to obtain expressions for the self-induced velocities of such vortices. 8 refs., 26 figs.
Tightly Coupled Geodynamic Systems: Software, Implicit Solvers & Applications
NASA Astrophysics Data System (ADS)
May, D.; Le Pourhiet, L.; Brown, J.
2011-12-01
The generic term "multi-physics" is used to define physical processes which are described by a collection of partial differential equations, or "physics". Numerous processes in geodynamics fall into this category. For example, the evolution of viscous fluid flow and heat transport within the mantle (Stokes flow + energy conservation), the dynamics of melt migration (Stokes flow + Darcy flow + porosity evolution) and landscape evolution (Stokes + diffusion/advection over a surface). The development of software to numerically investigate processes that are described through the composition of different physics components are typically (a) designed for one particular set of physics and are never intended to be extended, or coupled to other processes (b) enforce that certain non-linearity's (or coupling) are explicitly removed from the system for reasons of computational efficiency, or due the lack of a robust non-linear solver (e.g. most models in the mantle convection community). We describe a software infrastructure which enables us to easily introduce new physics with minimal code modifications; tightly couple all physics without introducing splitting errors; exploit modern linear/non-linear solvers and permit the re-use of monolithic preconditioners for individual physics blocks (e.g. saddle point preconditioners for Stokes). Here we present a number of examples to illustrate the flexibility and importance of using this software infra-structure. Using the Stokes system as a prototype, we show results illustrating (i) visco-plastic shear banding experiments, (ii) how coupling Stokes flow with the evolution of the material coordinates can yield temporal stability in the free surface evolution and (iii) the discretisation error associated with decoupling Stokes equation from the heat transport equation in models of mantle convection with various rheologies.
A three-dimensional fast solver for arbitrary vorton distributions
Strickland, J.H.; Baty, R.S.
1994-05-01
A method which is capable of an efficient calculation of the three-dimensional flow field produced by a large system of vortons (discretized regions of vorticity) is presented in this report. The system of vortons can, in turn, be used to model body surfaces, container boundaries, free-surfaces, plumes, jets, and wakes in unsteady three-dimensional flow fields. This method takes advantage of multipole and local series expansions which enables one to make calculations for interactions between groups of vortons which are in well-separated spatial domains rather than having to consider interactions between every pair of vortons. In this work, series expansions for the vector potential of the vorton system are obtained. From such expansions, the three components of velocity can be obtained explicitly. A Fortran computer code FAST3D has been written to calculate the vector potential and the velocity components at selected points in the flow field. In this code, the evaluation points do not have to coincide with the location of the vortons themselves. Test cases have been run to benchmark the truncation errors and CPU time savings associated with the method. Non-dimensional truncation errors for the magnitudes of the vector potential and velocity fields are on the order of 10{sup {minus}4}and 10{sup {minus}3} respectively. Single precision accuracy produces errors in these quantities of up to 10{sup {minus}5}. For less than 1,000 to 2,000 vortons in the field, there is virtually no CPU time savings with the fast solver. For 100,000 vortons in the flow, the fast solver obtains solutions in 1 % to 10% of the time required for the direct solution technique depending upon the configuration.
Relaxation approximations to second-order traffic flow models by high-resolution schemes
Nikolos, I.K.; Delis, A.I.; Papageorgiou, M.
2015-03-10
A relaxation-type approximation of second-order non-equilibrium traffic models, written in conservation or balance law form, is considered. Using the relaxation approximation, the nonlinear equations are transformed to a semi-linear diagonilizable problem with linear characteristic variables and stiff source terms with the attractive feature that neither Riemann solvers nor characteristic decompositions are in need. In particular, it is only necessary to provide the flux and source term functions and an estimate of the characteristic speeds. To discretize the resulting relaxation system, high-resolution reconstructions in space are considered. Emphasis is given on a fifth-order WENO scheme and its performance. The computations reported demonstrate the simplicity and versatility of relaxation schemes as numerical solvers.
Hierarchical Approximate Bayesian Computation
Turner, Brandon M.; Van Zandt, Trisha
2013-01-01
Approximate Bayesian computation (ABC) is a powerful technique for estimating the posterior distribution of a model’s parameters. It is especially important when the model to be fit has no explicit likelihood function, which happens for computational (or simulation-based) models such as those that are popular in cognitive neuroscience and other areas in psychology. However, ABC is usually applied only to models with few parameters. Extending ABC to hierarchical models has been difficult because high-dimensional hierarchical models add computational complexity that conventional ABC cannot accommodate. In this paper we summarize some current approaches for performing hierarchical ABC and introduce a new algorithm called Gibbs ABC. This new algorithm incorporates well-known Bayesian techniques to improve the accuracy and efficiency of the ABC approach for estimation of hierarchical models. We then use the Gibbs ABC algorithm to estimate the parameters of two models of signal detection, one with and one without a tractable likelihood function. PMID:24297436
Roy, Swapnoneel; Thakur, Ashok Kumar
2008-01-01
Genome rearrangements have been modelled by a variety of primitives such as reversals, transpositions, block moves and block interchanges. We consider such a genome rearrangement primitive Strip Exchanges. Given a permutation, the challenge is to sort it by using minimum number of strip exchanges. A strip exchanging move interchanges the positions of two chosen strips so that they merge with other strips. The strip exchange problem is to sort a permutation using minimum number of strip exchanges. We present here the first non-trivial 2-approximation algorithm to this problem. We also observe that sorting by strip-exchanges is fixed-parameter-tractable. Lastly we discuss the application of strip exchanges in a different area Optical Character Recognition (OCR) with an example.
NASA Astrophysics Data System (ADS)
Simmons, Alex; Yang, Qianqian; Moroney, Timothy
2015-04-01
The numerical solution of fractional partial differential equations poses significant computational challenges in regard to efficiency as a result of the spatial nonlocality of the fractional differential operators. The dense coefficient matrices that arise from spatial discretisation of these operators mean that even one-dimensional problems can be difficult to solve using standard methods on grids comprising thousands of nodes or more. In this work we address this issue of efficiency for one-dimensional, nonlinear space-fractional reaction-diffusion equations with fractional Laplacian operators. We apply variable-order, variable-stepsize backward differentiation formulas in a Jacobian-free Newton-Krylov framework to advance the solution in time. A key advantage of this approach is the elimination of any requirement to form the dense matrix representation of the fractional Laplacian operator. We show how a banded approximation to this matrix, which can be formed and factorised efficiently, can be used as part of an effective preconditioner that accelerates convergence of the Krylov subspace iterative solver. Our approach also captures the full contribution from the nonlinear reaction term in the preconditioner, which is crucial for problems that exhibit stiff reactions. Numerical examples are presented to illustrate the overall effectiveness of the solver.
Simulation of an Isolated Tiltrotor in Hover with an Unstructured Overset-Grid RANS Solver
NASA Technical Reports Server (NTRS)
Lee-Rausch, Elizabeth M.; Biedron, Robert T.
2009-01-01
An unstructured overset-grid Reynolds Averaged Navier-Stokes (RANS) solver, FUN3D, is used to simulate an isolated tiltrotor in hover. An overview of the computational method is presented as well as the details of the overset-grid systems. Steady-state computations within a noninertial reference frame define the performance trends of the rotor across a range of the experimental collective settings. Results are presented to show the effects of off-body grid refinement and blade grid refinement. The computed performance and blade loading trends show good agreement with experimental results and previously published structured overset-grid computations. Off-body flow features indicate a significant improvement in the resolution of the first perpendicular blade vortex interaction with background grid refinement across the collective range. Considering experimental data uncertainty and effects of transition, the prediction of figure of merit on the baseline and refined grid is reasonable at the higher collective range- within 3 percent of the measured values. At the lower collective settings, the computed figure of merit is approximately 6 percent lower than the experimental data. A comparison of steady and unsteady results show that with temporal refinement, the dynamic results closely match the steady-state noninertial results which gives confidence in the accuracy of the dynamic overset-grid approach.
A New Equation Solver for Modeling Turbulent Flow in Coupled Matrix-Conduit Flow Models.
Hubinger, Bernhard; Birk, Steffen; Hergarten, Stefan
2016-07-01
Karst aquifers represent dual flow systems consisting of a highly conductive conduit system embedded in a less permeable rock matrix. Hybrid models iteratively coupling both flow systems generally consume much time, especially because of the nonlinearity of turbulent conduit flow. To reduce calculation times compared to those of existing approaches, a new iterative equation solver for the conduit system is developed based on an approximated Newton-Raphson expression and a Gauß-Seidel or successive over-relaxation scheme with a single iteration step at the innermost level. It is implemented and tested in the research code CAVE but should be easily adaptable to similar models such as the Conduit Flow Process for MODFLOW-2005. It substantially reduces the computational effort as demonstrated by steady-state benchmark scenarios as well as by transient karst genesis simulations. Water balance errors are found to be acceptable in most of the test cases. However, the performance and accuracy may deteriorate under unfavorable conditions such as sudden, strong changes of the flow field at some stages of the karst genesis simulations.
NASA Astrophysics Data System (ADS)
Fosas de Pando, Miguel; Schmid, Peter J.; Sipp, Denis
2016-11-01
Nonlinear model reduction for large-scale flows is an essential component in many fluid applications such as flow control, optimization, parameter space exploration and statistical analysis. In this article, we generalize the POD-DEIM method, introduced by Chaturantabut & Sorensen [1], to address nonlocal nonlinearities in the equations without loss of performance or efficiency. The nonlinear terms are represented by nested DEIM-approximations using multiple expansion bases based on the Proper Orthogonal Decomposition. These extensions are imperative, for example, for applications of the POD-DEIM method to large-scale compressible flows. The efficient implementation of the presented model-reduction technique follows our earlier work [2] on linearized and adjoint analyses and takes advantage of the modular structure of our compressible flow solver. The efficacy of the nonlinear model-reduction technique is demonstrated to the flow around an airfoil and its acoustic footprint. We could obtain an accurate and robust low-dimensional model that captures the main features of the full flow.
NASA Astrophysics Data System (ADS)
Sun, Yifei; Kumar, Mrinal
2015-05-01
In this paper, a tensor decomposition approach combined with Chebyshev spectral differentiation is presented to solve the high dimensional transient Fokker-Planck equations (FPE) arising in the simulation of polymeric fluids via multi-bead-spring (MBS) model. Generalizing the authors' previous work on the stationary FPE, the transient solution is obtained in a single CANDECOMP/PARAFAC decomposition (CPD) form for all times via the alternating least squares algorithm. This is accomplished by treating the temporal dimension in the same manner as all other spatial dimensions, thereby decoupling it from them. As a result, the transient solution is obtained without resorting to expensive time stepping schemes. A new, relaxed approach for imposing the vanishing boundary conditions is proposed, improving the quality of the approximation. The asymptotic behavior of the temporal basis functions is studied. The proposed solver scales very well with the dimensionality of the MBS model. Numerical results for systems up to 14 dimensional state space are successfully obtained on a regular personal computer and compared with the corresponding matrix Riccati differential equation (for linear models) or Monte Carlo simulations (for nonlinear models).
The value of continuity: Refined isogeometric analysis and fast direct solvers
Garcia, Daniel; Pardo, David; Dalcin, Lisandro; Paszynski, Maciej; Collier, Nathan; Calo, Victor M.
2016-08-24
Here, we propose the use of highly continuous finite element spaces interconnected with low continuity hyperplanes to maximize the performance of direct solvers. Starting from a highly continuous Isogeometric Analysis (IGA) discretization, we introduce C0-separators to reduce the interconnection between degrees of freedom in the mesh. By doing so, both the solution time and best approximation errors are simultaneously improved. We call the resulting method “refined Isogeometric Analysis (rIGA)”. To illustrate the impact of the continuity reduction, we analyze the number of Floating Point Operations (FLOPs), computational times, and memory required to solve the linear system obtained by discretizing the Laplace problem with structured meshes and uniform polynomial orders. Theoretical estimates demonstrate that an optimal continuity reduction may decrease the total computational time by a factor between p^{2} and p^{3}, with pp being the polynomial order of the discretization. Numerical results indicate that our proposed refined isogeometric analysis delivers a speed-up factor proportional to p^{2}. In a 2D mesh with four million elements and p=5, the linear system resulting from rIGA is solved 22 times faster than the one from highly continuous IGA. In a 3D mesh with one million elements and p=3, the linear system is solved 15 times faster for the refined than the maximum continuity isogeometric analysis.
The value of continuity: Refined isogeometric analysis and fast direct solvers
Garcia, Daniel; Pardo, David; Dalcin, Lisandro; ...
2016-08-24
Here, we propose the use of highly continuous finite element spaces interconnected with low continuity hyperplanes to maximize the performance of direct solvers. Starting from a highly continuous Isogeometric Analysis (IGA) discretization, we introduce C0-separators to reduce the interconnection between degrees of freedom in the mesh. By doing so, both the solution time and best approximation errors are simultaneously improved. We call the resulting method “refined Isogeometric Analysis (rIGA)”. To illustrate the impact of the continuity reduction, we analyze the number of Floating Point Operations (FLOPs), computational times, and memory required to solve the linear system obtained by discretizing themore » Laplace problem with structured meshes and uniform polynomial orders. Theoretical estimates demonstrate that an optimal continuity reduction may decrease the total computational time by a factor between p2 and p3, with pp being the polynomial order of the discretization. Numerical results indicate that our proposed refined isogeometric analysis delivers a speed-up factor proportional to p2. In a 2D mesh with four million elements and p=5, the linear system resulting from rIGA is solved 22 times faster than the one from highly continuous IGA. In a 3D mesh with one million elements and p=3, the linear system is solved 15 times faster for the refined than the maximum continuity isogeometric analysis.« less
The value of continuity: Refined isogeometric analysis and fast direct solvers
Garcia, Daniel; Pardo, David; Dalcin, Lisandro; Paszynski, Maciej; Collier, Nathan; Calo, Victor M.
2016-08-24
Here, we propose the use of highly continuous finite element spaces interconnected with low continuity hyperplanes to maximize the performance of direct solvers. Starting from a highly continuous Isogeometric Analysis (IGA) discretization, we introduce C0-separators to reduce the interconnection between degrees of freedom in the mesh. By doing so, both the solution time and best approximation errors are simultaneously improved. We call the resulting method “refined Isogeometric Analysis (rIGA)”. To illustrate the impact of the continuity reduction, we analyze the number of Floating Point Operations (FLOPs), computational times, and memory required to solve the linear system obtained by discretizing the Laplace problem with structured meshes and uniform polynomial orders. Theoretical estimates demonstrate that an optimal continuity reduction may decrease the total computational time by a factor between p^{2} and p^{3}, with pp being the polynomial order of the discretization. Numerical results indicate that our proposed refined isogeometric analysis delivers a speed-up factor proportional to p^{2}. In a 2D mesh with four million elements and p=5, the linear system resulting from rIGA is solved 22 times faster than the one from highly continuous IGA. In a 3D mesh with one million elements and p=3, the linear system is solved 15 times faster for the refined than the maximum continuity isogeometric analysis.
Hybrid Approximate Message Passing
NASA Astrophysics Data System (ADS)
Rangan, Sundeep; Fletcher, Alyson K.; Goyal, Vivek K.; Byrne, Evan; Schniter, Philip
2017-09-01
The standard linear regression (SLR) problem is to recover a vector $\\mathbf{x}^0$ from noisy linear observations $\\mathbf{y}=\\mathbf{Ax}^0+\\mathbf{w}$. The approximate message passing (AMP) algorithm recently proposed by Donoho, Maleki, and Montanari is a computationally efficient iterative approach to SLR that has a remarkable property: for large i.i.d.\\ sub-Gaussian matrices $\\mathbf{A}$, its per-iteration behavior is rigorously characterized by a scalar state-evolution whose fixed points, when unique, are Bayes optimal. AMP, however, is fragile in that even small deviations from the i.i.d.\\ sub-Gaussian model can cause the algorithm to diverge. This paper considers a "vector AMP" (VAMP) algorithm and shows that VAMP has a rigorous scalar state-evolution that holds under a much broader class of large random matrices $\\mathbf{A}$: those that are right-rotationally invariant. After performing an initial singular value decomposition (SVD) of $\\mathbf{A}$, the per-iteration complexity of VAMP can be made similar to that of AMP. In addition, the fixed points of VAMP's state evolution are consistent with the replica prediction of the minimum mean-squared error recently derived by Tulino, Caire, Verd\\'u, and Shamai. The effectiveness and state evolution predictions of VAMP are confirmed in numerical experiments.
Oasis: A high-level/high-performance open source Navier-Stokes solver
NASA Astrophysics Data System (ADS)
Mortensen, Mikael; Valen-Sendstad, Kristian
2015-03-01
Oasis is a high-level/high-performance finite element Navier-Stokes solver written from scratch in Python using building blocks from the FEniCS project (fenicsproject.org). The solver is unstructured and targets large-scale applications in complex geometries on massively parallel clusters. Oasis utilizes MPI and interfaces, through FEniCS, to the linear algebra backend PETSc. Oasis advocates a high-level, programmable user interface through the creation of highly flexible Python modules for new problems. Through the high-level Python interface the user is placed in complete control of every aspect of the solver. A version of the solver, that is using piecewise linear elements for both velocity and pressure, is shown to reproduce very well the classical, spectral, turbulent channel simulations of Moser et al. (1999). The computational speed is strongly dominated by the iterative solvers provided by the linear algebra backend, which is arguably the best performance any similar implicit solver using PETSc may hope for. Higher order accuracy is also demonstrated and new solvers may be easily added within the same framework.
Acceleration of FDTD mode solver by high-performance computing techniques.
Han, Lin; Xi, Yanping; Huang, Wei-Ping
2010-06-21
A two-dimensional (2D) compact finite-difference time-domain (FDTD) mode solver is developed based on wave equation formalism in combination with the matrix pencil method (MPM). The method is validated for calculation of both real guided and complex leaky modes of typical optical waveguides against the bench-mark finite-difference (FD) eigen mode solver. By taking advantage of the inherent parallel nature of the FDTD algorithm, the mode solver is implemented on graphics processing units (GPUs) using the compute unified device architecture (CUDA). It is demonstrated that the high-performance computing technique leads to significant acceleration of the FDTD mode solver with more than 30 times improvement in computational efficiency in comparison with the conventional FDTD mode solver running on CPU of a standard desktop computer. The computational efficiency of the accelerated FDTD method is in the same order of magnitude of the standard finite-difference eigen mode solver and yet require much less memory (e.g., less than 10%). Therefore, the new method may serve as an efficient, accurate and robust tool for mode calculation of optical waveguides even when the conventional eigen value mode solvers are no longer applicable due to memory limitation.
A parallel 3D poisson solver for space charge simulation in cylindrical coordinates.
Xu, J.; Ostroumov, P. N.; Nolen, J.; Physics
2008-02-01
This paper presents the development of a parallel three-dimensional Poisson solver in cylindrical coordinate system for the electrostatic potential of a charged particle beam in a circular tube. The Poisson solver uses Fourier expansions in the longitudinal and azimuthal directions, and Spectral Element discretization in the radial direction. A Dirichlet boundary condition is used on the cylinder wall, a natural boundary condition is used on the cylinder axis and a Dirichlet or periodic boundary condition is used in the longitudinal direction. A parallel 2D domain decomposition was implemented in the (r,{theta}) plane. This solver was incorporated into the parallel code PTRACK for beam dynamics simulations. Detailed benchmark results for the parallel solver and a beam dynamics simulation in a high-intensity proton LINAC are presented. When the transverse beam size is small relative to the aperture of the accelerator line, using the Poisson solver in a Cartesian coordinate system and a Cylindrical coordinate system produced similar results. When the transverse beam size is large or beam center located off-axis, the result from Poisson solver in Cartesian coordinate system is not accurate because different boundary condition used. While using the new solver, we can apply circular boundary condition easily and accurately for beam dynamic simulations in accelerator devices.
GORRAM: Introducing accurate operational-speed radiative transfer Monte Carlo solvers
NASA Astrophysics Data System (ADS)
Buras-Schnell, Robert; Schnell, Franziska; Buras, Allan
2016-06-01
We present a new approach for solving the radiative transfer equation in horizontally homogeneous atmospheres. The motivation was to develop a fast yet accurate radiative transfer solver to be used in operational retrieval algorithms for next generation meteorological satellites. The core component is the program GORRAM (Generator Of Really Rapid Accurate Monte-Carlo) which generates solvers individually optimized for the intended task. These solvers consist of a Monte Carlo model capable of path recycling and a representative set of photon paths. Latter is generated using the simulated annealing technique. GORRAM automatically takes advantage of limitations on the variability of the atmosphere. Due to this optimization the number of photon paths necessary for accurate results can be reduced by several orders of magnitude. For the shown example of a forward model intended for an aerosol satellite retrieval, comparison with an exact yet slow solver shows that a precision of better than 1% can be achieved with only 36 photons. The computational time is at least an order of magnitude faster than any other type of radiative transfer solver. Merely the lookup table approach often used in satellite retrieval is faster, but on the other hand suffers from limited accuracy. This makes GORRAM-generated solvers an eligible candidate as forward model in operational-speed retrieval algorithms and data assimilation applications. GORRAM also has the potential to create fast solvers of other integrable equations.
Implementation of density-based solver for all speeds in the framework of OpenFOAM
NASA Astrophysics Data System (ADS)
Shen, Chun; Sun, Fengxian; Xia, Xinlin
2014-10-01
In the framework of open source CFD code OpenFOAM, a density-based solver for all speeds flow field is developed. In this solver the preconditioned all speeds AUSM+(P) scheme is adopted and the dual time scheme is implemented to complete the unsteady process. Parallel computation could be implemented to accelerate the solving process. Different interface reconstruction algorithms are implemented, and their accuracy with respect to convection is compared. Three benchmark tests of lid-driven cavity flow, flow crossing over a bump, and flow over a forward-facing step are presented to show the accuracy of the AUSM+(P) solver for low-speed incompressible flow, transonic flow, and supersonic/hypersonic flow. Firstly, for the lid driven cavity flow, the computational results obtained by different interface reconstruction algorithms are compared. It is indicated that the one dimensional reconstruction scheme adopted in this solver possesses high accuracy and the solver developed in this paper can effectively catch the features of low incompressible flow. Then via the test cases regarding the flow crossing over bump and over forward step, the ability to capture characteristics of the transonic and supersonic/hypersonic flows are confirmed. The forward-facing step proves to be the most challenging for the preconditioned solvers with and without the dual time scheme. Nonetheless, the solvers described in this paper reproduce the main features of this flow, including the evolution of the initial transient.
NASA Astrophysics Data System (ADS)
Marshall, David D.
With the renewed interest in Cartesian gridding methodologies for the ease and speed of gridding complex geometries in addition to the simplicity of the control volumes used in the computations, it has become important to investigate ways of extending the existing Cartesian grid solver functionalities. This includes developing methods of modeling the viscous effects in order to utilize Cartesian grids solvers for accurate drag predictions and addressing the issues related to the distributed memory parallelization of Cartesian solvers. This research presents advances in two areas of interest in Cartesian grid solvers, viscous effects modeling and MPI parallelization. The development of viscous effects modeling using solely Cartesian grids has been hampered by the widely varying control volume sizes associated with the mesh refinement and the cut cells associated with the solid surface. This problem is being addressed by using physically based modeling techniques to update the state vectors of the cut cells and removing them from the finite volume integration scheme. This work is performed on a new Cartesian grid solver, NASCART-GT, with modifications to its cut cell functionality. The development of MPI parallelization addresses issues associated with utilizing Cartesian solvers on distributed memory parallel environments. This work is performed on an existing Cartesian grid solver, CART3D, with modifications to its parallelization methodology.
Countably QC-Approximating Posets
Mao, Xuxin; Xu, Luoshan
2014-01-01
As a generalization of countably C-approximating posets, the concept of countably QC-approximating posets is introduced. With the countably QC-approximating property, some characterizations of generalized completely distributive lattices and generalized countably approximating posets are given. The main results are as follows: (1) a complete lattice is generalized completely distributive if and only if it is countably QC-approximating and weakly generalized countably approximating; (2) a poset L having countably directed joins is generalized countably approximating if and only if the lattice σc(L)op of all σ-Scott-closed subsets of L is weakly generalized countably approximating. PMID:25165730
Bounded fractional diffusion in geological media: Definition and Lagrangian approximation
NASA Astrophysics Data System (ADS)
Zhang, Yong; Green, Christopher T.; LaBolle, Eric M.; Neupauer, Roseanna M.; Sun, HongGuang
2016-11-01
Spatiotemporal fractional-derivative models (FDMs) have been increasingly used to simulate non-Fickian diffusion, but methods have not been available to define boundary conditions for FDMs in bounded domains. This study defines boundary conditions and then develops a Lagrangian solver to approximate bounded, one-dimensional fractional diffusion. Both the zero-value and nonzero-value Dirichlet, Neumann, and mixed Robin boundary conditions are defined, where the sign of Riemann-Liouville fractional derivative (capturing nonzero-value spatial-nonlocal boundary conditions with directional superdiffusion) remains consistent with the sign of the fractional-diffusive flux term in the FDMs. New Lagrangian schemes are then proposed to track solute particles moving in bounded domains, where the solutions are checked against analytical or Eulerian solutions available for simplified FDMs. Numerical experiments show that the particle-tracking algorithm for non-Fickian diffusion differs from Fickian diffusion in relocating the particle position around the reflective boundary, likely due to the nonlocal and nonsymmetric fractional diffusion. For a nonzero-value Neumann or Robin boundary, a source cell with a reflective face can be applied to define the release rate of random-walking particles at the specified flux boundary. Mathematical definitions of physically meaningful nonlocal boundaries combined with bounded Lagrangian solvers in this study may provide the only viable techniques at present to quantify the impact of boundaries on anomalous diffusion, expanding the applicability of FDMs from infinite domains to those with any size and boundary conditions.
A new set of direct and iterative solvers for the TOUGH2 family of codes
Moridis, G.J.
1995-04-01
Two new solvers are discussed. LUBAND, the first routine is a direct solver for banded systems and is based on a LU decomposition with partial pivoting and row interchange. BCGSTB, the second routine, is a Preconditioned Conjugate Gradient (PCG) solver with improved speed and convergence characteristics. Bandwidth minimization and gridblock ordering schemes are also introduced into TOUGH2 to improve speed and accuracy. TOUGH2 simulates fluid and heat flows in permeable media and is used for the evaluation of WIPP and TEVES (Thermal Enhanced Vapor Extraction System) that will be used to extract solvents from the Chemical Waste Landfill at Sandia National Laboratories.
Mathematical and Numerical Aspects of the Adaptive Fast Multipole Poisson-Boltzmann Solver
Zhang, Bo; Lu, Benzhuo; Huang, Jingfang; Pitsianis, Nikos P.; Sun, Xiaobai; McCammon, J. Andrew
2013-01-01
This paper summarizes the mathematical and numerical theories and computational elements of the adaptive fast multipole Poisson-Boltzmann (AFMPB) solver. We introduce and discuss the following components in order: the Poisson-Boltzmann model, boundary integral equation reformulation, surface mesh generation, the nodepatch discretization approach, Krylov iterative methods, the new version of fast multipole methods (FMMs), and a dynamic prioritization technique for scheduling parallel operations. For each component, we also remark on feasible approaches for further improvements in efficiency, accuracy and applicability of the AFMPB solver to large-scale long-time molecular dynamics simulations. Lastly, the potential of the solver is demonstrated with preliminary numerical results.
Application of an unstructured grid flow solver to planes, trains and automobiles
NASA Technical Reports Server (NTRS)
Spragle, Gregory S.; Smith, Wayne A.; Yadlin, Yoram
1993-01-01
Rampant, an unstructured flow solver developed at Fluent Inc., is used to compute three-dimensional, viscous, turbulent, compressible flow fields within complex solution domains. Rampant is an explicit, finite-volume flow solver capable of computing flow fields using either triangular (2d) or tetrahedral (3d) unstructured grids. Local time stepping, implicit residual smoothing, and multigrid techniques are used to accelerate the convergence of the explicit scheme. The paper describes the Rampant flow solver and presents flow field solutions about a plane, train, and automobile.
Orthogonal polynomial approximation in higher dimensions: Applications in astrodynamics
NASA Astrophysics Data System (ADS)
Bani Younes, Ahmad Hani Abd Alqader
We propose novel methods to utilize orthogonal polynomial approximation in higher dimension spaces, which enable us to modify classical differential equation solvers to perform high precision, long-term orbit propagation. These methods have immediate application to efficient propagation of catalogs of Resident Space Objects (RSOs) and improved accounting for the uncertainty in the ephemeris of these objects. More fundamentally, the methodology promises to be of broad utility in solving initial and two point boundary value problems from a wide class of mathematical representations of problems arising in engineering, optimal control, physical sciences and applied mathematics. We unify and extend classical results from function approximation theory and consider their utility in astrodynamics. Least square approximation, using the classical Chebyshev polynomials as basis functions, is reviewed for discrete samples of the to-be-approximated function. We extend the orthogonal approximation ideas to n-dimensions in a novel way, through the use of array algebra and Kronecker operations. Approximation of test functions illustrates the resulting algorithms and provides insight into the errors of approximation, as well as the associated errors arising when the approximations are differentiated or integrated. Two sets of applications are considered that are challenges in astrodynamics. The first application addresses local approximation of high degree and order geopotential models, replacing the global spherical harmonic series by a family of locally precise orthogonal polynomial approximations for efficient computation. A method is introduced which adapts the approximation degree radially, compatible with the truth that the highest degree approximations (to ensure maximum acceleration error < 10-9 ms-2, globally) are required near the Earths surface, whereas lower degree approximations are required as radius increases. We show that a four order of magnitude speedup is
NASA Astrophysics Data System (ADS)
Han, Song; Zhang, Wei; Zhang, Jie
2017-09-01
A fast sweeping method (FSM) determines the first arrival traveltimes of seismic waves by sweeping the velocity model in different directions meanwhile applying a local solver. It is an efficient way to numerically solve Hamilton-Jacobi equations for traveltime calculations. In this study, we develop an improved FSM to calculate the first arrival traveltimes of quasi-P (qP) waves in 2-D tilted transversely isotropic (TTI) media. A local solver utilizes the coupled slowness surface of qP and quasi-SV (qSV) waves to form a quartic equation, and solve it numerically to obtain possible traveltimes of qP-wave. The proposed quartic solver utilizes Fermat's principle to limit the range of the possible solution, then uses the bisection procedure to efficiently determine the real roots. With causality enforced during sweepings, our FSM converges fast in a few iterations, and the exact number depending on the complexity of the velocity model. To improve the accuracy, we employ high-order finite difference schemes and derive the second-order formulae. There is no weak anisotropy assumption, and no approximation is made to the complex slowness surface of qP-wave. In comparison to the traveltimes calculated by a horizontal slowness shooting method, the validity and accuracy of our FSM is demonstrated.
Fast approximate stochastic tractography.
Iglesias, Juan Eugenio; Thompson, Paul M; Liu, Cheng-Yi; Tu, Zhuowen
2012-01-01
Many different probabilistic tractography methods have been proposed in the literature to overcome the limitations of classical deterministic tractography: (i) lack of quantitative connectivity information; and (ii) robustness to noise, partial volume effects and selection of seed region. However, these methods rely on Monte Carlo sampling techniques that are computationally very demanding. This study presents an approximate stochastic tractography algorithm (FAST) that can be used interactively, as opposed to having to wait several minutes to obtain the output after marking a seed region. In FAST, tractography is formulated as a Markov chain that relies on a transition tensor. The tensor is designed to mimic the features of a well-known probabilistic tractography method based on a random walk model and Monte-Carlo sampling, but can also accommodate other propagation rules. Compared to the baseline algorithm, our method circumvents the sampling process and provides a deterministic solution at the expense of partially sacrificing sub-voxel accuracy. Therefore, the method is strictly speaking not stochastic, but provides a probabilistic output in the spirit of stochastic tractography methods. FAST was compared with the random walk model using real data from 10 patients in two different ways: 1. the probability maps produced by the two methods on five well-known fiber tracts were directly compared using metrics from the image registration literature; and 2. the connectivity measurements between different regions of the brain given by the two methods were compared using the correlation coefficient ρ. The results show that the connectivity measures provided by the two algorithms are well-correlated (ρ = 0.83), and so are the probability maps (normalized cross correlation 0.818 ± 0.081). The maps are also qualitatively (i.e., visually) very similar. The proposed method achieves a 60x speed-up (7 s vs. 7 min) over the Monte Carlo sampling scheme, therefore
Cwik, T.; Jamnejad, V.; Zuffada, C.
1994-12-31
The usefulness of finite element modeling follows from the ability to accurately simulate the geometry and three-dimensional fields on the scale of a fraction of a wavelength. To make this modeling practical for engineering design, it is necessary to integrate the stages of geometry modeling and mesh generation, numerical solution of the fields-a stage heavily dependent on the efficient use of a sparse matrix equation solver, and display of field information. The stages of geometry modeling, mesh generation, and field display are commonly completed using commercially available software packages. Algorithms for the numerical solution of the fields need to be written for the specific class of problems considered. Interior problems, i.e. simulating fields in waveguides and cavities, have been successfully solved using finite element methods. Exterior problems, i.e. simulating fields scattered or radiated from structures, are more difficult to model because of the need to numerically truncate the finite element mesh. To practically compute a solution to exterior problems, the domain must be truncated at some finite surface where the Sommerfeld radiation condition is enforced, either approximately or exactly. Approximate methods attempt to truncate the mesh using only local field information at each grid point, whereas exact methods are global, needing information from the entire mesh boundary. In this work, a method that couples three-dimensional finite element (FE) solutions interior to the bounding surface, with an efficient integral equation (IE) solution that exactly enforces the Sommerfeld radiation condition is developed. The bounding surface is taken to be a surface of revolution (SOR) to greatly reduce computational expense in the IE portion of the modeling.
NASA Technical Reports Server (NTRS)
Chang, S. C.; Wang, X. Y.; Chow, C. Y.; Himansu, A.
1995-01-01
The method of space-time conservation element and solution element is a nontraditional numerical method designed from a physicist's perspective, i.e., its development is based more on physics than numerics. It uses only the simplest approximation techniques and yet is capable of generating nearly perfect solutions for a 2-D shock reflection problem used by Helen Yee and others. In addition to providing an overall view of the new method, we introduce a new concept in the design of implicit schemes, and use it to construct a highly accurate solver for a convection-diffusion equation. It is shown that, in the inviscid case, this new scheme becomes explicit and its amplification factors are identical to those of the Leapfrog scheme. On the other hand, in the pure diffusion case, its principal amplification factor becomes the amplification factor of the Crank-Nicolson scheme.
A parallel explicit solver for unsteady compressible flows
NASA Astrophysics Data System (ADS)
Akay, H. U.; Ecer, A.; Kemle, W. B.
A previously developed sequential solver for unsteady compressible Euler equations is implemented on INTEL iPSC/860 parallel computer. An explicit finite element formulation using Clebsch variable form of the Euler equations is presented. A streamwise upwinding technique is employed for introducing artificial diffusion to convective terms. Applications are presented for the solution of transonic potential equations. For parallel implementation of the method, the three-dimensional solution domain is partitioned into a number of subdomains requiring each subdomain to reside on a separate processor for parallel computations. The exchange of information between the solution blocks is due to overlapped boundaries at the block interfaces. The same algorithm can also be applied to steady flows by continuing the time integrations until the steady flow conditions are reached. It has been observed that the convergence rate to steady state is affected little with increased number of solution blocks. Efficiency curves for nearly-balanced loads are obtained for different partitioning algorithms. The partition efficiency is shown to affect the central processing unit (CPU) efficiency of the algorithm directly.
An implicit-explicit flow solver for complex unsteady flows
NASA Astrophysics Data System (ADS)
Hsu, John Ming-Jey
2005-12-01
Current calculations of complex unsteady flows are prohibitively expensive for use in real engineering applications. Typical flow solvers for unsteady integration employ a fully implicit time stepping scheme, in which the equations are solved by an inner iteration. In order to achieve convergence within each physical time step, a substantial number of pseudo-time steps (typically between 30--100, depending on the case) are required. Another unfavorable characteristic of the dual time stepping method is that there are no available error estimates for time accuracy available unless the inner iterations are fully converged, although numerical experiments have demonstrated second order accuracy in time. The approach in this thesis is to construct hybrid type schemes by combining implicit and explicit schemes in a manner that guarantees second order accuracy in time. An initial time accurate ADI step is introduced, followed by a small number of cycles of the dual-time stepping scheme augmented by multigrid. The formal second order accuracy in time should be retained without the need for large numbers of inner iterations. The number of inner iterations required for convergence can thus be reduced while maintaining the same overall error levels. To investigate the effectiveness of the proposed scheme, several pitching airfoil test cases were examined, offering a close look at possible reductions in computational cost by adopting the present approach.
Verification of continuum drift kinetic equation solvers in NIMROD
NASA Astrophysics Data System (ADS)
Held, E. D.; Kruger, S. E.; Ji, J.-Y.; Belli, E. A.; Lyons, B. C.
2015-03-01
Verification of continuum solutions to the electron and ion drift kinetic equations (DKEs) in NIMROD [C. R. Sovinec et al., J. Comp. Phys. 195, 355 (2004)] is demonstrated through comparison with several neoclassical transport codes, most notably NEO [E. A. Belli and J. Candy, Plasma Phys. Controlled Fusion 54, 015015 (2012)]. The DKE solutions use NIMROD's spatial representation, 2D finite-elements in the poloidal plane and a 1D Fourier expansion in toroidal angle. For 2D velocity space, a novel 1D expansion in finite elements is applied for the pitch angle dependence and a collocation grid is used for the normalized speed coordinate. The full, linearized Coulomb collision operator is kept and shown to be important for obtaining quantitative results. Bootstrap currents, parallel ion flows, and radial particle and heat fluxes show quantitative agreement between NIMROD and NEO for a variety of tokamak equilibria. In addition, velocity space distribution function contours for ions and electrons show nearly identical detailed structure and agree quantitatively. A Θ-centered, implicit time discretization and a block-preconditioned, iterative linear algebra solver provide efficient electron and ion DKE solutions that ultimately will be used to obtain closures for NIMROD's evolving fluid model.
Verification of continuum drift kinetic equation solvers in NIMROD
Held, E. D.; Ji, J.-Y.; Kruger, S. E.; Belli, E. A.; Lyons, B. C.
2015-03-15
Verification of continuum solutions to the electron and ion drift kinetic equations (DKEs) in NIMROD [C. R. Sovinec et al., J. Comp. Phys. 195, 355 (2004)] is demonstrated through comparison with several neoclassical transport codes, most notably NEO [E. A. Belli and J. Candy, Plasma Phys. Controlled Fusion 54, 015015 (2012)]. The DKE solutions use NIMROD's spatial representation, 2D finite-elements in the poloidal plane and a 1D Fourier expansion in toroidal angle. For 2D velocity space, a novel 1D expansion in finite elements is applied for the pitch angle dependence and a collocation grid is used for the normalized speed coordinate. The full, linearized Coulomb collision operator is kept and shown to be important for obtaining quantitative results. Bootstrap currents, parallel ion flows, and radial particle and heat fluxes show quantitative agreement between NIMROD and NEO for a variety of tokamak equilibria. In addition, velocity space distribution function contours for ions and electrons show nearly identical detailed structure and agree quantitatively. A Θ-centered, implicit time discretization and a block-preconditioned, iterative linear algebra solver provide efficient electron and ion DKE solutions that ultimately will be used to obtain closures for NIMROD's evolving fluid model.
An optimal iterative solver for the Stokes problem
Wathen, A.; Silvester, D.
1994-12-31
Discretisations of the classical Stokes Problem for slow viscous incompressible flow gives rise to systems of equations in matrix form for the velocity u and the pressure p, where the coefficient matrix is symmetric but necessarily indefinite. The square submatrix A is symmetric and positive definite and represents a discrete (vector) Laplacian and the submatrix C may be the zero matrix or more generally will be symmetric positive semi-definite. For `stabilised` discretisations (C {ne} 0) and descretisations which are inherently `stable` (C = 0) and so do not admit spurious pressure components even as the mesh size, h approaches zero, the Schur compliment of the matrix has spectral condition number independent of h (given also that B is bounded). Here the authors will show how this property together with a multigrid preconditioner only for the Laplacian block A yields an optimal solver for the Stokes problem through use of the Minimum Residual iteration. That is, combining Minimum Residual iteration for the matrix equation with a block preconditioner which comprises a small number of multigrid V-cycles for the Laplacian block A together with a simple diagonal scaling block provides an iterative solution procedure for which the computational work grows only linearly with the problem size.
A multiblock multigrid three-dimensional Euler equation solver
NASA Technical Reports Server (NTRS)
Cannizzaro, Frank E.; Elmiligui, Alaa; Melson, N. Duane; Vonlavante, E.
1990-01-01
Current aerodynamic designs are often quite complex (geometrically). Flexible computational tools are needed for the analysis of a wide range of configurations with both internal and external flows. In the past, geometrically dissimilar configurations required different analysis codes with different grid topologies in each. The duplicity of codes can be avoided with the use of a general multiblock formulation which can handle any grid topology. Rather than hard wiring the grid topology into the program, it is instead dictated by input to the program. In this work, the compressible Euler equations, written in a body-fitted finite-volume formulation, are solved using a pseudo-time-marching approach. Two upwind methods (van Leer's flux-vector-splitting and Roe's flux-differencing) were investigated. Two types of explicit solvers (a two-step predictor-corrector and a modified multistage Runge-Kutta) were used with multigrid acceleration to enhance convergence. A multiblock strategy is used to allow greater geometric flexibility. A report on simple explicit upwind schemes for solving compressible flows is included.
Cooperative solutions coupling a geometry engine and adaptive solver codes
NASA Technical Reports Server (NTRS)
Dickens, Thomas P.
1995-01-01
Follow-on work has progressed in using Aero Grid and Paneling System (AGPS), a geometry and visualization system, as a dynamic real time geometry monitor, manipulator, and interrogator for other codes. In particular, AGPS has been successfully coupled with adaptive flow solvers which iterate, refining the grid in areas of interest, and continuing on to a solution. With the coupling to the geometry engine, the new grids represent the actual geometry much more accurately since they are derived directly from the geometry and do not use refits to the first-cut grids. Additional work has been done with design runs where the geometric shape is modified to achieve a desired result. Various constraints are used to point the solution in a reasonable direction which also more closely satisfies the desired results. Concepts and techniques are presented, as well as examples of sample case studies. Issues such as distributed operation of the cooperative codes versus running all codes locally and pre-calculation for performance are discussed. Future directions are considered which will build on these techniques in light of changing computer environments.
Incremental planning to control a blackboard-based problem solver
NASA Technical Reports Server (NTRS)
Durfee, E. H.; Lesser, V. R.
1987-01-01
To control problem solving activity, a planner must resolve uncertainty about which specific long-term goals (solutions) to pursue and about which sequences of actions will best achieve those goals. A planner is described that abstracts the problem solving state to recognize possible competing and compatible solutions and to roughly predict the importance and expense of developing these solutions. With this information, the planner plans sequences of problem solving activities that most efficiently resolve its uncertainty about which of the possible solutions to work toward. The planner only details actions for the near future because the results of these actions will influence how (and whether) a plan should be pursued. As problem solving proceeds, the planner adds new details to the plan incrementally, and monitors and repairs the plan to insure it achieves its goals whenever possible. Through experiments, researchers illustrate how these new mechanisms significantly improve problem solving decisions and reduce overall computation. They briefly discuss current research directions, including how these mechanisms can improve a problem solver's real-time response and can enhance cooperation in a distributed problem solving network.
A generalized Poisson solver for first-principles device simulations
Bani-Hashemian, Mohammad Hossein; VandeVondele, Joost; Brück, Sascha; Luisier, Mathieu
2016-01-28
Electronic structure calculations of atomistic systems based on density functional theory involve solving the Poisson equation. In this paper, we present a plane-wave based algorithm for solving the generalized Poisson equation subject to periodic or homogeneous Neumann conditions on the boundaries of the simulation cell and Dirichlet type conditions imposed at arbitrary subdomains. In this way, source, drain, and gate voltages can be imposed across atomistic models of electronic devices. Dirichlet conditions are enforced as constraints in a variational framework giving rise to a saddle point problem. The resulting system of equations is then solved using a stationary iterative method in which the generalized Poisson operator is preconditioned with the standard Laplace operator. The solver can make use of any sufficiently smooth function modelling the dielectric constant, including density dependent dielectric continuum models. For all the boundary conditions, consistent derivatives are available and molecular dynamics simulations can be performed. The convergence behaviour of the scheme is investigated and its capabilities are demonstrated.
Solute solver 'what if' module for modeling urea kinetics.
Daugirdas, John T
2016-11-01
The publicly available Solute Solver module allows calculation of a variety of two-pool urea kinetic measures of dialysis adequacy using pre- and postdialysis plasma urea and estimated dialyzer clearance or estimated urea distribution volumes as inputs. However, the existing program does not have a 'what if' module, which would estimate the plasma urea values as well as commonly used measures of hemodialysis adequacy for a patient with a given urea distribution volume and urea nitrogen generation rate dialyzed according to a particular dialysis schedule. Conventional variable extracellular volume 2-pool urea kinetic equations were used. A javascript-HTML Web form was created that can be used on any personal computer equipped with internet browsing software, to compute commonly used Kt/V-based measures of hemodialysis adequacy for patients with differing amounts of residual kidney function and following a variety of treatment schedules. The completed Web form calculator may be particularly useful in computing equivalent continuous clearances for incremental hemodialysis strategies. © The Author 2016. Published by Oxford University Press on behalf of ERA-EDTA. All rights reserved.
Development and acceleration of unstructured mesh-based cfd solver
NASA Astrophysics Data System (ADS)
Emelyanov, V.; Karpenko, A.; Volkov, K.
2017-06-01
The study was undertaken as part of a larger effort to establish a common computational fluid dynamics (CFD) code for simulation of internal and external flows and involves some basic validation studies. The governing equations are solved with ¦nite volume code on unstructured meshes. The computational procedure involves reconstruction of the solution in each control volume and extrapolation of the unknowns to find the flow variables on the faces of control volume, solution of Riemann problem for each face of the control volume, and evolution of the time step. The nonlinear CFD solver works in an explicit time-marching fashion, based on a three-step Runge-Kutta stepping procedure. Convergence to a steady state is accelerated by the use of geometric technique and by the application of Jacobi preconditioning for high-speed flows, with a separate low Mach number preconditioning method for use with low-speed flows. The CFD code is implemented on graphics processing units (GPUs). Speedup of solution on GPUs with respect to solution on central processing units (CPU) is compared with the use of different meshes and different methods of distribution of input data into blocks. The results obtained provide promising perspective for designing a GPU-based software framework for applications in CFD.
Algorithmic Enhancements to the VULCAN Navier-Stokes Solver
NASA Technical Reports Server (NTRS)
Litton, D. K.; Edwards, J. R.; White, J. A.
2003-01-01
VULCAN (Viscous Upwind aLgorithm for Complex flow ANalysis) is a cell centered, finite volume code used to solve high speed flows related to hypersonic vehicles. Two algorithms are presented for expanding the range of applications of the current Navier-Stokes solver implemented in VULCAN. The first addition is a highly implicit approach that uses subiterations to enhance block to block connectivity between adjacent subdomains. The addition of this scheme allows more efficient solution of viscous flows on highly-stretched meshes. The second algorithm addresses the shortcomings associated with density-based schemes by the addition of a time-derivative preconditioning strategy. High speed, compressible flows are typically solved with density based schemes, which show a high level of degradation in accuracy and convergence at low Mach numbers (M less than or equal to 0.1). With the addition of preconditioning and associated modifications to the numerical discretization scheme, the eigenvalues will scale with the local velocity, and the above problems will be eliminated. With these additions, VULCAN now has improved convergence behavior for multi-block, highly-stretched meshes and also can solve the Navier-Stokes equations for very low Mach numbers.
Shared Memory Parallelism for 3D Cartesian Discrete Ordinates Solver
NASA Astrophysics Data System (ADS)
Moustafa, Salli; Dutka-Malen, Ivan; Plagne, Laurent; Ponçot, Angélique; Ramet, Pierre
2014-06-01
This paper describes the design and the performance of DOMINO, a 3D Cartesian SN solver that implements two nested levels of parallelism (multicore+SIMD) on shared memory computation nodes. DOMINO is written in C++, a multi-paradigm programming language that enables the use of powerful and generic parallel programming tools such as Intel TBB and Eigen. These two libraries allow us to combine multi-thread parallelism with vector operations in an efficient and yet portable way. As a result, DOMINO can exploit the full power of modern multi-core processors and is able to tackle very large simulations, that usually require large HPC clusters, using a single computing node. For example, DOMINO solves a 3D full core PWR eigenvalue problem involving 26 energy groups, 288 angular directions (S16), 46 × 106 spatial cells and 1 × 1012 DoFs within 11 hours on a single 32-core SMP node. This represents a sustained performance of 235 GFlops and 40:74% of the SMP node peak performance for the DOMINO sweep implementation. The very high Flops/Watt ratio of DOMINO makes it a very interesting building block for a future many-nodes nuclear simulation tool.
A `metric' semi-Lagrangian Vlasov-Poisson solver
NASA Astrophysics Data System (ADS)
Colombi, Stéphane; Alard, Christophe
2017-06-01
We propose a new semi-Lagrangian Vlasov-Poisson solver. It employs metric elements to follow locally the flow and its deformation, allowing one to find quickly and accurately the initial phase-space position of any test particle , by expanding at second order the geometry of the motion in the vicinity of the closest element. It is thus possible to reconstruct accurately the phase-space distribution function at any time and position by proper interpolation of initial conditions, following Liouville theorem. When distortion of the elements of metric becomes too large, it is necessary to create new initial conditions along with isotropic elements and repeat the procedure again until next resampling. To speed up the process, interpolation of the phase-space distribution is performed at second order during the transport phase, while third-order splines are used at the moments of remapping. We also show how to compute accurately the region of influence of each element of metric with the proper percolation scheme. The algorithm is tested here in the framework of one-dimensional gravitational dynamics but is implemented in such a way that it can be extended easily to four- or six-dimensional phase space. It can also be trivially generalised to plasmas.
A New Parallel N-Body Gravity Solver: TPM
NASA Astrophysics Data System (ADS)
Xu, Guohong
1995-05-01
We have developed a gravity solver based on combining the particle-mesh (PM) method and TREE methods. It is designed for and has been implemented on parallel computer architectures. The new code can deal with tens of millions of particles on current computers, with the calculation done on a parallel super- computer or a group of workstations. Typically, the spatial resolution is enhanced by more than a factor of 20 over the pure PM code with mass resolution retained at nearly the PM level. This code runs much faster than a pure TREE code with the same number of particles and maintains almost the same resolution in high-density regions. Multiple time step integration has also been implemented with the code, with second-order time accuracy. The performance of the code has been checked in several kinds of parallel computer configurations, including IBM SP1, SGI Challenge, and a group of workstations, with the speedup of the parallel code on a 32 processor IBM SP2 supercomputer nearly linear (efficiency ≍ 80%) in the number of processors. The computation/communication ratio is also very high (˜50), which means the code spends 95% of its CPU time in computation.
New numerical solver for flows at various Mach numbers
NASA Astrophysics Data System (ADS)
Miczek, F.; Röpke, F. K.; Edelmann, P. V. F.
2015-04-01
Context. Many problems in stellar astrophysics feature flows at low Mach numbers. Conventional compressible hydrodynamics schemes frequently used in the field have been developed for the transonic regime and exhibit excessive numerical dissipation for these flows. Aims: While schemes were proposed that solve hydrodynamics strictly in the low Mach regime and thus restrict their applicability, we aim at developing a scheme that correctly operates in a wide range of Mach numbers. Methods: Based on an analysis of the asymptotic behavior of the Euler equations in the low Mach limit we propose a novel scheme that is able to maintain a low Mach number flow setup while retaining all effects of compressibility. This is achieved by a suitable modification of the well-known Roe solver. Results: Numerical tests demonstrate the capability of this new scheme to reproduce slow flow structures even in moderate numerical resolution. Conclusions: Our scheme provides a promising approach to a consistent multidimensional hydrodynamical treatment of astrophysical low Mach number problems such as convection, instabilities, and mixing in stellar evolution.
Towards Batched Linear Solvers on Accelerated Hardware Platforms
Haidar, Azzam; Dong, Tingzing Tim; Tomov, Stanimire; Dongarra, Jack J
2015-01-01
As hardware evolves, an increasingly effective approach to develop energy efficient, high-performance solvers, is to design them to work on many small and independent problems. Indeed, many applications already need this functionality, especially for GPUs, which are known to be currently about four to five times more energy efficient than multicore CPUs for every floating-point operation. In this paper, we describe the development of the main one-sided factorizations: LU, QR, and Cholesky; that are needed for a set of small dense matrices to work in parallel. We refer to such algorithms as batched factorizations. Our approach is based on representing the algorithms as a sequence of batched BLAS routines for GPU-contained execution. Note that this is similar in functionality to the LAPACK and the hybrid MAGMA algorithms for large-matrix factorizations. But it is different from a straightforward approach, whereby each of GPU's symmetric multiprocessors factorizes a single problem at a time. We illustrate how our performance analysis together with the profiling and tracing tools guided the development of batched factorizations to achieve up to 2-fold speedup and 3-fold better energy efficiency compared to our highly optimized batched CPU implementations based on the MKL library on a two-sockets, Intel Sandy Bridge server. Compared to a batched LU factorization featured in the NVIDIA's CUBLAS library for GPUs, we achieves up to 2.5-fold speedup on the K40 GPU.
Intrusive Method for Uncertainty Quantification in a Multiphase Flow Solver
NASA Astrophysics Data System (ADS)
Turnquist, Brian; Owkes, Mark
2016-11-01
Uncertainty quantification (UQ) is a necessary, interesting, and often neglected aspect of fluid flow simulations. To determine the significance of uncertain initial and boundary conditions, a multiphase flow solver is being created which extends a single phase, intrusive, polynomial chaos scheme into multiphase flows. Reliably estimating the impact of input uncertainty on design criteria can help identify and minimize unwanted variability in critical areas, and has the potential to help advance knowledge in atomizing jets, jet engines, pharmaceuticals, and food processing. Use of an intrusive polynomial chaos method has been shown to significantly reduce computational cost over non-intrusive collocation methods such as Monte-Carlo. This method requires transforming the model equations into a weak form through substitution of stochastic (random) variables. Ultimately, the model deploys a stochastic Navier Stokes equation, a stochastic conservative level set approach including reinitialization, as well as stochastic normals and curvature. By implementing these approaches together in one framework, basic problems may be investigated which shed light on model expansion, uncertainty theory, and fluid flow in general. NSF Grant Number 1511325.
Fault tolerance in an inner-outer solver: A GVR-enabled case study
Zhang, Ziming; Chien, Andrew A.; Teranishi, Keita
2015-04-18
Resilience is a major challenge for large-scale systems. It is particularly important for iterative linear solvers, since they take much of the time of many scientific applications. We show that single bit flip errors in the Flexible GMRES iterative linear solver can lead to high computational overhead or even failure to converge to the right answer. Informed by these results, we design and evaluate several strategies for fault tolerance in both inner and outer solvers appropriate across a range of error rates. We implement them, extending Trilinos’ solver library with the Global View Resilience (GVR) programming model, which provides multi-stream snapshots, multi-version data structures with portable and rich error checking/recovery. Lastly, experimental results validate correct execution with low performance overhead under varied error conditions.
Fault tolerance in an inner-outer solver: A GVR-enabled case study
Zhang, Ziming; Chien, Andrew A.; Teranishi, Keita
2015-04-18
Resilience is a major challenge for large-scale systems. It is particularly important for iterative linear solvers, since they take much of the time of many scientific applications. We show that single bit flip errors in the Flexible GMRES iterative linear solver can lead to high computational overhead or even failure to converge to the right answer. Informed by these results, we design and evaluate several strategies for fault tolerance in both inner and outer solvers appropriate across a range of error rates. We implement them, extending Trilinos’ solver library with the Global View Resilience (GVR) programming model, which provides multi-streammore » snapshots, multi-version data structures with portable and rich error checking/recovery. Lastly, experimental results validate correct execution with low performance overhead under varied error conditions.« less
User's Manual for PCSMS (Parallel Complex Sparse Matrix Solver). Version 1.
NASA Technical Reports Server (NTRS)
Reddy, C. J.
2000-01-01
PCSMS (Parallel Complex Sparse Matrix Solver) is a computer code written to make use of the existing real sparse direct solvers to solve complex, sparse matrix linear equations. PCSMS converts complex matrices into real matrices and use real, sparse direct matrix solvers to factor and solve the real matrices. The solution vector is reconverted to complex numbers. Though, this utility is written for Silicon Graphics (SGI) real sparse matrix solution routines, it is general in nature and can be easily modified to work with any real sparse matrix solver. The User's Manual is written to make the user acquainted with the installation and operation of the code. Driver routines are given to aid the users to integrate PCSMS routines in their own codes.
Cognitive Distance Learning Problem Solver Reduces Search Cost through Learning Processes
NASA Astrophysics Data System (ADS)
Yamakawa, Hiroshi; Miyamoto, Yuji; Baba, Takayuki; Okada, Hiroyuki
Our proposed cognitive distance learning problem solver generates sequence of actions from initial state to goal states in problem state space. This problem solver learns cognitive distance (path cost) of arbitrary combination of two states. Action generation at each state is selection of next state that has minimum cognitive distance to the goal, like Q-learning agent. In this paper, first, we show that our proposed method reduces search cost than conventional search method by analytical simulation in spherical state space. Second, we show that an average search cost is more reduced more the prior learning term is long and our problem solver is familiar to the environment, by a computer simulation in a tile world state space. Third, we showed that proposed problem solver is superior to the reinforcement learning techniques when goal is changed by a computer simulation. Forth, we found that our simulation result consist with psychological experimental results.
A novel high-order, entropy stable, 3D AMR MHD solver with guaranteed positive pressure
NASA Astrophysics Data System (ADS)
Derigs, Dominik; Winters, Andrew R.; Gassner, Gregor J.; Walch, Stefanie
2016-07-01
We describe a high-order numerical magnetohydrodynamics (MHD) solver built upon a novel non-linear entropy stable numerical flux function that supports eight travelling wave solutions. By construction the solver conserves mass, momentum, and energy and is entropy stable. The method is designed to treat the divergence-free constraint on the magnetic field in a similar fashion to a hyperbolic divergence cleaning technique. The solver described herein is especially well-suited for flows involving strong discontinuities. Furthermore, we present a new formulation to guarantee positivity of the pressure. We present the underlying theory and implementation of the new solver into the multi-physics, multi-scale adaptive mesh refinement (AMR) simulation code FLASH (http://flash.uchicago.edu).
NASA Astrophysics Data System (ADS)
Guda, A. A.; Guda, S. A.; Soldatov, M. A.; Lomachenko, K. A.; Bugaev, A. L.; Lamberti, C.; Gawelda, W.; Bressler, C.; Smolentsev, G.; Soldatov, A. V.; Joly, Y.
2016-05-01
Finite difference method (FDM) implemented in the FDMNES software [Phys. Rev. B, 2001, 63, 125120] was revised. Thorough analysis shows, that the calculated diagonal in the FDM matrix consists of about 96% zero elements. Thus a sparse solver would be more suitable for the problem instead of traditional Gaussian elimination for the diagonal neighbourhood. We have tried several iterative sparse solvers and the direct one MUMPS solver with METIS ordering turned out to be the best. Compared to the Gaussian solver present method is up to 40 times faster and allows XANES simulations for complex systems already on personal computers. We show applicability of the software for metal-organic [Fe(bpy)3]2+ complex both for low spin and high spin states populated after laser excitation.
Development of a Flow Solver with Complex Kinetics on the Graphic Processing Units
2011-09-22
Physics 109, 11 (2011), 113308. [9] Klockner, A., Warburton, T., Bridge, J., and Hesthaven, J. Nodal Discontinuous Galerkin Methods on Graphics...Graphic Processing Units ( GPU ) to model reactive gas mixture with detailed chemical kinetics. The solver incorporates high-order finite volume methods...method. We explored different approaches in implementing a fast kinetics solver on the GPU . The detail of the implementation is discussed in the
NASA Technical Reports Server (NTRS)
Mahajan, Aparajit J.; Dowell, Earl H.; Bliss, Donald B.
1991-01-01
A Lanczos procedure is presently applied to a Navier-Stokes (N-S) solver for eigenvalues and eigenvectors associated with the small-perturbation analysis of the N-S equations' finite-difference representation for airfoil flows; the matrix used is very large, sparse, real, and nonsymmetric. The Lanczos procedure is shown to furnish complete spectral information for the eigenvalues, as required for transient-stability analysis of N-S solvers.
The development of an intelligent interface to a computational fluid dynamics flow-solver code
NASA Technical Reports Server (NTRS)
Williams, Anthony D.
1988-01-01
Researchers at NASA Lewis are currently developing an 'intelligent' interface to aid in the development and use of large, computational fluid dynamics flow-solver codes for studying the internal fluid behavior of aerospace propulsion systems. This paper discusses the requirements, design, and implementation of an intelligent interface to Proteus, a general purpose, 3-D, Navier-Stokes flow solver. The interface is called PROTAIS to denote its introduction of artificial intelligence (AI) concepts to the Proteus code.
Implementation of a parallel unstructured Euler solver on the CM-5
NASA Technical Reports Server (NTRS)
Morano, Eric; Mavriplis, D. J.
1995-01-01
An efficient unstructured 3D Euler solver is parallelized on a Thinking Machine Corporation Connection Machine 5, distributed memory computer with vectoring capability. In this paper, the single instruction multiple data (SIMD) strategy is employed through the use of the CM Fortran language and the CMSSL scientific library. The performance of the CMSSL mesh partitioner is evaluated and the overall efficiency of the parallel flow solver is discussed.
A New Block Solver for Large, Full, Unsymmetric, Complex Systems of Linear Algebraic Equations.
1988-02-01
THE COEFFICIENT C MATRIX IN THAT ORDER. ON OUTPUT, UTI CONTAINS THE SOLUTION C MATRIX. C C THE NASTRAN DMAP INSTRUCTIONS TO INTERFACE WITH ’OCSOLVE...developed. Although OCSOLVE was developed for use with the finite element program NASTRAN , it is designed t,) be easily adapted for other applications...solve such a system of 500 equations with complex- valued coefficients to about 5% of the time required by the equation solver in NASTRAN . The solver
The development of an intelligent interface to a computational fluid dynamics flow-solver code
NASA Technical Reports Server (NTRS)
Williams, Anthony D.
1988-01-01
Researchers at NASA Lewis are currently developing an 'intelligent' interface to aid in the development and use of large, computational fluid dynamics flow-solver codes for studying the internal fluid behavior of aerospace propulsion systems. This paper discusses the requirements, design, and implementation of an intelligent interface to Proteus, a general purpose, three-dimensional, Navier-Stokes flow solver. The interface is called PROTAIS to denote its introduction of artificial intelligence (AI) concepts to the Proteus code.
THE USE OF CLASSICAL LAX-FRIEDRICHS RIEMANN SOLVERS WITH DISCONTINUOUS GALERKIN METHODS
W. J. RIDER; R. B. LOWRIE
2001-03-01
While conducting a von Neumann stability analysis of discontinuous Galerkin methods we found that the standard Lax-Friedrichs (LxF) Riemann solver is unstable for all time-step sizes. A simple modification of the Riemann solver's dissipation returns the method to stability. Furthermore, the method has a smaller truncation error than the corresponding method with an upwind flux for the RK2-DG(1) method. These results are confirmed upon testing.
Wavelet-based Poisson solver for use in particle-in-cell simulations.
Terzić, Balsa; Pogorelov, Ilya V
2005-06-01
We report on a successful implementation of a wavelet-based Poisson solver for use in three-dimensional particle-in-cell simulations. Our method harnesses advantages afforded by the wavelet formulation, such as sparsity of operators and data sets, existence of effective preconditioners, and the ability simultaneously to remove numerical noise and additional compression of relevant data sets. We present and discuss preliminary results relating to the application of the new solver to test problems in accelerator physics and astrophysics.
Dynamic Linear Solver Selection for Transient Simulations Using Multi-label Classifiers
2012-01-01
Conference on Computational Science, ICCS 2012 Dynamic linear solver selection for transient simulations using multi-label classifiers Paul R. Eller ...preconditioned linear solver as the output. Email addresses: Paul.R.Eller@usace.army.mil (Paul R. Eller ), Ruth.C.Cheng@usace.army.mil (Jing-Ru C...unclassified c. THIS PAGE unclassified Standard Form 298 (Rev. 8-98) Prescribed by ANSI Std Z39-18 1524 Paul R. Eller et al. / Procedia
The development of an intelligent interface to a computational fluid dynamics flow-solver code
NASA Technical Reports Server (NTRS)
Williams, Anthony D.
1988-01-01
Researchers at NASA Lewis are currently developing an 'intelligent' interface to aid in the development and use of large, computational fluid dynamics flow-solver codes for studying the internal fluid behavior of aerospace propulsion systems. This paper discusses the requirements, design, and implementation of an intelligent interface to Proteus, a general purpose, three-dimensional, Navier-Stokes flow solver. The interface is called PROTAIS to denote its introduction of artificial intelligence (AI) concepts to the Proteus code.
A 3D Unstructured Mesh Euler Solver Based on the Fourth-Order CESE Method
2013-06-01
conservation in space and time without using a one-dimensional Riemann solver, (ii) genuinely multi-dimensional treatment without dimensional splitting (iii...of the original second-order CESE method, including: (i) flux conservation in space and time without using a one-dimensional Riemann solver, (ii...treated in a unified manner. The geometry for a three-dimensional CESE method is more difficult to visualize than the one- and two-dimensional methods
A new 3D Eikonal solver for accurate traveltimes, take-off angles and amplitudes
NASA Astrophysics Data System (ADS)
Noble, Mark; Gesret, Alexandrine
2013-04-01
The finite-difference approximation to the eikonal equation was first introduced by J.Vidale in 1988 to propagate first-arrival times throughout a 2D or 3D gridded velocity model. Even today this method is still very attractive from a computational point of view when dealing with large datasets. Among many domains of application, the eikonal solver may be used for 2-D or 3-D depth migration, tomography or microseismicity data analysis. The original 3D method proposed by Vidale in 1990 did exhibit some degree of travel time error that may lead to poor image focusing in migration or inaccurate velocities estimated via tomographic inversion. The method even failed when large and sharp velocity contrasts were encountered. To try and overcome these limitations many authors proposed alternative algorithms, incorporating new finite-difference operators and/or new schemes of implementing the operators to propagate the travel times through the velocity model. If many recently published algorithms for resolving the 3D eikonal equation do yield fairly accurate travel times for most applications, the spatial derivatives of travel times remain very approximate and prevent reliable computation of auxiliary quantities such as take-off angle and amplitude. This limitation is due to the fact that the finite-difference operators locally assume that the wavefront is flat (plane wave). This assumption is in particularly wrong when close to the source where a spherical approximation would be more suitable. To overcome this singularity at the source, some authors proposed an adaptive method that reduces inaccuracies, however, the cost is more algorithmic complexity. The objective of this study is to develop an efficient simple 3D eikonal solver that is able to: overcome the problem of the source singularity, handle velocity models that exhibit strong vertical and horizontal velocity variations, use different grid spacing in x, y and z axis of model. The final goal is of course to
Head and neck 192Ir HDR-brachytherapy dosimetry using a grid-based Boltzmann solver
Wolf, Sabine; Kóvacs, George
2013-01-01
Purpose To compare dosimetry for head and neck cancer patients, calculated with TG-43 formalism and a commercially available grid-based Boltzmann solver. Material and methods This study included 3D-dosimetry of 49 consecutive brachytherapy head and neck cancer patients, computed by a grid-based Boltzmann solver that takes into account tissue inhomogeneities as well as TG-43 formalism. 3D-treatment planning was carried out by using computed tomography. Results Dosimetric indices D90 and V100 for target volume were about 3% lower (median value) for the grid-based Boltzmann solver relative to TG-43-based computation (p < 0.01). The V150 dose parameter showed 1.6% increase from grid-based Boltzmann solver to TG-43 (p < 0.01). Conclusions Dose differences between results of a grid-based Boltzmann solver and TG-43 formalism for high-dose-rate head and neck brachytherapy patients to the target volume were found. Distinctions in D90 of CTV were low (2.63 Gy for grid-based Boltzmann solver vs. 2.71 Gy TG-43 in mean). In our clinical practice, prescription doses remain unchanged for high-dose-rate head and neck brachytherapy for the time being. PMID:24474973
Domain decomposition solvers for PDEs : some basics, practical tools, and new developments.
Dohrmann, Clark R.
2010-11-01
The first part of this talk provides a basic introduction to the building blocks of domain decomposition solvers. Specific details are given for both the classical overlapping Schwarz (OS) algorithm and a recent iterative substructuring (IS) approach called balancing domain decomposition by constraints (BDDC). A more recent hybrid OS-IS approach is also described. The success of domain decomposition solvers depends critically on the coarse space. Similarities and differences between the coarse spaces for OS and BDDC approaches are discussed, along with how they can be obtained from discrete harmonic extensions. Connections are also made between coarse spaces and multiscale modeling approaches from computational mechanics. As a specific example, details are provided on constructing coarse spaces for incompressible fluid problems. The next part of the talk deals with a variety of implementation details for domain decomposition solvers. These include mesh partitioning options, local and global solver options, reducing the coarse space dimension, dealing with constraint equations, residual weighting to accelerate the convergence of OS methods, and recycling of Krylov spaces to efficiently solve problems with multiple right hand sides. Some potential bottlenecks and remedies for domain decomposition solvers are also discussed. The final part of the talk concerns some recent theoretical advances, new algorithms, and open questions in the analysis of domain decomposition solvers. The focus will be primarily on the work of the speaker and his colleagues on elasticity, fluid mechanics, problems in H(curl), and the analysis of subdomains with irregular boundaries.
NASA Astrophysics Data System (ADS)
Yosui, Kuniaki; Iwashita, Takeshi; Mori, Michiya; Kobayashi, Eiichi
Finite element analyses of electromagnetic field are commonly used for designing of various electronic devices. The scale of the analyses becomes larger and larger, therefore, a fast linear solver is needed to solve linear equations arising from the finite element method. Since a multigrid solver is the fastest linear solver for these problems, parallelization of a multigrid solver is a quite useful approach. From the viewpoint of industrial applications, an effective usage of a small-scale PC cluster is important due to initial cost for introducing parallel computers. In this paper, a distributed parallel multigrid solver for a small-scale PC cluster is developed. In high frequency electromagnetic field analyses, a special block Gauss-Seidel smoother is used for the multigrid solver instead of general smoothers such as Gauss-Seidel smoother or Jacobi smoother in order to improve a convergence rate. The block multicolor ordering technique is applied to parallelize the smoother. A numerical exsample shows that a 3.7-fold speed-up in computational time and a 3.0-fold increase in the scale of the analysis were attained when the number of CPU was increased from one to five.
Head and neck (192)Ir HDR-brachytherapy dosimetry using a grid-based Boltzmann solver.
Siebert, Frank-André; Wolf, Sabine; Kóvacs, George
2013-12-01
To compare dosimetry for head and neck cancer patients, calculated with TG-43 formalism and a commercially available grid-based Boltzmann solver. This study included 3D-dosimetry of 49 consecutive brachytherapy head and neck cancer patients, computed by a grid-based Boltzmann solver that takes into account tissue inhomogeneities as well as TG-43 formalism. 3D-treatment planning was carried out by using computed tomography. Dosimetric indices D90 and V100 for target volume were about 3% lower (median value) for the grid-based Boltzmann solver relative to TG-43-based computation (p < 0.01). The V150 dose parameter showed 1.6% increase from grid-based Boltzmann solver to TG-43 (p < 0.01). Dose differences between results of a grid-based Boltzmann solver and TG-43 formalism for high-dose-rate head and neck brachytherapy patients to the target volume were found. Distinctions in D90 of CTV were low (2.63 Gy for grid-based Boltzmann solver vs. 2.71 Gy TG-43 in mean). In our clinical practice, prescription doses remain unchanged for high-dose-rate head and neck brachytherapy for the time being.
Fast Poisson, Fast Helmholtz and fast linear elastostatic solvers on rectangular parallelepipeds
Wiegmann, A.
1999-06-01
FFT-based fast Poisson and fast Helmholtz solvers on rectangular parallelepipeds for periodic boundary conditions in one-, two and three space dimensions can also be used to solve Dirichlet and Neumann boundary value problems. For non-zero boundary conditions, this is the special, grid-aligned case of jump corrections used in the Explicit Jump Immersed Interface method. Fast elastostatic solvers for periodic boundary conditions in two and three dimensions can also be based on the FFT. From the periodic solvers we derive fast solvers for the new 'normal' boundary conditions and essential boundary conditions on rectangular parallelepipeds. The periodic case allows a simple proof of existence and uniqueness of the solutions to the discretization of normal boundary conditions. Numerical examples demonstrate the efficiency of the fast elastostatic solvers for non-periodic boundary conditions. More importantly, the fast solvers on rectangular parallelepipeds can be used together with the Immersed Interface Method to solve problems on non-rectangular domains with general boundary conditions. Details of this are reported in the preprint The Explicit Jump Immersed Interface Method for 2D Linear Elastostatics by the author.
Basis Function Approximation of Transonic Aerodynamic Influence Coefficient Matrix
NASA Technical Reports Server (NTRS)
Li, Wesley Waisang; Pak, Chan-Gi
2010-01-01
A technique for approximating the modal aerodynamic influence coefficients [AIC] matrices by using basis functions has been developed and validated. An application of the resulting approximated modal AIC matrix for a flutter analysis in transonic speed regime has been demonstrated. This methodology can be applied to the unsteady subsonic, transonic and supersonic aerodynamics. The method requires the unsteady aerodynamics in frequency-domain. The flutter solution can be found by the classic methods, such as rational function approximation, k, p-k, p, root-locus et cetera. The unsteady aeroelastic analysis for design optimization using unsteady transonic aerodynamic approximation is being demonstrated using the ZAERO(TradeMark) flutter solver (ZONA Technology Incorporated, Scottsdale, Arizona). The technique presented has been shown to offer consistent flutter speed prediction on an aerostructures test wing [ATW] 2 configuration with negligible loss in precision in transonic speed regime. These results may have practical significance in the analysis of aircraft aeroelastic calculation and could lead to a more efficient design optimization cycle
Basis Function Approximation of Transonic Aerodynamic Influence Coefficient Matrix
NASA Technical Reports Server (NTRS)
Li, Wesley W.; Pak, Chan-gi
2011-01-01
A technique for approximating the modal aerodynamic influence coefficients matrices by using basis functions has been developed and validated. An application of the resulting approximated modal aerodynamic influence coefficients matrix for a flutter analysis in transonic speed regime has been demonstrated. This methodology can be applied to the unsteady subsonic, transonic, and supersonic aerodynamics. The method requires the unsteady aerodynamics in frequency-domain. The flutter solution can be found by the classic methods, such as rational function approximation, k, p-k, p, root-locus et cetera. The unsteady aeroelastic analysis for design optimization using unsteady transonic aerodynamic approximation is being demonstrated using the ZAERO flutter solver (ZONA Technology Incorporated, Scottsdale, Arizona). The technique presented has been shown to offer consistent flutter speed prediction on an aerostructures test wing 2 configuration with negligible loss in precision in transonic speed regime. These results may have practical significance in the analysis of aircraft aeroelastic calculation and could lead to a more efficient design optimization cycle.
Implicit lower-upper/approximate-factorization schemes for incompressible flows
Briley, W.R.; Neerarambam, S.S.; Whitfield, D.L.
1996-10-01
A lower-upper/approximate-factorization (LU/AF) scheme is developed for the incompressible Euler or Navier-Stokes equations. The LU/AF scheme contains an iteration parameter that can be adjusted to improve iterative convergence rate. The LU/AF scheme is to be used in conjunction with linearized implicit approximations and artificial compressibility to compute steady solutions, and within sub-iterations to compute unsteady solutions. Formulations based on time linearization with and without sub-iteration and on Newton linearization are developed using spatial difference operators. The spatial approximation used includes upwind differencing based on Roe`s approximate Riemann solver and van Leer`s MUSCL scheme, with numerically computed implicit flux linearizations. Simple one-dimensional diffusion and advection/diffusion problems are first studied analytically to provide insight for development of the Navier-Stokes algorithm. The optimal values of both time step and LU/AF parameter are determined for a test problem consisting of two-dimensional flow past a NACA 0012 airfoil, with a highly stretched grid. The optimal parameter provides a consistent improvement in convergence rate for four test cases having different grids and Reynolds numbers and, also, for an inviscid case. The scheme can be easily extended to three dimensions and adapted for compressible flows. 24 refs., 11 figs., 2 tabs.
A parallel solver for huge dense linear systems
NASA Astrophysics Data System (ADS)
Badia, J. M.; Movilla, J. L.; Climente, J. I.; Castillo, M.; Marqués, M.; Mayo, R.; Quintana-Ortí, E. S.; Planelles, J.
2011-11-01
HDSS (Huge Dense Linear System Solver) is a Fortran Application Programming Interface (API) to facilitate the parallel solution of very large dense systems to scientists and engineers. The API makes use of parallelism to yield an efficient solution of the systems on a wide range of parallel platforms, from clusters of processors to massively parallel multiprocessors. It exploits out-of-core strategies to leverage the secondary memory in order to solve huge linear systems O(100.000). The API is based on the parallel linear algebra library PLAPACK, and on its Out-Of-Core (OOC) extension POOCLAPACK. Both PLAPACK and POOCLAPACK use the Message Passing Interface (MPI) as the communication layer and BLAS to perform the local matrix operations. The API provides a friendly interface to the users, hiding almost all the technical aspects related to the parallel execution of the code and the use of the secondary memory to solve the systems. In particular, the API can automatically select the best way to store and solve the systems, depending of the dimension of the system, the number of processes and the main memory of the platform. Experimental results on several parallel platforms report high performance, reaching more than 1 TFLOP with 64 cores to solve a system with more than 200 000 equations and more than 10 000 right-hand side vectors. New version program summaryProgram title: Huge Dense System Solver (HDSS) Catalogue identifier: AEHU_v1_1 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEHU_v1_1.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 87 062 No. of bytes in distributed program, including test data, etc.: 1 069 110 Distribution format: tar.gz Programming language: Fortran90, C Computer: Parallel architectures: multiprocessors, computer clusters Operating system
Woodward, Carol S.; Gardner, David J.; Evans, Katherine J.
2015-01-01
Efficient solutions of global climate models require effectively handling disparate length and time scales. Implicit solution approaches allow time integration of the physical system with a step size governed by accuracy of the processes of interest rather than by stability of the fastest time scales present. Implicit approaches, however, require the solution of nonlinear systems within each time step. Usually, a Newton's method is applied to solve these systems. Each iteration of the Newton's method, in turn, requires the solution of a linear model of the nonlinear system. This model employs the Jacobian of the problem-defining nonlinear residual, but this Jacobian can be costly to form. If a Krylov linear solver is used for the solution of the linear system, the action of the Jacobian matrix on a given vector is required. In the case of spectral element methods, the Jacobian is not calculated but only implemented through matrix-vector products. The matrix-vector multiply can also be approximated by a finite difference approximation which may introduce inaccuracy in the overall nonlinear solver. In this paper, we review the advantages and disadvantages of finite difference approximations of these matrix-vector products for climate dynamics within the spectral element shallow water dynamical core of the Community Atmosphere Model.
Woodward, Carol S.; Gardner, David J.; Evans, Katherine J.
2015-01-01
Efficient solutions of global climate models require effectively handling disparate length and time scales. Implicit solution approaches allow time integration of the physical system with a step size governed by accuracy of the processes of interest rather than by stability of the fastest time scales present. Implicit approaches, however, require the solution of nonlinear systems within each time step. Usually, a Newton's method is applied to solve these systems. Each iteration of the Newton's method, in turn, requires the solution of a linear model of the nonlinear system. This model employs the Jacobian of the problem-defining nonlinear residual, but thismore » Jacobian can be costly to form. If a Krylov linear solver is used for the solution of the linear system, the action of the Jacobian matrix on a given vector is required. In the case of spectral element methods, the Jacobian is not calculated but only implemented through matrix-vector products. The matrix-vector multiply can also be approximated by a finite difference approximation which may introduce inaccuracy in the overall nonlinear solver. In this paper, we review the advantages and disadvantages of finite difference approximations of these matrix-vector products for climate dynamics within the spectral element shallow water dynamical core of the Community Atmosphere Model.« less
A RADIATION TRANSFER SOLVER FOR ATHENA USING SHORT CHARACTERISTICS
Davis, Shane W.; Stone, James M.; Jiang Yanfei
2012-03-01
We describe the implementation of a module for the Athena magnetohydrodynamics (MHD) code that solves the time-independent, multi-frequency radiative transfer (RT) equation on multidimensional Cartesian simulation domains, including scattering and non-local thermodynamic equilibrium (LTE) effects. The module is based on well known and well tested algorithms developed for modeling stellar atmospheres, including the method of short characteristics to solve the RT equation, accelerated Lambda iteration to handle scattering and non-LTE effects, and parallelization via domain decomposition. The module serves several purposes: it can be used to generate spectra and images, to compute a variable Eddington tensor (VET) for full radiation MHD simulations, and to calculate the heating and cooling source terms in the MHD equations in flows where radiation pressure is small compared with gas pressure. For the latter case, the module is combined with the standard MHD integrators using operator splitting: we describe this approach in detail, including a new constraint on the time step for stability due to radiation diffusion modes. Implementation of the VET method for radiation pressure dominated flows is described in a companion paper. We present results from a suite of test problems for both the RT solver itself and for dynamical problems that include radiative heating and cooling. These tests demonstrate that the radiative transfer solution is accurate and confirm that the operator split method is stable, convergent, and efficient for problems of interest. We demonstrate there is no need to adopt ad hoc assumptions of questionable accuracy to solve RT problems in concert with MHD: the computational cost for our general-purpose module for simple (e.g., LTE gray) problems can be comparable to or less than a single time step of Athena's MHD integrators, and only few times more expensive than that for more general (non-LTE) problems.
A Radiation Transfer Solver for Athena Using Short Characteristics
NASA Astrophysics Data System (ADS)
Davis, Shane W.; Stone, James M.; Jiang, Yan-Fei
2012-03-01
We describe the implementation of a module for the Athena magnetohydrodynamics (MHD) code that solves the time-independent, multi-frequency radiative transfer (RT) equation on multidimensional Cartesian simulation domains, including scattering and non-local thermodynamic equilibrium (LTE) effects. The module is based on well known and well tested algorithms developed for modeling stellar atmospheres, including the method of short characteristics to solve the RT equation, accelerated Lambda iteration to handle scattering and non-LTE effects, and parallelization via domain decomposition. The module serves several purposes: it can be used to generate spectra and images, to compute a variable Eddington tensor (VET) for full radiation MHD simulations, and to calculate the heating and cooling source terms in the MHD equations in flows where radiation pressure is small compared with gas pressure. For the latter case, the module is combined with the standard MHD integrators using operator splitting: we describe this approach in detail, including a new constraint on the time step for stability due to radiation diffusion modes. Implementation of the VET method for radiation pressure dominated flows is described in a companion paper. We present results from a suite of test problems for both the RT solver itself and for dynamical problems that include radiative heating and cooling. These tests demonstrate that the radiative transfer solution is accurate and confirm that the operator split method is stable, convergent, and efficient for problems of interest. We demonstrate there is no need to adopt ad hoc assumptions of questionable accuracy to solve RT problems in concert with MHD: the computational cost for our general-purpose module for simple (e.g., LTE gray) problems can be comparable to or less than a single time step of Athena's MHD integrators, and only few times more expensive than that for more general (non-LTE) problems.
Patients as partners, patients as problem-solvers.
Young, Amanda; Flower, Linda
2002-01-01
This article reports our ongoing work in developing a model of health care communication called collaborative interpretation, which we define as a rhetorical practice that generates building blocks for a more complete and coherent diagnostic story and for a collaborative treatment plan. It does this by situating patients as problem-solvers. Our study begins with an analysis of provider-patient interactions in a specific setting-the emergency department (ED) of an urban trauma-level hospital- where we observed patients and providers miscommunicating in at least 3 distinct areas: over the meaning of key terms, in the framing of the immediate problem, and over the perceived role of the ED in serving the individual and the community. From our observations, we argue that all of these miscommunications and missed opportunities are rooted in mismatched expectations on the part of both provider and patient and the lack of explicit comparison and negotiation of expectations-in other words, a failure to see the patient-provider interaction as a rhetorical, knowledge-building event. In the process of observing interactions, conversing with patients and providers, and working with a team of providers and patients, we have developed an operational model of communication that could narrow the gap between the lay public and the medical profession-a gap that is especially critical in intercultural settings like the one we have studied. This model of collaborative interpretation (CI) provides strategies to help patients to represent their medical problems in the context of their life experiences and to share the logic behind their health care decisions. In addition, CI helps both patient and provider identify their goals and expectations in treatment, the obstacles that each party perceives, and the available options. It is adaptableto various settings, including short, structured conversations in the emergency room, extended dialogue between a health educator and a patient in a
DALI: Derivative Approximation for LIkelihoods
NASA Astrophysics Data System (ADS)
Sellentin, Elena
2015-07-01
DALI (Derivative Approximation for LIkelihoods) is a fast approximation of non-Gaussian likelihoods. It extends the Fisher Matrix in a straightforward way and allows for a wider range of posterior shapes. The code is written in C/C++.
Taylor Approximations and Definite Integrals
ERIC Educational Resources Information Center
Gordon, Sheldon P.
2007-01-01
We investigate the possibility of approximating the value of a definite integral by approximating the integrand rather than using numerical methods to approximate the value of the definite integral. Particular cases considered include examples where the integral is improper, such as an elliptic integral. (Contains 4 tables and 2 figures.)
Taylor Approximations and Definite Integrals
ERIC Educational Resources Information Center
Gordon, Sheldon P.
2007-01-01
We investigate the possibility of approximating the value of a definite integral by approximating the integrand rather than using numerical methods to approximate the value of the definite integral. Particular cases considered include examples where the integral is improper, such as an elliptic integral. (Contains 4 tables and 2 figures.)
Approximate equilibria for Bayesian games
NASA Astrophysics Data System (ADS)
Mallozzi, Lina; Pusillo, Lucia; Tijs, Stef
2008-07-01
In this paper the problem of the existence of approximate equilibria in mixed strategies is central. Sufficient conditions are given under which approximate equilibria exist for non-finite Bayesian games. Further one possible approach is suggested to the problem of the existence of approximate equilibria for the class of multicriteria Bayesian games.
Efficient IMRT inverse planning with a new L1-solver: template for first-order conic solver
NASA Astrophysics Data System (ADS)
Kim, Hojin; Suh, Tae-Suk; Lee, Rena; Xing, Lei; Li, Ruijiang
2012-07-01
Intensity modulated radiation therapy (IMRT) inverse planning using total-variation (TV) regularization has been proposed to reduce the complexity of fluence maps and facilitate dose delivery. Conventionally, the optimization problem with L-1 norm is solved with quadratic programming (QP), which is time consuming and memory expensive due to the second-order Newton update. This study proposes to use a new algorithm, template for first-order conic solver (TFOCS), for fast and memory-efficient optimization in IMRT inverse planning. The TFOCS utilizes dual-variable updates and first-order approaches for TV minimization without the need to compute and store the enlarged Hessian matrix required for Newton update in the QP technique. To evaluate the effectiveness and efficiency of the proposed method, two clinical cases were used for IMRT inverse planning: a head and neck case and a prostate case. For comparison, the conventional QP-based method for the TV form was adopted to solve the fluence map optimization problem in the above two cases. The convergence criteria and algorithm parameters were selected to achieve similar dose conformity for a fair comparison between the two methods. Compared with conventional QP-based approach, the proposed TFOCS-based method shows a remarkable improvement in computational efficiency for fluence map optimization, while maintaining the conformal dose distribution. Compared with QP-based algorithms, the computational speed using TFOCS for fluence optimization is increased by a factor of 4 to 6, and at the same time the memory requirement is reduced by a factor of 3 to 4. Therefore, TFOCS provides an effective, fast and memory-efficient method for IMRT inverse planning. The unique features of the approach should be particularly important in inverse planning involving a large number of beams, such as in VMAT and dense angularly sampled and sparse intensity modulated radiation therapy (DASSIM-RT).
Modeling of photon migration in the human lung using a finite volume solver
NASA Astrophysics Data System (ADS)
Sikorski, Zbigniew; Furmanczyk, Michal; Przekwas, Andrzej J.
2006-02-01
The application of the frequency domain and steady-state diffusive optical spectroscopy (DOS) and steady-state near infrared spectroscopy (NIRS) to diagnosis of the human lung injury challenges many elements of these techniques. These include the DOS/NIRS instrument performance and accurate models of light transport in heterogeneous thorax tissue. The thorax tissue not only consists of different media (e.g. chest wall with ribs, lungs) but its optical properties also vary with time due to respiration and changes in thorax geometry with contusion (e.g. pneumothorax or hemothorax). This paper presents a finite volume solver developed to model photon migration in the diffusion approximation in heterogeneous complex 3D tissues. The code applies boundary conditions that account for Fresnel reflections. We propose an effective diffusion coefficient for the void volumes (pneumothorax) based on the assumption of the Lambertian diffusion of photons entering the pleural cavity and accounting for the local pleural cavity thickness. The code has been validated using the MCML Monte Carlo code as a benchmark. The code environment enables a semi-automatic preparation of 3D computational geometry from medical images and its rapid automatic meshing. We present the application of the code to analysis/optimization of the hybrid DOS/NIRS/ultrasound technique in which ultrasound provides data on the localization of thorax tissue boundaries. The code effectiveness (3D complex case computation takes 1 second) enables its use to quantitatively relate detected light signal to absorption and reduced scattering coefficients that are indicators of the pulmonary physiologic state (hemoglobin concentration and oxygenation).
A Computationally Efficient Multicomponent Equilibrium Solver for Aerosols (MESA)
Zaveri, Rahul A.; Easter, Richard C.; Peters, Len K.
2005-12-23
This paper describes the development and application of a new multicomponent equilibrium solver for aerosol-phase (MESA) to predict the complex solid-liquid partitioning in atmospheric particles containing H+, NH4+, Na+, Ca2+, SO4=, HSO4-, NO3-, and Cl- ions. The algorithm of MESA involves integrating the set of ordinary differential equations describing the transient precipitation and dissolution reactions for each salt until the system satisfies the equilibrium or mass convergence criteria. Arbitrary values are chosen for the dissolution and precipitation rate constants such that their ratio is equal to the equilibrium constant. Numerically, this approach is equivalent to iterating all the equilibrium reactions simultaneously with a single iteration loop. Because CaSO4 is sparingly soluble, it is assumed to exist as a solid over the entire RH range to simplify the algorithm for calcium containing particles. Temperature-dependent mutual deliquescence relative humidity polynomials (valid from 240 to 310 K) for all the possible salt mixtures were constructed using the comprehensive Pitzer-Simonson-Clegg (PSC) activity coefficient model at 298.15 K and temperature-dependent equilibrium constants in MESA. Performance of MESA is evaluated for 16 representative mixed-electrolyte systems commonly found in tropospheric aerosols using PSC and two other multicomponent activity coefficient methods – Multicomponent Taylor Expansion Method (MTEM) of Zaveri et al. [2004], and the widely-used Kusik and Meissner method (KM), and the results are compared against the predictions of the Web-based AIM Model III or available experimental data. Excellent agreement was found between AIM, MESA-PSC, and MESA-MTEM predictions of the multistage deliquescence growth as a function of RH. On the other hand, MESA-KM displayed up to 20% deviations in the mass growth factors for common salt mixtures in the sulfate-poor cases while significant discrepancies were found in the predicted multistage
Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications
NASA Technical Reports Server (NTRS)
Sun, Xian-He
1997-01-01
Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm
Transonic adaptive flutter suppression using approximate unsteady time domain aerodynamics
NASA Technical Reports Server (NTRS)
Pak, Chan-Gi; Friedmann, Peretz P.; Livne, Eli
1991-01-01
A digital adaptive controller is applied to the active flutter suppression problem of a wing under time varying flight conditions in subsonic and transonic flow. Linear quadratic controller gain at each time step is obtained using an iterative Riccati solver. The digital adaptive optimal controller is robust with respect to the unknown external loads. Flutter and divergence instabilities are simultaneously suppressed using a trailing-edge control surface and displacement sensing. A new transonic unsteady aerodynamic approximation methodology is developed which enables one to carry out the rapid calculation required for transonic aeroservoelastic applications. This approximation is based on a combination of unsteady subsonic aerodynamics combined with a transonic correction procedure. Aeroservoelastic transient time response is obtained using Roger's approximation, state transition matrices and an iterative time marching algorithm. The aeroservoelastic system in the time domain is modeled using a deterministic ARMA model together with a parameter estimator. Transonic flutter boundaries of a wing structure are computed, in the time domain, using an estimated aeroelastic system matrix and are in good agreement with experimental data for the low transonic Mach number range.
Wilson, John D.; Naff, Richard L.
2004-01-01
A geometric multigrid solver (GMG), based in the preconditioned conjugate gradient algorithm, has been developed for solving systems of equations resulting from applying the cell-centered finite difference algorithm to flow in porous media. This solver has been adapted to the U.S. Geological Survey ground-water flow model MODFLOW-2000. The documentation herein is a description of the solver and the adaptation to MODFLOW-2000.
Combining global and local approximations
NASA Technical Reports Server (NTRS)
Haftka, Raphael T.
1991-01-01
A method based on a linear approximation to a scaling factor, designated the 'global-local approximation' (GLA) method, is presented and shown capable of extending the range of usefulness of derivative-based approximations to a more refined model. The GLA approach refines the conventional scaling factor by means of a linearly varying, rather than constant, scaling factor. The capabilities of the method are demonstrated for a simple beam example with a crude and more refined FEM model.
Combining global and local approximations
Haftka, R.T. )
1991-09-01
A method based on a linear approximation to a scaling factor, designated the 'global-local approximation' (GLA) method, is presented and shown capable of extending the range of usefulness of derivative-based approximations to a more refined model. The GLA approach refines the conventional scaling factor by means of a linearly varying, rather than constant, scaling factor. The capabilities of the method are demonstrated for a simple beam example with a crude and more refined FEM model. 6 refs.
Phenomenological applications of rational approximants
NASA Astrophysics Data System (ADS)
Gonzàlez-Solís, Sergi; Masjuan, Pere
2016-08-01
We illustrate the powerfulness of Padé approximants (PAs) as a summation method and explore one of their extensions, the so-called quadratic approximant (QAs), to access both space- and (low-energy) time-like (TL) regions. As an introductory and pedagogical exercise, the function 1 zln(1 + z) is approximated by both kind of approximants. Then, PAs are applied to predict pseudoscalar meson Dalitz decays and to extract Vub from the semileptonic B → πℓνℓ decays. Finally, the π vector form factor in the TL region is explored using QAs.
M2Di: MATLAB 2D Stokes solvers using the Finite Difference method
NASA Astrophysics Data System (ADS)
Räss, Ludovic; Duretz, Thibault; Schmalholz, Stefan; Podladchikov, Yury
2017-04-01
The study of coupled processes in Earth Sciences leads to the development of multiphysics modelling tools. Mechanical solvers represent the essential ingredient of any of these tools such that their performance and robustness is generally dictated by that of the mechanical solver. Here, we present M2Di, a collection of MATLAB routines designed for studying 2D linear and power law incompressible viscous flow using Finite Difference discretisation. The scripts are written in a concise vectorised MATLAB fashion and rely on fast and robust linear and non-linear solvers (Picard and Newton iterations). As a result, time to solution of 22 seconds for linear viscous flow with 104 viscosity jump on 10002 grid points can be achieved on a standard personal computer. We will present a numerous example of applications that span from high resolution crystal-melt dynamics, deformation of heterogeneous power law viscous fluids, instantaneous mantle flow patterns in cylindrical coordinates, and calculation of pressure gradients around inclusions using variable grid spacing. We use analytical solution for linear viscous flow with highly variable viscosity to validate the linear flow solver. Validation of the non-linear solver is achieved by comparing numerical solution to analytic and benchmark solutions of power law viscous folding and necking. The M2Di codes are open source and can hence be used for research or educational purposes.
A comparison of viscous-plastic sea ice solvers with and without replacement pressure
NASA Astrophysics Data System (ADS)
Kimmritz, Madlen; Losch, Martin; Danilov, Sergey
2017-07-01
Recent developments of the explicit elastic-viscous-plastic (EVP) solvers call for a new comparison with implicit solvers for the equations of viscous-plastic sea ice dynamics. In Arctic sea ice simulations, the modified and the adaptive EVP solvers, and the implicit Jacobian-free Newton-Krylov (JFNK) solver are compared against each other. The adaptive EVP method shows convergence rates that are generally similar or even better than those of the modified EVP method, but the convergence of the EVP methods is found to depend dramatically on the use of the replacement pressure (RP). Apparently, using the RP can affect the pseudo-elastic waves in the EVP methods by introducing extra non-physical oscillations so that, in the extreme case, convergence to the VP solution can be lost altogether. The JFNK solver also suffers from higher failure rates with RP implying that with RP the momentum equations are stiffer and more difficult to solve. For practical purposes, both EVP methods can be used efficiently with an unexpectedly low number of sub-cycling steps without compromising the solutions. The differences between the RP solutions and the NoRP solutions (when the RP is not being used) can be reduced with lower thresholds of viscous regularization at the cost of increasing stiffness of the equations, and hence the computational costs of solving them.
Fork Tensor-Product States: Efficient Multiorbital Real-Time DMFT Solver
NASA Astrophysics Data System (ADS)
Bauernfeind, Daniel; Zingl, Manuel; Triebl, Robert; Aichhorn, Markus; Evertz, Hans Gerd
2017-07-01
We present a tensor network especially suited for multi-orbital Anderson impurity models and as an impurity solver for multi-orbital dynamical mean-field theory (DMFT). The solver works directly on the real-frequency axis and yields high spectral resolution at all frequencies. We use a large number (O (100 )) of bath sites and therefore achieve an accurate representation of the bath. The solver can treat full rotationally invariant interactions with reasonable numerical effort. We show the efficiency and accuracy of the method by a benchmark for the three-orbital test-bed material SrVO3 . There we observe multiplet structures in the high-energy spectrum, which are almost impossible to resolve by other multi-orbital methods. The resulting structure of the Hubbard bands can be described as a broadened atomic spectrum with rescaled interaction parameters. Additional features emerge when U is increased. Finally, we show that our solver can be applied even to models with five orbitals. This impurity solver offers a new route to the calculation of precise real-frequency spectral functions of correlated materials.
Application of NASA General-Purpose Solver to Large-Scale Computations in Aeroacoustics
NASA Technical Reports Server (NTRS)
Watson, Willie R.; Storaasli, Olaf O.
2004-01-01
Of several iterative and direct equation solvers evaluated previously for computations in aeroacoustics, the most promising was the NASA-developed General-Purpose Solver (winner of NASA's 1999 software of the year award). This paper presents detailed, single-processor statistics of the performance of this solver, which has been tailored and optimized for large-scale aeroacoustic computations. The statistics, compiled using an SGI ORIGIN 2000 computer with 12 Gb available memory (RAM) and eight available processors, are the central processing unit time, RAM requirements, and solution error. The equation solver is capable of solving 10 thousand complex unknowns in as little as 0.01 sec using 0.02 Gb RAM, and 8.4 million complex unknowns in slightly less than 3 hours using all 12 Gb. This latter solution is the largest aeroacoustics problem solved to date with this technique. The study was unable to detect any noticeable error in the solution, since noise levels predicted from these solution vectors are in excellent agreement with the noise levels computed from the exact solution. The equation solver provides a means for obtaining numerical solutions to aeroacoustics problems in three dimensions.
Finite Element Interface to Linear Solvers (FEI) version 2.9 : users guide and reference manual.
Williams, Alan B.
2005-02-01
The Finite Element Interface to Linear Solvers (FEI) is a linear system assembly library. Sparse systems of linear equations arise in many computational engineering applications, and the solution of linear systems is often the most computationally intensive portion of the application. Depending on the complexity of problems addressed by the application, there may be no single solver package capable of solving all of the linear systems that arise. This motivates the need to switch an application from one solver library to another, depending on the problem being solved. The interfaces provided by various solver libraries for data assembly and problem solution differ greatly, making it difficult to switch an application code from one library to another. The amount of library-specific code in an application can be greatly reduced by having an abstraction layer that puts a 'common face' on various solver libraries. The FEI has seen significant use by finite element applications at Sandia National Laboratories and Lawrence Livermore National Laboratory. The original FEI offered several advantages over using linear algebra libraries directly, but also imposed significant limitations and disadvantages. A new set of interfaces has been added with the goal of removing the limitations of the original FEI while maintaining and extending its strengths.
NASA Astrophysics Data System (ADS)
Liu, Yang; Guo, Han; Michielssen, Eric
A butterfly-based fast direct integral equation solver for analyzing high-frequency scattering from two-dimensional objects is presented. The solver leverages a randomized butterfly scheme to compress blocks corresponding to near- and far-field interactions in the discretized forward and inverse electric field integral operators. The observed memory requirements and computational cost of the proposed solver scale as O(Nlog^2N) and O(N^1.5 logN), respectively. The solver is applied to the analysis of scattering from electrically large objects spanning over ten thousand of wavelengths and modeled in terms of five million unknowns.
NASA Astrophysics Data System (ADS)
Daude, F.; Galon, P.
2016-01-01
Computation of compressible two-phase flows with the unsteady compressible Baer-Nunziato model in conjunction with the moving grid approach is discussed in this paper. Both HLL- and HLLC-type Finite-Volume methods are presented and implemented in the context of Arbitrary Lagrangian-Eulerian formulation in a multidimensional framework. The construction of suitable numerical methods is linked to proper approximations of the non-conservative terms on moving grids. The HLL discretization follows global conservation properties such as free-stream preservation and uniform pressure and velocity profiles preservation on moving grids. The HLLC solver initially proposed by Tokareva and Toro [1] for the Baer-Nunziato model is based on an approximate solution of local Riemann problems containing all the characteristic fields present in the exact solution. Both ;subsonic; and ;supersonic; configurations are considered in the construction of the present HLLC solver. In addition, an adaptive 6-wave HLLC scheme is also proposed for computational efficiency. The methods are first assessed on a variety of 1-D Riemann problems including both fixed and moving grids applications. The methods are finally tested on 2-D and 3-D applications: 2-D Riemann problems, a 2-D shock-bubble interaction and finally a 3-D fluid-structure interaction problem with a good agreement with the experiments.
Gardner, David; Woodward, Carol S.; Evans, Katherine J
2015-01-01
Efficient solution of global climate models requires effectively handling disparate length and time scales. Implicit solution approaches allow time integration of the physical system with a time step dictated by accuracy of the processes of interest rather than by stability governed by the fastest of the time scales present. Implicit approaches, however, require the solution of nonlinear systems within each time step. Usually, a Newton s method is applied for these systems. Each iteration of the Newton s method, in turn, requires the solution of a linear model of the nonlinear system. This model employs the Jacobian of the problem-defining nonlinear residual, but this Jacobian can be costly to form. If a Krylov linear solver is used for the solution of the linear system, the action of the Jacobian matrix on a given vector is required. In the case of spectral element methods, the Jacobian is not calculated but only implemented through matrix-vector products. The matrix-vector multiply can also be approximated by a finite-difference which may show a loss of accuracy in the overall nonlinear solver. In this paper, we review the advantages and disadvantages of finite-difference approximations of these matrix-vector products for climate dynamics within the spectral-element based shallow-water dynamical-core of the Community Atmosphere Model (CAM).
Naff, Richard L.; Banta, Edward R.
2008-01-01
The preconditioned conjugate gradient with improved nonlinear control (PCGN) package provides addi-tional means by which the solution of nonlinear ground-water flow problems can be controlled as compared to existing solver packages for MODFLOW. Picard iteration is used to solve nonlinear ground-water flow equations by iteratively solving a linear approximation of the nonlinear equations. The linear solution is provided by means of the preconditioned conjugate gradient algorithm where preconditioning is provided by the modi-fied incomplete Cholesky algorithm. The incomplete Cholesky scheme incorporates two levels of fill, 0 and 1, in which the pivots can be modified so that the row sums of the preconditioning matrix and the original matrix are approximately equal. A relaxation factor is used to implement the modified pivots, which determines the degree of modification allowed. The effects of fill level and degree of pivot modification are briefly explored by means of a synthetic, heterogeneous finite-difference matrix; results are reported in the final section of this report. The preconditioned conjugate gradient method is coupled with Picard iteration so as to efficiently solve the nonlinear equations associated with many ground-water flow problems. The description of this coupling of the linear solver with Picard iteration is a primary concern of this document.
Bounded fractional diffusion in geological media: Definition and Lagrangian approximation
Zhang, Yong; Green, Christopher T.; LaBolle, Eric M.; Neupauer, Roseanna M.; Sun, HongGuang
2016-01-01
Spatiotemporal Fractional-Derivative Models (FDMs) have been increasingly used to simulate non-Fickian diffusion, but methods have not been available to define boundary conditions for FDMs in bounded domains. This study defines boundary conditions and then develops a Lagrangian solver to approximate bounded, one-dimensional fractional diffusion. Both the zero-value and non-zero-value Dirichlet, Neumann, and mixed Robin boundary conditions are defined, where the sign of Riemann-Liouville fractional derivative (capturing non-zero-value spatial-nonlocal boundary conditions with directional super-diffusion) remains consistent with the sign of the fractional-diffusive flux term in the FDMs. New Lagrangian schemes are then proposed to track solute particles moving in bounded domains, where the solutions are checked against analytical or Eularian solutions available for simplified FDMs. Numerical experiments show that the particle-tracking algorithm for non-Fickian diffusion differs from Fickian diffusion in relocating the particle position around the reflective boundary, likely due to the non-local and non-symmetric fractional diffusion. For a non-zero-value Neumann or Robin boundary, a source cell with a reflective face can be applied to define the release rate of random-walking particles at the specified flux boundary. Mathematical definitions of physically meaningful nonlocal boundaries combined with bounded Lagrangian solvers in this study may provide the only viable techniques at present to quantify the impact of boundaries on anomalous diffusion, expanding the applicability of FDMs from infinite do mains to those with any size and boundary conditions.
Approximating Functions with Exponential Functions
ERIC Educational Resources Information Center
Gordon, Sheldon P.
2005-01-01
The possibility of approximating a function with a linear combination of exponential functions of the form e[superscript x], e[superscript 2x], ... is considered as a parallel development to the notion of Taylor polynomials which approximate a function with a linear combination of power function terms. The sinusoidal functions sin "x" and cos "x"…
Structural optimization with approximate sensitivities
NASA Technical Reports Server (NTRS)
Patnaik, S. N.; Hopkins, D. A.; Coroneos, R.
1994-01-01
Computational efficiency in structural optimization can be enhanced if the intensive computations associated with the calculation of the sensitivities, that is, gradients of the behavior constraints, are reduced. Approximation to gradients of the behavior constraints that can be generated with small amount of numerical calculations is proposed. Structural optimization with these approximate sensitivities produced correct optimum solution. Approximate gradients performed well for different nonlinear programming methods, such as the sequence of unconstrained minimization technique, method of feasible directions, sequence of quadratic programming, and sequence of linear programming. Structural optimization with approximate gradients can reduce by one third the CPU time that would otherwise be required to solve the problem with explicit closed-form gradients. The proposed gradient approximation shows potential to reduce intensive computation that has been associated with traditional structural optimization.
Approximate circuits for increased reliability
Hamlet, Jason R.; Mayo, Jackson R.
2015-12-22
Embodiments of the invention describe a Boolean circuit having a voter circuit and a plurality of approximate circuits each based, at least in part, on a reference circuit. The approximate circuits are each to generate one or more output signals based on values of received input signals. The voter circuit is to receive the one or more output signals generated by each of the approximate circuits, and is to output one or more signals corresponding to a majority value of the received signals. At least some of the approximate circuits are to generate an output value different than the reference circuit for one or more input signal values; however, for each possible input signal value, the majority values of the one or more output signals generated by the approximate circuits and received by the voter circuit correspond to output signal result values of the reference circuit.
Approximate circuits for increased reliability
Hamlet, Jason R.; Mayo, Jackson R.
2015-08-18
Embodiments of the invention describe a Boolean circuit having a voter circuit and a plurality of approximate circuits each based, at least in part, on a reference circuit. The approximate circuits are each to generate one or more output signals based on values of received input signals. The voter circuit is to receive the one or more output signals generated by each of the approximate circuits, and is to output one or more signals corresponding to a majority value of the received signals. At least some of the approximate circuits are to generate an output value different than the reference circuit for one or more input signal values; however, for each possible input signal value, the majority values of the one or more output signals generated by the approximate circuits and received by the voter circuit correspond to output signal result values of the reference circuit.
Gao, Hao; Phan, Lan; Lin, Yuting
2012-09-01
A graphics processing unit-based parallel multigrid solver for a radiative transfer equation with vacuum boundary condition or reflection boundary condition is presented for heterogeneous media with complex geometry based on two-dimensional triangular meshes or three-dimensional tetrahedral meshes. The computational complexity of this parallel solver is linearly proportional to the degrees of freedom in both angular and spatial variables, while the full multigrid method is utilized to minimize the number of iterations. The overall gain of speed is roughly 30 to 300 fold with respect to our prior multigrid solver, which depends on the underlying regime and the parallelization. The numerical validations are presented with the MATLAB codes at https://sites.google.com/site/rtefastsolver/.
Wu, Jue; Chung, Albert C S
2005-01-01
This paper introduces a novel solver, namely cross entropy (CE), into the MRF theory for medical image segmentation. The solver, which is based on the theory of rare event simulation, is general and stochastic. Unlike some popular optimization methods such as belief propagation and graph cuts, CE makes no assumption on the form of objective functions and thus can be applied to any type of MRF models. Furthermore, it achieves higher performance of finding more global optima because of its stochastic property. In addition, it is more efficient than other stochastic methods like simulated annealing. We tested the new solver in 4 series of segmentation experiments on synthetic and clinical, vascular and cerebral images. The experiments show that CE can give more accurate segmentation results.
Numerical Investigation of Vertical Plunging Jet Using a Hybrid Multifluid–VOF Multiphase CFD Solver
Shonibare, Olabanji Y.; Wardle, Kent E.
2015-06-28
A novel hybrid multiphase flow solver has been used to conduct simulations of a vertical plunging liquid jet. This solver combines a multifluid methodology with selective interface sharpening to enable simulation of both the initial jet impingement and the long-time entrained bubble plume phenomena. Models are implemented for variable bubble size capturing and dynamic switching of interface sharpened regions to capture transitions between the initially fully segregated flow types into the dispersed bubbly flow regime. It was found that the solver was able to capture the salient features of the flow phenomena under study and areas for quantitative improvement havemore » been explored and identified. In particular, a population balance approach is employed and detailed calibration of the underlying models with experimental data is required to enable quantitative prediction of bubble size and distribution to capture the transition between segregated and dispersed flow types with greater fidelity.« less
A finite-volume Euler solver for computing rotary-wing aerodynamics on unstructured meshes
NASA Technical Reports Server (NTRS)
Strawn, Roger C.; Barth, Timothy J.
1992-01-01
An unstructured-grid solver for the unsteady Euler equations has been developed for predicting the aerodynamics of helicopter rotor blades. This flow solver is a finite-volume scheme that computes flow quantities at the vertices of the mesh. Special treatments are used for the flux differencing and boundary conditions in order to compute rotary-wing flowfields, and these are detailed in the paper. The unstructured-grid solver permits adaptive grid refinement in order to improve the resolution of flow features such as shocks, rotor wakes and acoustic waves. These capabilities are demonstrated in the paper. Example calculations are presented for two hovering rotors. In both cases, adaptive-grid refinement is used to resolve high gradients near the rotor surface and also to capture the vortical regions in the rotor wake. The computed results show good agreement with experimental results for surface airloads and wake geometry.
A fast parallel Poisson solver on irregular domains applied to beam dynamics simulations
Adelmann, A. Arbenz, P. Ineichen, Y.
2010-06-20
We discuss the scalable parallel solution of the Poisson equation within a Particle-In-Cell (PIC) code for the simulation of electron beams in particle accelerators of irregular shape. The problem is discretized by Finite Differences. Depending on the treatment of the Dirichlet boundary the resulting system of equations is symmetric or 'mildly' nonsymmetric positive definite. In all cases, the system is solved by the preconditioned conjugate gradient algorithm with smoothed aggregation (SA) based algebraic multigrid (AMG) preconditioning. We investigate variants of the implementation of SA-AMG that lead to considerable improvements in the execution times. We demonstrate good scalability of the solver on distributed memory parallel processor with up to 2048 processors. We also compare our iterative solver with an FFT-based solver that is more commonly used for applications in beam dynamics.
Phan, Lan; Lin, Yuting
2012-01-01
Abstract. A graphics processing unit–based parallel multigrid solver for a radiative transfer equation with vacuum boundary condition or reflection boundary condition is presented for heterogeneous media with complex geometry based on two-dimensional triangular meshes or three-dimensional tetrahedral meshes. The computational complexity of this parallel solver is linearly proportional to the degrees of freedom in both angular and spatial variables, while the full multigrid method is utilized to minimize the number of iterations. The overall gain of speed is roughly 30 to 300 fold with respect to our prior multigrid solver, which depends on the underlying regime and the parallelization. The numerical validations are presented with the MATLAB codes at https://sites.google.com/site/rtefastsolver/. PMID:23085905
Plasma wave simulation based on versatile FEM solver on Alcator C-mod
Shiraiwa, S.; Meneghini, O.; Parker, R.; Wallace, G.; Wilson, J.
2009-11-26
The finite element method (FEM) has the potential of simulating plasma waves seamlessly from the core to the vacuum and antenna regions. We explored the possibility of using a versatile FEM solver package, COMSOL, for lower hybrid (LH) wave simulation. Special care was paid to boundary conditions to satisfy toroidal symmetry. The non-trivial issue of introducing hot plasma effects was addressed by an iterative algorithm. These techniques are verified both analytically and numerically. In the lower hybrid (LH) grill antenna coupling problem, the FEM solver successfully reproduced the solution that was obtained analytically. Propagation of LH waves on the Alcator C and Alcator C-MOD plasmas was compared with a ray-tracing code, showing good consistency. The approach based on the FEM is computationally less intensive compared to spectral domain solvers, and more suitable for the simulation of larger device such as ITER.
A Parallel Multigrid Solver for Viscous Flows on Anisotropic Structured Grids
NASA Technical Reports Server (NTRS)
Prieto, Manuel; Montero, Ruben S.; Llorente, Ignacio M.; Bushnell, Dennis M. (Technical Monitor)
2001-01-01
This paper presents an efficient parallel multigrid solver for speeding up the computation of a 3-D model that treats the flow of a viscous fluid over a flat plate. The main interest of this simulation lies in exhibiting some basic difficulties that prevent optimal multigrid efficiencies from being achieved. As the computing platform, we have used Coral, a Beowulf-class system based on Intel Pentium processors and equipped with GigaNet cLAN and switched Fast Ethernet networks. Our study not only examines the scalability of the solver but also includes a performance evaluation of Coral where the investigated solver has been used to compare several of its design choices, namely, the interconnection network (GigaNet versus switched Fast-Ethernet) and the node configuration (dual nodes versus single nodes). As a reference, the performance results have been compared with those obtained with the NAS-MG benchmark.
EUPDF: An Eulerian-Based Monte Carlo Probability Density Function (PDF) Solver. User's Manual
NASA Technical Reports Server (NTRS)
Raju, M. S.
1998-01-01
EUPDF is an Eulerian-based Monte Carlo PDF solver developed for application with sprays, combustion, parallel computing and unstructured grids. It is designed to be massively parallel and could easily be coupled with any existing gas-phase flow and spray solvers. The solver accommodates the use of an unstructured mesh with mixed elements of either triangular, quadrilateral, and/or tetrahedral type. The manual provides the user with the coding required to couple the PDF code to any given flow code and a basic understanding of the EUPDF code structure as well as the models involved in the PDF formulation. The source code of EUPDF will be available with the release of the National Combustion Code (NCC) as a complete package.
Parallel performance investigations of an unstructured mesh Navier-Stokes solver
NASA Technical Reports Server (NTRS)
Mavriplis, Dimitri J.
2000-01-01
A Reynolds-averaged Navier-Stokes solver based on unstructured mesh techniques for analysis of high-lift configurations is described. The method makes use of an agglomeration multigrid solver for convergence acceleration. Implicit line-smoothing is employed to relieve the stiffness associated with highly stretched meshes. A GMRES technique is also implemented to speed convergence at the expense of additional memory usage. The solver is cache efficient and fully vectorizable, and is parallelized using a two-level hybrid MPI-OpenMP implementation suitable for shared and/or distributed memory architectures, as well as clusters of shared memory machines. Convergence and scalability results are illustrated for various high-lift cases.
dugksFoam: An open source OpenFOAM solver for the Boltzmann model equation
NASA Astrophysics Data System (ADS)
Zhu, Lianhua; Chen, Songze; Guo, Zhaoli
2017-04-01
A deterministic Boltzmann model equation solver called dugksFoam has been developed in the framework of the open source CFD toolbox OpenFOAM. The solver adopts the discrete unified gas kinetic scheme (Guo et al., 2015) with the Shakhov collision model. It has been validated by simulating several test cases covering different flow regimes including the one dimensional shock tube problem, a two dimensional thermal induced flow and the three dimensional lid-driven cavity flow. The solver features a parallel computing ability based on the velocity space decomposition, which is different from the physical space decomposition based approach provided by the OpenFOAM framework. The two decomposition approaches have been compared in both two and three dimensional cases. The parallel performance improves significantly using the newly implemented approach. A speed up by two orders of magnitudes has been observed using 256 cores on a small cluster.
Flutter and Forced Response Analyses of Cascades using a Two-Dimensional Linearized Euler Solver
NASA Technical Reports Server (NTRS)
Reddy, T. S. R.; Srivastava, R.; Mehmed, O.
1999-01-01
Flutter and forced response analyses for a cascade of blades in subsonic and transonic flow is presented. The structural model for each blade is a typical section with bending and torsion degrees of freedom. The unsteady aerodynamic forces due to bending and torsion motions. and due to a vortical gust disturbance are obtained by solving unsteady linearized Euler equations. The unsteady linearized equations are obtained by linearizing the unsteady nonlinear equations about the steady flow. The predicted unsteady aerodynamic forces include the effect of steady aerodynamic loading due to airfoil shape, thickness and angle of attack. The aeroelastic equations are solved in the frequency domain by coupling the un- steady aerodynamic forces to the aeroelastic solver MISER. The present unsteady aerodynamic solver showed good correlation with published results for both flutter and forced response predictions. Further improvements are required to use the unsteady aerodynamic solver in a design cycle.
Prediction of ship resistance in head waves using RANS based solver
NASA Astrophysics Data System (ADS)
Islam, Hafizul; Akimoto, Hiromichi
2016-07-01
Maneuverability prediction of ships using CFD has gained high popularity over the years because of its improving accuracy and economics. This paper discusses the estimation of calm water and added resistance properties of a KVLCC2 model using a light and economical RaNS based solver, called SHIP_Motion. The solver solves overset structured mesh using finite volume method. In the calm water test, total drag coefficient, sinkage and trim values were predicted together with mesh dependency analysis and compared with experimental data. For added resistance in head sea, short wave cases were simulated and compared with experimental and other simulation data. Overall the results were well predicted and showed good agreement with comparative data. The paper concludes that it is well possible to predict ship maneuverability characteristics using the present solver, with reasonable accuracy utilizing minimum computational resources and within acceptable time.
Mathematical and Numerical Aspects of the Adaptive Fast Multipole Poisson-Boltzmann Solver
Zhang, Bo; Lu, Benzhuo; Cheng, Xiaolin; ...
2013-01-01
This paper summarizes the mathematical and numerical theories and computational elements of the adaptive fast multipole Poisson-Boltzmann (AFMPB) solver. We introduce and discuss the following components in order: the Poisson-Boltzmann model, boundary integral equation reformulation, surface mesh generation, the nodepatch discretization approach, Krylov iterative methods, the new version of fast multipole methods (FMMs), and a dynamic prioritization technique for scheduling parallel operations. For each component, we also remark on feasible approaches for further improvements in efficiency, accuracy and applicability of the AFMPB solver to large-scale long-time molecular dynamics simulations. Lastly, the potential of the solver is demonstrated with preliminary numericalmore » results.« less
A Krylov-Schwarz iterative solver for the shallow water equations
NASA Astrophysics Data System (ADS)
Goossens, Serge; Tan, Kian; Roose, Dirk
In the DELFT3 D-FLOW software time integration is done by an AOI method, in which the ordering of explicit and implicit steps at every time step leads to a system of equations for the water elevation. Until recently this system was solved by an ADI iteration process, which does not converge very well for large time steps and small mesh widths. We implemented a robust solver by using a Krylov subspace method with the ADI method acting as a preconditioner. This solver is used as the subdomain solver in a domain decomposition method, which is also accelerated by a Krylov subspace method. In this case certain vectors from the subspace, constructed during the solution process, can be reused in the solution of the subsequent linear systems and this makes the method even more efficient. The adopted domain decomposition method is an additive preconditioner so it is inherently parallel.
Parallelization of the preconditioned IDR solver for modern multicore computer systems
NASA Astrophysics Data System (ADS)
Bessonov, O. A.; Fedoseyev, A. I.
2012-10-01
This paper present the analysis, parallelization and optimization approach for the large sparse matrix solver CNSPACK for modern multicore microprocessors. CNSPACK is an advanced solver successfully used for coupled solution of stiff problems arising in multiphysics applications such as CFD, semiconductor transport, kinetic and quantum problems. It employs iterative IDR algorithm with ILU preconditioning (user chosen ILU preconditioning order). CNSPACK has been successfully used during last decade for solving problems in several application areas, including fluid dynamics and semiconductor device simulation. However, there was a dramatic change in processor architectures and computer system organization in recent years. Due to this, performance criteria and methods have been revisited, together with involving the parallelization of the solver and preconditioner using Open MP environment. Results of the successful implementation for efficient parallelization are presented for the most advances computer system (Intel Core i7-9xx or two-processor Xeon 55xx/56xx).
Nonlinear vector eigen-solver and parallel reassembly processing for structural nonlinear vibration
NASA Astrophysics Data System (ADS)
Xue, D. Y.; Mei, Chuh
1993-12-01
In the frequency domain solution of large amplitude nonlinear vibration, two operations are computationally costly. They are: (1) the iterative eigen-solution and (2) the iterative nonlinear matrix reassembly. This study introduces a nonlinear eigen-solver which greatly speeds up the solution procedure by using a combination of vector iteration and nonlinear matrix updating. A feature of this new method is that it avoids repeatedly using a costly eigen-solver or equation solver. This solution procedure has also been engaged in parallel processing to further speed up the computation. Parallel nonlinear matrix reassembly is the main interest in this parallel processing. Force Macro is used in the parallel program on a CRAY-2S supercomputer.
Wavelet-based Poisson Solver for use in Particle-In-CellSimulations
Terzic, B.; Mihalcea, D.; Bohn, C.L.; Pogorelov, I.V.
2005-05-13
We report on a successful implementation of a wavelet based Poisson solver for use in 3D particle-in-cell (PIC) simulations. One new aspect of our algorithm is its ability to treat the general(inhomogeneous) Dirichlet boundary conditions (BCs). The solver harnesses advantages afforded by the wavelet formulation, such as sparsity of operators and data sets, existence of effective preconditioners, and the ability simultaneously to remove numerical noise and further compress relevant data sets. Having tested our method as a stand-alone solver on two model problems, we merged it into IMPACT-T to obtain a fully functional serial PIC code. We present and discuss preliminary results of application of the new code to the modeling of the Fermilab/NICADD and AES/JLab photoinjectors.
Edcsmoke: A new combustion solver for stiff chemistry based on OpenFOAM
NASA Astrophysics Data System (ADS)
Li, Zhiyi; Malik, Mohammad Rafi; Cuoci, Alberto; Parente, Alessandro
2017-07-01
In the present work, two new OpenFOAM solvers for combustion problems requiring detailed kinetic mechanisms are presented. The Eddy Dissipation Concept (EDC) [1] is used for turbulence-chemistry interactions and for the integration of detailed chemistry. The solvers, called 'edcSimpleSMOKE' for steady state problems and 'edcPimpleSMOKE' for unsteady ones, were developed for a robust handling of large and detailed chemical mechanisms in the context of RANS simulations. The solver was validated using high-fidelity experimental data from several Sandia flames and Jet in Hot Co-flow burner. In general, good agreement is observed between the simulations and the experimental results, for both temperature and species mass fraction profiles. What's more, different formulations of EDC model are tested and the results are compared.
NASA Technical Reports Server (NTRS)
Hartung, Lin C.
1991-01-01
A method for predicting radiation adsorption and emission coefficients in thermochemical nonequilibrium flows is developed. The method is called the Langley optimized radiative nonequilibrium code (LORAN). It applies the smeared band approximation for molecular radiation to produce moderately detailed results and is intended to fill the gap between detailed but costly prediction methods and very fast but highly approximate methods. The optimization of the method to provide efficient solutions allowing coupling to flowfield solvers is discussed. Representative results are obtained and compared to previous nonequilibrium radiation methods, as well as to ground- and flight-measured data. Reasonable agreement is found in all cases. A multidimensional radiative transport method is also developed for axisymmetric flows. Its predictions for wall radiative flux are 20 to 25 percent lower than those of the tangent slab transport method, as expected, though additional investigation of the symmetry and outflow boundary conditions is indicated. The method was applied to the peak heating condition of the aeroassist flight experiment (AFE) trajectory, with results comparable to predictions from other methods. The LORAN method was also applied in conjunction with the computational fluid dynamics (CFD) code LAURA to study the sensitivity of the radiative heating prediction to various models used in nonequilibrium CFD. This study suggests that radiation measurements can provide diagnostic information about the detailed processes occurring in a nonequilibrium flowfield because radiation phenomena are very sensitive to these processes.
NASA Astrophysics Data System (ADS)
Iungo, Giacomo Valerio; Camarri, Simone; Ciri, Umberto; El-Asha, Said; Leonardi, Stefano; Rotea, Mario A.; Santhanagopalan, Vignesh; Viola, Francesco; Zhan, Lu
2016-11-01
Site conditions, such as topography and local climate, as well as wind farm layout strongly affect performance of a wind power plant. Therefore, predictions of wake interactions and their effects on power production still remain a great challenge in wind energy. For this study, an onshore wind turbine array was monitored through lidar measurements, SCADA and met-tower data. Power losses due to wake interactions were estimated to be approximately 4% and 2% of the total power production under stable and convective conditions, respectively. This dataset was then leveraged for the calibration of a data driven RANS (DDRANS) solver, which is a compelling tool for prediction of wind turbine wakes and power production. DDRANS is characterized by a computational cost as low as that for engineering wake models, and adequate accuracy achieved through data-driven tuning of the turbulence closure model. DDRANS is based on a parabolic formulation, axisymmetry and boundary layer approximations, which allow achieving low computational costs. The turbulence closure model consists in a mixing length model, which is optimally calibrated with the experimental dataset. Assessment of DDRANS is then performed through lidar and SCADA data for different atmospheric conditions. This material is based upon work supported by the National Science Foundation under the I/UCRC WindSTAR, NSF Award IIP 1362033.
NASA Astrophysics Data System (ADS)
Hintermüller, M.; Hinze, M.; Kahle, C.
2013-02-01
An adaptive a posteriori error estimator based finite element method for the numerical solution of a coupled Cahn-Hilliard/Navier-Stokes system with a double-obstacle homogenous free (interfacial) energy density is proposed. A semi-implicit Euler scheme for the time-integration is applied which results in a system coupling a quasi-Stokes or Oseen-type problem for the fluid flow to a variational inequality for the concentration and the chemical potential according to the Cahn-Hilliard model [16]. A Moreau-Yosida regularization is employed which relaxes the constraints contained in the variational inequality and, thus, enables semi-smooth Newton solvers with locally superlinear convergence in function space. Moreover, upon discretization this yields a mesh independent method for a fixed relaxation parameter. For the finite dimensional approximation of the concentration and the chemical potential piecewise linear and globally continuous finite elements are used, and for the numerical approximation of the fluid velocity Taylor-Hood finite elements are employed. The paper ends by a report on numerical examples showing the efficiency of the new method.
The role and status of Euler solvers in impulsive rotor noise computations
NASA Technical Reports Server (NTRS)
Baeder, James D.
1995-01-01
Several recent applications (in the last five years) of Euler solvers in the computation of impulsive noise from rotor blades emphasize their emerging role in complementing other methods and experimental work. In the area of high-speed impulsive noise the use of Euler solvers as research tools has become fairly mature with very favorable comparisons with experimental data, especially in hover. The grid sizes and resulting computational times are reasonable when compared to those required for accurate surface aerodynamics alone. Furthermore, Euler solvers have provided a rich database with the resolution and accuracy needed for input to Kirchhoff and acoustic analogy methods for predicting the far-field noise. On the other hand, the application of Euler solvers to calculate blade-vortex interaction noise is still far from mature. The computational resources required for accurate calculations away from the blade are much larger than for high-speed impulsive noise. Current calculations help improve the basic understanding of the phenomena involved, but to date no comparisons with experiment have been made. Fortunately, the use of coupled Euler solver/Kirchhoff methods seems to offer promise for a robust and efficient technique for predicting both high-speed impulsive noise and blade-vortex interaction noise. Finally, a simple model problem of an isolated vortex interacting with an arbitrarily prescribed pitching airfoil demonstrates the feasibility of using Euler solvers to examine noise reduction techniques. The use of simple aerodynamic quasi-static theory and the computed lift time history as feedback to determine the required pitching motion appears sufficient to significantly dampen the unsteady loading and subsequent acoustics by an order of magnitude within a few blade passages.
Convergence Acceleration of a Navier-Stokes Solver for Efficient Static Aeroelastic Computations
NASA Technical Reports Server (NTRS)
Obayashi, Shigeru; Guruswamy, Guru P.
1995-01-01
New capabilities have been developed for a Navier-Stokes solver to perform steady-state simulations more efficiently. The flow solver for solving the Navier-Stokes equations is based on a combination of the lower-upper factored symmetric Gauss-Seidel implicit method and the modified Harten-Lax-van Leer-Einfeldt upwind scheme. A numerically stable and efficient pseudo-time-marching method is also developed for computing steady flows over flexible wings. Results are demonstrated for transonic flows over rigid and flexible wings.
Adaptively truncated Hilbert space based impurity solver for dynamical mean-field theory
NASA Astrophysics Data System (ADS)
Go, Ara; Millis, Andrew J.
2017-08-01
We present an impurity solver based on adaptively truncated Hilbert spaces. The solver is particularly suitable for dynamical mean-field theory in circumstances where quantum Monte Carlo approaches are ineffective. It exploits the sparsity structure of quantum impurity models, in which the interactions couple only a small subset of the degrees of freedom. We further introduce an adaptive truncation of the particle or hole excited spaces, which enables computations of Green functions with an accuracy needed to avoid unphysical (sign change of imaginary part) self-energies. The method is benchmarked on the one-dimensional Hubbard model.
Nearly Interactive Parabolized Navier-Stokes Solver for High Speed Forebody and Inlet Flows
NASA Technical Reports Server (NTRS)
Benson, Thomas J.; Liou, May-Fun; Jones, William H.; Trefny, Charles J.
2009-01-01
A system of computer programs is being developed for the preliminary design of high speed inlets and forebodies. The system comprises four functions: geometry definition, flow grid generation, flow solver, and graphics post-processor. The system runs on a dedicated personal computer using the Windows operating system and is controlled by graphical user interfaces written in MATLAB (The Mathworks, Inc.). The flow solver uses the Parabolized Navier-Stokes equations to compute millions of mesh points in several minutes. Sample two-dimensional and three-dimensional calculations are demonstrated in the paper.
A weakly compressible SPH method based on a low-dissipation Riemann solver
NASA Astrophysics Data System (ADS)
Zhang, C.; Hu, X. Y.; Adams, N. A.
2017-04-01
We present a low-dissipation weakly-compressible SPH method for modeling free-surface flows exhibiting violent events such as impact and breaking. The key idea is to modify a Riemann solver which determines the interaction between particles by a simple limiter to decrease the intrinsic numerical dissipation. The modified Riemann solver is also extended for imposing wall boundary conditions. Numerical tests show that the method resolves free-surface flows accurately and produces smooth, accurate pressure fields. The method is compatible with the hydrostatic solution and exhibits considerably less numerical damping of the mechanical energy than previous methods.
A second-order Grad-Shafranov solver with accurate derivative computation
NASA Astrophysics Data System (ADS)
Eshghi, Iraj; Ricketson, Lee; Cerfon, Antoine
2016-10-01
We present progress on a fast Grad-Shafranov and Poisson solver that uses the finite element method with linear elements to find equilibria of the electro-magnetic potentials inside tokamaks. The code converges with second-order errors, and we introduce a module which can take derivatives of the potential at no increase in error. Thus, this code can be much faster than most higher-order finite element solvers, while still retaining a sufficiently small error margin in the physically relevant quantities.
Parallel Schwarz domain decomposition solvers with applications in elasticity and poroelasticity
NASA Astrophysics Data System (ADS)
Blaheta, Radim; Starý, Jiří; Jakl, Ondřej
2017-07-01
The paper addresses the construction of parallel iterative solvers for problems of elasticity and poroelasticity. It is shown that such solvers can be built on the basis of conjugate gradient (CG) or another Krylov space iterative method with preconditioning by one- or two-level additive Schwarz methods. The special points of interest are efficient implementation of the two-level Schwarz method on supercomputers and new application of the Schwarz method in three-field poroelasticity formulated in displacements, fluid velocities and pressures.
Fast methods incorporating direct elliptic solvers for nonlinear applications in fluid dynamics
NASA Technical Reports Server (NTRS)
Martin, E. D.
1977-01-01
Semidirect methods are discussed, their present role, as well as some developments for their application in computational fluid dynamics. A semidirect method is a computational scheme that uses a fast, direct, elliptic solver as the driving algorithm for the iterative solution of finite difference equations. Specific subtopics include: (1) direct Cauchy Riemann solvers for first order elliptic equations; (2) application of the semidirect method to the mixed elliptic hyperbolic problem of steady, inviscid transonic flow; and (3) the treatment of interior conditions, such as those on an airfoil or wing, in semidirect methods.
Trust-region based solver for nonlinear transport in heterogeneous porous media
NASA Astrophysics Data System (ADS)
Wang, Xiaochen; Tchelepi, Hamdi A.
2013-11-01
We describe a new nonlinear solver for immiscible two-phase transport in porous media, where viscous, buoyancy, and capillary forces are significant. The flux (fractional flow) function, F, is a nonlinear function of saturation and typically has inflection points and can be non-monotonic. The non-convexity and non-monotonicity of F are major sources of difficulty for nonlinear solvers of coupled multiphase flow and transport in natural porous media. We describe a modified Newton algorithm that employs trust regions of the flux function to guide the Newton iterations. The flux function is divided into saturation trust regions delineated by the inflection, unit-flux, and end points. The updates are performed such that two successive iterations cannot cross any trust-region boundary. If a crossing is detected, the saturation value is chopped back to the appropriate trust-region boundary. The proposed trust-region Newton solver, which is demonstrated across the parameter space of viscous, buoyancy and capillary effects, is a significant extension of the inflection-point strategy of Jenny et al. (JCP, 2009) [5] for viscous dominated flows. We analyze the discrete nonlinear transport equation obtained using finite-volume discretization with phase-based upstream weighting. Then, we prove convergence of the trust-region Newton method irrespective of the timestep size for single-cell problems. Numerical results across the full range of the parameter space of viscous, gravity and capillary forces indicate that our trust-region scheme is unconditionally convergent for 1D transport. That is, for a given choice of timestep size, the unique discrete solution is found independently of the initial guess. For problems dominated by buoyancy and capillarity, the trust-region Newton solver overcomes the often severe limits on timestep size associated with existing methods. To validate the effectiveness of the new nonlinear solver for large reservoir models with strong heterogeneity
An accurate predictor-corrector HOC solver for the two dimensional Riemann problem of gas dynamics
NASA Astrophysics Data System (ADS)
Gogoi, Bidyut B.
2016-10-01
The work in the present manuscript is concerned with the simulation of twodimensional (2D) Riemann problem of gas dynamics. We extend our recently developed higher order compact (HOC) method from one-dimensional (1D) to 2D solver and simulate the problem on a square geometry with different initial conditions. The method is fourth order accurate in space and second order accurate in time. We then compare our results with the available benchmark results. The comparison shows an excellent agreement of our results with the existing ones in the literature. Being a finite difference solver, it is quite straight-forward and simple.
Approximating subtree distances between phylogenies.
Bonet, Maria Luisa; St John, Katherine; Mahindru, Ruchi; Amenta, Nina
2006-10-01
We give a 5-approximation algorithm to the rooted Subtree-Prune-and-Regraft (rSPR) distance between two phylogenies, which was recently shown to be NP-complete. This paper presents the first approximation result for this important tree distance. The algorithm follows a standard format for tree distances. The novel ideas are in the analysis. In the analysis, the cost of the algorithm uses a "cascading" scheme that accounts for possible wrong moves. This accounting is missing from previous analysis of tree distance approximation algorithms. Further, we show how all algorithms of this type can be implemented in linear time and give experimental results.
Shu, Yu-Chen; Chern, I-Liang; Chang, Chien C.
2014-10-15
Most elliptic interface solvers become complicated for complex interface problems at those “exceptional points” where there are not enough neighboring interior points for high order interpolation. Such complication increases especially in three dimensions. Usually, the solvers are thus reduced to low order accuracy. In this paper, we classify these exceptional points and propose two recipes to maintain order of accuracy there, aiming at improving the previous coupling interface method [26]. Yet the idea is also applicable to other interface solvers. The main idea is to have at least first order approximations for second order derivatives at those exceptional points. Recipe 1 is to use the finite difference approximation for the second order derivatives at a nearby interior grid point, whenever this is possible. Recipe 2 is to flip domain signatures and introduce a ghost state so that a second-order method can be applied. This ghost state is a smooth extension of the solution at the exceptional point from the other side of the interface. The original state is recovered by a post-processing using nearby states and jump conditions. The choice of recipes is determined by a classification scheme of the exceptional points. The method renders the solution and its gradient uniformly second-order accurate in the entire computed domain. Numerical examples are provided to illustrate the second order accuracy of the presently proposed method in approximating the gradients of the original states for some complex interfaces which we had tested previous in two and three dimensions, and a real molecule ( (1D63)) which is double-helix shape and composed of hundreds of atoms.
Rytov approximation in electron scattering
NASA Astrophysics Data System (ADS)
Krehl, Jonas; Lubk, Axel
2017-06-01
In this work we introduce the Rytov approximation in the scope of high-energy electron scattering with the motivation of developing better linear models for electron scattering. Such linear models play an important role in tomography and similar reconstruction techniques. Conventional linear models, such as the phase grating approximation, have reached their limits in current and foreseeable applications, most importantly in achieving three-dimensional atomic resolution using electron holographic tomography. The Rytov approximation incorporates propagation effects which are the most pressing limitation of conventional models. While predominately used in the weak-scattering regime of light microscopy, we show that the Rytov approximation can give reasonable results in the inherently strong-scattering regime of transmission electron microscopy.
Dual approximations in optimal control
NASA Technical Reports Server (NTRS)
Hager, W. W.; Ianculescu, G. D.
1984-01-01
A dual approximation for the solution to an optimal control problem is analyzed. The differential equation is handled with a Lagrange multiplier while other constraints are treated explicitly. An algorithm for solving the dual problem is presented.
Exponential approximations in optimal design
NASA Technical Reports Server (NTRS)
Belegundu, A. D.; Rajan, S. D.; Rajgopal, J.
1990-01-01
One-point and two-point exponential functions have been developed and proved to be very effective approximations of structural response. The exponential has been compared to the linear, reciprocal and quadratic fit methods. Four test problems in structural analysis have been selected. The use of such approximations is attractive in structural optimization to reduce the numbers of exact analyses which involve computationally expensive finite element analysis.
Mathematical algorithms for approximate reasoning
NASA Technical Reports Server (NTRS)
Murphy, John H.; Chay, Seung C.; Downs, Mary M.
1988-01-01
Most state of the art expert system environments contain a single and often ad hoc strategy for approximate reasoning. Some environments provide facilities to program the approximate reasoning algorithms. However, the next generation of expert systems should have an environment which contain a choice of several mathematical algorithms for approximate reasoning. To meet the need for validatable and verifiable coding, the expert system environment must no longer depend upon ad hoc reasoning techniques but instead must include mathematically rigorous techniques for approximate reasoning. Popular approximate reasoning techniques are reviewed, including: certainty factors, belief measures, Bayesian probabilities, fuzzy logic, and Shafer-Dempster techniques for reasoning. A group of mathematically rigorous algorithms for approximate reasoning are focused on that could form the basis of a next generation expert system environment. These algorithms are based upon the axioms of set theory and probability theory. To separate these algorithms for approximate reasoning various conditions of mutual exclusivity and independence are imposed upon the assertions. Approximate reasoning algorithms presented include: reasoning with statistically independent assertions, reasoning with mutually exclusive assertions, reasoning with assertions that exhibit minimum overlay within the state space, reasoning with assertions that exhibit maximum overlay within the state space (i.e. fuzzy logic), pessimistic reasoning (i.e. worst case analysis), optimistic reasoning (i.e. best case analysis), and reasoning with assertions with absolutely no knowledge of the possible dependency among the assertions. A robust environment for expert system construction should include the two modes of inference: modus ponens and modus tollens. Modus ponens inference is based upon reasoning towards the conclusion in a statement of logical implication, whereas modus tollens inference is based upon reasoning away
Approximation techniques for neuromimetic calculus.
Vigneron, V; Barret, C
1999-06-01
Approximation Theory plays a central part in modern statistical methods, in particular in Neural Network modeling. These models are able to approximate a large amount of metric data structures in their entire range of definition or at least piecewise. We survey most of the known results for networks of neurone-like units. The connections to classical statistical ideas such as ordinary least squares (LS) are emphasized.
Nonadiabatic charged spherical evolution in the postquasistatic approximation
Rosales, L.; Barreto, W.; Peralta, C.; Rodriguez-Mueller, B.
2010-10-15
We apply the postquasistatic approximation, an iterative method for the evolution of self-gravitating spheres of matter, to study the evolution of dissipative and electrically charged distributions in general relativity. The numerical implementation of our approach leads to a solver which is globally second-order convergent. We evolve nonadiabatic distributions assuming an equation of state that accounts for the anisotropy induced by the electric charge. Dissipation is described by streaming-out or diffusion approximations. We match the interior solution, in noncomoving coordinates, with the Vaidya-Reissner-Nordstroem exterior solution. Two models are considered: (i) a Schwarzschild-like shell in the diffusion limit; and (ii) a Schwarzschild-like interior in the free-streaming limit. These toy models tell us something about the nature of the dissipative and electrically charged collapse. Diffusion stabilizes the gravitational collapse producing a spherical shell whose contraction is halted in a short characteristic hydrodynamic time. The streaming-out radiation provides a more efficient mechanism for emission of energy, redistributing the electric charge on the whole sphere, while the distribution collapses indefinitely with a longer hydrodynamic time scale.
Fast Edge-Aware Processing via First Order Proximal Approximation.
Badri, Hicham; Yahia, Hussein; Aboutajdine, Driss
2015-06-01
We present a new framework for fast edge-aware processing of images and videos. The proposed smoothing method is based on an optimization formulation with a non-convex sparse regularization for a better smoothing behavior near strong edges. We develop mathematical tools based on first order approximation of proximal operators to accelerate the proposed method while maintaining high-quality smoothing. The first order approximation is used to estimate a solution of the proximal form in a half-quadratic solver, and also to derive a warm-start solution that can be calculated quickly when the image is loaded by the user. We extend the method to large-scale processing by estimating the smoothing operation with independent 1D convolution operations. This approach linearly scales to the size of the image and can fully take advantage of parallel processing. The method supports full color filtering and turns out to be temporally coherent for fast video processing. We demonstrate the performance of the proposed method on various applications including image smoothing, detail manipulation, HDR tone-mapping, fast edge simplification and video edge-aware processing.
A multiscale two-point flux-approximation method
Møyner, Olav Lie, Knut-Andreas
2014-10-15
A large number of multiscale finite-volume methods have been developed over the past decade to compute conservative approximations to multiphase flow problems in heterogeneous porous media. In particular, several iterative and algebraic multiscale frameworks that seek to reduce the fine-scale residual towards machine precision have been presented. Common for all such methods is that they rely on a compatible primal–dual coarse partition, which makes it challenging to extend them to stratigraphic and unstructured grids. Herein, we propose a general idea for how one can formulate multiscale finite-volume methods using only a primal coarse partition. To this end, we use two key ingredients that are computed numerically: (i) elementary functions that correspond to flow solutions used in transmissibility upscaling, and (ii) partition-of-unity functions used to combine elementary functions into basis functions. We exemplify the idea by deriving a multiscale two-point flux-approximation (MsTPFA) method, which is robust with regards to strong heterogeneities in the permeability field and can easily handle general grids with unstructured fine- and coarse-scale connections. The method can easily be adapted to arbitrary levels of coarsening, and can be used both as a standalone solver and as a preconditioner. Several numerical experiments are presented to demonstrate that the MsTPFA method can be used to solve elliptic pressure problems on a wide variety of geological models in a robust and efficient manner.
Kastanya, Doddy Yozef Febrian; Turinsky, Paul J.
2005-05-15
A Newton-Krylov iterative solver has been developed to reduce the CPU execution time of boiling water reactor (BWR) core simulators implemented in the core simulator part of the Fuel Optimization for Reloads Multiple Objectives by Simulated Annealing for BWR (FORMOSA-B) code, which is an in-core fuel management optimization code for BWRs. This new solver utilizes Newton's method to explicitly treat strong nonlinearities in the problem, replacing the traditionally used nested iterative approach. Newton's method provides the solver with a higher-than-linear convergence rate, assuming that good initial estimates of the unknowns are provided. Within each Newton iteration, an appropriately preconditioned Krylov solver is utilized for solving the linearized system of equations. Taking advantage of the higher convergence rate provided by Newton's method and utilizing an efficient preconditioned Krylov solver, we have developed a Newton-Krylov solver to evaluate the three-dimensional, two-group neutron diffusion equations coupled with a two-phase flow model within a BWR core simulator. Numerical tests on the new solver have shown that speedups ranging from 1.6 to 2.1, with reference to the traditional approach of employing nested iterations to treat the nonlinear feedbacks, can be achieved. However, if a preconditioned Krylov solver is employed to complete the inner iterations of the traditional approach, negligible CPU time differences are noted between the Newton-Krylov and traditional (Krylov) approaches.
Approximating random quantum optimization problems
NASA Astrophysics Data System (ADS)
Hsu, B.; Laumann, C. R.; Läuchli, A. M.; Moessner, R.; Sondhi, S. L.
2013-06-01
We report a cluster of results regarding the difficulty of finding approximate ground states to typical instances of the quantum satisfiability problem k-body quantum satisfiability (k-QSAT) on large random graphs. As an approximation strategy, we optimize the solution space over “classical” product states, which in turn introduces a novel autonomous classical optimization problem, PSAT, over a space of continuous degrees of freedom rather than discrete bits. Our central results are (i) the derivation of a set of bounds and approximations in various limits of the problem, several of which we believe may be amenable to a rigorous treatment; (ii) a demonstration that an approximation based on a greedy algorithm borrowed from the study of frustrated magnetism performs well over a wide range in parameter space, and its performance reflects the structure of the solution space of random k-QSAT. Simulated annealing exhibits metastability in similar “hard” regions of parameter space; and (iii) a generalization of belief propagation algorithms introduced for classical problems to the case of continuous spins. This yields both approximate solutions, as well as insights into the free energy “landscape” of the approximation problem, including a so-called dynamical transition near the satisfiability threshold. Taken together, these results allow us to elucidate the phase diagram of random k-QSAT in a two-dimensional energy-density-clause-density space.
NASA Technical Reports Server (NTRS)
Grantz, A. C.; Dejarnette, F. R.; Thompson, R. A.
1989-01-01
The approximate axisymmetric method presented for accurately calculating the surface and flowfield properties of fully viscous hypersonic flow over blunt-nosed bodies incorporates the turbulence model of Cebeci-Smith (1970) and the equilibrium air tables of Hansen (1959). The method is faster than the parabolized Navier-Stokes or viscous shock layer solvers that it could replace for preliminary design determinations. Surface heat transfer and pressure predictions for the present method are comparable with the more accurate viscous shock layer method as well as flight test and wind tunnel data. A starting solution is not required.
General relativistic corrections to N -body simulations and the Zel'dovich approximation
NASA Astrophysics Data System (ADS)
Fidler, Christian; Rampf, Cornelius; Tram, Thomas; Crittenden, Robert; Koyama, Kazuya; Wands, David
2015-12-01
The initial conditions for Newtonian N -body simulations are usually generated by applying the Zel'dovich approximation to the initial displacements of the particles using an initial power spectrum of density fluctuations generated by an Einstein-Boltzmann solver. We show that in most gauges the initial displacements generated in this way receive a first-order relativistic correction. We define a new gauge, the N -body gauge, in which this relativistic correction vanishes and show that a conventional Newtonian N -body simulation includes all first-order relativistic contributions (in the absence of radiation) if we identify the coordinates in Newtonian simulations with those in the relativistic N -body gauge.
The Laguerre finite difference one-way equation solver
NASA Astrophysics Data System (ADS)
Terekhov, Andrew V.
2017-05-01
This paper presents a new finite difference algorithm for solving the 2D one-way wave equation with a preliminary approximation of a pseudo-differential operator by a system of partial differential equations. As opposed to the existing approaches, the integral Laguerre transform instead of Fourier transform is used. After carrying out the approximation of spatial variables it is possible to obtain systems of linear algebraic equations with better computing properties and to reduce computer costs for their solution. High accuracy of calculations is attained at the expense of employing finite difference approximations of higher accuracy order that are based on the dispersion-relationship-preserving method and the Richardson extrapolation in the downward continuation direction. The numerical experiments have verified that as compared to the spectral difference method based on Fourier transform, the new algorithm allows one to calculate wave fields with a higher degree of accuracy and a lower level of numerical noise and artifacts including those for non-smooth velocity models. In the context of solving the geophysical problem the post-stack migration for velocity models of the types Syncline and Sigsbee2A has been carried out. It is shown that the images obtained contain lesser noise and are considerably better focused as compared to those obtained by the known Fourier Finite Difference and Phase-Shift Plus Interpolation methods. There is an opinion that purely finite difference approaches do not allow carrying out the seismic migration procedure with sufficient accuracy, however the results obtained disprove this statement. For the supercomputer implementation it is proposed to use the parallel dichotomy algorithm when solving systems of linear algebraic equations with block-tridiagonal matrices.
Novick, Laura R; Sherman, Steven J
2008-07-01
The two experiments reported here tested two predictions concerning the sensitivity of good and poor problem solvers to superficial and structural information during online problem solving: (a) Superficial features have a greater effect on solution difficulty for poor problem solvers, whereas (b) structural features have a greater effect on solution difficulty for good problem solvers. The tests were conducted in the domain of anagram solution by manipulating or measuring several superficial and structural characteristics in this domain. The results supported both predictions. They also indicated that better problem solvers have access to structural information from the earliest stages of processing (within the first 2 s). The authors discuss the implications of their results for the types of solution strategies used by more and less competent anagram solvers.
Systematically improvable multiscale solver for correlated electron systems
NASA Astrophysics Data System (ADS)
Kananenka, Alexei A.; Gull, Emanuel; Zgid, Dominika
2015-03-01
The development of numerical methods capable of simulating realistic materials with strongly correlated electrons, with controllable errors, is a central challenge in quantum many-body physics. Here we describe a framework for a general multiscale method based on embedding a self-energy of a strongly correlated subsystem into a self-energy generated by a method able to treat large weakly correlated systems approximately. As an example, we present the embedding of an exact diagonalization self-energy into a self-energy generated from self-consistent second-order perturbation theory. Using a quantum impurity model, generated from a cluster dynamical mean field approximation to the two-dimensional Hubbard model, as a benchmark, we illustrate that our method allows us to obtain accurate results at a fraction of the cost of typical Monte Carlo calculations. We test the method in multiple regimes of interaction strengths and dopings of the model. The general embedding framework we present avoids difficulties such as double counting corrections, frequency-dependent interactions, or vertex functions. As it is solely formulated at the level of the single-particle Green's function, it provides a promising route for the simulation of realistic materials that are currently difficult to study with other methods.
Mathematical Tasks without Words and Word Problems: Perceptions of Reluctant Problem Solvers
ERIC Educational Resources Information Center
Holbert, Sydney Margaret
2013-01-01
This qualitative research study used a multiple, holistic case study approach (Yin, 2009) to explore the perceptions of reluctant problem solvers related to mathematical tasks without words and word problems. Participants were given a choice of working a mathematical task without words or a word problem during four problem-solving sessions. Data…
A Comparison of the Intellectual Abilities of Good and Poor Problem Solvers: An Exploratory Study.
ERIC Educational Resources Information Center
Meyer, Ruth Ann
This study examined a selected sample of fourth-grade students who had been previously identified as good or poor problem solvers. The pupils were compared on variables considered as "reference tests" for Verbal, Induction, Numerical, Word Fluency, Memory, Spatial Visualization, and Perceptual Speed abilities. The data were compiled to…
A fast parallel solver for the forward problem in electrical impedance tomography.
Jehl, Markus; Dedner, Andreas; Betcke, Timo; Aristovich, Kirill; Klöfkorn, Robert; Holder, David
2015-01-01
Electrical impedance tomography (EIT) is a noninvasive imaging modality, where imperceptible currents are applied to the skin and the resulting surface voltages are measured. It has the potential to distinguish between ischaemic and haemorrhagic stroke with a portable and inexpensive device. The image reconstruction relies on an accurate forward model of the experimental setup. Because of the relatively small signal in stroke EIT, the finite-element modeling requires meshes of more than 10 million elements. To study the requirements in the forward modeling in EIT and also to reduce the time for experimental image acquisition, it is necessary to reduce the run time of the forward computation. We show the implementation of a parallel forward solver for EIT using the Dune-Fem C++ library and demonstrate its performance on many CPU's of a computer cluster. For a typical EIT application a direct solver was significantly slower and not an alternative to iterative solvers with multigrid preconditioning. With this new solver, we can compute the forward solutions and the Jacobian matrix of a typical EIT application with 30 electrodes on a 15-million element mesh in less than 15 min. This makes it a valuable tool for simulation studies and EIT applications with high precision requirements. It is freely available for download.
A Comparison of the Intellectual Abilities of Good and Poor Problem Solvers: An Exploratory Study.
ERIC Educational Resources Information Center
Meyer, Ruth Ann
This study examined a selected sample of fourth-grade students who had been previously identified as good or poor problem solvers. The pupils were compared on variables considered as "reference tests" for Verbal, Induction, Numerical, Word Fluency, Memory, Spatial Visualization, and Perceptual Speed abilities. The data were compiled to…
Parallel FFT-based Poisson Solver for Isolated Three-dimensional Systems
Budiardja, Reuben D; Cardall, Christian Y
2011-01-01
We describe an implementation to solve Poisson's equation for an isolated system on a unigrid mesh using FFTs. The method solves the equation globally on mesh blocks distributed across multiple processes on a distributed-memory parallel computer. Test results to demonstrate the convergence and scaling properties of the implementation are presented. The solver is offered to interested users as the library PSPFFT.
Lipnikov, Konstantin; Moulton, David; Svyatskiy, Daniil
2016-04-29
We develop a new approach for solving the nonlinear Richards’ equation arising in variably saturated flow modeling. The growing complexity of geometric models for simulation of subsurface flows leads to the necessity of using unstructured meshes and advanced discretization methods. Typically, a numerical solution is obtained by first discretizing PDEs and then solving the resulting system of nonlinear discrete equations with a Newton-Raphson-type method. Efficiency and robustness of the existing solvers rely on many factors, including an empiric quality control of intermediate iterates, complexity of the employed discretization method and a customized preconditioner. We propose and analyze a new preconditioning strategy that is based on a stable discretization of the continuum Jacobian. We will show with numerical experiments for challenging problems in subsurface hydrology that this new preconditioner improves convergence of the existing Jacobian-free solvers 3-20 times. Furthermore, we show that the Picard method with this preconditioner becomes a more efficient nonlinear solver than a few widely used Jacobian-free solvers.
NASA Astrophysics Data System (ADS)
Lipnikov, Konstantin; Moulton, David; Svyatskiy, Daniil
2016-08-01
We develop a new approach for solving the nonlinear Richards' equation arising in variably saturated flow modeling. The growing complexity of geometric models for simulation of subsurface flows leads to the necessity of using unstructured meshes and advanced discretization methods. Typically, a numerical solution is obtained by first discretizing PDEs and then solving the resulting system of nonlinear discrete equations with a Newton-Raphson-type method. Efficiency and robustness of the existing solvers rely on many factors, including an empiric quality control of intermediate iterates, complexity of the employed discretization method and a customized preconditioner. We propose and analyze a new preconditioning strategy that is based on a stable discretization of the continuum Jacobian. We will show with numerical experiments for challenging problems in subsurface hydrology that this new preconditioner improves convergence of the existing Jacobian-free solvers 3-20 times. We also show that the Picard method with this preconditioner becomes a more efficient nonlinear solver than a few widely used Jacobian-free solvers.
Preconditioned implicit solvers for the Navier-Stokes equations on distributed-memory machines
NASA Technical Reports Server (NTRS)
Ajmani, Kumud; Liou, Meng-Sing; Dyson, Rodger W.
1994-01-01
The GMRES method is parallelized, and combined with local preconditioning to construct an implicit parallel solver to obtain steady-state solutions for the Navier-Stokes equations of fluid flow on distributed-memory machines. The new implicit parallel solver is designed to preserve the convergence rate of the equivalent 'serial' solver. A static domain-decomposition is used to partition the computational domain amongst the available processing nodes of the parallel machine. The SPMD (Single-Program Multiple-Data) programming model is combined with message-passing tools to develop the parallel code on a 32-node Intel Hypercube and a 512-node Intel Delta machine. The implicit parallel solver is validated for internal and external flow problems, and is found to compare identically with flow solutions obtained on a Cray Y-MP/8. A peak computational speed of 2300 MFlops/sec has been achieved on 512 nodes of the Intel Delta machine,k for a problem size of 1024 K equations (256 K grid points).
NASA Technical Reports Server (NTRS)
Mavriplis, D. J.; Das, Raja; Saltz, Joel; Vermeland, R. E.
1992-01-01
An efficient three dimensional unstructured Euler solver is parallelized on a Cray Y-MP C90 shared memory computer and on an Intel Touchstone Delta distributed memory computer. This paper relates the experiences gained and describes the software tools and hardware used in this study. Performance comparisons between two differing architectures are made.
Flowfield Comparisons from Three Navier-Stokes Solvers for an Axisymmetric Separate Flow Jet
NASA Technical Reports Server (NTRS)
Koch, L. Danielle; Bridges, James; Khavaran, Abbas
2002-01-01
To meet new noise reduction goals, many concepts to enhance mixing in the exhaust jets of turbofan engines are being studied. Accurate steady state flowfield predictions from state-of-the-art computational fluid dynamics (CFD) solvers are needed as input to the latest noise prediction codes. The main intent of this paper was to ascertain that similar Navier-Stokes solvers run at different sites would yield comparable results for an axisymmetric two-stream nozzle case. Predictions from the WIND and the NPARC codes are compared to previously reported experimental data and results from the CRAFT Navier-Stokes solver. Similar k-epsilon turbulence models were employed in each solver, and identical computational grids were used. Agreement between experimental data and predictions from each code was generally good for mean values. All three codes underpredict the maximum value of turbulent kinetic energy. The predicted locations of the maximum turbulent kinetic energy were farther downstream than seen in the data. A grid study was conducted using the WIND code, and comments about convergence criteria and grid requirements for CFD solutions to be used as input for noise prediction computations are given. Additionally, noise predictions from the MGBK code, using the CFD results from the CRAFT code, NPARC, and WIND as input are compared to data.
Lipnikov, Konstantin; Moulton, David; Svyatskiy, Daniil
2016-04-29
We develop a new approach for solving the nonlinear Richards’ equation arising in variably saturated flow modeling. The growing complexity of geometric models for simulation of subsurface flows leads to the necessity of using unstructured meshes and advanced discretization methods. Typically, a numerical solution is obtained by first discretizing PDEs and then solving the resulting system of nonlinear discrete equations with a Newton-Raphson-type method. Efficiency and robustness of the existing solvers rely on many factors, including an empiric quality control of intermediate iterates, complexity of the employed discretization method and a customized preconditioner. We propose and analyze a new preconditioningmore » strategy that is based on a stable discretization of the continuum Jacobian. We will show with numerical experiments for challenging problems in subsurface hydrology that this new preconditioner improves convergence of the existing Jacobian-free solvers 3-20 times. Furthermore, we show that the Picard method with this preconditioner becomes a more efficient nonlinear solver than a few widely used Jacobian-free solvers.« less
NASA Technical Reports Server (NTRS)
Biedron, Robert T.; Vatsa, Veer N.; Atkins, Harold L.
2005-01-01
We apply an unsteady Reynolds-averaged Navier-Stokes (URANS) solver for unstructured grids to unsteady flows on moving and stationary grids. Example problems considered are relevant to active flow control and stability and control. Computational results are presented using the Spalart-Allmaras turbulence model and are compared to experimental data. The effect of grid and time-step refinement are examined.
VDJSeq-Solver: In Silico V(D)J Recombination Detection Tool
Paciello, Giulia; Acquaviva, Andrea; Pighi, Chiara; Ferrarini, Alberto; Macii, Enrico; Zamo’, Alberto; Ficarra, Elisa
2015-01-01
In this paper we present VDJSeq-Solver, a methodology and tool to identify clonal lymphocyte populations from paired-end RNA Sequencing reads derived from the sequencing of mRNA neoplastic cells. The tool detects the main clone that characterises the tissue of interest by recognizing the most abundant V(D)J rearrangement among the existing ones in the sample under study. The exact sequence of the clone identified is capable of accounting for the modifications introduced by the enzymatic processes. The proposed tool overcomes limitations of currently available lymphocyte rearrangements recognition methods, working on a single sequence at a time, that are not applicable to high-throughput sequencing data. In this work, VDJSeq-Solver has been applied to correctly detect the main clone and identify its sequence on five Mantle Cell Lymphoma samples; then the tool has been tested on twelve Diffuse Large B-Cell Lymphoma samples. In order to comply with the privacy, ethics and intellectual property policies of the University Hospital and the University of Verona, data is available upon request to supporto.utenti@ateneo.univr.it after signing a mandatory Materials Transfer Agreement. VDJSeq-Solver JAVA/Perl/Bash software implementation is free and available at http://eda.polito.it/VDJSeq-Solver/. PMID:25799103
LEOPARD: A grid-based dispersion relation solver for arbitrary gyrotropic distributions
NASA Astrophysics Data System (ADS)
Astfalk, Patrick; Jenko, Frank
2017-01-01
Particle velocity distributions measured in collisionless space plasmas often show strong deviations from idealized model distributions. Despite this observational evidence, linear wave analysis in space plasma environments such as the solar wind or Earth's magnetosphere is still mainly carried out using dispersion relation solvers based on Maxwellians or other parametric models. To enable a more realistic analysis, we present the new grid-based kinetic dispersion relation solver LEOPARD (Linear Electromagnetic Oscillations in Plasmas with Arbitrary Rotationally-symmetric Distributions) which no longer requires prescribed model distributions but allows for arbitrary gyrotropic distribution functions. In this work, we discuss the underlying numerical scheme of the code and we show a few exemplary benchmarks. Furthermore, we demonstrate a first application of LEOPARD to ion distribution data obtained from hybrid simulations. In particular, we show that in the saturation stage of the parallel fire hose instability, the deformation of the initial bi-Maxwellian distribution invalidates the use of standard dispersion relation solvers. A linear solver based on bi-Maxwellians predicts further growth even after saturation, while LEOPARD correctly indicates vanishing growth rates. We also discuss how this complies with former studies on the validity of quasilinear theory for the resonant fire hose. In the end, we briefly comment on the role of LEOPARD in directly analyzing spacecraft data, and we refer to an upcoming paper which demonstrates a first application of that kind.
A generalized Poisson and Poisson-Boltzmann solver for electrostatic environments
Fisicaro, G. Goedecker, S.; Genovese, L.; Andreussi, O.; Marzari, N.
2016-01-07
The computational study of chemical reactions in complex, wet environments is critical for applications in many fields. It is often essential to study chemical reactions in the presence of applied electrochemical potentials, taking into account the non-trivial electrostatic screening coming from the solvent and the electrolytes. As a consequence, the electrostatic potential has to be found by solving the generalized Poisson and the Poisson-Boltzmann equations for neutral and ionic solutions, respectively. In the present work, solvers for both problems have been developed. A preconditioned conjugate gradient method has been implemented for the solution of the generalized Poisson equation and the linear regime of the Poisson-Boltzmann, allowing to solve iteratively the minimization problem with some ten iterations of the ordinary Poisson equation solver. In addition, a self-consistent procedure enables us to solve the non-linear Poisson-Boltzmann problem. Both solvers exhibit very high accuracy and parallel efficiency and allow for the treatment of periodic, free, and slab boundary conditions. The solver has been integrated into the BigDFT and Quantum-ESPRESSO electronic-structure packages and will be released as an independent program, suitable for integration in other codes.
NASA Technical Reports Server (NTRS)
Jameson, A.
1975-01-01
The use of a fast elliptic solver in combination with relaxation is presented as an effective way to accelerate the convergence of transonic flow calculations, particularly when a marching scheme can be used to treat the supersonic zone in the relaxation process.