Sample records for parallelizable approximate solvers

  1. Dilatonic parallelizable NS-NS backgrounds

    NASA Astrophysics Data System (ADS)

    Kawano, Teruhiko; Yamaguchi, Satoshi

    2003-08-01

    We complete the classification of parallelizable NS-NS backgrounds in type II supergravity by adding the dilatonic case to the result of Figueroa-O'Farrill on the non-dilatonic case. We also study the supersymmetry of these parallelizable backgrounds. It is shown that all the dilatonic parallelizable backgrounds have sixteen supersymmetries.

  2. A 3D approximate maximum likelihood localization solver

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    2016-09-23

    A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with acoustic transmitters and vocalizing marine mammals to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives and support Marine Renewable Energy. An approximate maximum likelihood solver was developed using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature.

  3. An approximate Riemann solver for hypervelocity flows

    NASA Technical Reports Server (NTRS)

    Jacobs, Peter A.

    1991-01-01

    We describe an approximate Riemann solver for the computation of hypervelocity flows in which there are strong shocks and viscous interactions. The scheme has three stages, the first of which computes the intermediate states assuming isentropic waves. A second stage, based on the strong shock relations, may then be invoked if the pressure jump across either wave is large. The third stage interpolates the interface state from the two initial states and the intermediate states. The solver is used as part of a finite-volume code and is demonstrated on two test cases. The first is a high Mach number flow over a sphere while the second is a flow over a slender cone with an adiabatic boundary layer. In both cases the solver performs well.

  4. NONLINEAR MULTIGRID SOLVER EXPLOITING AMGe COARSE SPACES WITH APPROXIMATION PROPERTIES

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Christensen, Max La Cour; Villa, Umberto E.; Engsig-Karup, Allan P.

    The paper introduces a nonlinear multigrid solver for mixed nite element discretizations based on the Full Approximation Scheme (FAS) and element-based Algebraic Multigrid (AMGe). The main motivation to use FAS for unstruc- tured problems is the guaranteed approximation property of the AMGe coarse spaces that were developed recently at Lawrence Livermore National Laboratory. These give the ability to derive stable and accurate coarse nonlinear discretization problems. The previous attempts (including ones with the original AMGe method, [5, 11]), were less successful due to lack of such good approximation properties of the coarse spaces. With coarse spaces with approximation properties, ourmore » FAS approach on un- structured meshes should be as powerful/successful as FAS on geometrically re ned meshes. For comparison, Newton's method and Picard iterations with an inner state-of-the-art linear solver is compared to FAS on a nonlinear saddle point problem with applications to porous media ow. It is demonstrated that FAS is faster than Newton's method and Picard iterations for the experiments considered here. Due to the guaranteed approximation properties of our AMGe, the coarse spaces are very accurate, providing a solver with the potential for mesh-independent convergence on general unstructured meshes.« less

  5. An approximate Riemann solver for real gas parabolized Navier-Stokes equations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Urbano, Annafederica, E-mail: annafederica.urbano@uniroma1.it; Nasuti, Francesco, E-mail: francesco.nasuti@uniroma1.it

    2013-01-15

    Under specific assumptions, parabolized Navier-Stokes equations are a suitable mean to study channel flows. A special case is that of high pressure flow of real gases in cooling channels where large crosswise gradients of thermophysical properties occur. To solve the parabolized Navier-Stokes equations by a space marching approach, the hyperbolicity of the system of governing equations is obtained, even for very low Mach number flow, by recasting equations such that the streamwise pressure gradient is considered as a source term. For this system of equations an approximate Roe's Riemann solver is developed as the core of a Godunov type finitemore » volume algorithm. The properties of the approximated Riemann solver, which is a modification of Roe's Riemann solver for the parabolized Navier-Stokes equations, are presented and discussed with emphasis given to its original features introduced to handle fluids governed by a generic real gas EoS. Sample solutions are obtained for low Mach number high compressible flows of transcritical methane, heated in straight long channels, to prove the solver ability to describe flows dominated by complex thermodynamic phenomena.« less

  6. Jacobian-free approximate solvers for hyperbolic systems: Application to relativistic magnetohydrodynamics

    NASA Astrophysics Data System (ADS)

    Castro, Manuel J.; Gallardo, José M.; Marquina, Antonio

    2017-10-01

    We present recent advances in PVM (Polynomial Viscosity Matrix) methods based on internal approximations to the absolute value function, and compare them with Chebyshev-based PVM solvers. These solvers only require a bound on the maximum wave speed, so no spectral decomposition is needed. Another important feature of the proposed methods is that they are suitable to be written in Jacobian-free form, in which only evaluations of the physical flux are used. This is particularly interesting when considering systems for which the Jacobians involve complex expressions, e.g., the relativistic magnetohydrodynamics (RMHD) equations. On the other hand, the proposed Jacobian-free solvers have also been extended to the case of approximate DOT (Dumbser-Osher-Toro) methods, which can be regarded as simple and efficient approximations to the classical Osher-Solomon method, sharing most of it interesting features and being applicable to general hyperbolic systems. To test the properties of our schemes a number of numerical experiments involving the RMHD equations are presented, both in one and two dimensions. The obtained results are in good agreement with those found in the literature and show that our schemes are robust and accurate, running stable under a satisfactory time step restriction. It is worth emphasizing that, although this work focuses on RMHD, the proposed schemes are suitable to be applied to general hyperbolic systems.

  7. Hierarchically Parallelized Constrained Nonlinear Solvers with Automated Substructuring

    NASA Technical Reports Server (NTRS)

    Padovan, Joe; Kwang, Abel

    1994-01-01

    This paper develops a parallelizable multilevel multiple constrained nonlinear equation solver. The substructuring process is automated to yield appropriately balanced partitioning of each succeeding level. Due to the generality of the procedure,_sequential, as well as partially and fully parallel environments can be handled. This includes both single and multiprocessor assignment per individual partition. Several benchmark examples are presented. These illustrate the robustness of the procedure as well as its capability to yield significant reductions in memory utilization and calculational effort due both to updating and inversion.

  8. A 3D approximate maximum likelihood solver for localization of fish implanted with acoustic transmitters

    DOE PAGES

    Li, Xinya; Deng, Z. Daniel; USA, Richland Washington; ...

    2014-11-27

    Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developedmore » using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature.« less

  9. A 3D approximate maximum likelihood solver for localization of fish implanted with acoustic transmitters

    NASA Astrophysics Data System (ADS)

    Li, Xinya; Deng, Z. Daniel; Sun, Yannan; Martinez, Jayson J.; Fu, Tao; McMichael, Geoffrey A.; Carlson, Thomas J.

    2014-11-01

    Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developed using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature.

  10. A 3D approximate maximum likelihood solver for localization of fish implanted with acoustic transmitters

    PubMed Central

    Li, Xinya; Deng, Z. Daniel; Sun, Yannan; Martinez, Jayson J.; Fu, Tao; McMichael, Geoffrey A.; Carlson, Thomas J.

    2014-01-01

    Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developed using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature. PMID:25427517

  11. A 3D approximate maximum likelihood solver for localization of fish implanted with acoustic transmitters.

    PubMed

    Li, Xinya; Deng, Z Daniel; Sun, Yannan; Martinez, Jayson J; Fu, Tao; McMichael, Geoffrey A; Carlson, Thomas J

    2014-11-27

    Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developed using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature.

  12. A 3D approximate maximum likelihood solver for localization of fish implanted with acoustic transmitters

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Li, Xinya; Deng, Z. Daniel; USA, Richland Washington

    Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developedmore » using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature.« less

  13. Matrix decomposition graphics processing unit solver for Poisson image editing

    NASA Astrophysics Data System (ADS)

    Lei, Zhao; Wei, Li

    2012-10-01

    In recent years, gradient-domain methods have been widely discussed in the image processing field, including seamless cloning and image stitching. These algorithms are commonly carried out by solving a large sparse linear system: the Poisson equation. However, solving the Poisson equation is a computational and memory intensive task which makes it not suitable for real-time image editing. A new matrix decomposition graphics processing unit (GPU) solver (MDGS) is proposed to settle the problem. A matrix decomposition method is used to distribute the work among GPU threads, so that MDGS will take full advantage of the computing power of current GPUs. Additionally, MDGS is a hybrid solver (combines both the direct and iterative techniques) and has two-level architecture. These enable MDGS to generate identical solutions with those of the common Poisson methods and achieve high convergence rate in most cases. This approach is advantageous in terms of parallelizability, enabling real-time image processing, low memory-taken and extensive applications.

  14. An approximate Riemann solver for magnetohydrodynamics (that works in more than one dimension)

    NASA Technical Reports Server (NTRS)

    Powell, Kenneth G.

    1994-01-01

    An approximate Riemann solver is developed for the governing equations of ideal magnetohydrodynamics (MHD). The Riemann solver has an eight-wave structure, where seven of the waves are those used in previous work on upwind schemes for MHD, and the eighth wave is related to the divergence of the magnetic field. The structure of the eighth wave is not immediately obvious from the governing equations as they are usually written, but arises from a modification of the equations that is presented in this paper. The addition of the eighth wave allows multidimensional MHD problems to be solved without the use of staggered grids or a projection scheme, one or the other of which was necessary in previous work on upwind schemes for MHD. A test problem made up of a shock tube with rotated initial conditions is solved to show that the two-dimensional code yields answers consistent with the one-dimensional methods developed previously.

  15. An HLLC Riemann solver for resistive relativistic magnetohydrodynamics

    NASA Astrophysics Data System (ADS)

    Miranda-Aranguren, S.; Aloy, M. A.; Rembiasz, T.

    2018-05-01

    We present a new approximate Riemann solver for the augmented system of equations of resistive relativistic magnetohydrodynamics that belongs to the family of Harten-Lax-van Leer contact wave (HLLC) solvers. In HLLC solvers, the solution is approximated by two constant states flanked by two shocks separated by a contact wave. The accuracy of the new approximate solver is calibrated through 1D and 2D test problems.

  16. A fast Cauchy-Riemann solver. [differential equation solution for boundary conditions by finite difference approximation

    NASA Technical Reports Server (NTRS)

    Ghil, M.; Balgovind, R.

    1979-01-01

    The inhomogeneous Cauchy-Riemann equations in a rectangle are discretized by a finite difference approximation. Several different boundary conditions are treated explicitly, leading to algorithms which have overall second-order accuracy. All boundary conditions with either u or v prescribed along a side of the rectangle can be treated by similar methods. The algorithms presented here have nearly minimal time and storage requirements and seem suitable for development into a general-purpose direct Cauchy-Riemann solver for arbitrary boundary conditions.

  17. FELIX-1.0: A finite element solver for the time dependent generator coordinate method with the Gaussian overlap approximation

    NASA Astrophysics Data System (ADS)

    Regnier, D.; Verrière, M.; Dubray, N.; Schunck, N.

    2016-03-01

    We describe the software package FELIX that solves the equations of the time-dependent generator coordinate method (TDGCM) in N-dimensions (N ≥ 1) under the Gaussian overlap approximation. The numerical resolution is based on the Galerkin finite element discretization of the collective space and the Crank-Nicolson scheme for time integration. The TDGCM solver is implemented entirely in C++. Several additional tools written in C++, Python or bash scripting language are also included for convenience. In this paper, the solver is tested with a series of benchmarks calculations. We also demonstrate the ability of our code to handle a realistic calculation of fission dynamics.

  18. FELIX-1.0: A finite element solver for the time dependent generator coordinate method with the Gaussian overlap approximation

    DOE PAGES

    Regnier, D.; Verriere, M.; Dubray, N.; ...

    2015-11-30

    In this study, we describe the software package FELIX that solves the equations of the time-dependent generator coordinate method (TDGCM) in NN-dimensions (N ≥ 1) under the Gaussian overlap approximation. The numerical resolution is based on the Galerkin finite element discretization of the collective space and the Crank–Nicolson scheme for time integration. The TDGCM solver is implemented entirely in C++. Several additional tools written in C++, Python or bash scripting language are also included for convenience. In this paper, the solver is tested with a series of benchmarks calculations. We also demonstrate the ability of our code to handle amore » realistic calculation of fission dynamics.« less

  19. Nonlinear Solver Approaches for the Diffusive Wave Approximation to the Shallow Water Equations

    NASA Astrophysics Data System (ADS)

    Collier, N.; Knepley, M.

    2015-12-01

    The diffusive wave approximation to the shallow water equations (DSW) is a doubly-degenerate, nonlinear, parabolic partial differential equation used to model overland flows. Despite its challenges, the DSW equation has been extensively used to model the overland flow component of various integrated surface/subsurface models. The equation's complications become increasingly problematic when ponding occurs, a feature which becomes pervasive when solving on large domains with realistic terrain. In this talk I discuss the various forms and regularizations of the DSW equation and highlight their effect on the solvability of the nonlinear system. In addition to this analysis, I present results of a numerical study which tests the applicability of a class of composable nonlinear algebraic solvers recently added to the Portable, Extensible, Toolkit for Scientific Computation (PETSc).

  20. An accurate, fast, and scalable solver for high-frequency wave propagation

    NASA Astrophysics Data System (ADS)

    Zepeda-Núñez, L.; Taus, M.; Hewett, R.; Demanet, L.

    2017-12-01

    In many science and engineering applications, solving time-harmonic high-frequency wave propagation problems quickly and accurately is of paramount importance. For example, in geophysics, particularly in oil exploration, such problems can be the forward problem in an iterative process for solving the inverse problem of subsurface inversion. It is important to solve these wave propagation problems accurately in order to efficiently obtain meaningful solutions of the inverse problems: low order forward modeling can hinder convergence. Additionally, due to the volume of data and the iterative nature of most optimization algorithms, the forward problem must be solved many times. Therefore, a fast solver is necessary to make solving the inverse problem feasible. For time-harmonic high-frequency wave propagation, obtaining both speed and accuracy is historically challenging. Recently, there have been many advances in the development of fast solvers for such problems, including methods which have linear complexity with respect to the number of degrees of freedom. While most methods scale optimally only in the context of low-order discretizations and smooth wave speed distributions, the method of polarized traces has been shown to retain optimal scaling for high-order discretizations, such as hybridizable discontinuous Galerkin methods and for highly heterogeneous (and even discontinuous) wave speeds. The resulting fast and accurate solver is consequently highly attractive for geophysical applications. To date, this method relies on a layered domain decomposition together with a preconditioner applied in a sweeping fashion, which has limited straight-forward parallelization. In this work, we introduce a new version of the method of polarized traces which reveals more parallel structure than previous versions while preserving all of its other advantages. We achieve this by further decomposing each layer and applying the preconditioner to these new components separately and

  1. Development and Characterization of a Parallelizable Perfusion Bioreactor for 3D Cell Culture.

    PubMed

    Egger, Dominik; Fischer, Monica; Clementi, Andreas; Ribitsch, Volker; Hansmann, Jan; Kasper, Cornelia

    2017-05-25

    The three dimensional (3D) cultivation of stem cells in dynamic bioreactor systems is essential in the context of regenerative medicine. Still, there is a lack of bioreactor systems that allow the cultivation of multiple independent samples under different conditions while ensuring comprehensive control over the mechanical environment. Therefore, we developed a miniaturized, parallelizable perfusion bioreactor system with two different bioreactor chambers. Pressure sensors were also implemented to determine the permeability of biomaterials which allows us to approximate the shear stress conditions. To characterize the flow velocity and shear stress profile of a porous scaffold in both bioreactor chambers, a computational fluid dynamics analysis was performed. Furthermore, the mixing behavior was characterized by acquisition of the residence time distributions. Finally, the effects of the different flow and shear stress profiles of the bioreactor chambers on osteogenic differentiation of human mesenchymal stem cells were evaluated in a proof of concept study. In conclusion, the data from computational fluid dynamics and shear stress calculations were found to be predictable for relative comparison of the bioreactor geometries, but not for final determination of the optimal flow rate. However, we suggest that the system is beneficial for parallel dynamic cultivation of multiple samples for 3D cell culture processes.

  2. Development and Characterization of a Parallelizable Perfusion Bioreactor for 3D Cell Culture

    PubMed Central

    Egger, Dominik; Fischer, Monica; Clementi, Andreas; Ribitsch, Volker; Hansmann, Jan; Kasper, Cornelia

    2017-01-01

    The three dimensional (3D) cultivation of stem cells in dynamic bioreactor systems is essential in the context of regenerative medicine. Still, there is a lack of bioreactor systems that allow the cultivation of multiple independent samples under different conditions while ensuring comprehensive control over the mechanical environment. Therefore, we developed a miniaturized, parallelizable perfusion bioreactor system with two different bioreactor chambers. Pressure sensors were also implemented to determine the permeability of biomaterials which allows us to approximate the shear stress conditions. To characterize the flow velocity and shear stress profile of a porous scaffold in both bioreactor chambers, a computational fluid dynamics analysis was performed. Furthermore, the mixing behavior was characterized by acquisition of the residence time distributions. Finally, the effects of the different flow and shear stress profiles of the bioreactor chambers on osteogenic differentiation of human mesenchymal stem cells were evaluated in a proof of concept study. In conclusion, the data from computational fluid dynamics and shear stress calculations were found to be predictable for relative comparison of the bioreactor geometries, but not for final determination of the optimal flow rate. However, we suggest that the system is beneficial for parallel dynamic cultivation of multiple samples for 3D cell culture processes. PMID:28952530

  3. Numerical comparison of Riemann solvers for astrophysical hydrodynamics

    NASA Astrophysics Data System (ADS)

    Klingenberg, Christian; Schmidt, Wolfram; Waagan, Knut

    2007-11-01

    The idea of this work is to compare a new positive and entropy stable approximate Riemann solver by Francois Bouchut with a state-of the-art algorithm for astrophysical fluid dynamics. We implemented the new Riemann solver into an astrophysical PPM-code, the Prometheus code, and also made a version with a different, more theoretically grounded higher order algorithm than PPM. We present shock tube tests, two-dimensional instability tests and forced turbulence simulations in three dimensions. We find subtle differences between the codes in the shock tube tests, and in the statistics of the turbulence simulations. The new Riemann solver increases the computational speed without significant loss of accuracy.

  4. SU-E-T-22: A Deterministic Solver of the Boltzmann-Fokker-Planck Equation for Dose Calculation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hong, X; Gao, H; Paganetti, H

    2015-06-15

    Purpose: The Boltzmann-Fokker-Planck equation (BFPE) accurately models the migration of photons/charged particles in tissues. While the Monte Carlo (MC) method is popular for solving BFPE in a statistical manner, we aim to develop a deterministic BFPE solver based on various state-of-art numerical acceleration techniques for rapid and accurate dose calculation. Methods: Our BFPE solver is based on the structured grid that is maximally parallelizable, with the discretization in energy, angle and space, and its cross section coefficients are derived or directly imported from the Geant4 database. The physical processes that are taken into account are Compton scattering, photoelectric effect, pairmore » production for photons, and elastic scattering, ionization and bremsstrahlung for charged particles.While the spatial discretization is based on the diamond scheme, the angular discretization synergizes finite element method (FEM) and spherical harmonics (SH). Thus, SH is used to globally expand the scattering kernel and FFM is used to locally discretize the angular sphere. As a Result, this hybrid method (FEM-SH) is both accurate in dealing with forward-peaking scattering via FEM, and efficient for multi-energy-group computation via SH. In addition, FEM-SH enables the analytical integration in energy variable of delta scattering kernel for elastic scattering with reduced truncation error from the numerical integration based on the classic SH-based multi-energy-group method. Results: The accuracy of the proposed BFPE solver was benchmarked against Geant4 for photon dose calculation. In particular, FEM-SH had improved accuracy compared to FEM, while both were within 2% of the results obtained with Geant4. Conclusion: A deterministic solver of the Boltzmann-Fokker-Planck equation is developed for dose calculation, and benchmarked against Geant4. Xiang Hong and Hao Gao were partially supported by the NSFC (#11405105), the 973 Program (#2015CB856000) and the Shanghai

  5. A Riemann solver for single-phase and two-phase shallow flow models based on relaxation. Relations with Roe and VFRoe solvers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pelanti, Marica, E-mail: Marica.Pelanti@ens.f; Bouchut, Francois, E-mail: francois.bouchut@univ-mlv.f; Mangeney, Anne, E-mail: mangeney@ipgp.jussieu.f

    2011-02-01

    We present a Riemann solver derived by a relaxation technique for classical single-phase shallow flow equations and for a two-phase shallow flow model describing a mixture of solid granular material and fluid. Our primary interest is the numerical approximation of this two-phase solid/fluid model, whose complexity poses numerical difficulties that cannot be efficiently addressed by existing solvers. In particular, we are concerned with ensuring a robust treatment of dry bed states. The relaxation system used by the proposed solver is formulated by introducing auxiliary variables that replace the momenta in the spatial gradients of the original model systems. The resultingmore » relaxation solver is related to Roe solver in that its Riemann solution for the flow height and relaxation variables is formally computed as Roe's Riemann solution. The relaxation solver has the advantage of a certain degree of freedom in the specification of the wave structure through the choice of the relaxation parameters. This flexibility can be exploited to handle robustly vacuum states, which is a well known difficulty of standard Roe's method, while maintaining Roe's low diffusivity. For the single-phase model positivity of flow height is rigorously preserved. For the two-phase model positivity of volume fractions in general is not ensured, and a suitable restriction on the CFL number might be needed. Nonetheless, numerical experiments suggest that the proposed two-phase flow solver efficiently models wet/dry fronts and vacuum formation for a large range of flow conditions. As a corollary of our study, we show that for single-phase shallow flow equations the relaxation solver is formally equivalent to the VFRoe solver with conservative variables of Gallouet and Masella [T. Gallouet, J.-M. Masella, Un schema de Godunov approche C.R. Acad. Sci. Paris, Serie I, 323 (1996) 77-84]. The relaxation interpretation allows establishing positivity conditions for this VFRoe method.« less

  6. A multigrid solver for the semiconductor equations

    NASA Technical Reports Server (NTRS)

    Bachmann, Bernhard

    1993-01-01

    We present a multigrid solver for the exponential fitting method. The solver is applied to the current continuity equations of semiconductor device simulation in two dimensions. The exponential fitting method is based on a mixed finite element discretization using the lowest-order Raviart-Thomas triangular element. This discretization method yields a good approximation of front layers and guarantees current conservation. The corresponding stiffness matrix is an M-matrix. 'Standard' multigrid solvers, however, cannot be applied to the resulting system, as this is dominated by an unsymmetric part, which is due to the presence of strong convection in part of the domain. To overcome this difficulty, we explore the connection between Raviart-Thomas mixed methods and the nonconforming Crouzeix-Raviart finite element discretization. In this way we can construct nonstandard prolongation and restriction operators using easily computable weighted L(exp 2)-projections based on suitable quadrature rules and the upwind effects of the discretization. The resulting multigrid algorithm shows very good results, even for real-world problems and for locally refined grids.

  7. Assessment of Linear Finite-Difference Poisson-Boltzmann Solvers

    PubMed Central

    Wang, Jun; Luo, Ray

    2009-01-01

    CPU time and memory usage are two vital issues that any numerical solvers for the Poisson-Boltzmann equation have to face in biomolecular applications. In this study we systematically analyzed the CPU time and memory usage of five commonly used finite-difference solvers with a large and diversified set of biomolecular structures. Our comparative analysis shows that modified incomplete Cholesky conjugate gradient and geometric multigrid are the most efficient in the diversified test set. For the two efficient solvers, our test shows that their CPU times increase approximately linearly with the numbers of grids. Their CPU times also increase almost linearly with the negative logarithm of the convergence criterion at very similar rate. Our comparison further shows that geometric multigrid performs better in the large set of tested biomolecules. However, modified incomplete Cholesky conjugate gradient is superior to geometric multigrid in molecular dynamics simulations of tested molecules. We also investigated other significant components in numerical solutions of the Poisson-Boltzmann equation. It turns out that the time-limiting step is the free boundary condition setup for the linear systems for the selected proteins if the electrostatic focusing is not used. Thus, development of future numerical solvers for the Poisson-Boltzmann equation should balance all aspects of the numerical procedures in realistic biomolecular applications. PMID:20063271

  8. Albany/FELIX: A parallel, scalable and robust, finite element, first-order Stokes approximation ice sheet solver built for advanced analysis

    DOE PAGES

    Tezaur, I. K.; Perego, M.; Salinger, A. G.; ...

    2015-04-27

    This paper describes a new parallel, scalable and robust finite element based solver for the first-order Stokes momentum balance equations for ice flow. The solver, known as Albany/FELIX, is constructed using the component-based approach to building application codes, in which mature, modular libraries developed as a part of the Trilinos project are combined using abstract interfaces and template-based generic programming, resulting in a final code with access to dozens of algorithmic and advanced analysis capabilities. Following an overview of the relevant partial differential equations and boundary conditions, the numerical methods chosen to discretize the ice flow equations are described, alongmore » with their implementation. The results of several verification studies of the model accuracy are presented using (1) new test cases for simplified two-dimensional (2-D) versions of the governing equations derived using the method of manufactured solutions, and (2) canonical ice sheet modeling benchmarks. Model accuracy and convergence with respect to mesh resolution are then studied on problems involving a realistic Greenland ice sheet geometry discretized using hexahedral and tetrahedral meshes. Also explored as a part of this study is the effect of vertical mesh resolution on the solution accuracy and solver performance. The robustness and scalability of our solver on these problems is demonstrated. Lastly, we show that good scalability can be achieved by preconditioning the iterative linear solver using a new algebraic multilevel preconditioner, constructed based on the idea of semi-coarsening.« less

  9. Fast Laplace solver approach to pore-scale permeability

    NASA Astrophysics Data System (ADS)

    Arns, C. H.; Adler, P. M.

    2018-02-01

    We introduce a powerful and easily implemented method to calculate the permeability of porous media at the pore scale using an approximation based on the Poiseulle equation to calculate permeability to fluid flow with a Laplace solver. The method consists of calculating the Euclidean distance map of the fluid phase to assign local conductivities and lends itself naturally to the treatment of multiscale problems. We compare with analytical solutions as well as experimental measurements and lattice Boltzmann calculations of permeability for Fontainebleau sandstone. The solver is significantly more stable than the lattice Boltzmann approach, uses less memory, and is significantly faster. Permeabilities are in excellent agreement over a wide range of porosities.

  10. A new fast direct solver for the boundary element method

    NASA Astrophysics Data System (ADS)

    Huang, S.; Liu, Y. J.

    2017-09-01

    A new fast direct linear equation solver for the boundary element method (BEM) is presented in this paper. The idea of the new fast direct solver stems from the concept of the hierarchical off-diagonal low-rank matrix. The hierarchical off-diagonal low-rank matrix can be decomposed into the multiplication of several diagonal block matrices. The inverse of the hierarchical off-diagonal low-rank matrix can be calculated efficiently with the Sherman-Morrison-Woodbury formula. In this paper, a more general and efficient approach to approximate the coefficient matrix of the BEM with the hierarchical off-diagonal low-rank matrix is proposed. Compared to the current fast direct solver based on the hierarchical off-diagonal low-rank matrix, the proposed method is suitable for solving general 3-D boundary element models. Several numerical examples of 3-D potential problems with the total number of unknowns up to above 200,000 are presented. The results show that the new fast direct solver can be applied to solve large 3-D BEM models accurately and with better efficiency compared with the conventional BEM.

  11. A dynamic-solver-consistent minimum action method: With an application to 2D Navier-Stokes equations

    NASA Astrophysics Data System (ADS)

    Wan, Xiaoliang; Yu, Haijun

    2017-02-01

    This paper discusses the necessity and strategy to unify the development of a dynamic solver and a minimum action method (MAM) for a spatially extended system when employing the large deviation principle (LDP) to study the effects of small random perturbations. A dynamic solver is used to approximate the unperturbed system, and a minimum action method is used to approximate the LDP, which corresponds to solving an Euler-Lagrange equation related to but more complicated than the unperturbed system. We will clarify possible inconsistencies induced by independent numerical approximations of the unperturbed system and the LDP, based on which we propose to define both the dynamic solver and the MAM on the same approximation space for spatial discretization. The semi-discrete LDP can then be regarded as the exact LDP of the semi-discrete unperturbed system, which is a finite-dimensional ODE system. We achieve this methodology for the two-dimensional Navier-Stokes equations using a divergence-free approximation space. The method developed can be used to study the nonlinear instability of wall-bounded parallel shear flows, and be generalized straightforwardly to three-dimensional cases. Numerical experiments are presented.

  12. Effects of high-frequency damping on iterative convergence of implicit viscous solver

    NASA Astrophysics Data System (ADS)

    Nishikawa, Hiroaki; Nakashima, Yoshitaka; Watanabe, Norihiko

    2017-11-01

    This paper discusses effects of high-frequency damping on iterative convergence of an implicit defect-correction solver for viscous problems. The study targets a finite-volume discretization with a one parameter family of damped viscous schemes. The parameter α controls high-frequency damping: zero damping with α = 0, and larger damping for larger α (> 0). Convergence rates are predicted for a model diffusion equation by a Fourier analysis over a practical range of α. It is shown that the convergence rate attains its minimum at α = 1 on regular quadrilateral grids, and deteriorates for larger values of α. A similar behavior is observed for regular triangular grids. In both quadrilateral and triangular grids, the solver is predicted to diverge for α smaller than approximately 0.5. Numerical results are shown for the diffusion equation and the Navier-Stokes equations on regular and irregular grids. The study suggests that α = 1 and 4/3 are suitable values for robust and efficient computations, and α = 4 / 3 is recommended for the diffusion equation, which achieves higher-order accuracy on regular quadrilateral grids. Finally, a Jacobian-Free Newton-Krylov solver with the implicit solver (a low-order Jacobian approximately inverted by a multi-color Gauss-Seidel relaxation scheme) used as a variable preconditioner is recommended for practical computations, which provides robust and efficient convergence for a wide range of α.

  13. An approximate block Newton method for coupled iterations of nonlinear solvers: Theory and conjugate heat transfer applications

    NASA Astrophysics Data System (ADS)

    Yeckel, Andrew; Lun, Lisa; Derby, Jeffrey J.

    2009-12-01

    A new, approximate block Newton (ABN) method is derived and tested for the coupled solution of nonlinear models, each of which is treated as a modular, black box. Such an approach is motivated by a desire to maintain software flexibility without sacrificing solution efficiency or robustness. Though block Newton methods of similar type have been proposed and studied, we present a unique derivation and use it to sort out some of the more confusing points in the literature. In particular, we show that our ABN method behaves like a Newton iteration preconditioned by an inexact Newton solver derived from subproblem Jacobians. The method is demonstrated on several conjugate heat transfer problems modeled after melt crystal growth processes. These problems are represented by partitioned spatial regions, each modeled by independent heat transfer codes and linked by temperature and flux matching conditions at the boundaries common to the partitions. Whereas a typical block Gauss-Seidel iteration fails about half the time for the model problem, quadratic convergence is achieved by the ABN method under all conditions studied here. Additional performance advantages over existing methods are demonstrated and discussed.

  14. Modeling hemodynamics in intracranial aneurysms: Comparing accuracy of CFD solvers based on finite element and finite volume schemes.

    PubMed

    Botti, Lorenzo; Paliwal, Nikhil; Conti, Pierangelo; Antiga, Luca; Meng, Hui

    2018-06-01

    Image-based computational fluid dynamics (CFD) has shown potential to aid in the clinical management of intracranial aneurysms (IAs) but its adoption in the clinical practice has been missing, partially due to lack of accuracy assessment and sensitivity analysis. To numerically solve the flow-governing equations CFD solvers generally rely on two spatial discretization schemes: Finite Volume (FV) and Finite Element (FE). Since increasingly accurate numerical solutions are obtained by different means, accuracies and computational costs of FV and FE formulations cannot be compared directly. To this end, in this study we benchmark two representative CFD solvers in simulating flow in a patient-specific IA model: (1) ANSYS Fluent, a commercial FV-based solver and (2) VMTKLab multidGetto, a discontinuous Galerkin (dG) FE-based solver. The FV solver's accuracy is improved by increasing the spatial mesh resolution (134k, 1.1m, 8.6m and 68.5m tetrahedral element meshes). The dGFE solver accuracy is increased by increasing the degree of polynomials (first, second, third and fourth degree) on the base 134k tetrahedral element mesh. Solutions from best FV and dGFE approximations are used as baseline for error quantification. On average, velocity errors for second-best approximations are approximately 1cm/s for a [0,125]cm/s velocity magnitude field. Results show that high-order dGFE provide better accuracy per degree of freedom but worse accuracy per Jacobian non-zero entry as compared to FV. Cross-comparison of velocity errors demonstrates asymptotic convergence of both solvers to the same numerical solution. Nevertheless, the discrepancy between under-resolved velocity fields suggests that mesh independence is reached following different paths. This article is protected by copyright. All rights reserved.

  15. Approximating the Generalized Voronoi Diagram of Closely Spaced Objects

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Edwards, John; Daniel, Eric; Pascucci, Valerio

    2015-06-22

    We present an algorithm to compute an approximation of the generalized Voronoi diagram (GVD) on arbitrary collections of 2D or 3D geometric objects. In particular, we focus on datasets with closely spaced objects; GVD approximation is expensive and sometimes intractable on these datasets using previous algorithms. With our approach, the GVD can be computed using commodity hardware even on datasets with many, extremely tightly packed objects. Our approach is to subdivide the space with an octree that is represented with an adjacency structure. We then use a novel adaptive distance transform to compute the distance function on octree vertices. Themore » computed distance field is sampled more densely in areas of close object spacing, enabling robust and parallelizable GVD surface generation. We demonstrate our method on a variety of data and show example applications of the GVD in 2D and 3D.« less

  16. Approximate Bayesian computation for spatial SEIR(S) epidemic models.

    PubMed

    Brown, Grant D; Porter, Aaron T; Oleson, Jacob J; Hinman, Jessica A

    2018-02-01

    Approximate Bayesia n Computation (ABC) provides an attractive approach to estimation in complex Bayesian inferential problems for which evaluation of the kernel of the posterior distribution is impossible or computationally expensive. These highly parallelizable techniques have been successfully applied to many fields, particularly in cases where more traditional approaches such as Markov chain Monte Carlo (MCMC) are impractical. In this work, we demonstrate the application of approximate Bayesian inference to spatially heterogeneous Susceptible-Exposed-Infectious-Removed (SEIR) stochastic epidemic models. These models have a tractable posterior distribution, however MCMC techniques nevertheless become computationally infeasible for moderately sized problems. We discuss the practical implementation of these techniques via the open source ABSEIR package for R. The performance of ABC relative to traditional MCMC methods in a small problem is explored under simulation, as well as in the spatially heterogeneous context of the 2014 epidemic of Chikungunya in the Americas. Copyright © 2017 Elsevier Ltd. All rights reserved.

  17. A spectral Poisson solver for kinetic plasma simulation

    NASA Astrophysics Data System (ADS)

    Szeremley, Daniel; Obberath, Jens; Brinkmann, Ralf

    2011-10-01

    Plasma resonance spectroscopy is a well established plasma diagnostic method, realized in several designs. One of these designs is the multipole resonance probe (MRP). In its idealized - geometrically simplified - version it consists of two dielectrically shielded, hemispherical electrodes to which an RF signal is applied. A numerical tool is under development which is capable of simulating the dynamics of the plasma surrounding the MRP in electrostatic approximation. In this contribution we concentrate on the specialized Poisson solver for that tool. The plasma is represented by an ensemble of point charges. By expanding both the charge density and the potential into spherical harmonics, a largely analytical solution of the Poisson problem can be employed. For a practical implementation, the expansion must be appropriately truncated. With this spectral solver we are able to efficiently solve the Poisson equation in a kinetic plasma simulation without the need of introducing a spatial discretization.

  18. MILAMIN 2 - Fast MATLAB FEM solver

    NASA Astrophysics Data System (ADS)

    Dabrowski, Marcin; Krotkiewski, Marcin; Schmid, Daniel W.

    2013-04-01

    MILAMIN is a free and efficient MATLAB-based two-dimensional FEM solver utilizing unstructured meshes [Dabrowski et al., G-cubed (2008)]. The code consists of steady-state thermal diffusion and incompressible Stokes flow solvers implemented in approximately 200 lines of native MATLAB code. The brevity makes the code easily customizable. An important quality of MILAMIN is speed - it can handle millions of nodes within minutes on one CPU core of a standard desktop computer, and is faster than many commercial solutions. The new MILAMIN 2 allows three-dimensional modeling. It is designed as a set of functional modules that can be used as building blocks for efficient FEM simulations using MATLAB. The utilities are largely implemented as native MATLAB functions. For performance critical parts we use MUTILS - a suite of compiled MEX functions optimized for shared memory multi-core computers. The most important features of MILAMIN 2 are: 1. Modular approach to defining, tracking, and discretizing the geometry of the model 2. Interfaces to external mesh generators (e.g., Triangle, Fade2d, T3D) and mesh utilities (e.g., element type conversion, fast point location, boundary extraction) 3. Efficient computation of the stiffness matrix for a wide range of element types, anisotropic materials and three-dimensional problems 4. Fast global matrix assembly using a dedicated MEX function 5. Automatic integration rules 6. Flexible prescription (spatial, temporal, and field functions) and efficient application of Dirichlet, Neuman, and periodic boundary conditions 7. Treatment of transient and non-linear problems 8. Various iterative and multi-level solution strategies 9. Post-processing tools (e.g., numerical integration) 10. Visualization primitives using MATLAB, and VTK export functions We provide a large number of examples that show how to implement a custom FEM solver using the MILAMIN 2 framework. The examples are MATLAB scripts of increasing complexity that address a given

  19. An Efficient Multicore Implementation of a Novel HSS-Structured Multifrontal Solver Using Randomized Sampling

    DOE PAGES

    Ghysels, Pieter; Li, Xiaoye S.; Rouet, Francois -Henry; ...

    2016-10-27

    Here, we present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factoriz ation leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite.more » The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK - STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices.« less

  20. Riemann Solvers in Relativistic Hydrodynamics: Basics and Astrophysical Applications

    NASA Astrophysics Data System (ADS)

    Ibanez, Jose M.

    2001-12-01

    My contribution to these proceedings summarizes a general overview on t High Resolution Shock Capturing methods (HRSC) in the field of relativistic hydrodynamics with special emphasis on Riemann solvers. HRSC techniques achieve highly accurate numerical approximations (formally second order or better) in smooth regions of the flow, and capture the motion of unresolved steep gradients without creating spurious oscillations. In the first part I will show how these techniques have been extended to relativistic hydrodynamics, making it possible to explore some challenging astrophysical scenarios. I will review recent literature concerning the main properties of different special relativistic Riemann solvers, and discuss several 1D and 2D test problems which are commonly used to evaluate the performance of numerical methods in relativistic hydrodynamics. In the second part I will illustrate the use of HRSC methods in several astrophysical applications where special and general relativistic hydrodynamical processes play a crucial role.

  1. Analysis Tools for CFD Multigrid Solvers

    NASA Technical Reports Server (NTRS)

    Mineck, Raymond E.; Thomas, James L.; Diskin, Boris

    2004-01-01

    Analysis tools are needed to guide the development and evaluate the performance of multigrid solvers for the fluid flow equations. Classical analysis tools, such as local mode analysis, often fail to accurately predict performance. Two-grid analysis tools, herein referred to as Idealized Coarse Grid and Idealized Relaxation iterations, have been developed and evaluated within a pilot multigrid solver. These new tools are applicable to general systems of equations and/or discretizations and point to problem areas within an existing multigrid solver. Idealized Relaxation and Idealized Coarse Grid are applied in developing textbook-efficient multigrid solvers for incompressible stagnation flow problems.

  2. An accurate and efficient laser-envelope solver for the modeling of laser-plasma accelerators

    DOE PAGES

    Benedetti, C.; Schroeder, C. B.; Geddes, C. G. R.; ...

    2017-10-17

    Detailed and reliable numerical modeling of laser-plasma accelerators (LPAs), where a short and intense laser pulse interacts with an underdense plasma over distances of up to a meter, is a formidably challenging task. This is due to the great disparity among the length scales involved in the modeling, ranging from the micron scale of the laser wavelength to the meter scale of the total laser-plasma interaction length. The use of the time-averaged ponderomotive force approximation, where the laser pulse is described by means of its envelope, enables efficient modeling of LPAs by removing the need to model the details ofmore » electron motion at the laser wavelength scale. Furthermore, it allows simulations in cylindrical geometry which captures relevant 3D physics at 2D computational cost. A key element of any code based on the time-averaged ponderomotive force approximation is the laser envelope solver. In this paper we present the accurate and efficient envelope solver used in the code INF & RNO (INtegrated Fluid & paRticle simulatioN cOde). The features of the INF & RNO laser solver enable an accurate description of the laser pulse evolution deep into depletion even at a reasonably low resolution, resulting in significant computational speed-ups.« less

  3. An accurate and efficient laser-envelope solver for the modeling of laser-plasma accelerators

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Benedetti, C.; Schroeder, C. B.; Geddes, C. G. R.

    Detailed and reliable numerical modeling of laser-plasma accelerators (LPAs), where a short and intense laser pulse interacts with an underdense plasma over distances of up to a meter, is a formidably challenging task. This is due to the great disparity among the length scales involved in the modeling, ranging from the micron scale of the laser wavelength to the meter scale of the total laser-plasma interaction length. The use of the time-averaged ponderomotive force approximation, where the laser pulse is described by means of its envelope, enables efficient modeling of LPAs by removing the need to model the details ofmore » electron motion at the laser wavelength scale. Furthermore, it allows simulations in cylindrical geometry which captures relevant 3D physics at 2D computational cost. A key element of any code based on the time-averaged ponderomotive force approximation is the laser envelope solver. In this paper we present the accurate and efficient envelope solver used in the code INF & RNO (INtegrated Fluid & paRticle simulatioN cOde). The features of the INF & RNO laser solver enable an accurate description of the laser pulse evolution deep into depletion even at a reasonably low resolution, resulting in significant computational speed-ups.« less

  4. An accurate and efficient laser-envelope solver for the modeling of laser-plasma accelerators

    NASA Astrophysics Data System (ADS)

    Benedetti, C.; Schroeder, C. B.; Geddes, C. G. R.; Esarey, E.; Leemans, W. P.

    2018-01-01

    Detailed and reliable numerical modeling of laser-plasma accelerators (LPAs), where a short and intense laser pulse interacts with an underdense plasma over distances of up to a meter, is a formidably challenging task. This is due to the great disparity among the length scales involved in the modeling, ranging from the micron scale of the laser wavelength to the meter scale of the total laser-plasma interaction length. The use of the time-averaged ponderomotive force approximation, where the laser pulse is described by means of its envelope, enables efficient modeling of LPAs by removing the need to model the details of electron motion at the laser wavelength scale. Furthermore, it allows simulations in cylindrical geometry which captures relevant 3D physics at 2D computational cost. A key element of any code based on the time-averaged ponderomotive force approximation is the laser envelope solver. In this paper we present the accurate and efficient envelope solver used in the code INF&RNO (INtegrated Fluid & paRticle simulatioN cOde). The features of the INF&RNO laser solver enable an accurate description of the laser pulse evolution deep into depletion even at a reasonably low resolution, resulting in significant computational speed-ups.

  5. Tangent Adjoint Methods In a Higher-Order Space-Time Discontinuous-Galerkin Solver For Turbulent Flows

    NASA Technical Reports Server (NTRS)

    Diosady, Laslo; Murman, Scott; Blonigan, Patrick; Garai, Anirban

    2017-01-01

    Presented space-time adjoint solver for turbulent compressible flows. Confirmed failure of traditional sensitivity methods for chaotic flows. Assessed rate of exponential growth of adjoint for practical 3D turbulent simulation. Demonstrated failure of short-window sensitivity approximations.

  6. WIND Flow Solver Released

    NASA Technical Reports Server (NTRS)

    Towne, Charles E.

    1999-01-01

    The WIND code is a general-purpose, structured, multizone, compressible flow solver that can be used to analyze steady or unsteady flow for a wide range of geometric configurations and over a wide range of flow conditions. WIND is the latest product of the NPARC Alliance, a formal partnership between the NASA Lewis Research Center and the Air Force Arnold Engineering Development Center (AEDC). WIND Version 1.0 was released in February 1998, and Version 2.0 will be released in February 1999. The WIND code represents a merger of the capabilities of three existing computational fluid dynamics codes--NPARC (the original NPARC Alliance flow solver), NXAIR (an Air Force code used primarily for unsteady store separation problems), and NASTD (the primary flow solver at McDonnell Douglas, now part of Boeing).

  7. A mixed fluid-kinetic solver for the Vlasov-Poisson equations

    NASA Astrophysics Data System (ADS)

    Cheng, Yongtao

    Plasmas are ionized gases that appear in a wide range of applications including astrophysics and space physics, as well as in laboratory settings such as in magnetically confined fusion. There are two prevailing types of modeling strategies to describe a plasma system: kinetic models and fluid models. Kinetic models evolve particle probability density distributions (PDFs) in phase space, which are accurate but computationally expensive. Fluid models evolve a small number of moments of the distribution function and reduce the dimension of the solution. However, some approximation is necessary to close the system, and finding an accurate moment closure that correctly captures the dynamics away from thermodynamic equilibrium is a difficult and still open problem. The main contributions of the present work can be divided into two main parts: (1) a new class of moment closures, based on a modification of existing quadrature-based moment-closure methods, is developed using bi-B-spline and bi-bubble representations; and (2) a novel mixed solver that combines a fluid and a kinetic solver is proposed, which uses the new class of moment-closure methods described in the first part. For the newly developed quadrature-based moment-closure based on bi-B-spline and bi-bubble representation, the explicit form of flux terms and the moment-realizability conditions are given. It is shown that while the bi-delta system is weakly hyperbolic, the newly proposed fluid models are strongly hyperbolic. Using a high-order Runge-Kutta discontinuous Galerkin method together with Strang operator splitting, the resulting models are applied to the Vlasov-Poisson-Fokker-Planck system in the high field limit. In the second part of this work, results from kinetic solver are used to provide a corrected closure to the fluid model. This correction keeps the fluid model hyperbolic and gives fluid results that match the moments as computed from the kinetic solution. Furthermore, a prolongation operation

  8. Efficient three-dimensional Poisson solvers in open rectangular conducting pipe

    NASA Astrophysics Data System (ADS)

    Qiang, Ji

    2016-06-01

    Three-dimensional (3D) Poisson solver plays an important role in the study of space-charge effects on charged particle beam dynamics in particle accelerators. In this paper, we propose three new 3D Poisson solvers for a charged particle beam in an open rectangular conducting pipe. These three solvers include a spectral integrated Green function (IGF) solver, a 3D spectral solver, and a 3D integrated Green function solver. These solvers effectively handle the longitudinal open boundary condition using a finite computational domain that contains the beam itself. This saves the computational cost of using an extra larger longitudinal domain in order to set up an appropriate finite boundary condition. Using an integrated Green function also avoids the need to resolve rapid variation of the Green function inside the beam. The numerical operational cost of the spectral IGF solver and the 3D IGF solver scales as O(N log(N)) , where N is the number of grid points. The cost of the 3D spectral solver scales as O(Nn N) , where Nn is the maximum longitudinal mode number. We compare these three solvers using several numerical examples and discuss the advantageous regime of each solver in the physical application.

  9. Relaxation approximations to second-order traffic flow models by high-resolution schemes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nikolos, I.K.; Delis, A.I.; Papageorgiou, M.

    2015-03-10

    A relaxation-type approximation of second-order non-equilibrium traffic models, written in conservation or balance law form, is considered. Using the relaxation approximation, the nonlinear equations are transformed to a semi-linear diagonilizable problem with linear characteristic variables and stiff source terms with the attractive feature that neither Riemann solvers nor characteristic decompositions are in need. In particular, it is only necessary to provide the flux and source term functions and an estimate of the characteristic speeds. To discretize the resulting relaxation system, high-resolution reconstructions in space are considered. Emphasis is given on a fifth-order WENO scheme and its performance. The computations reportedmore » demonstrate the simplicity and versatility of relaxation schemes as numerical solvers.« less

  10. Exploiting Non-sequence Data in Dynamic Model Learning

    DTIC Science & Technology

    2013-10-01

    For our experiments here and in Section 3.5, we implement the proposed algorithms in MATLAB and use the maximum directed spanning tree solver...embarrassingly parallelizable, whereas PM’s maximum directed spanning tree procedure is harder to parallelize. In this experiment, our MATLAB ...some estimation problems, this approach is able to give unique and consistent estimates while the maximum- likelihood method gets entangled in

  11. Multidimensional Riemann problem with self-similar internal structure - part III - a multidimensional analogue of the HLLI Riemann solver for conservative hyperbolic systems

    NASA Astrophysics Data System (ADS)

    Balsara, Dinshaw S.; Nkonga, Boniface

    2017-10-01

    Just as the quality of a one-dimensional approximate Riemann solver is improved by the inclusion of internal sub-structure, the quality of a multidimensional Riemann solver is also similarly improved. Such multidimensional Riemann problems arise when multiple states come together at the vertex of a mesh. The interaction of the resulting one-dimensional Riemann problems gives rise to a strongly-interacting state. We wish to endow this strongly-interacting state with physically-motivated sub-structure. The fastest way of endowing such sub-structure consists of making a multidimensional extension of the HLLI Riemann solver for hyperbolic conservation laws. Presenting such a multidimensional analogue of the HLLI Riemann solver with linear sub-structure for use on structured meshes is the goal of this work. The multidimensional MuSIC Riemann solver documented here is universal in the sense that it can be applied to any hyperbolic conservation law. The multidimensional Riemann solver is made to be consistent with constraints that emerge naturally from the Galerkin projection of the self-similar states within the wave model. When the full eigenstructure in both directions is used in the present Riemann solver, it becomes a complete Riemann solver in a multidimensional sense. I.e., all the intermediate waves are represented in the multidimensional wave model. The work also presents, for the very first time, an important analysis of the dissipation characteristics of multidimensional Riemann solvers. The present Riemann solver results in the most efficient implementation of a multidimensional Riemann solver with sub-structure. Because it preserves stationary linearly degenerate waves, it might also help with well-balancing. Implementation-related details are presented in pointwise fashion for the one-dimensional HLLI Riemann solver as well as the multidimensional MuSIC Riemann solver.

  12. Parallel Solver for H(div) Problems Using Hybridization and AMG

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lee, Chak S.; Vassilevski, Panayot S.

    2016-01-15

    In this paper, a scalable parallel solver is proposed for H(div) problems discretized by arbitrary order finite elements on general unstructured meshes. The solver is based on hybridization and algebraic multigrid (AMG). Unlike some previously studied H(div) solvers, the hybridization solver does not require discrete curl and gradient operators as additional input from the user. Instead, only some element information is needed in the construction of the solver. The hybridization results in a H1-equivalent symmetric positive definite system, which is then rescaled and solved by AMG solvers designed for H1 problems. Weak and strong scaling of the method are examinedmore » through several numerical tests. Our numerical results show that the proposed solver provides a promising alternative to ADS, a state-of-the-art solver [12], for H(div) problems. In fact, it outperforms ADS for higher order elements.« less

  13. NHDS: The New Hampshire Dispersion Relation Solver

    NASA Astrophysics Data System (ADS)

    Verscharen, Daniel; Chandran, Benjamin D. G.

    2018-04-01

    NHDS is the New Hampshire Dispersion Relation Solver. This article describes the numerics of the solver and its capabilities. The code is available for download on https://github.com/danielver02/NHDS.

  14. TH-A-9A-02: BEST IN PHYSICS (THERAPY) - 4D IMRT Planning Using Highly- Parallelizable Particle Swarm Optimization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Modiri, A; Gu, X; Sawant, A

    2014-06-15

    Purpose: We present a particle swarm optimization (PSO)-based 4D IMRT planning technique designed for dynamic MLC tracking delivery to lung tumors. The key idea is to utilize the temporal dimension as an additional degree of freedom rather than a constraint in order to achieve improved sparing of organs at risk (OARs). Methods: The target and normal structures were manually contoured on each of the ten phases of a 4DCT scan acquired from a lung SBRT patient who exhibited 1.5cm tumor motion despite the use of abdominal compression. Corresponding ten IMRT plans were generated using the Eclipse treatment planning system. Thesemore » plans served as initial guess solutions for the PSO algorithm. Fluence weights were optimized over the entire solution space i.e., 10 phases × 12 beams × 166 control points. The size of the solution space motivated our choice of PSO, which is a highly parallelizable stochastic global optimization technique that is well-suited for such large problems. A summed fluence map was created using an in-house B-spline deformable image registration. Each plan was compared with a corresponding, internal target volume (ITV)-based IMRT plan. Results: The PSO 4D IMRT plan yielded comparable PTV coverage and significantly higher dose—sparing for parallel and serial OARs compared to the ITV-based plan. The dose-sparing achieved via PSO-4DIMRT was: lung Dmean = 28%; lung V20 = 90%; spinal cord Dmax = 23%; esophagus Dmax = 31%; heart Dmax = 51%; heart Dmean = 64%. Conclusion: Truly 4D IMRT that uses the temporal dimension as an additional degree of freedom can achieve significant dose sparing of serial and parallel OARs. Given the large solution space, PSO represents an attractive, parallelizable tool to achieve globally optimal solutions for such problems. This work was supported through funding from the National Institutes of Health and Varian Medical Systems. Amit Sawant has research funding from Varian Medical Systems, VisionRT Ltd. and Elekta.« less

  15. MACSYMA's symbolic ordinary differential equation solver

    NASA Technical Reports Server (NTRS)

    Golden, J. P.

    1977-01-01

    The MACSYMA's symbolic ordinary differential equation solver ODE2 is described. The code for this routine is delineated, which is of interest because it is written in top-level MACSYMA language, and may serve as a good example of programming in that language. Other symbolic ordinary differential equation solvers are mentioned.

  16. A Comparative Study of Randomized Constraint Solvers for Random-Symbolic Testing

    NASA Technical Reports Server (NTRS)

    Takaki, Mitsuo; Cavalcanti, Diego; Gheyi, Rohit; Iyoda, Juliano; dAmorim, Marcelo; Prudencio, Ricardo

    2009-01-01

    The complexity of constraints is a major obstacle for constraint-based software verification. Automatic constraint solvers are fundamentally incomplete: input constraints often build on some undecidable theory or some theory the solver does not support. This paper proposes and evaluates several randomized solvers to address this issue. We compare the effectiveness of a symbolic solver (CVC3), a random solver, three hybrid solvers (i.e., mix of random and symbolic), and two heuristic search solvers. We evaluate the solvers on two benchmarks: one consisting of manually generated constraints and another generated with a concolic execution of 8 subjects. In addition to fully decidable constraints, the benchmarks include constraints with non-linear integer arithmetic, integer modulo and division, bitwise arithmetic, and floating-point arithmetic. As expected symbolic solving (in particular, CVC3) subsumes the other solvers for the concolic execution of subjects that only generate decidable constraints. For the remaining subjects the solvers are complementary.

  17. Eigenvalue Solvers for Modeling Nuclear Reactors on Leadership Class Machines

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Slaybaugh, R. N.; Ramirez-Zweiger, M.; Pandya, Tara

    In this paper, three complementary methods have been implemented in the code Denovo that accelerate neutral particle transport calculations with methods that use leadership-class computers fully and effectively: a multigroup block (MG) Krylov solver, a Rayleigh quotient iteration (RQI) eigenvalue solver, and a multigrid in energy (MGE) preconditioner. The MG Krylov solver converges more quickly than Gauss Seidel and enables energy decomposition such that Denovo can scale to hundreds of thousands of cores. RQI should converge in fewer iterations than power iteration (PI) for large and challenging problems. RQI creates shifted systems that would not be tractable without the MGmore » Krylov solver. It also creates ill-conditioned matrices. The MGE preconditioner reduces iteration count significantly when used with RQI and takes advantage of the new energy decomposition such that it can scale efficiently. Each individual method has been described before, but this is the first time they have been demonstrated to work together effectively. The combination of solvers enables the RQI eigenvalue solver to work better than the other available solvers for large reactors problems on leadership-class machines. Using these methods together, RQI converged in fewer iterations and in less time than PI for a full pressurized water reactor core. These solvers also performed better than an Arnoldi eigenvalue solver for a reactor benchmark problem when energy decomposition is needed. The MG Krylov, MGE preconditioner, and RQI solver combination also scales well in energy. Finally, this solver set is a strong choice for very large and challenging problems.« less

  18. Eigenvalue Solvers for Modeling Nuclear Reactors on Leadership Class Machines

    DOE PAGES

    Slaybaugh, R. N.; Ramirez-Zweiger, M.; Pandya, Tara; ...

    2018-02-20

    In this paper, three complementary methods have been implemented in the code Denovo that accelerate neutral particle transport calculations with methods that use leadership-class computers fully and effectively: a multigroup block (MG) Krylov solver, a Rayleigh quotient iteration (RQI) eigenvalue solver, and a multigrid in energy (MGE) preconditioner. The MG Krylov solver converges more quickly than Gauss Seidel and enables energy decomposition such that Denovo can scale to hundreds of thousands of cores. RQI should converge in fewer iterations than power iteration (PI) for large and challenging problems. RQI creates shifted systems that would not be tractable without the MGmore » Krylov solver. It also creates ill-conditioned matrices. The MGE preconditioner reduces iteration count significantly when used with RQI and takes advantage of the new energy decomposition such that it can scale efficiently. Each individual method has been described before, but this is the first time they have been demonstrated to work together effectively. The combination of solvers enables the RQI eigenvalue solver to work better than the other available solvers for large reactors problems on leadership-class machines. Using these methods together, RQI converged in fewer iterations and in less time than PI for a full pressurized water reactor core. These solvers also performed better than an Arnoldi eigenvalue solver for a reactor benchmark problem when energy decomposition is needed. The MG Krylov, MGE preconditioner, and RQI solver combination also scales well in energy. Finally, this solver set is a strong choice for very large and challenging problems.« less

  19. Galerkin CFD solvers for use in a multi-disciplinary suite for modeling advanced flight vehicles

    NASA Astrophysics Data System (ADS)

    Moffitt, Nicholas J.

    This work extends existing Galerkin CFD solvers for use in a multi-disciplinary suite. The suite is proposed as a means of modeling advanced flight vehicles, which exhibit strong coupling between aerodynamics, structural dynamics, controls, rigid body motion, propulsion, and heat transfer. Such applications include aeroelastics, aeroacoustics, stability and control, and other highly coupled applications. The suite uses NASA STARS for modeling structural dynamics and heat transfer. Aerodynamics, propulsion, and rigid body dynamics are modeled in one of the five CFD solvers below. Euler2D and Euler3D are Galerkin CFD solvers created at OSU by Cowan (2003). These solvers are capable of modeling compressible inviscid aerodynamics with modal elastics and rigid body motion. This work reorganized these solvers to improve efficiency during editing and at run time. Simple and efficient propulsion models were added, including rocket, turbojet, and scramjet engines. Viscous terms were added to the previous solvers to create NS2D and NS3D. The viscous contributions were demonstrated in the inertial and non-inertial frames. Variable viscosity (Sutherland's equation) and heat transfer boundary conditions were added to both solvers but not verified in this work. Two turbulence models were implemented in NS2D and NS3D: Spalart-Allmarus (SA) model of Deck, et al. (2002) and Menter's SST model (1994). A rotation correction term (Shur, et al., 2000) was added to the production of turbulence. Local time stepping and artificial dissipation were adapted to each model. CFDsol is a Taylor-Galerkin solver with an SA turbulence model. This work improved the time accuracy, far field stability, viscous terms, Sutherland?s equation, and SA model with NS3D as a guideline and added the propulsion models from Euler3D to CFDsol. Simple geometries were demonstrated to utilize current meshing and processing capabilities. Air-breathing hypersonic flight vehicles (AHFVs) represent the ultimate

  20. Shallow-water sloshing in a moving vessel with variable cross-section and wetting-drying using an extension of George's well-balanced finite volume solver

    NASA Astrophysics Data System (ADS)

    Alemi Ardakani, Hamid; Bridges, Thomas J.; Turner, Matthew R.

    2016-06-01

    A class of augmented approximate Riemann solvers due to George (2008) [12] is extended to solve the shallow-water equations in a moving vessel with variable bottom topography and variable cross-section with wetting and drying. A class of Roe-type upwind solvers for the system of balance laws is derived which respects the steady-state solutions. The numerical solutions of the new adapted augmented f-wave solvers are validated against the Roe-type solvers. The theory is extended to solve the shallow-water flows in moving vessels with arbitrary cross-section with influx-efflux boundary conditions motivated by the shallow-water sloshing in the ocean wave energy converter (WEC) proposed by Offshore Wave Energy Ltd. (OWEL) [1]. A fractional step approach is used to handle the time-dependent forcing functions. The numerical solutions are compared to an extended new Roe-type solver for the system of balance laws with a time-dependent source function. The shallow-water sloshing finite volume solver can be coupled to a Runge-Kutta integrator for the vessel motion.

  1. BCYCLIC: A parallel block tridiagonal matrix cyclic solver

    NASA Astrophysics Data System (ADS)

    Hirshman, S. P.; Perumalla, K. S.; Lynch, V. E.; Sanchez, R.

    2010-09-01

    A block tridiagonal matrix is factored with minimal fill-in using a cyclic reduction algorithm that is easily parallelized. Storage of the factored blocks allows the application of the inverse to multiple right-hand sides which may not be known at factorization time. Scalability with the number of block rows is achieved with cyclic reduction, while scalability with the block size is achieved using multithreaded routines (OpenMP, GotoBLAS) for block matrix manipulation. This dual scalability is a noteworthy feature of this new solver, as well as its ability to efficiently handle arbitrary (non-powers-of-2) block row and processor numbers. Comparison with a state-of-the art parallel sparse solver is presented. It is expected that this new solver will allow many physical applications to optimally use the parallel resources on current supercomputers. Example usage of the solver in magneto-hydrodynamic (MHD), three-dimensional equilibrium solvers for high-temperature fusion plasmas is cited.

  2. Development of a steady potential solver for use with linearized, unsteady aerodynamic analyses

    NASA Technical Reports Server (NTRS)

    Hoyniak, Daniel; Verdon, Joseph M.

    1991-01-01

    A full potential steady flow solver (SFLOW) developed explicitly for use with an inviscid unsteady aerodynamic analysis (LINFLO) is described. The steady solver uses the nonconservative form of the nonlinear potential flow equations together with an implicit, least squares, finite difference approximation to solve for the steady flow field. The difference equations were developed on a composite mesh which consists of a C grid embedded in a rectilinear (H grid) cascade mesh. The composite mesh is capable of resolving blade to blade and far field phenomena on the H grid, while accurately resolving local phenomena on the C grid. The resulting system of algebraic equations is arranged in matrix form using a sparse matrix package and solved by Newton's method. Steady and unsteady results are presented for two cascade configurations: a high speed compressor and a turbine with high exit Mach number.

  3. Differences in the Processes of Solving Physics Problems between Good Physics Problem Solvers and Poor Physics Problem Solvers.

    ERIC Educational Resources Information Center

    Finegold, M.; Mass, R.

    1985-01-01

    Good problem solvers and poor problem solvers in advanced physics (N=8) were significantly different in their ability in translating, planning, and physical reasoning, as well as in problem solving time; no differences in reliance on algebraic solutions and checking problems were noted. Implications for physics teaching are discussed. (DH)

  4. Shape reanalysis and sensitivities utilizing preconditioned iterative boundary solvers

    NASA Technical Reports Server (NTRS)

    Guru Prasad, K.; Kane, J. H.

    1992-01-01

    The computational advantages associated with the utilization of preconditined iterative equation solvers are quantified for the reanalysis of perturbed shapes using continuum structural boundary element analysis (BEA). Both single- and multi-zone three-dimensional problems are examined. Significant reductions in computer time are obtained by making use of previously computed solution vectors and preconditioners in subsequent analyses. The effectiveness of this technique is demonstrated for the computation of shape response sensitivities required in shape optimization. Computer times and accuracies achieved using the preconditioned iterative solvers are compared with those obtained via direct solvers and implicit differentiation of the boundary integral equations. It is concluded that this approach employing preconditioned iterative equation solvers in reanalysis and sensitivity analysis can be competitive with if not superior to those involving direct solvers.

  5. General purpose nonlinear system solver based on Newton-Krylov method.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    2013-12-01

    KINSOL is part of a software family called SUNDIALS: SUite of Nonlinear and Differential/Algebraic equation Solvers [1]. KINSOL is a general-purpose nonlinear system solver based on Newton-Krylov and fixed-point solver technologies [2].

  6. A two-dimensional Riemann solver with self-similar sub-structure - Alternative formulation based on least squares projection

    NASA Astrophysics Data System (ADS)

    Balsara, Dinshaw S.; Vides, Jeaniffer; Gurski, Katharine; Nkonga, Boniface; Dumbser, Michael; Garain, Sudip; Audit, Edouard

    2016-01-01

    Just as the quality of a one-dimensional approximate Riemann solver is improved by the inclusion of internal sub-structure, the quality of a multidimensional Riemann solver is also similarly improved. Such multidimensional Riemann problems arise when multiple states come together at the vertex of a mesh. The interaction of the resulting one-dimensional Riemann problems gives rise to a strongly-interacting state. We wish to endow this strongly-interacting state with physically-motivated sub-structure. The self-similar formulation of Balsara [16] proves especially useful for this purpose. While that work is based on a Galerkin projection, in this paper we present an analogous self-similar formulation that is based on a different interpretation. In the present formulation, we interpret the shock jumps at the boundary of the strongly-interacting state quite literally. The enforcement of the shock jump conditions is done with a least squares projection (Vides, Nkonga and Audit [67]). With that interpretation, we again show that the multidimensional Riemann solver can be endowed with sub-structure. However, we find that the most efficient implementation arises when we use a flux vector splitting and a least squares projection. An alternative formulation that is based on the full characteristic matrices is also presented. The multidimensional Riemann solvers that are demonstrated here use one-dimensional HLLC Riemann solvers as building blocks. Several stringent test problems drawn from hydrodynamics and MHD are presented to show that the method works. Results from structured and unstructured meshes demonstrate the versatility of our method. The reader is also invited to watch a video introduction to multidimensional Riemann solvers on http://www.nd.edu/ dbalsara/Numerical-PDE-Course.

  7. Detailed analysis of the effects of stencil spatial variations with arbitrary high-order finite-difference Maxwell solver

    DOE PAGES

    Vincenti, H.; Vay, J. -L.

    2015-11-22

    Due to discretization effects and truncation to finite domains, many electromagnetic simulations present non-physical modifications of Maxwell's equations in space that may generate spurious signals affecting the overall accuracy of the result. Such modifications for instance occur when Perfectly Matched Layers (PMLs) are used at simulation domain boundaries to simulate open media. Another example is the use of arbitrary order Maxwell solver with domain decomposition technique that may under some condition involve stencil truncations at subdomain boundaries, resulting in small spurious errors that do eventually build up. In each case, a careful evaluation of the characteristics and magnitude of themore » errors resulting from these approximations, and their impact at any frequency and angle, requires detailed analytical and numerical studies. To this end, we present a general analytical approach that enables the evaluation of numerical discretization errors of fully three-dimensional arbitrary order finite-difference Maxwell solver, with arbitrary modification of the local stencil in the simulation domain. The analytical model is validated against simulations of domain decomposition technique and PMLs, when these are used with very high-order Maxwell solver, as well as in the infinite order limit of pseudo-spectral solvers. Results confirm that the new analytical approach enables exact predictions in each case. It also confirms that the domain decomposition technique can be used with very high-order Maxwell solver and a reasonably low number of guard cells with negligible effects on the whole accuracy of the simulation.« less

  8. On the use of reverse Brownian motion to accelerate hybrid simulations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bakarji, Joseph; Tartakovsky, Daniel M., E-mail: tartakovsky@stanford.edu

    Multiscale and multiphysics simulations are two rapidly developing fields of scientific computing. Efficient coupling of continuum (deterministic or stochastic) constitutive solvers with their discrete (stochastic, particle-based) counterparts is a common challenge in both kinds of simulations. We focus on interfacial, tightly coupled simulations of diffusion that combine continuum and particle-based solvers. The latter employs the reverse Brownian motion (rBm), a Monte Carlo approach that allows one to enforce inhomogeneous Dirichlet, Neumann, or Robin boundary conditions and is trivially parallelizable. We discuss numerical approaches for improving the accuracy of rBm in the presence of inhomogeneous Neumann boundary conditions and alternative strategiesmore » for coupling the rBm solver with its continuum counterpart. Numerical experiments are used to investigate the convergence, stability, and computational efficiency of the proposed hybrid algorithm.« less

  9. GENASIS Mathematics : Object-oriented manifolds, operations, and solvers for large-scale physics simulations

    NASA Astrophysics Data System (ADS)

    Cardall, Christian Y.; Budiardja, Reuben D.

    2018-01-01

    The large-scale computer simulation of a system of physical fields governed by partial differential equations requires some means of approximating the mathematical limit of continuity. For example, conservation laws are often treated with a 'finite-volume' approach in which space is partitioned into a large number of small 'cells,' with fluxes through cell faces providing an intuitive discretization modeled on the mathematical definition of the divergence operator. Here we describe and make available Fortran 2003 classes furnishing extensible object-oriented implementations of simple meshes and the evolution of generic conserved currents thereon, along with individual 'unit test' programs and larger example problems demonstrating their use. These classes inaugurate the Mathematics division of our developing astrophysics simulation code GENASIS (Gen eral A strophysical Si mulation S ystem), which will be expanded over time to include additional meshing options, mathematical operations, solver types, and solver variations appropriate for many multiphysics applications.

  10. Parallel O(N) Stokes’ solver towards scalable Brownian dynamics of hydrodynamically interacting objects in general geometries

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhao, Xujun; Li, Jiyuan; Jiang, Xikai

    An efficient parallel Stokes’s solver is developed towards the complete inclusion of hydrodynamic interactions of Brownian particles in any geometry. A Langevin description of the particle dynamics is adopted, where the long-range interactions are included using a Green’s function formalism. We present a scalable parallel computational approach, where the general geometry Stokeslet is calculated following a matrix-free algorithm using the General geometry Ewald-like method. Our approach employs a highly-efficient iterative finite element Stokes’ solver for the accurate treatment of long-range hydrodynamic interactions within arbitrary confined geometries. A combination of mid-point time integration of the Brownian stochastic differential equation, the parallelmore » Stokes’ solver, and a Chebyshev polynomial approximation for the fluctuation-dissipation theorem result in an O(N) parallel algorithm. We also illustrate the new algorithm in the context of the dynamics of confined polymer solutions in equilibrium and non-equilibrium conditions. Our method is extended to treat suspended finite size particles of arbitrary shape in any geometry using an Immersed Boundary approach.« less

  11. Parallel O(N) Stokes’ solver towards scalable Brownian dynamics of hydrodynamically interacting objects in general geometries

    DOE PAGES

    Zhao, Xujun; Li, Jiyuan; Jiang, Xikai; ...

    2017-06-29

    An efficient parallel Stokes’s solver is developed towards the complete inclusion of hydrodynamic interactions of Brownian particles in any geometry. A Langevin description of the particle dynamics is adopted, where the long-range interactions are included using a Green’s function formalism. We present a scalable parallel computational approach, where the general geometry Stokeslet is calculated following a matrix-free algorithm using the General geometry Ewald-like method. Our approach employs a highly-efficient iterative finite element Stokes’ solver for the accurate treatment of long-range hydrodynamic interactions within arbitrary confined geometries. A combination of mid-point time integration of the Brownian stochastic differential equation, the parallelmore » Stokes’ solver, and a Chebyshev polynomial approximation for the fluctuation-dissipation theorem result in an O(N) parallel algorithm. We also illustrate the new algorithm in the context of the dynamics of confined polymer solutions in equilibrium and non-equilibrium conditions. Our method is extended to treat suspended finite size particles of arbitrary shape in any geometry using an Immersed Boundary approach.« less

  12. Acceleration of FDTD mode solver by high-performance computing techniques.

    PubMed

    Han, Lin; Xi, Yanping; Huang, Wei-Ping

    2010-06-21

    A two-dimensional (2D) compact finite-difference time-domain (FDTD) mode solver is developed based on wave equation formalism in combination with the matrix pencil method (MPM). The method is validated for calculation of both real guided and complex leaky modes of typical optical waveguides against the bench-mark finite-difference (FD) eigen mode solver. By taking advantage of the inherent parallel nature of the FDTD algorithm, the mode solver is implemented on graphics processing units (GPUs) using the compute unified device architecture (CUDA). It is demonstrated that the high-performance computing technique leads to significant acceleration of the FDTD mode solver with more than 30 times improvement in computational efficiency in comparison with the conventional FDTD mode solver running on CPU of a standard desktop computer. The computational efficiency of the accelerated FDTD method is in the same order of magnitude of the standard finite-difference eigen mode solver and yet require much less memory (e.g., less than 10%). Therefore, the new method may serve as an efficient, accurate and robust tool for mode calculation of optical waveguides even when the conventional eigen value mode solvers are no longer applicable due to memory limitation.

  13. Performance Models for the Spike Banded Linear System Solver

    DOE PAGES

    Manguoglu, Murat; Saied, Faisal; Sameh, Ahmed; ...

    2011-01-01

    With availability of large-scale parallel platforms comprised of tens-of-thousands of processors and beyond, there is significant impetus for the development of scalable parallel sparse linear system solvers and preconditioners. An integral part of this design process is the development of performance models capable of predicting performance and providing accurate cost models for the solvers and preconditioners. There has been some work in the past on characterizing performance of the iterative solvers themselves. In this paper, we investigate the problem of characterizing performance and scalability of banded preconditioners. Recent work has demonstrated the superior convergence properties and robustness of banded preconditioners,more » compared to state-of-the-art ILU family of preconditioners as well as algebraic multigrid preconditioners. Furthermore, when used in conjunction with efficient banded solvers, banded preconditioners are capable of significantly faster time-to-solution. Our banded solver, the Truncated Spike algorithm is specifically designed for parallel performance and tolerance to deep memory hierarchies. Its regular structure is also highly amenable to accurate performance characterization. Using these characteristics, we derive the following results in this paper: (i) we develop parallel formulations of the Truncated Spike solver, (ii) we develop a highly accurate pseudo-analytical parallel performance model for our solver, (iii) we show excellent predication capabilities of our model – based on which we argue the high scalability of our solver. Our pseudo-analytical performance model is based on analytical performance characterization of each phase of our solver. These analytical models are then parameterized using actual runtime information on target platforms. An important consequence of our performance models is that they reveal underlying performance bottlenecks in both serial and parallel formulations. All of our results are validated

  14. Code Verification of the HIGRAD Computational Fluid Dynamics Solver

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Van Buren, Kendra L.; Canfield, Jesse M.; Hemez, Francois M.

    2012-05-04

    The purpose of this report is to outline code and solution verification activities applied to HIGRAD, a Computational Fluid Dynamics (CFD) solver of the compressible Navier-Stokes equations developed at the Los Alamos National Laboratory, and used to simulate various phenomena such as the propagation of wildfires and atmospheric hydrodynamics. Code verification efforts, as described in this report, are an important first step to establish the credibility of numerical simulations. They provide evidence that the mathematical formulation is properly implemented without significant mistakes that would adversely impact the application of interest. Highly accurate analytical solutions are derived for four code verificationmore » test problems that exercise different aspects of the code. These test problems are referred to as: (i) the quiet start, (ii) the passive advection, (iii) the passive diffusion, and (iv) the piston-like problem. These problems are simulated using HIGRAD with different levels of mesh discretization and the numerical solutions are compared to their analytical counterparts. In addition, the rates of convergence are estimated to verify the numerical performance of the solver. The first three test problems produce numerical approximations as expected. The fourth test problem (piston-like) indicates the extent to which the code is able to simulate a 'mild' discontinuity, which is a condition that would typically be better handled by a Lagrangian formulation. The current investigation concludes that the numerical implementation of the solver performs as expected. The quality of solutions is sufficient to provide credible simulations of fluid flows around wind turbines. The main caveat associated to these findings is the low coverage provided by these four problems, and somewhat limited verification activities. A more comprehensive evaluation of HIGRAD may be beneficial for future studies.« less

  15. A LAGRANGIAN GAUSS-NEWTON-KRYLOV SOLVER FOR MASS- AND INTENSITY-PRESERVING DIFFEOMORPHIC IMAGE REGISTRATION.

    PubMed

    Mang, Andreas; Ruthotto, Lars

    2017-01-01

    We present an efficient solver for diffeomorphic image registration problems in the framework of Large Deformations Diffeomorphic Metric Mappings (LDDMM). We use an optimal control formulation, in which the velocity field of a hyperbolic PDE needs to be found such that the distance between the final state of the system (the transformed/transported template image) and the observation (the reference image) is minimized. Our solver supports both stationary and non-stationary (i.e., transient or time-dependent) velocity fields. As transformation models, we consider both the transport equation (assuming intensities are preserved during the deformation) and the continuity equation (assuming mass-preservation). We consider the reduced form of the optimal control problem and solve the resulting unconstrained optimization problem using a discretize-then-optimize approach. A key contribution is the elimination of the PDE constraint using a Lagrangian hyperbolic PDE solver. Lagrangian methods rely on the concept of characteristic curves. We approximate these curves using a fourth-order Runge-Kutta method. We also present an efficient algorithm for computing the derivatives of the final state of the system with respect to the velocity field. This allows us to use fast Gauss-Newton based methods. We present quickly converging iterative linear solvers using spectral preconditioners that render the overall optimization efficient and scalable. Our method is embedded into the image registration framework FAIR and, thus, supports the most commonly used similarity measures and regularization functionals. We demonstrate the potential of our new approach using several synthetic and real world test problems with up to 14.7 million degrees of freedom.

  16. Comparing direct and iterative equation solvers in a large structural analysis software system

    NASA Technical Reports Server (NTRS)

    Poole, E. L.

    1991-01-01

    Two direct Choleski equation solvers and two iterative preconditioned conjugate gradient (PCG) equation solvers used in a large structural analysis software system are described. The two direct solvers are implementations of the Choleski method for variable-band matrix storage and sparse matrix storage. The two iterative PCG solvers include the Jacobi conjugate gradient method and an incomplete Choleski conjugate gradient method. The performance of the direct and iterative solvers is compared by solving several representative structural analysis problems. Some key factors affecting the performance of the iterative solvers relative to the direct solvers are identified.

  17. Application of Nearly Linear Solvers to Electric Power System Computation

    NASA Astrophysics Data System (ADS)

    Grant, Lisa L.

    To meet the future needs of the electric power system, improvements need to be made in the areas of power system algorithms, simulation, and modeling, specifically to achieve a time frame that is useful to industry. If power system time-domain simulations could run in real-time, then system operators would have situational awareness to implement online control and avoid cascading failures, significantly improving power system reliability. Several power system applications rely on the solution of a very large linear system. As the demands on power systems continue to grow, there is a greater computational complexity involved in solving these large linear systems within reasonable time. This project expands on the current work in fast linear solvers, developed for solving symmetric and diagonally dominant linear systems, in order to produce power system specific methods that can be solved in nearly-linear run times. The work explores a new theoretical method that is based on ideas in graph theory and combinatorics. The technique builds a chain of progressively smaller approximate systems with preconditioners based on the system's low stretch spanning tree. The method is compared to traditional linear solvers and shown to reduce the time and iterations required for an accurate solution, especially as the system size increases. A simulation validation is performed, comparing the solution capabilities of the chain method to LU factorization, which is the standard linear solver for power flow. The chain method was successfully demonstrated to produce accurate solutions for power flow simulation on a number of IEEE test cases, and a discussion on how to further improve the method's speed and accuracy is included.

  18. A matrix-form GSM-CFD solver for incompressible fluids and its application to hemodynamics

    NASA Astrophysics Data System (ADS)

    Yao, Jianyao; Liu, G. R.

    2014-10-01

    A GSM-CFD solver for incompressible flows is developed based on the gradient smoothing method (GSM). A matrix-form algorithm and corresponding data structure for GSM are devised to efficiently approximate the spatial gradients of field variables using the gradient smoothing operation. The calculated gradient values on various test fields show that the proposed GSM is capable of exactly reproducing linear field and of second order accuracy on all kinds of meshes. It is found that the GSM is much more robust to mesh deformation and therefore more suitable for problems with complicated geometries. Integrated with the artificial compressibility approach, the GSM is extended to solve the incompressible flows. As an example, the flow simulation of carotid bifurcation is carried out to show the effectiveness of the proposed GSM-CFD solver. The blood is modeled as incompressible Newtonian fluid and the vessel is treated as rigid wall in this paper.

  19. Performance of Nonlinear Finite-Difference Poisson-Boltzmann Solvers

    PubMed Central

    Cai, Qin; Hsieh, Meng-Juei; Wang, Jun; Luo, Ray

    2014-01-01

    We implemented and optimized seven finite-difference solvers for the full nonlinear Poisson-Boltzmann equation in biomolecular applications, including four relaxation methods, one conjugate gradient method, and two inexact Newton methods. The performance of the seven solvers was extensively evaluated with a large number of nucleic acids and proteins. Worth noting is the inexact Newton method in our analysis. We investigated the role of linear solvers in its performance by incorporating the incomplete Cholesky conjugate gradient and the geometric multigrid into its inner linear loop. We tailored and optimized both linear solvers for faster convergence rate. In addition, we explored strategies to optimize the successive over-relaxation method to reduce its convergence failures without too much sacrifice in its convergence rate. Specifically we attempted to adaptively change the relaxation parameter and to utilize the damping strategy from the inexact Newton method to improve the successive over-relaxation method. Our analysis shows that the nonlinear methods accompanied with a functional-assisted strategy, such as the conjugate gradient method and the inexact Newton method, can guarantee convergence in the tested molecules. Especially the inexact Newton method exhibits impressive performance when it is combined with highly efficient linear solvers that are tailored for its special requirement. PMID:24723843

  20. Oasis: A high-level/high-performance open source Navier-Stokes solver

    NASA Astrophysics Data System (ADS)

    Mortensen, Mikael; Valen-Sendstad, Kristian

    2015-03-01

    Oasis is a high-level/high-performance finite element Navier-Stokes solver written from scratch in Python using building blocks from the FEniCS project (fenicsproject.org). The solver is unstructured and targets large-scale applications in complex geometries on massively parallel clusters. Oasis utilizes MPI and interfaces, through FEniCS, to the linear algebra backend PETSc. Oasis advocates a high-level, programmable user interface through the creation of highly flexible Python modules for new problems. Through the high-level Python interface the user is placed in complete control of every aspect of the solver. A version of the solver, that is using piecewise linear elements for both velocity and pressure, is shown to reproduce very well the classical, spectral, turbulent channel simulations of Moser et al. (1999). The computational speed is strongly dominated by the iterative solvers provided by the linear algebra backend, which is arguably the best performance any similar implicit solver using PETSc may hope for. Higher order accuracy is also demonstrated and new solvers may be easily added within the same framework.

  1. Development of axisymmetric lattice Boltzmann flux solver for complex multiphase flows

    NASA Astrophysics Data System (ADS)

    Wang, Yan; Shu, Chang; Yang, Li-Ming; Yuan, Hai-Zhuan

    2018-05-01

    This paper presents an axisymmetric lattice Boltzmann flux solver (LBFS) for simulating axisymmetric multiphase flows. In the solver, the two-dimensional (2D) multiphase LBFS is applied to reconstruct macroscopic fluxes excluding axisymmetric effects. Source terms accounting for axisymmetric effects are introduced directly into the governing equations. As compared to conventional axisymmetric multiphase lattice Boltzmann (LB) method, the present solver has the kinetic feature for flux evaluation and avoids complex derivations of external forcing terms. In addition, the present solver also saves considerable computational efforts in comparison with three-dimensional (3D) computations. The capability of the proposed solver in simulating complex multiphase flows is demonstrated by studying single bubble rising in a circular tube. The obtained results compare well with the published data.

  2. Using SPARK as a Solver for Modelica

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wetter, Michael; Wetter, Michael; Haves, Philip

    Modelica is an object-oriented acausal modeling language that is well positioned to become a de-facto standard for expressing models of complex physical systems. To simulate a model expressed in Modelica, it needs to be translated into executable code. For generating run-time efficient code, such a translation needs to employ algebraic formula manipulations. As the SPARK solver has been shown to be competitive for generating such code but currently cannot be used with the Modelica language, we report in this paper how SPARK's symbolic and numerical algorithms can be implemented in OpenModelica, an open-source implementation of a Modelica modeling and simulationmore » environment. We also report benchmark results that show that for our air flow network simulation benchmark, the SPARK solver is competitive with Dymola, which is believed to provide the best solver for Modelica.« less

  3. Two-Dimensional Ffowcs Williams/Hawkings Equation Solver

    NASA Technical Reports Server (NTRS)

    Lockard, David P.

    2005-01-01

    FWH2D is a Fortran 90 computer program that solves a two-dimensional (2D) version of the equation, derived by J. E. Ffowcs Williams and D. L. Hawkings, for sound generated by turbulent flow. FWH2D was developed especially for estimating noise generated by airflows around such approximately 2D airframe components as slats. The user provides input data on fluctuations of pressure, density, and velocity on some surface. These data are combined with information about the geometry of the surface to calculate histories of thickness and loading terms. These histories are fast-Fourier-transformed into the frequency domain. For each frequency of interest and each observer position specified by the user, kernel functions are integrated over the surface by use of the trapezoidal rule to calculate a pressure signal. The resulting frequency-domain signals are inverse-fast-Fourier-transformed back into the time domain. The output of the code consists of the time- and frequency-domain representations of the pressure signals at the observer positions. Because of its approximate nature, FWH2D overpredicts the noise from a finite-length (3D) component. The advantage of FWH2D is that it requires a fraction of the computation time of a 3D Ffowcs Williams/Hawkings solver.

  4. An iterative solver for the 3D Helmholtz equation

    NASA Astrophysics Data System (ADS)

    Belonosov, Mikhail; Dmitriev, Maxim; Kostin, Victor; Neklyudov, Dmitry; Tcheverda, Vladimir

    2017-09-01

    We develop a frequency-domain iterative solver for numerical simulation of acoustic waves in 3D heterogeneous media. It is based on the application of a unique preconditioner to the Helmholtz equation that ensures convergence for Krylov subspace iteration methods. Effective inversion of the preconditioner involves the Fast Fourier Transform (FFT) and numerical solution of a series of boundary value problems for ordinary differential equations. Matrix-by-vector multiplication for iterative inversion of the preconditioned matrix involves inversion of the preconditioner and pointwise multiplication of grid functions. Our solver has been verified by benchmarking against exact solutions and a time-domain solver.

  5. High-performance equation solvers and their impact on finite element analysis

    NASA Technical Reports Server (NTRS)

    Poole, Eugene L.; Knight, Norman F., Jr.; Davis, D. Dale, Jr.

    1990-01-01

    The role of equation solvers in modern structural analysis software is described. Direct and iterative equation solvers which exploit vectorization on modern high-performance computer systems are described and compared. The direct solvers are two Cholesky factorization methods. The first method utilizes a novel variable-band data storage format to achieve very high computation rates and the second method uses a sparse data storage format designed to reduce the number of operations. The iterative solvers are preconditioned conjugate gradient methods. Two different preconditioners are included; the first uses a diagonal matrix storage scheme to achieve high computation rates and the second requires a sparse data storage scheme and converges to the solution in fewer iterations that the first. The impact of using all of the equation solvers in a common structural analysis software system is demonstrated by solving several representative structural analysis problems.

  6. High-performance equation solvers and their impact on finite element analysis

    NASA Technical Reports Server (NTRS)

    Poole, Eugene L.; Knight, Norman F., Jr.; Davis, D. D., Jr.

    1992-01-01

    The role of equation solvers in modern structural analysis software is described. Direct and iterative equation solvers which exploit vectorization on modern high-performance computer systems are described and compared. The direct solvers are two Cholesky factorization methods. The first method utilizes a novel variable-band data storage format to achieve very high computation rates and the second method uses a sparse data storage format designed to reduce the number od operations. The iterative solvers are preconditioned conjugate gradient methods. Two different preconditioners are included; the first uses a diagonal matrix storage scheme to achieve high computation rates and the second requires a sparse data storage scheme and converges to the solution in fewer iterations that the first. The impact of using all of the equation solvers in a common structural analysis software system is demonstrated by solving several representative structural analysis problems.

  7. Novel Scalable 3-D MT Inverse Solver

    NASA Astrophysics Data System (ADS)

    Kuvshinov, A. V.; Kruglyakov, M.; Geraskin, A.

    2016-12-01

    We present a new, robust and fast, three-dimensional (3-D) magnetotelluric (MT) inverse solver. As a forward modelling engine a highly-scalable solver extrEMe [1] is used. The (regularized) inversion is based on an iterative gradient-type optimization (quasi-Newton method) and exploits adjoint sources approach for fast calculation of the gradient of the misfit. The inverse solver is able to deal with highly detailed and contrasting models, allows for working (separately or jointly) with any type of MT (single-site and/or inter-site) responses, and supports massive parallelization. Different parallelization strategies implemented in the code allow for optimal usage of available computational resources for a given problem set up. To parameterize an inverse domain a mask approach is implemented, which means that one can merge any subset of forward modelling cells in order to account for (usually) irregular distribution of observation sites. We report results of 3-D numerical experiments aimed at analysing the robustness, performance and scalability of the code. In particular, our computational experiments carried out at different platforms ranging from modern laptops to high-performance clusters demonstrate practically linear scalability of the code up to thousands of nodes. 1. Kruglyakov, M., A. Geraskin, A. Kuvshinov, 2016. Novel accurate and scalable 3-D MT forward solver based on a contracting integral equation method, Computers and Geosciences, in press.

  8. A 1D-2D Shallow Water Equations solver for discontinuous porosity field based on a Generalized Riemann Problem

    NASA Astrophysics Data System (ADS)

    Ferrari, Alessia; Vacondio, Renato; Dazzi, Susanna; Mignosa, Paolo

    2017-09-01

    A novel augmented Riemann Solver capable of handling porosity discontinuities in 1D and 2D Shallow Water Equation (SWE) models is presented. With the aim of accurately approximating the porosity source term, a Generalized Riemann Problem is derived by adding an additional fictitious equation to the SWEs system and imposing mass and momentum conservation across the porosity discontinuity. The modified Shallow Water Equations are theoretically investigated, and the implementation of an augmented Roe Solver in a 1D Godunov-type finite volume scheme is presented. Robust treatment of transonic flows is ensured by introducing an entropy fix based on the wave pattern of the Generalized Riemann Problem. An Exact Riemann Solver is also derived in order to validate the numerical model. As an extension of the 1D scheme, an analogous 2D numerical model is also derived and validated through test cases with radial symmetry. The capability of the 1D and 2D numerical models to capture different wave patterns is assessed against several Riemann Problems with different wave patterns.

  9. A robust multilevel simultaneous eigenvalue solver

    NASA Technical Reports Server (NTRS)

    Costiner, Sorin; Taasan, Shlomo

    1993-01-01

    Multilevel (ML) algorithms for eigenvalue problems are often faced with several types of difficulties such as: the mixing of approximated eigenvectors by the solution process, the approximation of incomplete clusters of eigenvectors, the poor representation of solution on coarse levels, and the existence of close or equal eigenvalues. Algorithms that do not treat appropriately these difficulties usually fail, or their performance degrades when facing them. These issues motivated the development of a robust adaptive ML algorithm which treats these difficulties, for the calculation of a few eigenvectors and their corresponding eigenvalues. The main techniques used in the new algorithm include: the adaptive completion and separation of the relevant clusters on different levels, the simultaneous treatment of solutions within each cluster, and the robustness tests which monitor the algorithm's efficiency and convergence. The eigenvectors' separation efficiency is based on a new ML projection technique generalizing the Rayleigh Ritz projection, combined with a technique, the backrotations. These separation techniques, when combined with an FMG formulation, in many cases lead to algorithms of O(qN) complexity, for q eigenvectors of size N on the finest level. Previously developed ML algorithms are less focused on the mentioned difficulties. Moreover, algorithms which employ fine level separation techniques are of O(q(sub 2)N) complexity and usually do not overcome all these difficulties. Computational examples are presented where Schrodinger type eigenvalue problems in 2-D and 3-D, having equal and closely clustered eigenvalues, are solved with the efficiency of the Poisson multigrid solver. A second order approximation is obtained in O(qN) work, where the total computational work is equivalent to only a few fine level relaxations per eigenvector.

  10. A Newton-Krylov solver for fast spin-up of online ocean tracers

    NASA Astrophysics Data System (ADS)

    Lindsay, Keith

    2017-01-01

    We present a Newton-Krylov based solver to efficiently spin up tracers in an online ocean model. We demonstrate that the solver converges, that tracer simulations initialized with the solution from the solver have small drift, and that the solver takes orders of magnitude less computational time than the brute force spin-up approach. To demonstrate the application of the solver, we use it to efficiently spin up the tracer ideal age with respect to the circulation from different time intervals in a long physics run. We then evaluate how the spun-up ideal age tracer depends on the duration of the physics run, i.e., on how equilibrated the circulation is.

  11. Hypersonic simulations using open-source CFD and DSMC solvers

    NASA Astrophysics Data System (ADS)

    Casseau, V.; Scanlon, T. J.; John, B.; Emerson, D. R.; Brown, R. E.

    2016-11-01

    Hypersonic hybrid hydrodynamic-molecular gas flow solvers are required to satisfy the two essential requirements of any high-speed reacting code, these being physical accuracy and computational efficiency. The James Weir Fluids Laboratory at the University of Strathclyde is currently developing an open-source hybrid code which will eventually reconcile the direct simulation Monte-Carlo method, making use of the OpenFOAM application called dsmcFoam, and the newly coded open-source two-temperature computational fluid dynamics solver named hy2Foam. In conjunction with employing the CVDV chemistry-vibration model in hy2Foam, novel use is made of the QK rates in a CFD solver. In this paper, further testing is performed, in particular with the CFD solver, to ensure its efficacy before considering more advanced test cases. The hy2Foam and dsmcFoam codes have shown to compare reasonably well, thus providing a useful basis for other codes to compare against.

  12. Solvers for $$\\mathcal{O} (N)$$ Electronic Structure in the Strong Scaling Limit

    DOE PAGES

    Bock, Nicolas; Challacombe, William M.; Kale, Laxmikant

    2016-01-26

    Here we present a hybrid OpenMP/Charm\\tt++ framework for solving themore » $$\\mathcal{O} (N)$$ self-consistent-field eigenvalue problem with parallelism in the strong scaling regime, $$P\\gg{N}$$, where $P$ is the number of cores, and $N$ is a measure of system size, i.e., the number of matrix rows/columns, basis functions, atoms, molecules, etc. This result is achieved with a nested approach to spectral projection and the sparse approximate matrix multiply [Bock and Challacombe, SIAM J. Sci. Comput., 35 (2013), pp. C72--C98], and involves a recursive, task-parallel algorithm, often employed by generalized $N$-Body solvers, to occlusion and culling of negligible products in the case of matrices with decay. Lastly, employing classic technologies associated with generalized $N$-Body solvers, including overdecomposition, recursive task parallelism, orderings that preserve locality, and persistence-based load balancing, we obtain scaling beyond hundreds of cores per molecule for small water clusters ([H$${}_2$$O]$${}_N$$, $$N \\in \\{ 30, 90, 150 \\}$$, $$P/N \\approx \\{ 819, 273, 164 \\}$$) and find support for an increasingly strong scalability with increasing system size $N$.« less

  13. Multiply scaled constrained nonlinear equation solvers. [for nonlinear heat conduction problems

    NASA Technical Reports Server (NTRS)

    Padovan, Joe; Krishna, Lala

    1986-01-01

    To improve the numerical stability of nonlinear equation solvers, a partitioned multiply scaled constraint scheme is developed. This scheme enables hierarchical levels of control for nonlinear equation solvers. To complement the procedure, partitioned convergence checks are established along with self-adaptive partitioning schemes. Overall, such procedures greatly enhance the numerical stability of the original solvers. To demonstrate and motivate the development of the scheme, the problem of nonlinear heat conduction is considered. In this context the main emphasis is given to successive substitution-type schemes. To verify the improved numerical characteristics associated with partitioned multiply scaled solvers, results are presented for several benchmark examples.

  14. Sherlock Holmes, Master Problem Solver.

    ERIC Educational Resources Information Center

    Ballew, Hunter

    1994-01-01

    Shows the connections between Sherlock Holmes's investigative methods and mathematical problem solving, including observations, characteristics of the problem solver, importance of data, questioning the obvious, learning from experience, learning from errors, and indirect proof. (MKR)

  15. Steady potential solver for unsteady aerodynamic analyses

    NASA Technical Reports Server (NTRS)

    Hoyniak, Dan

    1994-01-01

    Development of a steady flow solver for use with LINFLO was the objective of this report. The solver must be compatible with LINFLO, be composed of composite mesh, and have transonic capability. The approaches used were: (1) steady flow potential equations written in nonconservative form; (2) Newton's Method; (3) implicit, least-squares, interpolation method to obtain finite difference equations; and (4) matrix inversion routines from LINFLO. This report was given during the NASA LeRC Workshop on Forced Response in Turbomachinery in August of 1993.

  16. Experimental validation of a coupled neutron-photon inverse radiation transport solver

    NASA Astrophysics Data System (ADS)

    Mattingly, John; Mitchell, Dean J.; Harding, Lee T.

    2011-10-01

    Sandia National Laboratories has developed an inverse radiation transport solver that applies nonlinear regression to coupled neutron-photon deterministic transport models. The inverse solver uses nonlinear regression to fit a radiation transport model to gamma spectrometry and neutron multiplicity counting measurements. The subject of this paper is the experimental validation of that solver. This paper describes a series of experiments conducted with a 4.5 kg sphere of α-phase, weapons-grade plutonium. The source was measured bare and reflected by high-density polyethylene (HDPE) spherical shells with total thicknesses between 1.27 and 15.24 cm. Neutron and photon emissions from the source were measured using three instruments: a gross neutron counter, a portable neutron multiplicity counter, and a high-resolution gamma spectrometer. These measurements were used as input to the inverse radiation transport solver to evaluate the solver's ability to correctly infer the configuration of the source from its measured radiation signatures.

  17. RELATIVISTIC MAGNETOHYDRODYNAMICS: RENORMALIZED EIGENVECTORS AND FULL WAVE DECOMPOSITION RIEMANN SOLVER

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Anton, Luis; MartI, Jose M; Ibanez, Jose M

    2010-05-01

    We obtain renormalized sets of right and left eigenvectors of the flux vector Jacobians of the relativistic MHD equations, which are regular and span a complete basis in any physical state including degenerate ones. The renormalization procedure relies on the characterization of the degeneracy types in terms of the normal and tangential components of the magnetic field to the wave front in the fluid rest frame. Proper expressions of the renormalized eigenvectors in conserved variables are obtained through the corresponding matrix transformations. Our work completes previous analysis that present different sets of right eigenvectors for non-degenerate and degenerate states, andmore » can be seen as a relativistic generalization of earlier work performed in classical MHD. Based on the full wave decomposition (FWD) provided by the renormalized set of eigenvectors in conserved variables, we have also developed a linearized (Roe-type) Riemann solver. Extensive testing against one- and two-dimensional standard numerical problems allows us to conclude that our solver is very robust. When compared with a family of simpler solvers that avoid the knowledge of the full characteristic structure of the equations in the computation of the numerical fluxes, our solver turns out to be less diffusive than HLL and HLLC, and comparable in accuracy to the HLLD solver. The amount of operations needed by the FWD solver makes it less efficient computationally than those of the HLL family in one-dimensional problems. However, its relative efficiency increases in multidimensional simulations.« less

  18. On the use of finite difference matrix-vector products in Newton-Krylov solvers for implicit climate dynamics with spectral elements

    DOE PAGES

    Woodward, Carol S.; Gardner, David J.; Evans, Katherine J.

    2015-01-01

    Efficient solutions of global climate models require effectively handling disparate length and time scales. Implicit solution approaches allow time integration of the physical system with a step size governed by accuracy of the processes of interest rather than by stability of the fastest time scales present. Implicit approaches, however, require the solution of nonlinear systems within each time step. Usually, a Newton's method is applied to solve these systems. Each iteration of the Newton's method, in turn, requires the solution of a linear model of the nonlinear system. This model employs the Jacobian of the problem-defining nonlinear residual, but thismore » Jacobian can be costly to form. If a Krylov linear solver is used for the solution of the linear system, the action of the Jacobian matrix on a given vector is required. In the case of spectral element methods, the Jacobian is not calculated but only implemented through matrix-vector products. The matrix-vector multiply can also be approximated by a finite difference approximation which may introduce inaccuracy in the overall nonlinear solver. In this paper, we review the advantages and disadvantages of finite difference approximations of these matrix-vector products for climate dynamics within the spectral element shallow water dynamical core of the Community Atmosphere Model.« less

  19. A High-Order Direct Solver for Helmholtz Equations with Neumann Boundary Conditions

    NASA Technical Reports Server (NTRS)

    Sun, Xian-He; Zhuang, Yu

    1997-01-01

    In this study, a compact finite-difference discretization is first developed for Helmholtz equations on rectangular domains. Special treatments are then introduced for Neumann and Neumann-Dirichlet boundary conditions to achieve accuracy and separability. Finally, a Fast Fourier Transform (FFT) based technique is used to yield a fast direct solver. Analytical and experimental results show this newly proposed solver is comparable to the conventional second-order elliptic solver when accuracy is not a primary concern, and is significantly faster than that of the conventional solver if a highly accurate solution is required. In addition, this newly proposed fourth order Helmholtz solver is parallel in nature. It is readily available for parallel and distributed computers. The compact scheme introduced in this study is likely extendible for sixth-order accurate algorithms and for more general elliptic equations.

  20. An adaptive discontinuous Galerkin solver for aerodynamic flows

    NASA Astrophysics Data System (ADS)

    Burgess, Nicholas K.

    This work considers the accuracy, efficiency, and robustness of an unstructured high-order accurate discontinuous Galerkin (DG) solver for computational fluid dynamics (CFD). Recently, there has been a drive to reduce the discretization error of CFD simulations using high-order methods on unstructured grids. However, high-order methods are often criticized for lacking robustness and having high computational cost. The goal of this work is to investigate methods that enhance the robustness of high-order discontinuous Galerkin (DG) methods on unstructured meshes, while maintaining low computational cost and high accuracy of the numerical solutions. This work investigates robustness enhancement of high-order methods by examining effective non-linear solvers, shock capturing methods, turbulence model discretizations and adaptive refinement techniques. The goal is to develop an all encompassing solver that can simulate a large range of physical phenomena, where all aspects of the solver work together to achieve a robust, efficient and accurate solution strategy. The components and framework for a robust high-order accurate solver that is capable of solving viscous, Reynolds Averaged Navier-Stokes (RANS) and shocked flows is presented. In particular, this work discusses robust discretizations of the turbulence model equation used to close the RANS equations, as well as stable shock capturing strategies that are applicable across a wide range of discretization orders and applicable to very strong shock waves. Furthermore, refinement techniques are considered as both efficiency and robustness enhancement strategies. Additionally, efficient non-linear solvers based on multigrid and Krylov subspace methods are presented. The accuracy, efficiency, and robustness of the solver is demonstrated using a variety of challenging aerodynamic test problems, which include turbulent high-lift and viscous hypersonic flows. Adaptive mesh refinement was found to play a critical role in

  1. On the implicit density based OpenFOAM solver for turbulent compressible flows

    NASA Astrophysics Data System (ADS)

    Fürst, Jiří

    The contribution deals with the development of coupled implicit density based solver for compressible flows in the framework of open source package OpenFOAM. However the standard distribution of OpenFOAM contains several ready-made segregated solvers for compressible flows, the performance of those solvers is rather week in the case of transonic flows. Therefore we extend the work of Shen [15] and we develop an implicit semi-coupled solver. The main flow field variables are updated using lower-upper symmetric Gauss-Seidel method (LU-SGS) whereas the turbulence model variables are updated using implicit Euler method.

  2. Basis Function Approximation of Transonic Aerodynamic Influence Coefficient Matrix

    NASA Technical Reports Server (NTRS)

    Li, Wesley Waisang; Pak, Chan-Gi

    2010-01-01

    A technique for approximating the modal aerodynamic influence coefficients [AIC] matrices by using basis functions has been developed and validated. An application of the resulting approximated modal AIC matrix for a flutter analysis in transonic speed regime has been demonstrated. This methodology can be applied to the unsteady subsonic, transonic and supersonic aerodynamics. The method requires the unsteady aerodynamics in frequency-domain. The flutter solution can be found by the classic methods, such as rational function approximation, k, p-k, p, root-locus et cetera. The unsteady aeroelastic analysis for design optimization using unsteady transonic aerodynamic approximation is being demonstrated using the ZAERO(TradeMark) flutter solver (ZONA Technology Incorporated, Scottsdale, Arizona). The technique presented has been shown to offer consistent flutter speed prediction on an aerostructures test wing [ATW] 2 configuration with negligible loss in precision in transonic speed regime. These results may have practical significance in the analysis of aircraft aeroelastic calculation and could lead to a more efficient design optimization cycle

  3. Basis Function Approximation of Transonic Aerodynamic Influence Coefficient Matrix

    NASA Technical Reports Server (NTRS)

    Li, Wesley W.; Pak, Chan-gi

    2011-01-01

    A technique for approximating the modal aerodynamic influence coefficients matrices by using basis functions has been developed and validated. An application of the resulting approximated modal aerodynamic influence coefficients matrix for a flutter analysis in transonic speed regime has been demonstrated. This methodology can be applied to the unsteady subsonic, transonic, and supersonic aerodynamics. The method requires the unsteady aerodynamics in frequency-domain. The flutter solution can be found by the classic methods, such as rational function approximation, k, p-k, p, root-locus et cetera. The unsteady aeroelastic analysis for design optimization using unsteady transonic aerodynamic approximation is being demonstrated using the ZAERO flutter solver (ZONA Technology Incorporated, Scottsdale, Arizona). The technique presented has been shown to offer consistent flutter speed prediction on an aerostructures test wing 2 configuration with negligible loss in precision in transonic speed regime. These results may have practical significance in the analysis of aircraft aeroelastic calculation and could lead to a more efficient design optimization cycle.

  4. IRMHD: an implicit radiative and magnetohydrodynamical solver for self-gravitating systems

    NASA Astrophysics Data System (ADS)

    Hujeirat, A.

    1998-07-01

    The 2D implicit hydrodynamical solver developed by Hujeirat & Rannacher is now modified to include the effects of radiation, magnetic fields and self-gravity in different geometries. The underlying numerical concept is based on the operator splitting approach, and the resulting 2D matrices are inverted using different efficient preconditionings such as ADI (alternating direction implicit), the approximate factorization method and Line-Gauss-Seidel or similar iteration procedures. Second-order finite volume with third-order upwinding and second-order time discretization is used. To speed up convergence and enhance efficiency we have incorporated an adaptive time-step control and monotonic multilevel grid distributions as well as vectorizing the code. Test calculations had shown that it requires only 38 per cent more computational effort than its explicit counterpart, whereas its range of application to astrophysical problems is much larger. For example, strongly time-dependent, quasi-stationary and steady-state solutions for the set of Euler and Navier-Stokes equations can now be sought on a non-linearly distributed and strongly stretched mesh. As most of the numerical techniques used to build up this algorithm have been described by Hujeirat & Rannacher in an earlier paper, we focus in this paper on the inclusion of self-gravity, radiation and magnetic fields. Strategies for satisfying the condition ∇.B=0 in the implicit evolution of MHD flows are given. A new discretization strategy for the vector potential which allows alternating use of the direct method is prescribed. We investigate the efficiencies of several 2D solvers for a Poisson-like equation and compare their convergence rates. We provide a splitting approach for the radiative flux within the FLD (flux-limited diffusion) approximation to enhance consistency and accuracy between regions of different optical depths. The results of some test problems are presented to demonstrate the accuracy and

  5. The novel high-performance 3-D MT inverse solver

    NASA Astrophysics Data System (ADS)

    Kruglyakov, Mikhail; Geraskin, Alexey; Kuvshinov, Alexey

    2016-04-01

    We present novel, robust, scalable, and fast 3-D magnetotelluric (MT) inverse solver. The solver is written in multi-language paradigm to make it as efficient, readable and maintainable as possible. Separation of concerns and single responsibility concepts go through implementation of the solver. As a forward modelling engine a modern scalable solver extrEMe, based on contracting integral equation approach, is used. Iterative gradient-type (quasi-Newton) optimization scheme is invoked to search for (regularized) inverse problem solution, and adjoint source approach is used to calculate efficiently the gradient of the misfit. The inverse solver is able to deal with highly detailed and contrasting models, allows for working (separately or jointly) with any type of MT responses, and supports massive parallelization. Moreover, different parallelization strategies implemented in the code allow optimal usage of available computational resources for a given problem statement. To parameterize an inverse domain the so-called mask parameterization is implemented, which means that one can merge any subset of forward modelling cells in order to account for (usually) irregular distribution of observation sites. We report results of 3-D numerical experiments aimed at analysing the robustness, performance and scalability of the code. In particular, our computational experiments carried out at different platforms ranging from modern laptops to HPC Piz Daint (6th supercomputer in the world) demonstrate practically linear scalability of the code up to thousands of nodes.

  6. User's Manual for PCSMS (Parallel Complex Sparse Matrix Solver). Version 1.

    NASA Technical Reports Server (NTRS)

    Reddy, C. J.

    2000-01-01

    PCSMS (Parallel Complex Sparse Matrix Solver) is a computer code written to make use of the existing real sparse direct solvers to solve complex, sparse matrix linear equations. PCSMS converts complex matrices into real matrices and use real, sparse direct matrix solvers to factor and solve the real matrices. The solution vector is reconverted to complex numbers. Though, this utility is written for Silicon Graphics (SGI) real sparse matrix solution routines, it is general in nature and can be easily modified to work with any real sparse matrix solver. The User's Manual is written to make the user acquainted with the installation and operation of the code. Driver routines are given to aid the users to integrate PCSMS routines in their own codes.

  7. PBEQ-Solver for online visualization of electrostatic potential of biomolecules.

    PubMed

    Jo, Sunhwan; Vargyas, Miklos; Vasko-Szedlar, Judit; Roux, Benoît; Im, Wonpil

    2008-07-01

    PBEQ-Solver provides a web-based graphical user interface to read biomolecular structures, solve the Poisson-Boltzmann (PB) equations and interactively visualize the electrostatic potential. PBEQ-Solver calculates (i) electrostatic potential and solvation free energy, (ii) protein-protein (DNA or RNA) electrostatic interaction energy and (iii) pKa of a selected titratable residue. All the calculations can be performed in both aqueous solvent and membrane environments (with a cylindrical pore in the case of membrane). PBEQ-Solver uses the PBEQ module in the biomolecular simulation program CHARMM to solve the finite-difference PB equation of molecules specified by users. Users can interactively inspect the calculated electrostatic potential on the solvent-accessible surface as well as iso-electrostatic potential contours using a novel online visualization tool based on MarvinSpace molecular visualization software, a Java applet integrated within CHARMM-GUI (http://www.charmm-gui.org). To reduce the computational time on the server, and to increase the efficiency in visualization, all the PB calculations are performed with coarse grid spacing (1.5 A before and 1 A after focusing). PBEQ-Solver suggests various physical parameters for PB calculations and users can modify them if necessary. PBEQ-Solver is available at http://www.charmm-gui.org/input/pbeqsolver.

  8. First and second order approximations to stage numbers in multicomponent enrichment cascades

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Scopatz, A.

    2013-07-01

    This paper describes closed form, Taylor series approximations to the number product stages in a multicomponent enrichment cascade. Such closed form approximations are required when a symbolic, rather than a numeric, algorithm is used to compute the optimal cascade state. Both first and second order approximations were implemented. The first order solution was found to be grossly incorrect, having the wrong functional form over the entire domain. On the other hand, the second order solution shows excellent agreement with the 'true' solution over the domain of interest. An implementation of the symbolic, second order solver is available in the freemore » and open source PyNE library. (authors)« less

  9. Computing approximate solutions of the protein structure determination problem using global constraints on discrete crystal lattices.

    PubMed

    Dal Palù, Alessandro; Dovier, Agostino; Pontelli, Enrico

    2010-01-01

    Crystal lattices are discrete models of the three-dimensional space that have been effectively employed to facilitate the task of determining proteins' natural conformation. This paper investigates alternative global constraints that can be introduced in a constraint solver over discrete crystal lattices. The objective is to enhance the efficiency of lattice solvers in dealing with the construction of approximate solutions of the protein structure determination problem. Some of them (e.g., self-avoiding-walk) have been explicitly or implicitly already used in previous approaches, while others (e.g., the density constraint) are new. The intrinsic complexities of all of them are studied and preliminary experimental results are discussed.

  10. Parallel performance investigations of an unstructured mesh Navier-Stokes solver

    NASA Technical Reports Server (NTRS)

    Mavriplis, Dimitri J.

    2000-01-01

    A Reynolds-averaged Navier-Stokes solver based on unstructured mesh techniques for analysis of high-lift configurations is described. The method makes use of an agglomeration multigrid solver for convergence acceleration. Implicit line-smoothing is employed to relieve the stiffness associated with highly stretched meshes. A GMRES technique is also implemented to speed convergence at the expense of additional memory usage. The solver is cache efficient and fully vectorizable, and is parallelized using a two-level hybrid MPI-OpenMP implementation suitable for shared and/or distributed memory architectures, as well as clusters of shared memory machines. Convergence and scalability results are illustrated for various high-lift cases.

  11. IGA-ADS: Isogeometric analysis FEM using ADS solver

    NASA Astrophysics Data System (ADS)

    Łoś, Marcin M.; Woźniak, Maciej; Paszyński, Maciej; Lenharth, Andrew; Hassaan, Muhamm Amber; Pingali, Keshav

    2017-08-01

    In this paper we present a fast explicit solver for solution of non-stationary problems using L2 projections with isogeometric finite element method. The solver has been implemented within GALOIS framework. It enables parallel multi-core simulations of different time-dependent problems, in 1D, 2D, or 3D. We have prepared the solver framework in a way that enables direct implementation of the selected PDE and corresponding boundary conditions. In this paper we describe the installation, implementation of exemplary three PDEs, and execution of the simulations on multi-core Linux cluster nodes. We consider three case studies, including heat transfer, linear elasticity, as well as non-linear flow in heterogeneous media. The presented package generates output suitable for interfacing with Gnuplot and ParaView visualization software. The exemplary simulations show near perfect scalability on Gilbert shared-memory node with four Intel® Xeon® CPU E7-4860 processors, each possessing 10 physical cores (for a total of 40 cores).

  12. An efficient spectral crystal plasticity solver for GPU architectures

    NASA Astrophysics Data System (ADS)

    Malahe, Michael

    2018-03-01

    We present a spectral crystal plasticity (CP) solver for graphics processing unit (GPU) architectures that achieves a tenfold increase in efficiency over prior GPU solvers. The approach makes use of a database containing a spectral decomposition of CP simulations performed using a conventional iterative solver over a parameter space of crystal orientations and applied velocity gradients. The key improvements in efficiency come from reducing global memory transactions, exposing more instruction-level parallelism, reducing integer instructions and performing fast range reductions on trigonometric arguments. The scheme also makes more efficient use of memory than prior work, allowing for larger problems to be solved on a single GPU. We illustrate these improvements with a simulation of 390 million crystal grains on a consumer-grade GPU, which executes at a rate of 2.72 s per strain step.

  13. Efficient Iterative Methods Applied to the Solution of Transonic Flows

    NASA Astrophysics Data System (ADS)

    Wissink, Andrew M.; Lyrintzis, Anastasios S.; Chronopoulos, Anthony T.

    1996-02-01

    We investigate the use of an inexact Newton's method to solve the potential equations in the transonic regime. As a test case, we solve the two-dimensional steady transonic small disturbance equation. Approximate factorization/ADI techniques have traditionally been employed for implicit solutions of this nonlinear equation. Instead, we apply Newton's method using an exact analytical determination of the Jacobian with preconditioned conjugate gradient-like iterative solvers for solution of the linear systems in each Newton iteration. Two iterative solvers are tested; a block s-step version of the classical Orthomin(k) algorithm called orthogonal s-step Orthomin (OSOmin) and the well-known GMRES method. The preconditioner is a vectorizable and parallelizable version of incomplete LU (ILU) factorization. Efficiency of the Newton-Iterative method on vector and parallel computer architectures is the main issue addressed. In vectorized tests on a single processor of the Cray C-90, the performance of Newton-OSOmin is superior to Newton-GMRES and a more traditional monotone AF/ADI method (MAF) for a variety of transonic Mach numbers and mesh sizes. Newton-GMRES is superior to MAF for some cases. The parallel performance of the Newton method is also found to be very good on multiple processors of the Cray C-90 and on the massively parallel thinking machine CM-5, where very fast execution rates (up to 9 Gflops) are found for large problems.

  14. Modifications of steam condensation model implemented in commercial solver

    NASA Astrophysics Data System (ADS)

    Sova, Libor; Jun, Gukchol; ŠÅ¥astný, Miroslav

    2017-09-01

    Nucleation theory and droplet grow theory and methods how they are incorporated into numerical solvers are crucial factors for proper wet steam modelling. Unfortunately, they are still covered by cloud of uncertainty and therefore some calibration of these models according to reliable experimental results is important for practical analyses of steam turbines. This article demonstrates how is possible to calibrate wet steam model incorporated into commercial solver ANSYS CFX.

  15. General Equation Set Solver for Compressible and Incompressible Turbomachinery Flows

    NASA Technical Reports Server (NTRS)

    Sondak, Douglas L.; Dorney, Daniel J.

    2002-01-01

    Turbomachines for propulsion applications operate with many different working fluids and flow conditions. The flow may be incompressible, such as in the liquid hydrogen pump in a rocket engine, or supersonic, such as in the turbine which may drive the hydrogen pump. Separate codes have traditionally been used for incompressible and compressible flow solvers. The General Equation Set (GES) method can be used to solve both incompressible and compressible flows, and it is not restricted to perfect gases, as are many compressible-flow turbomachinery solvers. An unsteady GES turbomachinery flow solver has been developed and applied to both air and water flows through turbines. It has been shown to be an excellent alternative to maintaining two separate codes.

  16. LSPRAY: Lagrangian Spray Solver for Applications With Parallel Computing and Unstructured Gas-Phase Flow Solvers

    NASA Technical Reports Server (NTRS)

    Raju, Manthena S.

    1998-01-01

    Sprays occur in a wide variety of industrial and power applications and in the processing of materials. A liquid spray is a phase flow with a gas as the continuous phase and a liquid as the dispersed phase (in the form of droplets or ligaments). Interactions between the two phases, which are coupled through exchanges of mass, momentum, and energy, can occur in different ways at different times and locations involving various thermal, mass, and fluid dynamic factors. An understanding of the flow, combustion, and thermal properties of a rapidly vaporizing spray requires careful modeling of the rate-controlling processes associated with the spray's turbulent transport, mixing, chemical kinetics, evaporation, and spreading rates, as well as other phenomena. In an attempt to advance the state-of-the-art in multidimensional numerical methods, we at the NASA Lewis Research Center extended our previous work on sprays to unstructured grids and parallel computing. LSPRAY, which was developed by M.S. Raju of Nyma, Inc., is designed to be massively parallel and could easily be coupled with any existing gas-phase flow and/or Monte Carlo probability density function (PDF) solver. The LSPRAY solver accommodates the use of an unstructured mesh with mixed triangular, quadrilateral, and/or tetrahedral elements in the gas-phase solvers. It is used specifically for fuel sprays within gas turbine combustors, but it has many other uses. The spray model used in LSPRAY provided favorable results when applied to stratified-charge rotary combustion (Wankel) engines and several other confined and unconfined spray flames. The source code will be available with the National Combustion Code (NCC) as a complete package.

  17. A CFD Heterogeneous Parallel Solver Based on Collaborating CPU and GPU

    NASA Astrophysics Data System (ADS)

    Lai, Jianqi; Tian, Zhengyu; Li, Hua; Pan, Sha

    2018-03-01

    Since Graphic Processing Unit (GPU) has a strong ability of floating-point computation and memory bandwidth for data parallelism, it has been widely used in the areas of common computing such as molecular dynamics (MD), computational fluid dynamics (CFD) and so on. The emergence of compute unified device architecture (CUDA), which reduces the complexity of compiling program, brings the great opportunities to CFD. There are three different modes for parallel solution of NS equations: parallel solver based on CPU, parallel solver based on GPU and heterogeneous parallel solver based on collaborating CPU and GPU. As we can see, GPUs are relatively rich in compute capacity but poor in memory capacity and the CPUs do the opposite. We need to make full use of the GPUs and CPUs, so a CFD heterogeneous parallel solver based on collaborating CPU and GPU has been established. Three cases are presented to analyse the solver’s computational accuracy and heterogeneous parallel efficiency. The numerical results agree well with experiment results, which demonstrate that the heterogeneous parallel solver has high computational precision. The speedup on a single GPU is more than 40 for laminar flow, it decreases for turbulent flow, but it still can reach more than 20. What’s more, the speedup increases as the grid size becomes larger.

  18. QED multi-dimensional vacuum polarization finite-difference solver

    NASA Astrophysics Data System (ADS)

    Carneiro, Pedro; Grismayer, Thomas; Silva, Luís; Fonseca, Ricardo

    2015-11-01

    The Extreme Light Infrastructure (ELI) is expected to deliver peak intensities of 1023 - 1024 W/cm2 allowing to probe nonlinear Quantum Electrodynamics (QED) phenomena in an unprecedented regime. Within the framework of QED, the second order process of photon-photon scattering leads to a set of extended Maxwell's equations [W. Heisenberg and H. Euler, Z. Physik 98, 714] effectively creating nonlinear polarization and magnetization terms that account for the nonlinear response of the vacuum. To model this in a self-consistent way, we present a multi dimensional generalized Maxwell equation finite difference solver with significantly enhanced dispersive properties, which was implemented in the OSIRIS particle-in-cell code [R.A. Fonseca et al. LNCS 2331, pp. 342-351, 2002]. We present a detailed numerical analysis of this electromagnetic solver. As an illustration of the properties of the solver, we explore several examples in extreme conditions. We confirm the theoretical prediction of vacuum birefringence of a pulse propagating in the presence of an intense static background field [arXiv:1301.4918 [quant-ph

  19. Fault tolerance in an inner-outer solver: A GVR-enabled case study

    DOE PAGES

    Zhang, Ziming; Chien, Andrew A.; Teranishi, Keita

    2015-04-18

    Resilience is a major challenge for large-scale systems. It is particularly important for iterative linear solvers, since they take much of the time of many scientific applications. We show that single bit flip errors in the Flexible GMRES iterative linear solver can lead to high computational overhead or even failure to converge to the right answer. Informed by these results, we design and evaluate several strategies for fault tolerance in both inner and outer solvers appropriate across a range of error rates. We implement them, extending Trilinos’ solver library with the Global View Resilience (GVR) programming model, which provides multi-streammore » snapshots, multi-version data structures with portable and rich error checking/recovery. Lastly, experimental results validate correct execution with low performance overhead under varied error conditions.« less

  20. The U.S. Geological Survey Modular Ground-Water Model - PCGN: A Preconditioned Conjugate Gradient Solver with Improved Nonlinear Control

    USGS Publications Warehouse

    Naff, Richard L.; Banta, Edward R.

    2008-01-01

    The preconditioned conjugate gradient with improved nonlinear control (PCGN) package provides addi-tional means by which the solution of nonlinear ground-water flow problems can be controlled as compared to existing solver packages for MODFLOW. Picard iteration is used to solve nonlinear ground-water flow equations by iteratively solving a linear approximation of the nonlinear equations. The linear solution is provided by means of the preconditioned conjugate gradient algorithm where preconditioning is provided by the modi-fied incomplete Cholesky algorithm. The incomplete Cholesky scheme incorporates two levels of fill, 0 and 1, in which the pivots can be modified so that the row sums of the preconditioning matrix and the original matrix are approximately equal. A relaxation factor is used to implement the modified pivots, which determines the degree of modification allowed. The effects of fill level and degree of pivot modification are briefly explored by means of a synthetic, heterogeneous finite-difference matrix; results are reported in the final section of this report. The preconditioned conjugate gradient method is coupled with Picard iteration so as to efficiently solve the nonlinear equations associated with many ground-water flow problems. The description of this coupling of the linear solver with Picard iteration is a primary concern of this document.

  1. Application of an unstructured grid flow solver to planes, trains and automobiles

    NASA Technical Reports Server (NTRS)

    Spragle, Gregory S.; Smith, Wayne A.; Yadlin, Yoram

    1993-01-01

    Rampant, an unstructured flow solver developed at Fluent Inc., is used to compute three-dimensional, viscous, turbulent, compressible flow fields within complex solution domains. Rampant is an explicit, finite-volume flow solver capable of computing flow fields using either triangular (2d) or tetrahedral (3d) unstructured grids. Local time stepping, implicit residual smoothing, and multigrid techniques are used to accelerate the convergence of the explicit scheme. The paper describes the Rampant flow solver and presents flow field solutions about a plane, train, and automobile.

  2. A generalized Poisson and Poisson-Boltzmann solver for electrostatic environments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fisicaro, G., E-mail: giuseppe.fisicaro@unibas.ch; Goedecker, S.; Genovese, L.

    2016-01-07

    The computational study of chemical reactions in complex, wet environments is critical for applications in many fields. It is often essential to study chemical reactions in the presence of applied electrochemical potentials, taking into account the non-trivial electrostatic screening coming from the solvent and the electrolytes. As a consequence, the electrostatic potential has to be found by solving the generalized Poisson and the Poisson-Boltzmann equations for neutral and ionic solutions, respectively. In the present work, solvers for both problems have been developed. A preconditioned conjugate gradient method has been implemented for the solution of the generalized Poisson equation and themore » linear regime of the Poisson-Boltzmann, allowing to solve iteratively the minimization problem with some ten iterations of the ordinary Poisson equation solver. In addition, a self-consistent procedure enables us to solve the non-linear Poisson-Boltzmann problem. Both solvers exhibit very high accuracy and parallel efficiency and allow for the treatment of periodic, free, and slab boundary conditions. The solver has been integrated into the BigDFT and Quantum-ESPRESSO electronic-structure packages and will be released as an independent program, suitable for integration in other codes.« less

  3. A generalized Poisson and Poisson-Boltzmann solver for electrostatic environments.

    PubMed

    Fisicaro, G; Genovese, L; Andreussi, O; Marzari, N; Goedecker, S

    2016-01-07

    The computational study of chemical reactions in complex, wet environments is critical for applications in many fields. It is often essential to study chemical reactions in the presence of applied electrochemical potentials, taking into account the non-trivial electrostatic screening coming from the solvent and the electrolytes. As a consequence, the electrostatic potential has to be found by solving the generalized Poisson and the Poisson-Boltzmann equations for neutral and ionic solutions, respectively. In the present work, solvers for both problems have been developed. A preconditioned conjugate gradient method has been implemented for the solution of the generalized Poisson equation and the linear regime of the Poisson-Boltzmann, allowing to solve iteratively the minimization problem with some ten iterations of the ordinary Poisson equation solver. In addition, a self-consistent procedure enables us to solve the non-linear Poisson-Boltzmann problem. Both solvers exhibit very high accuracy and parallel efficiency and allow for the treatment of periodic, free, and slab boundary conditions. The solver has been integrated into the BigDFT and Quantum-ESPRESSO electronic-structure packages and will be released as an independent program, suitable for integration in other codes.

  4. Multiscale Universal Interface: A concurrent framework for coupling heterogeneous solvers

    NASA Astrophysics Data System (ADS)

    Tang, Yu-Hang; Kudo, Shuhei; Bian, Xin; Li, Zhen; Karniadakis, George Em

    2015-09-01

    Concurrently coupled numerical simulations using heterogeneous solvers are powerful tools for modeling multiscale phenomena. However, major modifications to existing codes are often required to enable such simulations, posing significant difficulties in practice. In this paper we present a C++ library, i.e. the Multiscale Universal Interface (MUI), which is capable of facilitating the coupling effort for a wide range of multiscale simulations. The library adopts a header-only form with minimal external dependency and hence can be easily dropped into existing codes. A data sampler concept is introduced, combined with a hybrid dynamic/static typing mechanism, to create an easily customizable framework for solver-independent data interpretation. The library integrates MPI MPMD support and an asynchronous communication protocol to handle inter-solver information exchange irrespective of the solvers' own MPI awareness. Template metaprogramming is heavily employed to simultaneously improve runtime performance and code flexibility. We validated the library by solving three different multiscale problems, which also serve to demonstrate the flexibility of the framework in handling heterogeneous models and solvers. In the first example, a Couette flow was simulated using two concurrently coupled Smoothed Particle Hydrodynamics (SPH) simulations of different spatial resolutions. In the second example, we coupled the deterministic SPH method with the stochastic Dissipative Particle Dynamics (DPD) method to study the effect of surface grafting on the hydrodynamics properties on the surface. In the third example, we consider conjugate heat transfer between a solid domain and a fluid domain by coupling the particle-based energy-conserving DPD (eDPD) method with the Finite Element Method (FEM).

  5. Optimal sparse approximation with integrate and fire neurons.

    PubMed

    Shapero, Samuel; Zhu, Mengchen; Hasler, Jennifer; Rozell, Christopher

    2014-08-01

    Sparse approximation is a hypothesized coding strategy where a population of sensory neurons (e.g. V1) encodes a stimulus using as few active neurons as possible. We present the Spiking LCA (locally competitive algorithm), a rate encoded Spiking Neural Network (SNN) of integrate and fire neurons that calculate sparse approximations. The Spiking LCA is designed to be equivalent to the nonspiking LCA, an analog dynamical system that converges on a ℓ(1)-norm sparse approximations exponentially. We show that the firing rate of the Spiking LCA converges on the same solution as the analog LCA, with an error inversely proportional to the sampling time. We simulate in NEURON a network of 128 neuron pairs that encode 8 × 8 pixel image patches, demonstrating that the network converges to nearly optimal encodings within 20 ms of biological time. We also show that when using more biophysically realistic parameters in the neurons, the gain function encourages additional ℓ(0)-norm sparsity in the encoding, relative both to ideal neurons and digital solvers.

  6. Boosting Stochastic Problem Solvers Through Online Self-Analysis of Performance

    DTIC Science & Technology

    2003-07-21

    Boosting Stochastic Problem Solvers Through Online Self-Analysis of Performance Vincent A. Cicirello CMU-RI-TR-03-27 Submitted in partial fulfillment...AND SUBTITLE Boosting Stochastic Problem Solvers Through Online Self-Analysis of Performance 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM...lead to the development of a search control framework, called QD-BEACON that uses online -generated statistical models of search performance to

  7. Application of NASA General-Purpose Solver to Large-Scale Computations in Aeroacoustics

    NASA Technical Reports Server (NTRS)

    Watson, Willie R.; Storaasli, Olaf O.

    2004-01-01

    Of several iterative and direct equation solvers evaluated previously for computations in aeroacoustics, the most promising was the NASA-developed General-Purpose Solver (winner of NASA's 1999 software of the year award). This paper presents detailed, single-processor statistics of the performance of this solver, which has been tailored and optimized for large-scale aeroacoustic computations. The statistics, compiled using an SGI ORIGIN 2000 computer with 12 Gb available memory (RAM) and eight available processors, are the central processing unit time, RAM requirements, and solution error. The equation solver is capable of solving 10 thousand complex unknowns in as little as 0.01 sec using 0.02 Gb RAM, and 8.4 million complex unknowns in slightly less than 3 hours using all 12 Gb. This latter solution is the largest aeroacoustics problem solved to date with this technique. The study was unable to detect any noticeable error in the solution, since noise levels predicted from these solution vectors are in excellent agreement with the noise levels computed from the exact solution. The equation solver provides a means for obtaining numerical solutions to aeroacoustics problems in three dimensions.

  8. Validation of the Chemistry Module for the Euler Solver in Unified Flow Solver

    DTIC Science & Technology

    2012-03-01

    traveling through the atmosphere there are three types of flow regimes that exist; the first is the continuum regime, second is the rarified regime and...The second method has been used in a program called Unified Flow Solver (UFS). UFS is currently being developed under collaborative efforts the Air...thermal non-equilibrium case and finally to a thermo-chemical non- equilibrium case. The data from the simulations will be compared to a second code

  9. Performance of uncertainty quantification methodologies and linear solvers in cardiovascular simulations

    NASA Astrophysics Data System (ADS)

    Seo, Jongmin; Schiavazzi, Daniele; Marsden, Alison

    2017-11-01

    Cardiovascular simulations are increasingly used in clinical decision making, surgical planning, and disease diagnostics. Patient-specific modeling and simulation typically proceeds through a pipeline from anatomic model construction using medical image data to blood flow simulation and analysis. To provide confidence intervals on simulation predictions, we use an uncertainty quantification (UQ) framework to analyze the effects of numerous uncertainties that stem from clinical data acquisition, modeling, material properties, and boundary condition selection. However, UQ poses a computational challenge requiring multiple evaluations of the Navier-Stokes equations in complex 3-D models. To achieve efficiency in UQ problems with many function evaluations, we implement and compare a range of iterative linear solver and preconditioning techniques in our flow solver. We then discuss applications to patient-specific cardiovascular simulation and how the problem/boundary condition formulation in the solver affects the selection of the most efficient linear solver. Finally, we discuss performance improvements in the context of uncertainty propagation. Support from National Institute of Health (R01 EB018302) is greatly appreciated.

  10. Multiscale Universal Interface: A concurrent framework for coupling heterogeneous solvers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tang, Yu-Hang, E-mail: yuhang_tang@brown.edu; Kudo, Shuhei, E-mail: shuhei-kudo@outlook.jp; Bian, Xin, E-mail: xin_bian@brown.edu

    2015-09-15

    Graphical abstract: - Abstract: Concurrently coupled numerical simulations using heterogeneous solvers are powerful tools for modeling multiscale phenomena. However, major modifications to existing codes are often required to enable such simulations, posing significant difficulties in practice. In this paper we present a C++ library, i.e. the Multiscale Universal Interface (MUI), which is capable of facilitating the coupling effort for a wide range of multiscale simulations. The library adopts a header-only form with minimal external dependency and hence can be easily dropped into existing codes. A data sampler concept is introduced, combined with a hybrid dynamic/static typing mechanism, to create anmore » easily customizable framework for solver-independent data interpretation. The library integrates MPI MPMD support and an asynchronous communication protocol to handle inter-solver information exchange irrespective of the solvers' own MPI awareness. Template metaprogramming is heavily employed to simultaneously improve runtime performance and code flexibility. We validated the library by solving three different multiscale problems, which also serve to demonstrate the flexibility of the framework in handling heterogeneous models and solvers. In the first example, a Couette flow was simulated using two concurrently coupled Smoothed Particle Hydrodynamics (SPH) simulations of different spatial resolutions. In the second example, we coupled the deterministic SPH method with the stochastic Dissipative Particle Dynamics (DPD) method to study the effect of surface grafting on the hydrodynamics properties on the surface. In the third example, we consider conjugate heat transfer between a solid domain and a fluid domain by coupling the particle-based energy-conserving DPD (eDPD) method with the Finite Element Method (FEM)« less

  11. Decision Engines for Software Analysis Using Satisfiability Modulo Theories Solvers

    NASA Technical Reports Server (NTRS)

    Bjorner, Nikolaj

    2010-01-01

    The area of software analysis, testing and verification is now undergoing a revolution thanks to the use of automated and scalable support for logical methods. A well-recognized premise is that at the core of software analysis engines is invariably a component using logical formulas for describing states and transformations between system states. The process of using this information for discovering and checking program properties (including such important properties as safety and security) amounts to automatic theorem proving. In particular, theorem provers that directly support common software constructs offer a compelling basis. Such provers are commonly called satisfiability modulo theories (SMT) solvers. Z3 is a state-of-the-art SMT solver. It is developed at Microsoft Research. It can be used to check the satisfiability of logical formulas over one or more theories such as arithmetic, bit-vectors, lists, records and arrays. The talk describes some of the technology behind modern SMT solvers, including the solver Z3. Z3 is currently mainly targeted at solving problems that arise in software analysis and verification. It has been applied to various contexts, such as systems for dynamic symbolic simulation (Pex, SAGE, Vigilante), for program verification and extended static checking (Spec#/Boggie, VCC, HAVOC), for software model checking (Yogi, SLAM), model-based design (FORMULA), security protocol code (F7), program run-time analysis and invariant generation (VS3). We will describe how it integrates support for a variety of theories that arise naturally in the context of the applications. There are several new promising avenues and the talk will touch on some of these and the challenges related to SMT solvers. Proceedings

  12. AQUASOL: An efficient solver for the dipolar Poisson–Boltzmann–Langevin equation

    PubMed Central

    Koehl, Patrice; Delarue, Marc

    2010-01-01

    The Poisson–Boltzmann (PB) formalism is among the most popular approaches to modeling the solvation of molecules. It assumes a continuum model for water, leading to a dielectric permittivity that only depends on position in space. In contrast, the dipolar Poisson–Boltzmann–Langevin (DPBL) formalism represents the solvent as a collection of orientable dipoles with nonuniform concentration; this leads to a nonlinear permittivity function that depends both on the position and on the local electric field at that position. The differences in the assumptions underlying these two models lead to significant differences in the equations they generate. The PB equation is a second order, elliptic, nonlinear partial differential equation (PDE). Its response coefficients correspond to the dielectric permittivity and are therefore constant within each subdomain of the system considered (i.e., inside and outside of the molecules considered). While the DPBL equation is also a second order, elliptic, nonlinear PDE, its response coefficients are nonlinear functions of the electrostatic potential. Many solvers have been developed for the PB equation; to our knowledge, none of these can be directly applied to the DPBL equation. The methods they use may adapt to the difference; their implementations however are PBE specific. We adapted the PBE solver originally developed by Holst and Saied [J. Comput. Chem. 16, 337 (1995)] to the problem of solving the DPBL equation. This solver uses a truncated Newton method with a multigrid preconditioner. Numerical evidences suggest that it converges for the DPBL equation and that the convergence is superlinear. It is found however to be slow and greedy in memory requirement for problems commonly encountered in computational biology and computational chemistry. To circumvent these problems, we propose two variants, a quasi-Newton solver based on a simplified, inexact Jacobian and an iterative self-consistent solver that is based directly on

  13. AQUASOL: An efficient solver for the dipolar Poisson-Boltzmann-Langevin equation.

    PubMed

    Koehl, Patrice; Delarue, Marc

    2010-02-14

    The Poisson-Boltzmann (PB) formalism is among the most popular approaches to modeling the solvation of molecules. It assumes a continuum model for water, leading to a dielectric permittivity that only depends on position in space. In contrast, the dipolar Poisson-Boltzmann-Langevin (DPBL) formalism represents the solvent as a collection of orientable dipoles with nonuniform concentration; this leads to a nonlinear permittivity function that depends both on the position and on the local electric field at that position. The differences in the assumptions underlying these two models lead to significant differences in the equations they generate. The PB equation is a second order, elliptic, nonlinear partial differential equation (PDE). Its response coefficients correspond to the dielectric permittivity and are therefore constant within each subdomain of the system considered (i.e., inside and outside of the molecules considered). While the DPBL equation is also a second order, elliptic, nonlinear PDE, its response coefficients are nonlinear functions of the electrostatic potential. Many solvers have been developed for the PB equation; to our knowledge, none of these can be directly applied to the DPBL equation. The methods they use may adapt to the difference; their implementations however are PBE specific. We adapted the PBE solver originally developed by Holst and Saied [J. Comput. Chem. 16, 337 (1995)] to the problem of solving the DPBL equation. This solver uses a truncated Newton method with a multigrid preconditioner. Numerical evidences suggest that it converges for the DPBL equation and that the convergence is superlinear. It is found however to be slow and greedy in memory requirement for problems commonly encountered in computational biology and computational chemistry. To circumvent these problems, we propose two variants, a quasi-Newton solver based on a simplified, inexact Jacobian and an iterative self-consistent solver that is based directly on the PBE

  14. A high performance linear equation solver on the VPP500 parallel supercomputer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nakanishi, Makoto; Ina, Hiroshi; Miura, Kenichi

    1994-12-31

    This paper describes the implementation of two high performance linear equation solvers developed for the Fujitsu VPP500, a distributed memory parallel supercomputer system. The solvers take advantage of the key architectural features of VPP500--(1) scalability for an arbitrary number of processors up to 222 processors, (2) flexible data transfer among processors provided by a crossbar interconnection network, (3) vector processing capability on each processor, and (4) overlapped computation and transfer. The general linear equation solver based on the blocked LU decomposition method achieves 120.0 GFLOPS performance with 100 processors in the LIN-PACK Highly Parallel Computing benchmark.

  15. LAPACKrc: Fast linear algebra kernels/solvers for FPGA accelerators

    NASA Astrophysics Data System (ADS)

    Gonzalez, Juan; Núñez, Rafael C.

    2009-07-01

    We present LAPACKrc, a family of FPGA-based linear algebra solvers able to achieve more than 100x speedup per commodity processor on certain problems. LAPACKrc subsumes some of the LAPACK and ScaLAPACK functionalities, and it also incorporates sparse direct and iterative matrix solvers. Current LAPACKrc prototypes demonstrate between 40x-150x speedup compared against top-of-the-line hardware/software systems. A technology roadmap is in place to validate current performance of LAPACKrc in HPC applications, and to increase the computational throughput by factors of hundreds within the next few years.

  16. Implementation of density-based solver for all speeds in the framework of OpenFOAM

    NASA Astrophysics Data System (ADS)

    Shen, Chun; Sun, Fengxian; Xia, Xinlin

    2014-10-01

    In the framework of open source CFD code OpenFOAM, a density-based solver for all speeds flow field is developed. In this solver the preconditioned all speeds AUSM+(P) scheme is adopted and the dual time scheme is implemented to complete the unsteady process. Parallel computation could be implemented to accelerate the solving process. Different interface reconstruction algorithms are implemented, and their accuracy with respect to convection is compared. Three benchmark tests of lid-driven cavity flow, flow crossing over a bump, and flow over a forward-facing step are presented to show the accuracy of the AUSM+(P) solver for low-speed incompressible flow, transonic flow, and supersonic/hypersonic flow. Firstly, for the lid driven cavity flow, the computational results obtained by different interface reconstruction algorithms are compared. It is indicated that the one dimensional reconstruction scheme adopted in this solver possesses high accuracy and the solver developed in this paper can effectively catch the features of low incompressible flow. Then via the test cases regarding the flow crossing over bump and over forward step, the ability to capture characteristics of the transonic and supersonic/hypersonic flows are confirmed. The forward-facing step proves to be the most challenging for the preconditioned solvers with and without the dual time scheme. Nonetheless, the solvers described in this paper reproduce the main features of this flow, including the evolution of the initial transient.

  17. Preconditioned implicit solvers for the Navier-Stokes equations on distributed-memory machines

    NASA Technical Reports Server (NTRS)

    Ajmani, Kumud; Liou, Meng-Sing; Dyson, Rodger W.

    1994-01-01

    The GMRES method is parallelized, and combined with local preconditioning to construct an implicit parallel solver to obtain steady-state solutions for the Navier-Stokes equations of fluid flow on distributed-memory machines. The new implicit parallel solver is designed to preserve the convergence rate of the equivalent 'serial' solver. A static domain-decomposition is used to partition the computational domain amongst the available processing nodes of the parallel machine. The SPMD (Single-Program Multiple-Data) programming model is combined with message-passing tools to develop the parallel code on a 32-node Intel Hypercube and a 512-node Intel Delta machine. The implicit parallel solver is validated for internal and external flow problems, and is found to compare identically with flow solutions obtained on a Cray Y-MP/8. A peak computational speed of 2300 MFlops/sec has been achieved on 512 nodes of the Intel Delta machine,k for a problem size of 1024 K equations (256 K grid points).

  18. Efficiency optimization of a fast Poisson solver in beam dynamics simulation

    NASA Astrophysics Data System (ADS)

    Zheng, Dawei; Pöplau, Gisela; van Rienen, Ursula

    2016-01-01

    Calculating the solution of Poisson's equation relating to space charge force is still the major time consumption in beam dynamics simulations and calls for further improvement. In this paper, we summarize a classical fast Poisson solver in beam dynamics simulations: the integrated Green's function method. We introduce three optimization steps of the classical Poisson solver routine: using the reduced integrated Green's function instead of the integrated Green's function; using the discrete cosine transform instead of discrete Fourier transform for the Green's function; using a novel fast convolution routine instead of an explicitly zero-padded convolution. The new Poisson solver routine preserves the advantages of fast computation and high accuracy. This provides a fast routine for high performance calculation of the space charge effect in accelerators.

  19. On unstructured grids and solvers

    NASA Technical Reports Server (NTRS)

    Barth, T. J.

    1990-01-01

    The fundamentals and the state-of-the-art technology for unstructured grids and solvers are highlighted. Algorithms and techniques pertinent to mesh generation are discussed. It is shown that grid generation and grid manipulation schemes rely on fast multidimensional searching. Flow solution techniques for the Euler equations, which can be derived from the integral form of the equations are discussed. Sample calculations are also provided.

  20. Gpu Implementation of a Viscous Flow Solver on Unstructured Grids

    NASA Astrophysics Data System (ADS)

    Xu, Tianhao; Chen, Long

    2016-06-01

    Graphics processing units have gained popularities in scientific computing over past several years due to their outstanding parallel computing capability. Computational fluid dynamics applications involve large amounts of calculations, therefore a latest GPU card is preferable of which the peak computing performance and memory bandwidth are much better than a contemporary high-end CPU. We herein focus on the detailed implementation of our GPU targeting Reynolds-averaged Navier-Stokes equations solver based on finite-volume method. The solver employs a vertex-centered scheme on unstructured grids for the sake of being capable of handling complex topologies. Multiple optimizations are carried out to improve the memory accessing performance and kernel utilization. Both steady and unsteady flow simulation cases are carried out using explicit Runge-Kutta scheme. The solver with GPU acceleration in this paper is demonstrated to have competitive advantages over the CPU targeting one.

  1. Computation of three-dimensional multiphase flow dynamics by Fully-Coupled Immersed Flow (FCIF) solver

    NASA Astrophysics Data System (ADS)

    Miao, Sha; Hendrickson, Kelli; Liu, Yuming

    2017-12-01

    This work presents a Fully-Coupled Immersed Flow (FCIF) solver for the three-dimensional simulation of fluid-fluid interaction by coupling two distinct flow solvers using an Immersed Boundary (IB) method. The FCIF solver captures dynamic interactions between two fluids with disparate flow properties, while retaining the desirable simplicity of non-boundary-conforming grids. For illustration, we couple an IB-based unsteady Reynolds Averaged Navier Stokes (uRANS) simulator with a depth-integrated (long-wave) solver for the application of slug development with turbulent gas and laminar liquid. We perform a series of validations including turbulent/laminar flows over prescribed wavy boundaries and freely-evolving viscous fluids. These confirm the effectiveness and accuracy of both one-way and two-way coupling in the FCIF solver. Finally, we present a simulation example of the evolution from a stratified turbulent/laminar flow through the initiation of a slug that nearly bridges the channel. The results show both the interfacial wave dynamics excited by the turbulent gas forcing and the influence of the liquid on the gas turbulence. These results demonstrate that the FCIF solver effectively captures the essential physics of gas-liquid interaction and can serve as a useful tool for the mechanistic study of slug generation in two-phase gas/liquid flows in channels and pipes.

  2. A fast mass spring model solver for high-resolution elastic objects

    NASA Astrophysics Data System (ADS)

    Zheng, Mianlun; Yuan, Zhiyong; Zhu, Weixu; Zhang, Guian

    2017-03-01

    Real-time simulation of elastic objects is of great importance for computer graphics and virtual reality applications. The fast mass spring model solver can achieve visually realistic simulation in an efficient way. Unfortunately, this method suffers from resolution limitations and lack of mechanical realism for a surface geometry model, which greatly restricts its application. To tackle these problems, in this paper we propose a fast mass spring model solver for high-resolution elastic objects. First, we project the complex surface geometry model into a set of uniform grid cells as cages through *cages mean value coordinate method to reflect its internal structure and mechanics properties. Then, we replace the original Cholesky decomposition method in the fast mass spring model solver with a conjugate gradient method, which can make the fast mass spring model solver more efficient for detailed surface geometry models. Finally, we propose a graphics processing unit accelerated parallel algorithm for the conjugate gradient method. Experimental results show that our method can realize efficient deformation simulation of 3D elastic objects with visual reality and physical fidelity, which has a great potential for applications in computer animation.

  3. Mathematical and Numerical Aspects of the Adaptive Fast Multipole Poisson-Boltzmann Solver

    DOE PAGES

    Zhang, Bo; Lu, Benzhuo; Cheng, Xiaolin; ...

    2013-01-01

    This paper summarizes the mathematical and numerical theories and computational elements of the adaptive fast multipole Poisson-Boltzmann (AFMPB) solver. We introduce and discuss the following components in order: the Poisson-Boltzmann model, boundary integral equation reformulation, surface mesh generation, the nodepatch discretization approach, Krylov iterative methods, the new version of fast multipole methods (FMMs), and a dynamic prioritization technique for scheduling parallel operations. For each component, we also remark on feasible approaches for further improvements in efficiency, accuracy and applicability of the AFMPB solver to large-scale long-time molecular dynamics simulations. Lastly, the potential of the solver is demonstrated with preliminary numericalmore » results.« less

  4. A GPU accelerated and error-controlled solver for the unbounded Poisson equation in three dimensions

    NASA Astrophysics Data System (ADS)

    Exl, Lukas

    2017-12-01

    An efficient solver for the three dimensional free-space Poisson equation is presented. The underlying numerical method is based on finite Fourier series approximation. While the error of all involved approximations can be fully controlled, the overall computation error is driven by the convergence of the finite Fourier series of the density. For smooth and fast-decaying densities the proposed method will be spectrally accurate. The method scales with O(N log N) operations, where N is the total number of discretization points in the Cartesian grid. The majority of the computational costs come from fast Fourier transforms (FFT), which makes it ideal for GPU computation. Several numerical computations on CPU and GPU validate the method and show efficiency and convergence behavior. Tests are performed using the Vienna Scientific Cluster 3 (VSC3). A free MATLAB implementation for CPU and GPU is provided to the interested community.

  5. An iterative Riemann solver for systems of hyperbolic conservation law s, with application to hyperelastic solid mechanics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Miller, Gregory H.

    2003-08-06

    In this paper we present a general iterative method for the solution of the Riemann problem for hyperbolic systems of PDEs. The method is based on the multiple shooting method for free boundary value problems. We demonstrate the method by solving one-dimensional Riemann problems for hyperelastic solid mechanics. Even for conditions representative of routine laboratory conditions and military ballistics, dramatic differences are seen between the exact and approximate Riemann solution. The greatest discrepancy arises from misallocation of energy between compressional and thermal modes by the approximate solver, resulting in nonphysical entropy and temperature estimates. Several pathological conditions arise in commonmore » practice, and modifications to the method to handle these are discussed. These include points where genuine nonlinearity is lost, degeneracies, and eigenvector deficiencies that occur upon melting.« less

  6. Parallelizing alternating direction implicit solver on GPUs

    USDA-ARS?s Scientific Manuscript database

    We present a parallel Alternating Direction Implicit (ADI) solver on GPUs. Our implementation significantly improves existing implementations in two aspects. First, we address the scalability issue of existing Parallel Cyclic Reduction (PCR) implementations by eliminating their hardware resource con...

  7. A fast direct solver for a class of two-dimensional separable elliptic equations on the sphere

    NASA Technical Reports Server (NTRS)

    Moorthi, Shrinivas; Higgins, R. Wayne

    1992-01-01

    An efficient, direct, second-order solver for the discrete solution of two-dimensional separable elliptic equations on the sphere is presented. The method involves a Fourier transformation in longitude and a direct solution of the resulting coupled second-order finite difference equations in latitude. The solver is made efficient by vectorizing over longitudinal wavenumber and by using a vectorized fast Fourier transform routine. It is evaluated using a prescribed solution method and compared with a multigrid solver and the standard direct solver from FISHPAK.

  8. Efficient iterative methods applied to the solution of transonic flows

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wissink, A.M.; Lyrintzis, A.S.; Chronopoulos, A.T.

    1996-02-01

    We investigate the use of an inexact Newton`s method to solve the potential equations in the transonic regime. As a test case, we solve the two-dimensional steady transonic small disturbance equation. Approximate factorization/ADI techniques have traditionally been employed for implicit solutions of this nonlinear equation. Instead, we apply Newton`s method using an exact analytical determination of the Jacobian with preconditioned conjugate gradient-like iterative solvers for solution of the linear systems in each Newton iteration. Two iterative solvers are tested; a block s-step version of the classical Orthomin(k) algorithm called orthogonal s-step Orthomin (OSOmin) and the well-known GIVIRES method. The preconditionermore » is a vectorizable and parallelizable version of incomplete LU (ILU) factorization. Efficiency of the Newton-Iterative method on vector and parallel computer architectures is the main issue addressed. In vectorized tests on a single processor of the Cray C-90, the performance of Newton-OSOmin is superior to Newton-GMRES and a more traditional monotone AF/ADI method (MAF) for a variety of transonic Mach numbers and mesh sizes. Newton- GIVIRES is superior to MAF for some cases. The parallel performance of the Newton method is also found to be very good on multiple processors of the Cray C-90 and on the massively parallel thinking machine CM-5, where very fast execution rates (up to 9 Gflops) are found for large problems. 38 refs., 14 figs., 7 tabs.« less

  9. Advanced Fast 3-D Electromagnetic Solver for Microwave Tomography Imaging.

    PubMed

    Simonov, Nikolai; Kim, Bo-Ra; Lee, Kwang-Jae; Jeon, Soon-Ik; Son, Seong-Ho

    2017-10-01

    This paper describes a fast-forward electromagnetic solver (FFS) for the image reconstruction algorithm of our microwave tomography system. Our apparatus is a preclinical prototype of a biomedical imaging system, designed for the purpose of early breast cancer detection. It operates in the 3-6-GHz frequency band using a circular array of probe antennas immersed in a matching liquid; it produces image reconstructions of the permittivity and conductivity profiles of the breast under examination. Our reconstruction algorithm solves the electromagnetic (EM) inverse problem and takes into account the real EM properties of the probe antenna array as well as the influence of the patient's body and that of the upper metal screen sheet. This FFS algorithm is much faster than conventional EM simulation solvers. In comparison, in the same PC, the CST solver takes ~45 min, while the FFS takes ~1 s of effective simulation time for the same EM model of a numerical breast phantom.

  10. Study of the adaptive refinement on an open source 2D shallow-water flow solver using quadtree grid for flash flood simulations.

    NASA Astrophysics Data System (ADS)

    Kirstetter, G.; Popinet, S.; Fullana, J. M.; Lagrée, P. Y.; Josserand, C.

    2015-12-01

    The full resolution of shallow-water equations for modeling flash floods may have a high computational cost, so that majority of flood simulation softwares used for flood forecasting uses a simplification of this model : 1D approximations, diffusive or kinematic wave approximations or exotic models using non-physical free parameters. These kind of approximations permit to save a lot of computational time by sacrificing in an unquantified way the precision of simulations. To reduce drastically the cost of such 2D simulations by quantifying the lost of precision, we propose a 2D shallow-water flow solver built with the open source code Basilisk1, which is using adaptive refinement on a quadtree grid. This solver uses a well-balanced central-upwind scheme, which is at second order in time and space, and treats the friction and rain terms implicitly in finite volume approach. We demonstrate the validity of our simulation on the case of the flood of Tewkesbury (UK) occurred in July 2007, as shown on Fig. 1. On this case, a systematic study of the impact of the chosen criterium for adaptive refinement is performed. The criterium which has the best computational time / precision ratio is proposed. Finally, we present the power law giving the computational time in respect to the maximum resolution and we show that this law for our 2D simulation is close to the one of 1D simulation, thanks to the fractal dimension of the topography. [1] http://basilisk.fr/

  11. Implementation of a parallel unstructured Euler solver on the CM-5

    NASA Technical Reports Server (NTRS)

    Morano, Eric; Mavriplis, D. J.

    1995-01-01

    An efficient unstructured 3D Euler solver is parallelized on a Thinking Machine Corporation Connection Machine 5, distributed memory computer with vectoring capability. In this paper, the single instruction multiple data (SIMD) strategy is employed through the use of the CM Fortran language and the CMSSL scientific library. The performance of the CMSSL mesh partitioner is evaluated and the overall efficiency of the parallel flow solver is discussed.

  12. An Upwind Solver for the National Combustion Code

    NASA Technical Reports Server (NTRS)

    Sockol, Peter M.

    2011-01-01

    An upwind solver is presented for the unstructured grid National Combustion Code (NCC). The compressible Navier-Stokes equations with time-derivative preconditioning and preconditioned flux-difference splitting of the inviscid terms are used. First order derivatives are computed on cell faces and used to evaluate the shear stresses and heat fluxes. A new flux limiter uses these same first order derivatives in the evaluation of left and right states used in the flux-difference splitting. The k-epsilon turbulence equations are solved with the same second-order method. The new solver has been installed in a recent version of NCC and the resulting code has been tested successfully in 2D on two laminar cases with known solutions and one turbulent case with experimental data.

  13. Parallel-vector out-of-core equation solver for computational mechanics

    NASA Technical Reports Server (NTRS)

    Qin, J.; Agarwal, T. K.; Storaasli, O. O.; Nguyen, D. T.; Baddourah, M. A.

    1993-01-01

    A parallel/vector out-of-core equation solver is developed for shared-memory computers, such as the Cray Y-MP machine. The input/ output (I/O) time is reduced by using the a synchronous BUFFER IN and BUFFER OUT, which can be executed simultaneously with the CPU instructions. The parallel and vector capability provided by the supercomputers is also exploited to enhance the performance. Numerical applications in large-scale structural analysis are given to demonstrate the efficiency of the present out-of-core solver.

  14. Calculating qP-wave traveltimes in 2-D TTI media by high-order fast sweeping methods with a numerical quartic equation solver

    NASA Astrophysics Data System (ADS)

    Han, Song; Zhang, Wei; Zhang, Jie

    2017-09-01

    A fast sweeping method (FSM) determines the first arrival traveltimes of seismic waves by sweeping the velocity model in different directions meanwhile applying a local solver. It is an efficient way to numerically solve Hamilton-Jacobi equations for traveltime calculations. In this study, we develop an improved FSM to calculate the first arrival traveltimes of quasi-P (qP) waves in 2-D tilted transversely isotropic (TTI) media. A local solver utilizes the coupled slowness surface of qP and quasi-SV (qSV) waves to form a quartic equation, and solve it numerically to obtain possible traveltimes of qP-wave. The proposed quartic solver utilizes Fermat's principle to limit the range of the possible solution, then uses the bisection procedure to efficiently determine the real roots. With causality enforced during sweepings, our FSM converges fast in a few iterations, and the exact number depending on the complexity of the velocity model. To improve the accuracy, we employ high-order finite difference schemes and derive the second-order formulae. There is no weak anisotropy assumption, and no approximation is made to the complex slowness surface of qP-wave. In comparison to the traveltimes calculated by a horizontal slowness shooting method, the validity and accuracy of our FSM is demonstrated.

  15. Scalable domain decomposition solvers for stochastic PDEs in high performance computing

    DOE PAGES

    Desai, Ajit; Khalil, Mohammad; Pettit, Chris; ...

    2017-09-21

    Stochastic spectral finite element models of practical engineering systems may involve solutions of linear systems or linearized systems for non-linear problems with billions of unknowns. For stochastic modeling, it is therefore essential to design robust, parallel and scalable algorithms that can efficiently utilize high-performance computing to tackle such large-scale systems. Domain decomposition based iterative solvers can handle such systems. And though these algorithms exhibit excellent scalabilities, significant algorithmic and implementational challenges exist to extend them to solve extreme-scale stochastic systems using emerging computing platforms. Intrusive polynomial chaos expansion based domain decomposition algorithms are extended here to concurrently handle high resolutionmore » in both spatial and stochastic domains using an in-house implementation. Sparse iterative solvers with efficient preconditioners are employed to solve the resulting global and subdomain level local systems through multi-level iterative solvers. We also use parallel sparse matrix–vector operations to reduce the floating-point operations and memory requirements. Numerical and parallel scalabilities of these algorithms are presented for the diffusion equation having spatially varying diffusion coefficient modeled by a non-Gaussian stochastic process. Scalability of the solvers with respect to the number of random variables is also investigated.« less

  16. Scalable domain decomposition solvers for stochastic PDEs in high performance computing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Desai, Ajit; Khalil, Mohammad; Pettit, Chris

    Stochastic spectral finite element models of practical engineering systems may involve solutions of linear systems or linearized systems for non-linear problems with billions of unknowns. For stochastic modeling, it is therefore essential to design robust, parallel and scalable algorithms that can efficiently utilize high-performance computing to tackle such large-scale systems. Domain decomposition based iterative solvers can handle such systems. And though these algorithms exhibit excellent scalabilities, significant algorithmic and implementational challenges exist to extend them to solve extreme-scale stochastic systems using emerging computing platforms. Intrusive polynomial chaos expansion based domain decomposition algorithms are extended here to concurrently handle high resolutionmore » in both spatial and stochastic domains using an in-house implementation. Sparse iterative solvers with efficient preconditioners are employed to solve the resulting global and subdomain level local systems through multi-level iterative solvers. We also use parallel sparse matrix–vector operations to reduce the floating-point operations and memory requirements. Numerical and parallel scalabilities of these algorithms are presented for the diffusion equation having spatially varying diffusion coefficient modeled by a non-Gaussian stochastic process. Scalability of the solvers with respect to the number of random variables is also investigated.« less

  17. A Survey of Solver-Related Geometry and Meshing Issues

    NASA Technical Reports Server (NTRS)

    Masters, James; Daniel, Derick; Gudenkauf, Jared; Hine, David; Sideroff, Chris

    2016-01-01

    There is a concern in the computational fluid dynamics community that mesh generation is a significant bottleneck in the CFD workflow. This is one of several papers that will help set the stage for a moderated panel discussion addressing this issue. Although certain general "rules of thumb" and a priori mesh metrics can be used to ensure that some base level of mesh quality is achieved, inadequate consideration is often given to the type of solver or particular flow regime on which the mesh will be utilized. This paper explores how an analyst may want to think differently about a mesh based on considerations such as if a flow is compressible vs. incompressible or hypersonic vs. subsonic or if the solver is node-centered vs. cell-centered. This paper is a high-level investigation intended to provide general insight into how considering the nature of the solver or flow when performing mesh generation has the potential to increase the accuracy and/or robustness of the solution and drive the mesh generation process to a state where it is no longer a hindrance to the analysis process.

  18. Computational aeroelasticity using a pressure-based solver

    NASA Astrophysics Data System (ADS)

    Kamakoti, Ramji

    A computational methodology for performing fluid-structure interaction computations for three-dimensional elastic wing geometries is presented. The flow solver used is based on an unsteady Reynolds-Averaged Navier-Stokes (RANS) model. A well validated k-ε turbulence model with wall function treatment for near wall region was used to perform turbulent flow calculations. Relative merits of alternative flow solvers were investigated. The predictor-corrector-based Pressure Implicit Splitting of Operators (PISO) algorithm was found to be computationally economic for unsteady flow computations. Wing structure was modeled using Bernoulli-Euler beam theory. A fully implicit time-marching scheme (using the Newmark integration method) was used to integrate the equations of motion for structure. Bilinear interpolation and linear extrapolation techniques were used to transfer necessary information between fluid and structure solvers. Geometry deformation was accounted for by using a moving boundary module. The moving grid capability was based on a master/slave concept and transfinite interpolation techniques. Since computations were performed on a moving mesh system, the geometric conservation law must be preserved. This is achieved by appropriately evaluating the Jacobian values associated with each cell. Accurate computation of contravariant velocities for unsteady flows using the momentum interpolation method on collocated, curvilinear grids was also addressed. Flutter computations were performed for the AGARD 445.6 wing at subsonic, transonic and supersonic Mach numbers. Unsteady computations were performed at various dynamic pressures to predict the flutter boundary. Results showed favorable agreement of experiment and previous numerical results. The computational methodology exhibited capabilities to predict both qualitative and quantitative features of aeroelasticity.

  19. Numerical System Solver Developed for the National Cycle Program

    NASA Technical Reports Server (NTRS)

    Binder, Michael P.

    1999-01-01

    As part of the National Cycle Program (NCP), a powerful new numerical solver has been developed to support the simulation of aeropropulsion systems. This software uses a hierarchical object-oriented design. It can provide steady-state and time-dependent solutions to nonlinear and even discontinuous problems typically encountered when aircraft and spacecraft propulsion systems are simulated. It also can handle constrained solutions, in which one or more factors may limit the behavior of the engine system. Timedependent simulation capabilities include adaptive time-stepping and synchronization with digital control elements. The NCP solver is playing an important role in making the NCP a flexible, powerful, and reliable simulation package.

  20. A mass-conservative adaptive FAS multigrid solver for cell-centered finite difference methods on block-structured, locally-cartesian grids

    NASA Astrophysics Data System (ADS)

    Feng, Wenqiang; Guo, Zhenlin; Lowengrub, John S.; Wise, Steven M.

    2018-01-01

    We present a mass-conservative full approximation storage (FAS) multigrid solver for cell-centered finite difference methods on block-structured, locally cartesian grids. The algorithm is essentially a standard adaptive FAS (AFAS) scheme, but with a simple modification that comes in the form of a mass-conservative correction to the coarse-level force. This correction is facilitated by the creation of a zombie variable, analogous to a ghost variable, but defined on the coarse grid and lying under the fine grid refinement patch. We show that a number of different types of fine-level ghost cell interpolation strategies could be used in our framework, including low-order linear interpolation. In our approach, the smoother, prolongation, and restriction operations need never be aware of the mass conservation conditions at the coarse-fine interface. To maintain global mass conservation, we need only modify the usual FAS algorithm by correcting the coarse-level force function at points adjacent to the coarse-fine interface. We demonstrate through simulations that the solver converges geometrically, at a rate that is h-independent, and we show the generality of the solver, applying it to several nonlinear, time-dependent, and multi-dimensional problems. In several tests, we show that second-order asymptotic (h → 0) convergence is observed for the discretizations, provided that (1) at least linear interpolation of the ghost variables is employed, and (2) the mass conservation corrections are applied to the coarse-level force term.

  1. A fast direct solver for boundary value problems on locally perturbed geometries

    NASA Astrophysics Data System (ADS)

    Zhang, Yabin; Gillman, Adrianna

    2018-03-01

    Many applications including optimal design and adaptive discretization techniques involve solving several boundary value problems on geometries that are local perturbations of an original geometry. This manuscript presents a fast direct solver for boundary value problems that are recast as boundary integral equations. The idea is to write the discretized boundary integral equation on a new geometry as a low rank update to the discretized problem on the original geometry. Using the Sherman-Morrison formula, the inverse can be expressed in terms of the inverse of the original system applied to the low rank factors and the right hand side. Numerical results illustrate for problems where perturbation is localized the fast direct solver is three times faster than building a new solver from scratch.

  2. Three-Dimensional High-Lift Analysis Using a Parallel Unstructured Multigrid Solver

    NASA Technical Reports Server (NTRS)

    Mavriplis, Dimitri J.

    1998-01-01

    A directional implicit unstructured agglomeration multigrid solver is ported to shared and distributed memory massively parallel machines using the explicit domain-decomposition and message-passing approach. Because the algorithm operates on local implicit lines in the unstructured mesh, special care is required in partitioning the problem for parallel computing. A weighted partitioning strategy is described which avoids breaking the implicit lines across processor boundaries, while incurring minimal additional communication overhead. Good scalability is demonstrated on a 128 processor SGI Origin 2000 machine and on a 512 processor CRAY T3E machine for reasonably fine grids. The feasibility of performing large-scale unstructured grid calculations with the parallel multigrid algorithm is demonstrated by computing the flow over a partial-span flap wing high-lift geometry on a highly resolved grid of 13.5 million points in approximately 4 hours of wall clock time on the CRAY T3E.

  3. Parallelization of the preconditioned IDR solver for modern multicore computer systems

    NASA Astrophysics Data System (ADS)

    Bessonov, O. A.; Fedoseyev, A. I.

    2012-10-01

    This paper present the analysis, parallelization and optimization approach for the large sparse matrix solver CNSPACK for modern multicore microprocessors. CNSPACK is an advanced solver successfully used for coupled solution of stiff problems arising in multiphysics applications such as CFD, semiconductor transport, kinetic and quantum problems. It employs iterative IDR algorithm with ILU preconditioning (user chosen ILU preconditioning order). CNSPACK has been successfully used during last decade for solving problems in several application areas, including fluid dynamics and semiconductor device simulation. However, there was a dramatic change in processor architectures and computer system organization in recent years. Due to this, performance criteria and methods have been revisited, together with involving the parallelization of the solver and preconditioner using Open MP environment. Results of the successful implementation for efficient parallelization are presented for the most advances computer system (Intel Core i7-9xx or two-processor Xeon 55xx/56xx).

  4. A Parallel Multigrid Solver for Viscous Flows on Anisotropic Structured Grids

    NASA Technical Reports Server (NTRS)

    Prieto, Manuel; Montero, Ruben S.; Llorente, Ignacio M.; Bushnell, Dennis M. (Technical Monitor)

    2001-01-01

    This paper presents an efficient parallel multigrid solver for speeding up the computation of a 3-D model that treats the flow of a viscous fluid over a flat plate. The main interest of this simulation lies in exhibiting some basic difficulties that prevent optimal multigrid efficiencies from being achieved. As the computing platform, we have used Coral, a Beowulf-class system based on Intel Pentium processors and equipped with GigaNet cLAN and switched Fast Ethernet networks. Our study not only examines the scalability of the solver but also includes a performance evaluation of Coral where the investigated solver has been used to compare several of its design choices, namely, the interconnection network (GigaNet versus switched Fast-Ethernet) and the node configuration (dual nodes versus single nodes). As a reference, the performance results have been compared with those obtained with the NAS-MG benchmark.

  5. FoSSI: the family of simplified solver interfaces for the rapid development of parallel numerical atmosphere and ocean models

    NASA Astrophysics Data System (ADS)

    Frickenhaus, Stephan; Hiller, Wolfgang; Best, Meike

    The portable software FoSSI is introduced that—in combination with additional free solver software packages—allows for an efficient and scalable parallel solution of large sparse linear equations systems arising in finite element model codes. FoSSI is intended to support rapid model code development, completely hiding the complexity of the underlying solver packages. In particular, the model developer need not be an expert in parallelization and is yet free to switch between different solver packages by simple modifications of the interface call. FoSSI offers an efficient and easy, yet flexible interface to several parallel solvers, most of them available on the web, such as PETSC, AZTEC, MUMPS, PILUT and HYPRE. FoSSI makes use of the concept of handles for vectors, matrices, preconditioners and solvers, that is frequently used in solver libraries. Hence, FoSSI allows for a flexible treatment of several linear equations systems and associated preconditioners at the same time, even in parallel on separate MPI-communicators. The second special feature in FoSSI is the task specifier, being a combination of keywords, each configuring a certain phase in the solver setup. This enables the user to control a solver over one unique subroutine. Furthermore, FoSSI has rather similar features for all solvers, making a fast solver intercomparison or exchange an easy task. FoSSI is a community software, proven in an adaptive 2D-atmosphere model and a 3D-primitive equation ocean model, both formulated in finite elements. The present paper discusses perspectives of an OpenMP-implementation of parallel iterative solvers based on domain decomposition methods. This approach to OpenMP solvers is rather attractive, as the code for domain-local operations of factorization, preconditioning and matrix-vector product can be readily taken from a sequential implementation that is also suitable to be used in an MPI-variant. Code development in this direction is in an advanced state under

  6. A comparison of viscous-plastic sea ice solvers with and without replacement pressure

    NASA Astrophysics Data System (ADS)

    Kimmritz, Madlen; Losch, Martin; Danilov, Sergey

    2017-07-01

    Recent developments of the explicit elastic-viscous-plastic (EVP) solvers call for a new comparison with implicit solvers for the equations of viscous-plastic sea ice dynamics. In Arctic sea ice simulations, the modified and the adaptive EVP solvers, and the implicit Jacobian-free Newton-Krylov (JFNK) solver are compared against each other. The adaptive EVP method shows convergence rates that are generally similar or even better than those of the modified EVP method, but the convergence of the EVP methods is found to depend dramatically on the use of the replacement pressure (RP). Apparently, using the RP can affect the pseudo-elastic waves in the EVP methods by introducing extra non-physical oscillations so that, in the extreme case, convergence to the VP solution can be lost altogether. The JFNK solver also suffers from higher failure rates with RP implying that with RP the momentum equations are stiffer and more difficult to solve. For practical purposes, both EVP methods can be used efficiently with an unexpectedly low number of sub-cycling steps without compromising the solutions. The differences between the RP solutions and the NoRP solutions (when the RP is not being used) can be reduced with lower thresholds of viscous regularization at the cost of increasing stiffness of the equations, and hence the computational costs of solving them.

  7. Equation solvers for distributed-memory computers

    NASA Technical Reports Server (NTRS)

    Storaasli, Olaf O.

    1994-01-01

    A large number of scientific and engineering problems require the rapid solution of large systems of simultaneous equations. The performance of parallel computers in this area now dwarfs traditional vector computers by nearly an order of magnitude. This talk describes the major issues involved in parallel equation solvers with particular emphasis on the Intel Paragon, IBM SP-1 and SP-2 processors.

  8. Three-Dimensional Inverse Transport Solver Based on Compressive Sensing Technique

    NASA Astrophysics Data System (ADS)

    Cheng, Yuxiong; Wu, Hongchun; Cao, Liangzhi; Zheng, Youqi

    2013-09-01

    According to the direct exposure measurements from flash radiographic image, a compressive sensing-based method for three-dimensional inverse transport problem is presented. The linear absorption coefficients and interface locations of objects are reconstructed directly at the same time. It is always very expensive to obtain enough measurements. With limited measurements, compressive sensing sparse reconstruction technique orthogonal matching pursuit is applied to obtain the sparse coefficients by solving an optimization problem. A three-dimensional inverse transport solver is developed based on a compressive sensing-based technique. There are three features in this solver: (1) AutoCAD is employed as a geometry preprocessor due to its powerful capacity in graphic. (2) The forward projection matrix rather than Gauss matrix is constructed by the visualization tool generator. (3) Fourier transform and Daubechies wavelet transform are adopted to convert an underdetermined system to a well-posed system in the algorithm. Simulations are performed and numerical results in pseudo-sine absorption problem, two-cube problem and two-cylinder problem when using compressive sensing-based solver agree well with the reference value.

  9. Finite difference method accelerated with sparse solvers for structural analysis of the metal-organic complexes

    NASA Astrophysics Data System (ADS)

    Guda, A. A.; Guda, S. A.; Soldatov, M. A.; Lomachenko, K. A.; Bugaev, A. L.; Lamberti, C.; Gawelda, W.; Bressler, C.; Smolentsev, G.; Soldatov, A. V.; Joly, Y.

    2016-05-01

    Finite difference method (FDM) implemented in the FDMNES software [Phys. Rev. B, 2001, 63, 125120] was revised. Thorough analysis shows, that the calculated diagonal in the FDM matrix consists of about 96% zero elements. Thus a sparse solver would be more suitable for the problem instead of traditional Gaussian elimination for the diagonal neighbourhood. We have tried several iterative sparse solvers and the direct one MUMPS solver with METIS ordering turned out to be the best. Compared to the Gaussian solver present method is up to 40 times faster and allows XANES simulations for complex systems already on personal computers. We show applicability of the software for metal-organic [Fe(bpy)3]2+ complex both for low spin and high spin states populated after laser excitation.

  10. A Flexible CUDA LU-based Solver for Small, Batched Linear Systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tumeo, Antonino; Gawande, Nitin A.; Villa, Oreste

    This chapter presents the implementation of a batched CUDA solver based on LU factorization for small linear systems. This solver may be used in applications such as reactive flow transport models, which apply the Newton-Raphson technique to linearize and iteratively solve the sets of non linear equations that represent the reactions for ten of thousands to millions of physical locations. The implementation exploits somewhat counterintuitive GPGPU programming techniques: it assigns the solution of a matrix (representing a system) to a single CUDA thread, does not exploit shared memory and employs dynamic memory allocation on the GPUs. These techniques enable ourmore » implementation to simultaneously solve sets of systems with over 100 equations and to employ LU decomposition with complete pivoting, providing the higher numerical accuracy required by certain applications. Other currently available solutions for batched linear solvers are limited by size and only support partial pivoting, although they may result faster in certain conditions. We discuss the code of our implementation and present a comparison with the other implementations, discussing the various tradeoffs in terms of performance and flexibility. This work will enable developers that need batched linear solvers to choose whichever implementation is more appropriate to the features and the requirements of their applications, and even to implement dynamic switching approaches that can choose the best implementation depending on the input data.« less

  11. LEOPARD: A grid-based dispersion relation solver for arbitrary gyrotropic distributions

    NASA Astrophysics Data System (ADS)

    Astfalk, Patrick; Jenko, Frank

    2017-01-01

    Particle velocity distributions measured in collisionless space plasmas often show strong deviations from idealized model distributions. Despite this observational evidence, linear wave analysis in space plasma environments such as the solar wind or Earth's magnetosphere is still mainly carried out using dispersion relation solvers based on Maxwellians or other parametric models. To enable a more realistic analysis, we present the new grid-based kinetic dispersion relation solver LEOPARD (Linear Electromagnetic Oscillations in Plasmas with Arbitrary Rotationally-symmetric Distributions) which no longer requires prescribed model distributions but allows for arbitrary gyrotropic distribution functions. In this work, we discuss the underlying numerical scheme of the code and we show a few exemplary benchmarks. Furthermore, we demonstrate a first application of LEOPARD to ion distribution data obtained from hybrid simulations. In particular, we show that in the saturation stage of the parallel fire hose instability, the deformation of the initial bi-Maxwellian distribution invalidates the use of standard dispersion relation solvers. A linear solver based on bi-Maxwellians predicts further growth even after saturation, while LEOPARD correctly indicates vanishing growth rates. We also discuss how this complies with former studies on the validity of quasilinear theory for the resonant fire hose. In the end, we briefly comment on the role of LEOPARD in directly analyzing spacecraft data, and we refer to an upcoming paper which demonstrates a first application of that kind.

  12. Matlab Geochemistry: An open source geochemistry solver based on MRST

    NASA Astrophysics Data System (ADS)

    McNeece, C. J.; Raynaud, X.; Nilsen, H.; Hesse, M. A.

    2017-12-01

    The study of geological systems often requires the solution of complex geochemical relations. To address this need we present an open source geochemical solver based on the Matlab Reservoir Simulation Toolbox (MRST) developed by SINTEF. The implementation supports non-isothermal multicomponent aqueous complexation, surface complexation, ion exchange, and dissolution/precipitation reactions. The suite of tools available in MRST allows for rapid model development, in particular the incorporation of geochemical calculations into transport simulations of multiple phases, complex domain geometry and geomechanics. Different numerical schemes and additional physics can be easily incorporated into the existing tools through the object-oriented framework employed by MRST. The solver leverages the automatic differentiation tools available in MRST to solve arbitrarily complex geochemical systems with any choice of species or element concentration as input. Four mathematical approaches enable the solver to be quite robust: 1) the choice of chemical elements as the basis components makes all entries in the composition matrix positive thus preserving convexity, 2) a log variable transformation is used which transfers the nonlinearity to the convex composition matrix, 3) a priori bounds on variables are calculated from the structure of the problem, constraining Netwon's path and 4) an initial guess is calculated implicitly by sequentially adding model complexity. As a benchmark we compare the model to experimental and semi-analytic solutions of the coupled salinity-acidity transport system. Together with the reservoir simulation capabilities of MRST the solver offers a promising tool for geochemical simulations in reservoir domains for applications in a diversity of fields from enhanced oil recovery to radionuclide storage.

  13. Using a multifrontal sparse solver in a high performance, finite element code

    NASA Technical Reports Server (NTRS)

    King, Scott D.; Lucas, Robert; Raefsky, Arthur

    1990-01-01

    We consider the performance of the finite element method on a vector supercomputer. The computationally intensive parts of the finite element method are typically the individual element forms and the solution of the global stiffness matrix both of which are vectorized in high performance codes. To further increase throughput, new algorithms are needed. We compare a multifrontal sparse solver to a traditional skyline solver in a finite element code on a vector supercomputer. The multifrontal solver uses the Multiple-Minimum Degree reordering heuristic to reduce the number of operations required to factor a sparse matrix and full matrix computational kernels (e.g., BLAS3) to enhance vector performance. The net result in an order-of-magnitude reduction in run time for a finite element application on one processor of a Cray X-MP.

  14. On Riemann solvers and kinetic relations for isothermal two-phase flows with surface tension

    NASA Astrophysics Data System (ADS)

    Rohde, Christian; Zeiler, Christoph

    2018-06-01

    We consider a sharp interface approach for the inviscid isothermal dynamics of compressible two-phase flow that accounts for phase transition and surface tension effects. Kinetic relations are frequently used to fix the mass exchange and entropy dissipation rate across the interface. The complete unidirectional dynamics can then be understood by solving generalized two-phase Riemann problems. We present new well-posedness theorems for the Riemann problem and corresponding computable Riemann solvers that cover quite general equations of state, metastable input data and curvature effects. The new Riemann solver is used to validate different kinetic relations on physically relevant problems including a comparison with experimental data. Riemann solvers are building blocks for many numerical schemes that are used to track interfaces in two-phase flow. It is shown that the new Riemann solver enables reliable and efficient computations for physical situations that could not be treated before.

  15. An Implicit Solver on A Parallel Block-Structured Adaptive Mesh Grid for FLASH

    NASA Astrophysics Data System (ADS)

    Lee, D.; Gopal, S.; Mohapatra, P.

    2012-07-01

    We introduce a fully implicit solver for FLASH based on a Jacobian-Free Newton-Krylov (JFNK) approach with an appropriate preconditioner. The main goal of developing this JFNK-type implicit solver is to provide efficient high-order numerical algorithms and methodology for simulating stiff systems of differential equations on large-scale parallel computer architectures. A large number of natural problems in nonlinear physics involve a wide range of spatial and time scales of interest. A system that encompasses such a wide magnitude of scales is described as "stiff." A stiff system can arise in many different fields of physics, including fluid dynamics/aerodynamics, laboratory/space plasma physics, low Mach number flows, reactive flows, radiation hydrodynamics, and geophysical flows. One of the big challenges in solving such a stiff system using current-day computational resources lies in resolving time and length scales varying by several orders of magnitude. We introduce FLASH's preliminary implementation of a time-accurate JFNK-based implicit solver in the framework of FLASH's unsplit hydro solver.

  16. Algebraic multigrid preconditioning within parallel finite-element solvers for 3-D electromagnetic modelling problems in geophysics

    NASA Astrophysics Data System (ADS)

    Koldan, Jelena; Puzyrev, Vladimir; de la Puente, Josep; Houzeaux, Guillaume; Cela, José María

    2014-06-01

    We present an elaborate preconditioning scheme for Krylov subspace methods which has been developed to improve the performance and reduce the execution time of parallel node-based finite-element (FE) solvers for 3-D electromagnetic (EM) numerical modelling in exploration geophysics. This new preconditioner is based on algebraic multigrid (AMG) that uses different basic relaxation methods, such as Jacobi, symmetric successive over-relaxation (SSOR) and Gauss-Seidel, as smoothers and the wave front algorithm to create groups, which are used for a coarse-level generation. We have implemented and tested this new preconditioner within our parallel nodal FE solver for 3-D forward problems in EM induction geophysics. We have performed series of experiments for several models with different conductivity structures and characteristics to test the performance of our AMG preconditioning technique when combined with biconjugate gradient stabilized method. The results have shown that, the more challenging the problem is in terms of conductivity contrasts, ratio between the sizes of grid elements and/or frequency, the more benefit is obtained by using this preconditioner. Compared to other preconditioning schemes, such as diagonal, SSOR and truncated approximate inverse, the AMG preconditioner greatly improves the convergence of the iterative solver for all tested models. Also, when it comes to cases in which other preconditioners succeed to converge to a desired precision, AMG is able to considerably reduce the total execution time of the forward-problem code-up to an order of magnitude. Furthermore, the tests have confirmed that our AMG scheme ensures grid-independent rate of convergence, as well as improvement in convergence regardless of how big local mesh refinements are. In addition, AMG is designed to be a black-box preconditioner, which makes it easy to use and combine with different iterative methods. Finally, it has proved to be very practical and efficient in the

  17. Transonic Drag Prediction Using an Unstructured Multigrid Solver

    NASA Technical Reports Server (NTRS)

    Mavriplis, D. J.; Levy, David W.

    2001-01-01

    This paper summarizes the results obtained with the NSU-3D unstructured multigrid solver for the AIAA Drag Prediction Workshop held in Anaheim, CA, June 2001. The test case for the workshop consists of a wing-body configuration at transonic flow conditions. Flow analyses for a complete test matrix of lift coefficient values and Mach numbers at a constant Reynolds number are performed, thus producing a set of drag polars and drag rise curves which are compared with experimental data. Results were obtained independently by both authors using an identical baseline grid and different refined grids. Most cases were run in parallel on commodity cluster-type machines while the largest cases were run on an SGI Origin machine using 128 processors. The objective of this paper is to study the accuracy of the subject unstructured grid solver for predicting drag in the transonic cruise regime, to assess the efficiency of the method in terms of convergence, cpu time, and memory, and to determine the effects of grid resolution on this predictive ability and its computational efficiency. A good predictive ability is demonstrated over a wide range of conditions, although accuracy was found to degrade for cases at higher Mach numbers and lift values where increasing amounts of flow separation occur. The ability to rapidly compute large numbers of cases at varying flow conditions using an unstructured solver on inexpensive clusters of commodity computers is also demonstrated.

  18. A comparison of SuperLU solvers on the intel MIC architecture

    NASA Astrophysics Data System (ADS)

    Tuncel, Mehmet; Duran, Ahmet; Celebi, M. Serdar; Akaydin, Bora; Topkaya, Figen O.

    2016-10-01

    In many science and engineering applications, problems may result in solving a sparse linear system AX=B. For example, SuperLU_MCDT, a linear solver, was used for the large penta-diagonal matrices for 2D problems and hepta-diagonal matrices for 3D problems, coming from the incompressible blood flow simulation (see [1]). It is important to test the status and potential improvements of state-of-the-art solvers on new technologies. In this work, sequential, multithreaded and distributed versions of SuperLU solvers (see [2]) are examined on the Intel Xeon Phi coprocessors using offload programming model at the EURORA cluster of CINECA in Italy. We consider a portfolio of test matrices containing patterned matrices from UFMM ([3]) and randomly located matrices. This architecture can benefit from high parallelism and large vectors. We find that the sequential SuperLU benefited up to 45 % performance improvement from the offload programming depending on the sparse matrix type and the size of transferred and processed data.

  19. A Hybrid MPI/OpenMP Approach for Parallel Groundwater Model Calibration on Multicore Computers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tang, Guoping; D'Azevedo, Ed F; Zhang, Fan

    2010-01-01

    Groundwater model calibration is becoming increasingly computationally time intensive. We describe a hybrid MPI/OpenMP approach to exploit two levels of parallelism in software and hardware to reduce calibration time on multicore computers with minimal parallelization effort. At first, HydroGeoChem 5.0 (HGC5) is parallelized using OpenMP for a uranium transport model with over a hundred species involving nearly a hundred reactions, and a field scale coupled flow and transport model. In the first application, a single parallelizable loop is identified to consume over 97% of the total computational time. With a few lines of OpenMP compiler directives inserted into the code,more » the computational time reduces about ten times on a compute node with 16 cores. The performance is further improved by selectively parallelizing a few more loops. For the field scale application, parallelizable loops in 15 of the 174 subroutines in HGC5 are identified to take more than 99% of the execution time. By adding the preconditioned conjugate gradient solver and BICGSTAB, and using a coloring scheme to separate the elements, nodes, and boundary sides, the subroutines for finite element assembly, soil property update, and boundary condition application are parallelized, resulting in a speedup of about 10 on a 16-core compute node. The Levenberg-Marquardt (LM) algorithm is added into HGC5 with the Jacobian calculation and lambda search parallelized using MPI. With this hybrid approach, compute nodes at the number of adjustable parameters (when the forward difference is used for Jacobian approximation), or twice that number (if the center difference is used), are used to reduce the calibration time from days and weeks to a few hours for the two applications. This approach can be extended to global optimization scheme and Monte Carol analysis where thousands of compute nodes can be efficiently utilized.« less

  20. Approximate l-fold cross-validation with Least Squares SVM and Kernel Ridge Regression

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Edwards, Richard E; Zhang, Hao; Parker, Lynne Edwards

    2013-01-01

    Kernel methods have difficulties scaling to large modern data sets. The scalability issues are based on computational and memory requirements for working with a large matrix. These requirements have been addressed over the years by using low-rank kernel approximations or by improving the solvers scalability. However, Least Squares Support VectorMachines (LS-SVM), a popular SVM variant, and Kernel Ridge Regression still have several scalability issues. In particular, the O(n^3) computational complexity for solving a single model, and the overall computational complexity associated with tuning hyperparameters are still major problems. We address these problems by introducing an O(n log n) approximate l-foldmore » cross-validation method that uses a multi-level circulant matrix to approximate the kernel. In addition, we prove our algorithm s computational complexity and present empirical runtimes on data sets with approximately 1 million data points. We also validate our approximate method s effectiveness at selecting hyperparameters on real world and standard benchmark data sets. Lastly, we provide experimental results on using a multi-level circulant kernel approximation to solve LS-SVM problems with hyperparameters selected using our method.« less

  1. A Tensor-Train accelerated solver for integral equations in complex geometries

    NASA Astrophysics Data System (ADS)

    Corona, Eduardo; Rahimian, Abtin; Zorin, Denis

    2017-04-01

    We present a framework using the Quantized Tensor Train (QTT) decomposition to accurately and efficiently solve volume and boundary integral equations in three dimensions. We describe how the QTT decomposition can be used as a hierarchical compression and inversion scheme for matrices arising from the discretization of integral equations. For a broad range of problems, computational and storage costs of the inversion scheme are extremely modest O (log ⁡ N) and once the inverse is computed, it can be applied in O (Nlog ⁡ N) . We analyze the QTT ranks for hierarchically low rank matrices and discuss its relationship to commonly used hierarchical compression techniques such as FMM and HSS. We prove that the QTT ranks are bounded for translation-invariant systems and argue that this behavior extends to non-translation invariant volume and boundary integrals. For volume integrals, the QTT decomposition provides an efficient direct solver requiring significantly less memory compared to other fast direct solvers. We present results demonstrating the remarkable performance of the QTT-based solver when applied to both translation and non-translation invariant volume integrals in 3D. For boundary integral equations, we demonstrate that using a QTT decomposition to construct preconditioners for a Krylov subspace method leads to an efficient and robust solver with a small memory footprint. We test the QTT preconditioners in the iterative solution of an exterior elliptic boundary value problem (Laplace) formulated as a boundary integral equation in complex, multiply connected geometries.

  2. EUPDF: An Eulerian-Based Monte Carlo Probability Density Function (PDF) Solver. User's Manual

    NASA Technical Reports Server (NTRS)

    Raju, M. S.

    1998-01-01

    EUPDF is an Eulerian-based Monte Carlo PDF solver developed for application with sprays, combustion, parallel computing and unstructured grids. It is designed to be massively parallel and could easily be coupled with any existing gas-phase flow and spray solvers. The solver accommodates the use of an unstructured mesh with mixed elements of either triangular, quadrilateral, and/or tetrahedral type. The manual provides the user with the coding required to couple the PDF code to any given flow code and a basic understanding of the EUPDF code structure as well as the models involved in the PDF formulation. The source code of EUPDF will be available with the release of the National Combustion Code (NCC) as a complete package.

  3. Fast Euler solver for transonic airfoils. I - Theory. II - Applications

    NASA Technical Reports Server (NTRS)

    Dadone, Andrea; Moretti, Gino

    1988-01-01

    Equations written in terms of generalized Riemann variables are presently integrated by inverting six bidiagonal matrices and two tridiagonal matrices, using an implicit Euler solver that is based on the lambda-formulation. The solution is found on a C-grid whose boundaries are very close to the airfoil. The fast solver is then applied to the computation of several flowfields on a NACA 0012 airfoil at various Mach number and alpha values, yielding results that are primarily concerned with transonic flows. The effects of grid fineness and boundary distances are analyzed; the code is found to be robust and accurate, as well as fast.

  4. Execution of a parallel edge-based Navier-Stokes solver on commodity graphics processor units

    NASA Astrophysics Data System (ADS)

    Corral, Roque; Gisbert, Fernando; Pueblas, Jesus

    2017-02-01

    The implementation of an edge-based three-dimensional Reynolds Average Navier-Stokes solver for unstructured grids able to run on multiple graphics processing units (GPUs) is presented. Loops over edges, which are the most time-consuming part of the solver, have been written to exploit the massively parallel capabilities of GPUs. Non-blocking communications between parallel processes and between the GPU and the central processor unit (CPU) have been used to enhance code scalability. The code is written using a mixture of C++ and OpenCL, to allow the execution of the source code on GPUs. The Message Passage Interface (MPI) library is used to allow the parallel execution of the solver on multiple GPUs. A comparative study of the solver parallel performance is carried out using a cluster of CPUs and another of GPUs. It is shown that a single GPU is up to 64 times faster than a single CPU core. The parallel scalability of the solver is mainly degraded due to the loss of computing efficiency of the GPU when the size of the case decreases. However, for large enough grid sizes, the scalability is strongly improved. A cluster featuring commodity GPUs and a high bandwidth network is ten times less costly and consumes 33% less energy than a CPU-based cluster with an equivalent computational power.

  5. A systematic approach to numerical dispersion in Maxwell solvers

    NASA Astrophysics Data System (ADS)

    Blinne, Alexander; Schinkel, David; Kuschel, Stephan; Elkina, Nina; Rykovanov, Sergey G.; Zepf, Matt

    2018-03-01

    The finite-difference time-domain (FDTD) method is a well established method for solving the time evolution of Maxwell's equations. Unfortunately the scheme introduces numerical dispersion and therefore phase and group velocities which deviate from the correct values. The solution to Maxwell's equations in more than one dimension results in non-physical predictions such as numerical dispersion or numerical Cherenkov radiation emitted by a relativistic electron beam propagating in vacuum. Improved solvers, which keep the staggered Yee-type grid for electric and magnetic fields, generally modify the spatial derivative operator in the Maxwell-Faraday equation by increasing the computational stencil. These modified solvers can be characterized by different sets of coefficients, leading to different dispersion properties. In this work we introduce a norm function to rewrite the choice of coefficients into a minimization problem. We solve this problem numerically and show that the minimization procedure leads to phase and group velocities that are considerably closer to c as compared to schemes with manually set coefficients available in the literature. Depending on a specific problem at hand (e.g. electron beam propagation in plasma, high-order harmonic generation from plasma surfaces, etc.), the norm function can be chosen accordingly, for example, to minimize the numerical dispersion in a certain given propagation direction. Particle-in-cell simulations of an electron beam propagating in vacuum using our solver are provided.

  6. MODFLOW-2000, The U.S. Geological Survey Modular Ground-Water Model -- GMG Linear Equation Solver Package Documentation

    USGS Publications Warehouse

    Wilson, John D.; Naff, Richard L.

    2004-01-01

    A geometric multigrid solver (GMG), based in the preconditioned conjugate gradient algorithm, has been developed for solving systems of equations resulting from applying the cell-centered finite difference algorithm to flow in porous media. This solver has been adapted to the U.S. Geological Survey ground-water flow model MODFLOW-2000. The documentation herein is a description of the solver and the adaptation to MODFLOW-2000.

  7. Algorithms for parallel flow solvers on message passing architectures

    NASA Technical Reports Server (NTRS)

    Vanderwijngaart, Rob F.

    1995-01-01

    The purpose of this project has been to identify and test suitable technologies for implementation of fluid flow solvers -- possibly coupled with structures and heat equation solvers -- on MIMD parallel computers. In the course of this investigation much attention has been paid to efficient domain decomposition strategies for ADI-type algorithms. Multi-partitioning derives its efficiency from the assignment of several blocks of grid points to each processor in the parallel computer. A coarse-grain parallelism is obtained, and a near-perfect load balance results. In uni-partitioning every processor receives responsibility for exactly one block of grid points instead of several. This necessitates fine-grain pipelined program execution in order to obtain a reasonable load balance. Although fine-grain parallelism is less desirable on many systems, especially high-latency networks of workstations, uni-partition methods are still in wide use in production codes for flow problems. Consequently, it remains important to achieve good efficiency with this technique that has essentially been superseded by multi-partitioning for parallel ADI-type algorithms. Another reason for the concentration on improving the performance of pipeline methods is their applicability in other types of flow solver kernels with stronger implied data dependence. Analytical expressions can be derived for the size of the dynamic load imbalance incurred in traditional pipelines. From these it can be determined what is the optimal first-processor retardation that leads to the shortest total completion time for the pipeline process. Theoretical predictions of pipeline performance with and without optimization match experimental observations on the iPSC/860 very well. Analysis of pipeline performance also highlights the effect of uncareful grid partitioning in flow solvers that employ pipeline algorithms. If grid blocks at boundaries are not at least as large in the wall-normal direction as those

  8. New preconditioning strategy for Jacobian-free solvers for variably saturated flows with Richards’ equation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lipnikov, Konstantin; Moulton, David; Svyatskiy, Daniil

    2016-04-29

    We develop a new approach for solving the nonlinear Richards’ equation arising in variably saturated flow modeling. The growing complexity of geometric models for simulation of subsurface flows leads to the necessity of using unstructured meshes and advanced discretization methods. Typically, a numerical solution is obtained by first discretizing PDEs and then solving the resulting system of nonlinear discrete equations with a Newton-Raphson-type method. Efficiency and robustness of the existing solvers rely on many factors, including an empiric quality control of intermediate iterates, complexity of the employed discretization method and a customized preconditioner. We propose and analyze a new preconditioningmore » strategy that is based on a stable discretization of the continuum Jacobian. We will show with numerical experiments for challenging problems in subsurface hydrology that this new preconditioner improves convergence of the existing Jacobian-free solvers 3-20 times. Furthermore, we show that the Picard method with this preconditioner becomes a more efficient nonlinear solver than a few widely used Jacobian-free solvers.« less

  9. Convergence Acceleration of a Navier-Stokes Solver for Efficient Static Aeroelastic Computations

    NASA Technical Reports Server (NTRS)

    Obayashi, Shigeru; Guruswamy, Guru P.

    1995-01-01

    New capabilities have been developed for a Navier-Stokes solver to perform steady-state simulations more efficiently. The flow solver for solving the Navier-Stokes equations is based on a combination of the lower-upper factored symmetric Gauss-Seidel implicit method and the modified Harten-Lax-van Leer-Einfeldt upwind scheme. A numerically stable and efficient pseudo-time-marching method is also developed for computing steady flows over flexible wings. Results are demonstrated for transonic flows over rigid and flexible wings.

  10. Transonic Drag Prediction on a DLR-F6 Transport Configuration Using Unstructured Grid Solvers

    NASA Technical Reports Server (NTRS)

    Lee-Rausch, E. M.; Frink, N. T.; Mavriplis, D. J.; Rausch, R. D.; Milholen, W. E.

    2004-01-01

    A second international AIAA Drag Prediction Workshop (DPW-II) was organized and held in Orlando Florida on June 21-22, 2003. The primary purpose was to inves- tigate the code-to-code uncertainty. address the sensitivity of the drag prediction to grid size and quantify the uncertainty in predicting nacelle/pylon drag increments at a transonic cruise condition. This paper presents an in-depth analysis of the DPW-II computational results from three state-of-the-art unstructured grid Navier-Stokes flow solvers exercised on similar families of tetrahedral grids. The flow solvers are USM3D - a tetrahedral cell-centered upwind solver. FUN3D - a tetrahedral node-centered upwind solver, and NSU3D - a general element node-centered central-differenced solver. For the wingbody, the total drag predicted for a constant-lift transonic cruise condition showed a decrease in code-to-code variation with grid refinement as expected. For the same flight condition, the wing/body/nacelle/pylon total drag and the nacelle/pylon drag increment predicted showed an increase in code-to-code variation with grid refinement. Although the range in total drag for the wingbody fine grids was only 5 counts, a code-to-code comparison of surface pressures and surface restricted streamlines indicated that the three solvers were not all converging to the same flow solutions- different shock locations and separation patterns were evident. Similarly, the wing/body/nacelle/pylon solutions did not appear to be converging to the same flow solutions. Overall, grid refinement did not consistently improve the correlation with experimental data for either the wingbody or the wing/body/nacelle pylon configuration. Although the absolute values of total drag predicted by two of the solvers for the medium and fine grids did not compare well with the experiment, the incremental drag predictions were within plus or minus 3 counts of the experimental data. The correlation with experimental incremental drag was not

  11. Avoiding Communication in the Lanczos Bidiagonalization Routine and Associated Least Squares QR Solver

    DTIC Science & Technology

    2015-04-12

    Avoiding communication in the Lanczos bidiagonalization routine and associated Least Squares QR solver Erin Carson Electrical Engineering and...Bidiagonalization Routine and Associated Least Squares QR Solver 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d...throughout scienti c codes , are often the bottlenecks in application perfor- mance due to a low computation/communication ratio. In this paper we develop

  12. A Fast and Robust Poisson-Boltzmann Solver Based on Adaptive Cartesian Grids

    PubMed Central

    Boschitsch, Alexander H.; Fenley, Marcia O.

    2011-01-01

    An adaptive Cartesian grid (ACG) concept is presented for the fast and robust numerical solution of the 3D Poisson-Boltzmann Equation (PBE) governing the electrostatic interactions of large-scale biomolecules and highly charged multi-biomolecular assemblies such as ribosomes and viruses. The ACG offers numerous advantages over competing grid topologies such as regular 3D lattices and unstructured grids. For very large biological molecules and multi-biomolecule assemblies, the total number of grid-points is several orders of magnitude less than that required in a conventional lattice grid used in the current PBE solvers thus allowing the end user to obtain accurate and stable nonlinear PBE solutions on a desktop computer. Compared to tetrahedral-based unstructured grids, ACG offers a simpler hierarchical grid structure, which is naturally suited to multigrid, relieves indirect addressing requirements and uses fewer neighboring nodes in the finite difference stencils. Construction of the ACG and determination of the dielectric/ionic maps are straightforward, fast and require minimal user intervention. Charge singularities are eliminated by reformulating the problem to produce the reaction field potential in the molecular interior and the total electrostatic potential in the exterior ionic solvent region. This approach minimizes grid-dependency and alleviates the need for fine grid spacing near atomic charge sites. The technical portion of this paper contains three parts. First, the ACG and its construction for general biomolecular geometries are described. Next, a discrete approximation to the PBE upon this mesh is derived. Finally, the overall solution procedure and multigrid implementation are summarized. Results obtained with the ACG-based PBE solver are presented for: (i) a low dielectric spherical cavity, containing interior point charges, embedded in a high dielectric ionic solvent – analytical solutions are available for this case, thus allowing rigorous

  13. Final Report: Subcontract B623868 Algebraic Multigrid solvers for coupled PDE systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Brannick, J.

    The Pennsylvania State University (“Subcontractor”) continued to work on the design of algebraic multigrid solvers for coupled systems of partial differential equations (PDEs) arising in numerical modeling of various applications, with a main focus on solving the Dirac equation arising in Quantum Chromodynamics (QCD). The goal of the proposed work was to develop combined geometric and algebraic multilevel solvers that are robust and lend themselves to efficient implementation on massively parallel heterogeneous computers for these QCD systems. The research in these areas built on previous works, focusing on the following three topics: (1) the development of parallel full-multigrid (PFMG) andmore » non-Galerkin coarsening techniques in this frame work for solving the Wilson Dirac system; (2) the use of these same Wilson MG solvers for preconditioning the Overlap and Domain Wall formulations of the Dirac equation; and (3) the design and analysis of algebraic coarsening algorithms for coupled PDE systems including Stokes equation, Maxwell equation and linear elasticity.« less

  14. A new solver for granular avalanche simulation: Indoor experiment verification and field scale case study

    NASA Astrophysics Data System (ADS)

    Wang, XiaoLiang; Li, JiaChun

    2017-12-01

    A new solver based on the high-resolution scheme with novel treatments of source terms and interface capture for the Savage-Hutter model is developed to simulate granular avalanche flows. The capability to simulate flow spread and deposit processes is verified through indoor experiments of a two-dimensional granular avalanche. Parameter studies show that reduction in bed friction enhances runout efficiency, and that lower earth pressure restraints enlarge the deposit spread. The April 9, 2000, Yigong avalanche in Tibet, China, is simulated as a case study by this new solver. The predicted results, including evolution process, deposit spread, and hazard impacts, generally agree with site observations. It is concluded that the new solver for the Savage-Hutter equation provides a comprehensive software platform for granular avalanche simulation at both experimental and field scales. In particular, the solver can be a valuable tool for providing necessary information for hazard forecasts, disaster mitigation, and countermeasure decisions in mountainous areas.

  15. A robust and contact resolving Riemann solver on unstructured mesh, Part I, Euler method

    NASA Astrophysics Data System (ADS)

    Shen, Zhijun; Yan, Wei; Yuan, Guangwei

    2014-07-01

    This article presents a new cell-centered numerical method for compressible flows on arbitrary unstructured meshes. A multi-dimensional Riemann solver based on the HLLC method (denoted by HLLC-2D solver) is established. The work is an extension from the cell-centered Lagrangian scheme of Maire et al. [27] to the Eulerian framework. Similarly to the work in [27], a two-dimensional contact velocity defined on a grid node is introduced, and the motivation is to keep an edge flux consistency with the node velocity connected to the edge intrinsically. The main new feature of the algorithm is to relax the condition that the contact pressures must be same in the traditional HLLC solver. The discontinuous fluxes are constructed across each wave sampling direction rather than only along the contact wave direction. The two-dimensional contact velocity of the grid node is determined via enforcing conservation of mass, momentum and total energy, and thus the new method satisfies these conservation properties at nodes rather than on grid edges. Other good properties of the HLLC-2d solver, such as the positivity and the contact preserving, are described, and the two-dimensional high-order extension is constructed employing MUSCL type reconstruction procedure. Numerical results based on both quadrilateral and triangular grids are presented to demonstrate the robustness and the accuracy of this new solver, which shows it has better performance than the existing HLLC method.

  16. High Energy Boundary Conditions for a Cartesian Mesh Euler Solver

    NASA Technical Reports Server (NTRS)

    Pandya, Shishir; Murman, Scott; Aftosmis, Michael

    2003-01-01

    Inlets and exhaust nozzles are common place in the world of flight. Yet, many aerodynamic simulation packages do not provide a method of modelling such high energy boundaries in the flow field. For the purposes of aerodynamic simulation, inlets and exhausts are often fared over and it is assumed that the flow differences resulting from this assumption are minimal. While this is an adequate assumption for the prediction of lift, the lack of a plume behind the aircraft creates an evacuated base region thus effecting both drag and pitching moment values. In addition, the flow in the base region is often mis-predicted resulting in incorrect base drag. In order to accurately predict these quantities, a method for specifying inlet and exhaust conditions needs to be available in aerodynamic simulation packages. A method for a first approximation of a plume without accounting for chemical reactions is added to the Cartesian mesh based aerodynamic simulation package CART3D. The method consists of 3 steps. In the first step, a components approach where each triangle is assigned a component number is used. Here, a method for marking the inlet or exhaust plane triangles as separate components is discussed. In step two, the flow solver is modified to accept a reference state for the components marked inlet or exhaust. In the third step, the flow solver uses these separated components and the reference state to compute the correct flow condition at that triangle. The present method is implemented in the CART3D package which consists of a set of tools for generating a Cartesian volume mesh from a set of component triangulations. The Euler equations are solved on the resulting unstructured Cartesian mesh. The present methods is implemented in this package and its usefulness is demonstrated with two validation cases. A generic missile body is also presented to show the usefulness of the method on a real world geometry.

  17. Methods for Solving Gas Damping Problems in Perforated Microstructures Using a 2D Finite-Element Solver

    PubMed Central

    Veijola, Timo; Råback, Peter

    2007-01-01

    We present a straightforward method to solve gas damping problems for perforated structures in two dimensions (2D) utilising a Perforation Profile Reynolds (PPR) solver. The PPR equation is an extended Reynolds equation that includes additional terms modelling the leakage flow through the perforations, and variable diffusivity and compressibility profiles. The solution method consists of two phases: 1) determination of the specific admittance profile and relative diffusivity (and relative compressibility) profiles due to the perforation, and 2) solution of the PPR equation with a FEM solver in 2D. Rarefied gas corrections in the slip-flow region are also included. Analytic profiles for circular and square holes with slip conditions are presented in the paper. To verify the method, square perforated dampers with 16–64 holes were simulated with a three-dimensional (3D) Navier-Stokes solver, a homogenised extended Reynolds solver, and a 2D PPR solver. Cases for both translational (in normal to the surfaces) and torsional motion were simulated. The presented method extends the region of accurate simulation of perforated structures to cases where the homogenisation method is inaccurate and the full 3D Navier-Stokes simulation is too time-consuming.

  18. Flowfield Comparisons from Three Navier-Stokes Solvers for an Axisymmetric Separate Flow Jet

    NASA Technical Reports Server (NTRS)

    Koch, L. Danielle; Bridges, James; Khavaran, Abbas

    2002-01-01

    To meet new noise reduction goals, many concepts to enhance mixing in the exhaust jets of turbofan engines are being studied. Accurate steady state flowfield predictions from state-of-the-art computational fluid dynamics (CFD) solvers are needed as input to the latest noise prediction codes. The main intent of this paper was to ascertain that similar Navier-Stokes solvers run at different sites would yield comparable results for an axisymmetric two-stream nozzle case. Predictions from the WIND and the NPARC codes are compared to previously reported experimental data and results from the CRAFT Navier-Stokes solver. Similar k-epsilon turbulence models were employed in each solver, and identical computational grids were used. Agreement between experimental data and predictions from each code was generally good for mean values. All three codes underpredict the maximum value of turbulent kinetic energy. The predicted locations of the maximum turbulent kinetic energy were farther downstream than seen in the data. A grid study was conducted using the WIND code, and comments about convergence criteria and grid requirements for CFD solutions to be used as input for noise prediction computations are given. Additionally, noise predictions from the MGBK code, using the CFD results from the CRAFT code, NPARC, and WIND as input are compared to data.

  19. Comparison of Einstein-Boltzmann solvers for testing general relativity

    NASA Astrophysics Data System (ADS)

    Bellini, E.; Barreira, A.; Frusciante, N.; Hu, B.; Peirone, S.; Raveri, M.; Zumalacárregui, M.; Avilez-Lopez, A.; Ballardini, M.; Battye, R. A.; Bolliet, B.; Calabrese, E.; Dirian, Y.; Ferreira, P. G.; Finelli, F.; Huang, Z.; Ivanov, M. M.; Lesgourgues, J.; Li, B.; Lima, N. A.; Pace, F.; Paoletti, D.; Sawicki, I.; Silvestri, A.; Skordis, C.; Umiltà, C.; Vernizzi, F.

    2018-01-01

    We compare Einstein-Boltzmann solvers that include modifications to general relativity and find that, for a wide range of models and parameters, they agree to a high level of precision. We look at three general purpose codes that primarily model general scalar-tensor theories, three codes that model Jordan-Brans-Dicke (JBD) gravity, a code that models f (R ) gravity, a code that models covariant Galileons, a code that models Hořava-Lifschitz gravity, and two codes that model nonlocal models of gravity. Comparing predictions of the angular power spectrum of the cosmic microwave background and the power spectrum of dark matter for a suite of different models, we find agreement at the subpercent level. This means that this suite of Einstein-Boltzmann solvers is now sufficiently accurate for precision constraints on cosmological and gravitational parameters.

  20. Are Your Students Problem Performers or Problem Solvers?

    ERIC Educational Resources Information Center

    Barlow, Angela T.; Duncan, Matthew; Lischka, Alyson E.; Hartland, Kristin S.; Willingham, J. Christopher

    2017-01-01

    When presented with a problem in mathematics class, students often function as problem performers rather than problem solvers (Rigelman 2007). That is, rather than understanding the problem, students focus on using an operation to complete it. Students' tendencies to act as problem performers can prevent them from suggesting problem-solving…

  1. Hierarchically partitioned nonlinear equation solvers

    NASA Technical Reports Server (NTRS)

    Padovan, Joseph

    1987-01-01

    By partitioning solution space into a number of subspaces, a new multiply constrained partitioned Newton-Raphson nonlinear equation solver is developed. Specifically, for a given iteration, each of the various separate partitions are individually and simultaneously controlled. Due to the generality of the scheme, a hierarchy of partition levels can be employed. For finite-element-type applications, this includes the possibility of degree-of-freedom, nodal, elemental, geometric substructural, material and kinematically nonlinear group controls. It is noted that such partitioning can be continuously updated, depending on solution conditioning. In this context, convergence is ascertained at the individual partition level.

  2. A survey of SAT solver

    NASA Astrophysics Data System (ADS)

    Gong, Weiwei; Zhou, Xu

    2017-06-01

    In Computer Science, the Boolean Satisfiability Problem(SAT) is the problem of determining if there exists an interpretation that satisfies a given Boolean formula. SAT is one of the first problems that was proven to be NP-complete, which is also fundamental to artificial intelligence, algorithm and hardware design. This paper reviews the main algorithms of the SAT solver in recent years, including serial SAT algorithms, parallel SAT algorithms, SAT algorithms based on GPU, and SAT algorithms based on FPGA. The development of SAT is analyzed comprehensively in this paper. Finally, several possible directions for the development of the SAT problem are proposed.

  3. Treating convection in sequential solvers

    NASA Technical Reports Server (NTRS)

    Shyy, Wei; Thakur, Siddharth

    1992-01-01

    The treatment of the convection terms in the sequential solver, a standard procedure found in virtually all pressure based algorithms, to compute the flow problems with sharp gradients and source terms is investigated. Both scalar model problems and one-dimensional gas dynamics equations have been used to study the various issues involved. Different approaches including the use of nonlinear filtering techniques and adoption of TVD type schemes have been investigated. Special treatments of the source terms such as pressure gradients and heat release have also been devised, yielding insight and improved accuracy of the numerical procedure adopted.

  4. Tree-based solvers for adaptive mesh refinement code FLASH - I: gravity and optical depths

    NASA Astrophysics Data System (ADS)

    Wünsch, R.; Walch, S.; Dinnbier, F.; Whitworth, A.

    2018-04-01

    We describe an OctTree algorithm for the MPI parallel, adaptive mesh refinement code FLASH, which can be used to calculate the gas self-gravity, and also the angle-averaged local optical depth, for treating ambient diffuse radiation. The algorithm communicates to the different processors only those parts of the tree that are needed to perform the tree-walk locally. The advantage of this approach is a relatively low memory requirement, important in particular for the optical depth calculation, which needs to process information from many different directions. This feature also enables a general tree-based radiation transport algorithm that will be described in a subsequent paper, and delivers excellent scaling up to at least 1500 cores. Boundary conditions for gravity can be either isolated or periodic, and they can be specified in each direction independently, using a newly developed generalization of the Ewald method. The gravity calculation can be accelerated with the adaptive block update technique by partially re-using the solution from the previous time-step. Comparison with the FLASH internal multigrid gravity solver shows that tree-based methods provide a competitive alternative, particularly for problems with isolated or mixed boundary conditions. We evaluate several multipole acceptance criteria (MACs) and identify a relatively simple approximate partial error MAC which provides high accuracy at low computational cost. The optical depth estimates are found to agree very well with those of the RADMC-3D radiation transport code, with the tree-solver being much faster. Our algorithm is available in the standard release of the FLASH code in version 4.0 and later.

  5. ELSI: A unified software interface for Kohn–Sham electronic structure solvers

    DOE PAGES

    Yu, Victor Wen-zhe; Corsetti, Fabiano; Garcia, Alberto; ...

    2017-09-15

    Solving the electronic structure from a generalized or standard eigenproblem is often the bottleneck in large scale calculations based on Kohn-Sham density-functional theory. This problem must be addressed by essentially all current electronic structure codes, based on similar matrix expressions, and by high-performance computation. We here present a unified software interface, ELSI, to access different strategies that address the Kohn-Sham eigenvalue problem. Currently supported algorithms include the dense generalized eigensolver library ELPA, the orbital minimization method implemented in libOMM, and the pole expansion and selected inversion (PEXSI) approach with lower computational complexity for semilocal density functionals. The ELSI interface aimsmore » to simplify the implementation and optimal use of the different strategies, by offering (a) a unified software framework designed for the electronic structure solvers in Kohn-Sham density-functional theory; (b) reasonable default parameters for a chosen solver; (c) automatic conversion between input and internal working matrix formats, and in the future (d) recommendation of the optimal solver depending on the specific problem. As a result, comparative benchmarks are shown for system sizes up to 11,520 atoms (172,800 basis functions) on distributed memory supercomputing architectures.« less

  6. ELSI: A unified software interface for Kohn-Sham electronic structure solvers

    NASA Astrophysics Data System (ADS)

    Yu, Victor Wen-zhe; Corsetti, Fabiano; García, Alberto; Huhn, William P.; Jacquelin, Mathias; Jia, Weile; Lange, Björn; Lin, Lin; Lu, Jianfeng; Mi, Wenhui; Seifitokaldani, Ali; Vázquez-Mayagoitia, Álvaro; Yang, Chao; Yang, Haizhao; Blum, Volker

    2018-01-01

    Solving the electronic structure from a generalized or standard eigenproblem is often the bottleneck in large scale calculations based on Kohn-Sham density-functional theory. This problem must be addressed by essentially all current electronic structure codes, based on similar matrix expressions, and by high-performance computation. We here present a unified software interface, ELSI, to access different strategies that address the Kohn-Sham eigenvalue problem. Currently supported algorithms include the dense generalized eigensolver library ELPA, the orbital minimization method implemented in libOMM, and the pole expansion and selected inversion (PEXSI) approach with lower computational complexity for semilocal density functionals. The ELSI interface aims to simplify the implementation and optimal use of the different strategies, by offering (a) a unified software framework designed for the electronic structure solvers in Kohn-Sham density-functional theory; (b) reasonable default parameters for a chosen solver; (c) automatic conversion between input and internal working matrix formats, and in the future (d) recommendation of the optimal solver depending on the specific problem. Comparative benchmarks are shown for system sizes up to 11,520 atoms (172,800 basis functions) on distributed memory supercomputing architectures.

  7. ELSI: A unified software interface for Kohn–Sham electronic structure solvers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yu, Victor Wen-zhe; Corsetti, Fabiano; Garcia, Alberto

    Solving the electronic structure from a generalized or standard eigenproblem is often the bottleneck in large scale calculations based on Kohn-Sham density-functional theory. This problem must be addressed by essentially all current electronic structure codes, based on similar matrix expressions, and by high-performance computation. We here present a unified software interface, ELSI, to access different strategies that address the Kohn-Sham eigenvalue problem. Currently supported algorithms include the dense generalized eigensolver library ELPA, the orbital minimization method implemented in libOMM, and the pole expansion and selected inversion (PEXSI) approach with lower computational complexity for semilocal density functionals. The ELSI interface aimsmore » to simplify the implementation and optimal use of the different strategies, by offering (a) a unified software framework designed for the electronic structure solvers in Kohn-Sham density-functional theory; (b) reasonable default parameters for a chosen solver; (c) automatic conversion between input and internal working matrix formats, and in the future (d) recommendation of the optimal solver depending on the specific problem. As a result, comparative benchmarks are shown for system sizes up to 11,520 atoms (172,800 basis functions) on distributed memory supercomputing architectures.« less

  8. CASTRO: A NEW COMPRESSIBLE ASTROPHYSICAL SOLVER. II. GRAY RADIATION HYDRODYNAMICS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, W.; Almgren, A.; Bell, J.

    We describe the development of a flux-limited gray radiation solver for the compressible astrophysics code, CASTRO. CASTRO uses an Eulerian grid with block-structured adaptive mesh refinement based on a nested hierarchy of logically rectangular variable-sized grids with simultaneous refinement in both space and time. The gray radiation solver is based on a mixed-frame formulation of radiation hydrodynamics. In our approach, the system is split into two parts, one part that couples the radiation and fluid in a hyperbolic subsystem, and another parabolic part that evolves radiation diffusion and source-sink terms. The hyperbolic subsystem is solved explicitly with a high-order Godunovmore » scheme, whereas the parabolic part is solved implicitly with a first-order backward Euler method.« less

  9. Courant Number and Mach Number Insensitive CE/SE Euler Solvers

    NASA Technical Reports Server (NTRS)

    Chang, Sin-Chung

    2005-01-01

    It has been known that the space-time CE/SE method can be used to obtain ID, 2D, and 3D steady and unsteady flow solutions with Mach numbers ranging from 0.0028 to 10. However, it is also known that a CE/SE solution may become overly dissipative when the Mach number is very small. As an initial attempt to remedy this weakness, new 1D Courant number and Mach number insensitive CE/SE Euler solvers are developed using several key concepts underlying the recent successful development of Courant number insensitive CE/SE schemes. Numerical results indicate that the new solvers are capable of resolving crisply a contact discontinuity embedded in a flow with the maximum Mach number = 0.01.

  10. Efficient Implementation of Multigrid Solvers on Message-Passing Parrallel Systems

    NASA Technical Reports Server (NTRS)

    Lou, John

    1994-01-01

    We discuss our implementation strategies for finite difference multigrid partial differential equation (PDE) solvers on message-passing systems. Our target parallel architecture is Intel parallel computers: the Delta and Paragon system.

  11. Robust large-scale parallel nonlinear solvers for simulations.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bader, Brett William; Pawlowski, Roger Patrick; Kolda, Tamara Gibson

    2005-11-01

    This report documents research to develop robust and efficient solution techniques for solving large-scale systems of nonlinear equations. The most widely used method for solving systems of nonlinear equations is Newton's method. While much research has been devoted to augmenting Newton-based solvers (usually with globalization techniques), little has been devoted to exploring the application of different models. Our research has been directed at evaluating techniques using different models than Newton's method: a lower order model, Broyden's method, and a higher order model, the tensor method. We have developed large-scale versions of each of these models and have demonstrated their usemore » in important applications at Sandia. Broyden's method replaces the Jacobian with an approximation, allowing codes that cannot evaluate a Jacobian or have an inaccurate Jacobian to converge to a solution. Limited-memory methods, which have been successful in optimization, allow us to extend this approach to large-scale problems. We compare the robustness and efficiency of Newton's method, modified Newton's method, Jacobian-free Newton-Krylov method, and our limited-memory Broyden method. Comparisons are carried out for large-scale applications of fluid flow simulations and electronic circuit simulations. Results show that, in cases where the Jacobian was inaccurate or could not be computed, Broyden's method converged in some cases where Newton's method failed to converge. We identify conditions where Broyden's method can be more efficient than Newton's method. We also present modifications to a large-scale tensor method, originally proposed by Bouaricha, for greater efficiency, better robustness, and wider applicability. Tensor methods are an alternative to Newton-based methods and are based on computing a step based on a local quadratic model rather than a linear model. The advantage of Bouaricha's method is that it can use any existing linear solver, which makes it simple

  12. Development of a grid-independent approximate Riemannsolver. Ph.D. Thesis - Michigan Univ.

    NASA Technical Reports Server (NTRS)

    Rumsey, Christopher Lockwood

    1991-01-01

    A grid-independent approximate Riemann solver for use with the Euler and Navier-Stokes equations was introduced and explored. The two-dimensional Euler and Navier-Stokes equations are described in Cartesian and generalized coordinates, as well as the traveling wave form of the Euler equations. The spatial and temporal discretization are described for both explicit and implicit time-marching schemes. The grid-aligned flux function of Roe is outlined, while the 5-wave grid-independent flux function is derived. The stability and monotonicity analysis of the 5-wave model are presented. Two-dimensional results are provided and extended to three dimensions. The corresponding results are presented.

  13. A Nonlinear Modal Aeroelastic Solver for FUN3D

    NASA Technical Reports Server (NTRS)

    Goldman, Benjamin D.; Bartels, Robert E.; Biedron, Robert T.; Scott, Robert C.

    2016-01-01

    A nonlinear structural solver has been implemented internally within the NASA FUN3D computational fluid dynamics code, allowing for some new aeroelastic capabilities. Using a modal representation of the structure, a set of differential or differential-algebraic equations are derived for general thin structures with geometric nonlinearities. ODEPACK and LAPACK routines are linked with FUN3D, and the nonlinear equations are solved at each CFD time step. The existing predictor-corrector method is retained, whereby the structural solution is updated after mesh deformation. The nonlinear solver is validated using a test case for a flexible aeroshell at transonic, supersonic, and hypersonic flow conditions. Agreement with linear theory is seen for the static aeroelastic solutions at relatively low dynamic pressures, but structural nonlinearities limit deformation amplitudes at high dynamic pressures. No flutter was found at any of the tested trajectory points, though LCO may be possible in the transonic regime.

  14. Verification and Validation Studies for the LAVA CFD Solver

    NASA Technical Reports Server (NTRS)

    Moini-Yekta, Shayan; Barad, Michael F; Sozer, Emre; Brehm, Christoph; Housman, Jeffrey A.; Kiris, Cetin C.

    2013-01-01

    The verification and validation of the Launch Ascent and Vehicle Aerodynamics (LAVA) computational fluid dynamics (CFD) solver is presented. A modern strategy for verification and validation is described incorporating verification tests, validation benchmarks, continuous integration and version control methods for automated testing in a collaborative development environment. The purpose of the approach is to integrate the verification and validation process into the development of the solver and improve productivity. This paper uses the Method of Manufactured Solutions (MMS) for the verification of 2D Euler equations, 3D Navier-Stokes equations as well as turbulence models. A method for systematic refinement of unstructured grids is also presented. Verification using inviscid vortex propagation and flow over a flat plate is highlighted. Simulation results using laminar and turbulent flow past a NACA 0012 airfoil and ONERA M6 wing are validated against experimental and numerical data.

  15. Solvers' Making of Drawings in Mathematical Problem Solving and Their Understanding of the Problem Situations

    ERIC Educational Resources Information Center

    Nunokawa, Kazuhiko

    2004-01-01

    The purpose of this paper was to investigate how it becomes possible for solvers to make drawings to advance their problem solving processes, in order to understand the use of drawings in mathematical problem solving more deeply. For this purpose, three examples in which drawings made by the solver played a critical role in the solutions have been…

  16. A perspective on unstructured grid flow solvers

    NASA Technical Reports Server (NTRS)

    Venkatakrishnan, V.

    1995-01-01

    This survey paper assesses the status of compressible Euler and Navier-Stokes solvers on unstructured grids. Different spatial and temporal discretization options for steady and unsteady flows are discussed. The integration of these components into an overall framework to solve practical problems is addressed. Issues such as grid adaptation, higher order methods, hybrid discretizations and parallel computing are briefly discussed. Finally, some outstanding issues and future research directions are presented.

  17. Extending substructure based iterative solvers to multiple load and repeated analyses

    NASA Technical Reports Server (NTRS)

    Farhat, Charbel

    1993-01-01

    Direct solvers currently dominate commercial finite element structural software, but do not scale well in the fine granularity regime targeted by emerging parallel processors. Substructure based iterative solvers--often called also domain decomposition algorithms--lend themselves better to parallel processing, but must overcome several obstacles before earning their place in general purpose structural analysis programs. One such obstacle is the solution of systems with many or repeated right hand sides. Such systems arise, for example, in multiple load static analyses and in implicit linear dynamics computations. Direct solvers are well-suited for these problems because after the system matrix has been factored, the multiple or repeated solutions can be obtained through relatively inexpensive forward and backward substitutions. On the other hand, iterative solvers in general are ill-suited for these problems because they often must restart from scratch for every different right hand side. In this paper, we present a methodology for extending the range of applications of domain decomposition methods to problems with multiple or repeated right hand sides. Basically, we formulate the overall problem as a series of minimization problems over K-orthogonal and supplementary subspaces, and tailor the preconditioned conjugate gradient algorithm to solve them efficiently. The resulting solution method is scalable, whereas direct factorization schemes and forward and backward substitution algorithms are not. We illustrate the proposed methodology with the solution of static and dynamic structural problems, and highlight its potential to outperform forward and backward substitutions on parallel computers. As an example, we show that for a linear structural dynamics problem with 11640 degrees of freedom, every time-step beyond time-step 15 is solved in a single iteration and consumes 1.0 second on a 32 processor iPSC-860 system; for the same problem and the same parallel

  18. Ramses-GPU: Second order MUSCL-Handcock finite volume fluid solver

    NASA Astrophysics Data System (ADS)

    Kestener, Pierre

    2017-10-01

    RamsesGPU is a reimplementation of RAMSES (ascl:1011.007) which drops the adaptive mesh refinement (AMR) features to optimize 3D uniform grid algorithms for modern graphics processor units (GPU) to provide an efficient software package for astrophysics applications that do not need AMR features but do require a very large number of integration time steps. RamsesGPU provides an very efficient C++/CUDA/MPI software implementation of a second order MUSCL-Handcock finite volume fluid solver for compressible hydrodynamics as a magnetohydrodynamics solver based on the constraint transport technique. Other useful modules includes static gravity, dissipative terms (viscosity, resistivity), and forcing source term for turbulence studies, and special care was taken to enhance parallel input/output performance by using state-of-the-art libraries such as HDF5 and parallel-netcdf.

  19. PUFoam : A novel open-source CFD solver for the simulation of polyurethane foams

    NASA Astrophysics Data System (ADS)

    Karimi, M.; Droghetti, H.; Marchisio, D. L.

    2017-08-01

    In this work a transient three-dimensional mathematical model is formulated and validated for the simulation of polyurethane (PU) foams. The model is based on computational fluid dynamics (CFD) and is coupled with a population balance equation (PBE) to describe the evolution of the gas bubbles/cells within the PU foam. The front face of the expanding foam is monitored on the basis of the volume-of-fluid (VOF) method using a compressible solver available in OpenFOAM version 3.0.1. The solver is additionally supplemented to include the PBE, solved with the quadrature method of moments (QMOM), the polymerization kinetics, an adequate rheological model and a simple model for the foam thermal conductivity. The new solver is labelled as PUFoam and is, for the first time in this work, validated for 12 different mixing-cup experiments. Comparison of the time evolution of the predicted and experimentally measured density and temperature of the PU foam shows the potentials and limitations of the approach.

  20. Multitasking domain decomposition fast Poisson solvers on the Cray Y-MP

    NASA Technical Reports Server (NTRS)

    Chan, Tony F.; Fatoohi, Rod A.

    1990-01-01

    The results of multitasking implementation of a domain decomposition fast Poisson solver on eight processors of the Cray Y-MP are presented. The object of this research is to study the performance of domain decomposition methods on a Cray supercomputer and to analyze the performance of different multitasking techniques using highly parallel algorithms. Two implementations of multitasking are considered: macrotasking (parallelism at the subroutine level) and microtasking (parallelism at the do-loop level). A conventional FFT-based fast Poisson solver is also multitasked. The results of different implementations are compared and analyzed. A speedup of over 7.4 on the Cray Y-MP running in a dedicated environment is achieved for all cases.

  1. An AMR capable finite element diffusion solver for ALE hydrocodes [An AMR capable diffusion solver for ALE-AMR

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fisher, A. C.; Bailey, D. S.; Kaiser, T. B.

    2015-02-01

    Here, we present a novel method for the solution of the diffusion equation on a composite AMR mesh. This approach is suitable for including diffusion based physics modules to hydrocodes that support ALE and AMR capabilities. To illustrate, we proffer our implementations of diffusion based radiation transport and heat conduction in a hydrocode called ALE-AMR. Numerical experiments conducted with the diffusion solver and associated physics packages yield 2nd order convergence in the L 2 norm.

  2. Modularization and Validation of FUN3D as a CREATE-AV Helios Near-Body Solver

    NASA Technical Reports Server (NTRS)

    Jain, Rohit; Biedron, Robert T.; Jones, William T.; Lee-Rausch, Elizabeth M.

    2016-01-01

    Under a recent collaborative effort between the US Army Aeroflightdynamics Directorate (AFDD) and NASA Langley, NASA's general unstructured CFD solver, FUN3D, was modularized as a CREATE-AV Helios near-body unstructured grid solver. The strategies adopted in Helios/FUN3D integration effort are described. A validation study of the new capability is performed for rotorcraft cases spanning hover prediction, airloads prediction, coupling with computational structural dynamics, counter-rotating dual-rotor configurations, and free-flight trim. The integration of FUN3D, along with the previously integrated NASA OVERFLOW solver, lays the ground for future interaction opportunities where capabilities of one component could be leveraged with those of others in a relatively seamless fashion within CREATE-AV Helios.

  3. Intellectual Abilities That Discriminate Good and Poor Problem Solvers.

    ERIC Educational Resources Information Center

    Meyer, Ruth Ann

    1981-01-01

    This study compared good and poor fourth-grade problem solvers on a battery of 19 "reference" tests for verbal, induction, numerical, word fluency, memory, perceptual speed, and simple visualization abilities. Results suggest verbal, numerical, and especially induction abilities are important to successful mathematical problem solving.…

  4. A GPU-based incompressible Navier-Stokes solver on moving overset grids

    NASA Astrophysics Data System (ADS)

    Chandar, Dominic D. J.; Sitaraman, Jayanarayanan; Mavriplis, Dimitri J.

    2013-07-01

    In pursuit of obtaining high fidelity solutions to the fluid flow equations in a short span of time, graphics processing units (GPUs) which were originally intended for gaming applications are currently being used to accelerate computational fluid dynamics (CFD) codes. With a high peak throughput of about 1 TFLOPS on a PC, GPUs seem to be favourable for many high-resolution computations. One such computation that involves a lot of number crunching is computing time accurate flow solutions past moving bodies. The aim of the present paper is thus to discuss the development of a flow solver on unstructured and overset grids and its implementation on GPUs. In its present form, the flow solver solves the incompressible fluid flow equations on unstructured/hybrid/overset grids using a fully implicit projection method. The resulting discretised equations are solved using a matrix-free Krylov solver using several GPU kernels such as gradient, Laplacian and reduction. Some of the simple arithmetic vector calculations are implemented using the CU++: An Object Oriented Framework for Computational Fluid Dynamics Applications using Graphics Processing Units, Journal of Supercomputing, 2013, doi:10.1007/s11227-013-0985-9 approach where GPU kernels are automatically generated at compile time. Results are presented for two- and three-dimensional computations on static and moving grids.

  5. Anisotropic resonator analysis using the Fourier-Bessel mode solver

    NASA Astrophysics Data System (ADS)

    Gauthier, Robert C.

    2018-03-01

    A numerical mode solver for optical structures that conform to cylindrical symmetry using Faraday's and Ampere's laws as starting expressions is developed when electric or magnetic anisotropy is present. The technique builds on the existing Fourier-Bessel mode solver which allows resonator states to be computed exploiting the symmetry properties of the resonator and states to reduce the matrix system. The introduction of anisotropy into the theoretical frame work facilitates the inclusion of PML borders permitting the computation of open ended structures and a better estimation of the resonator state quality factor. Matrix populating expressions are provided that can accommodate any material anisotropy with arbitrary orientation in the computation domain. Several example of electrical anisotropic computations are provided for rationally symmetric structures such as standard optical fibers, axial Bragg-ring fibers and bottle resonators. The anisotropy present in the materials introduces off diagonal matrix elements in the permittivity tensor when expressed in cylindrical coordinates. The effects of the anisotropy of computed states are presented and discussed.

  6. Towards a Coupled Vortex Particle and Acoustic Boundary Element Solver to Predict the Noise Production of Bio-Inspired Propulsion

    NASA Astrophysics Data System (ADS)

    Wagenhoffer, Nathan; Moored, Keith; Jaworski, Justin

    2016-11-01

    The design of quiet and efficient bio-inspired propulsive concepts requires a rapid, unified computational framework that integrates the coupled fluid dynamics with the noise generation. Such a framework is developed where the fluid motion is modeled with a two-dimensional unsteady boundary element method that includes a vortex-particle wake. The unsteady surface forces from the potential flow solver are then passed to an acoustic boundary element solver to predict the radiated sound in low-Mach-number flows. The use of the boundary element method for both the hydrodynamic and acoustic solvers permits dramatic computational acceleration by application of the fast multiple method. The reduced order of calculations due to the fast multipole method allows for greater spatial resolution of the vortical wake per unit of computational time. The coupled flow-acoustic solver is validated against canonical vortex-sound problems. The capability of the coupled solver is demonstrated by analyzing the performance and noise production of an isolated bio-inspired swimmer and of tandem swimmers.

  7. Evaluating the performance of the two-phase flow solver interFoam

    NASA Astrophysics Data System (ADS)

    Deshpande, Suraj S.; Anumolu, Lakshman; Trujillo, Mario F.

    2012-01-01

    The performance of the open source multiphase flow solver, interFoam, is evaluated in this work. The solver is based on a modified volume of fluid (VoF) approach, which incorporates an interfacial compression flux term to mitigate the effects of numerical smearing of the interface. It forms a part of the C + + libraries and utilities of OpenFOAM and is gaining popularity in the multiphase flow research community. However, to the best of our knowledge, the evaluation of this solver is confined to the validation tests of specific interest to the users of the code and the extent of its applicability to a wide range of multiphase flow situations remains to be explored. In this work, we have performed a thorough investigation of the solver performance using a variety of verification and validation test cases, which include (i) verification tests for pure advection (kinematics), (ii) dynamics in the high Weber number limit and (iii) dynamics of surface tension-dominated flows. With respect to (i), the kinematics tests show that the performance of interFoam is generally comparable with the recent algebraic VoF algorithms; however, it is noticeably worse than the geometric reconstruction schemes. For (ii), the simulations of inertia-dominated flows with large density ratios {\\sim }\\mathscr {O}(10^3) yielded excellent agreement with analytical and experimental results. In regime (iii), where surface tension is important, consistency of pressure-surface tension formulation and accuracy of curvature are important, as established by Francois et al (2006 J. Comput. Phys. 213 141-73). Several verification tests were performed along these lines and the main findings are: (a) the algorithm of interFoam ensures a consistent formulation of pressure and surface tension; (b) the curvatures computed by the solver converge to a value slightly (10%) different from the analytical value and a scope for improvement exists in this respect. To reduce the disruptive effects of spurious

  8. Menu-Driven Solver Of Linear-Programming Problems

    NASA Technical Reports Server (NTRS)

    Viterna, L. A.; Ferencz, D.

    1992-01-01

    Program assists inexperienced user in formulating linear-programming problems. A Linear Program Solver (ALPS) computer program is full-featured LP analysis program. Solves plain linear-programming problems as well as more-complicated mixed-integer and pure-integer programs. Also contains efficient technique for solution of purely binary linear-programming problems. Written entirely in IBM's APL2/PC software, Version 1.01. Packed program contains licensed material, property of IBM (copyright 1988, all rights reserved).

  9. An Adaptive Flow Solver for Air-Borne Vehicles Undergoing Time-Dependent Motions/Deformations

    NASA Technical Reports Server (NTRS)

    Singh, Jatinder; Taylor, Stephen

    1997-01-01

    This report describes a concurrent Euler flow solver for flows around complex 3-D bodies. The solver is based on a cell-centered finite volume methodology on 3-D unstructured tetrahedral grids. In this algorithm, spatial discretization for the inviscid convective term is accomplished using an upwind scheme. A localized reconstruction is done for flow variables which is second order accurate. Evolution in time is accomplished using an explicit three-stage Runge-Kutta method which has second order temporal accuracy. This is adapted for concurrent execution using another proven methodology based on concurrent graph abstraction. This solver operates on heterogeneous network architectures. These architectures may include a broad variety of UNIX workstations and PCs running Windows NT, symmetric multiprocessors and distributed-memory multi-computers. The unstructured grid is generated using commercial grid generation tools. The grid is automatically partitioned using a concurrent algorithm based on heat diffusion. This results in memory requirements that are inversely proportional to the number of processors. The solver uses automatic granularity control and resource management techniques both to balance load and communication requirements, and deal with differing memory constraints. These ideas are again based on heat diffusion. Results are subsequently combined for visualization and analysis using commercial CFD tools. Flow simulation results are demonstrated for a constant section wing at subsonic, transonic, and a supersonic case. These results are compared with experimental data and numerical results of other researchers. Performance results are under way for a variety of network topologies.

  10. SediFoam: A general-purpose, open-source CFD-DEM solver for particle-laden flow with emphasis on sediment transport

    NASA Astrophysics Data System (ADS)

    Sun, Rui; Xiao, Heng

    2016-04-01

    With the growth of available computational resource, CFD-DEM (computational fluid dynamics-discrete element method) becomes an increasingly promising and feasible approach for the study of sediment transport. Several existing CFD-DEM solvers are applied in chemical engineering and mining industry. However, a robust CFD-DEM solver for the simulation of sediment transport is still desirable. In this work, the development of a three-dimensional, massively parallel, and open-source CFD-DEM solver SediFoam is detailed. This solver is built based on open-source solvers OpenFOAM and LAMMPS. OpenFOAM is a CFD toolbox that can perform three-dimensional fluid flow simulations on unstructured meshes; LAMMPS is a massively parallel DEM solver for molecular dynamics. Several validation tests of SediFoam are performed using cases of a wide range of complexities. The results obtained in the present simulations are consistent with those in the literature, which demonstrates the capability of SediFoam for sediment transport applications. In addition to the validation test, the parallel efficiency of SediFoam is studied to test the performance of the code for large-scale and complex simulations. The parallel efficiency tests show that the scalability of SediFoam is satisfactory in the simulations using up to O(107) particles.

  11. Bounded fractional diffusion in geological media: Definition and Lagrangian approximation

    NASA Astrophysics Data System (ADS)

    Zhang, Yong; Green, Christopher T.; LaBolle, Eric M.; Neupauer, Roseanna M.; Sun, HongGuang

    2016-11-01

    Spatiotemporal fractional-derivative models (FDMs) have been increasingly used to simulate non-Fickian diffusion, but methods have not been available to define boundary conditions for FDMs in bounded domains. This study defines boundary conditions and then develops a Lagrangian solver to approximate bounded, one-dimensional fractional diffusion. Both the zero-value and nonzero-value Dirichlet, Neumann, and mixed Robin boundary conditions are defined, where the sign of Riemann-Liouville fractional derivative (capturing nonzero-value spatial-nonlocal boundary conditions with directional superdiffusion) remains consistent with the sign of the fractional-diffusive flux term in the FDMs. New Lagrangian schemes are then proposed to track solute particles moving in bounded domains, where the solutions are checked against analytical or Eulerian solutions available for simplified FDMs. Numerical experiments show that the particle-tracking algorithm for non-Fickian diffusion differs from Fickian diffusion in relocating the particle position around the reflective boundary, likely due to the nonlocal and nonsymmetric fractional diffusion. For a nonzero-value Neumann or Robin boundary, a source cell with a reflective face can be applied to define the release rate of random-walking particles at the specified flux boundary. Mathematical definitions of physically meaningful nonlocal boundaries combined with bounded Lagrangian solvers in this study may provide the only viable techniques at present to quantify the impact of boundaries on anomalous diffusion, expanding the applicability of FDMs from infinite domains to those with any size and boundary conditions.

  12. Bounded fractional diffusion in geological media: Definition and Lagrangian approximation

    USGS Publications Warehouse

    Zhang, Yong; Green, Christopher T.; LaBolle, Eric M.; Neupauer, Roseanna M.; Sun, HongGuang

    2016-01-01

    Spatiotemporal Fractional-Derivative Models (FDMs) have been increasingly used to simulate non-Fickian diffusion, but methods have not been available to define boundary conditions for FDMs in bounded domains. This study defines boundary conditions and then develops a Lagrangian solver to approximate bounded, one-dimensional fractional diffusion. Both the zero-value and non-zero-value Dirichlet, Neumann, and mixed Robin boundary conditions are defined, where the sign of Riemann-Liouville fractional derivative (capturing non-zero-value spatial-nonlocal boundary conditions with directional super-diffusion) remains consistent with the sign of the fractional-diffusive flux term in the FDMs. New Lagrangian schemes are then proposed to track solute particles moving in bounded domains, where the solutions are checked against analytical or Eularian solutions available for simplified FDMs. Numerical experiments show that the particle-tracking algorithm for non-Fickian diffusion differs from Fickian diffusion in relocating the particle position around the reflective boundary, likely due to the non-local and non-symmetric fractional diffusion. For a non-zero-value Neumann or Robin boundary, a source cell with a reflective face can be applied to define the release rate of random-walking particles at the specified flux boundary. Mathematical definitions of physically meaningful nonlocal boundaries combined with bounded Lagrangian solvers in this study may provide the only viable techniques at present to quantify the impact of boundaries on anomalous diffusion, expanding the applicability of FDMs from infinite do mains to those with any size and boundary conditions.

  13. Using the Gurobi Solvers on the Peregrine System | High-Performance

    Science.gov Websites

    Peregrine System Gurobi Optimizer is a suite of solvers for mathematical programming. It is licensed for ('GRB_MATLAB_PATH') >> path(path,grb) Gurobi and GAMS GAMS is a high-level modeling system for mathematical

  14. Nearly Interactive Parabolized Navier-Stokes Solver for High Speed Forebody and Inlet Flows

    NASA Technical Reports Server (NTRS)

    Benson, Thomas J.; Liou, May-Fun; Jones, William H.; Trefny, Charles J.

    2009-01-01

    A system of computer programs is being developed for the preliminary design of high speed inlets and forebodies. The system comprises four functions: geometry definition, flow grid generation, flow solver, and graphics post-processor. The system runs on a dedicated personal computer using the Windows operating system and is controlled by graphical user interfaces written in MATLAB (The Mathworks, Inc.). The flow solver uses the Parabolized Navier-Stokes equations to compute millions of mesh points in several minutes. Sample two-dimensional and three-dimensional calculations are demonstrated in the paper.

  15. On improving linear solver performance: a block variant of GMRES

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Baker, A H; Dennis, J M; Jessup, E R

    2004-05-10

    The increasing gap between processor performance and memory access time warrants the re-examination of data movement in iterative linear solver algorithms. For this reason, we explore and establish the feasibility of modifying a standard iterative linear solver algorithm in a manner that reduces the movement of data through memory. In particular, we present an alternative to the restarted GMRES algorithm for solving a single right-hand side linear system Ax = b based on solving the block linear system AX = B. Algorithm performance, i.e. time to solution, is improved by using the matrix A in operations on groups of vectors.more » Experimental results demonstrate the importance of implementation choices on data movement as well as the effectiveness of the new method on a variety of problems from different application areas.« less

  16. Working towards a numerical solver for seismic wave propagation in unsaturated porous media

    NASA Astrophysics Data System (ADS)

    Boxberg, Marc S.; Friederich, Wolfgang

    2017-04-01

    Modeling the propagation of seismic waves in porous media gets more and more popular in the seismological community. However, it is still a challenging task in the field of computational seismology. Nevertheless, it is important to account for the fluid content of, e.g., reservoir rocks or soils, and the interaction between the fluid and the rock or between different immiscible fluids to accurately describe seismic wave propagation through such porous media. Often, numerical models are based on the elastic wave equation and some might include artificially introduced attenuation. This simplifies the computation, because it only approximates the physics behind that problem. However, the results are also simplified and could miss phenomena and lack accuracy in some applications. We present a numerical solver for wave propagation in porous media saturated by two immiscible fluids. It is based on Biot's theory of poroelasticity and accounts for macroscopic flow that occurs on the same scale as the wavelength of the seismic waves. Fluid flow is described by a Darcy type flow law and interactions between the fluids by means of capillary pressure curve models. In addition, consistent boundary conditions on interfaces between poroelastic media and elastic or acoustic media are derived from this poroelastic theory itself. The poroelastic solver is integrated into the larger software package NEXD that uses the nodal discontinuous Galerkin method to solve wave equations in 1D, 2D, and 3D on a mesh of linear (1D), triangular (2D), or tetrahedral (3D) elements. Triangular and tetrahedral elements have great advantages as soon as the model has a complex structure, like it is often the case for geologic models. We illustrate the capabilities of the codes by numerical examples. This work can be applied to various scientific questions in, e.g., exploration and monitoring of hydrocarbon or geothermal reservoirs as well as CO2 storage sites.

  17. The development of an intelligent interface to a computational fluid dynamics flow-solver code

    NASA Technical Reports Server (NTRS)

    Williams, Anthony D.

    1988-01-01

    Researchers at NASA Lewis are currently developing an 'intelligent' interface to aid in the development and use of large, computational fluid dynamics flow-solver codes for studying the internal fluid behavior of aerospace propulsion systems. This paper discusses the requirements, design, and implementation of an intelligent interface to Proteus, a general purpose, 3-D, Navier-Stokes flow solver. The interface is called PROTAIS to denote its introduction of artificial intelligence (AI) concepts to the Proteus code.

  18. The development of an intelligent interface to a computational fluid dynamics flow-solver code

    NASA Technical Reports Server (NTRS)

    Williams, Anthony D.

    1988-01-01

    Researchers at NASA Lewis are currently developing an 'intelligent' interface to aid in the development and use of large, computational fluid dynamics flow-solver codes for studying the internal fluid behavior of aerospace propulsion systems. This paper discusses the requirements, design, and implementation of an intelligent interface to Proteus, a general purpose, three-dimensional, Navier-Stokes flow solver. The interface is called PROTAIS to denote its introduction of artificial intelligence (AI) concepts to the Proteus code.

  19. Notes on the ExactPack Implementation of the DSD Rate Stick Solver

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kaul, Ann

    It has been shown above that the discretization scheme implemented in the ExactPack solver for the DSD Rate Stick equation is consistent with the Rate Stick PDE. In addition, a stability analysis has provided a CFL condition for a stable time step. Together, consistency and stability imply convergence of the scheme, which is expected to be close to first-order in time and second-order in space. It is understood that the nonlinearity of the underlying PDE will affect this rate somewhat. In the solver I implemented in ExactPack, I used the one-sided boundary condition described above at the outer boundary. Inmore » addition, I used 80% of the time step calculated in the stability analysis above. By making these two changes, I was able to implement a solver that calculates the solution without any arbitrary limits placed on the values of the curvature at the boundary. Thus, the calculation is driven directly by the conditions at the boundary as formulated in the DSD theory. The chosen scheme is completely coherent and defensible from a mathematical standpoint.« less

  20. Flutter and Forced Response Analyses of Cascades using a Two-Dimensional Linearized Euler Solver

    NASA Technical Reports Server (NTRS)

    Reddy, T. S. R.; Srivastava, R.; Mehmed, O.

    1999-01-01

    Flutter and forced response analyses for a cascade of blades in subsonic and transonic flow is presented. The structural model for each blade is a typical section with bending and torsion degrees of freedom. The unsteady aerodynamic forces due to bending and torsion motions. and due to a vortical gust disturbance are obtained by solving unsteady linearized Euler equations. The unsteady linearized equations are obtained by linearizing the unsteady nonlinear equations about the steady flow. The predicted unsteady aerodynamic forces include the effect of steady aerodynamic loading due to airfoil shape, thickness and angle of attack. The aeroelastic equations are solved in the frequency domain by coupling the un- steady aerodynamic forces to the aeroelastic solver MISER. The present unsteady aerodynamic solver showed good correlation with published results for both flutter and forced response predictions. Further improvements are required to use the unsteady aerodynamic solver in a design cycle.

  1. Large-scale 3-D EM modelling with a Block Low-Rank multifrontal direct solver

    NASA Astrophysics Data System (ADS)

    Shantsev, Daniil V.; Jaysaval, Piyoosh; de la Kethulle de Ryhove, Sébastien; Amestoy, Patrick R.; Buttari, Alfredo; L'Excellent, Jean-Yves; Mary, Theo

    2017-06-01

    We put forward the idea of using a Block Low-Rank (BLR) multifrontal direct solver to efficiently solve the linear systems of equations arising from a finite-difference discretization of the frequency-domain Maxwell equations for 3-D electromagnetic (EM) problems. The solver uses a low-rank representation for the off-diagonal blocks of the intermediate dense matrices arising in the multifrontal method to reduce the computational load. A numerical threshold, the so-called BLR threshold, controlling the accuracy of low-rank representations was optimized by balancing errors in the computed EM fields against savings in floating point operations (flops). Simulations were carried out over large-scale 3-D resistivity models representing typical scenarios for marine controlled-source EM surveys, and in particular the SEG SEAM model which contains an irregular salt body. The flop count, size of factor matrices and elapsed run time for matrix factorization are reduced dramatically by using BLR representations and can go down to, respectively, 10, 30 and 40 per cent of their full-rank values for our largest system with N = 20.6 million unknowns. The reductions are almost independent of the number of MPI tasks and threads at least up to 90 × 10 = 900 cores. The BLR savings increase for larger systems, which reduces the factorization flop complexity from O(N2) for the full-rank solver to O(Nm) with m = 1.4-1.6. The BLR savings are significantly larger for deep-water environments that exclude the highly resistive air layer from the computational domain. A study in a scenario where simulations are required at multiple source locations shows that the BLR solver can become competitive in comparison to iterative solvers as an engine for 3-D controlled-source electromagnetic Gauss-Newton inversion that requires forward modelling for a few thousand right-hand sides.

  2. Extending Clause Learning of SAT Solvers with Boolean Gröbner Bases

    NASA Astrophysics Data System (ADS)

    Zengler, Christoph; Küchlin, Wolfgang

    We extend clause learning as performed by most modern SAT Solvers by integrating the computation of Boolean Gröbner bases into the conflict learning process. Instead of learning only one clause per conflict, we compute and learn additional binary clauses from a Gröbner basis of the current conflict. We used the Gröbner basis engine of the logic package Redlog contained in the computer algebra system Reduce to extend the SAT solver MiniSAT with Gröbner basis learning. Our approach shows a significant reduction of conflicts and a reduction of restarts and computation time on many hard problems from the SAT 2009 competition.

  3. Agglomeration Multigrid for an Unstructured-Grid Flow Solver

    NASA Technical Reports Server (NTRS)

    Frink, Neal; Pandya, Mohagna J.

    2004-01-01

    An agglomeration multigrid scheme has been implemented into the sequential version of the NASA code USM3Dns, tetrahedral cell-centered finite volume Euler/Navier-Stokes flow solver. Efficiency and robustness of the multigrid-enhanced flow solver have been assessed for three configurations assuming an inviscid flow and one configuration assuming a viscous fully turbulent flow. The inviscid studies include a transonic flow over the ONERA M6 wing and a generic business jet with flow-through nacelles and a low subsonic flow over a high-lift trapezoidal wing. The viscous case includes a fully turbulent flow over the RAE 2822 rectangular wing. The multigrid solutions converged with 12%-33% of the Central Processing Unit (CPU) time required by the solutions obtained without multigrid. For all of the inviscid cases, multigrid in conjunction with an explicit time-stepping scheme performed the best with regard to the run time memory and CPU time requirements. However, for the viscous case multigrid had to be used with an implicit backward Euler time-stepping scheme that increased the run time memory requirement by 22% as compared to the run made without multigrid.

  4. Amesos2 and Belos: Direct and Iterative Solvers for Large Sparse Linear Systems

    DOE PAGES

    Bavier, Eric; Hoemmen, Mark; Rajamanickam, Sivasankaran; ...

    2012-01-01

    Solvers for large sparse linear systems come in two categories: direct and iterative. Amesos2, a package in the Trilinos software project, provides direct methods, and Belos, another Trilinos package, provides iterative methods. Amesos2 offers a common interface to many different sparse matrix factorization codes, and can handle any implementation of sparse matrices and vectors, via an easy-to-extend C++ traits interface. It can also factor matrices whose entries have arbitrary “Scalar” type, enabling extended-precision and mixed-precision algorithms. Belos includes many different iterative methods for solving large sparse linear systems and least-squares problems. Unlike competing iterative solver libraries, Belos completely decouples themore » algorithms from the implementations of the underlying linear algebra objects. This lets Belos exploit the latest hardware without changes to the code. Belos favors algorithms that solve higher-level problems, such as multiple simultaneous linear systems and sequences of related linear systems, faster than standard algorithms. The package also supports extended-precision and mixed-precision algorithms. Together, Amesos2 and Belos form a complete suite of sparse linear solvers.« less

  5. On multigrid solution of the implicit equations of hydrodynamics. Experiments for the compressible Euler equations in general coordinates

    NASA Astrophysics Data System (ADS)

    Kifonidis, K.; Müller, E.

    2012-08-01

    Aims: We describe and study a family of new multigrid iterative solvers for the multidimensional, implicitly discretized equations of hydrodynamics. Schemes of this class are free of the Courant-Friedrichs-Lewy condition. They are intended for simulations in which widely differing wave propagation timescales are present. A preferred solver in this class is identified. Applications to some simple stiff test problems that are governed by the compressible Euler equations, are presented to evaluate the convergence behavior, and the stability properties of this solver. Algorithmic areas are determined where further work is required to make the method sufficiently efficient and robust for future application to difficult astrophysical flow problems. Methods: The basic equations are formulated and discretized on non-orthogonal, structured curvilinear meshes. Roe's approximate Riemann solver and a second-order accurate reconstruction scheme are used for spatial discretization. Implicit Runge-Kutta (ESDIRK) schemes are employed for temporal discretization. The resulting discrete equations are solved with a full-coarsening, non-linear multigrid method. Smoothing is performed with multistage-implicit smoothers. These are applied here to the time-dependent equations by means of dual time stepping. Results: For steady-state problems, our results show that the efficiency of the present approach is comparable to the best implicit solvers for conservative discretizations of the compressible Euler equations that can be found in the literature. The use of red-black as opposed to symmetric Gauss-Seidel iteration in the multistage-smoother is found to have only a minor impact on multigrid convergence. This should enable scalable parallelization without having to seriously compromise the method's algorithmic efficiency. For time-dependent test problems, our results reveal that the multigrid convergence rate degrades with increasing Courant numbers (i.e. time step sizes). Beyond a

  6. A Hermite WENO reconstruction for fourth order temporal accurate schemes based on the GRP solver for hyperbolic conservation laws

    NASA Astrophysics Data System (ADS)

    Du, Zhifang; Li, Jiequan

    2018-02-01

    This paper develops a new fifth order accurate Hermite WENO (HWENO) reconstruction method for hyperbolic conservation schemes in the framework of the two-stage fourth order accurate temporal discretization in Li and Du (2016) [13]. Instead of computing the first moment of the solution additionally in the conventional HWENO or DG approach, we can directly take the interface values, which are already available in the numerical flux construction using the generalized Riemann problem (GRP) solver, to approximate the first moment. The resulting scheme is fourth order temporal accurate by only invoking the HWENO reconstruction twice so that it becomes more compact. Numerical experiments show that such compactness makes significant impact on the resolution of nonlinear waves.

  7. A contribution to the great Riemann solver debate

    NASA Technical Reports Server (NTRS)

    Quirk, James J.

    1992-01-01

    The aims of this paper are threefold: to increase the level of awareness within the shock capturing community to the fact that many Godunov-type methods contain subtle flaws that can cause spurious solutions to be computed; to identify one mechanism that might thwart attempts to produce very high resolution simulations; and to proffer a simple strategy for overcoming the specific failings of individual Riemann solvers.

  8. A Comparison of Monte Carlo and Deterministic Solvers for keff and Sensitivity Calculations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Haeck, Wim; Parsons, Donald Kent; White, Morgan Curtis

    Verification and validation of our solutions for calculating the neutron reactivity for nuclear materials is a key issue to address for many applications, including criticality safety, research reactors, power reactors, and nuclear security. Neutronics codes solve variations of the Boltzmann transport equation. The two main variants are Monte Carlo versus deterministic solutions, e.g. the MCNP [1] versus PARTISN [2] codes, respectively. There have been many studies over the decades that examined the accuracy of such solvers and the general conclusion is that when the problems are well-posed, either solver can produce accurate results. However, the devil is always in themore » details. The current study examines the issue of self-shielding and the stress it puts on deterministic solvers. Most Monte Carlo neutronics codes use continuous-energy descriptions of the neutron interaction data that are not subject to this effect. The issue of self-shielding occurs because of the discretisation of data used by the deterministic solutions. Multigroup data used in these solvers are the average cross section and scattering parameters over an energy range. Resonances in cross sections can occur that change the likelihood of interaction by one to three orders of magnitude over a small energy range. Self-shielding is the numerical effect that the average cross section in groups with strong resonances can be strongly affected as neutrons within that material are preferentially absorbed or scattered out of the resonance energies. This affects both the average cross section and the scattering matrix.« less

  9. Application of a fast Newton-Krylov solver for equilibrium simulations of phosphorus and oxygen

    NASA Astrophysics Data System (ADS)

    Fu, Weiwei; Primeau, François

    2017-11-01

    Model drift due to inadequate spinup is a serious problem that complicates the interpretation of climate change simulations. Even after a 300 year spinup we show that solutions are not only still drifting but often drifting away from their eventual equilibrium over large parts of the ocean. Here we present a Newton-Krylov solver for computing cyclostationary equilibrium solutions of a biogeochemical model for the cycling of phosphorus and oxygen. In addition to using previously developed preconditioning strategies - time-averaging and coarse-graining the Jacobian matrix - we also introduce a new strategy: the adiabatic elimination of a fast variable (particulate organic phosphorus) by slaving it to a slow variable (dissolved inorganic phosphorus). We use transport matrices derived from the Community Earth System Model (CESM) with a nominal horizontal resolution of 1° × 1° and 60 vertical levels to implement and test the solver. We find that the new solver obtains seasonally-varying equilibrium solutions with no visible drift using no more than 80 simulation years.

  10. Numerical Investigation of Vertical Plunging Jet Using a Hybrid Multifluid–VOF Multiphase CFD Solver

    DOE PAGES

    Shonibare, Olabanji Y.; Wardle, Kent E.

    2015-06-28

    A novel hybrid multiphase flow solver has been used to conduct simulations of a vertical plunging liquid jet. This solver combines a multifluid methodology with selective interface sharpening to enable simulation of both the initial jet impingement and the long-time entrained bubble plume phenomena. Models are implemented for variable bubble size capturing and dynamic switching of interface sharpened regions to capture transitions between the initially fully segregated flow types into the dispersed bubbly flow regime. It was found that the solver was able to capture the salient features of the flow phenomena under study and areas for quantitative improvement havemore » been explored and identified. In particular, a population balance approach is employed and detailed calibration of the underlying models with experimental data is required to enable quantitative prediction of bubble size and distribution to capture the transition between segregated and dispersed flow types with greater fidelity.« less

  11. Pushing Memory Bandwidth Limitations Through Efficient Implementations of Block-Krylov Space Solvers on GPUs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Clark, M. A.; Strelchenko, Alexei; Vaquero, Alejandro

    Lattice quantum chromodynamics simulations in nuclear physics have benefited from a tremendous number of algorithmic advances such as multigrid and eigenvector deflation. These improve the time to solution but do not alleviate the intrinsic memory-bandwidth constraints of the matrix-vector operation dominating iterative solvers. Batching this operation for multiple vectors and exploiting cache and register blocking can yield a super-linear speed up. Block-Krylov solvers can naturally take advantage of such batched matrix-vector operations, further reducing the iterations to solution by sharing the Krylov space between solves. However, practical implementations typically suffer from the quadratic scaling in the number of vector-vector operations.more » Using the QUDA library, we present an implementation of a block-CG solver on NVIDIA GPUs which reduces the memory-bandwidth complexity of vector-vector operations from quadratic to linear. We present results for the HISQ discretization, showing a 5x speedup compared to highly-optimized independent Krylov solves on NVIDIA's SaturnV cluster.« less

  12. Input-output-controlled nonlinear equation solvers

    NASA Technical Reports Server (NTRS)

    Padovan, Joseph

    1988-01-01

    To upgrade the efficiency and stability of the successive substitution (SS) and Newton-Raphson (NR) schemes, the concept of input-output-controlled solvers (IOCS) is introduced. By employing the formal properties of the constrained version of the SS and NR schemes, the IOCS algorithm can handle indefiniteness of the system Jacobian, can maintain iterate monotonicity, and provide for separate control of load incrementation and iterate excursions, as well as having other features. To illustrate the algorithmic properties, the results for several benchmark examples are presented. These define the associated numerical efficiency and stability of the IOCS.

  13. The Use of Sparse Direct Solver in Vector Finite Element Modeling for Calculating Two Dimensional (2-D) Magnetotelluric Responses in Transverse Electric (TE) Mode

    NASA Astrophysics Data System (ADS)

    Yihaa Roodhiyah, Lisa’; Tjong, Tiffany; Nurhasan; Sutarno, D.

    2018-04-01

    The late research, linear matrices of vector finite element in two dimensional(2-D) magnetotelluric (MT) responses modeling was solved by non-sparse direct solver in TE mode. Nevertheless, there is some weakness which have to be improved especially accuracy in the low frequency (10-3 Hz-10-5 Hz) which is not achieved yet and high cost computation in dense mesh. In this work, the solver which is used is sparse direct solver instead of non-sparse direct solverto overcome the weaknesses of solving linear matrices of vector finite element metod using non-sparse direct solver. Sparse direct solver will be advantageous in solving linear matrices of vector finite element method because of the matrix properties which is symmetrical and sparse. The validation of sparse direct solver in solving linear matrices of vector finite element has been done for a homogen half-space model and vertical contact model by analytical solution. Thevalidation result of sparse direct solver in solving linear matrices of vector finite element shows that sparse direct solver is more stable than non-sparse direct solver in computing linear problem of vector finite element method especially in low frequency. In the end, the accuracy of 2D MT responses modelling in low frequency (10-3 Hz-10-5 Hz) has been reached out under the efficient allocation memory of array and less computational time consuming.

  14. Solving Upwind-Biased Discretizations. 2; Multigrid Solver Using Semicoarsening

    NASA Technical Reports Server (NTRS)

    Diskin, Boris

    1999-01-01

    This paper studies a novel multigrid approach to the solution for a second order upwind biased discretization of the convection equation in two dimensions. This approach is based on semi-coarsening and well balanced explicit correction terms added to coarse-grid operators to maintain on coarse-grid the same cross-characteristic interaction as on the target (fine) grid. Colored relaxation schemes are used on all the levels allowing a very efficient parallel implementation. The results of the numerical tests can be summarized as follows: 1) The residual asymptotic convergence rate of the proposed V(0, 2) multigrid cycle is about 3 per cycle. This convergence rate far surpasses the theoretical limit (4/3) predicted for standard multigrid algorithms using full coarsening. The reported efficiency does not deteriorate with increasing the cycle, depth (number of levels) and/or refining the target-grid mesh spacing. 2) The full multi-grid algorithm (FMG) with two V(0, 2) cycles on the target grid and just one V(0, 2) cycle on all the coarse grids always provides an approximate solution with the algebraic error less than the discretization error. Estimates of the total work in the FMG algorithm are ranged between 18 and 30 minimal work units (depending on the target (discretizatioin). Thus, the overall efficiency of the FMG solver closely approaches (if does not achieve) the goal of the textbook multigrid efficiency. 3) A novel approach to deriving a discrete solution approximating the true continuous solution with a relative accuracy given in advance is developed. An adaptive multigrid algorithm (AMA) using comparison of the solutions on two successive target grids to estimate the accuracy of the current target-grid solution is defined. A desired relative accuracy is accepted as an input parameter. The final target grid on which this accuracy can be achieved is chosen automatically in the solution process. the actual relative accuracy of the discrete solution approximation

  15. Application of Aeroelastic Solvers Based on Navier Stokes Equations

    NASA Technical Reports Server (NTRS)

    Keith, Theo G., Jr.; Srivastava, Rakesh

    2001-01-01

    The propulsion element of the NASA Advanced Subsonic Technology (AST) initiative is directed towards increasing the overall efficiency of current aircraft engines. This effort requires an increase in the efficiency of various components, such as fans, compressors, turbines etc. Improvement in engine efficiency can be accomplished through the use of lighter materials, larger diameter fans and/or higher-pressure ratio compressors. However, each of these has the potential to result in aeroelastic problems such as flutter or forced response. To address the aeroelastic problems, the Structural Dynamics Branch of NASA Glenn has been involved in the development of numerical capabilities for analyzing the aeroelastic stability characteristics and forced response of wide chord fans, multi-stage compressors and turbines. In order to design an engine to safely perform a set of desired tasks, accurate information of the stresses on the blade during the entire cycle of blade motion is required. This requirement in turn demands that accurate knowledge of steady and unsteady blade loading is available. To obtain the steady and unsteady aerodynamic forces for the complex flows around the engine components, for the flow regimes encountered by the rotor, an advanced compressible Navier-Stokes solver is required. A finite volume based Navier-Stokes solver has been developed at Mississippi State University (MSU) for solving the flow field around multistage rotors. The focus of the current research effort, under NASA Cooperative Agreement NCC3- 596 was on developing an aeroelastic analysis code (entitled TURBO-AE) based on the Navier-Stokes solver developed by MSU. The TURBO-AE code has been developed for flutter analysis of turbomachine components and delivered to NASA and its industry partners. The code has been verified. validated and is being applied by NASA Glenn and by aircraft engine manufacturers to analyze the aeroelastic stability characteristics of modem fans, compressors

  16. A New Approximate Chimera Donor Cell Search Algorithm

    NASA Technical Reports Server (NTRS)

    Holst, Terry L.; Nixon, David (Technical Monitor)

    1998-01-01

    The objectives of this study were to develop chimera-based full potential methodology which is compatible with overflow (Euler/Navier-Stokes) chimera flow solver and to develop a fast donor cell search algorithm that is compatible with the chimera full potential approach. Results of this work included presenting a new donor cell search algorithm suitable for use with a chimera-based full potential solver. This algorithm was found to be extremely fast and simple producing donor cells as fast as 60,000 per second.

  17. Advanced Computational Methods for Security Constrained Financial Transmission Rights

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kalsi, Karanjit; Elbert, Stephen T.; Vlachopoulou, Maria

    Financial Transmission Rights (FTRs) are financial insurance tools to help power market participants reduce price risks associated with transmission congestion. FTRs are issued based on a process of solving a constrained optimization problem with the objective to maximize the FTR social welfare under power flow security constraints. Security constraints for different FTR categories (monthly, seasonal or annual) are usually coupled and the number of constraints increases exponentially with the number of categories. Commercial software for FTR calculation can only provide limited categories of FTRs due to the inherent computational challenges mentioned above. In this paper, first an innovative mathematical reformulationmore » of the FTR problem is presented which dramatically improves the computational efficiency of optimization problem. After having re-formulated the problem, a novel non-linear dynamic system (NDS) approach is proposed to solve the optimization problem. The new formulation and performance of the NDS solver is benchmarked against widely used linear programming (LP) solvers like CPLEX™ and tested on both standard IEEE test systems and large-scale systems using data from the Western Electricity Coordinating Council (WECC). The performance of the NDS is demonstrated to be comparable and in some cases is shown to outperform the widely used CPLEX algorithms. The proposed formulation and NDS based solver is also easily parallelizable enabling further computational improvement.« less

  18. Nonlinear model-order reduction for compressible flow solvers using the Discrete Empirical Interpolation Method

    NASA Astrophysics Data System (ADS)

    Fosas de Pando, Miguel; Schmid, Peter J.; Sipp, Denis

    2016-11-01

    Nonlinear model reduction for large-scale flows is an essential component in many fluid applications such as flow control, optimization, parameter space exploration and statistical analysis. In this article, we generalize the POD-DEIM method, introduced by Chaturantabut & Sorensen [1], to address nonlocal nonlinearities in the equations without loss of performance or efficiency. The nonlinear terms are represented by nested DEIM-approximations using multiple expansion bases based on the Proper Orthogonal Decomposition. These extensions are imperative, for example, for applications of the POD-DEIM method to large-scale compressible flows. The efficient implementation of the presented model-reduction technique follows our earlier work [2] on linearized and adjoint analyses and takes advantage of the modular structure of our compressible flow solver. The efficacy of the nonlinear model-reduction technique is demonstrated to the flow around an airfoil and its acoustic footprint. We could obtain an accurate and robust low-dimensional model that captures the main features of the full flow.

  19. A Numerical Study of Scalable Cardiac Electro-Mechanical Solvers on HPC Architectures

    PubMed Central

    Colli Franzone, Piero; Pavarino, Luca F.; Scacchi, Simone

    2018-01-01

    We introduce and study some scalable domain decomposition preconditioners for cardiac electro-mechanical 3D simulations on parallel HPC (High Performance Computing) architectures. The electro-mechanical model of the cardiac tissue is composed of four coupled sub-models: (1) the static finite elasticity equations for the transversely isotropic deformation of the cardiac tissue; (2) the active tension model describing the dynamics of the intracellular calcium, cross-bridge binding and myofilament tension; (3) the anisotropic Bidomain model describing the evolution of the intra- and extra-cellular potentials in the deforming cardiac tissue; and (4) the ionic membrane model describing the dynamics of ionic currents, gating variables, ionic concentrations and stretch-activated channels. This strongly coupled electro-mechanical model is discretized in time with a splitting semi-implicit technique and in space with isoparametric finite elements. The resulting scalable parallel solver is based on Multilevel Additive Schwarz preconditioners for the solution of the Bidomain system and on BDDC preconditioned Newton-Krylov solvers for the non-linear finite elasticity system. The results of several 3D parallel simulations show the scalability of both linear and non-linear solvers and their application to the study of both physiological excitation-contraction cardiac dynamics and re-entrant waves in the presence of different mechano-electrical feedbacks. PMID:29674971

  20. The Method of Space-time Conservation Element and Solution Element: Development of a New Implicit Solver

    NASA Technical Reports Server (NTRS)

    Chang, S. C.; Wang, X. Y.; Chow, C. Y.; Himansu, A.

    1995-01-01

    The method of space-time conservation element and solution element is a nontraditional numerical method designed from a physicist's perspective, i.e., its development is based more on physics than numerics. It uses only the simplest approximation techniques and yet is capable of generating nearly perfect solutions for a 2-D shock reflection problem used by Helen Yee and others. In addition to providing an overall view of the new method, we introduce a new concept in the design of implicit schemes, and use it to construct a highly accurate solver for a convection-diffusion equation. It is shown that, in the inviscid case, this new scheme becomes explicit and its amplification factors are identical to those of the Leapfrog scheme. On the other hand, in the pure diffusion case, its principal amplification factor becomes the amplification factor of the Crank-Nicolson scheme.

  1. Parallel Nonnegative Least Squares Solvers for Model Order Reduction

    DTIC Science & Technology

    2016-03-01

    NNLS problems that arise when the Energy Conserving Sampling and Weighting hyper -reduction procedure is used when constructing a reduced-order model...ScaLAPACK and performance results are presented. nonnegative least squares, model order reduction, hyper -reduction, Energy Conserving Sampling and...optimal solution. ........................................ 20 Table 6 Reduced mesh sizes produced for each solver in the ECSW hyper -reduction step

  2. Robust parallel iterative solvers for linear and least-squares problems, Final Technical Report

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Saad, Yousef

    2014-01-16

    The primary goal of this project is to study and develop robust iterative methods for solving linear systems of equations and least squares systems. The focus of the Minnesota team is on algorithms development, robustness issues, and on tests and validation of the methods on realistic problems. 1. The project begun with an investigation on how to practically update a preconditioner obtained from an ILU-type factorization, when the coefficient matrix changes. 2. We investigated strategies to improve robustness in parallel preconditioners in a specific case of a PDE with discontinuous coefficients. 3. We explored ways to adapt standard preconditioners formore » solving linear systems arising from the Helmholtz equation. These are often difficult linear systems to solve by iterative methods. 4. We have also worked on purely theoretical issues related to the analysis of Krylov subspace methods for linear systems. 5. We developed an effective strategy for performing ILU factorizations for the case when the matrix is highly indefinite. The strategy uses shifting in some optimal way. The method was extended to the solution of Helmholtz equations by using complex shifts, yielding very good results in many cases. 6. We addressed the difficult problem of preconditioning sparse systems of equations on GPUs. 7. A by-product of the above work is a software package consisting of an iterative solver library for GPUs based on CUDA. This was made publicly available. It was the first such library that offers complete iterative solvers for GPUs. 8. We considered another form of ILU which blends coarsening techniques from Multigrid with algebraic multilevel methods. 9. We have released a new version on our parallel solver - called pARMS [new version is version 3]. As part of this we have tested the code in complex settings - including the solution of Maxwell and Helmholtz equations and for a problem of crystal growth.10. As an application of polynomial preconditioning we considered

  3. A fast Poisson solver for unsteady incompressible Navier-Stokes equations on the half-staggered grid

    NASA Technical Reports Server (NTRS)

    Golub, G. H.; Huang, L. C.; Simon, H.; Tang, W. -P.

    1995-01-01

    In this paper, a fast Poisson solver for unsteady, incompressible Navier-Stokes equations with finite difference methods on the non-uniform, half-staggered grid is presented. To achieve this, new algorithms for diagonalizing a semi-definite pair are developed. Our fast solver can also be extended to the three dimensional case. The motivation and related issues in using this second kind of staggered grid are also discussed. Numerical testing has indicated the effectiveness of this algorithm.

  4. Calm water resistance prediction of a bulk carrier using Reynolds averaged Navier-Stokes based solver

    NASA Astrophysics Data System (ADS)

    Rahaman, Md. Mashiur; Islam, Hafizul; Islam, Md. Tariqul; Khondoker, Md. Reaz Hasan

    2017-12-01

    Maneuverability and resistance prediction with suitable accuracy is essential for optimum ship design and propulsion power prediction. This paper aims at providing some of the maneuverability characteristics of a Japanese bulk carrier model, JBC in calm water using a computational fluid dynamics solver named SHIP Motion and OpenFOAM. The solvers are based on the Reynolds average Navier-Stokes method (RaNS) and solves structured grid using the Finite Volume Method (FVM). This paper comprises the numerical results of calm water test for the JBC model with available experimental results. The calm water test results include the total drag co-efficient, average sinkage, and trim data. Visualization data for pressure distribution on the hull surface and free water surface have also been included. The paper concludes that the presented solvers predict the resistance and maneuverability characteristics of the bulk carrier with reasonable accuracy utilizing minimum computational resources.

  5. Validity of the Born approximation for beyond Gaussian weak lensing observables

    DOE PAGES

    Petri, Andrea; Haiman, Zoltan; May, Morgan

    2017-06-06

    Accurate forward modeling of weak lensing (WL) observables from cosmological parameters is necessary for upcoming galaxy surveys. Because WL probes structures in the nonlinear regime, analytical forward modeling is very challenging, if not impossible. Numerical simulations of WL features rely on ray tracing through the outputs of N-body simulations, which requires knowledge of the gravitational potential and accurate solvers for light ray trajectories. A less accurate procedure, based on the Born approximation, only requires knowledge of the density field, and can be implemented more efficiently and at a lower computational cost. In this work, we use simulations to show thatmore » deviations of the Born-approximated convergence power spectrum, skewness and kurtosis from their fully ray-traced counterparts are consistent with the smallest nontrivial O(Φ 3) post-Born corrections (so-called geodesic and lens-lens terms). Our results imply a cancellation among the larger O(Φ 4) (and higher order) terms, consistent with previous analytic work. We also find that cosmological parameter bias induced by the Born-approximated power spectrum is negligible even for a LSST-like survey, once galaxy shape noise is considered. When considering higher order statistics such as the κ skewness and kurtosis, however, we find significant bias of up to 2.5σ. Using the LensTools software suite, we show that the Born approximation saves a factor of 4 in computing time with respect to the full ray tracing in reconstructing the convergence.« less

  6. Validity of the Born approximation for beyond Gaussian weak lensing observables

    NASA Astrophysics Data System (ADS)

    Petri, Andrea; Haiman, Zoltán; May, Morgan

    2017-06-01

    Accurate forward modeling of weak lensing (WL) observables from cosmological parameters is necessary for upcoming galaxy surveys. Because WL probes structures in the nonlinear regime, analytical forward modeling is very challenging, if not impossible. Numerical simulations of WL features rely on ray tracing through the outputs of N -body simulations, which requires knowledge of the gravitational potential and accurate solvers for light ray trajectories. A less accurate procedure, based on the Born approximation, only requires knowledge of the density field, and can be implemented more efficiently and at a lower computational cost. In this work, we use simulations to show that deviations of the Born-approximated convergence power spectrum, skewness and kurtosis from their fully ray-traced counterparts are consistent with the smallest nontrivial O (Φ3) post-Born corrections (so-called geodesic and lens-lens terms). Our results imply a cancellation among the larger O (Φ4) (and higher order) terms, consistent with previous analytic work. We also find that cosmological parameter bias induced by the Born-approximated power spectrum is negligible even for a LSST-like survey, once galaxy shape noise is considered. When considering higher order statistics such as the κ skewness and kurtosis, however, we find significant bias of up to 2.5 σ . Using the LensTools software suite, we show that the Born approximation saves a factor of 4 in computing time with respect to the full ray tracing in reconstructing the convergence.

  7. ASIS v1.0: an adaptive solver for the simulation of atmospheric chemistry

    NASA Astrophysics Data System (ADS)

    Cariolle, Daniel; Moinat, Philippe; Teyssèdre, Hubert; Giraud, Luc; Josse, Béatrice; Lefèvre, Franck

    2017-04-01

    This article reports on the development and tests of the adaptive semi-implicit scheme (ASIS) solver for the simulation of atmospheric chemistry. To solve the ordinary differential equation systems associated with the time evolution of the species concentrations, ASIS adopts a one-step linearized implicit scheme with specific treatments of the Jacobian of the chemical fluxes. It conserves mass and has a time-stepping module to control the accuracy of the numerical solution. In idealized box-model simulations, ASIS gives results similar to the higher-order implicit schemes derived from the Rosenbrock's and Gear's methods and requires less computation and run time at the moderate precision required for atmospheric applications. When implemented in the MOCAGE chemical transport model and the Laboratoire de Météorologie Dynamique Mars general circulation model, the ASIS solver performs well and reveals weaknesses and limitations of the original semi-implicit solvers used by these two models. ASIS can be easily adapted to various chemical schemes and further developments are foreseen to increase its computational efficiency, and to include the computation of the concentrations of the species in aqueous-phase in addition to gas-phase chemistry.

  8. A User's Manual for ROTTILT Solver: Tiltrotor Fountain Flow Field Prediction

    NASA Technical Reports Server (NTRS)

    Tadghighi, Hormoz; Rajagopalan, R. Ganesh

    1999-01-01

    A CFD solver has been developed to provide the time averaged details of the fountain flow typical for tiltrotor aircraft in hover. This Navier-Stokes solver, designated as ROTTILT, assumes the 3-D fountain flowfield to be steady and incompressible. The theoretical background is described in this manual. In order to enable the rotor trim solution in the presence of tiltrotor aircraft components such as wing, nacelle, and fuselage, the solver is coupled with a set of trim routines which are highly efficient in CPU and suitable for CFD analysis. The Cartesian grid technique utilized provides the user with a unique capability for insertion or elimination of any components of the bodies considered for a given tiltrotor aircraft configuration. The flowfield associated with either a semi or full-span configuration can be computed through user options in the ROTTILT input file. Full details associated with the numerical solution implemented in ROTTILT and assumptions are presented. A description of input surface mesh topology is provided in the appendices along with a listing of all preprocessor programs. Input variable definitions and default values are provided for the V22 aircraft. Limited predicted results using the coupled ROTTILT/WOPWOP program for the V22 in hover are made and compared with measurement. To visualize the V22 aircraft and predictions, a preprocessor graphics program GNU-PLOT3D was used. This program is described and example graphic results presented.

  9. Fast immersed interface Poisson solver for 3D unbounded problems around arbitrary geometries

    NASA Astrophysics Data System (ADS)

    Gillis, T.; Winckelmans, G.; Chatelain, P.

    2018-02-01

    We present a fast and efficient Fourier-based solver for the Poisson problem around an arbitrary geometry in an unbounded 3D domain. This solver merges two rewarding approaches, the lattice Green's function method and the immersed interface method, using the Sherman-Morrison-Woodbury decomposition formula. The method is intended to be second order up to the boundary. This is verified on two potential flow benchmarks. We also further analyse the iterative process and the convergence behavior of the proposed algorithm. The method is applicable to a wide range of problems involving a Poisson equation around inner bodies, which goes well beyond the present validation on potential flows.

  10. An installed nacelle design code using a multiblock Euler solver. Volume 1: Theory document

    NASA Technical Reports Server (NTRS)

    Chen, H. C.

    1992-01-01

    An efficient multiblock Euler design code was developed for designing a nacelle installed on geometrically complex airplane configurations. This approach employed a design driver based on a direct iterative surface curvature method developed at LaRC. A general multiblock Euler flow solver was used for computing flow around complex geometries. The flow solver used a finite-volume formulation with explicit time-stepping to solve the Euler Equations. It used a multiblock version of the multigrid method to accelerate the convergence of the calculations. The design driver successively updated the surface geometry to reduce the difference between the computed and target pressure distributions. In the flow solver, the change in surface geometry was simulated by applying surface transpiration boundary conditions to avoid repeated grid generation during design iterations. Smoothness of the designed surface was ensured by alternate application of streamwise and circumferential smoothings. The capability and efficiency of the code was demonstrated through the design of both an isolated nacelle and an installed nacelle at various flow conditions. Information on the execution of the computer program is provided in volume 2.

  11. Extension of the Time-Spectral Approach to Overset Solvers for Arbitrary Motion

    NASA Technical Reports Server (NTRS)

    Leffell, Joshua Isaac; Murman, Scott M.; Pulliam, Thomas H.

    2012-01-01

    Forced periodic flows arise in a broad range of aerodynamic applications such as rotorcraft, turbomachinery, and flapping wing configurations. Standard practice involves solving the unsteady flow equations forward in time until the initial transient exits the domain and a statistically stationary flow is achieved. It is often required to simulate through several periods to remove the initial transient making unsteady design optimization prohibitively expensive for most realistic problems. An effort to reduce the computational cost of these calculations led to the development of the Harmonic Balance method [1, 2] which capitalizes on the periodic nature of the solution. The approach exploits the fact that forced temporally periodic flow, while varying in the time domain, is invariant in the frequency domain. Expanding the temporal variation at each spatial node into a Fourier series transforms the unsteady governing equations into a steady set of equations in integer harmonics that can be tackled with the acceleration techniques afforded to steady-state flow solvers. Other similar approaches, such as the Nonlinear Frequency Domain [3,4,5], Reduced Frequency [6] and Time-Spectral [7, 8, 9] methods, were developed shortly thereafter. Additionally, adjoint-based optimization techniques can be applied [10, 11] as well as frequency-adaptive methods [12, 13, 14] to provide even more flexibility to the method. The Fourier temporal basis functions imply spectral convergence as the number of harmonic modes, and correspondingly number of time samples, N, is increased. Some elect to solve the equations in the frequency domain directly, while others choose to transform the equations back into the time domain to simplify the process of adding this capability to existing solvers, but each harnesses the underlying steady solution in the frequency domain. These temporal projection methods will herein be collectively referred to as Time-Spectral methods. Time-Spectral methods have

  12. Optical solver for a system of ordinary differential equations based on an external feedback assisted microring resonator.

    PubMed

    Hou, Jie; Dong, Jianji; Zhang, Xinliang

    2017-06-15

    Systems of ordinary differential equations (SODEs) are crucial for describing the dynamic behaviors in various systems such as modern control systems which require observability and controllability. In this Letter, we propose and experimentally demonstrate an all-optical SODE solver based on the silicon-on-insulator platform. We use an add/drop microring resonator to construct two different ordinary differential equations (ODEs) and then introduce two external feedback waveguides to realize the coupling between these ODEs, thus forming the SODE solver. A temporal coupled mode theory is used to deduce the expression of the SODE. A system experiment is carried out for further demonstration. For the input 10 GHz NRZ-like pulses, the measured output waveforms of the SODE solver agree well with the calculated results.

  13. Object-Oriented Design for Sparse Direct Solvers

    NASA Technical Reports Server (NTRS)

    Dobrian, Florin; Kumfert, Gary; Pothen, Alex

    1999-01-01

    We discuss the object-oriented design of a software package for solving sparse, symmetric systems of equations (positive definite and indefinite) by direct methods. At the highest layers, we decouple data structure classes from algorithmic classes for flexibility. We describe the important structural and algorithmic classes in our design, and discuss the trade-offs we made for high performance. The kernels at the lower layers were optimized by hand. Our results show no performance loss from our object-oriented design, while providing flexibility, case of use, and extensibility over solvers using procedural design.

  14. ALPS - A LINEAR PROGRAM SOLVER

    NASA Technical Reports Server (NTRS)

    Viterna, L. A.

    1994-01-01

    Linear programming is a widely-used engineering and management tool. Scheduling, resource allocation, and production planning are all well-known applications of linear programs (LP's). Most LP's are too large to be solved by hand, so over the decades many computer codes for solving LP's have been developed. ALPS, A Linear Program Solver, is a full-featured LP analysis program. ALPS can solve plain linear programs as well as more complicated mixed integer and pure integer programs. ALPS also contains an efficient solution technique for pure binary (0-1 integer) programs. One of the many weaknesses of LP solvers is the lack of interaction with the user. ALPS is a menu-driven program with no special commands or keywords to learn. In addition, ALPS contains a full-screen editor to enter and maintain the LP formulation. These formulations can be written to and read from plain ASCII files for portability. For those less experienced in LP formulation, ALPS contains a problem "parser" which checks the formulation for errors. ALPS creates fully formatted, readable reports that can be sent to a printer or output file. ALPS is written entirely in IBM's APL2/PC product, Version 1.01. The APL2 workspace containing all the ALPS code can be run on any APL2/PC system (AT or 386). On a 32-bit system, this configuration can take advantage of all extended memory. The user can also examine and modify the ALPS code. The APL2 workspace has also been "packed" to be run on any DOS system (without APL2) as a stand-alone "EXE" file, but has limited memory capacity on a 640K system. A numeric coprocessor (80X87) is optional but recommended. The standard distribution medium for ALPS is a 5.25 inch 360K MS-DOS format diskette. IBM, IBM PC and IBM APL2 are registered trademarks of International Business Machines Corporation. MS-DOS is a registered trademark of Microsoft Corporation.

  15. Parallel Solver for Diffuse Optical Tomography on Realistic Head Models With Scattering and Clear Regions.

    PubMed

    Placati, Silvio; Guermandi, Marco; Samore, Andrea; Scarselli, Eleonora Franchi; Guerrieri, Roberto

    2016-09-01

    Diffuse optical tomography is an imaging technique, based on evaluation of how light propagates within the human head to obtain the functional information about the brain. Precision in reconstructing such an optical properties map is highly affected by the accuracy of the light propagation model implemented, which needs to take into account the presence of clear and scattering tissues. We present a numerical solver based on the radiosity-diffusion model, integrating the anatomical information provided by a structural MRI. The solver is designed to run on parallel heterogeneous platforms based on multiple GPUs and CPUs. We demonstrate how the solver provides a 7 times speed-up over an isotropic-scattered parallel Monte Carlo engine based on a radiative transport equation for a domain composed of 2 million voxels, along with a significant improvement in accuracy. The speed-up greatly increases for larger domains, allowing us to compute the light distribution of a full human head ( ≈ 3 million voxels) in 116 s for the platform used.

  16. Using computer algebra and SMT solvers in algebraic biology

    NASA Astrophysics Data System (ADS)

    Pineda Osorio, Mateo

    2014-05-01

    Biologic processes are represented as Boolean networks, in a discrete time. The dynamics within these networks are approached with the help of SMT Solvers and the use of computer algebra. Software such as Maple and Z3 was used in this case. The number of stationary states for each network was calculated. The network studied here corresponds to the immune system under the effects of drastic mood changes. Mood is considered as a Boolean variable that affects the entire dynamics of the immune system, changing the Boolean satisfiability and the number of stationary states of the immune network. Results obtained show Z3's great potential as a SMT Solver. Some of these results were verified in Maple, even though it showed not to be as suitable for the problem approach. The solving code was constructed using Z3-Python and Z3-SMT-LiB. Results obtained are important in biology systems and are expected to help in the design of immune therapies. As a future line of research, more complex Boolean network representations of the immune system as well as the whole psychological apparatus are suggested.

  17. An oscillation-free flow solver based on flux reconstruction

    NASA Astrophysics Data System (ADS)

    Aguerre, Horacio J.; Pairetti, Cesar I.; Venier, Cesar M.; Márquez Damián, Santiago; Nigro, Norberto M.

    2018-07-01

    In this paper, a segregated algorithm is proposed to suppress high-frequency oscillations in the velocity field for incompressible flows. In this context, a new velocity formula based on a reconstruction of face fluxes is defined eliminating high-frequency errors. In analogy to the Rhie-Chow interpolation, this approach is equivalent to including a flux-based pressure gradient with a velocity diffusion in the momentum equation. In order to guarantee second-order accuracy of the numerical solver, a set of conditions are defined for the reconstruction operator. To arrive at the final formulation, an outlook over the state of the art regarding velocity reconstruction procedures is presented comparing them through an error analysis. A new operator is then obtained by means of a flux difference minimization satisfying the required spatial accuracy. The accuracy of the new algorithm is analyzed by performing mesh convergence studies for unsteady Navier-Stokes problems with analytical solutions. The stabilization properties of the solver are then tested in a problem where spurious numerical oscillations arise for the velocity field. The results show a remarkable performance of the proposed technique eliminating high-frequency errors without losing accuracy.

  18. Globalized Newton-Krylov-Schwarz Algorithms and Software for Parallel Implicit CFD

    NASA Technical Reports Server (NTRS)

    Gropp, W. D.; Keyes, D. E.; McInnes, L. C.; Tidriri, M. D.

    1998-01-01

    Implicit solution methods are important in applications modeled by PDEs with disparate temporal and spatial scales. Because such applications require high resolution with reasonable turnaround, "routine" parallelization is essential. The pseudo-transient matrix-free Newton-Krylov-Schwarz (Psi-NKS) algorithmic framework is presented as an answer. We show that, for the classical problem of three-dimensional transonic Euler flow about an M6 wing, Psi-NKS can simultaneously deliver: globalized, asymptotically rapid convergence through adaptive pseudo- transient continuation and Newton's method-, reasonable parallelizability for an implicit method through deferred synchronization and favorable communication-to-computation scaling in the Krylov linear solver; and high per- processor performance through attention to distributed memory and cache locality, especially through the Schwarz preconditioner. Two discouraging features of Psi-NKS methods are their sensitivity to the coding of the underlying PDE discretization and the large number of parameters that must be selected to govern convergence. We therefore distill several recommendations from our experience and from our reading of the literature on various algorithmic components of Psi-NKS, and we describe a freely available, MPI-based portable parallel software implementation of the solver employed here.

  19. Thinking Process of Naive Problem Solvers to Solve Mathematical Problems

    ERIC Educational Resources Information Center

    Mairing, Jackson Pasini

    2017-01-01

    Solving problems is not only a goal of mathematical learning. Students acquire ways of thinking, habits of persistence and curiosity, and confidence in unfamiliar situations by learning to solve problems. In fact, there were students who had difficulty in solving problems. The students were naive problem solvers. This research aimed to describe…

  20. Continuous-time quantum Monte Carlo impurity solvers

    NASA Astrophysics Data System (ADS)

    Gull, Emanuel; Werner, Philipp; Fuchs, Sebastian; Surer, Brigitte; Pruschke, Thomas; Troyer, Matthias

    2011-04-01

    Continuous-time quantum Monte Carlo impurity solvers are algorithms that sample the partition function of an impurity model using diagrammatic Monte Carlo techniques. The present paper describes codes that implement the interaction expansion algorithm originally developed by Rubtsov, Savkin, and Lichtenstein, as well as the hybridization expansion method developed by Werner, Millis, Troyer, et al. These impurity solvers are part of the ALPS-DMFT application package and are accompanied by an implementation of dynamical mean-field self-consistency equations for (single orbital single site) dynamical mean-field problems with arbitrary densities of states. Program summaryProgram title: dmft Catalogue identifier: AEIL_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEIL_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: ALPS LIBRARY LICENSE version 1.1 No. of lines in distributed program, including test data, etc.: 899 806 No. of bytes in distributed program, including test data, etc.: 32 153 916 Distribution format: tar.gz Programming language: C++ Operating system: The ALPS libraries have been tested on the following platforms and compilers: Linux with GNU Compiler Collection (g++ version 3.1 and higher), and Intel C++ Compiler (icc version 7.0 and higher) MacOS X with GNU Compiler (g++ Apple-version 3.1, 3.3 and 4.0) IBM AIX with Visual Age C++ (xlC version 6.0) and GNU (g++ version 3.1 and higher) compilers Compaq Tru64 UNIX with Compq C++ Compiler (cxx) SGI IRIX with MIPSpro C++ Compiler (CC) HP-UX with HP C++ Compiler (aCC) Windows with Cygwin or coLinux platforms and GNU Compiler Collection (g++ version 3.1 and higher) RAM: 10 MB-1 GB Classification: 7.3 External routines: ALPS [1], BLAS/LAPACK, HDF5 Nature of problem: (See [2].) Quantum impurity models describe an atom or molecule embedded in a host material with which it can exchange electrons. They are basic to nanoscience as

  1. Conducting Automated Test Assembly Using the Premium Solver Platform Version 7.0 with Microsoft Excel and the Large-Scale LP/QP Solver Engine Add-In

    ERIC Educational Resources Information Center

    Cor, Ken; Alves, Cecilia; Gierl, Mark J.

    2008-01-01

    This review describes and evaluates a software add-in created by Frontline Systems, Inc., that can be used with Microsoft Excel 2007 to solve large, complex test assembly problems. The combination of Microsoft Excel 2007 with the Frontline Systems Premium Solver Platform is significant because Microsoft Excel is the most commonly used spreadsheet…

  2. Microwave beam broadening due to turbulent plasma density fluctuations within the limit of the Born approximation and beyond

    NASA Astrophysics Data System (ADS)

    Köhn, A.; Guidi, L.; Holzhauer, E.; Maj, O.; Poli, E.; Snicker, A.; Weber, H.

    2018-07-01

    Plasma turbulence, and edge density fluctuations in particular, can under certain conditions broaden the cross-section of injected microwave beams significantly. This can be a severe problem for applications relying on well-localized deposition of the microwave power, like the control of MHD instabilities. Here we investigate this broadening mechanism as a function of fluctuation level, background density and propagation length in a fusion-relevant scenario using two numerical codes, the full-wave code IPF-FDMC and the novel wave kinetic equation solver WKBeam. The latter treats the effects of fluctuations using a statistical approach, based on an iterative solution of the scattering problem (Born approximation). The full-wave simulations are used to benchmark this approach. The Born approximation is shown to be valid over a large parameter range, including ITER-relevant scenarios.

  3. Aircraft High-Lift Aerodynamic Analysis Using a Surface-Vorticity Solver

    NASA Technical Reports Server (NTRS)

    Olson, Erik D.; Albertson, Cindy W.

    2016-01-01

    This study extends an existing semi-empirical approach to high-lift analysis by examining its effectiveness for use with a three-dimensional aerodynamic analysis method. The aircraft high-lift geometry is modeled in Vehicle Sketch Pad (OpenVSP) using a newly-developed set of techniques for building a three-dimensional model of the high-lift geometry, and for controlling flap deflections using scripted parameter linking. Analysis of the low-speed aerodynamics is performed in FlightStream, a novel surface-vorticity solver that is expected to be substantially more robust and stable compared to pressure-based potential-flow solvers and less sensitive to surface perturbations. The calculated lift curve and drag polar are modified by an empirical lift-effectiveness factor that takes into account the effects of viscosity that are not captured in the potential-flow solution. Analysis results are validated against wind-tunnel data for The Energy-Efficient Transport AR12 low-speed wind-tunnel model, a 12-foot, full-span aircraft configuration with a supercritical wing, full-span slats, and part-span double-slotted flaps.

  4. An Optimized Multicolor Point-Implicit Solver for Unstructured Grid Applications on Graphics Processing Units

    NASA Technical Reports Server (NTRS)

    Zubair, Mohammad; Nielsen, Eric; Luitjens, Justin; Hammond, Dana

    2016-01-01

    In the field of computational fluid dynamics, the Navier-Stokes equations are often solved using an unstructuredgrid approach to accommodate geometric complexity. Implicit solution methodologies for such spatial discretizations generally require frequent solution of large tightly-coupled systems of block-sparse linear equations. The multicolor point-implicit solver used in the current work typically requires a significant fraction of the overall application run time. In this work, an efficient implementation of the solver for graphics processing units is proposed. Several factors present unique challenges to achieving an efficient implementation in this environment. These include the variable amount of parallelism available in different kernel calls, indirect memory access patterns, low arithmetic intensity, and the requirement to support variable block sizes. In this work, the solver is reformulated to use standard sparse and dense Basic Linear Algebra Subprograms (BLAS) functions. However, numerical experiments show that the performance of the BLAS functions available in existing CUDA libraries is suboptimal for matrices representative of those encountered in actual simulations. Instead, optimized versions of these functions are developed. Depending on block size, the new implementations show performance gains of up to 7x over the existing CUDA library functions.

  5. Applying EXCEL Solver to a watershed management goal-programming problem

    Treesearch

    J. E. de Steiguer

    2000-01-01

    This article demonstrates the application of EXCEL® spreadsheet linear programming (LP) solver to a watershed management multiple use goal programming (GP) problem. The data used to demonstrate the application are from a published study for a watershed in northern Colorado. GP has been used by natural resource managers for many years. However, the GP solution by means...

  6. Modeling of frequency-domain scalar wave equation with the average-derivative optimal scheme based on a multigrid-preconditioned iterative solver

    NASA Astrophysics Data System (ADS)

    Cao, Jian; Chen, Jing-Bo; Dai, Meng-Xue

    2018-01-01

    An efficient finite-difference frequency-domain modeling of seismic wave propagation relies on the discrete schemes and appropriate solving methods. The average-derivative optimal scheme for the scalar wave modeling is advantageous in terms of the storage saving for the system of linear equations and the flexibility for arbitrary directional sampling intervals. However, using a LU-decomposition-based direct solver to solve its resulting system of linear equations is very costly for both memory and computational requirements. To address this issue, we consider establishing a multigrid-preconditioned BI-CGSTAB iterative solver fit for the average-derivative optimal scheme. The choice of preconditioning matrix and its corresponding multigrid components is made with the help of Fourier spectral analysis and local mode analysis, respectively, which is important for the convergence. Furthermore, we find that for the computation with unequal directional sampling interval, the anisotropic smoothing in the multigrid precondition may affect the convergence rate of this iterative solver. Successful numerical applications of this iterative solver for the homogenous and heterogeneous models in 2D and 3D are presented where the significant reduction of computer memory and the improvement of computational efficiency are demonstrated by comparison with the direct solver. In the numerical experiments, we also show that the unequal directional sampling interval will weaken the advantage of this multigrid-preconditioned iterative solver in the computing speed or, even worse, could reduce its accuracy in some cases, which implies the need for a reasonable control of directional sampling interval in the discretization.

  7. Progress Toward Overset-Grid Moving Body Capability for USM3D Unstructured Flow Solver

    NASA Technical Reports Server (NTRS)

    Pandyna, Mohagna J.; Frink, Neal T.; Noack, Ralph W.

    2005-01-01

    A static and dynamic Chimera overset-grid capability is added to an established NASA tetrahedral unstructured parallel Navier-Stokes flow solver, USM3D. Modifications to the solver primarily consist of a few strategic calls to the Donor interpolation Receptor Transaction library (DiRTlib) to facilitate communication of solution information between various grids. The assembly of multiple overlapping grids into a single-zone composite grid is performed by the Structured, Unstructured and Generalized Grid AssembleR (SUGGAR) code. Several test cases are presented to verify the implementation, assess overset-grid solution accuracy and convergence relative to single-grid solutions, and demonstrate the prescribed relative grid motion capability.

  8. TemperSAT: A new efficient fair-sampling random k-SAT solver

    NASA Astrophysics Data System (ADS)

    Fang, Chao; Zhu, Zheng; Katzgraber, Helmut G.

    The set membership problem is of great importance to many applications and, in particular, database searches for target groups. Recently, an approach to speed up set membership searches based on the NP-hard constraint-satisfaction problem (random k-SAT) has been developed. However, the bottleneck of the approach lies in finding the solution to a large SAT formula efficiently and, in particular, a large number of independent solutions is needed to reduce the probability of false positives. Unfortunately, traditional random k-SAT solvers such as WalkSAT are biased when seeking solutions to the Boolean formulas. By porting parallel tempering Monte Carlo to the sampling of binary optimization problems, we introduce a new algorithm (TemperSAT) whose performance is comparable to current state-of-the-art SAT solvers for large k with the added benefit that theoretically it can find many independent solutions quickly. We illustrate our results by comparing to the currently fastest implementation of WalkSAT, WalkSATlm.

  9. Parallel satellite orbital situational problems solver for space missions design and control

    NASA Astrophysics Data System (ADS)

    Atanassov, Atanas Marinov

    2016-11-01

    Solving different scientific problems for space applications demands implementation of observations, measurements or realization of active experiments during time intervals in which specific geometric and physical conditions are fulfilled. The solving of situational problems for determination of these time intervals when the satellite instruments work optimally is a very important part of all activities on every stage of preparation and realization of space missions. The elaboration of universal, flexible and robust approach for situation analysis, which is easily portable toward new satellite missions, is significant for reduction of missions' preparation times and costs. Every situation problem could be based on one or more situation conditions. Simultaneously solving different kinds of situation problems based on different number and types of situational conditions, each one of them satisfied on different segments of satellite orbit requires irregular calculations. Three formal approaches are presented. First one is related to situation problems description that allows achieving flexibility in situation problem assembling and presentation in computer memory. The second formal approach is connected with developing of situation problem solver organized as processor that executes specific code for every particular situational condition. The third formal approach is related to solver parallelization utilizing threads and dynamic scheduling based on "pool of threads" abstraction and ensures a good load balance. The developed situation problems solver is intended for incorporation in the frames of multi-physics multi-satellite space mission's design and simulation tools.

  10. An assessment of the adaptive unstructured tetrahedral grid, Euler Flow Solver Code FELISA

    NASA Technical Reports Server (NTRS)

    Djomehri, M. Jahed; Erickson, Larry L.

    1994-01-01

    A three-dimensional solution-adaptive Euler flow solver for unstructured tetrahedral meshes is assessed, and the accuracy and efficiency of the method for predicting sonic boom pressure signatures about simple generic models are demonstrated. Comparison of computational and wind tunnel data and enhancement of numerical solutions by means of grid adaptivity are discussed. The mesh generation is based on the advancing front technique. The FELISA code consists of two solvers, the Taylor-Galerkin and the Runge-Kutta-Galerkin schemes, both of which are spacially discretized by the usual Galerkin weighted residual finite-element methods but with different explicit time-marching schemes to steady state. The solution-adaptive grid procedure is based on either remeshing or mesh refinement techniques. An alternative geometry adaptive procedure is also incorporated.

  11. LSRN: A PARALLEL ITERATIVE SOLVER FOR STRONGLY OVER- OR UNDERDETERMINED SYSTEMS*

    PubMed Central

    Meng, Xiangrui; Saunders, Michael A.; Mahoney, Michael W.

    2014-01-01

    We describe a parallel iterative least squares solver named LSRN that is based on random normal projection. LSRN computes the min-length solution to minx∈ℝn ‖Ax − b‖2, where A ∈ ℝm × n with m ≫ n or m ≪ n, and where A may be rank-deficient. Tikhonov regularization may also be included. Since A is involved only in matrix-matrix and matrix-vector multiplications, it can be a dense or sparse matrix or a linear operator, and LSRN automatically speeds up when A is sparse or a fast linear operator. The preconditioning phase consists of a random normal projection, which is embarrassingly parallel, and a singular value decomposition of size ⌈γ min(m, n)⌉ × min(m, n), where γ is moderately larger than 1, e.g., γ = 2. We prove that the preconditioned system is well-conditioned, with a strong concentration result on the extreme singular values, and hence that the number of iterations is fully predictable when we apply LSQR or the Chebyshev semi-iterative method. As we demonstrate, the Chebyshev method is particularly efficient for solving large problems on clusters with high communication cost. Numerical results show that on a shared-memory machine, LSRN is very competitive with LAPACK’s DGELSD and a fast randomized least squares solver called Blendenpik on large dense problems, and it outperforms the least squares solver from SuiteSparseQR on sparse problems without sparsity patterns that can be exploited to reduce fill-in. Further experiments show that LSRN scales well on an Amazon Elastic Compute Cloud cluster. PMID:25419094

  12. The a(4) Scheme-A High Order Neutrally Stable CESE Solver

    NASA Technical Reports Server (NTRS)

    Chang, Sin-Chung

    2009-01-01

    The CESE development is driven by a belief that a solver should (i) enforce conservation laws in both space and time, and (ii) be built from a nondissipative (i.e., neutrally stable) core scheme so that the numerical dissipation can be controlled effectively. To provide a solid foundation for a systematic CESE development of high order schemes, in this paper we describe a new high order (4-5th order) and neutrally stable CESE solver of a 1D advection equation with a constant advection speed a. The space-time stencil of this two-level explicit scheme is formed by one point at the upper time level and two points at the lower time level. Because it is associated with four independent mesh variables (the numerical analogues of the dependent variable and its first, second, and third-order spatial derivatives) and four equations per mesh point, the new scheme is referred to as the a(4) scheme. As in the case of other similar CESE neutrally stable solvers, the a(4) scheme enforces conservation laws in space-time locally and globally, and it has the basic, forward marching, and backward marching forms. Except for a singular case, these forms are equivalent and satisfy a space-time inversion (STI) invariant property which is shared by the advection equation. Based on the concept of STI invariance, a set of algebraic relations is developed and used to prove the a(4) scheme must be neutrally stable when it is stable. Numerically, it has been established that the scheme is stable if the value of the Courant number is less than 1/3

  13. Riemann solvers and Alfven waves in black hole magnetospheres

    NASA Astrophysics Data System (ADS)

    Punsly, Brian; Balsara, Dinshaw; Kim, Jinho; Garain, Sudip

    2016-09-01

    In the magnetosphere of a rotating black hole, an inner Alfven critical surface (IACS) must be crossed by inflowing plasma. Inside the IACS, Alfven waves are inward directed toward the black hole. The majority of the proper volume of the active region of spacetime (the ergosphere) is inside of the IACS. The charge and the totally transverse momentum flux (the momentum flux transverse to both the wave normal and the unperturbed magnetic field) are both determined exclusively by the Alfven polarization. Thus, it is important for numerical simulations of black hole magnetospheres to minimize the dissipation of Alfven waves. Elements of the dissipated wave emerge in adjacent cells regardless of the IACS, there is no mechanism to prevent Alfvenic information from crossing outward. Thus, numerical dissipation can affect how simulated magnetospheres attain the substantial Goldreich-Julian charge density associated with the rotating magnetic field. In order to help minimize dissipation of Alfven waves in relativistic numerical simulations we have formulated a one-dimensional Riemann solver, called HLLI, which incorporates the Alfven discontinuity and the contact discontinuity. We have also formulated a multidimensional Riemann solver, called MuSIC, that enables low dissipation propagation of Alfven waves in multiple dimensions. The importance of higher order schemes in lowering the numerical dissipation of Alfven waves is also catalogued.

  14. PowerPlay: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem

    PubMed Central

    Schmidhuber, Jürgen

    2013-01-01

    Most of computer science focuses on automatically solving given computational problems. I focus on automatically inventing or discovering problems in a way inspired by the playful behavior of animals and humans, to train a more and more general problem solver from scratch in an unsupervised fashion. Consider the infinite set of all computable descriptions of tasks with possibly computable solutions. Given a general problem-solving architecture, at any given time, the novel algorithmic framework PowerPlay (Schmidhuber, 2011) searches the space of possible pairs of new tasks and modifications of the current problem solver, until it finds a more powerful problem solver that provably solves all previously learned tasks plus the new one, while the unmodified predecessor does not. Newly invented tasks may require to achieve a wow-effect by making previously learned skills more efficient such that they require less time and space. New skills may (partially) re-use previously learned skills. The greedy search of typical PowerPlay variants uses time-optimal program search to order candidate pairs of tasks and solver modifications by their conditional computational (time and space) complexity, given the stored experience so far. The new task and its corresponding task-solving skill are those first found and validated. This biases the search toward pairs that can be described compactly and validated quickly. The computational costs of validating new tasks need not grow with task repertoire size. Standard problem solver architectures of personal computers or neural networks tend to generalize by solving numerous tasks outside the self-invented training set; PowerPlay’s ongoing search for novelty keeps breaking the generalization abilities of its present solver. This is related to Gödel’s sequence of increasingly powerful formal theories based on adding formerly unprovable statements to the axioms without affecting previously provable theorems. The continually increasing

  15. Final Report for "Implimentation and Evaluation of Multigrid Linear Solvers into Extended Magnetohydrodynamic Codes for Petascale Computing"

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Srinath Vadlamani; Scott Kruger; Travis Austin

    Extended magnetohydrodynamic (MHD) codes are used to model the large, slow-growing instabilities that are projected to limit the performance of International Thermonuclear Experimental Reactor (ITER). The multiscale nature of the extended MHD equations requires an implicit approach. The current linear solvers needed for the implicit algorithm scale poorly because the resultant matrices are so ill-conditioned. A new solver is needed, especially one that scales to the petascale. The most successful scalable parallel processor solvers to date are multigrid solvers. Applying multigrid techniques to a set of equations whose fundamental modes are dispersive waves is a promising solution to CEMM problems.more » For the Phase 1, we implemented multigrid preconditioners from the HYPRE project of the Center for Applied Scientific Computing at LLNL via PETSc of the DOE SciDAC TOPS for the real matrix systems of the extended MHD code NIMROD which is a one of the primary modeling codes of the OFES-funded Center for Extended Magnetohydrodynamic Modeling (CEMM) SciDAC. We implemented the multigrid solvers on the fusion test problem that allows for real matrix systems with success, and in the process learned about the details of NIMROD data structures and the difficulties of inverting NIMROD operators. The further success of this project will allow for efficient usage of future petascale computers at the National Leadership Facilities: Oak Ridge National Laboratory, Argonne National Laboratory, and National Energy Research Scientific Computing Center. The project will be a collaborative effort between computational plasma physicists and applied mathematicians at Tech-X Corporation, applied mathematicians Front Range Scientific Computations, Inc. (who are collaborators on the HYPRE project), and other computational plasma physicists involved with the CEMM project.« less

  16. Accelerated iteration schemes for transonic flow calculations using fast poisson solvers. [aerodynamics

    NASA Technical Reports Server (NTRS)

    Jameson, A.

    1975-01-01

    The use of a fast elliptic solver in combination with relaxation is presented as an effective way to accelerate the convergence of transonic flow calculations, particularly when a marching scheme can be used to treat the supersonic zone in the relaxation process.

  17. Parallel Computation of the Jacobian Matrix for Nonlinear Equation Solvers Using MATLAB

    NASA Technical Reports Server (NTRS)

    Rose, Geoffrey K.; Nguyen, Duc T.; Newman, Brett A.

    2017-01-01

    Demonstrating speedup for parallel code on a multicore shared memory PC can be challenging in MATLAB due to underlying parallel operations that are often opaque to the user. This can limit potential for improvement of serial code even for the so-called embarrassingly parallel applications. One such application is the computation of the Jacobian matrix inherent to most nonlinear equation solvers. Computation of this matrix represents the primary bottleneck in nonlinear solver speed such that commercial finite element (FE) and multi-body-dynamic (MBD) codes attempt to minimize computations. A timing study using MATLAB's Parallel Computing Toolbox was performed for numerical computation of the Jacobian. Several approaches for implementing parallel code were investigated while only the single program multiple data (spmd) method using composite objects provided positive results. Parallel code speedup is demonstrated but the goal of linear speedup through the addition of processors was not achieved due to PC architecture.

  18. Linear solver performance in elastoplastic problem solution on GPU cluster

    NASA Astrophysics Data System (ADS)

    Khalevitsky, Yu. V.; Konovalov, A. V.; Burmasheva, N. V.; Partin, A. S.

    2017-12-01

    Applying the finite element method to severe plastic deformation problems involves solving linear equation systems. While the solution procedure is relatively hard to parallelize and computationally intensive by itself, a long series of large scale systems need to be solved for each problem. When dealing with fine computational meshes, such as in the simulations of three-dimensional metal matrix composite microvolume deformation, tens and hundreds of hours may be needed to complete the whole solution procedure, even using modern supercomputers. In general, one of the preconditioned Krylov subspace methods is used in a linear solver for such problems. The method convergence highly depends on the operator spectrum of a problem stiffness matrix. In order to choose the appropriate method, a series of computational experiments is used. Different methods may be preferable for different computational systems for the same problem. In this paper we present experimental data obtained by solving linear equation systems from an elastoplastic problem on a GPU cluster. The data can be used to substantiate the choice of the appropriate method for a linear solver to use in severe plastic deformation simulations.

  19. libmpdata++ 1.0: a library of parallel MPDATA solvers for systems of generalised transport equations

    NASA Astrophysics Data System (ADS)

    Jaruga, A.; Arabas, S.; Jarecka, D.; Pawlowska, H.; Smolarkiewicz, P. K.; Waruszewski, M.

    2015-04-01

    This paper accompanies the first release of libmpdata++, a C++ library implementing the multi-dimensional positive-definite advection transport algorithm (MPDATA) on regular structured grid. The library offers basic numerical solvers for systems of generalised transport equations. The solvers are forward-in-time, conservative and non-linearly stable. The libmpdata++ library covers the basic second-order-accurate formulation of MPDATA, its third-order variant, the infinite-gauge option for variable-sign fields and a flux-corrected transport extension to guarantee non-oscillatory solutions. The library is equipped with a non-symmetric variational elliptic solver for implicit evaluation of pressure gradient terms. All solvers offer parallelisation through domain decomposition using shared-memory parallelisation. The paper describes the library programming interface, and serves as a user guide. Supported options are illustrated with benchmarks discussed in the MPDATA literature. Benchmark descriptions include code snippets as well as quantitative representations of simulation results. Examples of applications include homogeneous transport in one, two and three dimensions in Cartesian and spherical domains; a shallow-water system compared with analytical solution (originally derived for a 2-D case); and a buoyant convection problem in an incompressible Boussinesq fluid with interfacial instability. All the examples are implemented out of the library tree. Regardless of the differences in the problem dimensionality, right-hand-side terms, boundary conditions and parallelisation approach, all the examples use the same unmodified library, which is a key goal of libmpdata++ design. The design, based on the principle of separation of concerns, prioritises the user and developer productivity. The libmpdata++ library is implemented in C++, making use of the Blitz++ multi-dimensional array containers, and is released as free/libre and open-source software.

  20. libmpdata++ 0.1: a library of parallel MPDATA solvers for systems of generalised transport equations

    NASA Astrophysics Data System (ADS)

    Jaruga, A.; Arabas, S.; Jarecka, D.; Pawlowska, H.; Smolarkiewicz, P. K.; Waruszewski, M.

    2014-11-01

    This paper accompanies first release of libmpdata++, a C++ library implementing the Multidimensional Positive-Definite Advection Transport Algorithm (MPDATA). The library offers basic numerical solvers for systems of generalised transport equations. The solvers are forward-in-time, conservative and non-linearly stable. The libmpdata++ library covers the basic second-order-accurate formulation of MPDATA, its third-order variant, the infinite-gauge option for variable-sign fields and a flux-corrected transport extension to guarantee non-oscillatory solutions. The library is equipped with a non-symmetric variational elliptic solver for implicit evaluation of pressure gradient terms. All solvers offer parallelisation through domain decomposition using shared-memory parallelisation. The paper describes the library programming interface, and serves as a user guide. Supported options are illustrated with benchmarks discussed in the MPDATA literature. Benchmark descriptions include code snippets as well as quantitative representations of simulation results. Examples of applications include: homogeneous transport in one, two and three dimensions in Cartesian and spherical domains; shallow-water system compared with analytical solution (originally derived for a 2-D case); and a buoyant convection problem in an incompressible Boussinesq fluid with interfacial instability. All the examples are implemented out of the library tree. Regardless of the differences in the problem dimensionality, right-hand-side terms, boundary conditions and parallelisation approach, all the examples use the same unmodified library, which is a key goal of libmpdata++ design. The design, based on the principle of separation of concerns, prioritises the user and developer productivity. The libmpdata++ library is implemented in C++, making use of the Blitz++ multi-dimensional array containers, and is released as free/libre and open-source software.

  1. A Fast Solver for Implicit Integration of the Vlasov--Poisson System in the Eulerian Framework

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Garrett, C. Kristopher; Hauck, Cory D.

    In this paper, we present a domain decomposition algorithm to accelerate the solution of Eulerian-type discretizations of the linear, steady-state Vlasov equation. The steady-state solver then forms a key component in the implementation of fully implicit or nearly fully implicit temporal integrators for the nonlinear Vlasov--Poisson system. The solver relies on a particular decomposition of phase space that enables the use of sweeping techniques commonly used in radiation transport applications. The original linear system for the phase space unknowns is then replaced by a smaller linear system involving only unknowns on the boundary between subdomains, which can then be solvedmore » efficiently with Krylov methods such as GMRES. Steady-state solves are combined to form an implicit Runge--Kutta time integrator, and the Vlasov equation is coupled self-consistently to the Poisson equation via a linearized procedure or a nonlinear fixed-point method for the electric field. Finally, numerical results for standard test problems demonstrate the efficiency of the domain decomposition approach when compared to the direct application of an iterative solver to the original linear system.« less

  2. A Fast Solver for Implicit Integration of the Vlasov--Poisson System in the Eulerian Framework

    DOE PAGES

    Garrett, C. Kristopher; Hauck, Cory D.

    2018-04-05

    In this paper, we present a domain decomposition algorithm to accelerate the solution of Eulerian-type discretizations of the linear, steady-state Vlasov equation. The steady-state solver then forms a key component in the implementation of fully implicit or nearly fully implicit temporal integrators for the nonlinear Vlasov--Poisson system. The solver relies on a particular decomposition of phase space that enables the use of sweeping techniques commonly used in radiation transport applications. The original linear system for the phase space unknowns is then replaced by a smaller linear system involving only unknowns on the boundary between subdomains, which can then be solvedmore » efficiently with Krylov methods such as GMRES. Steady-state solves are combined to form an implicit Runge--Kutta time integrator, and the Vlasov equation is coupled self-consistently to the Poisson equation via a linearized procedure or a nonlinear fixed-point method for the electric field. Finally, numerical results for standard test problems demonstrate the efficiency of the domain decomposition approach when compared to the direct application of an iterative solver to the original linear system.« less

  3. Three-dimensional forward solver and its performance analysis for magnetic resonance electrical impedance tomography (MREIT) using recessed electrodes.

    PubMed

    Lee, Byung Il; Oh, Suk Hoon; Woo, Eung Je; Lee, Soo Yeol; Cho, Min Hyoung; Kwon, Ohin; Seo, Jin Keun; Lee, June-Yub; Baek, Woon Sik

    2003-07-07

    In magnetic resonance electrical impedance tomography (MREIT), we try to reconstruct a cross-sectional resistivity (or conductivity) image of a subject. When we inject a current through surface electrodes, it generates a magnetic field. Using a magnetic resonance imaging (MRI) scanner, we can obtain the induced magnetic flux density from MR phase images of the subject. We use recessed electrodes to avoid undesirable artefacts near electrodes in measuring magnetic flux densities. An MREIT image reconstruction algorithm produces cross-sectional resistivity images utilizing the measured internal magnetic flux density in addition to boundary voltage data. In order to develop such an image reconstruction algorithm, we need a three-dimensional forward solver. Given injection currents as boundary conditions, the forward solver described in this paper computes voltage and current density distributions using the finite element method (FEM). Then, it calculates the magnetic flux density within the subject using the Biot-Savart law and FEM. The performance of the forward solver is analysed and found to be enough for use in MREIT for resistivity image reconstructions and also experimental designs and validations. The forward solver may find other applications where one needs to compute voltage, current density and magnetic flux density distributions all within a volume conductor.

  4. A Multi-Level Parallelization Concept for High-Fidelity Multi-Block Solvers

    NASA Technical Reports Server (NTRS)

    Hatay, Ferhat F.; Jespersen, Dennis C.; Guruswamy, Guru P.; Rizk, Yehia M.; Byun, Chansup; Gee, Ken; VanDalsem, William R. (Technical Monitor)

    1997-01-01

    The integration of high-fidelity Computational Fluid Dynamics (CFD) analysis tools with the industrial design process benefits greatly from the robust implementations that are transportable across a wide range of computer architectures. In the present work, a hybrid domain-decomposition and parallelization concept was developed and implemented into the widely-used NASA multi-block Computational Fluid Dynamics (CFD) packages implemented in ENSAERO and OVERFLOW. The new parallel solver concept, PENS (Parallel Euler Navier-Stokes Solver), employs both fine and coarse granularity in data partitioning as well as data coalescing to obtain the desired load-balance characteristics on the available computer platforms. This multi-level parallelism implementation itself introduces no changes to the numerical results, hence the original fidelity of the packages are identically preserved. The present implementation uses the Message Passing Interface (MPI) library for interprocessor message passing and memory accessing. By choosing an appropriate combination of the available partitioning and coalescing capabilities only during the execution stage, the PENS solver becomes adaptable to different computer architectures from shared-memory to distributed-memory platforms with varying degrees of parallelism. The PENS implementation on the IBM SP2 distributed memory environment at the NASA Ames Research Center obtains 85 percent scalable parallel performance using fine-grain partitioning of single-block CFD domains using up to 128 wide computational nodes. Multi-block CFD simulations of complete aircraft simulations achieve 75 percent perfect load-balanced executions using data coalescing and the two levels of parallelism. SGI PowerChallenge, SGI Origin 2000, and a cluster of workstations are the other platforms where the robustness of the implementation is tested. The performance behavior on the other computer platforms with a variety of realistic problems will be included as this on

  5. A 2-D/1-D transverse leakage approximation based on azimuthal, Fourier moments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stimpson, Shane G.; Collins, Benjamin S.; Downar, Thomas

    Here, the MPACT code being developed collaboratively by Oak Ridge National Laboratory and the University of Michigan is the primary deterministic neutron transport solver within the Virtual Environment for Reactor Applications Core Simulator (VERA-CS). In MPACT, the two-dimensional (2-D)/one-dimensional (1-D) scheme is the most commonly used method for solving neutron transport-based three-dimensional nuclear reactor core physics problems. Several axial solvers in this scheme assume isotropic transverse leakages, but work with the axial S N solver has extended these leakages to include both polar and azimuthal dependence. However, explicit angular representation can be burdensome for run-time and memory requirements. The workmore » here alleviates this burden by assuming that the azimuthal dependence of the angular flux and transverse leakages are represented by a Fourier series expansion. At the heart of this is a new axial SN solver that takes in a Fourier expanded radial transverse leakage and generates the angular fluxes used to construct the axial transverse leakages used in the 2-D-Method of Characteristics calculations.« less

  6. A 2-D/1-D transverse leakage approximation based on azimuthal, Fourier moments

    DOE PAGES

    Stimpson, Shane G.; Collins, Benjamin S.; Downar, Thomas

    2017-01-12

    Here, the MPACT code being developed collaboratively by Oak Ridge National Laboratory and the University of Michigan is the primary deterministic neutron transport solver within the Virtual Environment for Reactor Applications Core Simulator (VERA-CS). In MPACT, the two-dimensional (2-D)/one-dimensional (1-D) scheme is the most commonly used method for solving neutron transport-based three-dimensional nuclear reactor core physics problems. Several axial solvers in this scheme assume isotropic transverse leakages, but work with the axial S N solver has extended these leakages to include both polar and azimuthal dependence. However, explicit angular representation can be burdensome for run-time and memory requirements. The workmore » here alleviates this burden by assuming that the azimuthal dependence of the angular flux and transverse leakages are represented by a Fourier series expansion. At the heart of this is a new axial SN solver that takes in a Fourier expanded radial transverse leakage and generates the angular fluxes used to construct the axial transverse leakages used in the 2-D-Method of Characteristics calculations.« less

  7. A distributed-memory approximation algorithm for maximum weight perfect bipartite matching

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Azad, Ariful; Buluc, Aydin; Li, Xiaoye S.

    We design and implement an efficient parallel approximation algorithm for the problem of maximum weight perfect matching in bipartite graphs, i.e. the problem of finding a set of non-adjacent edges that covers all vertices and has maximum weight. This problem differs from the maximum weight matching problem, for which scalable approximation algorithms are known. It is primarily motivated by finding good pivots in scalable sparse direct solvers before factorization where sequential implementations of maximum weight perfect matching algorithms, such as those available in MC64, are widely used due to the lack of scalable alternatives. To overcome this limitation, we proposemore » a fully parallel distributed memory algorithm that first generates a perfect matching and then searches for weightaugmenting cycles of length four in parallel and iteratively augments the matching with a vertex disjoint set of such cycles. For most practical problems the weights of the perfect matchings generated by our algorithm are very close to the optimum. An efficient implementation of the algorithm scales up to 256 nodes (17,408 cores) on a Cray XC40 supercomputer and can solve instances that are too large to be handled by a single node using the sequential algorithm.« less

  8. The value of continuity: Refined isogeometric analysis and fast direct solvers

    DOE PAGES

    Garcia, Daniel; Pardo, David; Dalcin, Lisandro; ...

    2016-08-24

    Here, we propose the use of highly continuous finite element spaces interconnected with low continuity hyperplanes to maximize the performance of direct solvers. Starting from a highly continuous Isogeometric Analysis (IGA) discretization, we introduce C0-separators to reduce the interconnection between degrees of freedom in the mesh. By doing so, both the solution time and best approximation errors are simultaneously improved. We call the resulting method “refined Isogeometric Analysis (rIGA)”. To illustrate the impact of the continuity reduction, we analyze the number of Floating Point Operations (FLOPs), computational times, and memory required to solve the linear system obtained by discretizing themore » Laplace problem with structured meshes and uniform polynomial orders. Theoretical estimates demonstrate that an optimal continuity reduction may decrease the total computational time by a factor between p 2 and p 3, with pp being the polynomial order of the discretization. Numerical results indicate that our proposed refined isogeometric analysis delivers a speed-up factor proportional to p 2. In a 2D mesh with four million elements and p=5, the linear system resulting from rIGA is solved 22 times faster than the one from highly continuous IGA. In a 3D mesh with one million elements and p=3, the linear system is solved 15 times faster for the refined than the maximum continuity isogeometric analysis.« less

  9. Simulation of vortex-induced vibrations of a cylinder using ANSYS CFX rigid body solver

    NASA Astrophysics Data System (ADS)

    Izhar, Abubakar; Qureshi, Arshad Hussain; Khushnood, Shahab

    2017-03-01

    This article simulates the vortex-induced oscillations of a rigid circular cylinder with elastic support using the new ANSYS CFX rigid body solver. This solver requires no solid mesh to setup FSI (Fluid Structure Interaction) simulation. The two-way case was setup in CFX only. Specific mass of the cylinder and flow conditions were similar to previous experimental data with mass damping parameter equal to 0.04, specific mass of 1 and Reynolds number of 3800. Two dimensional simulations were setup. Both one-degree-of-freedom and two-degree-of-freedom cases were run and results were obtained for both cases with reasonable accuracy as compared with experimental results. Eight-figure XY trajectory and lock-in behavior were clearly captured. The obtained results were satisfactory.

  10. Solving lattice QCD systems of equations using mixed precision solvers on GPUs

    NASA Astrophysics Data System (ADS)

    Clark, M. A.; Babich, R.; Barros, K.; Brower, R. C.; Rebbi, C.

    2010-09-01

    Modern graphics hardware is designed for highly parallel numerical tasks and promises significant cost and performance benefits for many scientific applications. One such application is lattice quantum chromodynamics (lattice QCD), where the main computational challenge is to efficiently solve the discretized Dirac equation in the presence of an SU(3) gauge field. Using NVIDIA's CUDA platform we have implemented a Wilson-Dirac sparse matrix-vector product that performs at up to 40, 135 and 212 Gflops for double, single and half precision respectively on NVIDIA's GeForce GTX 280 GPU. We have developed a new mixed precision approach for Krylov solvers using reliable updates which allows for full double precision accuracy while using only single or half precision arithmetic for the bulk of the computation. The resulting BiCGstab and CG solvers run in excess of 100 Gflops and, in terms of iterations until convergence, perform better than the usual defect-correction approach for mixed precision.

  11. Compact tunable silicon photonic differential-equation solver for general linear time-invariant systems.

    PubMed

    Wu, Jiayang; Cao, Pan; Hu, Xiaofeng; Jiang, Xinhong; Pan, Ting; Yang, Yuxing; Qiu, Ciyuan; Tremblay, Christine; Su, Yikai

    2014-10-20

    We propose and experimentally demonstrate an all-optical temporal differential-equation solver that can be used to solve ordinary differential equations (ODEs) characterizing general linear time-invariant (LTI) systems. The photonic device implemented by an add-drop microring resonator (MRR) with two tunable interferometric couplers is monolithically integrated on a silicon-on-insulator (SOI) wafer with a compact footprint of ~60 μm × 120 μm. By thermally tuning the phase shifts along the bus arms of the two interferometric couplers, the proposed device is capable of solving first-order ODEs with two variable coefficients. The operation principle is theoretically analyzed, and system testing of solving ODE with tunable coefficients is carried out for 10-Gb/s optical Gaussian-like pulses. The experimental results verify the effectiveness of the fabricated device as a tunable photonic ODE solver.

  12. Parareal in time 3D numerical solver for the LWR Benchmark neutron diffusion transient model

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Baudron, Anne-Marie, E-mail: anne-marie.baudron@cea.fr; CEA-DRN/DMT/SERMA, CEN-Saclay, 91191 Gif sur Yvette Cedex; Lautard, Jean-Jacques, E-mail: jean-jacques.lautard@cea.fr

    2014-12-15

    In this paper we present a time-parallel algorithm for the 3D neutrons calculation of a transient model in a nuclear reactor core. The neutrons calculation consists in numerically solving the time dependent diffusion approximation equation, which is a simplified transport equation. The numerical resolution is done with finite elements method based on a tetrahedral meshing of the computational domain, representing the reactor core, and time discretization is achieved using a θ-scheme. The transient model presents moving control rods during the time of the reaction. Therefore, cross-sections (piecewise constants) are taken into account by interpolations with respect to the velocity ofmore » the control rods. The parallelism across the time is achieved by an adequate use of the parareal in time algorithm to the handled problem. This parallel method is a predictor corrector scheme that iteratively combines the use of two kinds of numerical propagators, one coarse and one fine. Our method is made efficient by means of a coarse solver defined with large time step and fixed position control rods model, while the fine propagator is assumed to be a high order numerical approximation of the full model. The parallel implementation of our method provides a good scalability of the algorithm. Numerical results show the efficiency of the parareal method on large light water reactor transient model corresponding to the Langenbuch–Maurer–Werner benchmark.« less

  13. An Unsplit Monte-Carlo solver for the resolution of the linear Boltzmann equation coupled to (stiff) Bateman equations

    NASA Astrophysics Data System (ADS)

    Bernede, Adrien; Poëtte, Gaël

    2018-02-01

    In this paper, we are interested in the resolution of the time-dependent problem of particle transport in a medium whose composition evolves with time due to interactions. As a constraint, we want to use of Monte-Carlo (MC) scheme for the transport phase. A common resolution strategy consists in a splitting between the MC/transport phase and the time discretization scheme/medium evolution phase. After going over and illustrating the main drawbacks of split solvers in a simplified configuration (monokinetic, scalar Bateman problem), we build a new Unsplit MC (UMC) solver improving the accuracy of the solutions, avoiding numerical instabilities, and less sensitive to time discretization. The new solver is essentially based on a Monte Carlo scheme with time dependent cross sections implying the on-the-fly resolution of a reduced model for each MC particle describing the time evolution of the matter along their flight path.

  14. Proactive monitoring of an onshore wind farm through lidar measurements, SCADA data and a data-driven RANS solver

    NASA Astrophysics Data System (ADS)

    Iungo, Giacomo Valerio; Camarri, Simone; Ciri, Umberto; El-Asha, Said; Leonardi, Stefano; Rotea, Mario A.; Santhanagopalan, Vignesh; Viola, Francesco; Zhan, Lu

    2016-11-01

    Site conditions, such as topography and local climate, as well as wind farm layout strongly affect performance of a wind power plant. Therefore, predictions of wake interactions and their effects on power production still remain a great challenge in wind energy. For this study, an onshore wind turbine array was monitored through lidar measurements, SCADA and met-tower data. Power losses due to wake interactions were estimated to be approximately 4% and 2% of the total power production under stable and convective conditions, respectively. This dataset was then leveraged for the calibration of a data driven RANS (DDRANS) solver, which is a compelling tool for prediction of wind turbine wakes and power production. DDRANS is characterized by a computational cost as low as that for engineering wake models, and adequate accuracy achieved through data-driven tuning of the turbulence closure model. DDRANS is based on a parabolic formulation, axisymmetry and boundary layer approximations, which allow achieving low computational costs. The turbulence closure model consists in a mixing length model, which is optimally calibrated with the experimental dataset. Assessment of DDRANS is then performed through lidar and SCADA data for different atmospheric conditions. This material is based upon work supported by the National Science Foundation under the I/UCRC WindSTAR, NSF Award IIP 1362033.

  15. Performance issues for iterative solvers in device simulation

    NASA Technical Reports Server (NTRS)

    Fan, Qing; Forsyth, P. A.; Mcmacken, J. R. F.; Tang, Wei-Pai

    1994-01-01

    Due to memory limitations, iterative methods have become the method of choice for large scale semiconductor device simulation. However, it is well known that these methods still suffer from reliability problems. The linear systems which appear in numerical simulation of semiconductor devices are notoriously ill-conditioned. In order to produce robust algorithms for practical problems, careful attention must be given to many implementation issues. This paper concentrates on strategies for developing robust preconditioners. In addition, effective data structures and convergence check issues are also discussed. These algorithms are compared with a standard direct sparse matrix solver on a variety of problems.

  16. Pulsed plane wave analytic solutions for generic shapes and the validation of Maxwell's equations solvers

    NASA Technical Reports Server (NTRS)

    Yarrow, Maurice; Vastano, John A.; Lomax, Harvard

    1992-01-01

    Generic shapes are subjected to pulsed plane waves of arbitrary shape. The resulting scattered electromagnetic fields are determined analytically. These fields are then computed efficiently at field locations for which numerically determined EM fields are required. Of particular interest are the pulsed waveform shapes typically utilized by radar systems. The results can be used to validate the accuracy of finite difference time domain Maxwell's equations solvers. A two-dimensional solver which is second- and fourth-order accurate in space and fourth-order accurate in time is examined. Dielectric media properties are modeled by a ramping technique which simplifies the associated gridding of body shapes. The attributes of the ramping technique are evaluated by comparison with the analytic solutions.

  17. Knowledge-based design of generate-and-patch problem solvers that solve global resource assignment problems

    NASA Technical Reports Server (NTRS)

    Voigt, Kerstin

    1992-01-01

    We present MENDER, a knowledge based system that implements software design techniques that are specialized to automatically compile generate-and-patch problem solvers that satisfy global resource assignments problems. We provide empirical evidence of the superior performance of generate-and-patch over generate-and-test: even with constrained generation, for a global constraint in the domain of '2D-floorplanning'. For a second constraint in '2D-floorplanning' we show that even when it is possible to incorporate the constraint into a constrained generator, a generate-and-patch problem solver may satisfy the constraint more rapidly. We also briefly summarize how an extended version of our system applies to a constraint in the domain of 'multiprocessor scheduling'.

  18. Adaptive Discontinuous Evolution Galerkin Method for Dry Atmospheric Flow

    DTIC Science & Technology

    2013-04-02

    standard one-dimensional approximate Riemann solver used for the flux integration demonstrate better stability, accuracy as well as reliability of the...discontinuous evolution Galerkin method for dry atmospheric convection. Comparisons with the standard one-dimensional approximate Riemann solver used...instead of a standard one- dimensional approximate Riemann solver , the flux integration within the discontinuous Galerkin method is now realized by

  19. Use of direct and iterative solvers for estimation of SNP effects in genome-wide selection

    PubMed Central

    2010-01-01

    The aim of this study was to compare iterative and direct solvers for estimation of marker effects in genomic selection. One iterative and two direct methods were used: Gauss-Seidel with Residual Update, Cholesky Decomposition and Gentleman-Givens rotations. For resembling different scenarios with respect to number of markers and of genotyped animals, a simulated data set divided into 25 subsets was used. Number of markers ranged from 1,200 to 5,925 and number of animals ranged from 1,200 to 5,865. Methods were also applied to real data comprising 3081 individuals genotyped for 45181 SNPs. Results from simulated data showed that the iterative solver was substantially faster than direct methods for larger numbers of markers. Use of a direct solver may allow for computing (co)variances of SNP effects. When applied to real data, performance of the iterative method varied substantially, depending on the level of ill-conditioning of the coefficient matrix. From results with real data, Gentleman-Givens rotations would be the method of choice in this particular application as it provided an exact solution within a fairly reasonable time frame (less than two hours). It would indeed be the preferred method whenever computer resources allow its use. PMID:21637627

  20. Toward an optimal solver for time-spectral fluid-dynamic and aeroelastic solutions on unstructured meshes

    NASA Astrophysics Data System (ADS)

    Mundis, Nathan L.; Mavriplis, Dimitri J.

    2017-09-01

    The time-spectral method applied to the Euler and coupled aeroelastic equations theoretically offers significant computational savings for purely periodic problems when compared to standard time-implicit methods. However, attaining superior efficiency with time-spectral methods over traditional time-implicit methods hinges on the ability rapidly to solve the large non-linear system resulting from time-spectral discretizations which become larger and stiffer as more time instances are employed or the period of the flow becomes especially short (i.e. the maximum resolvable wave-number increases). In order to increase the efficiency of these solvers, and to improve robustness, particularly for large numbers of time instances, the Generalized Minimal Residual Method (GMRES) is used to solve the implicit linear system over all coupled time instances. The use of GMRES as the linear solver makes time-spectral methods more robust, allows them to be applied to a far greater subset of time-accurate problems, including those with a broad range of harmonic content, and vastly improves the efficiency of time-spectral methods. In previous work, a wave-number independent preconditioner that mitigates the increased stiffness of the time-spectral method when applied to problems with large resolvable wave numbers has been developed. This preconditioner, however, directly inverts a large matrix whose size increases in proportion to the number of time instances. As a result, the computational time of this method scales as the cube of the number of time instances. In the present work, this preconditioner has been reworked to take advantage of an approximate-factorization approach that effectively decouples the spatial and temporal systems. Once decoupled, the time-spectral matrix can be inverted in frequency space, where it has entries only on the main diagonal and therefore can be inverted quite efficiently. This new GMRES/preconditioner combination is shown to be over an order of

  1. AFMPB: An adaptive fast multipole Poisson-Boltzmann solver for calculating electrostatics in biomolecular systems

    NASA Astrophysics Data System (ADS)

    Lu, Benzhuo; Cheng, Xiaolin; Huang, Jingfang; McCammon, J. Andrew

    2010-06-01

    A Fortran program package is introduced for rapid evaluation of the electrostatic potentials and forces in biomolecular systems modeled by the linearized Poisson-Boltzmann equation. The numerical solver utilizes a well-conditioned boundary integral equation (BIE) formulation, a node-patch discretization scheme, a Krylov subspace iterative solver package with reverse communication protocols, and an adaptive new version of fast multipole method in which the exponential expansions are used to diagonalize the multipole-to-local translations. The program and its full description, as well as several closely related libraries and utility tools are available at http://lsec.cc.ac.cn/~lubz/afmpb.html and a mirror site at http://mccammon.ucsd.edu/. This paper is a brief summary of the program: the algorithms, the implementation and the usage. Program summaryProgram title: AFMPB: Adaptive fast multipole Poisson-Boltzmann solver Catalogue identifier: AEGB_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEGB_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GPL 2.0 No. of lines in distributed program, including test data, etc.: 453 649 No. of bytes in distributed program, including test data, etc.: 8 764 754 Distribution format: tar.gz Programming language: Fortran Computer: Any Operating system: Any RAM: Depends on the size of the discretized biomolecular system Classification: 3 External routines: Pre- and post-processing tools are required for generating the boundary elements and for visualization. Users can use MSMS ( http://www.scripps.edu/~sanner/html/msms_home.html) for pre-processing, and VMD ( http://www.ks.uiuc.edu/Research/vmd/) for visualization. Sub-programs included: An iterative Krylov subspace solvers package from SPARSKIT by Yousef Saad ( http://www-users.cs.umn.edu/~saad/software/SPARSKIT/sparskit.html), and the fast multipole methods subroutines from FMMSuite ( http

  2. Analysis, tuning and comparison of two general sparse solvers for distributed memory computers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Amestoy, P.R.; Duff, I.S.; L'Excellent, J.-Y.

    2000-06-30

    We describe the work performed in the context of a Franco-Berkeley funded project between NERSC-LBNL located in Berkeley (USA) and CERFACS-ENSEEIHT located in Toulouse (France). We discuss both the tuning and performance analysis of two distributed memory sparse solvers (superlu from Berkeley and mumps from Toulouse) on the 512 processor Cray T3E from NERSC (Lawrence Berkeley National Laboratory). This project gave us the opportunity to improve the algorithms and add new features to the codes. We then quite extensively analyze and compare the two approaches on a set of large problems from real applications. We further explain the main differencesmore » in the behavior of the approaches on artificial regular grid problems. As a conclusion to this activity report, we mention a set of parallel sparse solvers on which this type of study should be extended.« less

  3. Implementation of a 3D version of ponderomotive guiding center solver in particle-in-cell code OSIRIS

    NASA Astrophysics Data System (ADS)

    Helm, Anton; Vieira, Jorge; Silva, Luis; Fonseca, Ricardo

    2016-10-01

    Laser-driven accelerators gained an increased attention over the past decades. Typical modeling techniques for laser wakefield acceleration (LWFA) are based on particle-in-cell (PIC) simulations. PIC simulations, however, are very computationally expensive due to the disparity of the relevant scales ranging from the laser wavelength, in the micrometer range, to the acceleration length, currently beyond the ten centimeter range. To minimize the gap between these despair scales the ponderomotive guiding center (PGC) algorithm is a promising approach. By describing the evolution of the laser pulse envelope separately, only the scales larger than the plasma wavelength are required to be resolved in the PGC algorithm, leading to speedups in several orders of magnitude. Previous work was limited to two dimensions. Here we present the implementation of the 3D version of a PGC solver into the massively parallel, fully relativistic PIC code OSIRIS. We extended the solver to include periodic boundary conditions and parallelization in all spatial dimensions. We present benchmarks for distributed and shared memory parallelization. We also discuss the stability of the PGC solver.

  4. Advancing parabolic operators in thermodynamic MHD models: Explicit super time-stepping versus implicit schemes with Krylov solvers

    NASA Astrophysics Data System (ADS)

    Caplan, R. M.; Mikić, Z.; Linker, J. A.; Lionello, R.

    2017-05-01

    We explore the performance and advantages/disadvantages of using unconditionally stable explicit super time-stepping (STS) algorithms versus implicit schemes with Krylov solvers for integrating parabolic operators in thermodynamic MHD models of the solar corona. Specifically, we compare the second-order Runge-Kutta Legendre (RKL2) STS method with the implicit backward Euler scheme computed using the preconditioned conjugate gradient (PCG) solver with both a point-Jacobi and a non-overlapping domain decomposition ILU0 preconditioner. The algorithms are used to integrate anisotropic Spitzer thermal conduction and artificial kinematic viscosity at time-steps much larger than classic explicit stability criteria allow. A key component of the comparison is the use of an established MHD model (MAS) to compute a real-world simulation on a large HPC cluster. Special attention is placed on the parallel scaling of the algorithms. It is shown that, for a specific problem and model, the RKL2 method is comparable or surpasses the implicit method with PCG solvers in performance and scaling, but suffers from some accuracy limitations. These limitations, and the applicability of RKL methods are briefly discussed.

  5. Extension of the ADjoint Approach to a Laminar Navier-Stokes Solver

    NASA Astrophysics Data System (ADS)

    Paige, Cody

    The use of adjoint methods is common in computational fluid dynamics to reduce the cost of the sensitivity analysis in an optimization cycle. The forward mode ADjoint is a combination of an adjoint sensitivity analysis method with a forward mode automatic differentiation (AD) and is a modification of the reverse mode ADjoint method proposed by Mader et al.[1]. A colouring acceleration technique is presented to reduce the computational cost increase associated with forward mode AD. The forward mode AD facilitates the implementation of the laminar Navier-Stokes (NS) equations. The forward mode ADjoint method is applied to a three-dimensional computational fluid dynamics solver. The resulting Euler and viscous ADjoint sensitivities are compared to the reverse mode Euler ADjoint derivatives and a complex-step method to demonstrate the reduced computational cost and accuracy. Both comparisons demonstrate the benefits of the colouring method and the practicality of using a forward mode AD. [1] Mader, C.A., Martins, J.R.R.A., Alonso, J.J., and van der Weide, E. (2008) ADjoint: An approach for the rapid development of discrete adjoint solvers. AIAA Journal, 46(4):863-873. doi:10.2514/1.29123.

  6. A high-order relativistic two-fluid electrodynamic scheme with consistent reconstruction of electromagnetic fields and a multidimensional Riemann solver for electromagnetism

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Balsara, Dinshaw S., E-mail: dbalsara@nd.edu; Amano, Takanobu, E-mail: amano@eps.s.u-tokyo.ac.jp; Garain, Sudip, E-mail: sgarain@nd.edu

    In various astrophysics settings it is common to have a two-fluid relativistic plasma that interacts with the electromagnetic field. While it is common to ignore the displacement current in the ideal, classical magnetohydrodynamic limit, when the flows become relativistic this approximation is less than absolutely well-justified. In such a situation, it is more natural to consider a positively charged fluid made up of positrons or protons interacting with a negatively charged fluid made up of electrons. The two fluids interact collectively with the full set of Maxwell's equations. As a result, a solution strategy for that coupled system of equationsmore » is sought and found here. Our strategy extends to higher orders, providing increasing accuracy. The primary variables in the Maxwell solver are taken to be the facially-collocated components of the electric and magnetic fields. Consistent with such a collocation, three important innovations are reported here. The first two pertain to the Maxwell solver. In our first innovation, the magnetic field within each zone is reconstructed in a divergence-free fashion while the electric field within each zone is reconstructed in a form that is consistent with Gauss' law. In our second innovation, a multidimensionally upwinded strategy is presented which ensures that the magnetic field can be updated via a discrete interpretation of Faraday's law and the electric field can be updated via a discrete interpretation of the generalized Ampere's law. This multidimensional upwinding is achieved via a multidimensional Riemann solver. The multidimensional Riemann solver automatically provides edge-centered electric field components for the Stokes law-based update of the magnetic field. It also provides edge-centered magnetic field components for the Stokes law-based update of the electric field. The update strategy ensures that the electric field is always consistent with Gauss' law and the magnetic field is always divergence

  7. A high-order relativistic two-fluid electrodynamic scheme with consistent reconstruction of electromagnetic fields and a multidimensional Riemann solver for electromagnetism

    NASA Astrophysics Data System (ADS)

    Balsara, Dinshaw S.; Amano, Takanobu; Garain, Sudip; Kim, Jinho

    2016-08-01

    In various astrophysics settings it is common to have a two-fluid relativistic plasma that interacts with the electromagnetic field. While it is common to ignore the displacement current in the ideal, classical magnetohydrodynamic limit, when the flows become relativistic this approximation is less than absolutely well-justified. In such a situation, it is more natural to consider a positively charged fluid made up of positrons or protons interacting with a negatively charged fluid made up of electrons. The two fluids interact collectively with the full set of Maxwell's equations. As a result, a solution strategy for that coupled system of equations is sought and found here. Our strategy extends to higher orders, providing increasing accuracy. The primary variables in the Maxwell solver are taken to be the facially-collocated components of the electric and magnetic fields. Consistent with such a collocation, three important innovations are reported here. The first two pertain to the Maxwell solver. In our first innovation, the magnetic field within each zone is reconstructed in a divergence-free fashion while the electric field within each zone is reconstructed in a form that is consistent with Gauss' law. In our second innovation, a multidimensionally upwinded strategy is presented which ensures that the magnetic field can be updated via a discrete interpretation of Faraday's law and the electric field can be updated via a discrete interpretation of the generalized Ampere's law. This multidimensional upwinding is achieved via a multidimensional Riemann solver. The multidimensional Riemann solver automatically provides edge-centered electric field components for the Stokes law-based update of the magnetic field. It also provides edge-centered magnetic field components for the Stokes law-based update of the electric field. The update strategy ensures that the electric field is always consistent with Gauss' law and the magnetic field is always divergence-free. This

  8. Determining the Optimal Values of Exponential Smoothing Constants--Does Solver Really Work?

    ERIC Educational Resources Information Center

    Ravinder, Handanhal V.

    2013-01-01

    A key issue in exponential smoothing is the choice of the values of the smoothing constants used. One approach that is becoming increasingly popular in introductory management science and operations management textbooks is the use of Solver, an Excel-based non-linear optimizer, to identify values of the smoothing constants that minimize a measure…

  9. Multi-GPU three dimensional Stokes solver for simulating glacier flow

    NASA Astrophysics Data System (ADS)

    Licul, Aleksandar; Herman, Frédéric; Podladchikov, Yuri; Räss, Ludovic; Omlin, Samuel

    2016-04-01

    Here we present how we have recently developed a three-dimensional Stokes solver on the GPUs and apply it to a glacier flow. We numerically solve the Stokes momentum balance equations together with the incompressibility equation, while also taking into account strong nonlinearities for ice rheology. We have developed a fully three-dimensional numerical MATLAB application based on an iterative finite difference scheme with preconditioning of residuals. Differential equations are discretized on a regular staggered grid. We have ported it to C-CUDA to run it on GPU's in parallel, using MPI. We demonstrate the accuracy and efficiency of our developed model by manufactured analytical solution test for three-dimensional Stokes ice sheet models (Leng et al.,2013) and by comparison with other well-established ice sheet models on diagnostic ISMIP-HOM benchmark experiments (Pattyn et al., 2008). The results show that our developed model is capable to accurately and efficiently solve Stokes system of equations in a variety of different test scenarios, while preserving good parallel efficiency on up to 80 GPU's. For example, in 3D test scenarios with 250000 grid points our solver converges in around 3 minutes for single precision computations and around 10 minutes for double precision computations. We have also optimized the developed code to efficiently run on our newly acquired state-of-the-art GPU cluster octopus. This allows us to solve our problem on more than 20 million grid points, by just increasing the number of GPU used, while keeping the computation time the same. In future work we will apply our solver to real world applications and implement the free surface evolution capabilities. REFERENCES Leng,W.,Ju,L.,Gunzburger,M. & Price,S., 2013. Manufactured solutions and the verification of three-dimensional stokes ice-sheet models. Cryosphere 7,19-29. Pattyn, F., Perichon, L., Aschwanden, A., Breuer, B., de Smedt, B., Gagliardini, O., Gudmundsson,G.H., Hindmarsh, R

  10. Improved Solver Settings for 3D Exploding Wire Simulations in ALEGRA

    DTIC Science & Technology

    2016-08-01

    expanding plasma and shock wave resulting from the wire burst can extend to tens of cen- timeters. The elliptic nature of the magnetic diffusion...such simulations were prohibitively slow due in part to unoptimized (matrix) solver settings. In this report, we address that by varying 6 parameters...distribution is unlimited. simulation code developed by SNL for modeling high-deformation solid dynam- ics, shock -hydrodynamics, magnetohydrodynamics

  11. Implementation of a fully-balanced periodic tridiagonal solver on a parallel distributed memory architecture

    NASA Technical Reports Server (NTRS)

    Eidson, T. M.; Erlebacher, G.

    1994-01-01

    While parallel computers offer significant computational performance, it is generally necessary to evaluate several programming strategies. Two programming strategies for a fairly common problem - a periodic tridiagonal solver - are developed and evaluated. Simple model calculations as well as timing results are presented to evaluate the various strategies. The particular tridiagonal solver evaluated is used in many computational fluid dynamic simulation codes. The feature that makes this algorithm unique is that these simulation codes usually require simultaneous solutions for multiple right-hand-sides (RHS) of the system of equations. Each RHS solutions is independent and thus can be computed in parallel. Thus a Gaussian elimination type algorithm can be used in a parallel computation and the more complicated approaches such as cyclic reduction are not required. The two strategies are a transpose strategy and a distributed solver strategy. For the transpose strategy, the data is moved so that a subset of all the RHS problems is solved on each of the several processors. This usually requires significant data movement between processor memories across a network. The second strategy attempts to have the algorithm allow the data across processor boundaries in a chained manner. This usually requires significantly less data movement. An approach to accomplish this second strategy in a near-perfect load-balanced manner is developed. In addition, an algorithm will be shown to directly transform a sequential Gaussian elimination type algorithm into the parallel chained, load-balanced algorithm.

  12. Nonequilibrium radiative heating prediction method for aeroassist flowfields with coupling to flowfield solvers. Ph.D. Thesis

    NASA Technical Reports Server (NTRS)

    Hartung, Lin C.

    1991-01-01

    A method for predicting radiation adsorption and emission coefficients in thermochemical nonequilibrium flows is developed. The method is called the Langley optimized radiative nonequilibrium code (LORAN). It applies the smeared band approximation for molecular radiation to produce moderately detailed results and is intended to fill the gap between detailed but costly prediction methods and very fast but highly approximate methods. The optimization of the method to provide efficient solutions allowing coupling to flowfield solvers is discussed. Representative results are obtained and compared to previous nonequilibrium radiation methods, as well as to ground- and flight-measured data. Reasonable agreement is found in all cases. A multidimensional radiative transport method is also developed for axisymmetric flows. Its predictions for wall radiative flux are 20 to 25 percent lower than those of the tangent slab transport method, as expected, though additional investigation of the symmetry and outflow boundary conditions is indicated. The method was applied to the peak heating condition of the aeroassist flight experiment (AFE) trajectory, with results comparable to predictions from other methods. The LORAN method was also applied in conjunction with the computational fluid dynamics (CFD) code LAURA to study the sensitivity of the radiative heating prediction to various models used in nonequilibrium CFD. This study suggests that radiation measurements can provide diagnostic information about the detailed processes occurring in a nonequilibrium flowfield because radiation phenomena are very sensitive to these processes.

  13. A Matlab-based finite-difference solver for the Poisson problem with mixed Dirichlet-Neumann boundary conditions

    NASA Astrophysics Data System (ADS)

    Reimer, Ashton S.; Cheviakov, Alexei F.

    2013-03-01

    A Matlab-based finite-difference numerical solver for the Poisson equation for a rectangle and a disk in two dimensions, and a spherical domain in three dimensions, is presented. The solver is optimized for handling an arbitrary combination of Dirichlet and Neumann boundary conditions, and allows for full user control of mesh refinement. The solver routines utilize effective and parallelized sparse vector and matrix operations. Computations exhibit high speeds, numerical stability with respect to mesh size and mesh refinement, and acceptable error values even on desktop computers. Catalogue identifier: AENQ_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AENQ_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU General Public License v3.0 No. of lines in distributed program, including test data, etc.: 102793 No. of bytes in distributed program, including test data, etc.: 369378 Distribution format: tar.gz Programming language: Matlab 2010a. Computer: PC, Macintosh. Operating system: Windows, OSX, Linux. RAM: 8 GB (8, 589, 934, 592 bytes) Classification: 4.3. Nature of problem: To solve the Poisson problem in a standard domain with “patchy surface”-type (strongly heterogeneous) Neumann/Dirichlet boundary conditions. Solution method: Finite difference with mesh refinement. Restrictions: Spherical domain in 3D; rectangular domain or a disk in 2D. Unusual features: Choice between mldivide/iterative solver for the solution of large system of linear algebraic equations that arise. Full user control of Neumann/Dirichlet boundary conditions and mesh refinement. Running time: Depending on the number of points taken and the geometry of the domain, the routine may take from less than a second to several hours to execute.

  14. Cpu/gpu Computing for AN Implicit Multi-Block Compressible Navier-Stokes Solver on Heterogeneous Platform

    NASA Astrophysics Data System (ADS)

    Deng, Liang; Bai, Hanli; Wang, Fang; Xu, Qingxin

    2016-06-01

    CPU/GPU computing allows scientists to tremendously accelerate their numerical codes. In this paper, we port and optimize a double precision alternating direction implicit (ADI) solver for three-dimensional compressible Navier-Stokes equations from our in-house Computational Fluid Dynamics (CFD) software on heterogeneous platform. First, we implement a full GPU version of the ADI solver to remove a lot of redundant data transfers between CPU and GPU, and then design two fine-grain schemes, namely “one-thread-one-point” and “one-thread-one-line”, to maximize the performance. Second, we present a dual-level parallelization scheme using the CPU/GPU collaborative model to exploit the computational resources of both multi-core CPUs and many-core GPUs within the heterogeneous platform. Finally, considering the fact that memory on a single node becomes inadequate when the simulation size grows, we present a tri-level hybrid programming pattern MPI-OpenMP-CUDA that merges fine-grain parallelism using OpenMP and CUDA threads with coarse-grain parallelism using MPI for inter-node communication. We also propose a strategy to overlap the computation with communication using the advanced features of CUDA and MPI programming. We obtain speedups of 6.0 for the ADI solver on one Tesla M2050 GPU in contrast to two Xeon X5670 CPUs. Scalability tests show that our implementation can offer significant performance improvement on heterogeneous platform.

  15. An Extension of the Time-Spectral Method to Overset Solvers

    NASA Technical Reports Server (NTRS)

    Leffell, Joshua Isaac; Murman, Scott M.; Pulliam, Thomas

    2013-01-01

    Relative motion in the Cartesian or overset framework causes certain spatial nodes to move in and out of the physical domain as they are dynamically blanked by moving solid bodies. This poses a problem for the conventional Time-Spectral approach, which expands the solution at every spatial node into a Fourier series spanning the period of motion. The proposed extension to the Time-Spectral method treats unblanked nodes in the conventional manner but expands the solution at dynamically blanked nodes in a basis of barycentric rational polynomials spanning partitions of contiguously defined temporal intervals. Rational polynomials avoid Runge's phenomenon on the equidistant time samples of these sub-periodic intervals. Fourier- and rational polynomial-based differentiation operators are used in tandem to provide a consistent hybrid Time-Spectral overset scheme capable of handling relative motion. The hybrid scheme is tested with a linear model problem and implemented within NASA's OVERFLOW Reynolds-averaged Navier- Stokes (RANS) solver. The hybrid Time-Spectral solver is then applied to inviscid and turbulent RANS cases of plunging and pitching airfoils and compared to time-accurate and experimental data. A limiter was applied in the turbulent case to avoid undershoots in the undamped turbulent eddy viscosity while maintaining accuracy. The hybrid scheme matches the performance of the conventional Time-Spectral method and converges to the time-accurate results with increased temporal resolution.

  16. BRAIN initiative: fast and parallel solver for real-time monitoring of the eddy current in the brain for TMS applications.

    PubMed

    Sabouni, Abas; Pouliot, Philippe; Shmuel, Amir; Lesage, Frederic

    2014-01-01

    This paper introduce a fast and efficient solver for simulating the induced (eddy) current distribution in the brain during transcranial magnetic stimulation procedure. This solver has been integrated with MRI and neuronavigation software to accurately model the electromagnetic field and show eddy current in the head almost in real-time. To examine the performance of the proposed technique, we used a 3D anatomically accurate MRI model of the 25 year old female subject.

  17. An interior penalty stabilised incompressible discontinuous Galerkin-Fourier solver for implicit large eddy simulations

    NASA Astrophysics Data System (ADS)

    Ferrer, Esteban

    2017-11-01

    We present an implicit Large Eddy Simulation (iLES) h / p high order (≥2) unstructured Discontinuous Galerkin-Fourier solver with sliding meshes. The solver extends the laminar version of Ferrer and Willden, 2012 [34], to enable the simulation of turbulent flows at moderately high Reynolds numbers in the incompressible regime. This solver allows accurate flow solutions of the laminar and turbulent 3D incompressible Navier-Stokes equations on moving and static regions coupled through a high order sliding interface. The spatial discretisation is provided by the Symmetric Interior Penalty Discontinuous Galerkin (IP-DG) method in the x-y plane coupled with a purely spectral method that uses Fourier series and allows efficient computation of spanwise periodic three-dimensional flows. Since high order methods (e.g. discontinuous Galerkin and Fourier) are unable to provide enough numerical dissipation to enable under-resolved high Reynolds computations (i.e. as necessary in the iLES approach), we adapt the laminar version of the solver to increase (controllably) the dissipation and enhance the stability in under-resolved simulations. The novel stabilisation relies on increasing the penalty parameter included in the DG interior penalty (IP) formulation. The latter penalty term is included when discretising the linear viscous terms in the incompressible Navier-Stokes equations. These viscous penalty fluxes substitute the stabilising effect of non-linear fluxes, which has been the main trend in implicit LES discontinuous Galerkin approaches. The IP-DG penalty term provides energy dissipation, which is controlled by the numerical jumps at element interfaces (e.g. large in under-resolved regions) such as to stabilise under-resolved high Reynolds number flows. This dissipative term has minimal impact in well resolved regions and its implicit treatment does not restrict the use of large time steps, thus providing an efficient stabilization mechanism for iLES. The IP

  18. A Note on Substructuring Preconditioning for Nonconforming Finite Element Approximations of Second Order Elliptic Problems

    NASA Technical Reports Server (NTRS)

    Maliassov, Serguei

    1996-01-01

    In this paper an algebraic substructuring preconditioner is considered for nonconforming finite element approximations of second order elliptic problems in 3D domains with a piecewise constant diffusion coefficient. Using a substructuring idea and a block Gauss elimination, part of the unknowns is eliminated and the Schur complement obtained is preconditioned by a spectrally equivalent very sparse matrix. In the case of quasiuniform tetrahedral mesh an appropriate algebraic multigrid solver can be used to solve the problem with this matrix. Explicit estimates of condition numbers and implementation algorithms are established for the constructed preconditioner. It is shown that the condition number of the preconditioned matrix does not depend on either the mesh step size or the jump of the coefficient. Finally, numerical experiments are presented to illustrate the theory being developed.

  19. On the implementation of an accurate and efficient solver for convection-diffusion equations

    NASA Astrophysics Data System (ADS)

    Wu, Chin-Tien

    In this dissertation, we examine several different aspects of computing the numerical solution of the convection-diffusion equation. The solution of this equation often exhibits sharp gradients due to Dirichlet outflow boundaries or discontinuities in boundary conditions. Because of the singular-perturbed nature of the equation, numerical solutions often have severe oscillations when grid sizes are not small enough to resolve sharp gradients. To overcome such difficulties, the streamline diffusion discretization method can be used to obtain an accurate approximate solution in regions where the solution is smooth. To increase accuracy of the solution in the regions containing layers, adaptive mesh refinement and mesh movement based on a posteriori error estimations can be employed. An error-adapted mesh refinement strategy based on a posteriori error estimations is also proposed to resolve layers. For solving the sparse linear systems that arise from discretization, goemetric multigrid (MG) and algebraic multigrid (AMG) are compared. In addition, both methods are also used as preconditioners for Krylov subspace methods. We derive some convergence results for MG with line Gauss-Seidel smoothers and bilinear interpolation. Finally, while considering adaptive mesh refinement as an integral part of the solution process, it is natural to set a stopping tolerance for the iterative linear solvers on each mesh stage so that the difference between the approximate solution obtained from iterative methods and the finite element solution is bounded by an a posteriori error bound. Here, we present two stopping criteria. The first is based on a residual-type a posteriori error estimator developed by Verfurth. The second is based on an a posteriori error estimator, using local solutions, developed by Kay and Silvester. Our numerical results show the refined mesh obtained from the iterative solution which satisfies the second criteria is similar to the refined mesh obtained from

  20. A fast solver for the Helmholtz equation based on the generalized multiscale finite-element method

    NASA Astrophysics Data System (ADS)

    Fu, Shubin; Gao, Kai

    2017-11-01

    Conventional finite-element methods for solving the acoustic-wave Helmholtz equation in highly heterogeneous media usually require finely discretized mesh to represent the medium property variations with sufficient accuracy. Computational costs for solving the Helmholtz equation can therefore be considerably expensive for complicated and large geological models. Based on the generalized multiscale finite-element theory, we develop a novel continuous Galerkin method to solve the Helmholtz equation in acoustic media with spatially variable velocity and mass density. Instead of using conventional polynomial basis functions, we use multiscale basis functions to form the approximation space on the coarse mesh. The multiscale basis functions are obtained from multiplying the eigenfunctions of a carefully designed local spectral problem with an appropriate multiscale partition of unity. These multiscale basis functions can effectively incorporate the characteristics of heterogeneous media's fine-scale variations, thus enable us to obtain accurate solution to the Helmholtz equation without directly solving the large discrete system formed on the fine mesh. Numerical results show that our new solver can significantly reduce the dimension of the discrete Helmholtz equation system, and can also obviously reduce the computational time.

  1. SMPBS: Web server for computing biomolecular electrostatics using finite element solvers of size modified Poisson-Boltzmann equation.

    PubMed

    Xie, Yang; Ying, Jinyong; Xie, Dexuan

    2017-03-30

    SMPBS (Size Modified Poisson-Boltzmann Solvers) is a web server for computing biomolecular electrostatics using finite element solvers of the size modified Poisson-Boltzmann equation (SMPBE). SMPBE not only reflects ionic size effects but also includes the classic Poisson-Boltzmann equation (PBE) as a special case. Thus, its web server is expected to have a broader range of applications than a PBE web server. SMPBS is designed with a dynamic, mobile-friendly user interface, and features easily accessible help text, asynchronous data submission, and an interactive, hardware-accelerated molecular visualization viewer based on the 3Dmol.js library. In particular, the viewer allows computed electrostatics to be directly mapped onto an irregular triangular mesh of a molecular surface. Due to this functionality and the fast SMPBE finite element solvers, the web server is very efficient in the calculation and visualization of electrostatics. In addition, SMPBE is reconstructed using a new objective electrostatic free energy, clearly showing that the electrostatics and ionic concentrations predicted by SMPBE are optimal in the sense of minimizing the objective electrostatic free energy. SMPBS is available at the URL: smpbs.math.uwm.edu © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.

  2. Adaptation of a Multi-Block Structured Solver for Effective Use in a Hybrid CPU/GPU Massively Parallel Environment

    NASA Astrophysics Data System (ADS)

    Gutzwiller, David; Gontier, Mathieu; Demeulenaere, Alain

    2014-11-01

    Multi-Block structured solvers hold many advantages over their unstructured counterparts, such as a smaller memory footprint and efficient serial performance. Historically, multi-block structured solvers have not been easily adapted for use in a High Performance Computing (HPC) environment, and the recent trend towards hybrid GPU/CPU architectures has further complicated the situation. This paper will elaborate on developments and innovations applied to the NUMECA FINE/Turbo solver that have allowed near-linear scalability with real-world problems on over 250 hybrid GPU/GPU cluster nodes. Discussion will focus on the implementation of virtual partitioning and load balancing algorithms using a novel meta-block concept. This implementation is transparent to the user, allowing all pre- and post-processing steps to be performed using a simple, unpartitioned grid topology. Additional discussion will elaborate on developments that have improved parallel performance, including fully parallel I/O with the ADIOS API and the GPU porting of the computationally heavy CPUBooster convergence acceleration module. Head of HPC and Release Management, Numeca International.

  3. StagBL : A Scalable, Portable, High-Performance Discretization and Solver Layer for Geodynamic Simulation

    NASA Astrophysics Data System (ADS)

    Sanan, P.; Tackley, P. J.; Gerya, T.; Kaus, B. J. P.; May, D.

    2017-12-01

    StagBL is an open-source parallel solver and discretization library for geodynamic simulation,encapsulating and optimizing operations essential to staggered-grid finite volume Stokes flow solvers.It provides a parallel staggered-grid abstraction with a high-level interface in C and Fortran.On top of this abstraction, tools are available to define boundary conditions and interact with particle systems.Tools and examples to efficiently solve Stokes systems defined on the grid are provided in small (direct solver), medium (simple preconditioners), and large (block factorization and multigrid) model regimes.By working directly with leading application codes (StagYY, I3ELVIS, and LaMEM) and providing an API and examples to integrate with others, StagBL aims to become a community tool supplying scalable, portable, reproducible performance toward novel science in regional- and planet-scale geodynamics and planetary science.By implementing kernels used by many research groups beneath a uniform abstraction layer, the library will enable optimization for modern hardware, thus reducing community barriers to large- or extreme-scale parallel simulation on modern architectures. In particular, the library will include CPU-, Manycore-, and GPU-optimized variants of matrix-free operators and multigrid components.The common layer provides a framework upon which to introduce innovative new tools.StagBL will leverage p4est to provide distributed adaptive meshes, and incorporate a multigrid convergence analysis tool.These options, in addition to a wealth of solver options provided by an interface to PETSc, will make the most modern solution techniques available from a common interface. StagBL in turn provides a PETSc interface, DMStag, to its central staggered grid abstraction.We present public version 0.5 of StagBL, including preliminary integration with application codes and demonstrations with its own demonstration application, StagBLDemo. Central to StagBL is the notion of an

  4. Application of the FUN3D Unstructured-Grid Navier-Stokes Solver to the 4th AIAA Drag Prediction Workshop Cases

    NASA Technical Reports Server (NTRS)

    Lee-Rausch, Elizabeth M.; Hammond, Dana P.; Nielsen, Eric J.; Pirzadeh, S. Z.; Rumsey, Christopher L.

    2010-01-01

    FUN3D Navier-Stokes solutions were computed for the 4th AIAA Drag Prediction Workshop grid convergence study, downwash study, and Reynolds number study on a set of node-based mixed-element grids. All of the baseline tetrahedral grids were generated with the VGRID (developmental) advancing-layer and advancing-front grid generation software package following the gridding guidelines developed for the workshop. With maximum grid sizes exceeding 100 million nodes, the grid convergence study was particularly challenging for the node-based unstructured grid generators and flow solvers. At the time of the workshop, the super-fine grid with 105 million nodes and 600 million elements was the largest grid known to have been generated using VGRID. FUN3D Version 11.0 has a completely new pre- and post-processing paradigm that has been incorporated directly into the solver and functions entirely in a parallel, distributed memory environment. This feature allowed for practical pre-processing and solution times on the largest unstructured-grid size requested for the workshop. For the constant-lift grid convergence case, the convergence of total drag is approximately second-order on the finest three grids. The variation in total drag between the finest two grids is only 2 counts. At the finest grid levels, only small variations in wing and tail pressure distributions are seen with grid refinement. Similarly, a small wing side-of-body separation also shows little variation at the finest grid levels. Overall, the FUN3D results compare well with the structured-grid code CFL3D. The FUN3D downwash study and Reynolds number study results compare well with the range of results shown in the workshop presentations.

  5. A Comparison of Three Navier-Stokes Solvers for Exhaust Nozzle Flowfields

    NASA Technical Reports Server (NTRS)

    Georgiadis, Nicholas J.; Yoder, Dennis A.; Debonis, James R.

    1999-01-01

    A comparison of the NPARC, PAB, and WIND (previously known as NASTD) Navier-Stokes solvers is made for two flow cases with turbulent mixing as the dominant flow characteristic, a two-dimensional ejector nozzle and a Mach 1.5 elliptic jet. The objective of the work is to determine if comparable predictions of nozzle flows can be obtained from different Navier-Stokes codes employed in a multiple site research program. A single computational grid was constructed for each of the two flows and used for all of the Navier-Stokes solvers. In addition, similar k-e based turbulence models were employed in each code, and boundary conditions were specified as similarly as possible across the codes. Comparisons of mass flow rates, velocity profiles, and turbulence model quantities are made between the computations and experimental data. The computational cost of obtaining converged solutions with each of the codes is also documented. Results indicate that all of the codes provided similar predictions for the two nozzle flows. Agreement of the Navier-Stokes calculations with experimental data was good for the ejector nozzle. However, for the Mach 1.5 elliptic jet, the calculations were unable to accurately capture the development of the three dimensional elliptic mixing layer.

  6. Design of a Modular Monolithic Implicit Solver for Multi-Physics Applications

    NASA Technical Reports Server (NTRS)

    Carton De Wiart, Corentin; Diosady, Laslo T.; Garai, Anirban; Burgess, Nicholas; Blonigan, Patrick; Ekelschot, Dirk; Murman, Scott M.

    2018-01-01

    The design of a modular multi-physics high-order space-time finite-element framework is presented together with its extension to allow monolithic coupling of different physics. One of the main objectives of the framework is to perform efficient high- fidelity simulations of capsule/parachute systems. This problem requires simulating multiple physics including, but not limited to, the compressible Navier-Stokes equations, the dynamics of a moving body with mesh deformations and adaptation, the linear shell equations, non-re effective boundary conditions and wall modeling. The solver is based on high-order space-time - finite element methods. Continuous, discontinuous and C1-discontinuous Galerkin methods are implemented, allowing one to discretize various physical models. Tangent and adjoint sensitivity analysis are also targeted in order to conduct gradient-based optimization, error estimation, mesh adaptation, and flow control, adding another layer of complexity to the framework. The decisions made to tackle these challenges are presented. The discussion focuses first on the "single-physics" solver and later on its extension to the monolithic coupling of different physics. The implementation of different physics modules, relevant to the capsule/parachute system, are also presented. Finally, examples of coupled computations are presented, paving the way to the simulation of the full capsule/parachute system.

  7. Validation Process for LEWICE by Use of a Navier-Stokes Solver

    NASA Technical Reports Server (NTRS)

    Wright, William B.; Porter, Christopher E.

    2017-01-01

    A research project is underway at NASA Glenn to produce computer software that can accurately predict ice growth under any meteorological conditions for any aircraft surface. This report will present results from the latest LEWICE release, version 3.5. This program differs from previous releases in its ability to model mixed phase and ice crystal conditions such as those encountered inside an engine. It also has expanded capability to use structured grids and a new capability to use results from unstructured grid flow solvers. A quantitative comparison of the results against a database of ice shapes that have been generated in the NASA Glenn Icing Research Tunnel (IRT) has also been performed. This paper will extend the comparison of ice shapes between LEWICE 3.5 and experimental data from a previous paper. Comparisons of lift and drag are made between experimentally collected data from experimentally obtained ice shapes and simulated (CFD) data on simulated (LEWICE) ice shapes. Comparisons are also made between experimentally collected and simulated performance data on select experimental ice shapes to ensure the CFD solver, FUN3D, is valid within the flight regime. The results show that the predicted results are within the accuracy limits of the experimental data for the majority of cases.

  8. Advanced Multigrid Solvers for Fluid Dynamics

    NASA Technical Reports Server (NTRS)

    Brandt, Achi

    1999-01-01

    The main objective of this project has been to support the development of multigrid techniques in computational fluid dynamics that can achieve "textbook multigrid efficiency" (TME), which is several orders of magnitude faster than current industrial CFD solvers. Toward that goal we have assembled a detailed table which lists every foreseen kind of computational difficulty for achieving it, together with the possible ways for resolving the difficulty, their current state of development, and references. We have developed several codes to test and demonstrate, in the framework of simple model problems, several approaches for overcoming the most important of the listed difficulties that had not been resolved before. In particular, TME has been demonstrated for incompressible flows on one hand, and for near-sonic flows on the other hand. General approaches were advanced for the relaxation of stagnation points and boundary conditions under various situations. Also, new algebraic multigrid techniques were formed for treating unstructured grid formulations. More details on all these are given below.

  9. Development of high-accuracy convection schemes for sequential solvers

    NASA Technical Reports Server (NTRS)

    Thakur, Siddharth; Shyy, Wei

    1993-01-01

    An exploration is conducted of the applicability of such high resolution schemes as TVD to the resolving of sharp flow gradients using a sequential solution approach borrowed from pressure-based algorithms. It is shown that by extending these high-resolution shock-capturing schemes to a sequential solver that treats the equations as a collection of scalar conservation equations, the speed of signal propagation in the solution has to be coordinated by assigning the local convection speed as the characteristic speed for the entire system. A higher amount of dissipation is therefore needed to eliminate oscillations near discontinuities.

  10. Implementation of a parallel unstructured Euler solver on shared and distributed memory architectures

    NASA Technical Reports Server (NTRS)

    Mavriplis, D. J.; Das, Raja; Saltz, Joel; Vermeland, R. E.

    1992-01-01

    An efficient three dimensional unstructured Euler solver is parallelized on a Cray Y-MP C90 shared memory computer and on an Intel Touchstone Delta distributed memory computer. This paper relates the experiences gained and describes the software tools and hardware used in this study. Performance comparisons between two differing architectures are made.

  11. Simulation of an Isolated Tiltrotor in Hover with an Unstructured Overset-Grid RANS Solver

    NASA Technical Reports Server (NTRS)

    Lee-Rausch, Elizabeth M.; Biedron, Robert T.

    2009-01-01

    An unstructured overset-grid Reynolds Averaged Navier-Stokes (RANS) solver, FUN3D, is used to simulate an isolated tiltrotor in hover. An overview of the computational method is presented as well as the details of the overset-grid systems. Steady-state computations within a noninertial reference frame define the performance trends of the rotor across a range of the experimental collective settings. Results are presented to show the effects of off-body grid refinement and blade grid refinement. The computed performance and blade loading trends show good agreement with experimental results and previously published structured overset-grid computations. Off-body flow features indicate a significant improvement in the resolution of the first perpendicular blade vortex interaction with background grid refinement across the collective range. Considering experimental data uncertainty and effects of transition, the prediction of figure of merit on the baseline and refined grid is reasonable at the higher collective range- within 3 percent of the measured values. At the lower collective settings, the computed figure of merit is approximately 6 percent lower than the experimental data. A comparison of steady and unsteady results show that with temporal refinement, the dynamic results closely match the steady-state noninertial results which gives confidence in the accuracy of the dynamic overset-grid approach.

  12. Algorithms and Application of Sparse Matrix Assembly and Equation Solvers for Aeroacoustics

    NASA Technical Reports Server (NTRS)

    Watson, W. R.; Nguyen, D. T.; Reddy, C. J.; Vatsa, V. N.; Tang, W. H.

    2001-01-01

    An algorithm for symmetric sparse equation solutions on an unstructured grid is described. Efficient, sequential sparse algorithms for degree-of-freedom reordering, supernodes, symbolic/numerical factorization, and forward backward solution phases are reviewed. Three sparse algorithms for the generation and assembly of symmetric systems of matrix equations are presented. The accuracy and numerical performance of the sequential version of the sparse algorithms are evaluated over the frequency range of interest in a three-dimensional aeroacoustics application. Results show that the solver solutions are accurate using a discretization of 12 points per wavelength. Results also show that the first assembly algorithm is impractical for high-frequency noise calculations. The second and third assembly algorithms have nearly equal performance at low values of source frequencies, but at higher values of source frequencies the third algorithm saves CPU time and RAM. The CPU time and the RAM required by the second and third assembly algorithms are two orders of magnitude smaller than that required by the sparse equation solver. A sequential version of these sparse algorithms can, therefore, be conveniently incorporated into a substructuring for domain decomposition formulation to achieve parallel computation, where different substructures are handles by different parallel processors.

  13. Application of an Unstructured Grid Navier-Stokes Solver to a Generic Helicopter Boby: Comparison of Unstructured Grid Results with Structured Grid Results and Experimental Results

    NASA Technical Reports Server (NTRS)

    Mineck, Raymond E.

    1999-01-01

    An unstructured-grid Navier-Stokes solver was used to predict the surface pressure distribution, the off-body flow field, the surface flow pattern, and integrated lift and drag coefficients on the ROBIN configuration (a generic helicopter) without a rotor at four angles of attack. The results are compared to those predicted by two structured- grid Navier-Stokes solvers and to experimental surface pressure distributions. The surface pressure distributions from the unstructured-grid Navier-Stokes solver are in good agreement with the results from the structured-grid Navier-Stokes solvers. Agreement with the experimental pressure coefficients is good over the forward portion of the body. However, agreement is poor on the lower portion of the mid-section of the body. Comparison of the predicted surface flow patterns showed similar regions of separated flow. Predicted lift and drag coefficients were in fair agreement with each other.

  14. Iterative methods for 3D implicit finite-difference migration using the complex Padé approximation

    NASA Astrophysics Data System (ADS)

    Costa, Carlos A. N.; Campos, Itamara S.; Costa, Jessé C.; Neto, Francisco A.; Schleicher, Jörg; Novais, Amélia

    2013-08-01

    Conventional implementations of 3D finite-difference (FD) migration use splitting techniques to accelerate performance and save computational cost. However, such techniques are plagued with numerical anisotropy that jeopardises the correct positioning of dipping reflectors in the directions not used for the operator splitting. We implement 3D downward continuation FD migration without splitting using a complex Padé approximation. In this way, the numerical anisotropy is eliminated at the expense of a computationally more intensive solution of a large-band linear system. We compare the performance of the iterative stabilized biconjugate gradient (BICGSTAB) and that of the multifrontal massively parallel direct solver (MUMPS). It turns out that the use of the complex Padé approximation not only stabilizes the solution, but also acts as an effective preconditioner for the BICGSTAB algorithm, reducing the number of iterations as compared to the implementation using the real Padé expansion. As a consequence, the iterative BICGSTAB method is more efficient than the direct MUMPS method when solving a single term in the Padé expansion. The results of both algorithms, here evaluated by computing the migration impulse response in the SEG/EAGE salt model, are of comparable quality.

  15. Acceleration of GPU-based Krylov solvers via data transfer reduction

    DOE PAGES

    Anzt, Hartwig; Tomov, Stanimire; Luszczek, Piotr; ...

    2015-04-08

    Krylov subspace iterative solvers are often the method of choice when solving large sparse linear systems. At the same time, hardware accelerators such as graphics processing units continue to offer significant floating point performance gains for matrix and vector computations through easy-to-use libraries of computational kernels. However, as these libraries are usually composed of a well optimized but limited set of linear algebra operations, applications that use them often fail to reduce certain data communications, and hence fail to leverage the full potential of the accelerator. In this study, we target the acceleration of Krylov subspace iterative methods for graphicsmore » processing units, and in particular the Biconjugate Gradient Stabilized solver that significant improvement can be achieved by reformulating the method to reduce data-communications through application-specific kernels instead of using the generic BLAS kernels, e.g. as provided by NVIDIA’s cuBLAS library, and by designing a graphics processing unit specific sparse matrix-vector product kernel that is able to more efficiently use the graphics processing unit’s computing power. Furthermore, we derive a model estimating the performance improvement, and use experimental data to validate the expected runtime savings. Finally, considering that the derived implementation achieves significantly higher performance, we assert that similar optimizations addressing algorithm structure, as well as sparse matrix-vector, are crucial for the subsequent development of high-performance graphics processing units accelerated Krylov subspace iterative methods.« less

  16. Progress in developing Poisson-Boltzmann equation solvers

    PubMed Central

    Li, Chuan; Li, Lin; Petukh, Marharyta; Alexov, Emil

    2013-01-01

    This review outlines the recent progress made in developing more accurate and efficient solutions to model electrostatics in systems comprised of bio-macromolecules and nano-objects, the last one referring to objects that do not have biological function themselves but nowadays are frequently used in biophysical and medical approaches in conjunction with bio-macromolecules. The problem of modeling macromolecular electrostatics is reviewed from two different angles: as a mathematical task provided the specific definition of the system to be modeled and as a physical problem aiming to better capture the phenomena occurring in the real experiments. In addition, specific attention is paid to methods to extend the capabilities of the existing solvers to model large systems toward applications of calculations of the electrostatic potential and energies in molecular motors, mitochondria complex, photosynthetic machinery and systems involving large nano-objects. PMID:24199185

  17. The alpha(3) Scheme - A Fourth-Order Neutrally Stable CESE Solver

    NASA Technical Reports Server (NTRS)

    Chang, Sin-Chung

    2007-01-01

    The conservation element and solution element (CESE) development is driven by a belief that a solver should (i) enforce conservation laws in both space and time, and (ii) be built from a non-dissipative (i.e., neutrally stable) core scheme so that the numerical dissipation can be controlled effectively. To provide a solid foundation for a systematic CESE development of high order schemes, in this paper we describe a new 4th-order neutrally stable CESE solver of the advection equation Theta u/Theta + alpha Theta u/Theta x = 0. The space-time stencil of this two-level explicit scheme is formed by one point at the upper time level and three points at the lower time level. Because it is associated with three independent mesh variables u(sup n) (sub j), (u(sub x))(sup n) (sub j) , and (uxz)(sup n) (sub j) (the numerical analogues of u, Theta u/Theta x, and Theta(exp 2)u/Theta x(exp 2), respectively) and four equations per mesh point, the new scheme is referred to as the alpha(3) scheme. As in the case of other similar CESE neutrally stable solvers, the alpha(3) scheme enforces conservation laws in space-time locally and globally, and it has the basic, forward marching, and backward marching forms. These forms are equivalent and satisfy a space-time inversion (STI) invariant property which is shared by the advection equation. Based on the concept of STI invariance, a set of algebraic relations is developed and used to prove that the alpha(3) scheme must be neutrally stable when it is stable. Moreover it is proved rigorously that all three amplification factors of the alpha(3) scheme are of unit magnitude for all phase angles if |v| <= 1/2 (v = alpha delta t/delta x). This theoretical result is consistent with the numerical stability condition |v| <= 1/2. Through numerical experiments, it is established that the alpha(3) scheme generally is (i) 4th-order accurate for the mesh variables u(sup n) (sub j) and (ux)(sup n) (sub j); and 2nd-order accurate for (uxx)(sup n) (sub

  18. On the parallel solution of parabolic equations

    NASA Technical Reports Server (NTRS)

    Gallopoulos, E.; Saad, Youcef

    1989-01-01

    Parallel algorithms for the solution of linear parabolic problems are proposed. The first of these methods is based on using polynomial approximation to the exponential. It does not require solving any linear systems and is highly parallelizable. The two other methods proposed are based on Pade and Chebyshev approximations to the matrix exponential. The parallelization of these methods is achieved by using partial fraction decomposition techniques to solve the resulting systems and thus offers the potential for increased time parallelism in time dependent problems. Experimental results from the Alliant FX/8 and the Cray Y-MP/832 vector multiprocessors are also presented.

  19. Applying Reduced Generator Models in the Coarse Solver of Parareal in Time Parallel Power System Simulation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Duan, Nan; Dimitrovski, Aleksandar D; Simunovic, Srdjan

    2016-01-01

    The development of high-performance computing techniques and platforms has provided many opportunities for real-time or even faster-than-real-time implementation of power system simulations. One approach uses the Parareal in time framework. The Parareal algorithm has shown promising theoretical simulation speedups by temporal decomposing a simulation run into a coarse simulation on the entire simulation interval and fine simulations on sequential sub-intervals linked through the coarse simulation. However, it has been found that the time cost of the coarse solver needs to be reduced to fully exploit the potentials of the Parareal algorithm. This paper studies a Parareal implementation using reduced generatormore » models for the coarse solver and reports the testing results on the IEEE 39-bus system and a 327-generator 2383-bus Polish system model.« less

  20. A biomolecular electrostatics solver using Python, GPUs and boundary elements that can handle solvent-filled cavities and Stern layers.

    PubMed

    Cooper, Christopher D; Bardhan, Jaydeep P; Barba, L A

    2014-03-01

    The continuum theory applied to biomolecular electrostatics leads to an implicit-solvent model governed by the Poisson-Boltzmann equation. Solvers relying on a boundary integral representation typically do not consider features like solvent-filled cavities or ion-exclusion (Stern) layers, due to the added difficulty of treating multiple boundary surfaces. This has hindered meaningful comparisons with volume-based methods, and the effects on accuracy of including these features has remained unknown. This work presents a solver called PyGBe that uses a boundary-element formulation and can handle multiple interacting surfaces. It was used to study the effects of solvent-filled cavities and Stern layers on the accuracy of calculating solvation energy and binding energy of proteins, using the well-known apbs finite-difference code for comparison. The results suggest that if required accuracy for an application allows errors larger than about 2% in solvation energy, then the simpler, single-surface model can be used. When calculating binding energies, the need for a multi-surface model is problem-dependent, becoming more critical when ligand and receptor are of comparable size. Comparing with the apbs solver, the boundary-element solver is faster when the accuracy requirements are higher. The cross-over point for the PyGBe code is in the order of 1-2% error, when running on one gpu card (nvidia Tesla C2075), compared with apbs running on six Intel Xeon cpu cores. PyGBe achieves algorithmic acceleration of the boundary element method using a treecode, and hardware acceleration using gpus via PyCuda from a user-visible code that is all Python. The code is open-source under MIT license.

  1. Memory transfer optimization for a lattice Boltzmann solver on Kepler architecture nVidia GPUs

    NASA Astrophysics Data System (ADS)

    Mawson, Mark J.; Revell, Alistair J.

    2014-10-01

    The Lattice Boltzmann method (LBM) for solving fluid flow is naturally well suited to an efficient implementation for massively parallel computing, due to the prevalence of local operations in the algorithm. This paper presents and analyses the performance of a 3D lattice Boltzmann solver, optimized for third generation nVidia GPU hardware, also known as 'Kepler'. We provide a review of previous optimization strategies and analyse data read/write times for different memory types. In LBM, the time propagation step (known as streaming), involves shifting data to adjacent locations and is central to parallel performance; here we examine three approaches which make use of different hardware options. Two of which make use of 'performance enhancing' features of the GPU; shared memory and the new shuffle instruction found in Kepler based GPUs. These are compared to a standard transfer of data which relies instead on optimized storage to increase coalesced access. It is shown that the more simple approach is most efficient; since the need for large numbers of registers per thread in LBM limits the block size and thus the efficiency of these special features is reduced. Detailed results are obtained for a D3Q19 LBM solver, which is benchmarked on nVidia K5000M and K20C GPUs. In the latter case the use of a read-only data cache is explored, and peak performance of over 1036 Million Lattice Updates Per Second (MLUPS) is achieved. The appearance of a periodic bottleneck in the solver performance is also reported, believed to be hardware related; spikes in iteration-time occur with a frequency of around 11 Hz for both GPUs, independent of the size of the problem.

  2. Comparative Performance Analysis of Coarse Solvers for Algebraic Multigrid on Multicore and Manycore Architectures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Druinsky, Alex; Ghysels, Pieter; Li, Xiaoye S.

    In this paper, we study the performance of a two-level algebraic-multigrid algorithm, with a focus on the impact of the coarse-grid solver on performance. We consider two algorithms for solving the coarse-space systems: the preconditioned conjugate gradient method and a new robust HSS-embedded low-rank sparse-factorization algorithm. Our test data comes from the SPE Comparative Solution Project for oil-reservoir simulations. We contrast the performance of our code on one 12-core socket of a Cray XC30 machine with performance on a 60-core Intel Xeon Phi coprocessor. To obtain top performance, we optimized the code to take full advantage of fine-grained parallelism andmore » made it thread-friendly for high thread count. We also developed a bounds-and-bottlenecks performance model of the solver which we used to guide us through the optimization effort, and also carried out performance tuning in the solver’s large parameter space. Finally, as a result, significant speedups were obtained on both machines.« less

  3. LINFLUX-AE: A Turbomachinery Aeroelastic Code Based on a 3-D Linearized Euler Solver

    NASA Technical Reports Server (NTRS)

    Reddy, T. S. R.; Bakhle, M. A.; Trudell, J. J.; Mehmed, O.; Stefko, G. L.

    2004-01-01

    This report describes the development and validation of LINFLUX-AE, a turbomachinery aeroelastic code based on the linearized unsteady 3-D Euler solver, LINFLUX. A helical fan with flat plate geometry is selected as the test case for numerical validation. The steady solution required by LINFLUX is obtained from the nonlinear Euler/Navier Stokes solver TURBO-AE. The report briefly describes the salient features of LINFLUX and the details of the aeroelastic extension. The aeroelastic formulation is based on a modal approach. An eigenvalue formulation is used for flutter analysis. The unsteady aerodynamic forces required for flutter are obtained by running LINFLUX for each mode, interblade phase angle and frequency of interest. The unsteady aerodynamic forces for forced response analysis are obtained from LINFLUX for the prescribed excitation, interblade phase angle, and frequency. The forced response amplitude is calculated from the modal summation of the generalized displacements. The unsteady pressures, work done per cycle, eigenvalues and forced response amplitudes obtained from LINFLUX are compared with those obtained from LINSUB, TURBO-AE, ASTROP2, and ANSYS.

  4. Progress report on PIXIE3D, a fully implicit 3D extended MHD solver

    NASA Astrophysics Data System (ADS)

    Chacon, Luis

    2008-11-01

    Recently, invited talk at DPP07 an optimal, massively parallel implicit algorithm for 3D resistive magnetohydrodynamics (PIXIE3D) was demonstrated. Excellent algorithmic and parallel results were obtained with up to 4096 processors and 138 million unknowns. While this is a remarkable result, further developments are still needed for PIXIE3D to become a 3D extended MHD production code in general geometries. In this poster, we present an update on the status of PIXIE3D on several fronts. On the physics side, we will describe our progress towards the full Braginskii model, including: electron Hall terms, anisotropic heat conduction, and gyroviscous corrections. Algorithmically, we will discuss progress towards a robust, optimal, nonlinear solver for arbitrary geometries, including preconditioning for the new physical effects described, the implementation of a coarse processor-grid solver (to maintain optimal algorithmic performance for an arbitrarily large number of processors in massively parallel computations), and of a multiblock capability to deal with complicated geometries. L. Chac'on, Phys. Plasmas 15, 056103 (2008);

  5. Controlling the numerical Cerenkov instability in PIC simulations using a customized finite difference Maxwell solver and a local FFT based current correction

    DOE PAGES

    Li, Fei; Yu, Peicheng; Xu, Xinlu; ...

    2017-01-12

    In this study we present a customized finite-difference-time-domain (FDTD) Maxwell solver for the particle-in-cell (PIC) algorithm. The solver is customized to effectively eliminate the numerical Cerenkov instability (NCI) which arises when a plasma (neutral or non-neutral) relativistically drifts on a grid when using the PIC algorithm. We control the EM dispersion curve in the direction of the plasma drift of a FDTD Maxwell solver by using a customized higher order finite difference operator for the spatial derivative along the direction of the drift (1ˆ direction). We show that this eliminates the main NCI modes with moderate |k 1|, while keepsmore » additional main NCI modes well outside the range of physical interest with higher |k 1|. These main NCI modes can be easily filtered out along with first spatial aliasing NCI modes which are also at the edge of the fundamental Brillouin zone. The customized solver has the possible advantage of improved parallel scalability because it can be easily partitioned along 1ˆ which typically has many more cells than other directions for the problems of interest. We show that FFTs can be performed locally to current on each partition to filter out the main and first spatial aliasing NCI modes, and to correct the current so that it satisfies the continuity equation for the customized spatial derivative. This ensures that Gauss’ Law is satisfied. Lastly, we present simulation examples of one relativistically drifting plasma, of two colliding relativistically drifting plasmas, and of nonlinear laser wakefield acceleration (LWFA) in a Lorentz boosted frame that show no evidence of the NCI can be observed when using this customized Maxwell solver together with its NCI elimination scheme.« less

  6. Controlling the numerical Cerenkov instability in PIC simulations using a customized finite difference Maxwell solver and a local FFT based current correction

    NASA Astrophysics Data System (ADS)

    Li, Fei; Yu, Peicheng; Xu, Xinlu; Fiuza, Frederico; Decyk, Viktor K.; Dalichaouch, Thamine; Davidson, Asher; Tableman, Adam; An, Weiming; Tsung, Frank S.; Fonseca, Ricardo A.; Lu, Wei; Mori, Warren B.

    2017-05-01

    In this paper we present a customized finite-difference-time-domain (FDTD) Maxwell solver for the particle-in-cell (PIC) algorithm. The solver is customized to effectively eliminate the numerical Cerenkov instability (NCI) which arises when a plasma (neutral or non-neutral) relativistically drifts on a grid when using the PIC algorithm. We control the EM dispersion curve in the direction of the plasma drift of a FDTD Maxwell solver by using a customized higher order finite difference operator for the spatial derivative along the direction of the drift (1 ˆ direction). We show that this eliminates the main NCI modes with moderate |k1 | , while keeps additional main NCI modes well outside the range of physical interest with higher |k1 | . These main NCI modes can be easily filtered out along with first spatial aliasing NCI modes which are also at the edge of the fundamental Brillouin zone. The customized solver has the possible advantage of improved parallel scalability because it can be easily partitioned along 1 ˆ which typically has many more cells than other directions for the problems of interest. We show that FFTs can be performed locally to current on each partition to filter out the main and first spatial aliasing NCI modes, and to correct the current so that it satisfies the continuity equation for the customized spatial derivative. This ensures that Gauss' Law is satisfied. We present simulation examples of one relativistically drifting plasma, of two colliding relativistically drifting plasmas, and of nonlinear laser wakefield acceleration (LWFA) in a Lorentz boosted frame that show no evidence of the NCI can be observed when using this customized Maxwell solver together with its NCI elimination scheme.

  7. Controlling the numerical Cerenkov instability in PIC simulations using a customized finite difference Maxwell solver and a local FFT based current correction

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Li, Fei; Yu, Peicheng; Xu, Xinlu

    In this study we present a customized finite-difference-time-domain (FDTD) Maxwell solver for the particle-in-cell (PIC) algorithm. The solver is customized to effectively eliminate the numerical Cerenkov instability (NCI) which arises when a plasma (neutral or non-neutral) relativistically drifts on a grid when using the PIC algorithm. We control the EM dispersion curve in the direction of the plasma drift of a FDTD Maxwell solver by using a customized higher order finite difference operator for the spatial derivative along the direction of the drift (1ˆ direction). We show that this eliminates the main NCI modes with moderate |k 1|, while keepsmore » additional main NCI modes well outside the range of physical interest with higher |k 1|. These main NCI modes can be easily filtered out along with first spatial aliasing NCI modes which are also at the edge of the fundamental Brillouin zone. The customized solver has the possible advantage of improved parallel scalability because it can be easily partitioned along 1ˆ which typically has many more cells than other directions for the problems of interest. We show that FFTs can be performed locally to current on each partition to filter out the main and first spatial aliasing NCI modes, and to correct the current so that it satisfies the continuity equation for the customized spatial derivative. This ensures that Gauss’ Law is satisfied. Lastly, we present simulation examples of one relativistically drifting plasma, of two colliding relativistically drifting plasmas, and of nonlinear laser wakefield acceleration (LWFA) in a Lorentz boosted frame that show no evidence of the NCI can be observed when using this customized Maxwell solver together with its NCI elimination scheme.« less

  8. PyOperators: Operators and solvers for high-performance computing

    NASA Astrophysics Data System (ADS)

    Chanial, P.; Barbey, N.

    2012-12-01

    PyOperators is a publicly available library that provides basic operators and solvers for small-to-very large inverse problems ({http://pchanial.github.com/pyoperators}). It forms the backbone of the package PySimulators, which implements specific operators to construct an instrument model and means to conveniently represent a map, a timeline or a time-dependent observation ({http://pchanial.github.com/pysimulators}). Both are part of the Tamasis (Tools for Advanced Map-making, Analysis and SImulations of Submillimeter surveys) toolbox, aiming at providing versatile, reliable, easy-to-use, and optimal map-making tools for Herschel and future generation of sub-mm instruments. The project is a collaboration between 4 institutes (ESO Garching, IAS Orsay, CEA Saclay, Univ. Leiden).

  9. Performance of a cavity-method-based algorithm for the prize-collecting Steiner tree problem on graphs

    NASA Astrophysics Data System (ADS)

    Biazzo, Indaco; Braunstein, Alfredo; Zecchina, Riccardo

    2012-08-01

    We study the behavior of an algorithm derived from the cavity method for the prize-collecting steiner tree (PCST) problem on graphs. The algorithm is based on the zero temperature limit of the cavity equations and as such is formally simple (a fixed point equation resolved by iteration) and distributed (parallelizable). We provide a detailed comparison with state-of-the-art algorithms on a wide range of existing benchmarks, networks, and random graphs. Specifically, we consider an enhanced derivative of the Goemans-Williamson heuristics and the dhea solver, a branch and cut integer linear programming based approach. The comparison shows that the cavity algorithm outperforms the two algorithms in most large instances both in running time and quality of the solution. Finally we prove a few optimality properties of the solutions provided by our algorithm, including optimality under the two postprocessing procedures defined in the Goemans-Williamson derivative and global optimality in some limit cases.

  10. Hybrid MPI+OpenMP Programming of an Overset CFD Solver and Performance Investigations

    NASA Technical Reports Server (NTRS)

    Djomehri, M. Jahed; Jin, Haoqiang H.; Biegel, Bryan (Technical Monitor)

    2002-01-01

    This report describes a two level parallelization of a Computational Fluid Dynamic (CFD) solver with multi-zone overset structured grids. The approach is based on a hybrid MPI+OpenMP programming model suitable for shared memory and clusters of shared memory machines. The performance investigations of the hybrid application on an SGI Origin2000 (O2K) machine is reported using medium and large scale test problems.

  11. A Comparison of the Intellectual Abilities of Good and Poor Problem Solvers: An Exploratory Study.

    ERIC Educational Resources Information Center

    Meyer, Ruth Ann

    This study examined a selected sample of fourth-grade students who had been previously identified as good or poor problem solvers. The pupils were compared on variables considered as "reference tests" for Verbal, Induction, Numerical, Word Fluency, Memory, Spatial Visualization, and Perceptual Speed abilities. The data were compiled to…

  12. An evaluation of solution algorithms and numerical approximation methods for modeling an ion exchange process

    NASA Astrophysics Data System (ADS)

    Bu, Sunyoung; Huang, Jingfang; Boyer, Treavor H.; Miller, Cass T.

    2010-07-01

    The focus of this work is on the modeling of an ion exchange process that occurs in drinking water treatment applications. The model formulation consists of a two-scale model in which a set of microscale diffusion equations representing ion exchange resin particles that vary in size and age are coupled through a boundary condition with a macroscopic ordinary differential equation (ODE), which represents the concentration of a species in a well-mixed reactor. We introduce a new age-averaged model (AAM) that averages all ion exchange particle ages for a given size particle to avoid the expensive Monte-Carlo simulation associated with previous modeling applications. We discuss two different numerical schemes to approximate both the original Monte-Carlo algorithm and the new AAM for this two-scale problem. The first scheme is based on the finite element formulation in space coupled with an existing backward difference formula-based ODE solver in time. The second scheme uses an integral equation based Krylov deferred correction (KDC) method and a fast elliptic solver (FES) for the resulting elliptic equations. Numerical results are presented to validate the new AAM algorithm, which is also shown to be more computationally efficient than the original Monte-Carlo algorithm. We also demonstrate that the higher order KDC scheme is more efficient than the traditional finite element solution approach and this advantage becomes increasingly important as the desired accuracy of the solution increases. We also discuss issues of smoothness, which affect the efficiency of the KDC-FES approach, and outline additional algorithmic changes that would further improve the efficiency of these developing methods for a wide range of applications.

  13. Incompressible Navier-Stokes Solvers in Primative Variables and their Applications to Steady and Unsteady Flow Simulations

    NASA Technical Reports Server (NTRS)

    Kiris, Cetin C.; Kwak, Dochan; Rogers, Stuart E.

    2002-01-01

    This paper reviews recent progress made in incompressible Navier-Stokes simulation procedures and their application to problems of engineering interest. Discussions are focused on the methods designed for complex geometry applications in three dimensions, and thus are limited to primitive variable formulation. A summary of efforts in flow solver development is given followed by numerical studies of a few example problems of current interest. Both steady and unsteady solution algorithms and their salient features are discussed. Solvers discussed here are based on a structured-grid approach using either a finite -difference or a finite-volume frame work. As a grand-challenge application of these solvers, an unsteady turbopump flow simulation procedure has been developed which utilizes high performance computing platforms. In the paper, the progress toward the complete simulation capability of the turbo-pump for a liquid rocket engine is reported. The Space Shuttle Main Engine (SSME) turbo-pump is used as a test case for evaluation of two parallel computing algorithms that have been implemented in the INS3D code. The relative motion of the grid systems for the rotorstator interaction was obtained using overact grid techniques. Unsteady computations for the SSME turbo-pump, which contains 114 zones with 34.5 million grid points, are carried out on SCSI Origin 3000 systems at NASA Ames Research Center. The same procedure has been extended to the development of NASA-DeBakey Ventricular Assist Device (VAD) that is based on an axial blood pump. Computational, and clinical analysis of this device are presented.

  14. Incompressible SPH (ISPH) with fast Poisson solver on a GPU

    NASA Astrophysics Data System (ADS)

    Chow, Alex D.; Rogers, Benedict D.; Lind, Steven J.; Stansby, Peter K.

    2018-05-01

    This paper presents a fast incompressible SPH (ISPH) solver implemented to run entirely on a graphics processing unit (GPU) capable of simulating several millions of particles in three dimensions on a single GPU. The ISPH algorithm is implemented by converting the highly optimised open-source weakly-compressible SPH (WCSPH) code DualSPHysics to run ISPH on the GPU, combining it with the open-source linear algebra library ViennaCL for fast solutions of the pressure Poisson equation (PPE). Several challenges are addressed with this research: constructing a PPE matrix every timestep on the GPU for moving particles, optimising the limited GPU memory, and exploiting fast matrix solvers. The ISPH pressure projection algorithm is implemented as 4 separate stages, each with a particle sweep, including an algorithm for the population of the PPE matrix suitable for the GPU, and mixed precision storage methods. An accurate and robust ISPH boundary condition ideal for parallel processing is also established by adapting an existing WCSPH boundary condition for ISPH. A variety of validation cases are presented: an impulsively started plate, incompressible flow around a moving square in a box, and dambreaks (2-D and 3-D) which demonstrate the accuracy, flexibility, and speed of the methodology. Fragmentation of the free surface is shown to influence the performance of matrix preconditioners and therefore the PPE matrix solution time. The Jacobi preconditioner demonstrates robustness and reliability in the presence of fragmented flows. For a dambreak simulation, GPU speed ups demonstrate up to 10-18 times and 1.1-4.5 times compared to single-threaded and 16-threaded CPU run times respectively.

  15. A biomolecular electrostatics solver using Python, GPUs and boundary elements that can handle solvent-filled cavities and Stern layers

    NASA Astrophysics Data System (ADS)

    Cooper, Christopher D.; Bardhan, Jaydeep P.; Barba, L. A.

    2014-03-01

    The continuum theory applied to biomolecular electrostatics leads to an implicit-solvent model governed by the Poisson-Boltzmann equation. Solvers relying on a boundary integral representation typically do not consider features like solvent-filled cavities or ion-exclusion (Stern) layers, due to the added difficulty of treating multiple boundary surfaces. This has hindered meaningful comparisons with volume-based methods, and the effects on accuracy of including these features has remained unknown. This work presents a solver called PyGBe that uses a boundary-element formulation and can handle multiple interacting surfaces. It was used to study the effects of solvent-filled cavities and Stern layers on the accuracy of calculating solvation energy and binding energy of proteins, using the well-known APBS finite-difference code for comparison. The results suggest that if required accuracy for an application allows errors larger than about 2% in solvation energy, then the simpler, single-surface model can be used. When calculating binding energies, the need for a multi-surface model is problem-dependent, becoming more critical when ligand and receptor are of comparable size. Comparing with the APBS solver, the boundary-element solver is faster when the accuracy requirements are higher. The cross-over point for the PyGBe code is on the order of 1-2% error, when running on one GPU card (NVIDIA Tesla C2075), compared with APBS running on six Intel Xeon CPU cores. PyGBe achieves algorithmic acceleration of the boundary element method using a treecode, and hardware acceleration using GPUs via PyCuda from a user-visible code that is all Python. The code is open-source under MIT license.

  16. Performance Comparison of a Set of Periodic and Non-Periodic Tridiagonal Solvers on SP2 and Paragon Parallel Computers

    NASA Technical Reports Server (NTRS)

    Sun, Xian-He; Moitra, Stuti

    1996-01-01

    Various tridiagonal solvers have been proposed in recent years for different parallel platforms. In this paper, the performance of three tridiagonal solvers, namely, the parallel partition LU algorithm, the parallel diagonal dominant algorithm, and the reduced diagonal dominant algorithm, is studied. These algorithms are designed for distributed-memory machines and are tested on an Intel Paragon and an IBM SP2 machines. Measured results are reported in terms of execution time and speedup. Analytical study are conducted for different communication topologies and for different tridiagonal systems. The measured results match the analytical results closely. In addition to address implementation issues, performance considerations such as problem sizes and models of speedup are also discussed.

  17. Some fast elliptic solvers on parallel architectures and their complexities

    NASA Technical Reports Server (NTRS)

    Gallopoulos, E.; Saad, Y.

    1989-01-01

    The discretization of separable elliptic partial differential equations leads to linear systems with special block tridiagonal matrices. Several methods are known to solve these systems, the most general of which is the Block Cyclic Reduction (BCR) algorithm which handles equations with nonconstant coefficients. A method was recently proposed to parallelize and vectorize BCR. In this paper, the mapping of BCR on distributed memory architectures is discussed, and its complexity is compared with that of other approaches including the Alternating-Direction method. A fast parallel solver is also described, based on an explicit formula for the solution, which has parallel computational compelxity lower than that of parallel BCR.

  18. Some fast elliptic solvers on parallel architectures and their complexities

    NASA Technical Reports Server (NTRS)

    Gallopoulos, E.; Saad, Youcef

    1989-01-01

    The discretization of separable elliptic partial differential equations leads to linear systems with special block triangular matrices. Several methods are known to solve these systems, the most general of which is the Block Cyclic Reduction (BCR) algorithm which handles equations with nonconsistant coefficients. A method was recently proposed to parallelize and vectorize BCR. Here, the mapping of BCR on distributed memory architectures is discussed, and its complexity is compared with that of other approaches, including the Alternating-Direction method. A fast parallel solver is also described, based on an explicit formula for the solution, which has parallel computational complexity lower than that of parallel BCR.

  19. ALPS: A Linear Program Solver

    NASA Technical Reports Server (NTRS)

    Ferencz, Donald C.; Viterna, Larry A.

    1991-01-01

    ALPS is a computer program which can be used to solve general linear program (optimization) problems. ALPS was designed for those who have minimal linear programming (LP) knowledge and features a menu-driven scheme to guide the user through the process of creating and solving LP formulations. Once created, the problems can be edited and stored in standard DOS ASCII files to provide portability to various word processors or even other linear programming packages. Unlike many math-oriented LP solvers, ALPS contains an LP parser that reads through the LP formulation and reports several types of errors to the user. ALPS provides a large amount of solution data which is often useful in problem solving. In addition to pure linear programs, ALPS can solve for integer, mixed integer, and binary type problems. Pure linear programs are solved with the revised simplex method. Integer or mixed integer programs are solved initially with the revised simplex, and the completed using the branch-and-bound technique. Binary programs are solved with the method of implicit enumeration. This manual describes how to use ALPS to create, edit, and solve linear programming problems. Instructions for installing ALPS on a PC compatible computer are included in the appendices along with a general introduction to linear programming. A programmers guide is also included for assistance in modifying and maintaining the program.

  20. GPU accelerated FDTD solver and its application in MRI.

    PubMed

    Chi, J; Liu, F; Jin, J; Mason, D G; Crozier, S

    2010-01-01

    The finite difference time domain (FDTD) method is a popular technique for computational electromagnetics (CEM). The large computational power often required, however, has been a limiting factor for its applications. In this paper, we will present a graphics processing unit (GPU)-based parallel FDTD solver and its successful application to the investigation of a novel B1 shimming scheme for high-field magnetic resonance imaging (MRI). The optimized shimming scheme exhibits considerably improved transmit B(1) profiles. The GPU implementation dramatically shortened the runtime of FDTD simulation of electromagnetic field compared with its CPU counterpart. The acceleration in runtime has made such investigation possible, and will pave the way for other studies of large-scale computational electromagnetic problems in modern MRI which were previously impractical.

  1. Notes on the ExactPack Implementation of the DSD Explosive Arc Solver

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kaul, Ann; Doebling, Scott William

    It has been shown above that the discretization scheme implemented in the ExactPack solver for the DSD Explosive Arc equation is consistent with the Explosive Arc PDE. In addition, a stability analysis has provided a CFL condition for a stable time step. Together, consistency and stability imply convergence of the scheme, which is expected to be close to first-order in time and second-order in space. It is understood that the nonlinearity of the underlying PDE will affect this rate somewhat.

  2. An approximate Riemann solver for thermal and chemical nonequilibrium flows

    NASA Technical Reports Server (NTRS)

    Prabhu, Ramadas K.

    1994-01-01

    Among the many methods available for the determination of inviscid fluxes across a surface of discontinuity, the flux-difference-splitting technique that employs Roe-averaged variables has been used extensively by the CFD community because of its simplicity and its ability to capture shocks exactly. This method, originally developed for perfect gas flows, has since been extended to equilibrium as well as nonequilibrium flows. Determination of the Roe-averaged variables for the case of a perfect gas flow is a simple task; however, for thermal and chemical nonequilibrium flows, some of the variables are not uniquely defined. Methods available in the literature to determine these variables seem to lack sound bases. The present paper describes a simple, yet accurate, method to determine all the variables for nonequilibrium flows in the Roe-average state. The basis for this method is the requirement that the Roe-averaged variables form a consistent set of thermodynamic variables. The present method satisfies the requirement that the square of the speed of sound be positive.

  3. A Polyhedral Outer-approximation, Dynamic-discretization optimization solver, 1.x

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bent, Rusell; Nagarajan, Harsha; Sundar, Kaarthik

    2017-09-25

    In this software, we implement an adaptive, multivariate partitioning algorithm for solving mixed-integer nonlinear programs (MINLP) to global optimality. The algorithm combines ideas that exploit the structure of convex relaxations to MINLPs and bound tightening procedures

  4. Energy consumption optimization of the total-FETI solver by changing the CPU frequency

    NASA Astrophysics Data System (ADS)

    Horak, David; Riha, Lubomir; Sojka, Radim; Kruzik, Jakub; Beseda, Martin; Cermak, Martin; Schuchart, Joseph

    2017-07-01

    The energy consumption of supercomputers is one of the critical problems for the upcoming Exascale supercomputing era. The awareness of power and energy consumption is required on both software and hardware side. This paper deals with the energy consumption evaluation of the Finite Element Tearing and Interconnect (FETI) based solvers of linear systems, which is an established method for solving real-world engineering problems. We have evaluated the effect of the CPU frequency on the energy consumption of the FETI solver using a linear elasticity 3D cube synthetic benchmark. In this problem, we have evaluated the effect of frequency tuning on the energy consumption of the essential processing kernels of the FETI method. The paper provides results for two types of frequency tuning: (1) static tuning and (2) dynamic tuning. For static tuning experiments, the frequency is set before execution and kept constant during the runtime. For dynamic tuning, the frequency is changed during the program execution to adapt the system to the actual needs of the application. The paper shows that static tuning brings up 12% energy savings when compared to default CPU settings (the highest clock rate). The dynamic tuning improves this further by up to 3%.

  5. ASTROP2-LE: A Mistuned Aeroelastic Analysis System Based on a Two Dimensional Linearized Euler Solver

    NASA Technical Reports Server (NTRS)

    Reddy, T. S. R.; Srivastava, R.; Mehmed, Oral

    2002-01-01

    An aeroelastic analysis system for flutter and forced response analysis of turbomachines based on a two-dimensional linearized unsteady Euler solver has been developed. The ASTROP2 code, an aeroelastic stability analysis program for turbomachinery, was used as a basis for this development. The ASTROP2 code uses strip theory to couple a two dimensional aerodynamic model with a three dimensional structural model. The code was modified to include forced response capability. The formulation was also modified to include aeroelastic analysis with mistuning. A linearized unsteady Euler solver, LINFLX2D is added to model the unsteady aerodynamics in ASTROP2. By calculating the unsteady aerodynamic loads using LINFLX2D, it is possible to include the effects of transonic flow on flutter and forced response in the analysis. The stability is inferred from an eigenvalue analysis. The revised code, ASTROP2-LE for ASTROP2 code using Linearized Euler aerodynamics, is validated by comparing the predictions with those obtained using linear unsteady aerodynamic solutions.

  6. An approximate viscous shock layer technique for calculating chemically reacting hypersonic flows about blunt-nosed bodies

    NASA Technical Reports Server (NTRS)

    Cheatwood, F. Mcneil; Dejarnette, Fred R.

    1991-01-01

    An approximate axisymmetric method was developed which can reliably calculate fully viscous hypersonic flows over blunt nosed bodies. By substituting Maslen's second order pressure expression for the normal momentum equation, a simplified form of the viscous shock layer (VSL) equations is obtained. This approach can solve both the subsonic and supersonic regions of the shock layer without a starting solution for the shock shape. The approach is applicable to perfect gas, equilibrium, and nonequilibrium flowfields. Since the method is fully viscous, the problems associated with a boundary layer solution with an inviscid layer solution are avoided. This procedure is significantly faster than the parabolized Navier-Stokes (PNS) or VSL solvers and would be useful in a preliminary design environment. Problems associated with a previously developed approximate VSL technique are addressed before extending the method to nonequilibrium calculations. Perfect gas (laminar and turbulent), equilibrium, and nonequilibrium solutions were generated for airflows over several analytic body shapes. Surface heat transfer, skin friction, and pressure predictions are comparable to VSL results. In addition, computed heating rates are in good agreement with experimental data. The present technique generates its own shock shape as part of its solution, and therefore could be used to provide more accurate initial shock shapes for higher order procedures which require starting solutions.

  7. A 3D Unstructured Mesh Euler Solver Based on the Fourth-Order CESE Method

    DTIC Science & Technology

    2013-06-01

    Form 298 (Rev. 8-98) Prescribed by ANSI Std. 239.18 A 3D Unstructured Mesh Euler Solver Based on the Fourth-Order CESE Method David L. Bilyeu ∗1,2...Similarly, the fluxes, f x,y,z i , and their derivatives inside a SE are also discretized by the Taylor series expansion: ∂ Cfx ,y,zi ∂xI∂yJ∂zK∂tL = A

  8. ROMI 3.1 Least-cost lumber grade mix solver using open source statistical software

    Treesearch

    Rebecca A. Buck; Urs Buehlmann; R. Edward Thomas

    2010-01-01

    The least-cost lumber grade mix solution has been a topic of interest to both industry and academia for many years due to its potential to help wood processing operations reduce costs. A least-cost lumber grade mix solver is a rough mill decision support system that describes the lumber grade or grade mix needed to minimize raw material or total production cost (raw...

  9. Multiscale solvers and systematic upscaling in computational physics

    NASA Astrophysics Data System (ADS)

    Brandt, A.

    2005-07-01

    Multiscale algorithms can overcome the scale-born bottlenecks that plague most computations in physics. These algorithms employ separate processing at each scale of the physical space, combined with interscale iterative interactions, in ways which use finer scales very sparingly. Having been developed first and well known as multigrid solvers for partial differential equations, highly efficient multiscale techniques have more recently been developed for many other types of computational tasks, including: inverse PDE problems; highly indefinite (e.g., standing wave) equations; Dirac equations in disordered gauge fields; fast computation and updating of large determinants (as needed in QCD); fast integral transforms; integral equations; astrophysics; molecular dynamics of macromolecules and fluids; many-atom electronic structures; global and discrete-state optimization; practical graph problems; image segmentation and recognition; tomography (medical imaging); fast Monte-Carlo sampling in statistical physics; and general, systematic methods of upscaling (accurate numerical derivation of large-scale equations from microscopic laws).

  10. Application of a Scalable, Parallel, Unstructured-Grid-Based Navier-Stokes Solver

    NASA Technical Reports Server (NTRS)

    Parikh, Paresh

    2001-01-01

    A parallel version of an unstructured-grid based Navier-Stokes solver, USM3Dns, previously developed for efficient operation on a variety of parallel computers, has been enhanced to incorporate upgrades made to the serial version. The resultant parallel code has been extensively tested on a variety of problems of aerospace interest and on two sets of parallel computers to understand and document its characteristics. An innovative grid renumbering construct and use of non-blocking communication are shown to produce superlinear computing performance. Preliminary results from parallelization of a recently introduced "porous surface" boundary condition are also presented.

  11. On the Performance of an Algebraic MultigridSolver on Multicore Clusters

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Baker, A H; Schulz, M; Yang, U M

    2010-04-29

    Algebraic multigrid (AMG) solvers have proven to be extremely efficient on distributed-memory architectures. However, when executed on modern multicore cluster architectures, we face new challenges that can significantly harm AMG's performance. We discuss our experiences on such an architecture and present a set of techniques that help users to overcome the associated problems, including thread and process pinning and correct memory associations. We have implemented most of the techniques in a MultiCore SUPport library (MCSup), which helps to map OpenMP applications to multicore machines. We present results using both an MPI-only and a hybrid MPI/OpenMP model.

  12. High-resolution numerical approximation of traffic flow problems with variable lanes and free-flow velocities.

    PubMed

    Zhang, Peng; Liu, Ru-Xun; Wong, S C

    2005-05-01

    This paper develops macroscopic traffic flow models for a highway section with variable lanes and free-flow velocities, that involve spatially varying flux functions. To address this complex physical property, we develop a Riemann solver that derives the exact flux values at the interface of the Riemann problem. Based on this solver, we formulate Godunov-type numerical schemes to solve the traffic flow models. Numerical examples that simulate the traffic flow around a bottleneck that arises from a drop in traffic capacity on the highway section are given to illustrate the efficiency of these schemes.

  13. A resistive magnetohydrodynamics solver using modern C++ and the Boost library

    NASA Astrophysics Data System (ADS)

    Einkemmer, Lukas

    2016-09-01

    In this paper we describe the implementation of our C++ resistive magnetohydrodynamics solver. The framework developed facilitates the separation of the code implementing the specific numerical method and the physical model from the handling of boundary conditions and the management of the computational domain. In particular, this will allow us to use finite difference stencils which are only defined in the interior of the domain (the boundary conditions are handled automatically). We will discuss this and other design considerations and their impact on performance in some detail. In addition, we provide a documentation of the code developed and demonstrate that a performance comparable to Fortran can be achieved, while still maintaining a maximum of code readability and extensibility. Catalogue identifier: AFAH_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AFAH_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 592774 No. of bytes in distributed program, including test data, etc.: 43771395 Distribution format: tar.gz Programming language: C++03. Computer: PC, HPC systems. Operating system: POSIX compatible (extensively tested on various Linux systems). In fact only the timing class requires POSIX routines; all other parts of the program can be run on any system where a C++ compiler, Boost, CVODE, and an implementation of BLAS are available. RAM: Hundredths of Kilobytes to Gigabytes (depending on the problem size) Classification: 19.10, 4.3. External routines: Boost, CVODE, either a BLAS library or Intel MKL Nature of problem: An approximate solution to the equations of resistive magnetohydrodynamics for a given initial value and given boundary conditions is computed. Solution method: The discretization is performed using a finite difference approximation in

  14. Adaptive multi-resolution 3D Hartree-Fock-Bogoliubov solver for nuclear structure

    NASA Astrophysics Data System (ADS)

    Pei, J. C.; Fann, G. I.; Harrison, R. J.; Nazarewicz, W.; Shi, Yue; Thornton, S.

    2014-08-01

    Background: Complex many-body systems, such as triaxial and reflection-asymmetric nuclei, weakly bound halo states, cluster configurations, nuclear fragments produced in heavy-ion fusion reactions, cold Fermi gases, and pasta phases in neutron star crust, are all characterized by large sizes and complex topologies in which many geometrical symmetries characteristic of ground-state configurations are broken. A tool of choice to study such complex forms of matter is an adaptive multi-resolution wavelet analysis. This method has generated much excitement since it provides a common framework linking many diversified methodologies across different fields, including signal processing, data compression, harmonic analysis and operator theory, fractals, and quantum field theory. Purpose: To describe complex superfluid many-fermion systems, we introduce an adaptive pseudospectral method for solving self-consistent equations of nuclear density functional theory in three dimensions, without symmetry restrictions. Methods: The numerical method is based on the multi-resolution and computational harmonic analysis techniques with a multi-wavelet basis. The application of state-of-the-art parallel programming techniques include sophisticated object-oriented templates which parse the high-level code into distributed parallel tasks with a multi-thread task queue scheduler for each multi-core node. The internode communications are asynchronous. The algorithm is variational and is capable of solving coupled complex-geometric systems of equations adaptively, with functional and boundary constraints, in a finite spatial domain of very large size, limited by existing parallel computer memory. For smooth functions, user-defined finite precision is guaranteed. Results: The new adaptive multi-resolution Hartree-Fock-Bogoliubov (HFB) solver madness-hfb is benchmarked against a two-dimensional coordinate-space solver hfb-ax that is based on the B-spline technique and a three-dimensional solver

  15. New algorithms for field-theoretic block copolymer simulations: Progress on using adaptive-mesh refinement and sparse matrix solvers in SCFT calculations

    NASA Astrophysics Data System (ADS)

    Sides, Scott; Jamroz, Ben; Crockett, Robert; Pletzer, Alexander

    2012-02-01

    Self-consistent field theory (SCFT) for dense polymer melts has been highly successful in describing complex morphologies in block copolymers. Field-theoretic simulations such as these are able to access large length and time scales that are difficult or impossible for particle-based simulations such as molecular dynamics. The modified diffusion equations that arise as a consequence of the coarse-graining procedure in the SCF theory can be efficiently solved with a pseudo-spectral (PS) method that uses fast-Fourier transforms on uniform Cartesian grids. However, PS methods can be difficult to apply in many block copolymer SCFT simulations (eg. confinement, interface adsorption) in which small spatial regions might require finer resolution than most of the simulation grid. Progress on using new solver algorithms to address these problems will be presented. The Tech-X Chompst project aims at marrying the best of adaptive mesh refinement with linear matrix solver algorithms. The Tech-X code PolySwift++ is an SCFT simulation platform that leverages ongoing development in coupling Chombo, a package for solving PDEs via block-structured AMR calculations and embedded boundaries, with PETSc, a toolkit that includes a large assortment of sparse linear solvers.

  16. Modeling of photon migration in the human lung using a finite volume solver

    NASA Astrophysics Data System (ADS)

    Sikorski, Zbigniew; Furmanczyk, Michal; Przekwas, Andrzej J.

    2006-02-01

    The application of the frequency domain and steady-state diffusive optical spectroscopy (DOS) and steady-state near infrared spectroscopy (NIRS) to diagnosis of the human lung injury challenges many elements of these techniques. These include the DOS/NIRS instrument performance and accurate models of light transport in heterogeneous thorax tissue. The thorax tissue not only consists of different media (e.g. chest wall with ribs, lungs) but its optical properties also vary with time due to respiration and changes in thorax geometry with contusion (e.g. pneumothorax or hemothorax). This paper presents a finite volume solver developed to model photon migration in the diffusion approximation in heterogeneous complex 3D tissues. The code applies boundary conditions that account for Fresnel reflections. We propose an effective diffusion coefficient for the void volumes (pneumothorax) based on the assumption of the Lambertian diffusion of photons entering the pleural cavity and accounting for the local pleural cavity thickness. The code has been validated using the MCML Monte Carlo code as a benchmark. The code environment enables a semi-automatic preparation of 3D computational geometry from medical images and its rapid automatic meshing. We present the application of the code to analysis/optimization of the hybrid DOS/NIRS/ultrasound technique in which ultrasound provides data on the localization of thorax tissue boundaries. The code effectiveness (3D complex case computation takes 1 second) enables its use to quantitatively relate detected light signal to absorption and reduced scattering coefficients that are indicators of the pulmonary physiologic state (hemoglobin concentration and oxygenation).

  17. Multigrid Techniques for Highly Indefinite Equations

    NASA Technical Reports Server (NTRS)

    Shapira, Yair

    1996-01-01

    A multigrid method for the solution of finite difference approximations of elliptic PDE's is introduced. A parallelizable version of it, suitable for two and multi level analysis, is also defined, and serves as a theoretical tool for deriving a suitable implementation for the main version. For indefinite Helmholtz equations, this analysis provides a suitable mesh size for the coarsest grid used. Numerical experiments show that the method is applicable to diffusion equations with discontinuous coefficients and highly indefinite Helmholtz equations.

  18. A high-order semi-explicit discontinuous Galerkin solver for 3D incompressible flow with application to DNS and LES of turbulent channel flow

    NASA Astrophysics Data System (ADS)

    Krank, Benjamin; Fehn, Niklas; Wall, Wolfgang A.; Kronbichler, Martin

    2017-11-01

    We present an efficient discontinuous Galerkin scheme for simulation of the incompressible Navier-Stokes equations including laminar and turbulent flow. We consider a semi-explicit high-order velocity-correction method for time integration as well as nodal equal-order discretizations for velocity and pressure. The non-linear convective term is treated explicitly while a linear system is solved for the pressure Poisson equation and the viscous term. The key feature of our solver is a consistent penalty term reducing the local divergence error in order to overcome recently reported instabilities in spatially under-resolved high-Reynolds-number flows as well as small time steps. This penalty method is similar to the grad-div stabilization widely used in continuous finite elements. We further review and compare our method to several other techniques recently proposed in literature to stabilize the method for such flow configurations. The solver is specifically designed for large-scale computations through matrix-free linear solvers including efficient preconditioning strategies and tensor-product elements, which have allowed us to scale this code up to 34.4 billion degrees of freedom and 147,456 CPU cores. We validate our code and demonstrate optimal convergence rates with laminar flows present in a vortex problem and flow past a cylinder and show applicability of our solver to direct numerical simulation as well as implicit large-eddy simulation of turbulent channel flow at Reτ = 180 as well as 590.

  19. Anatomically accurate high resolution modeling of human whole heart electromechanics: A strongly scalable algebraic multigrid solver method for nonlinear deformation

    NASA Astrophysics Data System (ADS)

    Augustin, Christoph M.; Neic, Aurel; Liebmann, Manfred; Prassl, Anton J.; Niederer, Steven A.; Haase, Gundolf; Plank, Gernot

    2016-01-01

    Electromechanical (EM) models of the heart have been used successfully to study fundamental mechanisms underlying a heart beat in health and disease. However, in all modeling studies reported so far numerous simplifications were made in terms of representing biophysical details of cellular function and its heterogeneity, gross anatomy and tissue microstructure, as well as the bidirectional coupling between electrophysiology (EP) and tissue distension. One limiting factor is the employed spatial discretization methods which are not sufficiently flexible to accommodate complex geometries or resolve heterogeneities, but, even more importantly, the limited efficiency of the prevailing solver techniques which is not sufficiently scalable to deal with the incurring increase in degrees of freedom (DOF) when modeling cardiac electromechanics at high spatio-temporal resolution. This study reports on the development of a novel methodology for solving the nonlinear equation of finite elasticity using human whole organ models of cardiac electromechanics, discretized at a high para-cellular resolution. Three patient-specific, anatomically accurate, whole heart EM models were reconstructed from magnetic resonance (MR) scans at resolutions of 220 μm, 440 μm and 880 μm, yielding meshes of approximately 184.6, 24.4 and 3.7 million tetrahedral elements and 95.9, 13.2 and 2.1 million displacement DOF, respectively. The same mesh was used for discretizing the governing equations of both electrophysiology (EP) and nonlinear elasticity. A novel algebraic multigrid (AMG) preconditioner for an iterative Krylov solver was developed to deal with the resulting computational load. The AMG preconditioner was designed under the primary objective of achieving favorable strong scaling characteristics for both setup and solution runtimes, as this is key for exploiting current high performance computing hardware. Benchmark results using the 220 μm, 440 μm and 880 μm meshes demonstrate

  20. Anatomically accurate high resolution modeling of human whole heart electromechanics: A strongly scalable algebraic multigrid solver method for nonlinear deformation

    PubMed Central

    Augustin, Christoph M.; Neic, Aurel; Liebmann, Manfred; Prassl, Anton J.; Niederer, Steven A.; Haase, Gundolf; Plank, Gernot

    2016-01-01

    Electromechanical (EM) models of the heart have been used successfully to study fundamental mechanisms underlying a heart beat in health and disease. However, in all modeling studies reported so far numerous simplifications were made in terms of representing biophysical details of cellular function and its heterogeneity, gross anatomy and tissue microstructure, as well as the bidirectional coupling between electrophysiology (EP) and tissue distension. One limiting factor is the employed spatial discretization methods which are not sufficiently flexible to accommodate complex geometries or resolve heterogeneities, but, even more importantly, the limited efficiency of the prevailing solver techniques which are not sufficiently scalable to deal with the incurring increase in degrees of freedom (DOF) when modeling cardiac electromechanics at high spatio-temporal resolution. This study reports on the development of a novel methodology for solving the nonlinear equation of finite elasticity using human whole organ models of cardiac electromechanics, discretized at a high para-cellular resolution. Three patient-specific, anatomically accurate, whole heart EM models were reconstructed from magnetic resonance (MR) scans at resolutions of 220 μm, 440 μm and 880 μm, yielding meshes of approximately 184.6, 24.4 and 3.7 million tetrahedral elements and 95.9, 13.2 and 2.1 million displacement DOF, respectively. The same mesh was used for discretizing the governing equations of both electrophysiology (EP) and nonlinear elasticity. A novel algebraic multigrid (AMG) preconditioner for an iterative Krylov solver was developed to deal with the resulting computational load. The AMG preconditioner was designed under the primary objective of achieving favorable strong scaling characteristics for both setup and solution runtimes, as this is key for exploiting current high performance computing hardware. Benchmark results using the 220 μm, 440 μm and 880 μm meshes demonstrate

  1. Cascade Optimization for Aircraft Engines With Regression and Neural Network Analysis - Approximators

    NASA Technical Reports Server (NTRS)

    Patnaik, Surya N.; Guptill, James D.; Hopkins, Dale A.; Lavelle, Thomas M.

    2000-01-01

    The NASA Engine Performance Program (NEPP) can configure and analyze almost any type of gas turbine engine that can be generated through the interconnection of a set of standard physical components. In addition, the code can optimize engine performance by changing adjustable variables under a set of constraints. However, for engine cycle problems at certain operating points, the NEPP code can encounter difficulties: nonconvergence in the currently implemented Powell's optimization algorithm and deficiencies in the Newton-Raphson solver during engine balancing. A project was undertaken to correct these deficiencies. Nonconvergence was avoided through a cascade optimization strategy, and deficiencies associated with engine balancing were eliminated through neural network and linear regression methods. An approximation-interspersed cascade strategy was used to optimize the engine's operation over its flight envelope. Replacement of Powell's algorithm by the cascade strategy improved the optimization segment of the NEPP code. The performance of the linear regression and neural network methods as alternative engine analyzers was found to be satisfactory. This report considers two examples-a supersonic mixed-flow turbofan engine and a subsonic waverotor-topped engine-to illustrate the results, and it discusses insights gained from the improved version of the NEPP code.

  2. Predictions of a Supersonic Jet-in-Crossflow: Comparisons Among CFD Solvers and with Experiment

    DTIC Science & Technology

    2014-09-01

    The transverse supersonic jet was produced using a converging-diverging nozzle with a design Mach number of 3.73, a conical expansion section half...J. F., and Erven, R. J., “Flow Separation Inside a Supersonic Nozzle Exhausting into a Subsonic Compressible Crossflw, “Journal of Propulsion and...Predictions of a Supersonic Jet-in-Crossflow: Comparisons Among CFD Solvers and with Experiment by James DeSpirito, Kevin D Kennedy, Clark

  3. Fast and accurate implementation of Fourier spectral approximations of nonlocal diffusion operators and its applications

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Du, Qiang, E-mail: jyanghkbu@gmail.com; Yang, Jiang, E-mail: qd2125@columbia.edu

    This work is concerned with the Fourier spectral approximation of various integral differential equations associated with some linear nonlocal diffusion and peridynamic operators under periodic boundary conditions. For radially symmetric kernels, the nonlocal operators under consideration are diagonalizable in the Fourier space so that the main computational challenge is on the accurate and fast evaluation of their eigenvalues or Fourier symbols consisting of possibly singular and highly oscillatory integrals. For a large class of fractional power-like kernels, we propose a new approach based on reformulating the Fourier symbols both as coefficients of a series expansion and solutions of some simplemore » ODE models. We then propose a hybrid algorithm that utilizes both truncated series expansions and high order Runge–Kutta ODE solvers to provide fast evaluation of Fourier symbols in both one and higher dimensional spaces. It is shown that this hybrid algorithm is robust, efficient and accurate. As applications, we combine this hybrid spectral discretization in the spatial variables and the fourth-order exponential time differencing Runge–Kutta for temporal discretization to offer high order approximations of some nonlocal gradient dynamics including nonlocal Allen–Cahn equations, nonlocal Cahn–Hilliard equations, and nonlocal phase-field crystal models. Numerical results show the accuracy and effectiveness of the fully discrete scheme and illustrate some interesting phenomena associated with the nonlocal models.« less

  4. An Approximate Axisymmetric Viscous Shock Layer Aeroheating Method for Three-Dimensional Bodies

    NASA Technical Reports Server (NTRS)

    Brykina, Irina G.; Scott, Carl D.

    1998-01-01

    A technique is implemented for computing hypersonic aeroheating, shear stress, and other flow properties on the windward side of a three-dimensional (3D) blunt body. The technique uses a 2D/axisymmetric flow solver modified by scale factors for a, corresponding equivalent axisymmetric body. Examples are given in which a 2D solver is used to calculate the flow at selected meridional planes on elliptic paraboloids in reentry flight. The report describes the equations and the codes used to convert the body surface parameters into input used to scale the 2D viscous shock layer equations in the axisymmetric viscous shock layer code. Very good agreement is obtained with solutions to finite rate chemistry 3D thin viscous shock layer equations for a finite rate catalytic body.

  5. Simulation of Unsteady Flows Using an Unstructured Navier-Stokes Solver on Moving and Stationary Grids

    NASA Technical Reports Server (NTRS)

    Biedron, Robert T.; Vatsa, Veer N.; Atkins, Harold L.

    2005-01-01

    We apply an unsteady Reynolds-averaged Navier-Stokes (URANS) solver for unstructured grids to unsteady flows on moving and stationary grids. Example problems considered are relevant to active flow control and stability and control. Computational results are presented using the Spalart-Allmaras turbulence model and are compared to experimental data. The effect of grid and time-step refinement are examined.

  6. SPIREs: A Finite-Difference Frequency-Domain electromagnetic solver for inhomogeneous magnetized plasma cylinders

    NASA Astrophysics Data System (ADS)

    Melazzi, D.; Curreli, D.; Manente, M.; Carlsson, J.; Pavarin, D.

    2012-06-01

    We present SPIREs (plaSma Padova Inhomogeneous Radial Electromagnetic solver), a Finite-Difference Frequency-Domain (FDFD) electromagnetic solver in one dimension for the rapid calculation of the electromagnetic fields and the deposited power of a large variety of cylindrical plasma problems. The two Maxwell wave equations have been discretized using a staggered Yee mesh along the radial direction of the cylinder, and Fourier transformed along the other two dimensions and in time. By means of this kind of discretization, we have found that mode-coupling of fast and slow branches can be fully resolved without singularity issues that flawed other well-established methods in the past. Fields are forced by an antenna placed at a given distance from the plasma. The plasma can be inhomogeneous, finite-temperature, collisional, magnetized and multi-species. Finite-temperature Maxwellian effects, comprising Landau and cyclotron damping, have been included by means of the plasma Z dispersion function. Finite Larmor radius effects have been neglected. Radial variations of the plasma parameters are taken into account, thus extending the range of applications to a large variety of inhomogeneous plasma systems. The method proved to be fast and reliable, with accuracy depending on the spatial grid size. Two physical examples are reported: fields in a forced vacuum waveguide with the antenna inside, and forced plasma oscillations in the helicon radiofrequency range.

  7. How inverse solver technologies can support die face development and process planning in the automotive industry

    NASA Astrophysics Data System (ADS)

    Huhn, Stefan; Peeling, Derek; Burkart, Maximilian

    2017-10-01

    With the availability of die face design tools and incremental solver technologies to provide detailed forming feasibility results in a timely fashion, the use of inverse solver technologies and resulting process improvements during the product development process of stamped parts often is underestimated. This paper presents some applications of inverse technologies that are currently used in the automotive industry to streamline the product development process and greatly increase the quality of a developed process and the resulting product. The first focus is on the so-called target strain technology. Application examples will show how inverse forming analysis can be applied to support the process engineer during the development of a die face geometry for Class `A' panels. The drawing process is greatly affected by the die face design and the process designer has to ensure that the resulting drawn panel will meet specific requirements regarding surface quality and a minimum strain distribution to ensure dent resistance. The target strain technology provides almost immediate feedback to the process engineer during the die face design process if a specific change of the die face design will help to achieve these specific requirements or will be counterproductive. The paper will further show how an optimization of the material flow can be achieved through the use of a newly developed technology called Sculptured Die Face (SDF). The die face generation in SDF is more suited to be used in optimization loops than any other conventional die face design technology based on cross section design. A second focus in this paper is on the use of inverse solver technologies for secondary forming operations. The paper will show how the application of inverse technology can be used to accurately and quickly develop trim lines on simple as well as on complex support geometries.

  8. An unstructured shock-fitting solver for hypersonic plasma flows in chemical non-equilibrium

    NASA Astrophysics Data System (ADS)

    Pepe, R.; Bonfiglioli, A.; D'Angola, A.; Colonna, G.; Paciorri, R.

    2015-11-01

    A CFD solver, using Residual Distribution Schemes on unstructured grids, has been extended to deal with inviscid chemical non-equilibrium flows. The conservative equations have been coupled with a kinetic model for argon plasma which includes the argon metastable state as independent species, taking into account electron-atom and atom-atom processes. Results in the case of an hypersonic flow around an infinite cylinder, obtained by using both shock-capturing and shock-fitting approaches, show higher accuracy of the shock-fitting approach.

  9. A high-order 3D spectral difference solver for simulating flows about rotating geometries

    NASA Astrophysics Data System (ADS)

    Zhang, Bin; Liang, Chunlei

    2017-11-01

    Fluid flows around rotating geometries are ubiquitous. For example, a spinning ping pong ball can quickly change its trajectory in an air flow; a marine propeller can provide enormous amount of thrust to a ship. It has been a long-time challenge to accurately simulate these flows. In this work, we present a high-order and efficient 3D flow solver based on unstructured spectral difference (SD) method and a novel sliding-mesh method. In the SD method, solution and fluxes are reconstructed using tensor products of 1D polynomials and the equations are solved in differential-form, which leads to high-order accuracy and high efficiency. In the sliding-mesh method, a computational domain is decomposed into non-overlapping subdomains. Each subdomain can enclose a geometry and can rotate relative to its neighbor, resulting in nonconforming sliding interfaces. A curved dynamic mortar approach is designed for communication on these interfaces. In this approach, solutions and fluxes are projected from cell faces to mortars to compute common values which are then projected back to ensures continuity and conservation. Through theoretical analysis and numerical tests, it is shown that this solver is conservative, free-stream preservative, and high-order accurate in both space and time.

  10. A SEMI-LAGRANGIAN TWO-LEVEL PRECONDITIONED NEWTON-KRYLOV SOLVER FOR CONSTRAINED DIFFEOMORPHIC IMAGE REGISTRATION.

    PubMed

    Mang, Andreas; Biros, George

    2017-01-01

    We propose an efficient numerical algorithm for the solution of diffeomorphic image registration problems. We use a variational formulation constrained by a partial differential equation (PDE), where the constraints are a scalar transport equation. We use a pseudospectral discretization in space and second-order accurate semi-Lagrangian time stepping scheme for the transport equations. We solve for a stationary velocity field using a preconditioned, globalized, matrix-free Newton-Krylov scheme. We propose and test a two-level Hessian preconditioner. We consider two strategies for inverting the preconditioner on the coarse grid: a nested preconditioned conjugate gradient method (exact solve) and a nested Chebyshev iterative method (inexact solve) with a fixed number of iterations. We test the performance of our solver in different synthetic and real-world two-dimensional application scenarios. We study grid convergence and computational efficiency of our new scheme. We compare the performance of our solver against our initial implementation that uses the same spatial discretization but a standard, explicit, second-order Runge-Kutta scheme for the numerical time integration of the transport equations and a single-level preconditioner. Our improved scheme delivers significant speedups over our original implementation. As a highlight, we observe a 20 × speedup for a two dimensional, real world multi-subject medical image registration problem.

  11. Aerodynamics Simulations for the D8 ``Double Bubble'' Aircraft Using the LAVA Unstructured Solver

    NASA Astrophysics Data System (ADS)

    Ballinger, Sean

    2013-11-01

    The D8 ``double bubble'' is a proposed design for quieter and more efficient domestic passenger aircraft of the Boeing 737 class. It features boundary layer-ingesting engines located under a non-load-bearing π-tail and a lightweight low-sweep wing for flight around Mach 0.7. The D8's wide lifting body is expected to supply 15% of its total lift, while a Boeing 737's fuselage contributes only 8%. The tapering rear of the fuselage is also predicted to experience a negative moment resulting in positive pitch, produce a thicker boundary layer for ingestion by distortion-tolerant engines, and act as a noise shield. To investigate these predictions, unstructured grids generated over a fine surface triangulation using Star-CCM+ are used to model the unpowered D8 with flow conditions mimicking those in the MIT Wright brothers wind tunnel at angles of attack from - 2 to 14 degrees. LAVA, the recently developed Launch Ascent and Vehicle Aerodynamics solver, is used to carry out simulations on an unstructured grid. The results are compared to wind tunnel data, and to data from structured grid simulations using the LAVA, Overflow, and Cart3D solvers. Applied Modeling and Simulation Branch, NASA Advanced Supercomputing Division, funded by New York Space Grant.

  12. Parallelization of Unsteady Adaptive Mesh Refinement for Unstructured Navier-Stokes Solvers

    NASA Technical Reports Server (NTRS)

    Schwing, Alan M.; Nompelis, Ioannis; Candler, Graham V.

    2014-01-01

    This paper explores the implementation of the MPI parallelization in a Navier-Stokes solver using adaptive mesh re nement. Viscous and inviscid test problems are considered for the purpose of benchmarking, as are implicit and explicit time advancement methods. The main test problem for comparison includes e ects from boundary layers and other viscous features and requires a large number of grid points for accurate computation. Ex- perimental validation against double cone experiments in hypersonic ow are shown. The adaptive mesh re nement shows promise for a staple test problem in the hypersonic com- munity. Extension to more advanced techniques for more complicated ows is described.

  13. Blade design and analysis using a modified Euler solver

    NASA Technical Reports Server (NTRS)

    Leonard, O.; Vandenbraembussche, R. A.

    1991-01-01

    An iterative method for blade design based on Euler solver and described in an earlier paper is used to design compressor and turbine blades providing shock free transonic flows. The method shows a rapid convergence, and indicates how much the flow is sensitive to small modifications of the blade geometry, that the classical iterative use of analysis methods might not be able to define. The relationship between the required Mach number distribution and the resulting geometry is discussed. Examples show how geometrical constraints imposed upon the blade shape can be respected by using free geometrical parameters or by relaxing the required Mach number distribution. The same code is used both for the design of the required geometry and for the off-design calculations. Examples illustrate the difficulty of designing blade shapes with optimal performance also outside of the design point.

  14. Approximate tensor-product preconditioners for very high order discontinuous Galerkin methods

    NASA Astrophysics Data System (ADS)

    Pazner, Will; Persson, Per-Olof

    2018-02-01

    In this paper, we develop a new tensor-product based preconditioner for discontinuous Galerkin methods with polynomial degrees higher than those typically employed. This preconditioner uses an automatic, purely algebraic method to approximate the exact block Jacobi preconditioner by Kronecker products of several small, one-dimensional matrices. Traditional matrix-based preconditioners require O (p2d) storage and O (p3d) computational work, where p is the degree of basis polynomials used, and d is the spatial dimension. Our SVD-based tensor-product preconditioner requires O (p d + 1) storage, O (p d + 1) work in two spatial dimensions, and O (p d + 2) work in three spatial dimensions. Combined with a matrix-free Newton-Krylov solver, these preconditioners allow for the solution of DG systems in linear time in p per degree of freedom in 2D, and reduce the computational complexity from O (p9) to O (p5) in 3D. Numerical results are shown in 2D and 3D for the advection, Euler, and Navier-Stokes equations, using polynomials of degree up to p = 30. For many test cases, the preconditioner results in similar iteration counts when compared with the exact block Jacobi preconditioner, and performance is significantly improved for high polynomial degrees p.

  15. elsA-Hybrid: an all-in-one structured/unstructured solver for the simulation of internal and external flows. Application to turbomachinery

    NASA Astrophysics Data System (ADS)

    de la Llave Plata, M.; Couaillier, V.; Le Pape, M.-C.; Marmignon, C.; Gazaix, M.

    2013-03-01

    This paper reports recent work on the extension of the multiblock structured solver elsA to deal with hybrid grids. The new hybrid-grid solver is called elsA-H (elsA-Hybrid), is based on the investigation of a new unstructured-grid module has been built within the original elsA CFD (computational fluid dynamics) system. The implementation benefits from the flexibility of the object-oriented design. The aim of elsA-H is to take advantage of the full potential of structured solvers and unstructured mesh generation by allowing any type of grid to be used within the same simulation process. The main challenge lies in the numerical treatment of the hybrid-grid interfaces where blocks of different type meet. In particular, one must pay attention to the transfer of information across these boundaries, so that the accuracy of the numerical scheme is preserved and flux conservation is guaranteed. In this paper, the numerical approach allowing to achieve this is presented. A comparison between the hybrid and the structured-grid methods is also carried out by considering a fully hexahedral multiblock mesh for which a few blocks have been transformed into unstructured. The performance of elsA-H for the simulation of internal flows will be demonstrated on a number of turbomachinery configurations.

  16. P-CSI v1.0, an accelerated barotropic solver for the high-resolution ocean model component in the Community Earth System Model v2.0

    NASA Astrophysics Data System (ADS)

    Huang, Xiaomeng; Tang, Qiang; Tseng, Yuheng; Hu, Yong; Baker, Allison H.; Bryan, Frank O.; Dennis, John; Fu, Haohuan; Yang, Guangwen

    2016-11-01

    In the Community Earth System Model (CESM), the ocean model is computationally expensive for high-resolution grids and is often the least scalable component for high-resolution production experiments. The major bottleneck is that the barotropic solver scales poorly at high core counts. We design a new barotropic solver to accelerate the high-resolution ocean simulation. The novel solver adopts a Chebyshev-type iterative method to reduce the global communication cost in conjunction with an effective block preconditioner to further reduce the iterations. The algorithm and its computational complexity are theoretically analyzed and compared with other existing methods. We confirm the significant reduction of the global communication time with a competitive convergence rate using a series of idealized tests. Numerical experiments using the CESM 0.1° global ocean model show that the proposed approach results in a factor of 1.7 speed-up over the original method with no loss of accuracy, achieving 10.5 simulated years per wall-clock day on 16 875 cores.

  17. Parallelizable 3D statistical reconstruction for C-arm tomosynthesis system

    NASA Astrophysics Data System (ADS)

    Wang, Beilei; Barner, Kenneth; Lee, Denny

    2005-04-01

    Clinical diagnosis and security detection tasks increasingly require 3D information which is difficult or impossible to obtain from 2D (two dimensional) radiographs. As a 3D (three dimensional) radiographic and non-destructive imaging technique, digital tomosynthesis is especially fit for cases where 3D information is required while a complete projection data is not available. Nowadays, FBP (filtered back projection) is extensively used in industry for its fast speed and simplicity. However, it is hard to deal with situations where only a limited number of projections from constrained directions are available, or the SNR (signal to noises ratio) of the projections is low. In order to deal with noise and take into account a priori information of the object, a statistical image reconstruction method is described based on the acquisition model of X-ray projections. We formulate a ML (maximum likelihood) function for this model and develop an ordered-subsets iterative algorithm to estimate the unknown attenuation of the object. Simulations show that satisfied results can be obtained after 1 to 2 iterations, and after that there is no significant improvement of the image quality. An adaptive wiener filter is also applied to the reconstructed image to remove its noise. Some approximations to speed up the reconstruction computation are also considered. Applying this method to computer generated projections of a revised Shepp phantom and true projections from diagnostic radiographs of a patient"s hand and mammography images yields reconstructions with impressive quality. Parallel programming is also implemented and tested. The quality of the reconstructed object is conserved, while the computation time is considerably reduced by almost the number of threads used.

  18. A fast parallel 3D Poisson solver with longitudinal periodic and transverse open boundary conditions for space-charge simulations

    NASA Astrophysics Data System (ADS)

    Qiang, Ji

    2017-10-01

    A three-dimensional (3D) Poisson solver with longitudinal periodic and transverse open boundary conditions can have important applications in beam physics of particle accelerators. In this paper, we present a fast efficient method to solve the Poisson equation using a spectral finite-difference method. This method uses a computational domain that contains the charged particle beam only and has a computational complexity of O(Nu(logNmode)) , where Nu is the total number of unknowns and Nmode is the maximum number of longitudinal or azimuthal modes. This saves both the computational time and the memory usage of using an artificial boundary condition in a large extended computational domain. The new 3D Poisson solver is parallelized using a message passing interface (MPI) on multi-processor computers and shows a reasonable parallel performance up to hundreds of processor cores.

  19. Adding Complex Terrain and Stable Atmospheric Condition Capability to the OpenFOAM-based Flow Solver of the Simulator for On/Offshore Wind Farm Applications (SOWFA): Preprint

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Churchfield, M. J.; Sang, L.; Moriarty, P. J.

    This paper describes changes made to NREL's OpenFOAM-based wind plant aerodynamics solver such that it can compute the stably stratified atmospheric boundary layer and flow over terrain. Background about the flow solver, the Simulator for Off/Onshore Wind Farm Applications (SOWFA) is given, followed by details of the stable stratification/complex terrain modifications to SOWFA, along with somepreliminary results calculations of a stable atmospheric boundary layer and flow over a simply set of hills.

  20. Application of Aeroelastic Solvers Based on Navier-Stokes Equations

    NASA Technical Reports Server (NTRS)

    Keith, Theo G., Jr.; Srivastava, Rakesh

    1998-01-01

    A pre-release version of the Navier-Stokes solver (TURBO) was obtained from MSU. Along with Dr. Milind Bakhle of the University of Toledo, subroutines for aeroelastic analysis were developed and added to the TURBO code to develop versions 1 and 2 of the TURBO-AE code. For specified mode shape, frequency and inter-blade phase angle the code calculates the work done by the fluid on the rotor for a prescribed sinusoidal motion. Positive work on the rotor indicates instability of the rotor. The version 1 of the code calculates the work for in-phase blade motions only. In version 2 of the code, the capability for analyzing all possible inter-blade phase angles, was added. The version 2 of TURBO-AE code was validated and delivered to NASA and the industry partners of the AST project. The capabilities and the features of the code are summarized in Refs. [1] & [2]. To release the version 2 of TURBO-AE, a workshop was organized at NASA Lewis, by Dr. Srivastava and Dr. M. A. Bakhle, both of the University of Toledo, in October of 1996 for the industry partners of NASA Lewis. The workshop provided the potential users of TURBO-AE, all the relevant information required in preparing the input data, executing the code, interpreting the results and bench marking the code on their computer systems. After the code was delivered to the industry partners, user support was also provided. A new version of the Navier-Stokes solver (TURBO) was later released by MSU. This version had significant changes and upgrades over the previous version. This new version was merged with the TURBO-AE code. Also, new boundary conditions for 3-D unsteady non-reflecting boundaries, were developed by researchers from UTRC, Ref. [3]. Time was spent on understanding, familiarizing, executing and implementing the new boundary conditions into the TURBO-AE code. Work was started on the phase lagged (time-shifted) boundary condition version (version 4) of the code. This will allow the users to calculate non

  1. Scalable direct Vlasov solver with discontinuous Galerkin method on unstructured mesh.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Xu, J.; Ostroumov, P. N.; Mustapha, B.

    2010-12-01

    This paper presents the development of parallel direct Vlasov solvers with discontinuous Galerkin (DG) method for beam and plasma simulations in four dimensions. Both physical and velocity spaces are in two dimesions (2P2V) with unstructured mesh. Contrary to the standard particle-in-cell (PIC) approach for kinetic space plasma simulations, i.e., solving Vlasov-Maxwell equations, direct method has been used in this paper. There are several benefits to solving a Vlasov equation directly, such as avoiding noise associated with a finite number of particles and the capability to capture fine structure in the plasma. The most challanging part of a direct Vlasov solvermore » comes from higher dimensions, as the computational cost increases as N{sup 2d}, where d is the dimension of the physical space. Recently, due to the fast development of supercomputers, the possibility has become more realistic. Many efforts have been made to solve Vlasov equations in low dimensions before; now more interest has focused on higher dimensions. Different numerical methods have been tried so far, such as the finite difference method, Fourier Spectral method, finite volume method, and spectral element method. This paper is based on our previous efforts to use the DG method. The DG method has been proven to be very successful in solving Maxwell equations, and this paper is our first effort in applying the DG method to Vlasov equations. DG has shown several advantages, such as local mass matrix, strong stability, and easy parallelization. These are particularly suitable for Vlasov equations. Domain decomposition in high dimensions has been used for parallelization; these include a highly scalable parallel two-dimensional Poisson solver. Benchmark results have been shown and simulation results will be reported.« less

  2. A frozen Gaussian approximation-based multi-level particle swarm optimization for seismic inversion

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Li, Jinglai, E-mail: jinglaili@sjtu.edu.cn; Lin, Guang, E-mail: lin491@purdue.edu; Computational Sciences and Mathematics Division, Pacific Northwest National Laboratory, Richland, WA 99352

    2015-09-01

    In this paper, we propose a frozen Gaussian approximation (FGA)-based multi-level particle swarm optimization (MLPSO) method for seismic inversion of high-frequency wave data. The method addresses two challenges in it: First, the optimization problem is highly non-convex, which makes hard for gradient-based methods to reach global minima. This is tackled by MLPSO which can escape from undesired local minima. Second, the character of high-frequency of seismic waves requires a large number of grid points in direct computational methods, and thus renders an extremely high computational demand on the simulation of each sample in MLPSO. We overcome this difficulty by threemore » steps: First, we use FGA to compute high-frequency wave propagation based on asymptotic analysis on phase plane; Then we design a constrained full waveform inversion problem to prevent the optimization search getting into regions of velocity where FGA is not accurate; Last, we solve the constrained optimization problem by MLPSO that employs FGA solvers with different fidelity. The performance of the proposed method is demonstrated by a two-dimensional full-waveform inversion example of the smoothed Marmousi model.« less

  3. Extending fullwave core ICRF simulation to SOL and antenna regions using FEM solver

    NASA Astrophysics Data System (ADS)

    Shiraiwa, S.; Wright, J. C.

    2016-10-01

    A full wave simulation approach to solve a driven RF waves problem including hot core, SOL plasmas and possibly antenna is presented. This approach allows for exploiting advantages of two different way of representing wave field, namely treating spatially dispersive hot conductivity in a spectral solver and handling complicated geometry in SOL/antenna region using an unstructured mesh. Here, we compute a mode set in each region with the RF electric field excitation on the connecting boundary between core and edge regions. A mode corresponding to antenna excitation is also computed. By requiring the continuity of tangential RF electric and magnetic fields, the solution is obtained as unique superposition of these modes. In this work, TORIC core spectral solver is modified to allow for mode excitation, and the edge region of diverted Alcator C-Mod plasma is modeled using COMSOL FEM package. The reconstructed RF field is similar in the core region to TORIC stand-alone simulation. However, it contains higher poloidal modes near the edge and captures a wave bounced and propagating in the poloidal direction near the vacuum-plasma boundary. These features could play an important role when the single power pass absorption is modest. This new capability will enable antenna coupling calculations with a realistic load plasma, including collisional damping in realistic SOL plasma and other loss mechanisms such as RF sheath rectification. USDoE Awards DE-FC02-99ER54512, DE-FC02-01ER54648.

  4. Development of Tokamak Transport Solvers for Stiff Confinement Systems

    NASA Astrophysics Data System (ADS)

    St. John, H. E.; Lao, L. L.; Murakami, M.; Park, J. M.

    2006-10-01

    Leading transport models such as GLF23 [1] and MM95 [2] describe turbulent plasma energy, momentum and particle flows. In order to accommodate existing transport codes and associated solution methods effective diffusivities have to be derived from these turbulent flow models. This can cause significant problems in predicting unique solutions. We have developed a parallel transport code solver, GCNMP, that can accommodate both flow based and diffusivity based confinement models by solving the discretized nonlinear equations using modern Newton, trust region, steepest descent and homotopy methods. We present our latest development efforts, including multiple dynamic grids, application of two-level parallel schemes, and operator splitting techniques that allow us to combine flow based and diffusivity based models in tokamk simulations. 6pt [1] R.E. Waltz, et al., Phys. Plasmas 4, 7 (1997). [2] G. Bateman, et al., Phys. Plasmas 5, 1793 (1998).

  5. Diffusion of Zonal Variables Using Node-Centered Diffusion Solver

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yang, T B

    2007-08-06

    Tom Kaiser [1] has done some preliminary work to use the node-centered diffusion solver (originally developed by T. Palmer [2]) in Kull for diffusion of zonal variables such as electron temperature. To avoid numerical diffusion, Tom used a scheme developed by Shestakov et al. [3] and found their scheme could, in the vicinity of steep gradients, decouple nearest-neighbor zonal sub-meshes leading to 'alternating-zone' (red-black mode) errors. Tom extended their scheme to couple the sub-meshes with appropriate chosen artificial diffusion and thereby solved the 'alternating-zone' problem. Because the choice of the artificial diffusion coefficient could be very delicate, it is desirablemore » to use a scheme that does not require the artificial diffusion but still able to avoid both numerical diffusion and the 'alternating-zone' problem. In this document we present such a scheme.« less

  6. A low-complexity Reed-Solomon decoder using new key equation solver

    NASA Astrophysics Data System (ADS)

    Xie, Jun; Yuan, Songxin; Tu, Xiaodong; Zhang, Chongfu

    2006-09-01

    This paper presents a low-complexity parallel Reed-Solomon (RS) (255,239) decoder architecture using a novel pipelined variable stages recursive Modified Euclidean (ME) algorithm for optical communication. The pipelined four-parallel syndrome generator is proposed. The time multiplexing and resource sharing schemes are used in the novel recursive ME algorithm to reduce the logic gate count. The new key equation solver can be shared by two decoder macro. A new Chien search cell which doesn't need initialization is proposed in the paper. The proposed decoder can be used for 2.5Gb/s data rates device. The decoder is implemented in Altera' Stratixll device. The resource utilization is reduced about 40% comparing to the conventional method.

  7. Introduction to COFFE: The Next-Generation HPCMP CREATE-AV CFD Solver

    NASA Technical Reports Server (NTRS)

    Glasby, Ryan S.; Erwin, J. Taylor; Stefanski, Douglas L.; Allmaras, Steven R.; Galbraith, Marshall C.; Anderson, W. Kyle; Nichols, Robert H.

    2016-01-01

    HPCMP CREATE-AV Conservative Field Finite Element (COFFE) is a modular, extensible, robust numerical solver for the Navier-Stokes equations that invokes modularity and extensibility from its first principles. COFFE implores a flexible, class-based hierarchy that provides a modular approach consisting of discretization, physics, parallelization, and linear algebra components. These components are developed with modern software engineering principles to ensure ease of uptake from a user's or developer's perspective. The Streamwise Upwind/Petrov-Galerkin (SU/PG) method is utilized to discretize the compressible Reynolds-Averaged Navier-Stokes (RANS) equations tightly coupled with a variety of turbulence models. The mathematics and the philosophy of the methodology that makes up COFFE are presented.

  8. Use of EPANET solver to manage water distribution in Smart City

    NASA Astrophysics Data System (ADS)

    Antonowicz, A.; Brodziak, R.; Bylka, J.; Mazurkiewicz, J.; Wojtecki, S.; Zakrzewski, P.

    2018-02-01

    Paper presents a method of using EPANET solver to support manage water distribution system in Smart City. The main task is to develop the application that allows remote access to the simulation model of the water distribution network developed in the EPANET environment. Application allows to perform both single and cyclic simulations with the specified step of changing the values of the selected process variables. In the paper the architecture of application was shown. The application supports the selection of the best device control algorithm using optimization methods. Optimization procedures are possible with following methods: brute force, SLSQP (Sequential Least SQuares Programming), Modified Powell Method. Article was supplemented by example of using developed computer tool.

  9. Proteus-MOC: A 3D deterministic solver incorporating 2D method of characteristics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Marin-Lafleche, A.; Smith, M. A.; Lee, C.

    2013-07-01

    A new transport solution methodology was developed by combining the two-dimensional method of characteristics with the discontinuous Galerkin method for the treatment of the axial variable. The method, which can be applied to arbitrary extruded geometries, was implemented in PROTEUS-MOC and includes parallelization in group, angle, plane, and space using a top level GMRES linear algebra solver. Verification tests were performed to show accuracy and stability of the method with the increased number of angular directions and mesh elements. Good scalability with parallelism in angle and axial planes is displayed. (authors)

  10. Implementing High-Performance Geometric Multigrid Solver with Naturally Grained Messages

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shan, Hongzhang; Williams, Samuel; Zheng, Yili

    2015-10-26

    Structured-grid linear solvers often require manually packing and unpacking of communication data to achieve high performance.Orchestrating this process efficiently is challenging, labor-intensive, and potentially error-prone.In this paper, we explore an alternative approach that communicates the data with naturally grained messagesizes without manual packing and unpacking. This approach is the distributed analogue of shared-memory programming, taking advantage of the global addressspace in PGAS languages to provide substantial programming ease. However, its performance may suffer from the large number of small messages. We investigate theruntime support required in the UPC ++ library for this naturally grained version to close the performance gapmore » between the two approaches and attain comparable performance at scale using the High-Performance Geometric Multgrid (HPGMG-FV) benchmark as a driver.« less

  11. Workload Characterization of CFD Applications Using Partial Differential Equation Solvers

    NASA Technical Reports Server (NTRS)

    Waheed, Abdul; Yan, Jerry; Saini, Subhash (Technical Monitor)

    1998-01-01

    Workload characterization is used for modeling and evaluating of computing systems at different levels of detail. We present workload characterization for a class of Computational Fluid Dynamics (CFD) applications that solve Partial Differential Equations (PDEs). This workload characterization focuses on three high performance computing platforms: SGI Origin2000, EBM SP-2, a cluster of Intel Pentium Pro bases PCs. We execute extensive measurement-based experiments on these platforms to gather statistics of system resource usage, which results in workload characterization. Our workload characterization approach yields a coarse-grain resource utilization behavior that is being applied for performance modeling and evaluation of distributed high performance metacomputing systems. In addition, this study enhances our understanding of interactions between PDE solver workloads and high performance computing platforms and is useful for tuning these applications.

  12. A real-frequency solver for the Anderson impurity model based on bath optimization and cluster perturbation theory

    NASA Astrophysics Data System (ADS)

    Zingl, Manuel; Nuss, Martin; Bauernfeind, Daniel; Aichhorn, Markus

    2018-05-01

    Recently solvers for the Anderson impurity model (AIM) working directly on the real-frequency axis have gained much interest. A simple and yet frequently used impurity solver is exact diagonalization (ED), which is based on a discretization of the AIM bath degrees of freedom. Usually, the bath parameters cannot be obtained directly on the real-frequency axis, but have to be determined by a fit procedure on the Matsubara axis. In this work we present an approach where the bath degrees of freedom are first discretized directly on the real-frequency axis using a large number of bath sites (≈ 50). Then, the bath is optimized by unitary transformations such that it separates into two parts that are weakly coupled. One part contains the impurity site and its interacting Green's functions can be determined with ED. The other (larger) part is a non-interacting system containing all the remaining bath sites. Finally, the Green's function of the full AIM is calculated via coupling these two parts with cluster perturbation theory.

  13. Cutting-edge Kinetic Physics with Parker Solar Probe and Solar Orbiter: The Arbitrary Linear Plasma Solver (ALPS)

    NASA Astrophysics Data System (ADS)

    Verscharen, D.; Klein, K. G.; Chandran, B. D. G.; Stevens, M. L.; Salem, C. S.; Bale, S. D.

    2017-12-01

    The Arbitrary Linear Plasma Solver (ALPS) is a parallelized numerical code that solves the dispersion relation in a hot (even relativistic) magnetized plasma with an arbitrary number of particle species with arbitrary gyrotropic equilibrium distribution functions for any direction of wave propagation with respect to the background field. In this way, ALPS retains generality and overcomes the shortcomings of previous (bi-)Maxwellian solvers for the plasma dispersion relations. The unprecedented high-resolution particle and field data products from Parker Solar Probe (PSP) and Solar Orbiter (SO) will require novel theoretical tools. ALPS is one such tool, and its use will make possible new investigations into the role of non-Maxwellian distributions in the near-Sun solar wind. It can be applied to numerous high-velocity-resolution systems, ranging from current space missions to numerical simulations. We will briefly discuss the ALPS algorithm and demonstrate its functionality based on previous solar-wind measurements. We will then highlight our plans for future applications of ALPS to PSP and SO observations.

  14. A multiblock multigrid three-dimensional Euler equation solver

    NASA Technical Reports Server (NTRS)

    Cannizzaro, Frank E.; Elmiligui, Alaa; Melson, N. Duane; Vonlavante, E.

    1990-01-01

    Current aerodynamic designs are often quite complex (geometrically). Flexible computational tools are needed for the analysis of a wide range of configurations with both internal and external flows. In the past, geometrically dissimilar configurations required different analysis codes with different grid topologies in each. The duplicity of codes can be avoided with the use of a general multiblock formulation which can handle any grid topology. Rather than hard wiring the grid topology into the program, it is instead dictated by input to the program. In this work, the compressible Euler equations, written in a body-fitted finite-volume formulation, are solved using a pseudo-time-marching approach. Two upwind methods (van Leer's flux-vector-splitting and Roe's flux-differencing) were investigated. Two types of explicit solvers (a two-step predictor-corrector and a modified multistage Runge-Kutta) were used with multigrid acceleration to enhance convergence. A multiblock strategy is used to allow greater geometric flexibility. A report on simple explicit upwind schemes for solving compressible flows is included.

  15. Comparison of an algebraic multigrid algorithm to two iterative solvers used for modeling ground water flow and transport

    USGS Publications Warehouse

    Detwiler, R.L.; Mehl, S.; Rajaram, H.; Cheung, W.W.

    2002-01-01

    Numerical solution of large-scale ground water flow and transport problems is often constrained by the convergence behavior of the iterative solvers used to solve the resulting systems of equations. We demonstrate the ability of an algebraic multigrid algorithm (AMG) to efficiently solve the large, sparse systems of equations that result from computational models of ground water flow and transport in large and complex domains. Unlike geometric multigrid methods, this algorithm is applicable to problems in complex flow geometries, such as those encountered in pore-scale modeling of two-phase flow and transport. We integrated AMG into MODFLOW 2000 to compare two- and three-dimensional flow simulations using AMG to simulations using PCG2, a preconditioned conjugate gradient solver that uses the modified incomplete Cholesky preconditioner and is included with MODFLOW 2000. CPU times required for convergence with AMG were up to 140 times faster than those for PCG2. The cost of this increased speed was up to a nine-fold increase in required random access memory (RAM) for the three-dimensional problems and up to a four-fold increase in required RAM for the two-dimensional problems. We also compared two-dimensional numerical simulations of steady-state transport using AMG and the generalized minimum residual method with an incomplete LU-decomposition preconditioner. For these transport simulations, AMG yielded increased speeds of up to 17 times with only a 20% increase in required RAM. The ability of AMG to solve flow and transport problems in large, complex flow systems and its ready availability make it an ideal solver for use in both field-scale and pore-scale modeling.

  16. DL_MG: A Parallel Multigrid Poisson and Poisson-Boltzmann Solver for Electronic Structure Calculations in Vacuum and Solution.

    PubMed

    Womack, James C; Anton, Lucian; Dziedzic, Jacek; Hasnip, Phil J; Probert, Matt I J; Skylaris, Chris-Kriton

    2018-03-13

    The solution of the Poisson equation is a crucial step in electronic structure calculations, yielding the electrostatic potential-a key component of the quantum mechanical Hamiltonian. In recent decades, theoretical advances and increases in computer performance have made it possible to simulate the electronic structure of extended systems in complex environments. This requires the solution of more complicated variants of the Poisson equation, featuring nonhomogeneous dielectric permittivities, ionic concentrations with nonlinear dependencies, and diverse boundary conditions. The analytic solutions generally used to solve the Poisson equation in vacuum (or with homogeneous permittivity) are not applicable in these circumstances, and numerical methods must be used. In this work, we present DL_MG, a flexible, scalable, and accurate solver library, developed specifically to tackle the challenges of solving the Poisson equation in modern large-scale electronic structure calculations on parallel computers. Our solver is based on the multigrid approach and uses an iterative high-order defect correction method to improve the accuracy of solutions. Using two chemically relevant model systems, we tested the accuracy and computational performance of DL_MG when solving the generalized Poisson and Poisson-Boltzmann equations, demonstrating excellent agreement with analytic solutions and efficient scaling to ∼10 9 unknowns and 100s of CPU cores. We also applied DL_MG in actual large-scale electronic structure calculations, using the ONETEP linear-scaling electronic structure package to study a 2615 atom protein-ligand complex with routinely available computational resources. In these calculations, the overall execution time with DL_MG was not significantly greater than the time required for calculations using a conventional FFT-based solver.

  17. Fast iterative solution of the Bethe-Salpeter eigenvalue problem using low-rank and QTT tensor approximation

    NASA Astrophysics Data System (ADS)

    Benner, Peter; Dolgov, Sergey; Khoromskaia, Venera; Khoromskij, Boris N.

    2017-04-01

    In this paper, we propose and study two approaches to approximate the solution of the Bethe-Salpeter equation (BSE) by using structured iterative eigenvalue solvers. Both approaches are based on the reduced basis method and low-rank factorizations of the generating matrices. We also propose to represent the static screen interaction part in the BSE matrix by a small active sub-block, with a size balancing the storage for rank-structured representations of other matrix blocks. We demonstrate by various numerical tests that the combination of the diagonal plus low-rank plus reduced-block approximation exhibits higher precision with low numerical cost, providing as well a distinct two-sided error estimate for the smallest eigenvalues of the Bethe-Salpeter operator. The complexity is reduced to O (Nb2) in the size of the atomic orbitals basis set, Nb, instead of the practically intractable O (Nb6) scaling for the direct diagonalization. In the second approach, we apply the quantized-TT (QTT) tensor representation to both, the long eigenvectors and the column vectors in the rank-structured BSE matrix blocks, and combine this with the ALS-type iteration in block QTT format. The QTT-rank of the matrix entities possesses almost the same magnitude as the number of occupied orbitals in the molecular systems, No approximation is estimated by O (log ⁡ (No)No2). We confirm numerically a considerable decrease in computational time for the presented iterative approaches applied to various compact and chain-type molecules, while supporting sufficient accuracy.

  18. Comparison of Integer Programming (IP) Solvers for Automated Test Assembly (ATA). Research Report. ETS RR-15-05

    ERIC Educational Resources Information Center

    Donoghue, John R.

    2015-01-01

    At the heart of van der Linden's approach to automated test assembly (ATA) is a linear programming/integer programming (LP/IP) problem. A variety of IP solvers are available, ranging in cost from free to hundreds of thousands of dollars. In this paper, I compare several approaches to solving the underlying IP problem. These approaches range from…

  19. ArraySolver: an algorithm for colour-coded graphical display and Wilcoxon signed-rank statistics for comparing microarray gene expression data.

    PubMed

    Khan, Haseeb Ahmad

    2004-01-01

    The massive surge in the production of microarray data poses a great challenge for proper analysis and interpretation. In recent years numerous computational tools have been developed to extract meaningful interpretation of microarray gene expression data. However, a convenient tool for two-groups comparison of microarray data is still lacking and users have to rely on commercial statistical packages that might be costly and require special skills, in addition to extra time and effort for transferring data from one platform to other. Various statistical methods, including the t-test, analysis of variance, Pearson test and Mann-Whitney U test, have been reported for comparing microarray data, whereas the utilization of the Wilcoxon signed-rank test, which is an appropriate test for two-groups comparison of gene expression data, has largely been neglected in microarray studies. The aim of this investigation was to build an integrated tool, ArraySolver, for colour-coded graphical display and comparison of gene expression data using the Wilcoxon signed-rank test. The results of software validation showed similar outputs with ArraySolver and SPSS for large datasets. Whereas the former program appeared to be more accurate for 25 or fewer pairs (n < or = 25), suggesting its potential application in analysing molecular signatures that usually contain small numbers of genes. The main advantages of ArraySolver are easy data selection, convenient report format, accurate statistics and the familiar Excel platform.

  20. ArraySolver: An Algorithm for Colour-Coded Graphical Display and Wilcoxon Signed-Rank Statistics for Comparing Microarray Gene Expression Data

    PubMed Central

    2004-01-01

    The massive surge in the production of microarray data poses a great challenge for proper analysis and interpretation. In recent years numerous computational tools have been developed to extract meaningful interpretation of microarray gene expression data. However, a convenient tool for two-groups comparison of microarray data is still lacking and users have to rely on commercial statistical packages that might be costly and require special skills, in addition to extra time and effort for transferring data from one platform to other. Various statistical methods, including the t-test, analysis of variance, Pearson test and Mann–Whitney U test, have been reported for comparing microarray data, whereas the utilization of the Wilcoxon signed-rank test, which is an appropriate test for two-groups comparison of gene expression data, has largely been neglected in microarray studies. The aim of this investigation was to build an integrated tool, ArraySolver, for colour-coded graphical display and comparison of gene expression data using the Wilcoxon signed-rank test. The results of software validation showed similar outputs with ArraySolver and SPSS for large datasets. Whereas the former program appeared to be more accurate for 25 or fewer pairs (n ≤ 25), suggesting its potential application in analysing molecular signatures that usually contain small numbers of genes. The main advantages of ArraySolver are easy data selection, convenient report format, accurate statistics and the familiar Excel platform. PMID:18629036

  1. Adjoint Sensitivity Analysis for Scale-Resolving Turbulent Flow Solvers

    NASA Astrophysics Data System (ADS)

    Blonigan, Patrick; Garai, Anirban; Diosady, Laslo; Murman, Scott

    2017-11-01

    Adjoint-based sensitivity analysis methods are powerful design tools for engineers who use computational fluid dynamics. In recent years, these engineers have started to use scale-resolving simulations like large-eddy simulations (LES) and direct numerical simulations (DNS), which resolve more scales in complex flows with unsteady separation and jets than the widely-used Reynolds-averaged Navier-Stokes (RANS) methods. However, the conventional adjoint method computes large, unusable sensitivities for scale-resolving simulations, which unlike RANS simulations exhibit the chaotic dynamics inherent in turbulent flows. Sensitivity analysis based on least-squares shadowing (LSS) avoids the issues encountered by conventional adjoint methods, but has a high computational cost even for relatively small simulations. The following talk discusses a more computationally efficient formulation of LSS, ``non-intrusive'' LSS, and its application to turbulent flows simulated with a discontinuous-Galkerin spectral-element-method LES/DNS solver. Results are presented for the minimal flow unit, a turbulent channel flow with a limited streamwise and spanwise domain.

  2. Seeking Space Aliens and the Strong Approximation Property: A (disjoint) Study in Dust Plumes on Planetary Satellites and Nonsymmetric Algebraic Multigrid

    NASA Astrophysics Data System (ADS)

    Southworth, Benjamin Scott

    linear systems arises often in the modeling of biological and physical phenomenon, data analysis through graphs and networks, and other scientific applications. This work focuses primarily on linear systems resulting from the discretization of partial differential equations (PDEs). Because solving linear systems is the bottleneck of many large simulation codes, there is a rich field of research in developing "fast" solvers, with the ultimate goal being a method that solves an n x n linear system in O(n) operations. One of the most effective classes of solvers is algebraic multigrid (AMG), which is a multilevel iterative method based on projecting the problem into progressively smaller spaces, and scales like O(n) or O(nlog n) for certain classes of problems. The field of AMG is well-developed for symmetric positive definite matrices, and is typically most effective on linear systems resulting from the discretization of scalar elliptic PDEs, such as the heat equation. Systems of PDEs can add additional difficulties, but the underlying linear algebraic theory is consistent and, in many cases, an elliptic system of PDEs can be handled well by AMG with appropriate modifications of the solver. Solving general, nonsymmetric linear systems remains the wild west of AMG (and other fast solvers), lacking significant results in convergence theory as well as robust methods. Here, we develop new theoretical motivation and practical variations of AMG to solve nonsymmetric linear systems, often resulting from the discretization of hyperbolic PDEs. In particular, multilevel convergence of AMG for nonsymmetric systems is proven for the first time. A new nonsymmetric AMG solver is also developed based on an approximate ideal restriction, referred to as AIR, which is able to solve advection-dominated, hyperbolic-type problems that are outside the scope of existing AMG solvers and other fast iterative methods. AIR demonstrates scalable convergence on unstructured meshes, in multiple

  3. Using parallel banded linear system solvers in generalized eigenvalue problems

    NASA Technical Reports Server (NTRS)

    Zhang, Hong; Moss, William F.

    1993-01-01

    Subspace iteration is a reliable and cost effective method for solving positive definite banded symmetric generalized eigenproblems, especially in the case of large scale problems. This paper discusses an algorithm that makes use of two parallel banded solvers in subspace iteration. A shift is introduced to decompose the banded linear systems into relatively independent subsystems and to accelerate the iterations. With this shift, an eigenproblem is mapped efficiently into the memories of a multiprocessor and a high speed-up is obtained for parallel implementations. An optimal shift is a shift that balances total computation and communication costs. Under certain conditions, we show how to estimate an optimal shift analytically using the decay rate for the inverse of a banded matrix, and how to improve this estimate. Computational results on iPSC/2 and iPSC/860 multiprocessors are presented.

  4. Making Sense of Math: How to Help Every Student become a Mathematical Thinker and Problem Solver (ASCD Arias)

    ERIC Educational Resources Information Center

    Seeley, Cathy L.

    2016-01-01

    In "Making Sense of Math," Cathy L. Seeley, former president of the National Council of Teachers of Mathematics, shares her insight into how to turn your students into flexible mathematical thinkers and problem solvers. This practical volume concentrates on the following areas: (1) Making sense of math by fostering habits of mind that…

  5. The unstaggered extension to GFDL's FV3 dynamical core on the cubed-sphere

    NASA Astrophysics Data System (ADS)

    Chen, X.; Lin, S. J.; Harris, L.

    2017-12-01

    Finite-volume schemes have become popular for atmospheric transport since they provide intrinsic mass conservation to constituent species. Many CFD codes use unstaggered discretizations for finite volume methods with an approximate Riemann solver. However, this approach is inefficient for geophysical flows due to the complexity of the Riemann solver. We introduce a Low Mach number Approximate Riemann Solver (LMARS) simplified using assumptions appropriate for atmospheric flows: the wind speed is much slower than the sound speed, weak discontinuities, and locally uniform sound wave velocity. LMARS makes possible a Riemann-solver-based dynamical core comparable in computational efficiency to many current dynamical cores. We will present a 3D finite-volume dynamical core using LMARS in a cubed-sphere geometry with a vertically Lagrangian discretization. Results from standard idealized test cases will be discussed.

  6. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Chao; Pouransari, Hadi; Rajamanickam, Sivasankaran

    We present a parallel hierarchical solver for general sparse linear systems on distributed-memory machines. For large-scale problems, this fully algebraic algorithm is faster and more memory-efficient than sparse direct solvers because it exploits the low-rank structure of fill-in blocks. Depending on the accuracy of low-rank approximations, the hierarchical solver can be used either as a direct solver or as a preconditioner. The parallel algorithm is based on data decomposition and requires only local communication for updating boundary data on every processor. Moreover, the computation-to-communication ratio of the parallel algorithm is approximately the volume-to-surface-area ratio of the subdomain owned by everymore » processor. We also provide various numerical results to demonstrate the versatility and scalability of the parallel algorithm.« less

  7. Impact of the implementation of MPI point-to-point communications on the performance of two general sparse solvers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Amestoy, Patrick R.; Duff, Iain S.; L'Excellent, Jean-Yves

    2001-10-10

    We examine the mechanics of the send and receive mechanism of MPI and in particular how we can implement message passing in a robust way so that our performance is not significantly affected by changes to the MPI system. This leads us to using the Isend/Irecv protocol which will entail sometimes significant algorithmic changes. We discuss this within the context of two different algorithms for sparse Gaussian elimination that we have parallelized. One is a multifrontal solver called MUMPS, the other is a supernodal solver called SuperLU. Both algorithms are difficult to parallelize on distributed memory machines. Our initial strategiesmore » were based on simple MPI point-to-point communication primitives. With such approaches, the parallel performance of both codes are very sensitive to the MPI implementation, the way MPI internal buffers are used in particular. We then modified our codes to use more sophisticated nonblocking versions of MPI communication. This significantly improved the performance robustness (independent of the MPI buffering mechanism) and scalability, but at the cost of increased code complexity.« less

  8. All-optical 1st- and 2nd-order differential equation solvers with large tuning ranges using Fabry-Pérot semiconductor optical amplifiers.

    PubMed

    Chen, Kaisheng; Hou, Jie; Huang, Zhuyang; Cao, Tong; Zhang, Jihua; Yu, Yuan; Zhang, Xinliang

    2015-02-09

    We experimentally demonstrate an all-optical temporal computation scheme for solving 1st- and 2nd-order linear ordinary differential equations (ODEs) with tunable constant coefficients by using Fabry-Pérot semiconductor optical amplifiers (FP-SOAs). By changing the injection currents of FP-SOAs, the constant coefficients of the differential equations are practically tuned. A quite large constant coefficient tunable range from 0.0026/ps to 0.085/ps is achieved for the 1st-order differential equation. Moreover, the constant coefficient p of the 2nd-order ODE solver can be continuously tuned from 0.0216/ps to 0.158/ps, correspondingly with the constant coefficient q varying from 0.0000494/ps(2) to 0.006205/ps(2). Additionally, a theoretical model that combining the carrier density rate equation of the semiconductor optical amplifier (SOA) with the transfer function of the Fabry-Pérot (FP) cavity is exploited to analyze the solving processes. For both 1st- and 2nd-order solvers, excellent agreements between the numerical simulations and the experimental results are obtained. The FP-SOAs based all-optical differential-equation solvers can be easily integrated with other optical components based on InP/InGaAsP materials, such as laser, modulator, photodetector and waveguide, which can motivate the realization of the complicated optical computing on a single integrated chip.

  9. Chromatographic peak resolution using Microsoft Excel Solver. The merit of time shifting input arrays.

    PubMed

    Dasgupta, Purnendu K

    2008-12-05

    Resolution of overlapped chromatographic peaks is generally accomplished by modeling the peaks as Gaussian or modified Gaussian functions. It is possible, even preferable, to use actual single analyte input responses for this purpose and a nonlinear least squares minimization routine such as that provided by Microsoft Excel Solver can then provide the resolution. In practice, the quality of the results obtained varies greatly due to small shifts in retention time. I show here that such deconvolution can be considerably improved if one or more of the response arrays are iteratively shifted in time.

  10. FELIX-2.0: New version of the finite element solver for the time dependent generator coordinate method with the Gaussian overlap approximation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Regnier, D.; Dubray, N.; Verriere, M.

    The time-dependent generator coordinate method (TDGCM) is a powerful method to study the large amplitude collective motion of quantum many-body systems such as atomic nuclei. Under the Gaussian Overlap Approximation (GOA), the TDGCM leads to a local, time-dependent Schrödinger equation in a multi-dimensional collective space. In this study, we present the version 2.0 of the code FELIX that solves the collective Schrödinger equation in a finite element basis. This new version features: (i) the ability to solve a generalized TDGCM+GOA equation with a metric term in the collective Hamiltonian, (ii) support for new kinds of finite elements and different typesmore » of quadrature to compute the discretized Hamiltonian and overlap matrices, (iii) the possibility to leverage the spectral element scheme, (iv) an explicit Krylov approximation of the time propagator for time integration instead of the implicit Crank–Nicolson method implemented in the first version, (v) an entirely redesigned workflow. We benchmark this release on an analytic problem as well as on realistic two-dimensional calculations of the low-energy fission of 240Pu and 256Fm. Low to moderate numerical precision calculations are most efficiently performed with simplex elements with a degree 2 polynomial basis. Higher precision calculations should instead use the spectral element method with a degree 4 polynomial basis. Finally, we emphasize that in a realistic calculation of fission mass distributions of 240Pu, FELIX-2.0 is about 20 times faster than its previous release (within a numerical precision of a few percents).« less

  11. FELIX-2.0: New version of the finite element solver for the time dependent generator coordinate method with the Gaussian overlap approximation

    NASA Astrophysics Data System (ADS)

    Regnier, D.; Dubray, N.; Verrière, M.; Schunck, N.

    2018-04-01

    The time-dependent generator coordinate method (TDGCM) is a powerful method to study the large amplitude collective motion of quantum many-body systems such as atomic nuclei. Under the Gaussian Overlap Approximation (GOA), the TDGCM leads to a local, time-dependent Schrödinger equation in a multi-dimensional collective space. In this paper, we present the version 2.0 of the code FELIX that solves the collective Schrödinger equation in a finite element basis. This new version features: (i) the ability to solve a generalized TDGCM+GOA equation with a metric term in the collective Hamiltonian, (ii) support for new kinds of finite elements and different types of quadrature to compute the discretized Hamiltonian and overlap matrices, (iii) the possibility to leverage the spectral element scheme, (iv) an explicit Krylov approximation of the time propagator for time integration instead of the implicit Crank-Nicolson method implemented in the first version, (v) an entirely redesigned workflow. We benchmark this release on an analytic problem as well as on realistic two-dimensional calculations of the low-energy fission of 240Pu and 256Fm. Low to moderate numerical precision calculations are most efficiently performed with simplex elements with a degree 2 polynomial basis. Higher precision calculations should instead use the spectral element method with a degree 4 polynomial basis. We emphasize that in a realistic calculation of fission mass distributions of 240Pu, FELIX-2.0 is about 20 times faster than its previous release (within a numerical precision of a few percents).

  12. FELIX-2.0: New version of the finite element solver for the time dependent generator coordinate method with the Gaussian overlap approximation

    DOE PAGES

    Regnier, D.; Dubray, N.; Verriere, M.; ...

    2017-12-20

    The time-dependent generator coordinate method (TDGCM) is a powerful method to study the large amplitude collective motion of quantum many-body systems such as atomic nuclei. Under the Gaussian Overlap Approximation (GOA), the TDGCM leads to a local, time-dependent Schrödinger equation in a multi-dimensional collective space. In this study, we present the version 2.0 of the code FELIX that solves the collective Schrödinger equation in a finite element basis. This new version features: (i) the ability to solve a generalized TDGCM+GOA equation with a metric term in the collective Hamiltonian, (ii) support for new kinds of finite elements and different typesmore » of quadrature to compute the discretized Hamiltonian and overlap matrices, (iii) the possibility to leverage the spectral element scheme, (iv) an explicit Krylov approximation of the time propagator for time integration instead of the implicit Crank–Nicolson method implemented in the first version, (v) an entirely redesigned workflow. We benchmark this release on an analytic problem as well as on realistic two-dimensional calculations of the low-energy fission of 240Pu and 256Fm. Low to moderate numerical precision calculations are most efficiently performed with simplex elements with a degree 2 polynomial basis. Higher precision calculations should instead use the spectral element method with a degree 4 polynomial basis. Finally, we emphasize that in a realistic calculation of fission mass distributions of 240Pu, FELIX-2.0 is about 20 times faster than its previous release (within a numerical precision of a few percents).« less

  13. An unstructured mesh arbitrary Lagrangian-Eulerian unsteady incompressible flow solver and its application to insect flight aerodynamics

    NASA Astrophysics Data System (ADS)

    Su, Xiaohui; Cao, Yuanwei; Zhao, Yong

    2016-06-01

    In this paper, an unstructured mesh Arbitrary Lagrangian-Eulerian (ALE) incompressible flow solver is developed to investigate the aerodynamics of insect hovering flight. The proposed finite-volume ALE Navier-Stokes solver is based on the artificial compressibility method (ACM) with a high-resolution method of characteristics-based scheme on unstructured grids. The present ALE model is validated and assessed through flow passing over an oscillating cylinder. Good agreements with experimental results and other numerical solutions are obtained, which demonstrates the accuracy and the capability of the present model. The lift generation mechanisms of 2D wing in hovering motion, including wake capture, delayed stall, rapid pitch, as well as clap and fling are then studied and illustrated using the current ALE model. Moreover, the optimized angular amplitude in symmetry model, 45°, is firstly reported in details using averaged lift and the energy power method. Besides, the lift generation of complete cyclic clap and fling motion, which is simulated by few researchers using the ALE method due to large deformation, is studied and clarified for the first time. The present ALE model is found to be a useful tool to investigate lift force generation mechanism for insect wing flight.

  14. A new optimization method using a compressed sensing inspired solver for real-time LDR-brachytherapy treatment planning

    NASA Astrophysics Data System (ADS)

    Guthier, C.; Aschenbrenner, K. P.; Buergy, D.; Ehmann, M.; Wenz, F.; Hesser, J. W.

    2015-03-01

    This work discusses a novel strategy for inverse planning in low dose rate brachytherapy. It applies the idea of compressed sensing to the problem of inverse treatment planning and a new solver for this formulation is developed. An inverse planning algorithm was developed incorporating brachytherapy dose calculation methods as recommended by AAPM TG-43. For optimization of the functional a new variant of a matching pursuit type solver is presented. The results are compared with current state-of-the-art inverse treatment planning algorithms by means of real prostate cancer patient data. The novel strategy outperforms the best state-of-the-art methods in speed, while achieving comparable quality. It is able to find solutions with comparable values for the objective function and it achieves these results within a few microseconds, being up to 542 times faster than competing state-of-the-art strategies, allowing real-time treatment planning. The sparse solution of inverse brachytherapy planning achieved with methods from compressed sensing is a new paradigm for optimization in medical physics. Through the sparsity of required needles and seeds identified by this method, the cost of intervention may be reduced.

  15. A new optimization method using a compressed sensing inspired solver for real-time LDR-brachytherapy treatment planning.

    PubMed

    Guthier, C; Aschenbrenner, K P; Buergy, D; Ehmann, M; Wenz, F; Hesser, J W

    2015-03-21

    This work discusses a novel strategy for inverse planning in low dose rate brachytherapy. It applies the idea of compressed sensing to the problem of inverse treatment planning and a new solver for this formulation is developed. An inverse planning algorithm was developed incorporating brachytherapy dose calculation methods as recommended by AAPM TG-43. For optimization of the functional a new variant of a matching pursuit type solver is presented. The results are compared with current state-of-the-art inverse treatment planning algorithms by means of real prostate cancer patient data. The novel strategy outperforms the best state-of-the-art methods in speed, while achieving comparable quality. It is able to find solutions with comparable values for the objective function and it achieves these results within a few microseconds, being up to 542 times faster than competing state-of-the-art strategies, allowing real-time treatment planning. The sparse solution of inverse brachytherapy planning achieved with methods from compressed sensing is a new paradigm for optimization in medical physics. Through the sparsity of required needles and seeds identified by this method, the cost of intervention may be reduced.

  16. The a(3) Scheme--A Fourth-Order Space-Time Flux-Conserving and Neutrally Stable CESE Solver

    NASA Technical Reports Server (NTRS)

    Chang, Sin-Chung

    2008-01-01

    The CESE development is driven by a belief that a solver should (i) enforce conservation laws in both space and time, and (ii) be built from a non-dissipative (i.e., neutrally stable) core scheme so that the numerical dissipation can be controlled effectively. To initiate a systematic CESE development of high order schemes, in this paper we provide a thorough discussion on the structure, consistency, stability, phase error, and accuracy of a new 4th-order space-time flux-conserving and neutrally stable CESE solver of an 1D scalar advection equation. The space-time stencil of this two-level explicit scheme is formed by one point at the upper time level and three points at the lower time level. Because it is associated with three independent mesh variables (the numerical analogues of the dependent variable and its 1st-order and 2ndorder spatial derivatives, respectively) and three equations per mesh point, the new scheme is referred to as the a(3) scheme. Through the von Neumann analysis, it is shown that the a(3) scheme is stable if and only if the Courant number is less than 0.5. Moreover, it is established numerically that the a(3) scheme is 4th-order accurate.

  17. A speciation solver for cement paste modeling and the semismooth Newton method

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Georget, Fabien, E-mail: fabieng@princeton.edu; Prévost, Jean H., E-mail: prevost@princeton.edu; Vanderbei, Robert J., E-mail: rvdb@princeton.edu

    2015-02-15

    The mineral assemblage of a cement paste may vary considerably with its environment. In addition, the water content of a cement paste is relatively low and the ionic strength of the interstitial solution is often high. These conditions are extreme conditions with respect to the common assumptions made in speciation problem. Furthermore the common trial and error algorithm to find the phase assemblage does not provide any guarantee of convergence. We propose a speciation solver based on a semismooth Newton method adapted to the thermodynamic modeling of cement paste. The strong theoretical properties associated with these methods offer practical advantages.more » Results of numerical experiments indicate that the algorithm is reliable, robust, and efficient.« less

  18. A Riemann solver for RANS

    NASA Astrophysics Data System (ADS)

    Chuvakhov, P. V.

    2014-01-01

    An exact expression for a system of both eigenvalues and right/left eigenvectors of a Jacobian matrix for a convective two-equation differential closure RANS operator split along a curvilinear coordinate is derived. It is shown by examples of numerical modeling of supersonic flows over a flat plate and a compression corner with separation that application of the exact system of eigenvalues and eigenvectors to the Roe approach for approximate solution of the Riemann problem gives rise to an increase in the convergence rate, better stability and higher accuracy of a steady-state solution in comparison with those in the case of an approximate system.

  19. Fast solver for large scale eddy current non-destructive evaluation problems

    NASA Astrophysics Data System (ADS)

    Lei, Naiguang

    Eddy current testing plays a very important role in non-destructive evaluations of conducting test samples. Based on Faraday's law, an alternating magnetic field source generates induced currents, called eddy currents, in an electrically conducting test specimen. The eddy currents generate induced magnetic fields that oppose the direction of the inducing magnetic field in accordance with Lenz's law. In the presence of discontinuities in material property or defects in the test specimen, the induced eddy current paths are perturbed and the associated magnetic fields can be detected by coils or magnetic field sensors, such as Hall elements or magneto-resistance sensors. Due to the complexity of the test specimen and the inspection environments, the availability of theoretical simulation models is extremely valuable for studying the basic field/flaw interactions in order to obtain a fuller understanding of non-destructive testing phenomena. Theoretical models of the forward problem are also useful for training and validation of automated defect detection systems. Theoretical models generate defect signatures that are expensive to replicate experimentally. In general, modelling methods can be classified into two categories: analytical and numerical. Although analytical approaches offer closed form solution, it is generally not possible to obtain largely due to the complex sample and defect geometries, especially in three-dimensional space. Numerical modelling has become popular with advances in computer technology and computational methods. However, due to the huge time consumption in the case of large scale problems, accelerations/fast solvers are needed to enhance numerical models. This dissertation describes a numerical simulation model for eddy current problems using finite element analysis. Validation of the accuracy of this model is demonstrated via comparison with experimental measurements of steam generator tube wall defects. These simulations generating two

  20. Approximate Dynamic Programming: Combining Regional and Local State Following Approximations.

    PubMed

    Deptula, Patryk; Rosenfeld, Joel A; Kamalapurkar, Rushikesh; Dixon, Warren E

    2018-06-01

    An infinite-horizon optimal regulation problem for a control-affine deterministic system is solved online using a local state following (StaF) kernel and a regional model-based reinforcement learning (R-MBRL) method to approximate the value function. Unlike traditional methods such as R-MBRL that aim to approximate the value function over a large compact set, the StaF kernel approach aims to approximate the value function in a local neighborhood of the state that travels within a compact set. In this paper, the value function is approximated using a state-dependent convex combination of the StaF-based and the R-MBRL-based approximations. As the state enters a neighborhood containing the origin, the value function transitions from being approximated by the StaF approach to the R-MBRL approach. Semiglobal uniformly ultimately bounded (SGUUB) convergence of the system states to the origin is established using a Lyapunov-based analysis. Simulation results are provided for two, three, six, and ten-state dynamical systems to demonstrate the scalability and performance of the developed method.

  1. An effective lattice Boltzmann flux solver on arbitrarily unstructured meshes

    NASA Astrophysics Data System (ADS)

    Wu, Qi-Feng; Shu, Chang; Wang, Yan; Yang, Li-Ming

    2018-05-01

    The recently proposed lattice Boltzmann flux solver (LBFS) is a new approach for the simulation of incompressible flow problems. It applies the finite volume method (FVM) to discretize the governing equations, and the flux at the cell interface is evaluated by local reconstruction of lattice Boltzmann solution from macroscopic flow variables at cell centers. In the previous application of the LBFS, the structured meshes have been commonly employed, which may cause inconvenience for problems with complex geometries. In this paper, the LBFS is extended to arbitrarily unstructured meshes for effective simulation of incompressible flows. Two test cases, the lid-driven flow in a triangular cavity and flow around a circular cylinder, are carried out for validation. The obtained results are compared with the data available in the literature. Good agreement has been achieved, which demonstrates the effectiveness and reliability of the LBFS in simulating flows on arbitrarily unstructured meshes.

  2. Directional Agglomeration Multigrid Techniques for High Reynolds Number Viscous Flow Solvers

    NASA Technical Reports Server (NTRS)

    1998-01-01

    A preconditioned directional-implicit agglomeration algorithm is developed for solving two- and three-dimensional viscous flows on highly anisotropic unstructured meshes of mixed-element types. The multigrid smoother consists of a pre-conditioned point- or line-implicit solver which operates on lines constructed in the unstructured mesh using a weighted graph algorithm. Directional coarsening or agglomeration is achieved using a similar weighted graph algorithm. A tight coupling of the line construction and directional agglomeration algorithms enables the use of aggressive coarsening ratios in the multigrid algorithm, which in turn reduces the cost of a multigrid cycle. Convergence rates which are independent of the degree of grid stretching are demonstrated in both two and three dimensions. Further improvement of the three-dimensional convergence rates through a GMRES technique is also demonstrated.

  3. LES of Swirling Reacting Flows via the Unstructured scalar-FDF Solver

    NASA Astrophysics Data System (ADS)

    Ansari, Naseem; Pisciuneri, Patrick; Strakey, Peter; Givi, Peyman

    2011-11-01

    Swirling flames pose a significant challenge for computational modeling due to the presence of recirculation regions and vortex shedding. In this work, results are presented of LES of two swirl stabilized non-premixed flames (SM1 and SM2) via the FDF methodology. These flames are part of the database for validation of turbulent-combustion models. The scalar-FDF is simulated on a domain discretized by unstructured meshes, and is coupled with a finite volume flow solver. In the SM1 flame (with a low swirl number) chemistry is described by the flamelet model based on the full GRI 2.11 mechanism. The SM2 flame (with a high swirl number) is simulated via a 46-step 17-species mechanism. The simulated results are assessed via comparison with experimental data.

  4. Algorithm 971: An Implementation of a Randomized Algorithm for Principal Component Analysis

    PubMed Central

    LI, HUAMIN; LINDERMAN, GEORGE C.; SZLAM, ARTHUR; STANTON, KELLY P.; KLUGER, YUVAL; TYGERT, MARK

    2017-01-01

    Recent years have witnessed intense development of randomized methods for low-rank approximation. These methods target principal component analysis and the calculation of truncated singular value decompositions. The present article presents an essentially black-box, foolproof implementation for Mathworks’ MATLAB, a popular software platform for numerical computation. As illustrated via several tests, the randomized algorithms for low-rank approximation outperform or at least match the classical deterministic techniques (such as Lanczos iterations run to convergence) in basically all respects: accuracy, computational efficiency (both speed and memory usage), ease-of-use, parallelizability, and reliability. However, the classical procedures remain the methods of choice for estimating spectral norms and are far superior for calculating the least singular values and corresponding singular vectors (or singular subspaces). PMID:28983138

  5. Divergence-free MHD on unstructured meshes using high order finite volume schemes based on multidimensional Riemann solvers

    NASA Astrophysics Data System (ADS)

    Balsara, Dinshaw S.; Dumbser, Michael

    2015-10-01

    Several advances have been reported in the recent literature on divergence-free finite volume schemes for Magnetohydrodynamics (MHD). Almost all of these advances are restricted to structured meshes. To retain full geometric versatility, however, it is also very important to make analogous advances in divergence-free schemes for MHD on unstructured meshes. Such schemes utilize a staggered Yee-type mesh, where all hydrodynamic quantities (mass, momentum and energy density) are cell-centered, while the magnetic fields are face-centered and the electric fields, which are so useful for the time update of the magnetic field, are centered at the edges. Three important advances are brought together in this paper in order to make it possible to have high order accurate finite volume schemes for the MHD equations on unstructured meshes. First, it is shown that a divergence-free WENO reconstruction of the magnetic field can be developed for unstructured meshes in two and three space dimensions using a classical cell-centered WENO algorithm, without the need to do a WENO reconstruction for the magnetic field on the faces. This is achieved via a novel constrained L2-projection operator that is used in each time step as a postprocessor of the cell-centered WENO reconstruction so that the magnetic field becomes locally and globally divergence free. Second, it is shown that recently-developed genuinely multidimensional Riemann solvers (called MuSIC Riemann solvers) can be used on unstructured meshes to obtain a multidimensionally upwinded representation of the electric field at each edge. Third, the above two innovations work well together with a high order accurate one-step ADER time stepping strategy, which requires the divergence-free nonlinear WENO reconstruction procedure to be carried out only once per time step. The resulting divergence-free ADER-WENO schemes with MuSIC Riemann solvers give us an efficient and easily-implemented strategy for divergence-free MHD on

  6. Countably QC-Approximating Posets

    PubMed Central

    Mao, Xuxin; Xu, Luoshan

    2014-01-01

    As a generalization of countably C-approximating posets, the concept of countably QC-approximating posets is introduced. With the countably QC-approximating property, some characterizations of generalized completely distributive lattices and generalized countably approximating posets are given. The main results are as follows: (1) a complete lattice is generalized completely distributive if and only if it is countably QC-approximating and weakly generalized countably approximating; (2) a poset L having countably directed joins is generalized countably approximating if and only if the lattice σ c(L)op of all σ-Scott-closed subsets of L is weakly generalized countably approximating. PMID:25165730

  7. A simplified analysis of the multigrid V-cycle as a fast elliptic solver

    NASA Technical Reports Server (NTRS)

    Decker, Naomi H.; Taasan, Shlomo

    1988-01-01

    For special model problems, Fourier analysis gives exact convergence rates for the two-grid multigrid cycle and, for more general problems, provides estimates of the two-grid convergence rates via local mode analysis. A method is presented for obtaining mutigrid convergence rate estimates for cycles involving more than two grids (using essentially the same analysis as for the two-grid cycle). For the simple cast of the V-cycle used as a fast Laplace solver on the unit square, the k-grid convergence rate bounds obtained by this method are sharper than the bounds predicted by the variational theory. Both theoretical justification and experimental evidence are presented.

  8. Stochastic Partial Differential Equation Solver for Hydroacoustic Modeling: Improvements to Paracousti Sound Propagation Solver

    NASA Astrophysics Data System (ADS)

    Preston, L. A.

    2017-12-01

    Marine hydrokinetic (MHK) devices offer a clean, renewable alternative energy source for the future. Responsible utilization of MHK devices, however, requires that the effects of acoustic noise produced by these devices on marine life and marine-related human activities be well understood. Paracousti is a 3-D full waveform acoustic modeling suite that can accurately propagate MHK noise signals in the complex bathymetry found in the near-shore to open ocean environment and considers real properties of the seabed, water column, and air-surface interface. However, this is a deterministic simulation that assumes the environment and source are exactly known. In reality, environmental and source characteristics are often only known in a statistical sense. Thus, to fully characterize the expected noise levels within the marine environment, this uncertainty in environmental and source factors should be incorporated into the acoustic simulations. One method is to use Monte Carlo (MC) techniques where simulation results from a large number of deterministic solutions are aggregated to provide statistical properties of the output signal. However, MC methods can be computationally prohibitive since they can require tens of thousands or more simulations to build up an accurate representation of those statistical properties. An alternative method, using the technique of stochastic partial differential equations (SPDE), allows computation of the statistical properties of output signals at a small fraction of the computational cost of MC. We are developing a SPDE solver for the 3-D acoustic wave propagation problem called Paracousti-UQ to help regulators and operators assess the statistical properties of environmental noise produced by MHK devices. In this presentation, we present the SPDE method and compare statistical distributions of simulated acoustic signals in simple models to MC simulations to show the accuracy and efficiency of the SPDE method. Sandia National Laboratories

  9. A Depth-Averaged 2-D Simulation for Coastal Barrier Breaching Processes

    DTIC Science & Technology

    2011-05-01

    including bed change and variable flow density in the flow continuity and momentum equations. The model adopts the HLL approximate Riemann solver to handle...flow density in the flow continuity and momentum equations. The model adopts the HLL approximate Riemann solver to handle the mixed-regime flows near...18 547 Keulegan equation or the Bernoulli equation, and the breach morphological change is determined using simplified sediment transport models

  10. Incremental planning to control a blackboard-based problem solver

    NASA Technical Reports Server (NTRS)

    Durfee, E. H.; Lesser, V. R.

    1987-01-01

    To control problem solving activity, a planner must resolve uncertainty about which specific long-term goals (solutions) to pursue and about which sequences of actions will best achieve those goals. A planner is described that abstracts the problem solving state to recognize possible competing and compatible solutions and to roughly predict the importance and expense of developing these solutions. With this information, the planner plans sequences of problem solving activities that most efficiently resolve its uncertainty about which of the possible solutions to work toward. The planner only details actions for the near future because the results of these actions will influence how (and whether) a plan should be pursued. As problem solving proceeds, the planner adds new details to the plan incrementally, and monitors and repairs the plan to insure it achieves its goals whenever possible. Through experiments, researchers illustrate how these new mechanisms significantly improve problem solving decisions and reduce overall computation. They briefly discuss current research directions, including how these mechanisms can improve a problem solver's real-time response and can enhance cooperation in a distributed problem solving network.

  11. Solute solver 'what if' module for modeling urea kinetics.

    PubMed

    Daugirdas, John T

    2016-11-01

    The publicly available Solute Solver module allows calculation of a variety of two-pool urea kinetic measures of dialysis adequacy using pre- and postdialysis plasma urea and estimated dialyzer clearance or estimated urea distribution volumes as inputs. However, the existing program does not have a 'what if' module, which would estimate the plasma urea values as well as commonly used measures of hemodialysis adequacy for a patient with a given urea distribution volume and urea nitrogen generation rate dialyzed according to a particular dialysis schedule. Conventional variable extracellular volume 2-pool urea kinetic equations were used. A javascript-HTML Web form was created that can be used on any personal computer equipped with internet browsing software, to compute commonly used Kt/V-based measures of hemodialysis adequacy for patients with differing amounts of residual kidney function and following a variety of treatment schedules. The completed Web form calculator may be particularly useful in computing equivalent continuous clearances for incremental hemodialysis strategies. © The Author 2016. Published by Oxford University Press on behalf of ERA-EDTA. All rights reserved.

  12. A generalized Poisson solver for first-principles device simulations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bani-Hashemian, Mohammad Hossein; VandeVondele, Joost, E-mail: joost.vandevondele@mat.ethz.ch; Brück, Sascha

    2016-01-28

    Electronic structure calculations of atomistic systems based on density functional theory involve solving the Poisson equation. In this paper, we present a plane-wave based algorithm for solving the generalized Poisson equation subject to periodic or homogeneous Neumann conditions on the boundaries of the simulation cell and Dirichlet type conditions imposed at arbitrary subdomains. In this way, source, drain, and gate voltages can be imposed across atomistic models of electronic devices. Dirichlet conditions are enforced as constraints in a variational framework giving rise to a saddle point problem. The resulting system of equations is then solved using a stationary iterative methodmore » in which the generalized Poisson operator is preconditioned with the standard Laplace operator. The solver can make use of any sufficiently smooth function modelling the dielectric constant, including density dependent dielectric continuum models. For all the boundary conditions, consistent derivatives are available and molecular dynamics simulations can be performed. The convergence behaviour of the scheme is investigated and its capabilities are demonstrated.« less

  13. A Conformal, Fully-Conservative Approach for Predicting Blast Effects on Ground Vehicles

    DTIC Science & Technology

    2014-04-01

    time integration  Approximate Riemann Fluxes (HLLE, HLLC) ◦ Robust mixture model for multi-material flows  Multiple Equations of State ◦ Perfect Gas...Loci/CHEM: Chemically reacting compressible flow solver . ◦ Currently in production use by NASA for the simulation of rocket motors, plumes, and...vehicles  Loci/DROPLET: Eulerian and Lagrangian multiphase solvers  Loci/STREAM: pressure-based solver ◦ Developed by Streamline Numerics and

  14. Application of Biot-Savart Solver to Predict Axis Switching Phenomena in Finite-Span Vortices Expelled from a Synthetic Jet

    NASA Astrophysics Data System (ADS)

    Straccia, Joseph; Farnsworth, John

    2016-11-01

    The Biot-Savart law is a simple yet powerful inviscid and incompressible relationship between the velocity induced at a point and the circulation, orientation and distance of separation of a vortex line. The authors have developed an algorithm for obtaining numerical solutions of the Biot-Savart relationship to predict the self-induced velocity on a vortex line of arbitrary shape. In this work the Biot-Savart solver was used to predict the self-induced propagation of non-circular, finite-span vortex rings expelled from synthetic jets with rectangular orifices of varying aspect ratios. The solver's prediction of the time varying shape of the vortex ring and frequency of axis switching was then compared with Particle Image Velocimetry (PIV) data from a synthetic jet expelled into a quiescent flow i.e. zero cross flow condition. Conclusions about the effectiveness and limitations of this simple, inviscid relationship are drawn from this experimental data. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship under Grant No. DGE 1144083.

  15. Fast Multilevel Solvers for a Class of Discrete Fourth Order Parabolic Problems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zheng, Bin; Chen, Luoping; Hu, Xiaozhe

    2016-03-05

    In this paper, we study fast iterative solvers for the solution of fourth order parabolic equations discretized by mixed finite element methods. We propose to use consistent mass matrix in the discretization and use lumped mass matrix to construct efficient preconditioners. We provide eigenvalue analysis for the preconditioned system and estimate the convergence rate of the preconditioned GMRes method. Furthermore, we show that these preconditioners only need to be solved inexactly by optimal multigrid algorithms. Our numerical examples indicate that the proposed preconditioners are very efficient and robust with respect to both discretization parameters and diffusion coefficients. We also investigatemore » the performance of multigrid algorithms with either collective smoothers or distributive smoothers when solving the preconditioner systems.« less

  16. A Comparison of Solver Performance for Complex Gastric Electrophysiology Models

    PubMed Central

    Sathar, Shameer; Cheng, Leo K.; Trew, Mark L.

    2016-01-01

    Computational techniques for solving systems of equations arising in gastric electrophysiology have not been studied for efficient solution process. We present a computationally challenging problem of simulating gastric electrophysiology in anatomically realistic stomach geometries with multiple intracellular and extracellular domains. The multiscale nature of the problem and mesh resolution required to capture geometric and functional features necessitates efficient solution methods if the problem is to be tractable. In this study, we investigated and compared several parallel preconditioners for the linear systems arising from tetrahedral discretisation of electrically isotropic and anisotropic problems, with and without stimuli. The results showed that the isotropic problem was computationally less challenging than the anisotropic problem and that the application of extracellular stimuli increased workload considerably. Preconditioning based on block Jacobi and algebraic multigrid solvers were found to have the best overall solution times and least iteration counts, respectively. The algebraic multigrid preconditioner would be expected to perform better on large problems. PMID:26736543

  17. A Block Preconditioned Conjugate Gradient-type Iterative Solver for Linear Systems in Thermal Reservoir Simulation

    NASA Astrophysics Data System (ADS)

    Betté, Srinivas; Diaz, Julio C.; Jines, William R.; Steihaug, Trond

    1986-11-01

    A preconditioned residual-norm-reducing iterative solver is described. Based on a truncated form of the generalized-conjugate-gradient method for nonsymmetric systems of linear equations, the iterative scheme is very effective for linear systems generated in reservoir simulation of thermal oil recovery processes. As a consequence of employing an adaptive implicit finite-difference scheme to solve the model equations, the number of variables per cell-block varies dynamically over the grid. The data structure allows for 5- and 9-point operators in the areal model, 5-point in the cross-sectional model, and 7- and 11-point operators in the three-dimensional model. Block-diagonal-scaling of the linear system, done prior to iteration, is found to have a significant effect on the rate of convergence. Block-incomplete-LU-decomposition (BILU) and block-symmetric-Gauss-Seidel (BSGS) methods, which result in no fill-in, are used as preconditioning procedures. A full factorization is done on the well terms, and the cells are ordered in a manner which minimizes the fill-in in the well-column due to this factorization. The convergence criterion for the linear (inner) iteration is linked to that of the nonlinear (Newton) iteration, thereby enhancing the efficiency of the computation. The algorithm, with both BILU and BSGS preconditioners, is evaluated in the context of a variety of thermal simulation problems. The solver is robust and can be used with little or no user intervention.

  18. A scalable geometric multigrid solver for nonsymmetric elliptic systems with application to variable-density flows

    NASA Astrophysics Data System (ADS)

    Esmaily, M.; Jofre, L.; Mani, A.; Iaccarino, G.

    2018-03-01

    A geometric multigrid algorithm is introduced for solving nonsymmetric linear systems resulting from the discretization of the variable density Navier-Stokes equations on nonuniform structured rectilinear grids and high-Reynolds number flows. The restriction operation is defined such that the resulting system on the coarser grids is symmetric, thereby allowing for the use of efficient smoother algorithms. To achieve an optimal rate of convergence, the sequence of interpolation and restriction operations are determined through a dynamic procedure. A parallel partitioning strategy is introduced to minimize communication while maintaining the load balance between all processors. To test the proposed algorithm, we consider two cases: 1) homogeneous isotropic turbulence discretized on uniform grids and 2) turbulent duct flow discretized on stretched grids. Testing the algorithm on systems with up to a billion unknowns shows that the cost varies linearly with the number of unknowns. This O (N) behavior confirms the robustness of the proposed multigrid method regarding ill-conditioning of large systems characteristic of multiscale high-Reynolds number turbulent flows. The robustness of our method to density variations is established by considering cases where density varies sharply in space by a factor of up to 104, showing its applicability to two-phase flow problems. Strong and weak scalability studies are carried out, employing up to 30,000 processors, to examine the parallel performance of our implementation. Excellent scalability of our solver is shown for a granularity as low as 104 to 105 unknowns per processor. At its tested peak throughput, it solves approximately 4 billion unknowns per second employing over 16,000 processors with a parallel efficiency higher than 50%.

  19. Reacting Multi-Species Gas Capability for USM3D Flow Solver

    NASA Technical Reports Server (NTRS)

    Frink, Neal T.; Schuster, David M.

    2012-01-01

    The USM3D Navier-Stokes flow solver contributed heavily to the NASA Constellation Project (CxP) as a highly productive computational tool for generating the aerodynamic databases for the Ares I and V launch vehicles and Orion launch abort vehicle (LAV). USM3D is currently limited to ideal-gas flows, which are not adequate for modeling the chemistry or temperature effects of hot-gas jet flows. This task was initiated to create an efficient implementation of multi-species gas and equilibrium chemistry into the USM3D code to improve its predictive capabilities for hot jet impingement effects. The goal of this NASA Engineering and Safety Center (NESC) assessment was to implement and validate a simulation capability to handle real-gas effects in the USM3D code. This document contains the outcome of the NESC assessment.

  20. PB-AM: An open-source, fully analytical linear poisson-boltzmann solver

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Felberg, Lisa E.; Brookes, David H.; Yap, Eng-Hui

    2016-11-02

    We present the open source distributed software package Poisson-Boltzmann Analytical Method (PB-AM), a fully analytical solution to the linearized Poisson Boltzmann equation. The PB-AM software package includes the generation of outputs files appropriate for visualization using VMD, a Brownian dynamics scheme that uses periodic boundary conditions to simulate dynamics, the ability to specify docking criteria, and offers two different kinetics schemes to evaluate biomolecular association rate constants. Given that PB-AM defines mutual polarization completely and accurately, it can be refactored as a many-body expansion to explore 2- and 3-body polarization. Additionally, the software has been integrated into the Adaptive Poisson-Boltzmannmore » Solver (APBS) software package to make it more accessible to a larger group of scientists, educators and students that are more familiar with the APBS framework.« less

  1. An Immersed Boundary - Adaptive Mesh Refinement solver (IB-AMR) for high fidelity fully resolved wind turbine simulations

    NASA Astrophysics Data System (ADS)

    Angelidis, Dionysios; Sotiropoulos, Fotis

    2015-11-01

    The geometrical details of wind turbines determine the structure of the turbulence in the near and far wake and should be taken in account when performing high fidelity calculations. Multi-resolution simulations coupled with an immersed boundary method constitutes a powerful framework for high-fidelity calculations past wind farms located over complex terrains. We develop a 3D Immersed-Boundary Adaptive Mesh Refinement flow solver (IB-AMR) which enables turbine-resolving LES of wind turbines. The idea of using a hybrid staggered/non-staggered grid layout adopted in the Curvilinear Immersed Boundary Method (CURVIB) has been successfully incorporated on unstructured meshes and the fractional step method has been employed. The overall performance and robustness of the second order accurate, parallel, unstructured solver is evaluated by comparing the numerical simulations against conforming grid calculations and experimental measurements of laminar and turbulent flows over complex geometries. We also present turbine-resolving multi-scale LES considering all the details affecting the induced flow field; including the geometry of the tower, the nacelle and especially the rotor blades of a wind tunnel scale turbine. This material is based upon work supported by the Department of Energy under Award Number DE-EE0005482 and the Sandia National Laboratories.

  2. Adaptive time stepping for fluid-structure interaction solvers

    DOE PAGES

    Mayr, M.; Wall, W. A.; Gee, M. W.

    2017-12-22

    In this work, a novel adaptive time stepping scheme for fluid-structure interaction (FSI) problems is proposed that allows for controlling the accuracy of the time-discrete solution. Furthermore, it eases practical computations by providing an efficient and very robust time step size selection. This has proven to be very useful, especially when addressing new physical problems, where no educated guess for an appropriate time step size is available. The fluid and the structure field, but also the fluid-structure interface are taken into account for the purpose of a posteriori error estimation, rendering it easy to implement and only adding negligible additionalmore » cost. The adaptive time stepping scheme is incorporated into a monolithic solution framework, but can straightforwardly be applied to partitioned solvers as well. The basic idea can be extended to the coupling of an arbitrary number of physical models. Accuracy and efficiency of the proposed method are studied in a variety of numerical examples ranging from academic benchmark tests to complex biomedical applications like the pulsatile blood flow through an abdominal aortic aneurysm. Finally, the demonstrated accuracy of the time-discrete solution in combination with reduced computational cost make this algorithm very appealing in all kinds of FSI applications.« less

  3. Adaptive time stepping for fluid-structure interaction solvers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mayr, M.; Wall, W. A.; Gee, M. W.

    In this work, a novel adaptive time stepping scheme for fluid-structure interaction (FSI) problems is proposed that allows for controlling the accuracy of the time-discrete solution. Furthermore, it eases practical computations by providing an efficient and very robust time step size selection. This has proven to be very useful, especially when addressing new physical problems, where no educated guess for an appropriate time step size is available. The fluid and the structure field, but also the fluid-structure interface are taken into account for the purpose of a posteriori error estimation, rendering it easy to implement and only adding negligible additionalmore » cost. The adaptive time stepping scheme is incorporated into a monolithic solution framework, but can straightforwardly be applied to partitioned solvers as well. The basic idea can be extended to the coupling of an arbitrary number of physical models. Accuracy and efficiency of the proposed method are studied in a variety of numerical examples ranging from academic benchmark tests to complex biomedical applications like the pulsatile blood flow through an abdominal aortic aneurysm. Finally, the demonstrated accuracy of the time-discrete solution in combination with reduced computational cost make this algorithm very appealing in all kinds of FSI applications.« less

  4. Localized Artificial Viscosity Stabilization of Discontinuous Galerkin Methods for Nonhydrostatic Mesoscale Atmospheric Modeling

    DTIC Science & Technology

    2014-01-01

    with a Riemann flux !"#! (!! ,!!!,), where !!! denotes the solution outside the current element !. Various (approximate) Riemann solvers ...can be used to calculate the Riemann flux, and the Rusanov Riemann solver is adopted in this paper. Then Eq. (7) can be rewritten as !

  5. GASPACHO: a generic automatic solver using proximal algorithms for convex huge optimization problems

    NASA Astrophysics Data System (ADS)

    Goossens, Bart; Luong, Hiêp; Philips, Wilfried

    2017-08-01

    Many inverse problems (e.g., demosaicking, deblurring, denoising, image fusion, HDR synthesis) share various similarities: degradation operators are often modeled by a specific data fitting function while image prior knowledge (e.g., sparsity) is incorporated by additional regularization terms. In this paper, we investigate automatic algorithmic techniques for evaluating proximal operators. These algorithmic techniques also enable efficient calculation of adjoints from linear operators in a general matrix-free setting. In particular, we study the simultaneous-direction method of multipliers (SDMM) and the parallel proximal algorithm (PPXA) solvers and show that the automatically derived implementations are well suited for both single-GPU and multi-GPU processing. We demonstrate this approach for an Electron Microscopy (EM) deconvolution problem.

  6. PCTDSE: A parallel Cartesian-grid-based TDSE solver for modeling laser-atom interactions

    NASA Astrophysics Data System (ADS)

    Fu, Yongsheng; Zeng, Jiaolong; Yuan, Jianmin

    2017-01-01

    We present a parallel Cartesian-grid-based time-dependent Schrödinger equation (TDSE) solver for modeling laser-atom interactions. It can simulate the single-electron dynamics of atoms in arbitrary time-dependent vector potentials. We use a split-operator method combined with fast Fourier transforms (FFT), on a three-dimensional (3D) Cartesian grid. Parallelization is realized using a 2D decomposition strategy based on the Message Passing Interface (MPI) library, which results in a good parallel scaling on modern supercomputers. We give simple applications for the hydrogen atom using the benchmark problems coming from the references and obtain repeatable results. The extensions to other laser-atom systems are straightforward with minimal modifications of the source code.

  7. Approximate symmetries of Hamiltonians

    NASA Astrophysics Data System (ADS)

    Chubb, Christopher T.; Flammia, Steven T.

    2017-08-01

    We explore the relationship between approximate symmetries of a gapped Hamiltonian and the structure of its ground space. We start by considering approximate symmetry operators, defined as unitary operators whose commutators with the Hamiltonian have norms that are sufficiently small. We show that when approximate symmetry operators can be restricted to the ground space while approximately preserving certain mutual commutation relations. We generalize the Stone-von Neumann theorem to matrices that approximately satisfy the canonical (Heisenberg-Weyl-type) commutation relations and use this to show that approximate symmetry operators can certify the degeneracy of the ground space even though they only approximately form a group. Importantly, the notions of "approximate" and "small" are all independent of the dimension of the ambient Hilbert space and depend only on the degeneracy in the ground space. Our analysis additionally holds for any gapped band of sufficiently small width in the excited spectrum of the Hamiltonian, and we discuss applications of these ideas to topological quantum phases of matter and topological quantum error correcting codes. Finally, in our analysis, we also provide an exponential improvement upon bounds concerning the existence of shared approximate eigenvectors of approximately commuting operators under an added normality constraint, which may be of independent interest.

  8. Development and Verification of the Charring Ablating Thermal Protection Implicit System Solver

    NASA Technical Reports Server (NTRS)

    Amar, Adam J.; Calvert, Nathan D.; Kirk, Benjamin S.

    2010-01-01

    The development and verification of the Charring Ablating Thermal Protection Implicit System Solver is presented. This work concentrates on the derivation and verification of the stationary grid terms in the equations that govern three-dimensional heat and mass transfer for charring thermal protection systems including pyrolysis gas flow through the porous char layer. The governing equations are discretized according to the Galerkin finite element method with first and second order implicit time integrators. The governing equations are fully coupled and are solved in parallel via Newton's method, while the fully implicit linear system is solved with the Generalized Minimal Residual method. Verification results from exact solutions and the Method of Manufactured Solutions are presented to show spatial and temporal orders of accuracy as well as nonlinear convergence rates.

  9. A Simple GPU-Accelerated Two-Dimensional MUSCL-Hancock Solver for Ideal Magnetohydrodynamics

    NASA Technical Reports Server (NTRS)

    Bard, Christopher; Dorelli, John C.

    2013-01-01

    We describe our experience using NVIDIA's CUDA (Compute Unified Device Architecture) C programming environment to implement a two-dimensional second-order MUSCL-Hancock ideal magnetohydrodynamics (MHD) solver on a GTX 480 Graphics Processing Unit (GPU). Taking a simple approach in which the MHD variables are stored exclusively in the global memory of the GTX 480 and accessed in a cache-friendly manner (without further optimizing memory access by, for example, staging data in the GPU's faster shared memory), we achieved a maximum speed-up of approx. = 126 for a sq 1024 grid relative to the sequential C code running on a single Intel Nehalem (2.8 GHz) core. This speedup is consistent with simple estimates based on the known floating point performance, memory throughput and parallel processing capacity of the GTX 480.

  10. A simple GPU-accelerated two-dimensional MUSCL-Hancock solver for ideal magnetohydrodynamics

    NASA Astrophysics Data System (ADS)

    Bard, Christopher M.; Dorelli, John C.

    2014-02-01

    We describe our experience using NVIDIA's CUDA (Compute Unified Device Architecture) C programming environment to implement a two-dimensional second-order MUSCL-Hancock ideal magnetohydrodynamics (MHD) solver on a GTX 480 Graphics Processing Unit (GPU). Taking a simple approach in which the MHD variables are stored exclusively in the global memory of the GTX 480 and accessed in a cache-friendly manner (without further optimizing memory access by, for example, staging data in the GPU's faster shared memory), we achieved a maximum speed-up of ≈126 for a 10242 grid relative to the sequential C code running on a single Intel Nehalem (2.8 GHz) core. This speedup is consistent with simple estimates based on the known floating point performance, memory throughput and parallel processing capacity of the GTX 480.

  11. A Fast MoM Solver (GIFFT) for Large Arrays of Microstrip and Cavity-Backed Antennas

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fasenfest, B J; Capolino, F; Wilton, D

    2005-02-02

    A straightforward numerical analysis of large arrays of arbitrary contour (and possibly missing elements) requires large memory storage and long computation times. Several techniques are currently under development to reduce this cost. One such technique is the GIFFT (Green's function interpolation and FFT) method discussed here that belongs to the class of fast solvers for large structures. This method uses a modification of the standard AIM approach [1] that takes into account the reusability properties of matrices that arise from identical array elements. If the array consists of planar conducting bodies, the array elements are meshed using standard subdomain basismore » functions, such as the RWG basis. The Green's function is then projected onto a sparse regular grid of separable interpolating polynomials. This grid can then be used in a 2D or 3D FFT to accelerate the matrix-vector product used in an iterative solver [2]. The method has been proven to greatly reduce solve time by speeding up the matrix-vector product computation. The GIFFT approach also reduces fill time and memory requirements, since only the near element interactions need to be calculated exactly. The present work extends GIFFT to layered material Green's functions and multiregion interactions via slots in ground planes. In addition, a preconditioner is implemented to greatly reduce the number of iterations required for a solution. The general scheme of the GIFFT method is reported in [2]; this contribution is limited to presenting new results for array antennas made of slot-excited patches and cavity-backed patch antennas.« less

  12. TSPmap, a tool making use of traveling salesperson problem solvers in the efficient and accurate construction of high-density genetic linkage maps.

    PubMed

    Monroe, J Grey; Allen, Zachariah A; Tanger, Paul; Mullen, Jack L; Lovell, John T; Moyers, Brook T; Whitley, Darrell; McKay, John K

    2017-01-01

    Recent advances in nucleic acid sequencing technologies have led to a dramatic increase in the number of markers available to generate genetic linkage maps. This increased marker density can be used to improve genome assemblies as well as add much needed resolution for loci controlling variation in ecologically and agriculturally important traits. However, traditional genetic map construction methods from these large marker datasets can be computationally prohibitive and highly error prone. We present TSPmap , a method which implements both approximate and exact Traveling Salesperson Problem solvers to generate linkage maps. We demonstrate that for datasets with large numbers of genomic markers (e.g. 10,000) and in multiple population types generated from inbred parents, TSPmap can rapidly produce high quality linkage maps with low sensitivity to missing and erroneous genotyping data compared to two other benchmark methods, JoinMap and MSTmap . TSPmap is open source and freely available as an R package. With the advancement of low cost sequencing technologies, the number of markers used in the generation of genetic maps is expected to continue to rise. TSPmap will be a useful tool to handle such large datasets into the future, quickly producing high quality maps using a large number of genomic markers.

  13. Padé Approximant and Minimax Rational Approximation in Standard Cosmology

    NASA Astrophysics Data System (ADS)

    Zaninetti, Lorenzo

    2016-02-01

    The luminosity distance in the standard cosmology as given by $\\Lambda$CDM and consequently the distance modulus for supernovae can be defined by the Pad\\'e approximant. A comparison with a known analytical solution shows that the Pad\\'e approximant for the luminosity distance has an error of $4\\%$ at redshift $= 10$. A similar procedure for the Taylor expansion of the luminosity distance gives an error of $4\\%$ at redshift $=0.7 $; this means that for the luminosity distance, the Pad\\'e approximation is superior to the Taylor series. The availability of an analytical expression for the distance modulus allows applying the Levenberg--Marquardt method to derive the fundamental parameters from the available compilations for supernovae. A new luminosity function for galaxies derived from the truncated gamma probability density function models the observed luminosity function for galaxies when the observed range in absolute magnitude is modeled by the Pad\\'e approximant. A comparison of $\\Lambda$CDM with other cosmologies is done adopting a statistical point of view.

  14. Gust Acoustics Computation with a Space-Time CE/SE Parallel 3D Solver

    NASA Technical Reports Server (NTRS)

    Wang, X. Y.; Himansu, A.; Chang, S. C.; Jorgenson, P. C. E.; Reddy, D. R. (Technical Monitor)

    2002-01-01

    The benchmark Problem 2 in Category 3 of the Third Computational Aero-Acoustics (CAA) Workshop is solved using the space-time conservation element and solution element (CE/SE) method. This problem concerns the unsteady response of an isolated finite-span swept flat-plate airfoil bounded by two parallel walls to an incident gust. The acoustic field generated by the interaction of the gust with the flat-plate airfoil is computed by solving the 3D (three-dimensional) Euler equations in the time domain using a parallel version of a 3D CE/SE solver. The effect of the gust orientation on the far-field directivity is studied. Numerical solutions are presented and compared with analytical solutions, showing a reasonable agreement.

  15. Implementation of Implicit Adaptive Mesh Refinement in an Unstructured Finite-Volume Flow Solver

    NASA Technical Reports Server (NTRS)

    Schwing, Alan M.; Nompelis, Ioannis; Candler, Graham V.

    2013-01-01

    This paper explores the implementation of adaptive mesh refinement in an unstructured, finite-volume solver. Unsteady and steady problems are considered. The effect on the recovery of high-order numerics is explored and the results are favorable. Important to this work is the ability to provide a path for efficient, implicit time advancement. A method using a simple refinement sensor based on undivided differences is discussed and applied to a practical problem: a shock-shock interaction on a hypersonic, inviscid double-wedge. Cases are compared to uniform grids without the use of adapted meshes in order to assess error and computational expense. Discussion of difficulties, advances, and future work prepare this method for additional research. The potential for this method in more complicated flows is described.

  16. Bypass Transitional Flow Calculations Using a Navier-Stokes Solver and Two-Equation Models

    NASA Technical Reports Server (NTRS)

    Liuo, William W.; Shih, Tsan-Hsing; Povinelli, L. A. (Technical Monitor)

    2000-01-01

    Bypass transitional flows over a flat plate were simulated using a Navier-Stokes solver and two equation models. A new model for the bypass transition, which occurs in cases with high free stream turbulence intensity (TI), is described. The new transition model is developed by including an intermittency correction function to an existing two-equation turbulence model. The advantages of using Navier-Stokes equations, as opposed to boundary-layer equations, in bypass transition simulations are also illustrated. The results for two test flows over a flat plate with different levels of free stream turbulence intensity are reported. Comparisons with the experimental measurements show that the new model can capture very well both the onset and the length of bypass transition.

  17. Intrusive Method for Uncertainty Quantification in a Multiphase Flow Solver

    NASA Astrophysics Data System (ADS)

    Turnquist, Brian; Owkes, Mark

    2016-11-01

    Uncertainty quantification (UQ) is a necessary, interesting, and often neglected aspect of fluid flow simulations. To determine the significance of uncertain initial and boundary conditions, a multiphase flow solver is being created which extends a single phase, intrusive, polynomial chaos scheme into multiphase flows. Reliably estimating the impact of input uncertainty on design criteria can help identify and minimize unwanted variability in critical areas, and has the potential to help advance knowledge in atomizing jets, jet engines, pharmaceuticals, and food processing. Use of an intrusive polynomial chaos method has been shown to significantly reduce computational cost over non-intrusive collocation methods such as Monte-Carlo. This method requires transforming the model equations into a weak form through substitution of stochastic (random) variables. Ultimately, the model deploys a stochastic Navier Stokes equation, a stochastic conservative level set approach including reinitialization, as well as stochastic normals and curvature. By implementing these approaches together in one framework, basic problems may be investigated which shed light on model expansion, uncertainty theory, and fluid flow in general. NSF Grant Number 1511325.

  18. Cooperative solutions coupling a geometry engine and adaptive solver codes

    NASA Technical Reports Server (NTRS)

    Dickens, Thomas P.

    1995-01-01

    Follow-on work has progressed in using Aero Grid and Paneling System (AGPS), a geometry and visualization system, as a dynamic real time geometry monitor, manipulator, and interrogator for other codes. In particular, AGPS has been successfully coupled with adaptive flow solvers which iterate, refining the grid in areas of interest, and continuing on to a solution. With the coupling to the geometry engine, the new grids represent the actual geometry much more accurately since they are derived directly from the geometry and do not use refits to the first-cut grids. Additional work has been done with design runs where the geometric shape is modified to achieve a desired result. Various constraints are used to point the solution in a reasonable direction which also more closely satisfies the desired results. Concepts and techniques are presented, as well as examples of sample case studies. Issues such as distributed operation of the cooperative codes versus running all codes locally and pre-calculation for performance are discussed. Future directions are considered which will build on these techniques in light of changing computer environments.

  19. Parallelization of elliptic solver for solving 1D Boussinesq model

    NASA Astrophysics Data System (ADS)

    Tarwidi, D.; Adytia, D.

    2018-03-01

    In this paper, a parallel implementation of an elliptic solver in solving 1D Boussinesq model is presented. Numerical solution of Boussinesq model is obtained by implementing a staggered grid scheme to continuity, momentum, and elliptic equation of Boussinesq model. Tridiagonal system emerging from numerical scheme of elliptic equation is solved by cyclic reduction algorithm. The parallel implementation of cyclic reduction is executed on multicore processors with shared memory architectures using OpenMP. To measure the performance of parallel program, large number of grids is varied from 28 to 214. Two test cases of numerical experiment, i.e. propagation of solitary and standing wave, are proposed to evaluate the parallel program. The numerical results are verified with analytical solution of solitary and standing wave. The best speedup of solitary and standing wave test cases is about 2.07 with 214 of grids and 1.86 with 213 of grids, respectively, which are executed by using 8 threads. Moreover, the best efficiency of parallel program is 76.2% and 73.5% for solitary and standing wave test cases, respectively.

  20. A vectorized Poisson solver over a spherical shell and its application to the quasi-geostrophic omega-equation

    NASA Technical Reports Server (NTRS)

    Mullenmeister, Paul

    1988-01-01

    The quasi-geostrophic omega-equation in flux form is developed as an example of a Poisson problem over a spherical shell. Solutions of this equation are obtained by applying a two-parameter Chebyshev solver in vector layout for CDC 200 series computers. The performance of this vectorized algorithm greatly exceeds the performance of its scalar analog. The algorithm generates solutions of the omega-equation which are compared with the omega fields calculated with the aid of the mass continuity equation.