NASA Astrophysics Data System (ADS)
An, Hyunuk; Ichikawa, Yutaka; Tachikawa, Yasuto; Shiiba, Michiharu
2012-11-01
SummaryThree different iteration methods for a three-dimensional coordinate-transformed saturated-unsaturated flow model are compared in this study. The Picard and Newton iteration methods are the common approaches for solving Richards' equation. The Picard method is simple to implement and cost-efficient (on an individual iteration basis). However it converges slower than the Newton method. On the other hand, although the Newton method converges faster, it is more complex to implement and consumes more CPU resources per iteration than the Picard method. The comparison of the two methods in finite-element model (FEM) for saturated-unsaturated flow has been well evaluated in previous studies. However, two iteration methods might exhibit different behavior in the coordinate-transformed finite-difference model (FDM). In addition, the Newton-Krylov method could be a suitable alternative for the coordinate-transformed FDM because it requires the evaluation of a 19-point stencil matrix. The formation of a 19-point stencil is quite a complex and laborious procedure. Instead, the Newton-Krylov method calculates the matrix-vector product, which can be easily approximated by calculating the differences of the original nonlinear function. In this respect, the Newton-Krylov method might be the most appropriate iteration method for coordinate-transformed FDM. However, this method involves the additional cost of taking an approximation at each Krylov iteration in the Newton-Krylov method. In this paper, we evaluated the efficiency and robustness of three iteration methods—the Picard, Newton, and Newton-Krylov methods—for simulating saturated-unsaturated flow through porous media using a three-dimensional coordinate-transformed FDM.
On iterative processes in the Krylov-Sonneveld subspaces
NASA Astrophysics Data System (ADS)
Ilin, Valery P.
2016-10-01
The iterative Induced Dimension Reduction (IDR) methods are considered for solving large systems of linear algebraic equations (SLAEs) with nonsingular nonsymmetric matrices. These approaches are investigated by many authors and are charachterized sometimes as the alternative to the classical processes of Krylov type. The key moments of the IDR algorithms consist in the construction of the embedded Sonneveld subspaces, which have the decreasing dimensions and use the orthogonalization to some fixed subspace. Other independent approaches for research and optimization of the iterations are based on the augmented and modified Krylov subspaces by using the aggregation and deflation procedures with present various low rank approximations of the original matrices. The goal of this paper is to show, that IDR method in Sonneveld subspaces present an original interpretation of the modified algorithms in the Krylov subspaces. In particular, such description is given for the multi-preconditioned semi-conjugate direction methods which are actual for the parallel algebraic domain decomposition approaches.
Krylov subspace iterative methods for boundary element method based near-field acoustic holography.
Valdivia, Nicolas; Williams, Earl G
2005-02-01
The reconstruction of the acoustic field for general surfaces is obtained from the solution of a matrix system that results from a boundary integral equation discretized using boundary element methods. The solution to the resultant matrix system is obtained using iterative regularization methods that counteract the effect of noise on the measurements. These methods will not require the calculation of the singular value decomposition, which can be expensive when the matrix system is considerably large. Krylov subspace methods are iterative methods that have the phenomena known as "semi-convergence," i.e., the optimal regularization solution is obtained after a few iterations. If the iteration is not stopped, the method converges to a solution that generally is totally corrupted by errors on the measurements. For these methods the number of iterations play the role of the regularization parameter. We will focus our attention to the study of the regularizing properties from the Krylov subspace methods like conjugate gradients, least squares QR and the recently proposed Hybrid method. A discussion and comparison of the available stopping rules will be included. A vibrating plate is considered as an example to validate our results.
Globally convergent techniques in nonlinear Newton-Krylov
NASA Technical Reports Server (NTRS)
Brown, Peter N.; Saad, Youcef
1989-01-01
Some convergence theory is presented for nonlinear Krylov subspace methods. The basic idea of these methods is to use variants of Newton's iteration in conjunction with a Krylov subspace method for solving the Jacobian linear systems. These methods are variants of inexact Newton methods where the approximate Newton direction is taken from a subspace of small dimensions. The main focus is to analyze these methods when they are combined with global strategies such as linesearch techniques and model trust region algorithms. Most of the convergence results are formulated for projection onto general subspaces rather than just Krylov subspaces.
Newton-Krylov-Schwarz: An implicit solver for CFD
NASA Technical Reports Server (NTRS)
Cai, Xiao-Chuan; Keyes, David E.; Venkatakrishnan, V.
1995-01-01
Newton-Krylov methods and Krylov-Schwarz (domain decomposition) methods have begun to become established in computational fluid dynamics (CFD) over the past decade. The former employ a Krylov method inside of Newton's method in a Jacobian-free manner, through directional differencing. The latter employ an overlapping Schwarz domain decomposition to derive a preconditioner for the Krylov accelerator that relies primarily on local information, for data-parallel concurrency. They may be composed as Newton-Krylov-Schwarz (NKS) methods, which seem particularly well suited for solving nonlinear elliptic systems in high-latency, distributed-memory environments. We give a brief description of this family of algorithms, with an emphasis on domain decomposition iterative aspects. We then describe numerical simulations with Newton-Krylov-Schwarz methods on aerodynamics applications emphasizing comparisons with a standard defect-correction approach, subdomain preconditioner consistency, subdomain preconditioner quality, and the effect of a coarse grid.
Parallel Dynamics Simulation Using a Krylov-Schwarz Linear Solution Scheme
Abhyankar, Shrirang; Constantinescu, Emil M.; Smith, Barry F.; ...
2016-11-07
Fast dynamics simulation of large-scale power systems is a computational challenge because of the need to solve a large set of stiff, nonlinear differential-algebraic equations at every time step. The main bottleneck in dynamic simulations is the solution of a linear system during each nonlinear iteration of Newton’s method. In this paper, we present a parallel Krylov- Schwarz linear solution scheme that uses the Krylov subspacebased iterative linear solver GMRES with an overlapping restricted additive Schwarz preconditioner. As a result, performance tests of the proposed Krylov-Schwarz scheme for several large test cases ranging from 2,000 to 20,000 buses, including amore » real utility network, show good scalability on different computing architectures.« less
Parallel Dynamics Simulation Using a Krylov-Schwarz Linear Solution Scheme
DOE Office of Scientific and Technical Information (OSTI.GOV)
Abhyankar, Shrirang; Constantinescu, Emil M.; Smith, Barry F.
Fast dynamics simulation of large-scale power systems is a computational challenge because of the need to solve a large set of stiff, nonlinear differential-algebraic equations at every time step. The main bottleneck in dynamic simulations is the solution of a linear system during each nonlinear iteration of Newton’s method. In this paper, we present a parallel Krylov- Schwarz linear solution scheme that uses the Krylov subspacebased iterative linear solver GMRES with an overlapping restricted additive Schwarz preconditioner. As a result, performance tests of the proposed Krylov-Schwarz scheme for several large test cases ranging from 2,000 to 20,000 buses, including amore » real utility network, show good scalability on different computing architectures.« less
Till, Andrew T.; Warsa, James S.; Morel, Jim E.
2018-06-15
The thermal radiative transfer (TRT) equations comprise a radiation equation coupled to the material internal energy equation. Linearization of these equations produces effective, thermally-redistributed scattering through absorption-reemission. In this paper, we investigate the effectiveness and efficiency of Linear-Multi-Frequency-Grey (LMFG) acceleration that has been reformulated for use as a preconditioner to Krylov iterative solution methods. We introduce two general frameworks, the scalar flux formulation (SFF) and the absorption rate formulation (ARF), and investigate their iterative properties in the absence and presence of true scattering. SFF has a group-dependent state size but may be formulated without inner iterations in the presence ofmore » scattering, while ARF has a group-independent state size but requires inner iterations when scattering is present. We compare and evaluate the computational cost and efficiency of LMFG applied to these two formulations using a direct solver for the preconditioners. Finally, this work is novel because the use of LMFG for the radiation transport equation, in conjunction with Krylov methods, involves special considerations not required for radiation diffusion.« less
Multilevel Iterative Methods in Nonlinear Computational Plasma Physics
NASA Astrophysics Data System (ADS)
Knoll, D. A.; Finn, J. M.
1997-11-01
Many applications in computational plasma physics involve the implicit numerical solution of coupled systems of nonlinear partial differential equations or integro-differential equations. Such problems arise in MHD, systems of Vlasov-Fokker-Planck equations, edge plasma fluid equations. We have been developing matrix-free Newton-Krylov algorithms for such problems and have applied these algorithms to the edge plasma fluid equations [1,2] and to the Vlasov-Fokker-Planck equation [3]. Recently we have found that with increasing grid refinement, the number of Krylov iterations required per Newton iteration has grown unmanageable [4]. This has led us to the study of multigrid methods as a means of preconditioning matrix-free Newton-Krylov methods. In this poster we will give details of the general multigrid preconditioned Newton-Krylov algorithm, as well as algorithm performance details on problems of interest in the areas of magnetohydrodynamics and edge plasma physics. Work supported by US DoE 1. Knoll and McHugh, J. Comput. Phys., 116, pg. 281 (1995) 2. Knoll and McHugh, Comput. Phys. Comm., 88, pg. 141 (1995) 3. Mousseau and Knoll, J. Comput. Phys. (1997) (to appear) 4. Knoll and McHugh, SIAM J. Sci. Comput. 19, (1998) (to appear)
Eigenvalue Solvers for Modeling Nuclear Reactors on Leadership Class Machines
Slaybaugh, R. N.; Ramirez-Zweiger, M.; Pandya, Tara; ...
2018-02-20
In this paper, three complementary methods have been implemented in the code Denovo that accelerate neutral particle transport calculations with methods that use leadership-class computers fully and effectively: a multigroup block (MG) Krylov solver, a Rayleigh quotient iteration (RQI) eigenvalue solver, and a multigrid in energy (MGE) preconditioner. The MG Krylov solver converges more quickly than Gauss Seidel and enables energy decomposition such that Denovo can scale to hundreds of thousands of cores. RQI should converge in fewer iterations than power iteration (PI) for large and challenging problems. RQI creates shifted systems that would not be tractable without the MGmore » Krylov solver. It also creates ill-conditioned matrices. The MGE preconditioner reduces iteration count significantly when used with RQI and takes advantage of the new energy decomposition such that it can scale efficiently. Each individual method has been described before, but this is the first time they have been demonstrated to work together effectively. The combination of solvers enables the RQI eigenvalue solver to work better than the other available solvers for large reactors problems on leadership-class machines. Using these methods together, RQI converged in fewer iterations and in less time than PI for a full pressurized water reactor core. These solvers also performed better than an Arnoldi eigenvalue solver for a reactor benchmark problem when energy decomposition is needed. The MG Krylov, MGE preconditioner, and RQI solver combination also scales well in energy. Finally, this solver set is a strong choice for very large and challenging problems.« less
Eigenvalue Solvers for Modeling Nuclear Reactors on Leadership Class Machines
DOE Office of Scientific and Technical Information (OSTI.GOV)
Slaybaugh, R. N.; Ramirez-Zweiger, M.; Pandya, Tara
In this paper, three complementary methods have been implemented in the code Denovo that accelerate neutral particle transport calculations with methods that use leadership-class computers fully and effectively: a multigroup block (MG) Krylov solver, a Rayleigh quotient iteration (RQI) eigenvalue solver, and a multigrid in energy (MGE) preconditioner. The MG Krylov solver converges more quickly than Gauss Seidel and enables energy decomposition such that Denovo can scale to hundreds of thousands of cores. RQI should converge in fewer iterations than power iteration (PI) for large and challenging problems. RQI creates shifted systems that would not be tractable without the MGmore » Krylov solver. It also creates ill-conditioned matrices. The MGE preconditioner reduces iteration count significantly when used with RQI and takes advantage of the new energy decomposition such that it can scale efficiently. Each individual method has been described before, but this is the first time they have been demonstrated to work together effectively. The combination of solvers enables the RQI eigenvalue solver to work better than the other available solvers for large reactors problems on leadership-class machines. Using these methods together, RQI converged in fewer iterations and in less time than PI for a full pressurized water reactor core. These solvers also performed better than an Arnoldi eigenvalue solver for a reactor benchmark problem when energy decomposition is needed. The MG Krylov, MGE preconditioner, and RQI solver combination also scales well in energy. Finally, this solver set is a strong choice for very large and challenging problems.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, Paul T.; Shadid, John N.; Tsuji, Paul H.
Here, this study explores the performance and scaling of a GMRES Krylov method employed as a smoother for an algebraic multigrid (AMG) preconditioned Newton- Krylov solution approach applied to a fully-implicit variational multiscale (VMS) nite element (FE) resistive magnetohydrodynamics (MHD) formulation. In this context a Newton iteration is used for the nonlinear system and a Krylov (GMRES) method is employed for the linear subsystems. The efficiency of this approach is critically dependent on the scalability and performance of the AMG preconditioner for the linear solutions and the performance of the smoothers play a critical role. Krylov smoothers are considered inmore » an attempt to reduce the time and memory requirements of existing robust smoothers based on additive Schwarz domain decomposition (DD) with incomplete LU factorization solves on each subdomain. Three time dependent resistive MHD test cases are considered to evaluate the method. The results demonstrate that the GMRES smoother can be faster due to a decrease in the preconditioner setup time and a reduction in outer GMRESR solver iterations, and requires less memory (typically 35% less memory for global GMRES smoother) than the DD ILU smoother.« less
Acceleration of GPU-based Krylov solvers via data transfer reduction
Anzt, Hartwig; Tomov, Stanimire; Luszczek, Piotr; ...
2015-04-08
Krylov subspace iterative solvers are often the method of choice when solving large sparse linear systems. At the same time, hardware accelerators such as graphics processing units continue to offer significant floating point performance gains for matrix and vector computations through easy-to-use libraries of computational kernels. However, as these libraries are usually composed of a well optimized but limited set of linear algebra operations, applications that use them often fail to reduce certain data communications, and hence fail to leverage the full potential of the accelerator. In this study, we target the acceleration of Krylov subspace iterative methods for graphicsmore » processing units, and in particular the Biconjugate Gradient Stabilized solver that significant improvement can be achieved by reformulating the method to reduce data-communications through application-specific kernels instead of using the generic BLAS kernels, e.g. as provided by NVIDIA’s cuBLAS library, and by designing a graphics processing unit specific sparse matrix-vector product kernel that is able to more efficiently use the graphics processing unit’s computing power. Furthermore, we derive a model estimating the performance improvement, and use experimental data to validate the expected runtime savings. Finally, considering that the derived implementation achieves significantly higher performance, we assert that similar optimizations addressing algorithm structure, as well as sparse matrix-vector, are crucial for the subsequent development of high-performance graphics processing units accelerated Krylov subspace iterative methods.« less
Krylov Deferred Correction Accelerated Method of Lines Transpose for Parabolic Problems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jia, Jun; Jingfang, Huang
2008-01-01
In this paper, a new class of numerical methods for the accurate and efficient solutions of parabolic partial differential equations is presented. Unlike traditional method of lines (MoL), the new {\\bf \\it Krylov deferred correction (KDC) accelerated method of lines transpose (MoL^T)} first discretizes the temporal direction using Gaussian type nodes and spectral integration, and symbolically applies low-order time marching schemes to form a preconditioned elliptic system, which is then solved iteratively using Newton-Krylov techniques such as Newton-GMRES or Newton-BiCGStab method. Each function evaluation in the Newton-Krylov method is simply one low-order time-stepping approximation of the error by solving amore » decoupled system using available fast elliptic equation solvers. Preliminary numerical experiments show that the KDC accelerated MoL^T technique is unconditionally stable, can be spectrally accurate in both temporal and spatial directions, and allows optimal time-step sizes in long-time simulations.« less
Solution of nonlinear time-dependent PDEs through componentwise approximation of matrix functions
NASA Astrophysics Data System (ADS)
Cibotarica, Alexandru; Lambers, James V.; Palchak, Elisabeth M.
2016-09-01
Exponential propagation iterative (EPI) methods provide an efficient approach to the solution of large stiff systems of ODEs, compared to standard integrators. However, the bulk of the computational effort in these methods is due to products of matrix functions and vectors, which can become very costly at high resolution due to an increase in the number of Krylov projection steps needed to maintain accuracy. In this paper, it is proposed to modify EPI methods by using Krylov subspace spectral (KSS) methods, instead of standard Krylov projection methods, to compute products of matrix functions and vectors. Numerical experiments demonstrate that this modification causes the number of Krylov projection steps to become bounded independently of the grid size, thus dramatically improving efficiency and scalability. As a result, for each test problem featured, as the total number of grid points increases, the growth in computation time is just below linear, while other methods achieved this only on selected test problems or not at all.
Parallel iterative methods for sparse linear and nonlinear equations
NASA Technical Reports Server (NTRS)
Saad, Youcef
1989-01-01
As three-dimensional models are gaining importance, iterative methods will become almost mandatory. Among these, preconditioned Krylov subspace methods have been viewed as the most efficient and reliable, when solving linear as well as nonlinear systems of equations. There has been several different approaches taken to adapt iterative methods for supercomputers. Some of these approaches are discussed and the methods that deal more specifically with general unstructured sparse matrices, such as those arising from finite element methods, are emphasized.
Krylov Subspace Methods for Complex Non-Hermitian Linear Systems. Thesis
NASA Technical Reports Server (NTRS)
Freund, Roland W.
1991-01-01
We consider Krylov subspace methods for the solution of large sparse linear systems Ax = b with complex non-Hermitian coefficient matrices. Such linear systems arise in important applications, such as inverse scattering, numerical solution of time-dependent Schrodinger equations, underwater acoustics, eddy current computations, numerical computations in quantum chromodynamics, and numerical conformal mapping. Typically, the resulting coefficient matrices A exhibit special structures, such as complex symmetry, or they are shifted Hermitian matrices. In this paper, we first describe a Krylov subspace approach with iterates defined by a quasi-minimal residual property, the QMR method, for solving general complex non-Hermitian linear systems. Then, we study special Krylov subspace methods designed for the two families of complex symmetric respectively shifted Hermitian linear systems. We also include some results concerning the obvious approach to general complex linear systems by solving equivalent real linear systems for the real and imaginary parts of x. Finally, numerical experiments for linear systems arising from the complex Helmholtz equation are reported.
NASA Technical Reports Server (NTRS)
Sidi, Avram
1992-01-01
Let F(z) be a vectored-valued function F: C approaches C sup N, which is analytic at z=0 and meromorphic in a neighborhood of z=0, and let its Maclaurin series be given. We use vector-valued rational approximation procedures for F(z) that are based on its Maclaurin series in conjunction with power iterations to develop bona fide generalizations of the power method for an arbitrary N X N matrix that may be diagonalizable or not. These generalizations can be used to obtain simultaneously several of the largest distinct eigenvalues and the corresponding invariant subspaces, and present a detailed convergence theory for them. In addition, it is shown that the generalized power methods of this work are equivalent to some Krylov subspace methods, among them the methods of Arnoldi and Lanczos. Thus, the theory provides a set of completely new results and constructions for these Krylov subspace methods. This theory suggests at the same time a new mode of usage for these Krylov subspace methods that were observed to possess computational advantages over their common mode of usage.
Krylov-Subspace Recycling via the POD-Augmented Conjugate-Gradient Method
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carlberg, Kevin; Forstall, Virginia; Tuminaro, Ray
This paper presents a new Krylov-subspace-recycling method for efficiently solving sequences of linear systems of equations characterized by varying right-hand sides and symmetric-positive-definite matrices. As opposed to typical truncation strategies used in recycling such as deflation, we propose a truncation method inspired by goal-oriented proper orthogonal decomposition (POD) from model reduction. This idea is based on the observation that model reduction aims to compute a low-dimensional subspace that contains an accurate solution; as such, we expect the proposed method to generate a low-dimensional subspace that is well suited for computing solutions that can satisfy inexact tolerances. In particular, we proposemore » specific goal-oriented POD `ingredients' that align the optimality properties of POD with the objective of Krylov-subspace recycling. To compute solutions in the resulting 'augmented' POD subspace, we propose a hybrid direct/iterative three-stage method that leverages 1) the optimal ordering of POD basis vectors, and 2) well-conditioned reduced matrices. Numerical experiments performed on solid-mechanics problems highlight the benefits of the proposed method over existing approaches for Krylov-subspace recycling.« less
Krylov-Subspace Recycling via the POD-Augmented Conjugate-Gradient Method
Carlberg, Kevin; Forstall, Virginia; Tuminaro, Ray
2016-01-01
This paper presents a new Krylov-subspace-recycling method for efficiently solving sequences of linear systems of equations characterized by varying right-hand sides and symmetric-positive-definite matrices. As opposed to typical truncation strategies used in recycling such as deflation, we propose a truncation method inspired by goal-oriented proper orthogonal decomposition (POD) from model reduction. This idea is based on the observation that model reduction aims to compute a low-dimensional subspace that contains an accurate solution; as such, we expect the proposed method to generate a low-dimensional subspace that is well suited for computing solutions that can satisfy inexact tolerances. In particular, we proposemore » specific goal-oriented POD `ingredients' that align the optimality properties of POD with the objective of Krylov-subspace recycling. To compute solutions in the resulting 'augmented' POD subspace, we propose a hybrid direct/iterative three-stage method that leverages 1) the optimal ordering of POD basis vectors, and 2) well-conditioned reduced matrices. Numerical experiments performed on solid-mechanics problems highlight the benefits of the proposed method over existing approaches for Krylov-subspace recycling.« less
Construction, classification and parametrization of complex Hadamard matrices
NASA Astrophysics Data System (ADS)
Szöllősi, Ferenc
To improve the design of nuclear systems, high-fidelity neutron fluxes are required. Leadership-class machines provide platforms on which very large problems can be solved. Computing such fluxes efficiently requires numerical methods with good convergence properties and algorithms that can scale to hundreds of thousands of cores. Many 3-D deterministic transport codes are decomposable in space and angle only, limiting them to tens of thousands of cores. Most codes rely on methods such as Gauss Seidel for fixed source problems and power iteration for eigenvalue problems, which can be slow to converge for challenging problems like those with highly scattering materials or high dominance ratios. Three methods have been added to the 3-D SN transport code Denovo that are designed to improve convergence and enable the full use of cutting-edge computers. The first is a multigroup Krylov solver that converges more quickly than Gauss Seidel and parallelizes the code in energy such that Denovo can use hundreds of thousand of cores effectively. The second is Rayleigh quotient iteration (RQI), an old method applied in a new context. This eigenvalue solver finds the dominant eigenvalue in a mathematically optimal way and should converge in fewer iterations than power iteration. RQI creates energy-block-dense equations that the new Krylov solver treats efficiently. However, RQI can have convergence problems because it creates poorly conditioned systems. This can be overcome with preconditioning. The third method is a multigrid-in-energy preconditioner. The preconditioner takes advantage of the new energy decomposition because the grids are in energy rather than space or angle. The preconditioner greatly reduces iteration count for many problem types and scales well in energy. It also allows RQI to be successful for problems it could not solve otherwise. The methods added to Denovo accomplish the goals of this work. They converge in fewer iterations than traditional methods and enable the use of hundreds of thousands of cores. Each method can be used individually, with the multigroup Krylov solver and multigrid-in-energy preconditioner being particularly successful on their own. The largest benefit, though, comes from using these methods in concert.
Parallel Newton-Krylov-Schwarz algorithms for the transonic full potential equation
NASA Technical Reports Server (NTRS)
Cai, Xiao-Chuan; Gropp, William D.; Keyes, David E.; Melvin, Robin G.; Young, David P.
1996-01-01
We study parallel two-level overlapping Schwarz algorithms for solving nonlinear finite element problems, in particular, for the full potential equation of aerodynamics discretized in two dimensions with bilinear elements. The overall algorithm, Newton-Krylov-Schwarz (NKS), employs an inexact finite-difference Newton method and a Krylov space iterative method, with a two-level overlapping Schwarz method as a preconditioner. We demonstrate that NKS, combined with a density upwinding continuation strategy for problems with weak shocks, is robust and, economical for this class of mixed elliptic-hyperbolic nonlinear partial differential equations, with proper specification of several parameters. We study upwinding parameters, inner convergence tolerance, coarse grid density, subdomain overlap, and the level of fill-in in the incomplete factorization, and report their effect on numerical convergence rate, overall execution time, and parallel efficiency on a distributed-memory parallel computer.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aliaga, José I., E-mail: aliaga@uji.es; Alonso, Pedro; Badía, José M.
We introduce a new iterative Krylov subspace-based eigensolver for the simulation of macromolecular motions on desktop multithreaded platforms equipped with multicore processors and, possibly, a graphics accelerator (GPU). The method consists of two stages, with the original problem first reduced into a simpler band-structured form by means of a high-performance compute-intensive procedure. This is followed by a memory-intensive but low-cost Krylov iteration, which is off-loaded to be computed on the GPU by means of an efficient data-parallel kernel. The experimental results reveal the performance of the new eigensolver. Concretely, when applied to the simulation of macromolecules with a few thousandsmore » degrees of freedom and the number of eigenpairs to be computed is small to moderate, the new solver outperforms other methods implemented as part of high-performance numerical linear algebra packages for multithreaded architectures.« less
Numerical simulations of microwave heating of liquids: enhancements using Krylov subspace methods
NASA Astrophysics Data System (ADS)
Lollchund, M. R.; Dookhitram, K.; Sunhaloo, M. S.; Boojhawon, R.
2013-04-01
In this paper, we compare the performances of three iterative solvers for large sparse linear systems arising in the numerical computations of incompressible Navier-Stokes (NS) equations. These equations are employed mainly in the simulation of microwave heating of liquids. The emphasis of this work is on the application of Krylov projection techniques such as Generalized Minimal Residual (GMRES) to solve the Pressure Poisson Equations that result from discretisation of the NS equations. The performance of the GMRES method is compared with the traditional Gauss-Seidel (GS) and point successive over relaxation (PSOR) techniques through their application to simulate the dynamics of water housed inside a vertical cylindrical vessel which is subjected to microwave radiation. It is found that as the mesh size increases, GMRES gives the fastest convergence rate in terms of computational times and number of iterations.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Clark, M. A.; Strelchenko, Alexei; Vaquero, Alejandro
Lattice quantum chromodynamics simulations in nuclear physics have benefited from a tremendous number of algorithmic advances such as multigrid and eigenvector deflation. These improve the time to solution but do not alleviate the intrinsic memory-bandwidth constraints of the matrix-vector operation dominating iterative solvers. Batching this operation for multiple vectors and exploiting cache and register blocking can yield a super-linear speed up. Block-Krylov solvers can naturally take advantage of such batched matrix-vector operations, further reducing the iterations to solution by sharing the Krylov space between solves. However, practical implementations typically suffer from the quadratic scaling in the number of vector-vector operations.more » Using the QUDA library, we present an implementation of a block-CG solver on NVIDIA GPUs which reduces the memory-bandwidth complexity of vector-vector operations from quadratic to linear. We present results for the HISQ discretization, showing a 5x speedup compared to highly-optimized independent Krylov solves on NVIDIA's SaturnV cluster.« less
The Krylov accelerated SIMPLE(R) method for flow problems in industrial furnaces
NASA Astrophysics Data System (ADS)
Vuik, C.; Saghir, A.; Boerstoel, G. P.
2000-08-01
Numerical modeling of the melting and combustion process is an important tool in gaining understanding of the physical and chemical phenomena that occur in a gas- or oil-fired glass-melting furnace. The incompressible Navier-Stokes equations are used to model the gas flow in the furnace. The discrete Navier-Stokes equations are solved by the SIMPLE(R) pressure-correction method. In these applications, many SIMPLE(R) iterations are necessary to obtain an accurate solution. In this paper, Krylov accelerated versions are proposed: GCR-SIMPLE(R). The properties of these methods are investigated for a simple two-dimensional flow. Thereafter, the efficiencies of the methods are compared for three-dimensional flows in industrial glass-melting furnaces. Copyright
An implementation of the QMR method based on coupled two-term recurrences
NASA Technical Reports Server (NTRS)
Freund, Roland W.; Nachtigal, Noeel M.
1992-01-01
The authors have proposed a new Krylov subspace iteration, the quasi-minimal residual algorithm (QMR), for solving non-Hermitian linear systems. In the original implementation of the QMR method, the Lanczos process with look-ahead is used to generate basis vectors for the underlying Krylov subspaces. In the Lanczos algorithm, these basis vectors are computed by means of three-term recurrences. It has been observed that, in finite precision arithmetic, vector iterations based on three-term recursions are usually less robust than mathematically equivalent coupled two-term vector recurrences. This paper presents a look-ahead algorithm that constructs the Lanczos basis vectors by means of coupled two-term recursions. Implementation details are given, and the look-ahead strategy is described. A new implementation of the QMR method, based on this coupled two-term algorithm, is described. A simplified version of the QMR algorithm without look-ahead is also presented, and the special case of QMR for complex symmetric linear systems is considered. Results of numerical experiments comparing the original and the new implementations of the QMR method are reported.
An iterative solver for the 3D Helmholtz equation
NASA Astrophysics Data System (ADS)
Belonosov, Mikhail; Dmitriev, Maxim; Kostin, Victor; Neklyudov, Dmitry; Tcheverda, Vladimir
2017-09-01
We develop a frequency-domain iterative solver for numerical simulation of acoustic waves in 3D heterogeneous media. It is based on the application of a unique preconditioner to the Helmholtz equation that ensures convergence for Krylov subspace iteration methods. Effective inversion of the preconditioner involves the Fast Fourier Transform (FFT) and numerical solution of a series of boundary value problems for ordinary differential equations. Matrix-by-vector multiplication for iterative inversion of the preconditioned matrix involves inversion of the preconditioner and pointwise multiplication of grid functions. Our solver has been verified by benchmarking against exact solutions and a time-domain solver.
A new implementation of the CMRH method for solving dense linear systems
NASA Astrophysics Data System (ADS)
Heyouni, M.; Sadok, H.
2008-04-01
The CMRH method [H. Sadok, Methodes de projections pour les systemes lineaires et non lineaires, Habilitation thesis, University of Lille1, Lille, France, 1994; H. Sadok, CMRH: A new method for solving nonsymmetric linear systems based on the Hessenberg reduction algorithm, Numer. Algorithms 20 (1999) 303-321] is an algorithm for solving nonsymmetric linear systems in which the Arnoldi component of GMRES is replaced by the Hessenberg process, which generates Krylov basis vectors which are orthogonal to standard unit basis vectors rather than mutually orthogonal. The iterate is formed from these vectors by solving a small least squares problem involving a Hessenberg matrix. Like GMRES, this method requires one matrix-vector product per iteration. However, it can be implemented to require half as much arithmetic work and less storage. Moreover, numerical experiments show that this method performs accurately and reduces the residual about as fast as GMRES. With this new implementation, we show that the CMRH method is the only method with long-term recurrence which requires not storing at the same time the entire Krylov vectors basis and the original matrix as in the GMRES algorithmE A comparison with Gaussian elimination is provided.
Nonlinear Krylov and moving nodes in the method of lines
NASA Astrophysics Data System (ADS)
Miller, Keith
2005-11-01
We report on some successes and problem areas in the Method of Lines from our work with moving node finite element methods. First, we report on our "nonlinear Krylov accelerator" for the modified Newton's method on the nonlinear equations of our stiff ODE solver. Since 1990 it has been robust, simple, cheap, and automatic on all our moving node computations. We publicize further trials with it here because it should be of great general usefulness to all those solving evolutionary equations. Second, we discuss the need for reliable automatic choice of spatially variable time steps. Third, we discuss the need for robust and efficient iterative solvers for the difficult linearized equations (Jx=b) of our stiff ODE solver. Here, the 1997 thesis of Zulu Xaba has made significant progress.
A stopping criterion for the iterative solution of partial differential equations
NASA Astrophysics Data System (ADS)
Rao, Kaustubh; Malan, Paul; Perot, J. Blair
2018-01-01
A stopping criterion for iterative solution methods is presented that accurately estimates the solution error using low computational overhead. The proposed criterion uses information from prior solution changes to estimate the error. When the solution changes are noisy or stagnating it reverts to a less accurate but more robust, low-cost singular value estimate to approximate the error given the residual. This estimator can also be applied to iterative linear matrix solvers such as Krylov subspace or multigrid methods. Examples of the stopping criterion's ability to accurately estimate the non-linear and linear solution error are provided for a number of different test cases in incompressible fluid dynamics.
PRIFIRA: General regularization using prior-conditioning for fast radio interferometric imaging†
NASA Astrophysics Data System (ADS)
Naghibzadeh, Shahrzad; van der Veen, Alle-Jan
2018-06-01
Image formation in radio astronomy is a large-scale inverse problem that is inherently ill-posed. We present a general algorithmic framework based on a Bayesian-inspired regularized maximum likelihood formulation of the radio astronomical imaging problem with a focus on diffuse emission recovery from limited noisy correlation data. The algorithm is dubbed PRIor-conditioned Fast Iterative Radio Astronomy (PRIFIRA) and is based on a direct embodiment of the regularization operator into the system by right preconditioning. The resulting system is then solved using an iterative method based on projections onto Krylov subspaces. We motivate the use of a beamformed image (which includes the classical "dirty image") as an efficient prior-conditioner. Iterative reweighting schemes generalize the algorithmic framework and can account for different regularization operators that encourage sparsity of the solution. The performance of the proposed method is evaluated based on simulated one- and two-dimensional array arrangements as well as actual data from the core stations of the Low Frequency Array radio telescope antenna configuration, and compared to state-of-the-art imaging techniques. We show the generality of the proposed method in terms of regularization schemes while maintaining a competitive reconstruction quality with the current reconstruction techniques. Furthermore, we show that exploiting Krylov subspace methods together with the proper noise-based stopping criteria results in a great improvement in imaging efficiency.
NASA Astrophysics Data System (ADS)
Turcksin, Bruno; Ragusa, Jean C.; Morel, Jim E.
2012-01-01
It is well known that the diffusion synthetic acceleration (DSA) methods for the Sn equations become ineffective in the Fokker-Planck forward-peaked scattering limit. In response to this deficiency, Morel and Manteuffel (1991) developed an angular multigrid method for the 1-D Sn equations. This method is very effective, costing roughly twice as much as DSA per source iteration, and yielding a maximum spectral radius of approximately 0.6 in the Fokker-Planck limit. Pautz, Adams, and Morel (PAM) (1999) later generalized the angular multigrid to 2-D, but it was found that the method was unstable with sufficiently forward-peaked mappings between the angular grids. The method was stabilized via a filtering technique based on diffusion operators, but this filtering also degraded the effectiveness of the overall scheme. The spectral radius was not bounded away from unity in the Fokker-Planck limit, although the method remained more effective than DSA. The purpose of this article is to recast the multidimensional PAM angular multigrid method without the filtering as an Sn preconditioner and use it in conjunction with the Generalized Minimal RESidual (GMRES) Krylov method. The approach ensures stability and our computational results demonstrate that it is also significantly more efficient than an analogous DSA-preconditioned Krylov method.
NASA Astrophysics Data System (ADS)
Marx, Alain; Lütjens, Hinrich
2017-03-01
A hybrid MPI/OpenMP parallel version of the XTOR-2F code [Lütjens and Luciani, J. Comput. Phys. 229 (2010) 8130] solving the two-fluid MHD equations in full tokamak geometry by means of an iterative Newton-Krylov matrix-free method has been developed. The present work shows that the code has been parallelized significantly despite the numerical profile of the problem solved by XTOR-2F, i.e. a discretization with pseudo-spectral representations in all angular directions, the stiffness of the two-fluid stability problem in tokamaks, and the use of a direct LU decomposition to invert the physical pre-conditioner at every Krylov iteration of the solver. The execution time of the parallelized version is an order of magnitude smaller than the sequential one for low resolution cases, with an increasing speedup when the discretization mesh is refined. Moreover, it allows to perform simulations with higher resolutions, previously forbidden because of memory limitations.
Krylov subspace methods on supercomputers
NASA Technical Reports Server (NTRS)
Saad, Youcef
1988-01-01
A short survey of recent research on Krylov subspace methods with emphasis on implementation on vector and parallel computers is presented. Conjugate gradient methods have proven very useful on traditional scalar computers, and their popularity is likely to increase as three-dimensional models gain importance. A conservative approach to derive effective iterative techniques for supercomputers has been to find efficient parallel/vector implementations of the standard algorithms. The main source of difficulty in the incomplete factorization preconditionings is in the solution of the triangular systems at each step. A few approaches consisting of implementing efficient forward and backward triangular solutions are described in detail. Polynomial preconditioning as an alternative to standard incomplete factorization techniques is also discussed. Another efficient approach is to reorder the equations so as to improve the structure of the matrix to achieve better parallelism or vectorization. An overview of these and other ideas and their effectiveness or potential for different types of architectures is given.
NASA Astrophysics Data System (ADS)
Chen, Hao; Lv, Wen; Zhang, Tongtong
2018-05-01
We study preconditioned iterative methods for the linear system arising in the numerical discretization of a two-dimensional space-fractional diffusion equation. Our approach is based on a formulation of the discrete problem that is shown to be the sum of two Kronecker products. By making use of an alternating Kronecker product splitting iteration technique we establish a class of fixed-point iteration methods. Theoretical analysis shows that the new method converges to the unique solution of the linear system. Moreover, the optimal choice of the involved iteration parameters and the corresponding asymptotic convergence rate are computed exactly when the eigenvalues of the system matrix are all real. The basic iteration is accelerated by a Krylov subspace method like GMRES. The corresponding preconditioner is in a form of a Kronecker product structure and requires at each iteration the solution of a set of discrete one-dimensional fractional diffusion equations. We use structure preserving approximations to the discrete one-dimensional fractional diffusion operators in the action of the preconditioning matrix. Numerical examples are presented to illustrate the effectiveness of this approach.
Efficient Coupling of Fluid-Plasma and Monte-Carlo-Neutrals Models for Edge Plasma Transport
NASA Astrophysics Data System (ADS)
Dimits, A. M.; Cohen, B. I.; Friedman, A.; Joseph, I.; Lodestro, L. L.; Rensink, M. E.; Rognlien, T. D.; Sjogreen, B.; Stotler, D. P.; Umansky, M. V.
2017-10-01
UEDGE has been valuable for modeling transport in the tokamak edge and scrape-off layer due in part to its efficient fully implicit solution of coupled fluid neutrals and plasma models. We are developing an implicit coupling of the kinetic Monte-Carlo (MC) code DEGAS-2, as the neutrals model component, to the UEDGE plasma component, based on an extension of the Jacobian-free Newton-Krylov (JFNK) method to MC residuals. The coupling components build on the methods and coding already present in UEDGE. For the linear Krylov iterations, a procedure has been developed to ``extract'' a good preconditioner from that of UEDGE. This preconditioner may also be used to greatly accelerate the convergence rate of a relaxed fixed-point iteration, which may provide a useful ``intermediate'' algorithm. The JFNK method also requires calculation of Jacobian-vector products, for which any finite-difference procedure is inaccurate when a MC component is present. A semi-analytical procedure that retains the standard MC accuracy and fully kinetic neutrals physics is therefore being developed. Prepared for US DOE by LLNL under Contract DE-AC52-07NA27344 and LDRD project 15-ERD-059, by PPPL under Contract DE-AC02-09CH11466, and supported in part by the U.S. DOE, OFES.
Multilevel acceleration of scattering-source iterations with application to electron transport
Drumm, Clif; Fan, Wesley
2017-08-18
Acceleration/preconditioning strategies available in the SCEPTRE radiation transport code are described. A flexible transport synthetic acceleration (TSA) algorithm that uses a low-order discrete-ordinates (S N) or spherical-harmonics (P N) solve to accelerate convergence of a high-order S N source-iteration (SI) solve is described. Convergence of the low-order solves can be further accelerated by applying off-the-shelf incomplete-factorization or algebraic-multigrid methods. Also available is an algorithm that uses a generalized minimum residual (GMRES) iterative method rather than SI for convergence, using a parallel sweep-based solver to build up a Krylov subspace. TSA has been applied as a preconditioner to accelerate the convergencemore » of the GMRES iterations. The methods are applied to several problems involving electron transport and problems with artificial cross sections with large scattering ratios. These methods were compared and evaluated by considering material discontinuities and scattering anisotropy. Observed accelerations obtained are highly problem dependent, but speedup factors around 10 have been observed in typical applications.« less
Projection methods for line radiative transfer in spherical media.
NASA Astrophysics Data System (ADS)
Anusha, L. S.; Nagendra, K. N.
An efficient numerical method called the Preconditioned Bi-Conjugate Gradient (Pre-BiCG) method is presented for the solution of radiative transfer equation in spherical geometry. A variant of this method called Stabilized Preconditioned Bi-Conjugate Gradient (Pre-BiCG-STAB) is also presented. These methods are based on projections on the subspaces of the n dimensional Euclidean space mathbb {R}n called Krylov subspaces. The methods are shown to be faster in terms of convergence rate compared to the contemporary iterative methods such as Jacobi, Gauss-Seidel and Successive Over Relaxation (SOR).
Pratapa, Phanisri P.; Suryanarayana, Phanish; Pask, John E.
2015-12-01
We employ Anderson extrapolation to accelerate the classical Jacobi iterative method for large, sparse linear systems. Specifically, we utilize extrapolation at periodic intervals within the Jacobi iteration to develop the Alternating Anderson–Jacobi (AAJ) method. We verify the accuracy and efficacy of AAJ in a range of test cases, including nonsymmetric systems of equations. We demonstrate that AAJ possesses a favorable scaling with system size that is accompanied by a small prefactor, even in the absence of a preconditioner. In particular, we show that AAJ is able to accelerate the classical Jacobi iteration by over four orders of magnitude, with speed-upsmore » that increase as the system gets larger. Moreover, we find that AAJ significantly outperforms the Generalized Minimal Residual (GMRES) method in the range of problems considered here, with the relative performance again improving with size of the system. As a result, the proposed method represents a simple yet efficient technique that is particularly attractive for large-scale parallel solutions of linear systems of equations.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pratapa, Phanisri P.; Suryanarayana, Phanish; Pask, John E.
We employ Anderson extrapolation to accelerate the classical Jacobi iterative method for large, sparse linear systems. Specifically, we utilize extrapolation at periodic intervals within the Jacobi iteration to develop the Alternating Anderson–Jacobi (AAJ) method. We verify the accuracy and efficacy of AAJ in a range of test cases, including nonsymmetric systems of equations. We demonstrate that AAJ possesses a favorable scaling with system size that is accompanied by a small prefactor, even in the absence of a preconditioner. In particular, we show that AAJ is able to accelerate the classical Jacobi iteration by over four orders of magnitude, with speed-upsmore » that increase as the system gets larger. Moreover, we find that AAJ significantly outperforms the Generalized Minimal Residual (GMRES) method in the range of problems considered here, with the relative performance again improving with size of the system. As a result, the proposed method represents a simple yet efficient technique that is particularly attractive for large-scale parallel solutions of linear systems of equations.« less
Mang, Andreas; Biros, George
2017-01-01
We propose an efficient numerical algorithm for the solution of diffeomorphic image registration problems. We use a variational formulation constrained by a partial differential equation (PDE), where the constraints are a scalar transport equation. We use a pseudospectral discretization in space and second-order accurate semi-Lagrangian time stepping scheme for the transport equations. We solve for a stationary velocity field using a preconditioned, globalized, matrix-free Newton-Krylov scheme. We propose and test a two-level Hessian preconditioner. We consider two strategies for inverting the preconditioner on the coarse grid: a nested preconditioned conjugate gradient method (exact solve) and a nested Chebyshev iterative method (inexact solve) with a fixed number of iterations. We test the performance of our solver in different synthetic and real-world two-dimensional application scenarios. We study grid convergence and computational efficiency of our new scheme. We compare the performance of our solver against our initial implementation that uses the same spatial discretization but a standard, explicit, second-order Runge-Kutta scheme for the numerical time integration of the transport equations and a single-level preconditioner. Our improved scheme delivers significant speedups over our original implementation. As a highlight, we observe a 20 × speedup for a two dimensional, real world multi-subject medical image registration problem.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hutchinson, S.A.; Shadid, J.N.; Tuminaro, R.S.
1995-10-01
Aztec is an iterative library that greatly simplifies the parallelization process when solving the linear systems of equations Ax = b where A is a user supplied n x n sparse matrix, b is a user supplied vector of length n and x is a vector of length n to be computed. Aztec is intended as a software tool for users who want to avoid cumbersome parallel programming details but who have large sparse linear systems which require an efficiently utilized parallel processing system. A collection of data transformation tools are provided that allow for easy creation of distributed sparsemore » unstructured matrices for parallel solution. Once the distributed matrix is created, computation can be performed on any of the parallel machines running Aztec: nCUBE 2, IBM SP2 and Intel Paragon, MPI platforms as well as standard serial and vector platforms. Aztec includes a number of Krylov iterative methods such as conjugate gradient (CG), generalized minimum residual (GMRES) and stabilized biconjugate gradient (BICGSTAB) to solve systems of equations. These Krylov methods are used in conjunction with various preconditioners such as polynomial or domain decomposition methods using LU or incomplete LU factorizations within subdomains. Although the matrix A can be general, the package has been designed for matrices arising from the approximation of partial differential equations (PDEs). In particular, the Aztec package is oriented toward systems arising from PDE applications.« less
Asgharzadeh, Hafez; Borazjani, Iman
2017-02-15
The explicit and semi-implicit schemes in flow simulations involving complex geometries and moving boundaries suffer from time-step size restriction and low convergence rates. Implicit schemes can be used to overcome these restrictions, but implementing them to solve the Navier-Stokes equations is not straightforward due to their non-linearity. Among the implicit schemes for nonlinear equations, Newton-based techniques are preferred over fixed-point techniques because of their high convergence rate but each Newton iteration is more expensive than a fixed-point iteration. Krylov subspace methods are one of the most advanced iterative methods that can be combined with Newton methods, i.e., Newton-Krylov Methods (NKMs) to solve non-linear systems of equations. The success of NKMs vastly depends on the scheme for forming the Jacobian, e.g., automatic differentiation is very expensive, and matrix-free methods without a preconditioner slow down as the mesh is refined. A novel, computationally inexpensive analytical Jacobian for NKM is developed to solve unsteady incompressible Navier-Stokes momentum equations on staggered overset-curvilinear grids with immersed boundaries. Moreover, the analytical Jacobian is used to form preconditioner for matrix-free method in order to improve its performance. The NKM with the analytical Jacobian was validated and verified against Taylor-Green vortex, inline oscillations of a cylinder in a fluid initially at rest, and pulsatile flow in a 90 degree bend. The capability of the method in handling complex geometries with multiple overset grids and immersed boundaries is shown by simulating an intracranial aneurysm. It was shown that the NKM with an analytical Jacobian is 1.17 to 14.77 times faster than the fixed-point Runge-Kutta method, and 1.74 to 152.3 times (excluding an intensively stretched grid) faster than automatic differentiation depending on the grid (size) and the flow problem. In addition, it was shown that using only the diagonal of the Jacobian further improves the performance by 42 - 74% compared to the full Jacobian. The NKM with an analytical Jacobian showed better performance than the fixed point Runge-Kutta because it converged with higher time steps and in approximately 30% less iterations even when the grid was stretched and the Reynold number was increased. In fact, stretching the grid decreased the performance of all methods, but the fixed-point Runge-Kutta performance decreased 4.57 and 2.26 times more than NKM with a diagonal Jacobian when the stretching factor was increased, respectively. The NKM with a diagonal analytical Jacobian and matrix-free method with an analytical preconditioner are the fastest methods and the superiority of one to another depends on the flow problem. Furthermore, the implemented methods are fully parallelized with parallel efficiency of 80-90% on the problems tested. The NKM with the analytical Jacobian can guide building preconditioners for other techniques to improve their performance in the future.
Asgharzadeh, Hafez; Borazjani, Iman
2016-01-01
The explicit and semi-implicit schemes in flow simulations involving complex geometries and moving boundaries suffer from time-step size restriction and low convergence rates. Implicit schemes can be used to overcome these restrictions, but implementing them to solve the Navier-Stokes equations is not straightforward due to their non-linearity. Among the implicit schemes for nonlinear equations, Newton-based techniques are preferred over fixed-point techniques because of their high convergence rate but each Newton iteration is more expensive than a fixed-point iteration. Krylov subspace methods are one of the most advanced iterative methods that can be combined with Newton methods, i.e., Newton-Krylov Methods (NKMs) to solve non-linear systems of equations. The success of NKMs vastly depends on the scheme for forming the Jacobian, e.g., automatic differentiation is very expensive, and matrix-free methods without a preconditioner slow down as the mesh is refined. A novel, computationally inexpensive analytical Jacobian for NKM is developed to solve unsteady incompressible Navier-Stokes momentum equations on staggered overset-curvilinear grids with immersed boundaries. Moreover, the analytical Jacobian is used to form preconditioner for matrix-free method in order to improve its performance. The NKM with the analytical Jacobian was validated and verified against Taylor-Green vortex, inline oscillations of a cylinder in a fluid initially at rest, and pulsatile flow in a 90 degree bend. The capability of the method in handling complex geometries with multiple overset grids and immersed boundaries is shown by simulating an intracranial aneurysm. It was shown that the NKM with an analytical Jacobian is 1.17 to 14.77 times faster than the fixed-point Runge-Kutta method, and 1.74 to 152.3 times (excluding an intensively stretched grid) faster than automatic differentiation depending on the grid (size) and the flow problem. In addition, it was shown that using only the diagonal of the Jacobian further improves the performance by 42 – 74% compared to the full Jacobian. The NKM with an analytical Jacobian showed better performance than the fixed point Runge-Kutta because it converged with higher time steps and in approximately 30% less iterations even when the grid was stretched and the Reynold number was increased. In fact, stretching the grid decreased the performance of all methods, but the fixed-point Runge-Kutta performance decreased 4.57 and 2.26 times more than NKM with a diagonal Jacobian when the stretching factor was increased, respectively. The NKM with a diagonal analytical Jacobian and matrix-free method with an analytical preconditioner are the fastest methods and the superiority of one to another depends on the flow problem. Furthermore, the implemented methods are fully parallelized with parallel efficiency of 80–90% on the problems tested. The NKM with the analytical Jacobian can guide building preconditioners for other techniques to improve their performance in the future. PMID:28042172
NASA Astrophysics Data System (ADS)
Asgharzadeh, Hafez; Borazjani, Iman
2017-02-01
The explicit and semi-implicit schemes in flow simulations involving complex geometries and moving boundaries suffer from time-step size restriction and low convergence rates. Implicit schemes can be used to overcome these restrictions, but implementing them to solve the Navier-Stokes equations is not straightforward due to their non-linearity. Among the implicit schemes for non-linear equations, Newton-based techniques are preferred over fixed-point techniques because of their high convergence rate but each Newton iteration is more expensive than a fixed-point iteration. Krylov subspace methods are one of the most advanced iterative methods that can be combined with Newton methods, i.e., Newton-Krylov Methods (NKMs) to solve non-linear systems of equations. The success of NKMs vastly depends on the scheme for forming the Jacobian, e.g., automatic differentiation is very expensive, and matrix-free methods without a preconditioner slow down as the mesh is refined. A novel, computationally inexpensive analytical Jacobian for NKM is developed to solve unsteady incompressible Navier-Stokes momentum equations on staggered overset-curvilinear grids with immersed boundaries. Moreover, the analytical Jacobian is used to form a preconditioner for matrix-free method in order to improve its performance. The NKM with the analytical Jacobian was validated and verified against Taylor-Green vortex, inline oscillations of a cylinder in a fluid initially at rest, and pulsatile flow in a 90 degree bend. The capability of the method in handling complex geometries with multiple overset grids and immersed boundaries is shown by simulating an intracranial aneurysm. It was shown that the NKM with an analytical Jacobian is 1.17 to 14.77 times faster than the fixed-point Runge-Kutta method, and 1.74 to 152.3 times (excluding an intensively stretched grid) faster than automatic differentiation depending on the grid (size) and the flow problem. In addition, it was shown that using only the diagonal of the Jacobian further improves the performance by 42-74% compared to the full Jacobian. The NKM with an analytical Jacobian showed better performance than the fixed point Runge-Kutta because it converged with higher time steps and in approximately 30% less iterations even when the grid was stretched and the Reynold number was increased. In fact, stretching the grid decreased the performance of all methods, but the fixed-point Runge-Kutta performance decreased 4.57 and 2.26 times more than NKM with a diagonal and full Jacobian, respectivley, when the stretching factor was increased. The NKM with a diagonal analytical Jacobian and matrix-free method with an analytical preconditioner are the fastest methods and the superiority of one to another depends on the flow problem. Furthermore, the implemented methods are fully parallelized with parallel efficiency of 80-90% on the problems tested. The NKM with the analytical Jacobian can guide building preconditioners for other techniques to improve their performance in the future.
A model reduction approach to numerical inversion for a parabolic partial differential equation
NASA Astrophysics Data System (ADS)
Borcea, Liliana; Druskin, Vladimir; Mamonov, Alexander V.; Zaslavsky, Mikhail
2014-12-01
We propose a novel numerical inversion algorithm for the coefficients of parabolic partial differential equations, based on model reduction. The study is motivated by the application of controlled source electromagnetic exploration, where the unknown is the subsurface electrical resistivity and the data are time resolved surface measurements of the magnetic field. The algorithm presented in this paper considers inversion in one and two dimensions. The reduced model is obtained with rational interpolation in the frequency (Laplace) domain and a rational Krylov subspace projection method. It amounts to a nonlinear mapping from the function space of the unknown resistivity to the small dimensional space of the parameters of the reduced model. We use this mapping as a nonlinear preconditioner for the Gauss-Newton iterative solution of the inverse problem. The advantage of the inversion algorithm is twofold. First, the nonlinear preconditioner resolves most of the nonlinearity of the problem. Thus the iterations are less likely to get stuck in local minima and the convergence is fast. Second, the inversion is computationally efficient because it avoids repeated accurate simulations of the time-domain response. We study the stability of the inversion algorithm for various rational Krylov subspaces, and assess its performance with numerical experiments.
Parallel computing techniques for rotorcraft aerodynamics
NASA Astrophysics Data System (ADS)
Ekici, Kivanc
The modification of unsteady three-dimensional Navier-Stokes codes for application on massively parallel and distributed computing environments is investigated. The Euler/Navier-Stokes code TURNS (Transonic Unsteady Rotor Navier-Stokes) was chosen as a test bed because of its wide use by universities and industry. For the efficient implementation of TURNS on parallel computing systems, two algorithmic changes are developed. First, main modifications to the implicit operator, Lower-Upper Symmetric Gauss Seidel (LU-SGS) originally used in TURNS, is performed. Second, application of an inexact Newton method, coupled with a Krylov subspace iterative method (Newton-Krylov method) is carried out. Both techniques have been tried previously for the Euler equations mode of the code. In this work, we have extended the methods to the Navier-Stokes mode. Several new implicit operators were tried because of convergence problems of traditional operators with the high cell aspect ratio (CAR) grids needed for viscous calculations on structured grids. Promising results for both Euler and Navier-Stokes cases are presented for these operators. For the efficient implementation of Newton-Krylov methods to the Navier-Stokes mode of TURNS, efficient preconditioners must be used. The parallel implicit operators used in the previous step are employed as preconditioners and the results are compared. The Message Passing Interface (MPI) protocol has been used because of its portability to various parallel architectures. It should be noted that the proposed methodology is general and can be applied to several other CFD codes (e.g. OVERFLOW).
Nested Krylov methods and preserving the orthogonality
NASA Technical Reports Server (NTRS)
Desturler, Eric; Fokkema, Diederik R.
1993-01-01
Recently the GMRESR inner-outer iteraction scheme for the solution of linear systems of equations was proposed by Van der Vorst and Vuik. Similar methods have been proposed by Axelsson and Vassilevski and Saad (FGMRES). The outer iteration is GCR, which minimizes the residual over a given set of direction vectors. The inner iteration is GMRES, which at each step computes a new direction vector by approximately solving the residual equation. However, the optimality of the approximation over the space of outer search directions is ignored in the inner GMRES iteration. This leads to suboptimal corrections to the solution in the outer iteration, as components of the outer iteration directions may reenter in the inner iteration process. Therefore we propose to preserve the orthogonality relations of GCR in the inner GMRES iteration. This gives optimal corrections; however, it involves working with a singular, non-symmetric operator. We will discuss some important properties, and we will show by experiments that, in terms of matrix vector products, this modification (almost) always leads to better convergence. However, because we do more orthogonalizations, it does not always give an improved performance in CPU-time. Furthermore, we will discuss efficient implementations as well as the truncation possibilities of the outer GCR process. The experimental results indicate that for such methods it is advantageous to preserve the orthogonality in the inner iteration. Of course we can also use iteration schemes other than GMRES as the inner method; methods with short recurrences like GICGSTAB are of interest.
Anomaly General Circulation Models.
NASA Astrophysics Data System (ADS)
Navarra, Antonio
The feasibility of the anomaly model is assessed using barotropic and baroclinic models. In the barotropic case, both a stationary and a time-dependent model has been formulated and constructed, whereas only the stationary, linear case is considered in the baroclinic case. Results from the barotropic model indicate that a relation between the stationary solution and the time-averaged non-linear solution exists. The stationary linear baroclinic solution can therefore be considered with some confidence. The linear baroclinic anomaly model poses a formidable mathematical problem because it is necessary to solve a gigantic linear system to obtain the solution. A new method to find solution of large linear system, based on a projection on the Krylov subspace is shown to be successful when applied to the linearized baroclinic anomaly model. The scheme consists of projecting the original linear system on the Krylov subspace, thereby reducing the dimensionality of the matrix to be inverted to obtain the solution. With an appropriate setting of the damping parameters, the iterative Krylov method reaches a solution even using a Krylov subspace ten times smaller than the original space of the problem. This generality allows the treatment of the important problem of linear waves in the atmosphere. A larger class (nonzonally symmetric) of basic states can now be treated for the baroclinic primitive equations. These problem leads to large unsymmetrical linear systems of order 10000 and more which can now be successfully tackled by the Krylov method. The (R7) linear anomaly model is used to investigate extensively the linear response to equatorial and mid-latitude prescribed heating. The results indicate that the solution is deeply affected by the presence of the stationary waves in the basic state. The instability of the asymmetric flows, first pointed out by Simmons et al. (1983), is active also in the baroclinic case. However, the presence of baroclinic processes modifies the dominant response. The most sensitive areas are identified; they correspond to north Japan, the Pole and Greenland regions. A limited set of higher resolution (R15) experiments indicate that this situation is still present and enhanced at higher resolution. The linear anomaly model is also applied to a realistic case. (Abstract shortened with permission of author.).
A General Algorithm for Reusing Krylov Subspace Information. I. Unsteady Navier-Stokes
NASA Technical Reports Server (NTRS)
Carpenter, Mark H.; Vuik, C.; Lucas, Peter; vanGijzen, Martin; Bijl, Hester
2010-01-01
A general algorithm is developed that reuses available information to accelerate the iterative convergence of linear systems with multiple right-hand sides A x = b (sup i), which are commonly encountered in steady or unsteady simulations of nonlinear equations. The algorithm is based on the classical GMRES algorithm with eigenvector enrichment but also includes a Galerkin projection preprocessing step and several novel Krylov subspace reuse strategies. The new approach is applied to a set of test problems, including an unsteady turbulent airfoil, and is shown in some cases to provide significant improvement in computational efficiency relative to baseline approaches.
Fast solution of elliptic partial differential equations using linear combinations of plane waves.
Pérez-Jordá, José M
2016-02-01
Given an arbitrary elliptic partial differential equation (PDE), a procedure for obtaining its solution is proposed based on the method of Ritz: the solution is written as a linear combination of plane waves and the coefficients are obtained by variational minimization. The PDE to be solved is cast as a system of linear equations Ax=b, where the matrix A is not sparse, which prevents the straightforward application of standard iterative methods in order to solve it. This sparseness problem can be circumvented by means of a recursive bisection approach based on the fast Fourier transform, which makes it possible to implement fast versions of some stationary iterative methods (such as Gauss-Seidel) consuming O(NlogN) memory and executing an iteration in O(Nlog(2)N) time, N being the number of plane waves used. In a similar way, fast versions of Krylov subspace methods and multigrid methods can also be implemented. These procedures are tested on Poisson's equation expressed in adaptive coordinates. It is found that the best results are obtained with the GMRES method using a multigrid preconditioner with Gauss-Seidel relaxation steps.
NASA Technical Reports Server (NTRS)
Turc, Catalin; Anand, Akash; Bruno, Oscar; Chaubell, Julian
2011-01-01
We present a computational methodology (a novel Nystrom approach based on use of a non-overlapping patch technique and Chebyshev discretizations) for efficient solution of problems of acoustic and electromagnetic scattering by open surfaces. Our integral equation formulations (1) Incorporate, as ansatz, the singular nature of open-surface integral-equation solutions, and (2) For the Electric Field Integral Equation (EFIE), use analytical regularizes that effectively reduce the number of iterations required by iterative linear-algebra solution based on Krylov-subspace iterative solvers.
Overview of Krylov subspace methods with applications to control problems
NASA Technical Reports Server (NTRS)
Saad, Youcef
1989-01-01
An overview of projection methods based on Krylov subspaces are given with emphasis on their application to solving matrix equations that arise in control problems. The main idea of Krylov subspace methods is to generate a basis of the Krylov subspace Span and seek an approximate solution the the original problem from this subspace. Thus, the original matrix problem of size N is approximated by one of dimension m typically much smaller than N. Krylov subspace methods have been very successful in solving linear systems and eigenvalue problems and are now just becoming popular for solving nonlinear equations. It is shown how they can be used to solve partial pole placement problems, Sylvester's equation, and Lyapunov's equation.
Krylov subspace methods - Theory, algorithms, and applications
NASA Technical Reports Server (NTRS)
Sad, Youcef
1990-01-01
Projection methods based on Krylov subspaces for solving various types of scientific problems are reviewed. The main idea of this class of methods when applied to a linear system Ax = b, is to generate in some manner an approximate solution to the original problem from the so-called Krylov subspace span. Thus, the original problem of size N is approximated by one of dimension m, typically much smaller than N. Krylov subspace methods have been very successful in solving linear systems and eigenvalue problems and are now becoming popular for solving nonlinear equations. The main ideas in Krylov subspace methods are shown and their use in solving linear systems, eigenvalue problems, parabolic partial differential equations, Liapunov matrix equations, and nonlinear system of equations are discussed.
Anderson acceleration and application to the three-temperature energy equations
NASA Astrophysics Data System (ADS)
An, Hengbin; Jia, Xiaowei; Walker, Homer F.
2017-10-01
The Anderson acceleration method is an algorithm for accelerating the convergence of fixed-point iterations, including the Picard method. Anderson acceleration was first proposed in 1965 and, for some years, has been used successfully to accelerate the convergence of self-consistent field iterations in electronic-structure computations. Recently, the method has attracted growing attention in other application areas and among numerical analysts. Compared with a Newton-like method, an advantage of Anderson acceleration is that there is no need to form the Jacobian matrix. Thus the method is easy to implement. In this paper, an Anderson-accelerated Picard method is employed to solve the three-temperature energy equations, which are a type of strong nonlinear radiation-diffusion equations. Two strategies are used to improve the robustness of the Anderson acceleration method. One strategy is to adjust the iterates when necessary to satisfy the physical constraint. Another strategy is to monitor and, if necessary, reduce the matrix condition number of the least-squares problem in the Anderson-acceleration implementation so that numerical stability can be guaranteed. Numerical results show that the Anderson-accelerated Picard method can solve the three-temperature energy equations efficiently. Compared with the Picard method without acceleration, Anderson acceleration can reduce the number of iterations by at least half. A comparison between a Jacobian-free Newton-Krylov method, the Picard method, and the Anderson-accelerated Picard method is conducted in this paper.
NASA Astrophysics Data System (ADS)
Heinkenschloss, Matthias
2005-01-01
We study a class of time-domain decomposition-based methods for the numerical solution of large-scale linear quadratic optimal control problems. Our methods are based on a multiple shooting reformulation of the linear quadratic optimal control problem as a discrete-time optimal control (DTOC) problem. The optimality conditions for this DTOC problem lead to a linear block tridiagonal system. The diagonal blocks are invertible and are related to the original linear quadratic optimal control problem restricted to smaller time-subintervals. This motivates the application of block Gauss-Seidel (GS)-type methods for the solution of the block tridiagonal systems. Numerical experiments show that the spectral radii of the block GS iteration matrices are larger than one for typical applications, but that the eigenvalues of the iteration matrices decay to zero fast. Hence, while the GS method is not expected to convergence for typical applications, it can be effective as a preconditioner for Krylov-subspace methods. This is confirmed by our numerical tests.A byproduct of this research is the insight that certain instantaneous control techniques can be viewed as the application of one step of the forward block GS method applied to the DTOC optimality system.
NASA Technical Reports Server (NTRS)
Jothiprasad, Giridhar; Mavriplis, Dimitri J.; Caughey, David A.; Bushnell, Dennis M. (Technical Monitor)
2002-01-01
The efficiency gains obtained using higher-order implicit Runge-Kutta schemes as compared with the second-order accurate backward difference schemes for the unsteady Navier-Stokes equations are investigated. Three different algorithms for solving the nonlinear system of equations arising at each timestep are presented. The first algorithm (NMG) is a pseudo-time-stepping scheme which employs a non-linear full approximation storage (FAS) agglomeration multigrid method to accelerate convergence. The other two algorithms are based on Inexact Newton's methods. The linear system arising at each Newton step is solved using iterative/Krylov techniques and left preconditioning is used to accelerate convergence of the linear solvers. One of the methods (LMG) uses Richardson's iterative scheme for solving the linear system at each Newton step while the other (PGMRES) uses the Generalized Minimal Residual method. Results demonstrating the relative superiority of these Newton's methods based schemes are presented. Efficiency gains as high as 10 are obtained by combining the higher-order time integration schemes with the more efficient nonlinear solvers.
NASA Astrophysics Data System (ADS)
Borazjani, Iman; Asgharzadeh, Hafez
2015-11-01
Flow simulations involving complex geometries and moving boundaries suffer from time-step size restriction and low convergence rates with explicit and semi-implicit schemes. Implicit schemes can be used to overcome these restrictions. However, implementing implicit solver for nonlinear equations including Navier-Stokes is not straightforward. Newton-Krylov subspace methods (NKMs) are one of the most advanced iterative methods to solve non-linear equations such as implicit descritization of the Navier-Stokes equation. The efficiency of NKMs massively depends on the Jacobian formation method, e.g., automatic differentiation is very expensive, and matrix-free methods slow down as the mesh is refined. Analytical Jacobian is inexpensive method, but derivation of analytical Jacobian for Navier-Stokes equation on staggered grid is challenging. The NKM with a novel analytical Jacobian was developed and validated against Taylor-Green vortex and pulsatile flow in a 90 degree bend. The developed method successfully handled the complex geometries such as an intracranial aneurysm with multiple overset grids, and immersed boundaries. It is shown that the NKM with an analytical Jacobian is 3 to 25 times faster than the fixed-point implicit Runge-Kutta method, and more than 100 times faster than automatic differentiation depending on the grid (size) and the flow problem. The developed methods are fully parallelized with parallel efficiency of 80-90% on the problems tested.
Krylov subspace methods for computing hydrodynamic interactions in Brownian dynamics simulations
Ando, Tadashi; Chow, Edmond; Saad, Yousef; Skolnick, Jeffrey
2012-01-01
Hydrodynamic interactions play an important role in the dynamics of macromolecules. The most common way to take into account hydrodynamic effects in molecular simulations is in the context of a Brownian dynamics simulation. However, the calculation of correlated Brownian noise vectors in these simulations is computationally very demanding and alternative methods are desirable. This paper studies methods based on Krylov subspaces for computing Brownian noise vectors. These methods are related to Chebyshev polynomial approximations, but do not require eigenvalue estimates. We show that only low accuracy is required in the Brownian noise vectors to accurately compute values of dynamic and static properties of polymer and monodisperse suspension models. With this level of accuracy, the computational time of Krylov subspace methods scales very nearly as O(N2) for the number of particles N up to 10 000, which was the limit tested. The performance of the Krylov subspace methods, especially the “block” version, is slightly better than that of the Chebyshev method, even without taking into account the additional cost of eigenvalue estimates required by the latter. Furthermore, at N = 10 000, the Krylov subspace method is 13 times faster than the exact Cholesky method. Thus, Krylov subspace methods are recommended for performing large-scale Brownian dynamics simulations with hydrodynamic interactions. PMID:22897254
Multigrid and Krylov Subspace Methods for the Discrete Stokes Equations
NASA Technical Reports Server (NTRS)
Elman, Howard C.
1996-01-01
Discretization of the Stokes equations produces a symmetric indefinite system of linear equations. For stable discretizations, a variety of numerical methods have been proposed that have rates of convergence independent of the mesh size used in the discretization. In this paper, we compare the performance of four such methods: variants of the Uzawa, preconditioned conjugate gradient, preconditioned conjugate residual, and multigrid methods, for solving several two-dimensional model problems. The results indicate that where it is applicable, multigrid with smoothing based on incomplete factorization is more efficient than the other methods, but typically by no more than a factor of two. The conjugate residual method has the advantage of being both independent of iteration parameters and widely applicable.
Projection methods for the numerical solution of Markov chain models
NASA Technical Reports Server (NTRS)
Saad, Youcef
1989-01-01
Projection methods for computing stationary probability distributions for Markov chain models are presented. A general projection method is a method which seeks an approximation from a subspace of small dimension to the original problem. Thus, the original matrix problem of size N is approximated by one of dimension m, typically much smaller than N. A particularly successful class of methods based on this principle is that of Krylov subspace methods which utilize subspaces of the form span(v,av,...,A(exp m-1)v). These methods are effective in solving linear systems and eigenvalue problems (Lanczos, Arnoldi,...) as well as nonlinear equations. They can be combined with more traditional iterative methods such as successive overrelaxation, symmetric successive overrelaxation, or with incomplete factorization methods to enhance convergence.
Preserving Symmetry in Preconditioned Krylov Subspace Methods
NASA Technical Reports Server (NTRS)
Chan, Tony F.; Chow, E.; Saad, Y.; Yeung, M. C.
1996-01-01
We consider the problem of solving a linear system Ax = b when A is nearly symmetric and when the system is preconditioned by a symmetric positive definite matrix M. In the symmetric case, one can recover symmetry by using M-inner products in the conjugate gradient (CG) algorithm. This idea can also be used in the nonsymmetric case, and near symmetry can be preserved similarly. Like CG, the new algorithms are mathematically equivalent to split preconditioning, but do not require M to be factored. Better robustness in a specific sense can also be observed. When combined with truncated versions of iterative methods, tests show that this is more effective than the common practice of forfeiting near-symmetry altogether.
Mathematical and Numerical Aspects of the Adaptive Fast Multipole Poisson-Boltzmann Solver
Zhang, Bo; Lu, Benzhuo; Cheng, Xiaolin; ...
2013-01-01
This paper summarizes the mathematical and numerical theories and computational elements of the adaptive fast multipole Poisson-Boltzmann (AFMPB) solver. We introduce and discuss the following components in order: the Poisson-Boltzmann model, boundary integral equation reformulation, surface mesh generation, the nodepatch discretization approach, Krylov iterative methods, the new version of fast multipole methods (FMMs), and a dynamic prioritization technique for scheduling parallel operations. For each component, we also remark on feasible approaches for further improvements in efficiency, accuracy and applicability of the AFMPB solver to large-scale long-time molecular dynamics simulations. Lastly, the potential of the solver is demonstrated with preliminary numericalmore » results.« less
On optimal improvements of classical iterative schemes for Z-matrices
NASA Astrophysics Data System (ADS)
Noutsos, D.; Tzoumas, M.
2006-04-01
Many researchers have considered preconditioners, applied to linear systems, whose matrix coefficient is a Z- or an M-matrix, that make the associated Jacobi and Gauss-Seidel methods converge asymptotically faster than the unpreconditioned ones. Such preconditioners are chosen so that they eliminate the off-diagonal elements of the same column or the elements of the first upper diagonal [Milaszewicz, LAA 93 (1987) 161-170], Gunawardena et al. [LAA 154-156 (1991) 123-143]. In this work we generalize the previous preconditioners to obtain optimal methods. "Good" Jacobi and Gauss-Seidel algorithms are given and preconditioners, that eliminate more than one entry per row, are also proposed and analyzed. Moreover, the behavior of the above preconditioners to the Krylov subspace methods is studied.
NASA Astrophysics Data System (ADS)
Lu, Benzhuo; Cheng, Xiaolin; Huang, Jingfang; McCammon, J. Andrew
2010-06-01
A Fortran program package is introduced for rapid evaluation of the electrostatic potentials and forces in biomolecular systems modeled by the linearized Poisson-Boltzmann equation. The numerical solver utilizes a well-conditioned boundary integral equation (BIE) formulation, a node-patch discretization scheme, a Krylov subspace iterative solver package with reverse communication protocols, and an adaptive new version of fast multipole method in which the exponential expansions are used to diagonalize the multipole-to-local translations. The program and its full description, as well as several closely related libraries and utility tools are available at http://lsec.cc.ac.cn/~lubz/afmpb.html and a mirror site at http://mccammon.ucsd.edu/. This paper is a brief summary of the program: the algorithms, the implementation and the usage. Program summaryProgram title: AFMPB: Adaptive fast multipole Poisson-Boltzmann solver Catalogue identifier: AEGB_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEGB_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GPL 2.0 No. of lines in distributed program, including test data, etc.: 453 649 No. of bytes in distributed program, including test data, etc.: 8 764 754 Distribution format: tar.gz Programming language: Fortran Computer: Any Operating system: Any RAM: Depends on the size of the discretized biomolecular system Classification: 3 External routines: Pre- and post-processing tools are required for generating the boundary elements and for visualization. Users can use MSMS ( http://www.scripps.edu/~sanner/html/msms_home.html) for pre-processing, and VMD ( http://www.ks.uiuc.edu/Research/vmd/) for visualization. Sub-programs included: An iterative Krylov subspace solvers package from SPARSKIT by Yousef Saad ( http://www-users.cs.umn.edu/~saad/software/SPARSKIT/sparskit.html), and the fast multipole methods subroutines from FMMSuite ( http://www.fastmultipole.org/). Nature of problem: Numerical solution of the linearized Poisson-Boltzmann equation that describes electrostatic interactions of molecular systems in ionic solutions. Solution method: A novel node-patch scheme is used to discretize the well-conditioned boundary integral equation formulation of the linearized Poisson-Boltzmann equation. Various Krylov subspace solvers can be subsequently applied to solve the resulting linear system, with a bounded number of iterations independent of the number of discretized unknowns. The matrix-vector multiplication at each iteration is accelerated by the adaptive new versions of fast multipole methods. The AFMPB solver requires other stand-alone pre-processing tools for boundary mesh generation, post-processing tools for data analysis and visualization, and can be conveniently coupled with different time stepping methods for dynamics simulation. Restrictions: Only three or six significant digits options are provided in this version. Unusual features: Most of the codes are in Fortran77 style. Memory allocation functions from Fortran90 and above are used in a few subroutines. Additional comments: The current version of the codes is designed and written for single core/processor desktop machines. Check http://lsec.cc.ac.cn/~lubz/afmpb.html and http://mccammon.ucsd.edu/ for updates and changes. Running time: The running time varies with the number of discretized elements ( N) in the system and their distributions. In most cases, it scales linearly as a function of N.
An assessment of coupling algorithms for nuclear reactor core physics simulations
Hamilton, Steven; Berrill, Mark; Clarno, Kevin; ...
2016-04-01
This paper evaluates the performance of multiphysics coupling algorithms applied to a light water nuclear reactor core simulation. The simulation couples the k-eigenvalue form of the neutron transport equation with heat conduction and subchannel flow equations. We compare Picard iteration (block Gauss–Seidel) to Anderson acceleration and multiple variants of preconditioned Jacobian-free Newton–Krylov (JFNK). The performance of the methods are evaluated over a range of energy group structures and core power levels. A novel physics-based approximation to a Jacobian-vector product has been developed to mitigate the impact of expensive on-line cross section processing steps. Furthermore, numerical simulations demonstrating the efficiency ofmore » JFNK and Anderson acceleration relative to standard Picard iteration are performed on a 3D model of a nuclear fuel assembly. Both criticality (k-eigenvalue) and critical boron search problems are considered.« less
An assessment of coupling algorithms for nuclear reactor core physics simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hamilton, Steven; Berrill, Mark; Clarno, Kevin
This paper evaluates the performance of multiphysics coupling algorithms applied to a light water nuclear reactor core simulation. The simulation couples the k-eigenvalue form of the neutron transport equation with heat conduction and subchannel flow equations. We compare Picard iteration (block Gauss–Seidel) to Anderson acceleration and multiple variants of preconditioned Jacobian-free Newton–Krylov (JFNK). The performance of the methods are evaluated over a range of energy group structures and core power levels. A novel physics-based approximation to a Jacobian-vector product has been developed to mitigate the impact of expensive on-line cross section processing steps. Furthermore, numerical simulations demonstrating the efficiency ofmore » JFNK and Anderson acceleration relative to standard Picard iteration are performed on a 3D model of a nuclear fuel assembly. Both criticality (k-eigenvalue) and critical boron search problems are considered.« less
An assessment of coupling algorithms for nuclear reactor core physics simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hamilton, Steven, E-mail: hamiltonsp@ornl.gov; Berrill, Mark, E-mail: berrillma@ornl.gov; Clarno, Kevin, E-mail: clarnokt@ornl.gov
This paper evaluates the performance of multiphysics coupling algorithms applied to a light water nuclear reactor core simulation. The simulation couples the k-eigenvalue form of the neutron transport equation with heat conduction and subchannel flow equations. We compare Picard iteration (block Gauss–Seidel) to Anderson acceleration and multiple variants of preconditioned Jacobian-free Newton–Krylov (JFNK). The performance of the methods are evaluated over a range of energy group structures and core power levels. A novel physics-based approximation to a Jacobian-vector product has been developed to mitigate the impact of expensive on-line cross section processing steps. Numerical simulations demonstrating the efficiency of JFNKmore » and Anderson acceleration relative to standard Picard iteration are performed on a 3D model of a nuclear fuel assembly. Both criticality (k-eigenvalue) and critical boron search problems are considered.« less
Improvements in Block-Krylov Ritz Vectors and the Boundary Flexibility Method of Component Synthesis
NASA Technical Reports Server (NTRS)
Carney, Kelly Scott
1997-01-01
A method of dynamic substructuring is presented which utilizes a set of static Ritz vectors as a replacement for normal eigenvectors in component mode synthesis. This set of Ritz vectors is generated in a recurrence relationship, proposed by Wilson, which has the form of a block-Krylov subspace. The initial seed to the recurrence algorithm is based upon the boundary flexibility vectors of the component. Improvements have been made in the formulation of the initial seed to the Krylov sequence, through the use of block-filtering. A method to shift the Krylov sequence to create Ritz vectors that will represent the dynamic behavior of the component at target frequencies, the target frequency being determined by the applied forcing functions, has been developed. A method to terminate the Krylov sequence has also been developed. Various orthonormalization schemes have been developed and evaluated, including the Cholesky/QR method. Several auxiliary theorems and proofs which illustrate issues in component mode synthesis and loss of orthogonality in the Krylov sequence have also been presented. The resulting methodology is applicable to both fixed and free- interface boundary components, and results in a general component model appropriate for any type of dynamic analysis. The accuracy is found to be comparable to that of component synthesis based upon normal modes, using fewer generalized coordinates. In addition, the block-Krylov recurrence algorithm is a series of static solutions and so requires significantly less computation than solving the normal eigenspace problem. The requirement for less vectors to form the component, coupled with the lower computational expense of calculating these Ritz vectors, combine to create a method more efficient than traditional component mode synthesis.
Performance evaluation of OpenFOAM on many-core architectures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brzobohatý, Tomáš; Říha, Lubomír; Karásek, Tomáš, E-mail: tomas.karasek@vsb.cz
In this article application of Open Source Field Operation and Manipulation (OpenFOAM) C++ libraries for solving engineering problems on many-core architectures is presented. Objective of this article is to present scalability of OpenFOAM on parallel platforms solving real engineering problems of fluid dynamics. Scalability test of OpenFOAM is performed using various hardware and different implementation of standard PCG and PBiCG Krylov iterative methods. Speed up of various implementations of linear solvers using GPU and MIC accelerators are presented in this paper. Numerical experiments of 3D lid-driven cavity flow for several cases with various number of cells are presented.
DOE Office of Scientific and Technical Information (OSTI.GOV)
None, None
Frequency-dependent correlations, such as the spectral function and the dynamical structure factor, help illustrate condensed matter experiments. Within the density matrix renormalization group (DMRG) framework, an accurate method for calculating spectral functions directly in frequency is the correction-vector method. The correction vector can be computed by solving a linear equation or by minimizing a functional. Our paper proposes an alternative to calculate the correction vector: to use the Krylov-space approach. This paper also studies the accuracy and performance of the Krylov-space approach, when applied to the Heisenberg, the t-J, and the Hubbard models. The cases we studied indicate that themore » Krylov-space approach can be more accurate and efficient than the conjugate gradient, and that the error of the former integrates best when a Krylov-space decomposition is also used for ground state DMRG.« less
None, None
2016-11-21
Frequency-dependent correlations, such as the spectral function and the dynamical structure factor, help illustrate condensed matter experiments. Within the density matrix renormalization group (DMRG) framework, an accurate method for calculating spectral functions directly in frequency is the correction-vector method. The correction vector can be computed by solving a linear equation or by minimizing a functional. Our paper proposes an alternative to calculate the correction vector: to use the Krylov-space approach. This paper also studies the accuracy and performance of the Krylov-space approach, when applied to the Heisenberg, the t-J, and the Hubbard models. The cases we studied indicate that themore » Krylov-space approach can be more accurate and efficient than the conjugate gradient, and that the error of the former integrates best when a Krylov-space decomposition is also used for ground state DMRG.« less
Woodward, Carol S.; Gardner, David J.; Evans, Katherine J.
2015-01-01
Efficient solutions of global climate models require effectively handling disparate length and time scales. Implicit solution approaches allow time integration of the physical system with a step size governed by accuracy of the processes of interest rather than by stability of the fastest time scales present. Implicit approaches, however, require the solution of nonlinear systems within each time step. Usually, a Newton's method is applied to solve these systems. Each iteration of the Newton's method, in turn, requires the solution of a linear model of the nonlinear system. This model employs the Jacobian of the problem-defining nonlinear residual, but thismore » Jacobian can be costly to form. If a Krylov linear solver is used for the solution of the linear system, the action of the Jacobian matrix on a given vector is required. In the case of spectral element methods, the Jacobian is not calculated but only implemented through matrix-vector products. The matrix-vector multiply can also be approximated by a finite difference approximation which may introduce inaccuracy in the overall nonlinear solver. In this paper, we review the advantages and disadvantages of finite difference approximations of these matrix-vector products for climate dynamics within the spectral element shallow water dynamical core of the Community Atmosphere Model.« less
Solving coupled groundwater flow systems using a Jacobian Free Newton Krylov method
NASA Astrophysics Data System (ADS)
Mehl, S.
2012-12-01
Jacobian Free Newton Kyrlov (JFNK) methods can have several advantages for simulating coupled groundwater flow processes versus conventional methods. Conventional methods are defined here as those based on an iterative coupling (rather than a direct coupling) and/or that use Picard iteration rather than Newton iteration. In an iterative coupling, the systems are solved separately, coupling information is updated and exchanged between the systems, and the systems are re-solved, etc., until convergence is achieved. Trusted simulators, such as Modflow, are based on these conventional methods of coupling and work well in many cases. An advantage of the JFNK method is that it only requires calculation of the residual vector of the system of equations and thus can make use of existing simulators regardless of how the equations are formulated. This opens the possibility of coupling different process models via augmentation of a residual vector by each separate process, which often requires substantially fewer changes to the existing source code than if the processes were directly coupled. However, appropriate perturbation sizes need to be determined for accurate approximations of the Frechet derivative, which is not always straightforward. Furthermore, preconditioning is necessary for reasonable convergence of the linear solution required at each Kyrlov iteration. Existing preconditioners can be used and applied separately to each process which maximizes use of existing code and robust preconditioners. In this work, iteratively coupled parent-child local grid refinement models of groundwater flow and groundwater flow models with nonlinear exchanges to streams are used to demonstrate the utility of the JFNK approach for Modflow models. Use of incomplete Cholesky preconditioners with various levels of fill are examined on a suite of nonlinear and linear models to analyze the effect of the preconditioner. Comparisons of convergence and computer simulation time are made using conventional iteratively coupled methods and those based on Picard iteration to those formulated with JFNK to gain insights on the types of nonlinearities and system features that make one approach advantageous. Results indicate that nonlinearities associated with stream/aquifer exchanges are more problematic than those resulting from unconfined flow.
Stability analysis of a deterministic dose calculation for MRI-guided radiotherapy.
Zelyak, O; Fallone, B G; St-Aubin, J
2017-12-14
Modern effort in radiotherapy to address the challenges of tumor localization and motion has led to the development of MRI guided radiotherapy technologies. Accurate dose calculations must properly account for the effects of the MRI magnetic fields. Previous work has investigated the accuracy of a deterministic linear Boltzmann transport equation (LBTE) solver that includes magnetic field, but not the stability of the iterative solution method. In this work, we perform a stability analysis of this deterministic algorithm including an investigation of the convergence rate dependencies on the magnetic field, material density, energy, and anisotropy expansion. The iterative convergence rate of the continuous and discretized LBTE including magnetic fields is determined by analyzing the spectral radius using Fourier analysis for the stationary source iteration (SI) scheme. The spectral radius is calculated when the magnetic field is included (1) as a part of the iteration source, and (2) inside the streaming-collision operator. The non-stationary Krylov subspace solver GMRES is also investigated as a potential method to accelerate the iterative convergence, and an angular parallel computing methodology is investigated as a method to enhance the efficiency of the calculation. SI is found to be unstable when the magnetic field is part of the iteration source, but unconditionally stable when the magnetic field is included in the streaming-collision operator. The discretized LBTE with magnetic fields using a space-angle upwind stabilized discontinuous finite element method (DFEM) was also found to be unconditionally stable, but the spectral radius rapidly reaches unity for very low-density media and increasing magnetic field strengths indicating arbitrarily slow convergence rates. However, GMRES is shown to significantly accelerate the DFEM convergence rate showing only a weak dependence on the magnetic field. In addition, the use of an angular parallel computing strategy is shown to potentially increase the efficiency of the dose calculation.
Stability analysis of a deterministic dose calculation for MRI-guided radiotherapy
NASA Astrophysics Data System (ADS)
Zelyak, O.; Fallone, B. G.; St-Aubin, J.
2018-01-01
Modern effort in radiotherapy to address the challenges of tumor localization and motion has led to the development of MRI guided radiotherapy technologies. Accurate dose calculations must properly account for the effects of the MRI magnetic fields. Previous work has investigated the accuracy of a deterministic linear Boltzmann transport equation (LBTE) solver that includes magnetic field, but not the stability of the iterative solution method. In this work, we perform a stability analysis of this deterministic algorithm including an investigation of the convergence rate dependencies on the magnetic field, material density, energy, and anisotropy expansion. The iterative convergence rate of the continuous and discretized LBTE including magnetic fields is determined by analyzing the spectral radius using Fourier analysis for the stationary source iteration (SI) scheme. The spectral radius is calculated when the magnetic field is included (1) as a part of the iteration source, and (2) inside the streaming-collision operator. The non-stationary Krylov subspace solver GMRES is also investigated as a potential method to accelerate the iterative convergence, and an angular parallel computing methodology is investigated as a method to enhance the efficiency of the calculation. SI is found to be unstable when the magnetic field is part of the iteration source, but unconditionally stable when the magnetic field is included in the streaming-collision operator. The discretized LBTE with magnetic fields using a space-angle upwind stabilized discontinuous finite element method (DFEM) was also found to be unconditionally stable, but the spectral radius rapidly reaches unity for very low-density media and increasing magnetic field strengths indicating arbitrarily slow convergence rates. However, GMRES is shown to significantly accelerate the DFEM convergence rate showing only a weak dependence on the magnetic field. In addition, the use of an angular parallel computing strategy is shown to potentially increase the efficiency of the dose calculation.
Corrigendum to "Stability analysis of a deterministic dose calculation for MRI-guided radiotherapy".
Zelyak, Oleksandr; Fallone, B Gino; St-Aubin, Joel
2018-03-12
Modern effort in radiotherapy to address the challenges of tumor localization and motion has led to the development of MRI guided radiotherapy technologies. Accurate dose calculations must properly account for the effects of the MRI magnetic fields. Previous work has investigated the accuracy of a deterministic linear Boltzmann transport equation (LBTE) solver that includes magnetic field, but not the stability of the iterative solution method. In this work, we perform a stability analysis of this deterministic algorithm including an investigation of the convergence rate dependencies on the magnetic field, material density, energy, and anisotropy expansion. The iterative convergence rate of the continuous and discretized LBTE including magnetic fields is determined by analyzing the spectral radius using Fourier analysis for the stationary source iteration (SI) scheme. The spectral radius is calculated when the magnetic field is included (1) as a part of the iteration source, and (2) inside the streaming-collision operator. The non-stationary Krylov subspace solver GMRES is also investigated as a potential method to accelerate the iterative convergence, and an angular parallel computing methodology is investigated as a method to enhance the efficiency of the calculation. SI is found to be unstable when the magnetic field is part of the iteration source, but unconditionally stable when the magnetic field is included in the streaming-collision operator. The discretized LBTE with magnetic fields using a space-angle upwind stabilized discontinuous finite element method (DFEM) was also found to be unconditionally stable, but the spectral radius rapidly reaches unity for very low density media and increasing magnetic field strengths indicating arbitrarily slow convergence rates. However, GMRES is shown to significantly accelerate the DFEM convergence rate showing only a weak dependence on the magnetic field. In addition, the use of an angular parallel computing strategy is shown to potentially increase the efficiency of the dose calculation. © 2018 Institute of Physics and Engineering in Medicine.
ML 3.0 smoothed aggregation user's guide.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sala, Marzio; Hu, Jonathan Joseph; Tuminaro, Raymond Stephen
2004-05-01
ML is a multigrid preconditioning package intended to solve linear systems of equations Az = b where A is a user supplied n x n sparse matrix, b is a user supplied vector of length n and x is a vector of length n to be computed. ML should be used on large sparse linear systems arising from partial differential equation (PDE) discretizations. While technically any linear system can be considered, ML should be used on linear systems that correspond to things that work well with multigrid methods (e.g. elliptic PDEs). ML can be used as a stand-alone package ormore » to generate preconditioners for a traditional iterative solver package (e.g. Krylov methods). We have supplied support for working with the AZTEC 2.1 and AZTECOO iterative package [15]. However, other solvers can be used by supplying a few functions. This document describes one specific algebraic multigrid approach: smoothed aggregation. This approach is used within several specialized multigrid methods: one for the eddy current formulation for Maxwell's equations, and a multilevel and domain decomposition method for symmetric and non-symmetric systems of equations (like elliptic equations, or compressible and incompressible fluid dynamics problems). Other methods exist within ML but are not described in this document. Examples are given illustrating the problem definition and exercising multigrid options.« less
ML 3.1 smoothed aggregation user's guide.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sala, Marzio; Hu, Jonathan Joseph; Tuminaro, Raymond Stephen
2004-10-01
ML is a multigrid preconditioning package intended to solve linear systems of equations Ax = b where A is a user supplied n x n sparse matrix, b is a user supplied vector of length n and x is a vector of length n to be computed. ML should be used on large sparse linear systems arising from partial differential equation (PDE) discretizations. While technically any linear system can be considered, ML should be used on linear systems that correspond to things that work well with multigrid methods (e.g. elliptic PDEs). ML can be used as a stand-alone package ormore » to generate preconditioners for a traditional iterative solver package (e.g. Krylov methods). We have supplied support for working with the Aztec 2.1 and AztecOO iterative package [16]. However, other solvers can be used by supplying a few functions. This document describes one specific algebraic multigrid approach: smoothed aggregation. This approach is used within several specialized multigrid methods: one for the eddy current formulation for Maxwell's equations, and a multilevel and domain decomposition method for symmetric and nonsymmetric systems of equations (like elliptic equations, or compressible and incompressible fluid dynamics problems). Other methods exist within ML but are not described in this document. Examples are given illustrating the problem definition and exercising multigrid options.« less
General purpose nonlinear system solver based on Newton-Krylov method.
DOE Office of Scientific and Technical Information (OSTI.GOV)
2013-12-01
KINSOL is part of a software family called SUNDIALS: SUite of Nonlinear and Differential/Algebraic equation Solvers [1]. KINSOL is a general-purpose nonlinear system solver based on Newton-Krylov and fixed-point solver technologies [2].
NASA Astrophysics Data System (ADS)
Chacon, L.; Finn, J. M.; Knoll, D. A.
2000-10-01
Recently, a new parallel velocity instability has been found.(J. M. Finn, Phys. Plasmas), 2, 12 (1995) This mode is a tearing mode driven unstable by curvature effects and sound wave coupling in the presence of parallel velocity shear. Under such conditions, linear theory predicts that tearing instabilities will grow even in situations in which the classical tearing mode is stable. This could then be a viable seed mechanism for the neoclassical tearing mode, and hence a non-linear study is of interest. Here, the linear and non-linear stages of this instability are explored using a fully implicit, fully nonlinear 2D reduced resistive MHD code,(L. Chacon et al), ``Implicit, Jacobian-free Newton-Krylov 2D reduced resistive MHD nonlinear solver,'' submitted to J. Comput. Phys. (2000) including viscosity and particle transport effects. The nonlinear implicit time integration is performed using the Newton-Raphson iterative algorithm. Krylov iterative techniques are employed for the required algebraic matrix inversions, implemented Jacobian-free (i.e., without ever forming and storing the Jacobian matrix), and preconditioned with a ``physics-based'' preconditioner. Nonlinear results indicate that, for large total plasma beta and large parallel velocity shear, the instability results in the generation of large poloidal shear flows and large magnetic islands even in regimes when the classical tearing mode is absolutely stable. For small viscosity, the time asymptotic state can be turbulent.
Studies of numerical algorithms for gyrokinetics and the effects of shaping on plasma turbulence
NASA Astrophysics Data System (ADS)
Belli, Emily Ann
Advanced numerical algorithms for gyrokinetic simulations are explored for more effective studies of plasma turbulent transport. The gyrokinetic equations describe the dynamics of particles in 5-dimensional phase space, averaging over the fast gyromotion, and provide a foundation for studying plasma microturbulence in fusion devices and in astrophysical plasmas. Several algorithms for Eulerian/continuum gyrokinetic solvers are compared. An iterative implicit scheme based on numerical approximations of the plasma response is developed. This method reduces the long time needed to set-up implicit arrays, yet still has larger time step advantages similar to a fully implicit method. Various model preconditioners and iteration schemes, including Krylov-based solvers, are explored. An Alternating Direction Implicit algorithm is also studied and is surprisingly found to yield a severe stability restriction on the time step. Overall, an iterative Krylov algorithm might be the best approach for extensions of core tokamak gyrokinetic simulations to edge kinetic formulations and may be particularly useful for studies of large-scale ExB shear effects. The effects of flux surface shape on the gyrokinetic stability and transport of tokamak plasmas are studied using the nonlinear GS2 gyrokinetic code with analytic equilibria based on interpolations of representative JET-like shapes. High shaping is found to be a stabilizing influence on both the linear ITG instability and nonlinear ITG turbulence. A scaling of the heat flux with elongation of chi ˜ kappa-1.5 or kappa-2 (depending on the triangularity) is observed, which is consistent with previous gyrofluid simulations. Thus, the GS2 turbulence simulations are explaining a significant fraction, but not all, of the empirical elongation scaling. The remainder of the scaling may come from (1) the edge boundary conditions for core turbulence, and (2) the larger Dimits nonlinear critical temperature gradient shift due to the enhancement of zonal flows with shaping, which is observed with the GS2 simulations. Finally, a local linear trial function-based gyrokinetic code is developed to aid in fast scoping studies of gyrokinetic linear stability. This code is successfully benchmarked with the full GS2 code in the collisionless, electrostatic limit, as well as in the more general electromagnetic description with higher-order Hermite basis functions.
The aggregated unfitted finite element method for elliptic problems
NASA Astrophysics Data System (ADS)
Badia, Santiago; Verdugo, Francesc; Martín, Alberto F.
2018-07-01
Unfitted finite element techniques are valuable tools in different applications where the generation of body-fitted meshes is difficult. However, these techniques are prone to severe ill conditioning problems that obstruct the efficient use of iterative Krylov methods and, in consequence, hinders the practical usage of unfitted methods for realistic large scale applications. In this work, we present a technique that addresses such conditioning problems by constructing enhanced finite element spaces based on a cell aggregation technique. The presented method, called aggregated unfitted finite element method, is easy to implement, and can be used, in contrast to previous works, in Galerkin approximations of coercive problems with conforming Lagrangian finite element spaces. The mathematical analysis of the new method states that the condition number of the resulting linear system matrix scales as in standard finite elements for body-fitted meshes, without being affected by small cut cells, and that the method leads to the optimal finite element convergence order. These theoretical results are confirmed with 2D and 3D numerical experiments.
TH-AB-BRA-09: Stability Analysis of a Novel Dose Calculation Algorithm for MRI Guided Radiotherapy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zelyak, O; Fallone, B; Cross Cancer Institute, Edmonton, AB
2016-06-15
Purpose: To determine the iterative deterministic solution stability of the Linear Boltzmann Transport Equation (LBTE) in the presence of magnetic fields. Methods: The LBTE with magnetic fields under investigation is derived using a discrete ordinates approach. The stability analysis is performed using analytical and numerical methods. Analytically, the spectral Fourier analysis is used to obtain the convergence rate of the source iteration procedures based on finding the largest eigenvalue of the iterative operator. This eigenvalue is a function of relevant physical parameters, such as magnetic field strength and material properties, and provides essential information about the domain of applicability requiredmore » for clinically optimal parameter selection and maximum speed of convergence. The analytical results are reinforced by numerical simulations performed using the same discrete ordinates method in angle, and a discontinuous finite element spatial approach. Results: The spectral radius for the source iteration technique of the time independent transport equation with isotropic and anisotropic scattering centers inside infinite 3D medium is equal to the ratio of differential and total cross sections. The result is confirmed numerically by solving LBTE and is in full agreement with previously published results. The addition of magnetic field reveals that the convergence becomes dependent on the strength of magnetic field, the energy group discretization, and the order of anisotropic expansion. Conclusion: The source iteration technique for solving the LBTE with magnetic fields with the discrete ordinates method leads to divergent solutions in the limiting cases of small energy discretizations and high magnetic field strengths. Future investigations into non-stationary Krylov subspace techniques as an iterative solver will be performed as this has been shown to produce greater stability than source iteration. Furthermore, a stability analysis of a discontinuous finite element space-angle approach (which has been shown to provide the greatest stability) will also be investigated. Dr. B Gino Fallone is a co-founder and CEO of MagnetTx Oncology Solutions (under discussions to license Alberta bi-planar linac MR for commercialization)« less
Density-matrix-based algorithm for solving eigenvalue problems
NASA Astrophysics Data System (ADS)
Polizzi, Eric
2009-03-01
A fast and stable numerical algorithm for solving the symmetric eigenvalue problem is presented. The technique deviates fundamentally from the traditional Krylov subspace iteration based techniques (Arnoldi and Lanczos algorithms) or other Davidson-Jacobi techniques and takes its inspiration from the contour integration and density-matrix representation in quantum mechanics. It will be shown that this algorithm—named FEAST—exhibits high efficiency, robustness, accuracy, and scalability on parallel architectures. Examples from electronic structure calculations of carbon nanotubes are presented, and numerical performances and capabilities are discussed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Druskin, V.; Lee, Ping; Knizhnerman, L.
There is now a growing interest in the area of using Krylov subspace approximations to compute the actions of matrix functions. The main application of this approach is the solution of ODE systems, obtained after discretization of partial differential equations by method of lines. In the event that the cost of computing the matrix inverse is relatively inexpensive, it is sometimes attractive to solve the ODE using the extended Krylov subspaces, originated by actions of both positive and negative matrix powers. Examples of such problems can be found frequently in computational electromagnetics.
Aviat, Félix; Levitt, Antoine; Stamm, Benjamin; Maday, Yvon; Ren, Pengyu; Ponder, Jay W; Lagardère, Louis; Piquemal, Jean-Philip
2017-01-10
We introduce a new class of methods, denoted as Truncated Conjugate Gradient(TCG), to solve the many-body polarization energy and its associated forces in molecular simulations (i.e. molecular dynamics (MD) and Monte Carlo). The method consists in a fixed number of Conjugate Gradient (CG) iterations. TCG approaches provide a scalable solution to the polarization problem at a user-chosen cost and a corresponding optimal accuracy. The optimality of the CG-method guarantees that the number of the required matrix-vector products are reduced to a minimum compared to other iterative methods. This family of methods is non-empirical, fully adaptive, and provides analytical gradients, avoiding therefore any energy drift in MD as compared to popular iterative solvers. Besides speed, one great advantage of this class of approximate methods is that their accuracy is systematically improvable. Indeed, as the CG-method is a Krylov subspace method, the associated error is monotonically reduced at each iteration. On top of that, two improvements can be proposed at virtually no cost: (i) the use of preconditioners can be employed, which leads to the Truncated Preconditioned Conjugate Gradient (TPCG); (ii) since the residual of the final step of the CG-method is available, one additional Picard fixed point iteration ("peek"), equivalent to one step of Jacobi Over Relaxation (JOR) with relaxation parameter ω, can be made at almost no cost. This method is denoted by TCG-n(ω). Black-box adaptive methods to find good choices of ω are provided and discussed. Results show that TPCG-3(ω) is converged to high accuracy (a few kcal/mol) for various types of systems including proteins and highly charged systems at the fixed cost of four matrix-vector products: three CG iterations plus the initial CG descent direction. Alternatively, T(P)CG-2(ω) provides robust results at a reduced cost (three matrix-vector products) and offers new perspectives for long polarizable MD as a production algorithm. The T(P)CG-1(ω) level provides less accurate solutions for inhomogeneous systems, but its applicability to well-conditioned problems such as water is remarkable, with only two matrix-vector product evaluations.
2016-01-01
We introduce a new class of methods, denoted as Truncated Conjugate Gradient(TCG), to solve the many-body polarization energy and its associated forces in molecular simulations (i.e. molecular dynamics (MD) and Monte Carlo). The method consists in a fixed number of Conjugate Gradient (CG) iterations. TCG approaches provide a scalable solution to the polarization problem at a user-chosen cost and a corresponding optimal accuracy. The optimality of the CG-method guarantees that the number of the required matrix-vector products are reduced to a minimum compared to other iterative methods. This family of methods is non-empirical, fully adaptive, and provides analytical gradients, avoiding therefore any energy drift in MD as compared to popular iterative solvers. Besides speed, one great advantage of this class of approximate methods is that their accuracy is systematically improvable. Indeed, as the CG-method is a Krylov subspace method, the associated error is monotonically reduced at each iteration. On top of that, two improvements can be proposed at virtually no cost: (i) the use of preconditioners can be employed, which leads to the Truncated Preconditioned Conjugate Gradient (TPCG); (ii) since the residual of the final step of the CG-method is available, one additional Picard fixed point iteration (“peek”), equivalent to one step of Jacobi Over Relaxation (JOR) with relaxation parameter ω, can be made at almost no cost. This method is denoted by TCG-n(ω). Black-box adaptive methods to find good choices of ω are provided and discussed. Results show that TPCG-3(ω) is converged to high accuracy (a few kcal/mol) for various types of systems including proteins and highly charged systems at the fixed cost of four matrix-vector products: three CG iterations plus the initial CG descent direction. Alternatively, T(P)CG-2(ω) provides robust results at a reduced cost (three matrix-vector products) and offers new perspectives for long polarizable MD as a production algorithm. The T(P)CG-1(ω) level provides less accurate solutions for inhomogeneous systems, but its applicability to well-conditioned problems such as water is remarkable, with only two matrix-vector product evaluations. PMID:28068773
NASA Astrophysics Data System (ADS)
Jerez-Hanckes, Carlos; Pérez-Arancibia, Carlos; Turc, Catalin
2017-12-01
We present Nyström discretizations of multitrace/singletrace formulations and non-overlapping Domain Decomposition Methods (DDM) for the solution of Helmholtz transmission problems for bounded composite scatterers with piecewise constant material properties. We investigate the performance of DDM with both classical Robin and optimized transmission boundary conditions. The optimized transmission boundary conditions incorporate square root Fourier multiplier approximations of Dirichlet to Neumann operators. While the multitrace/singletrace formulations as well as the DDM that use classical Robin transmission conditions are not particularly well suited for Krylov subspace iterative solutions of high-contrast high-frequency Helmholtz transmission problems, we provide ample numerical evidence that DDM with optimized transmission conditions constitute efficient computational alternatives for these type of applications. In the case of large numbers of subdomains with different material properties, we show that the associated DDM linear system can be efficiently solved via hierarchical Schur complements elimination.
Accelerating molecular property calculations with nonorthonormal Krylov space methods
DOE Office of Scientific and Technical Information (OSTI.GOV)
Furche, Filipp; Krull, Brandon T.; Nguyen, Brian D.
Here, we formulate Krylov space methods for large eigenvalue problems and linear equation systems that take advantage of decreasing residual norms to reduce the cost of matrix-vector multiplication. The residuals are used as subspace basis without prior orthonormalization, which leads to generalized eigenvalue problems or linear equation systems on the Krylov space. These nonorthonormal Krylov space (nKs) algorithms are favorable for large matrices with irregular sparsity patterns whose elements are computed on the fly, because fewer operations are necessary as the residual norm decreases as compared to the conventional method, while errors in the desired eigenpairs and solution vectors remainmore » small. We consider real symmetric and symplectic eigenvalue problems as well as linear equation systems and Sylvester equations as they appear in configuration interaction and response theory. The nKs method can be implemented in existing electronic structure codes with minor modifications and yields speed-ups of 1.2-1.8 in typical time-dependent Hartree-Fock and density functional applications without accuracy loss. The algorithm can compute entire linear subspaces simultaneously which benefits electronic spectra and force constant calculations requiring many eigenpairs or solution vectors. The nKs approach is related to difference density methods in electronic ground state calculations, and particularly efficient for integral direct computations of exchange-type contractions. By combination with resolution-of-the-identity methods for Coulomb contractions, three- to fivefold speed-ups of hybrid time-dependent density functional excited state and response calculations are achieved.« less
Accelerating molecular property calculations with nonorthonormal Krylov space methods
Furche, Filipp; Krull, Brandon T.; Nguyen, Brian D.; ...
2016-05-03
Here, we formulate Krylov space methods for large eigenvalue problems and linear equation systems that take advantage of decreasing residual norms to reduce the cost of matrix-vector multiplication. The residuals are used as subspace basis without prior orthonormalization, which leads to generalized eigenvalue problems or linear equation systems on the Krylov space. These nonorthonormal Krylov space (nKs) algorithms are favorable for large matrices with irregular sparsity patterns whose elements are computed on the fly, because fewer operations are necessary as the residual norm decreases as compared to the conventional method, while errors in the desired eigenpairs and solution vectors remainmore » small. We consider real symmetric and symplectic eigenvalue problems as well as linear equation systems and Sylvester equations as they appear in configuration interaction and response theory. The nKs method can be implemented in existing electronic structure codes with minor modifications and yields speed-ups of 1.2-1.8 in typical time-dependent Hartree-Fock and density functional applications without accuracy loss. The algorithm can compute entire linear subspaces simultaneously which benefits electronic spectra and force constant calculations requiring many eigenpairs or solution vectors. The nKs approach is related to difference density methods in electronic ground state calculations, and particularly efficient for integral direct computations of exchange-type contractions. By combination with resolution-of-the-identity methods for Coulomb contractions, three- to fivefold speed-ups of hybrid time-dependent density functional excited state and response calculations are achieved.« less
NASA Astrophysics Data System (ADS)
Citro, V.; Luchini, P.; Giannetti, F.; Auteri, F.
2017-09-01
The study of the stability of a dynamical system described by a set of partial differential equations (PDEs) requires the computation of unstable states as the control parameter exceeds its critical threshold. Unfortunately, the discretization of the governing equations, especially for fluid dynamic applications, often leads to very large discrete systems. As a consequence, matrix based methods, like for example the Newton-Raphson algorithm coupled with a direct inversion of the Jacobian matrix, lead to computational costs too large in terms of both memory and execution time. We present a novel iterative algorithm, inspired by Krylov-subspace methods, which is able to compute unstable steady states and/or accelerate the convergence to stable configurations. Our new algorithm is based on the minimization of the residual norm at each iteration step with a projection basis updated at each iteration rather than at periodic restarts like in the classical GMRES method. The algorithm is able to stabilize any dynamical system without increasing the computational time of the original numerical procedure used to solve the governing equations. Moreover, it can be easily inserted into a pre-existing relaxation (integration) procedure with a call to a single black-box subroutine. The procedure is discussed for problems of different sizes, ranging from a small two-dimensional system to a large three-dimensional problem involving the Navier-Stokes equations. We show that the proposed algorithm is able to improve the convergence of existing iterative schemes. In particular, the procedure is applied to the subcritical flow inside a lid-driven cavity. We also discuss the application of Boostconv to compute the unstable steady flow past a fixed circular cylinder (2D) and boundary-layer flow over a hemispherical roughness element (3D) for supercritical values of the Reynolds number. We show that Boostconv can be used effectively with any spatial discretization, be it a finite-difference, finite-volume, finite-element or spectral method.
NASA Astrophysics Data System (ADS)
Han, B.; Li, Y.
2016-12-01
We present a three-dimensional (3D) forward and inverse modeling code for marine controlled-source electromagnetic (CSEM) surveys in anisotropic media. The forward solution is based on a primary/secondary field approach, in which secondary fields are solved using a staggered finite-volume (FV) method and primary fields are solved for 1D isotropic background models analytically. It is shown that it is rather straightforward to extend the isotopic 3D FV algorithm to a triaxial anisotropic one, while additional coefficients are required to account for full tensor conductivity. To solve the linear system resulting from FV discretization of Maxwell' s equations, both iterative Krylov solvers (e.g. BiCGSTAB) and direct solvers (e.g. MUMPS) have been implemented, makes the code flexible for different computing platforms and different problems. For iterative soloutions, the linear system in terms of electromagnetic potentials (A-Phi) is used to precondition the original linear system, transforming the discretized Curl-Curl equations to discretized Laplace-like equations, thus much more favorable numerical properties can be obtained. Numerical experiments suggest that this A-Phi preconditioner can dramatically improve the convergence rate of an iterative solver and high accuracy can be achieved without divergence correction even for low frequencies. To efficiently calculate the sensitivities, i.e. the derivatives of CSEM data with respect to tensor conductivity, the adjoint method is employed. For inverse modeling, triaxial anisotropy is taken into account. Since the number of model parameters to be resolved of triaxial anisotropic medias is twice or thrice that of isotropic medias, the data-space version of the Gauss-Newton (GN) minimization method is preferred due to its lower computational cost compared with the traditional model-space GN method. We demonstrate the effectiveness of the code with synthetic examples.
NASA Astrophysics Data System (ADS)
Li, Meng; Gu, Xian-Ming; Huang, Chengming; Fei, Mingfa; Zhang, Guoyu
2018-04-01
In this paper, a fast linearized conservative finite element method is studied for solving the strongly coupled nonlinear fractional Schrödinger equations. We prove that the scheme preserves both the mass and energy, which are defined by virtue of some recursion relationships. Using the Sobolev inequalities and then employing the mathematical induction, the discrete scheme is proved to be unconditionally convergent in the sense of L2-norm and H α / 2-norm, which means that there are no any constraints on the grid ratios. Then, the prior bound of the discrete solution in L2-norm and L∞-norm are also obtained. Moreover, we propose an iterative algorithm, by which the coefficient matrix is independent of the time level, and thus it leads to Toeplitz-like linear systems that can be efficiently solved by Krylov subspace solvers with circulant preconditioners. This method can reduce the memory requirement of the proposed linearized finite element scheme from O (M2) to O (M) and the computational complexity from O (M3) to O (Mlog M) in each iterative step, where M is the number of grid nodes. Finally, numerical results are carried out to verify the correction of the theoretical analysis, simulate the collision of two solitary waves, and show the utility of the fast numerical solution techniques.
Implementation of the block-Krylov boundary flexibility method of component synthesis
NASA Technical Reports Server (NTRS)
Carney, Kelly S.; Abdallah, Ayman A.; Hucklebridge, Arthur A.
1993-01-01
A method of dynamic substructuring is presented which utilizes a set of static Ritz vectors as a replacement for normal eigenvectors in component mode synthesis. This set of Ritz vectors is generated in a recurrence relationship, which has the form of a block-Krylov subspace. The initial seed to the recurrence algorithm is based on the boundary flexibility vectors of the component. This algorithm is not load-dependent, is applicable to both fixed and free-interface boundary components, and results in a general component model appropriate for any type of dynamic analysis. This methodology was implemented in the MSC/NASTRAN normal modes solution sequence using DMAP. The accuracy is found to be comparable to that of component synthesis based upon normal modes. The block-Krylov recurrence algorithm is a series of static solutions and so requires significantly less computation than solving the normal eigenspace problem.
FAST SIMULATION OF SOLID TUMORS THERMAL ABLATION TREATMENTS WITH A 3D REACTION DIFFUSION MODEL *
BERTACCINI, DANIELE; CALVETTI, DANIELA
2007-01-01
An efficient computational method for near real-time simulation of thermal ablation of tumors via radio frequencies is proposed. Model simulations of the temperature field in a 3D portion of tissue containing the tumoral mass for different patterns of source heating can be used to design the ablation procedure. The availability of a very efficient computational scheme makes it possible update the predicted outcome of the procedure in real time. In the algorithms proposed here a discretization in space of the governing equations is followed by an adaptive time integration based on implicit multistep formulas. A modification of the ode15s MATLAB function which uses Krylov space iterative methods for the solution of for the linear systems arising at each integration step makes it possible to perform the simulations on standard desktop for much finer grids than using the built-in ode15s. The proposed algorithm can be applied to a wide class of nonlinear parabolic differential equations. PMID:17173888
A method for exponential propagation of large systems of stiff nonlinear differential equations
NASA Technical Reports Server (NTRS)
Friesner, Richard A.; Tuckerman, Laurette S.; Dornblaser, Bright C.; Russo, Thomas V.
1989-01-01
A new time integrator for large, stiff systems of linear and nonlinear coupled differential equations is described. For linear systems, the method consists of forming a small (5-15-term) Krylov space using the Jacobian of the system and carrying out exact exponential propagation within this space. Nonlinear corrections are incorporated via a convolution integral formalism; the integral is evaluated via approximate Krylov methods as well. Gains in efficiency ranging from factors of 2 to 30 are demonstrated for several test problems as compared to a forward Euler scheme and to the integration package LSODE.
Tensor-GMRES method for large sparse systems of nonlinear equations
NASA Technical Reports Server (NTRS)
Feng, Dan; Pulliam, Thomas H.
1994-01-01
This paper introduces a tensor-Krylov method, the tensor-GMRES method, for large sparse systems of nonlinear equations. This method is a coupling of tensor model formation and solution techniques for nonlinear equations with Krylov subspace projection techniques for unsymmetric systems of linear equations. Traditional tensor methods for nonlinear equations are based on a quadratic model of the nonlinear function, a standard linear model augmented by a simple second order term. These methods are shown to be significantly more efficient than standard methods both on nonsingular problems and on problems where the Jacobian matrix at the solution is singular. A major disadvantage of the traditional tensor methods is that the solution of the tensor model requires the factorization of the Jacobian matrix, which may not be suitable for problems where the Jacobian matrix is large and has a 'bad' sparsity structure for an efficient factorization. We overcome this difficulty by forming and solving the tensor model using an extension of a Newton-GMRES scheme. Like traditional tensor methods, we show that the new tensor method has significant computational advantages over the analogous Newton counterpart. Consistent with Krylov subspace based methods, the new tensor method does not depend on the factorization of the Jacobian matrix. As a matter of fact, the Jacobian matrix is never needed explicitly.
Some Remarks on GMRES for Transport Theory
NASA Technical Reports Server (NTRS)
Patton, Bruce W.; Holloway, James Paul
2003-01-01
We review some work on the application of GMRES to the solution of the discrete ordinates transport equation in one-dimension. We note that GMRES can be applied directly to the angular flux vector, or it can be applied to only a vector of flux moments as needed to compute the scattering operator of the transport equation. In the former case we illustrate both the delights and defects of ILU right-preconditioners for problems with anisotropic scatter and for problems with upscatter. When working with flux moments we note that GMRES can be used as an accelerator for any existing transport code whose solver is based on a stationary fixed-point iteration, including transport sweeps and DSA transport sweeps. We also provide some numerical illustrations of this idea. We finally show how space can be traded for speed by taking multiple transport sweeps per GMRES iteration. Key Words: transport equation, GMRES, Krylov subspace
NASA Astrophysics Data System (ADS)
Raburn, Daniel Louis
We have developed a preconditioned, globalized Jacobian-free Newton-Krylov (JFNK) solver for calculating equilibria with magnetic islands. The solver has been developed in conjunction with the Princeton Iterative Equilibrium Solver (PIES) and includes two notable enhancements over a traditional JFNK scheme: (1) globalization of the algorithm by a sophisticated backtracking scheme, which optimizes between the Newton and steepest-descent directions; and, (2) adaptive preconditioning, wherein information regarding the system Jacobian is reused between Newton iterations to form a preconditioner for our GMRES-like linear solver. We have developed a formulation for calculating saturated neoclassical tearing modes (NTMs) which accounts for the incomplete loss of a bootstrap current due to gradients of multiple physical quantities. We have applied the coupled PIES-JFNK solver to calculate saturated island widths on several shots from the Tokamak Fusion Test Reactor (TFTR) and have found reasonable agreement with experimental measurement.
Bardhan, Jaydeep P
2008-10-14
The importance of molecular electrostatic interactions in aqueous solution has motivated extensive research into physical models and numerical methods for their estimation. The computational costs associated with simulations that include many explicit water molecules have driven the development of implicit-solvent models, with generalized-Born (GB) models among the most popular of these. In this paper, we analyze a boundary-integral equation interpretation for the Coulomb-field approximation (CFA), which plays a central role in most GB models. This interpretation offers new insights into the nature of the CFA, which traditionally has been assessed using only a single point charge in the solute. The boundary-integral interpretation of the CFA allows the use of multiple point charges, or even continuous charge distributions, leading naturally to methods that eliminate the interpolation inaccuracies associated with the Still equation. This approach, which we call boundary-integral-based electrostatic estimation by the CFA (BIBEE/CFA), is most accurate when the molecular charge distribution generates a smooth normal displacement field at the solute-solvent boundary, and CFA-based GB methods perform similarly. Conversely, both methods are least accurate for charge distributions that give rise to rapidly varying or highly localized normal displacement fields. Supporting this analysis are comparisons of the reaction-potential matrices calculated using GB methods and boundary-element-method (BEM) simulations. An approximation similar to BIBEE/CFA exhibits complementary behavior, with superior accuracy for charge distributions that generate rapidly varying normal fields and poorer accuracy for distributions that produce smooth fields. This approximation, BIBEE by preconditioning (BIBEE/P), essentially generates initial guesses for preconditioned Krylov-subspace iterative BEMs. Thus, iterative refinement of the BIBEE/P results recovers the BEM solution; excellent agreement is obtained in only a few iterations. The boundary-integral-equation framework may also provide a means to derive rigorous results explaining how the empirical correction terms in many modern GB models significantly improve accuracy despite their simple analytical forms.
Globalized Newton-Krylov-Schwarz Algorithms and Software for Parallel Implicit CFD
NASA Technical Reports Server (NTRS)
Gropp, W. D.; Keyes, D. E.; McInnes, L. C.; Tidriri, M. D.
1998-01-01
Implicit solution methods are important in applications modeled by PDEs with disparate temporal and spatial scales. Because such applications require high resolution with reasonable turnaround, "routine" parallelization is essential. The pseudo-transient matrix-free Newton-Krylov-Schwarz (Psi-NKS) algorithmic framework is presented as an answer. We show that, for the classical problem of three-dimensional transonic Euler flow about an M6 wing, Psi-NKS can simultaneously deliver: globalized, asymptotically rapid convergence through adaptive pseudo- transient continuation and Newton's method-, reasonable parallelizability for an implicit method through deferred synchronization and favorable communication-to-computation scaling in the Krylov linear solver; and high per- processor performance through attention to distributed memory and cache locality, especially through the Schwarz preconditioner. Two discouraging features of Psi-NKS methods are their sensitivity to the coding of the underlying PDE discretization and the large number of parameters that must be selected to govern convergence. We therefore distill several recommendations from our experience and from our reading of the literature on various algorithmic components of Psi-NKS, and we describe a freely available, MPI-based portable parallel software implementation of the solver employed here.
Robust parallel iterative solvers for linear and least-squares problems, Final Technical Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Saad, Yousef
2014-01-16
The primary goal of this project is to study and develop robust iterative methods for solving linear systems of equations and least squares systems. The focus of the Minnesota team is on algorithms development, robustness issues, and on tests and validation of the methods on realistic problems. 1. The project begun with an investigation on how to practically update a preconditioner obtained from an ILU-type factorization, when the coefficient matrix changes. 2. We investigated strategies to improve robustness in parallel preconditioners in a specific case of a PDE with discontinuous coefficients. 3. We explored ways to adapt standard preconditioners formore » solving linear systems arising from the Helmholtz equation. These are often difficult linear systems to solve by iterative methods. 4. We have also worked on purely theoretical issues related to the analysis of Krylov subspace methods for linear systems. 5. We developed an effective strategy for performing ILU factorizations for the case when the matrix is highly indefinite. The strategy uses shifting in some optimal way. The method was extended to the solution of Helmholtz equations by using complex shifts, yielding very good results in many cases. 6. We addressed the difficult problem of preconditioning sparse systems of equations on GPUs. 7. A by-product of the above work is a software package consisting of an iterative solver library for GPUs based on CUDA. This was made publicly available. It was the first such library that offers complete iterative solvers for GPUs. 8. We considered another form of ILU which blends coarsening techniques from Multigrid with algebraic multilevel methods. 9. We have released a new version on our parallel solver - called pARMS [new version is version 3]. As part of this we have tested the code in complex settings - including the solution of Maxwell and Helmholtz equations and for a problem of crystal growth.10. As an application of polynomial preconditioning we considered the problem of evaluating f(A)v which arises in statistical sampling. 11. As an application to the methods we developed, we tackled the problem of computing the diagonal of the inverse of a matrix. This arises in statistical applications as well as in many applications in physics. We explored probing methods as well as domain-decomposition type methods. 12. A collaboration with researchers from Toulouse, France, considered the important problem of computing the Schur complement in a domain-decomposition approach. 13. We explored new ways of preconditioning linear systems, based on low-rank approximations.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kılıç, Emre, E-mail: emre.kilic@tum.de; Eibert, Thomas F.
An approach combining boundary integral and finite element methods is introduced for the solution of three-dimensional inverse electromagnetic medium scattering problems. Based on the equivalence principle, unknown equivalent electric and magnetic surface current densities on a closed surface are utilized to decompose the inverse medium problem into two parts: a linear radiation problem and a nonlinear cavity problem. The first problem is formulated by a boundary integral equation, the computational burden of which is reduced by employing the multilevel fast multipole method (MLFMM). Reconstructed Cauchy data on the surface allows the utilization of the Lorentz reciprocity and the Poynting's theorems.more » Exploiting these theorems, the noise level and an initial guess are estimated for the cavity problem. Moreover, it is possible to determine whether the material is lossy or not. In the second problem, the estimated surface currents form inhomogeneous boundary conditions of the cavity problem. The cavity problem is formulated by the finite element technique and solved iteratively by the Gauss–Newton method to reconstruct the properties of the object. Regularization for both the first and the second problems is achieved by a Krylov subspace method. The proposed method is tested against both synthetic and experimental data and promising reconstruction results are obtained.« less
On the implementation of an accurate and efficient solver for convection-diffusion equations
NASA Astrophysics Data System (ADS)
Wu, Chin-Tien
In this dissertation, we examine several different aspects of computing the numerical solution of the convection-diffusion equation. The solution of this equation often exhibits sharp gradients due to Dirichlet outflow boundaries or discontinuities in boundary conditions. Because of the singular-perturbed nature of the equation, numerical solutions often have severe oscillations when grid sizes are not small enough to resolve sharp gradients. To overcome such difficulties, the streamline diffusion discretization method can be used to obtain an accurate approximate solution in regions where the solution is smooth. To increase accuracy of the solution in the regions containing layers, adaptive mesh refinement and mesh movement based on a posteriori error estimations can be employed. An error-adapted mesh refinement strategy based on a posteriori error estimations is also proposed to resolve layers. For solving the sparse linear systems that arise from discretization, goemetric multigrid (MG) and algebraic multigrid (AMG) are compared. In addition, both methods are also used as preconditioners for Krylov subspace methods. We derive some convergence results for MG with line Gauss-Seidel smoothers and bilinear interpolation. Finally, while considering adaptive mesh refinement as an integral part of the solution process, it is natural to set a stopping tolerance for the iterative linear solvers on each mesh stage so that the difference between the approximate solution obtained from iterative methods and the finite element solution is bounded by an a posteriori error bound. Here, we present two stopping criteria. The first is based on a residual-type a posteriori error estimator developed by Verfurth. The second is based on an a posteriori error estimator, using local solutions, developed by Kay and Silvester. Our numerical results show the refined mesh obtained from the iterative solution which satisfies the second criteria is similar to the refined mesh obtained from the finite element solution.
Zou, Ling; Zhao, Haihua; Zhang, Hongbin
2016-08-24
This study presents a numerical investigation on using the Jacobian-free Newton–Krylov (JFNK) method to solve the two-phase flow four-equation drift flux model with realistic constitutive correlations (‘closure models’). The drift flux model is based on Isshi and his collaborators’ work. Additional constitutive correlations for vertical channel flow, such as two-phase flow pressure drop, flow regime map, wall boiling and interfacial heat transfer models, were taken from the RELAP5-3D Code Manual and included to complete the model. The staggered grid finite volume method and fully implicit backward Euler method was used for the spatial discretization and time integration schemes, respectively. Themore » Jacobian-free Newton–Krylov method shows no difficulty in solving the two-phase flow drift flux model with a discrete flow regime map. In addition to the Jacobian-free approach, the preconditioning matrix is obtained by using the default finite differencing method provided in the PETSc package, and consequently the labor-intensive implementation of complex analytical Jacobian matrix is avoided. Extensive and successful numerical verification and validation have been performed to prove the correct implementation of the models and methods. Code-to-code comparison with RELAP5-3D has further demonstrated the successful implementation of the drift flux model.« less
NASA Astrophysics Data System (ADS)
Arteaga, Santiago Egido
1998-12-01
The steady-state Navier-Stokes equations are of considerable interest because they are used to model numerous common physical phenomena. The applications encountered in practice often involve small viscosities and complicated domain geometries, and they result in challenging problems in spite of the vast attention that has been dedicated to them. In this thesis we examine methods for computing the numerical solution of the primitive variable formulation of the incompressible equations on distributed memory parallel computers. We use the Galerkin method to discretize the differential equations, although most results are stated so that they apply also to stabilized methods. We also reformulate some classical results in a single framework and discuss some issues frequently dismissed in the literature, such as the implementation of pressure space basis and non- homogeneous boundary values. We consider three nonlinear methods: Newton's method, Oseen's (or Picard) iteration, and sequences of Stokes problems. All these iterative nonlinear methods require solving a linear system at every step. Newton's method has quadratic convergence while that of the others is only linear; however, we obtain theoretical bounds showing that Oseen's iteration is more robust, and we confirm it experimentally. In addition, although Oseen's iteration usually requires more iterations than Newton's method, the linear systems it generates tend to be simpler and its overall costs (in CPU time) are lower. The Stokes problems result in linear systems which are easier to solve, but its convergence is much slower, so that it is competitive only for large viscosities. Inexact versions of these methods are studied, and we explain why the best timings are obtained using relatively modest error tolerances in solving the corresponding linear systems. We also present a new damping optimization strategy based on the quadratic nature of the Navier-Stokes equations, which improves the robustness of all the linearization strategies considered and whose computational cost is negligible. The algebraic properties of these systems depend on both the discretization and nonlinear method used. We study in detail the positive definiteness and skewsymmetry of the advection submatrices (essentially, convection-diffusion problems). We propose a discretization based on a new trilinear form for Newton's method. We solve the linear systems using three Krylov subspace methods, GMRES, QMR and TFQMR, and compare the advantages of each. Our emphasis is on parallel algorithms, and so we consider preconditioners suitable for parallel computers such as line variants of the Jacobi and Gauss- Seidel methods, alternating direction implicit methods, and Chebyshev and least squares polynomial preconditioners. These work well for moderate viscosities (moderate Reynolds number). For small viscosities we show that effective parallel solution of the advection subproblem is a critical factor to improve performance. Implementation details on a CM-5 are presented.
An application of the Krylov-FSP-SSA method to parameter fitting with maximum likelihood
NASA Astrophysics Data System (ADS)
Dinh, Khanh N.; Sidje, Roger B.
2017-12-01
Monte Carlo methods such as the stochastic simulation algorithm (SSA) have traditionally been employed in gene regulation problems. However, there has been increasing interest to directly obtain the probability distribution of the molecules involved by solving the chemical master equation (CME). This requires addressing the curse of dimensionality that is inherent in most gene regulation problems. The finite state projection (FSP) seeks to address the challenge and there have been variants that further reduce the size of the projection or that accelerate the resulting matrix exponential. The Krylov-FSP-SSA variant has proved numerically efficient by combining, on one hand, the SSA to adaptively drive the FSP, and on the other hand, adaptive Krylov techniques to evaluate the matrix exponential. Here we apply this Krylov-FSP-SSA to a mutual inhibitory gene network synthetically engineered in Saccharomyces cerevisiae, in which bimodality arises. We show numerically that the approach can efficiently approximate the transient probability distribution, and this has important implications for parameter fitting, where the CME has to be solved for many different parameter sets. The fitting scheme amounts to an optimization problem of finding the parameter set so that the transient probability distributions fit the observations with maximum likelihood. We compare five optimization schemes for this difficult problem, thereby providing further insights into this approach of parameter estimation that is often applied to models in systems biology where there is a need to calibrate free parameters. Work supported by NSF grant DMS-1320849.
Hydrodynamics of suspensions of passive and active rigid particles: a rigid multiblob approach
Usabiaga, Florencio Balboa; Kallemov, Bakytzhan; Delmotte, Blaise; ...
2016-01-12
We develop a rigid multiblob method for numerically solving the mobility problem for suspensions of passive and active rigid particles of complex shape in Stokes flow in unconfined, partially confined, and fully confined geometries. As in a number of existing methods, we discretize rigid bodies using a collection of minimally resolved spherical blobs constrained to move as a rigid body, to arrive at a potentially large linear system of equations for the unknown Lagrange multipliers and rigid-body motions. Here we develop a block-diagonal preconditioner for this linear system and show that a standard Krylov solver converges in a modest numbermore » of iterations that is essentially independent of the number of particles. Key to the efficiency of the method is a technique for fast computation of the product of the blob-blob mobility matrix and a vector. For unbounded suspensions, we rely on existing analytical expressions for the Rotne-Prager-Yamakawa tensor combined with a fast multipole method (FMM) to obtain linear scaling in the number of particles. For suspensions sedimented against a single no-slip boundary, we use a direct summation on a graphical processing unit (GPU), which gives quadratic asymptotic scaling with the number of particles. For fully confined domains, such as periodic suspensions or suspensions confined in slit and square channels, we extend a recently developed rigid-body immersed boundary method by B. Kallemov, A. P. S. Bhalla, B. E. Griffith, and A. Donev (Commun. Appl. Math. Comput. Sci. 11 (2016), no. 1, 79-141) to suspensions of freely moving passive or active rigid particles at zero Reynolds number. We demonstrate that the iterative solver for the coupled fluid and rigid-body equations converges in a bounded number of iterations regardless of the system size. In our approach, each iteration only requires a few cycles of a geometric multigrid solver for the Poisson equation, and an application of the block-diagonal preconditioner, leading to linear scaling with the number of particles. We optimize a number of parameters in the iterative solvers and apply our method to a variety of benchmark problems to carefully assess the accuracy of the rigid multiblob approach as a function of the resolution. We also model the dynamics of colloidal particles studied in recent experiments, such as passive boomerangs in a slit channel, as well as a pair of non-Brownian active nanorods sedimented against a wall.« less
NASA Astrophysics Data System (ADS)
Sanan, Patrick; May, Dave A.; Schenk, Olaf; Bollhöffer, Matthias
2017-04-01
Geodynamics simulations typically involve the repeated solution of saddle-point systems arising from the Stokes equations. These computations often dominate the time to solution. Direct solvers are known for their robustness and ``black box'' properties, yet exhibit superlinear memory requirements and time to solution. More complex multilevel-preconditioned iterative solvers have been very successful for large problems, yet their use can require more effort from the practitioner in terms of setting up a solver and choosing its parameters. We champion an intermediate approach, based on leveraging the power of modern incomplete factorization techniques for indefinite symmetric matrices. These provide an interesting alternative in situations in between the regimes where direct solvers are an obvious choice and those where complex, scalable, iterative solvers are an obvious choice. That is, much like their relatives for definite systems, ILU/ICC-preconditioned Krylov methods and ILU/ICC-smoothed multigrid methods, the approaches demonstrated here provide a useful addition to the solver toolkit. We present results with a simple, PETSc-based, open-source Q2-Q1 (Taylor-Hood) finite element discretization, in 2 and 3 dimensions, with the Stokes and Lamé (linear elasticity) saddle point systems. Attention is paid to cases in which full-operator incomplete factorization gives an improvement in time to solution over direct solution methods (which may not even be feasible due to memory limitations), without the complication of more complex (or at least, less-automatic) preconditioners or smoothers. As an important factor in the relevance of these tools is their availability in portable software, we also describe open-source PETSc interfaces to the factorization routines.
PIXIE3D: A Parallel, Implicit, eXtended MHD 3D Code.
NASA Astrophysics Data System (ADS)
Chacon, L.; Knoll, D. A.
2004-11-01
We report on the development of PIXIE3D, a 3D parallel, fully implicit Newton-Krylov extended primitive-variable MHD code in general curvilinear geometry. PIXIE3D employs a second-order, finite-volume-based spatial discretization that satisfies remarkable properties such as being conservative, solenoidal in the magnetic field, non-dissipative, and stable in the absence of physical dissipation.(L. Chacón , phComput. Phys. Comm.) submitted (2004) PIXIE3D employs fully-implicit Newton-Krylov methods for the time advance. Currently, first and second-order implicit schemes are available, although higher-order temporal implicit schemes can be effortlessly implemented within the Newton-Krylov framework. A successful, scalable, MG physics-based preconditioning strategy, similar in concept to previous 2D MHD efforts,(L. Chacón et al., phJ. Comput. Phys). 178 (1), 15- 36 (2002); phJ. Comput. Phys., 188 (2), 573-592 (2003) has been developed. We are currently in the process of parallelizing the code using the PETSc library, and a Newton-Krylov-Schwarz approach for the parallel treatment of the preconditioner. In this poster, we will report on both the serial and parallel performance of PIXIE3D, focusing primarily on scalability and CPU speedup vs. an explicit approach.
SIERRA Multimechanics Module: Aria User Manual Version 4.44
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sierra Thermal /Fluid Team
2017-04-01
Aria is a Galerkin fnite element based program for solving coupled-physics problems described by systems of PDEs and is capable of solving nonlinear, implicit, transient and direct-to-steady state problems in two and three dimensions on parallel architectures. The suite of physics currently supported by Aria includes thermal energy transport, species transport, and electrostatics as well as generalized scalar, vector and tensor transport equations. Additionally, Aria includes support for manufacturing process fows via the incompressible Navier-Stokes equations specialized to a low Reynolds number ( %3C 1 ) regime. Enhanced modeling support of manufacturing processing is made possible through use of eithermore » arbitrary Lagrangian- Eulerian (ALE) and level set based free and moving boundary tracking in conjunction with quasi-static nonlinear elastic solid mechanics for mesh control. Coupled physics problems are solved in several ways including fully-coupled Newton's method with analytic or numerical sensitivities, fully-coupled Newton- Krylov methods and a loosely-coupled nonlinear iteration about subsets of the system that are solved using combinations of the aforementioned methods. Error estimation, uniform and dynamic h -adaptivity and dynamic load balancing are some of Aria's more advanced capabilities. Aria is based upon the Sierra Framework.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sierra Thermal/Fluid Team
Aria is a Galerkin fnite element based program for solving coupled-physics problems described by systems of PDEs and is capable of solving nonlinear, implicit, transient and direct-to-steady state problems in two and three dimensions on parallel architectures. The suite of physics currently supported by Aria includes thermal energy transport, species transport, and electrostatics as well as generalized scalar, vector and tensor transport equations. Additionally, Aria includes support for manufacturing process fows via the incompressible Navier-Stokes equations specialized to a low Reynolds number ( %3C 1 ) regime. Enhanced modeling support of manufacturing processing is made possible through use of eithermore » arbitrary Lagrangian- Eulerian (ALE) and level set based free and moving boundary tracking in conjunction with quasi-static nonlinear elastic solid mechanics for mesh control. Coupled physics problems are solved in several ways including fully-coupled Newton's method with analytic or numerical sensitivities, fully-coupled Newton- Krylov methods and a loosely-coupled nonlinear iteration about subsets of the system that are solved using combinations of the aforementioned methods. Error estimation, uniform and dynamic h -adaptivity and dynamic load balancing are some of Aria's more advanced capabilities. Aria is based upon the Sierra Framework.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sierra Thermal /Fluid Team
Aria is a Galerkin finite element based program for solving coupled-physics problems described by systems of PDEs and is capable of solving nonlinear, implicit, transient and direct-to-steady state problems in two and three dimensions on parallel architectures. The suite of physics currently supported by Aria includes thermal energy transport, species transport, and electrostatics as well as generalized scalar, vector and tensor transport equations. Additionally, Aria includes support for manufacturing process flows via the incompressible Navier-Stokes equations specialized to a low Reynolds number (Re %3C 1) regime. Enhanced modeling support of manufacturing processing is made possible through use of either arbitrarymore » Lagrangian- Eulerian (ALE) and level set based free and moving boundary tracking in conjunction with quasi-static nonlinear elastic solid mechanics for mesh control. Coupled physics problems are solved in several ways including fully-coupled Newton's method with analytic or numerical sensitivities, fully-coupled Newton- Krylov methods and a loosely-coupled nonlinear iteration about subsets of the system that are solved using combinations of the aforementioned methods. Error estimation, uniform and dynamic h-adaptivity and dynamic load balancing are some of Aria's more advanced capabilities. Aria is based upon the Sierra Framework.« less
Sound transmission through a poroelastic layered panel
NASA Astrophysics Data System (ADS)
Nagler, Loris; Rong, Ping; Schanz, Martin; von Estorff, Otto
2014-04-01
Multi-layered panels are often used to improve the acoustics in cars, airplanes, rooms, etc. For such an application these panels include porous and/or fibrous layers. The proposed numerical method is an approach to simulate the acoustical behavior of such multi-layered panels. The model assumes plate-like structures and, hence, combines plate theories for the different layers. The poroelastic layer is modelled with a recently developed plate theory. This theory uses a series expansion in thickness direction with subsequent analytical integration in this direction to reduce the three dimensions to two. The same idea is used to model either air gaps or fibrous layers. The latter are modeled as equivalent fluid and can be handled like an air gap, i.e., a kind of `air plate' is used. The coupling of the layers is done by using the series expansion to express the continuity conditions on the surfaces of the plates. The final system is solved with finite elements, where domain decomposition techniques in combination with preconditioned iterative solvers are applied to solve the final system of equations. In a large frequency range, the comparison with measurements shows very good agreement. From the numerical solution process it can be concluded that different preconditioners for the different layers are necessary. A reuse of the Krylov subspace of the iterative solvers pays if several excitations have to be computed but not that much in the loop over the frequencies.
Fully Implicit, Nonlinear 3D Extended Magnetohydrodynamics
NASA Astrophysics Data System (ADS)
Chacon, Luis; Knoll, Dana
2003-10-01
Extended magnetohydrodynamics (XMHD) includes nonideal effects such as nonlinear, anisotropic transport and two-fluid (Hall) effects. XMHD supports multiple, separate time scales that make explicit time differencing approaches extremely inefficient. While a fully implicit implementation promises efficiency without sacrificing numerical accuracy,(D. A. Knoll et al., phJ. Comput. Phys.) 185 (2), 583-611 (2003) the nonlinear nature of the XMHD system and the numerical stiffness associated with the fast waves make this endeavor difficult. Newton-Krylov methods are, however, ideally suited for such a task. These synergistically combine Newton's method for nonlinear convergence, and Krylov techniques to solve the associated Jacobian (linear) systems. Krylov methods can be implemented Jacobian-free and can be preconditioned for efficiency. Successful preconditioning strategies have been developed for 2D incompressible resistive(L. Chacón et al., phJ. Comput. Phys). 178 (1), 15- 36 (2002) and Hall(L. Chacón and D. A. Knoll, phJ. Comput. Phys.), 188 (2), 573-592 (2003) MHD models. These are based on ``physics-based'' ideas, in which knowledge of the physics is exploited to derive well-conditioned (diagonally-dominant) approximations to the original system that are amenable to optimal solver technologies (multigrid). In this work, we will describe the status of the extension of the 2D preconditioning ideas for a 3D compressible, single-fluid XMHD model.
Application of high-order numerical schemes and Newton-Krylov method to two-phase drift-flux model
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zou, Ling; Zhao, Haihua; Zhang, Hongbin
This study concerns the application and solver robustness of the Newton-Krylov method in solving two-phase flow drift-flux model problems using high-order numerical schemes. In our previous studies, the Newton-Krylov method has been proven as a promising solver for two-phase flow drift-flux model problems. However, these studies were limited to use first-order numerical schemes only. Moreover, the previous approach to treating the drift-flux closure correlations was later revealed to cause deteriorated solver convergence performance, when the mesh was highly refined, and also when higher-order numerical schemes were employed. In this study, a second-order spatial discretization scheme that has been tested withmore » two-fluid two-phase flow model was extended to solve drift-flux model problems. In order to improve solver robustness, and therefore efficiency, a new approach was proposed to treating the mean drift velocity of the gas phase as a primary nonlinear variable to the equation system. With this new approach, significant improvement in solver robustness was achieved. With highly refined mesh, the proposed treatment along with the Newton-Krylov solver were extensively tested with two-phase flow problems that cover a wide range of thermal-hydraulics conditions. Satisfactory convergence performances were observed for all test cases. Numerical verification was then performed in the form of mesh convergence studies, from which expected orders of accuracy were obtained for both the first-order and the second-order spatial discretization schemes. Finally, the drift-flux model, along with numerical methods presented, were validated with three sets of flow boiling experiments that cover different flow channel geometries (round tube, rectangular tube, and rod bundle), and a wide range of test conditions (pressure, mass flux, wall heat flux, inlet subcooling and outlet void fraction).« less
Application of high-order numerical schemes and Newton-Krylov method to two-phase drift-flux model
Zou, Ling; Zhao, Haihua; Zhang, Hongbin
2017-08-07
This study concerns the application and solver robustness of the Newton-Krylov method in solving two-phase flow drift-flux model problems using high-order numerical schemes. In our previous studies, the Newton-Krylov method has been proven as a promising solver for two-phase flow drift-flux model problems. However, these studies were limited to use first-order numerical schemes only. Moreover, the previous approach to treating the drift-flux closure correlations was later revealed to cause deteriorated solver convergence performance, when the mesh was highly refined, and also when higher-order numerical schemes were employed. In this study, a second-order spatial discretization scheme that has been tested withmore » two-fluid two-phase flow model was extended to solve drift-flux model problems. In order to improve solver robustness, and therefore efficiency, a new approach was proposed to treating the mean drift velocity of the gas phase as a primary nonlinear variable to the equation system. With this new approach, significant improvement in solver robustness was achieved. With highly refined mesh, the proposed treatment along with the Newton-Krylov solver were extensively tested with two-phase flow problems that cover a wide range of thermal-hydraulics conditions. Satisfactory convergence performances were observed for all test cases. Numerical verification was then performed in the form of mesh convergence studies, from which expected orders of accuracy were obtained for both the first-order and the second-order spatial discretization schemes. Finally, the drift-flux model, along with numerical methods presented, were validated with three sets of flow boiling experiments that cover different flow channel geometries (round tube, rectangular tube, and rod bundle), and a wide range of test conditions (pressure, mass flux, wall heat flux, inlet subcooling and outlet void fraction).« less
Recovery Discontinuous Galerkin Jacobian-Free Newton-Krylov Method for All-Speed Flows
DOE Office of Scientific and Technical Information (OSTI.GOV)
HyeongKae Park; Robert Nourgaliev; Vincent Mousseau
2008-07-01
A novel numerical algorithm (rDG-JFNK) for all-speed fluid flows with heat conduction and viscosity is introduced. The rDG-JFNK combines the Discontinuous Galerkin spatial discretization with the implicit Runge-Kutta time integration under the Jacobian-free Newton-Krylov framework. We solve fully-compressible Navier-Stokes equations without operator-splitting of hyperbolic, diffusion and reaction terms, which enables fully-coupled high-order temporal discretization. The stability constraint is removed due to the L-stable Explicit, Singly Diagonal Implicit Runge-Kutta (ESDIRK) scheme. The governing equations are solved in the conservative form, which allows one to accurately compute shock dynamics, as well as low-speed flows. For spatial discretization, we develop a “recovery” familymore » of DG, exhibiting nearly-spectral accuracy. To precondition the Krylov-based linear solver (GMRES), we developed an “Operator-Split”-(OS) Physics Based Preconditioner (PBP), in which we transform/simplify the fully-coupled system to a sequence of segregated scalar problems, each can be solved efficiently with Multigrid method. Each scalar problem is designed to target/cluster eigenvalues of the Jacobian matrix associated with a specific physics.« less
Numerical Solution of the Gyrokinetic Poisson Equation in TEMPEST
NASA Astrophysics Data System (ADS)
Dorr, Milo; Cohen, Bruce; Cohen, Ronald; Dimits, Andris; Hittinger, Jeffrey; Kerbel, Gary; Nevins, William; Rognlien, Thomas; Umansky, Maxim; Xiong, Andrew; Xu, Xueqiao
2006-10-01
The gyrokinetic Poisson (GKP) model in the TEMPEST continuum gyrokinetic edge plasma code yields the electrostatic potential due to the charge density of electrons and an arbitrary number of ion species including the effects of gyroaveraging in the limit kρ1. The TEMPEST equations are integrated as a differential algebraic system involving a nonlinear system solve via Newton-Krylov iteration. The GKP preconditioner block is inverted using a multigrid preconditioned conjugate gradient (CG) algorithm. Electrons are treated as kinetic or adiabatic. The Boltzmann relation in the adiabatic option employs flux surface averaging to maintain neutrality within field lines and is solved self-consistently with the GKP equation. A decomposition procedure circumvents the near singularity of the GKP Jacobian block that otherwise degrades CG convergence.
NASA Astrophysics Data System (ADS)
Debreu, Laurent; Neveu, Emilie; Simon, Ehouarn; Le Dimet, Francois Xavier; Vidard, Arthur
2014-05-01
In order to lower the computational cost of the variational data assimilation process, we investigate the use of multigrid methods to solve the associated optimal control system. On a linear advection equation, we study the impact of the regularization term on the optimal control and the impact of discretization errors on the efficiency of the coarse grid correction step. We show that even if the optimal control problem leads to the solution of an elliptic system, numerical errors introduced by the discretization can alter the success of the multigrid methods. The view of the multigrid iteration as a preconditioner for a Krylov optimization method leads to a more robust algorithm. A scale dependent weighting of the multigrid preconditioner and the usual background error covariance matrix based preconditioner is proposed and brings significant improvements. [1] Laurent Debreu, Emilie Neveu, Ehouarn Simon, François-Xavier Le Dimet and Arthur Vidard, 2014: Multigrid solvers and multigrid preconditioners for the solution of variational data assimilation problems, submitted to QJRMS, http://hal.inria.fr/hal-00874643 [2] Emilie Neveu, Laurent Debreu and François-Xavier Le Dimet, 2011: Multigrid methods and data assimilation - Convergence study and first experiments on non-linear equations, ARIMA, 14, 63-80, http://intranet.inria.fr/international/arima/014/014005.html
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schunert, Sebastian; Wang, Congjian; Wang, Yaqi
Rattlesnake and MAMMOTH are the designated TREAT analysis tools currently being developed at the Idaho National Laboratory. Concurrent with development of the multi-physics, multi-scale capabilities, sensitivity analysis and uncertainty quantification (SA/UQ) capabilities are required for predicitive modeling of the TREAT reactor. For steady-state SA/UQ, that is essential for setting initial conditions for the transients, generalized perturbation theory (GPT) will be used. This work describes the implementation of a PETSc based solver for the generalized adjoint equations that constitute a inhomogeneous, rank deficient problem. The standard approach is to use an outer iteration strategy with repeated removal of the fundamental modemore » contamination. The described GPT algorithm directly solves the GPT equations without the need of an outer iteration procedure by using Krylov subspaces that are orthogonal to the operator’s nullspace. Three test problems are solved and provide sufficient verification for the Rattlesnake’s GPT capability. We conclude with a preliminary example evaluating the impact of the Boron distribution in the TREAT reactor using perturbation theory.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Devine, K.D.; Hennigan, G.L.; Hutchinson, S.A.
1999-01-01
The theoretical background for the finite element computer program, MPSalsa Version 1.5, is presented in detail. MPSalsa is designed to solve laminar or turbulent low Mach number, two- or three-dimensional incompressible and variable density reacting fluid flows on massively parallel computers, using a Petrov-Galerkin finite element formulation. The code has the capability to solve coupled fluid flow (with auxiliary turbulence equations), heat transport, multicomponent species transport, and finite-rate chemical reactions, and to solve coupled multiple Poisson or advection-diffusion-reaction equations. The program employs the CHEMKIN library to provide a rigorous treatment of multicomponent ideal gas kinetics and transport. Chemical reactions occurringmore » in the gas phase and on surfaces are treated by calls to CHEMKIN and SURFACE CHEMK3N, respectively. The code employs unstructured meshes, using the EXODUS II finite element database suite of programs for its input and output files. MPSalsa solves both transient and steady flows by using fully implicit time integration, an inexact Newton method and iterative solvers based on preconditioned Krylov methods as implemented in the Aztec. solver library.« less
A Fast Solver for Implicit Integration of the Vlasov--Poisson System in the Eulerian Framework
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garrett, C. Kristopher; Hauck, Cory D.
In this paper, we present a domain decomposition algorithm to accelerate the solution of Eulerian-type discretizations of the linear, steady-state Vlasov equation. The steady-state solver then forms a key component in the implementation of fully implicit or nearly fully implicit temporal integrators for the nonlinear Vlasov--Poisson system. The solver relies on a particular decomposition of phase space that enables the use of sweeping techniques commonly used in radiation transport applications. The original linear system for the phase space unknowns is then replaced by a smaller linear system involving only unknowns on the boundary between subdomains, which can then be solvedmore » efficiently with Krylov methods such as GMRES. Steady-state solves are combined to form an implicit Runge--Kutta time integrator, and the Vlasov equation is coupled self-consistently to the Poisson equation via a linearized procedure or a nonlinear fixed-point method for the electric field. Finally, numerical results for standard test problems demonstrate the efficiency of the domain decomposition approach when compared to the direct application of an iterative solver to the original linear system.« less
NASA Astrophysics Data System (ADS)
Kong, Fande; Cai, Xiao-Chuan
2017-07-01
Nonlinear fluid-structure interaction (FSI) problems on unstructured meshes in 3D appear in many applications in science and engineering, such as vibration analysis of aircrafts and patient-specific diagnosis of cardiovascular diseases. In this work, we develop a highly scalable, parallel algorithmic and software framework for FSI problems consisting of a nonlinear fluid system and a nonlinear solid system, that are coupled monolithically. The FSI system is discretized by a stabilized finite element method in space and a fully implicit backward difference scheme in time. To solve the large, sparse system of nonlinear algebraic equations at each time step, we propose an inexact Newton-Krylov method together with a multilevel, smoothed Schwarz preconditioner with isogeometric coarse meshes generated by a geometry preserving coarsening algorithm. Here "geometry" includes the boundary of the computational domain and the wet interface between the fluid and the solid. We show numerically that the proposed algorithm and implementation are highly scalable in terms of the number of linear and nonlinear iterations and the total compute time on a supercomputer with more than 10,000 processor cores for several problems with hundreds of millions of unknowns.
Performance of fully-coupled algebraic multigrid preconditioners for large-scale VMS resistive MHD
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, P. T.; Shadid, J. N.; Hu, J. J.
Here, we explore the current performance and scaling of a fully-implicit stabilized unstructured finite element (FE) variational multiscale (VMS) capability for large-scale simulations of 3D incompressible resistive magnetohydrodynamics (MHD). The large-scale linear systems that are generated by a Newton nonlinear solver approach are iteratively solved by preconditioned Krylov subspace methods. The efficiency of this approach is critically dependent on the scalability and performance of the algebraic multigrid preconditioner. Our study considers the performance of the numerical methods as recently implemented in the second-generation Trilinos implementation that is 64-bit compliant and is not limited by the 32-bit global identifiers of themore » original Epetra-based Trilinos. The study presents representative results for a Poisson problem on 1.6 million cores of an IBM Blue Gene/Q platform to demonstrate very large-scale parallel execution. Additionally, results for a more challenging steady-state MHD generator and a transient solution of a benchmark MHD turbulence calculation for the full resistive MHD system are also presented. These results are obtained on up to 131,000 cores of a Cray XC40 and one million cores of a BG/Q system.« less
Performance of fully-coupled algebraic multigrid preconditioners for large-scale VMS resistive MHD
Lin, P. T.; Shadid, J. N.; Hu, J. J.; ...
2017-11-06
Here, we explore the current performance and scaling of a fully-implicit stabilized unstructured finite element (FE) variational multiscale (VMS) capability for large-scale simulations of 3D incompressible resistive magnetohydrodynamics (MHD). The large-scale linear systems that are generated by a Newton nonlinear solver approach are iteratively solved by preconditioned Krylov subspace methods. The efficiency of this approach is critically dependent on the scalability and performance of the algebraic multigrid preconditioner. Our study considers the performance of the numerical methods as recently implemented in the second-generation Trilinos implementation that is 64-bit compliant and is not limited by the 32-bit global identifiers of themore » original Epetra-based Trilinos. The study presents representative results for a Poisson problem on 1.6 million cores of an IBM Blue Gene/Q platform to demonstrate very large-scale parallel execution. Additionally, results for a more challenging steady-state MHD generator and a transient solution of a benchmark MHD turbulence calculation for the full resistive MHD system are also presented. These results are obtained on up to 131,000 cores of a Cray XC40 and one million cores of a BG/Q system.« less
Kong, Fande; Cai, Xiao-Chuan
2017-03-24
Nonlinear fluid-structure interaction (FSI) problems on unstructured meshes in 3D appear many applications in science and engineering, such as vibration analysis of aircrafts and patient-specific diagnosis of cardiovascular diseases. In this work, we develop a highly scalable, parallel algorithmic and software framework for FSI problems consisting of a nonlinear fluid system and a nonlinear solid system, that are coupled monolithically. The FSI system is discretized by a stabilized finite element method in space and a fully implicit backward difference scheme in time. To solve the large, sparse system of nonlinear algebraic equations at each time step, we propose an inexactmore » Newton-Krylov method together with a multilevel, smoothed Schwarz preconditioner with isogeometric coarse meshes generated by a geometry preserving coarsening algorithm. Here ''geometry'' includes the boundary of the computational domain and the wet interface between the fluid and the solid. We show numerically that the proposed algorithm and implementation are highly scalable in terms of the number of linear and nonlinear iterations and the total compute time on a supercomputer with more than 10,000 processor cores for several problems with hundreds of millions of unknowns.« less
A Fast Solver for Implicit Integration of the Vlasov--Poisson System in the Eulerian Framework
Garrett, C. Kristopher; Hauck, Cory D.
2018-04-05
In this paper, we present a domain decomposition algorithm to accelerate the solution of Eulerian-type discretizations of the linear, steady-state Vlasov equation. The steady-state solver then forms a key component in the implementation of fully implicit or nearly fully implicit temporal integrators for the nonlinear Vlasov--Poisson system. The solver relies on a particular decomposition of phase space that enables the use of sweeping techniques commonly used in radiation transport applications. The original linear system for the phase space unknowns is then replaced by a smaller linear system involving only unknowns on the boundary between subdomains, which can then be solvedmore » efficiently with Krylov methods such as GMRES. Steady-state solves are combined to form an implicit Runge--Kutta time integrator, and the Vlasov equation is coupled self-consistently to the Poisson equation via a linearized procedure or a nonlinear fixed-point method for the electric field. Finally, numerical results for standard test problems demonstrate the efficiency of the domain decomposition approach when compared to the direct application of an iterative solver to the original linear system.« less
Low order models for uncertainty quantification in acoustic propagation problems
NASA Astrophysics Data System (ADS)
Millet, Christophe
2016-11-01
Long-range sound propagation problems are characterized by both a large number of length scales and a large number of normal modes. In the atmosphere, these modes are confined within waveguides causing the sound to propagate through multiple paths to the receiver. For uncertain atmospheres, the modes are described as random variables. Concise mathematical models and analysis reveal fundamental limitations in classical projection techniques due to different manifestations of the fact that modes that carry small variance can have important effects on the large variance modes. In the present study, we propose a systematic strategy for obtaining statistically accurate low order models. The normal modes are sorted in decreasing Sobol indices using asymptotic expansions, and the relevant modes are extracted using a modified iterative Krylov-based method. The statistics of acoustic signals are computed by decomposing the original pulse into a truncated sum of modal pulses that can be described by a stationary phase method. As the low-order acoustic model preserves the overall structure of waveforms under perturbations of the atmosphere, it can be applied to uncertainty quantification. The result of this study is a new algorithm which applies on the entire phase space of acoustic fields.
2017-09-27
ARL-TR-8161•SEP 2017 US Army Research Laboratory Excluding Noise from Short Krylov Subspace Approximations to the Truncated Singular Value...originator. ARL-TR-8161•SEP 2017 US Army Research Laboratory Excluding Noise from Short Krylov Subspace Approximations to the Truncated Singular Value...unlimited. October 2015–January 2016 US Army Research Laboratory ATTN: RDRL-CIH-C Aberdeen Proving Ground, MD 21005-5066 primary author’s email
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chacon, Luis; Stanier, Adam John
Here, we demonstrate a scalable fully implicit algorithm for the two-field low-β extended MHD model. This reduced model describes plasma behavior in the presence of strong guide fields, and is of significant practical impact both in nature and in laboratory plasmas. The model displays strong hyperbolic behavior, as manifested by the presence of fast dispersive waves, which make a fully implicit treatment very challenging. In this study, we employ a Jacobian-free Newton–Krylov nonlinear solver, for which we propose a physics-based preconditioner that renders the linearized set of equations suitable for inversion with multigrid methods. As a result, the algorithm ismore » shown to scale both algorithmically (i.e., the iteration count is insensitive to grid refinement and timestep size) and in parallel in a weak-scaling sense, with the wall-clock time scaling weakly with the number of cores for up to 4096 cores. For a 4096 × 4096 mesh, we demonstrate a wall-clock-time speedup of ~6700 with respect to explicit algorithms. The model is validated linearly (against linear theory predictions) and nonlinearly (against fully kinetic simulations), demonstrating excellent agreement.« less
Quantum and electromagnetic propagation with the conjugate symmetric Lanczos method.
Acevedo, Ramiro; Lombardini, Richard; Turner, Matthew A; Kinsey, James L; Johnson, Bruce R
2008-02-14
The conjugate symmetric Lanczos (CSL) method is introduced for the solution of the time-dependent Schrodinger equation. This remarkably simple and efficient time-domain algorithm is a low-order polynomial expansion of the quantum propagator for time-independent Hamiltonians and derives from the time-reversal symmetry of the Schrodinger equation. The CSL algorithm gives forward solutions by simply complex conjugating backward polynomial expansion coefficients. Interestingly, the expansion coefficients are the same for each uniform time step, a fact that is only spoiled by basis incompleteness and finite precision. This is true for the Krylov basis and, with further investigation, is also found to be true for the Lanczos basis, important for efficient orthogonal projection-based algorithms. The CSL method errors roughly track those of the short iterative Lanczos method while requiring fewer matrix-vector products than the Chebyshev method. With the CSL method, only a few vectors need to be stored at a time, there is no need to estimate the Hamiltonian spectral range, and only matrix-vector and vector-vector products are required. Applications using localized wavelet bases are made to harmonic oscillator and anharmonic Morse oscillator systems as well as electrodynamic pulse propagation using the Hamiltonian form of Maxwell's equations. For gold with a Drude dielectric function, the latter is non-Hermitian, requiring consideration of corrections to the CSL algorithm.
Approximate tensor-product preconditioners for very high order discontinuous Galerkin methods
NASA Astrophysics Data System (ADS)
Pazner, Will; Persson, Per-Olof
2018-02-01
In this paper, we develop a new tensor-product based preconditioner for discontinuous Galerkin methods with polynomial degrees higher than those typically employed. This preconditioner uses an automatic, purely algebraic method to approximate the exact block Jacobi preconditioner by Kronecker products of several small, one-dimensional matrices. Traditional matrix-based preconditioners require O (p2d) storage and O (p3d) computational work, where p is the degree of basis polynomials used, and d is the spatial dimension. Our SVD-based tensor-product preconditioner requires O (p d + 1) storage, O (p d + 1) work in two spatial dimensions, and O (p d + 2) work in three spatial dimensions. Combined with a matrix-free Newton-Krylov solver, these preconditioners allow for the solution of DG systems in linear time in p per degree of freedom in 2D, and reduce the computational complexity from O (p9) to O (p5) in 3D. Numerical results are shown in 2D and 3D for the advection, Euler, and Navier-Stokes equations, using polynomials of degree up to p = 30. For many test cases, the preconditioner results in similar iteration counts when compared with the exact block Jacobi preconditioner, and performance is significantly improved for high polynomial degrees p.
A tightly-coupled domain-decomposition approach for highly nonlinear stochastic multiphysics systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Taverniers, Søren; Tartakovsky, Daniel M., E-mail: dmt@ucsd.edu
2017-02-01
Multiphysics simulations often involve nonlinear components that are driven by internally generated or externally imposed random fluctuations. When used with a domain-decomposition (DD) algorithm, such components have to be coupled in a way that both accurately propagates the noise between the subdomains and lends itself to a stable and cost-effective temporal integration. We develop a conservative DD approach in which tight coupling is obtained by using a Jacobian-free Newton–Krylov (JfNK) method with a generalized minimum residual iterative linear solver. This strategy is tested on a coupled nonlinear diffusion system forced by a truncated Gaussian noise at the boundary. Enforcement ofmore » path-wise continuity of the state variable and its flux, as opposed to continuity in the mean, at interfaces between subdomains enables the DD algorithm to correctly propagate boundary fluctuations throughout the computational domain. Reliance on a single Newton iteration (explicit coupling), rather than on the fully converged JfNK (implicit) coupling, may increase the solution error by an order of magnitude. Increase in communication frequency between the DD components reduces the explicit coupling's error, but makes it less efficient than the implicit coupling at comparable error levels for all noise strengths considered. Finally, the DD algorithm with the implicit JfNK coupling resolves temporally-correlated fluctuations of the boundary noise when the correlation time of the latter exceeds some multiple of an appropriately defined characteristic diffusion time.« less
NASA Astrophysics Data System (ADS)
Moghaderi, Hamid; Dehghan, Mehdi; Donatelli, Marco; Mazza, Mariarosa
2017-12-01
Fractional diffusion equations (FDEs) are a mathematical tool used for describing some special diffusion phenomena arising in many different applications like porous media and computational finance. In this paper, we focus on a two-dimensional space-FDE problem discretized by means of a second order finite difference scheme obtained as combination of the Crank-Nicolson scheme and the so-called weighted and shifted Grünwald formula. By fully exploiting the Toeplitz-like structure of the resulting linear system, we provide a detailed spectral analysis of the coefficient matrix at each time step, both in the case of constant and variable diffusion coefficients. Such a spectral analysis has a very crucial role, since it can be used for designing fast and robust iterative solvers. In particular, we employ the obtained spectral information to define a Galerkin multigrid method based on the classical linear interpolation as grid transfer operator and damped-Jacobi as smoother, and to prove the linear convergence rate of the corresponding two-grid method. The theoretical analysis suggests that the proposed grid transfer operator is strong enough for working also with the V-cycle method and the geometric multigrid. On this basis, we introduce two computationally favourable variants of the proposed multigrid method and we use them as preconditioners for Krylov methods. Several numerical results confirm that the resulting preconditioning strategies still keep a linear convergence rate.
Efficient solution of parabolic equations by Krylov approximation methods
NASA Technical Reports Server (NTRS)
Gallopoulos, E.; Saad, Y.
1990-01-01
Numerical techniques for solving parabolic equations by the method of lines is addressed. The main motivation for the proposed approach is the possibility of exploiting a high degree of parallelism in a simple manner. The basic idea of the method is to approximate the action of the evolution operator on a given state vector by means of a projection process onto a Krylov subspace. Thus, the resulting approximation consists of applying an evolution operator of a very small dimension to a known vector which is, in turn, computed accurately by exploiting well-known rational approximations to the exponential. Because the rational approximation is only applied to a small matrix, the only operations required with the original large matrix are matrix-by-vector multiplications, and as a result the algorithm can easily be parallelized and vectorized. Some relevant approximation and stability issues are discussed. We present some numerical experiments with the method and compare its performance with a few explicit and implicit algorithms.
Reverse time migration by Krylov subspace reduced order modeling
NASA Astrophysics Data System (ADS)
Basir, Hadi Mahdavi; Javaherian, Abdolrahim; Shomali, Zaher Hossein; Firouz-Abadi, Roohollah Dehghani; Gholamy, Shaban Ali
2018-04-01
Imaging is a key step in seismic data processing. To date, a myriad of advanced pre-stack depth migration approaches have been developed; however, reverse time migration (RTM) is still considered as the high-end imaging algorithm. The main limitations associated with the performance cost of reverse time migration are the intensive computation of the forward and backward simulations, time consumption, and memory allocation related to imaging condition. Based on the reduced order modeling, we proposed an algorithm, which can be adapted to all the aforementioned factors. Our proposed method benefit from Krylov subspaces method to compute certain mode shapes of the velocity model computed by as an orthogonal base of reduced order modeling. Reverse time migration by reduced order modeling is helpful concerning the highly parallel computation and strongly reduces the memory requirement of reverse time migration. The synthetic model results showed that suggested method can decrease the computational costs of reverse time migration by several orders of magnitudes, compared with reverse time migration by finite element method.
NASA Astrophysics Data System (ADS)
Salary, Mohammad Mahdi; Mosallaei, Hossein
2015-06-01
Interactions between the plasmons of noble metal nanoparticles and non-absorbing biomolecules forms the basis of the plasmonic sensors, which have received much attention. Studying these interactions can help to exploit the full potentials of plasmonic sensors in quantification and analysis of biomolecules. Here, a quasi-static continuum model is adopted for this purpose. We present a boundary-element method for computing the optical response of plasmonic particles to the molecular binding events by solving the Poisson equation. The model represents biomolecules with their molecular surfaces, thus accurately accounting for the influence of exact binding conformations as well as structural differences between different proteins on the response of plasmonic nanoparticles. The linear systems arising in the method are solved iteratively with Krylov generalized minimum residual algorithm, and the acceleration is achieved by applying precorrected-Fast Fourier Transformation technique. We apply the developed method to investigate interactions of biotinylated gold nanoparticles (nanosphere and nanorod) with four different types of biotin-binding proteins. The interactions are studied at both ensemble and single-molecule level. Computational results demonstrate the ability of presented model for analyzing realistic nanoparticle-biomolecule configurations. The method can provide comprehensive study for wide variety of applications, including protein structures, monitoring structural and conformational transitions, and quantification of protein concentrations. In addition, it is suitable for design and optimization of the nano-plasmonic sensors.
NASA Astrophysics Data System (ADS)
Koldan, Jelena; Puzyrev, Vladimir; de la Puente, Josep; Houzeaux, Guillaume; Cela, José María
2014-06-01
We present an elaborate preconditioning scheme for Krylov subspace methods which has been developed to improve the performance and reduce the execution time of parallel node-based finite-element (FE) solvers for 3-D electromagnetic (EM) numerical modelling in exploration geophysics. This new preconditioner is based on algebraic multigrid (AMG) that uses different basic relaxation methods, such as Jacobi, symmetric successive over-relaxation (SSOR) and Gauss-Seidel, as smoothers and the wave front algorithm to create groups, which are used for a coarse-level generation. We have implemented and tested this new preconditioner within our parallel nodal FE solver for 3-D forward problems in EM induction geophysics. We have performed series of experiments for several models with different conductivity structures and characteristics to test the performance of our AMG preconditioning technique when combined with biconjugate gradient stabilized method. The results have shown that, the more challenging the problem is in terms of conductivity contrasts, ratio between the sizes of grid elements and/or frequency, the more benefit is obtained by using this preconditioner. Compared to other preconditioning schemes, such as diagonal, SSOR and truncated approximate inverse, the AMG preconditioner greatly improves the convergence of the iterative solver for all tested models. Also, when it comes to cases in which other preconditioners succeed to converge to a desired precision, AMG is able to considerably reduce the total execution time of the forward-problem code-up to an order of magnitude. Furthermore, the tests have confirmed that our AMG scheme ensures grid-independent rate of convergence, as well as improvement in convergence regardless of how big local mesh refinements are. In addition, AMG is designed to be a black-box preconditioner, which makes it easy to use and combine with different iterative methods. Finally, it has proved to be very practical and efficient in the parallel context.
Application of Krylov exponential propagation to fluid dynamics equations
NASA Technical Reports Server (NTRS)
Saad, Youcef; Semeraro, David
1991-01-01
An application of matrix exponentiation via Krylov subspace projection to the solution of fluid dynamics problems is presented. The main idea is to approximate the operation exp(A)v by means of a projection-like process onto a krylov subspace. This results in a computation of an exponential matrix vector product similar to the one above but of a much smaller size. Time integration schemes can then be devised to exploit this basic computational kernel. The motivation of this approach is to provide time-integration schemes that are essentially of an explicit nature but which have good stability properties.
Rapid sampling of stochastic displacements in Brownian dynamics simulations
NASA Astrophysics Data System (ADS)
Fiore, Andrew M.; Balboa Usabiaga, Florencio; Donev, Aleksandar; Swan, James W.
2017-03-01
We present a new method for sampling stochastic displacements in Brownian Dynamics (BD) simulations of colloidal scale particles. The method relies on a new formulation for Ewald summation of the Rotne-Prager-Yamakawa (RPY) tensor, which guarantees that the real-space and wave-space contributions to the tensor are independently symmetric and positive-definite for all possible particle configurations. Brownian displacements are drawn from a superposition of two independent samples: a wave-space (far-field or long-ranged) contribution, computed using techniques from fluctuating hydrodynamics and non-uniform fast Fourier transforms; and a real-space (near-field or short-ranged) correction, computed using a Krylov subspace method. The combined computational complexity of drawing these two independent samples scales linearly with the number of particles. The proposed method circumvents the super-linear scaling exhibited by all known iterative sampling methods applied directly to the RPY tensor that results from the power law growth of the condition number of tensor with the number of particles. For geometrically dense microstructures (fractal dimension equal three), the performance is independent of volume fraction, while for tenuous microstructures (fractal dimension less than three), such as gels and polymer solutions, the performance improves with decreasing volume fraction. This is in stark contrast with other related linear-scaling methods such as the force coupling method and the fluctuating immersed boundary method, for which performance degrades with decreasing volume fraction. Calculations for hard sphere dispersions and colloidal gels are illustrated and used to explore the role of microstructure on performance of the algorithm. In practice, the logarithmic part of the predicted scaling is not observed and the algorithm scales linearly for up to 4 ×106 particles, obtaining speed ups of over an order of magnitude over existing iterative methods, and making the cost of computing Brownian displacements comparable to the cost of computing deterministic displacements in BD simulations. A high-performance implementation employing non-uniform fast Fourier transforms implemented on graphics processing units and integrated with the software package HOOMD-blue is used for benchmarking.
Large scale Brownian dynamics of confined suspensions of rigid particles
NASA Astrophysics Data System (ADS)
Sprinkle, Brennan; Balboa Usabiaga, Florencio; Patankar, Neelesh A.; Donev, Aleksandar
2017-12-01
We introduce methods for large-scale Brownian Dynamics (BD) simulation of many rigid particles of arbitrary shape suspended in a fluctuating fluid. Our method adds Brownian motion to the rigid multiblob method [F. Balboa Usabiaga et al., Commun. Appl. Math. Comput. Sci. 11(2), 217-296 (2016)] at a cost comparable to the cost of deterministic simulations. We demonstrate that we can efficiently generate deterministic and random displacements for many particles using preconditioned Krylov iterative methods, if kernel methods to efficiently compute the action of the Rotne-Prager-Yamakawa (RPY) mobility matrix and its "square" root are available for the given boundary conditions. These kernel operations can be computed with near linear scaling for periodic domains using the positively split Ewald method. Here we study particles partially confined by gravity above a no-slip bottom wall using a graphical processing unit implementation of the mobility matrix-vector product, combined with a preconditioned Lanczos iteration for generating Brownian displacements. We address a major challenge in large-scale BD simulations, capturing the stochastic drift term that arises because of the configuration-dependent mobility. Unlike the widely used Fixman midpoint scheme, our methods utilize random finite differences and do not require the solution of resistance problems or the computation of the action of the inverse square root of the RPY mobility matrix. We construct two temporal schemes which are viable for large-scale simulations, an Euler-Maruyama traction scheme and a trapezoidal slip scheme, which minimize the number of mobility problems to be solved per time step while capturing the required stochastic drift terms. We validate and compare these schemes numerically by modeling suspensions of boomerang-shaped particles sedimented near a bottom wall. Using the trapezoidal scheme, we investigate the steady-state active motion in dense suspensions of confined microrollers, whose height above the wall is set by a combination of thermal noise and active flows. We find the existence of two populations of active particles, slower ones closer to the bottom and faster ones above them, and demonstrate that our method provides quantitative accuracy even with relatively coarse resolutions of the particle geometry.
NASA Astrophysics Data System (ADS)
Godoy, William F.; DesJardin, Paul E.
2010-05-01
The application of flux limiters to the discrete ordinates method (DOM), SN, for radiative transfer calculations is discussed and analyzed for 3D enclosures for cases in which the intensities are strongly coupled to each other such as: radiative equilibrium and scattering media. A Newton-Krylov iterative method (GMRES) solves the final systems of linear equations along with a domain decomposition strategy for parallel computation using message passing libraries in a distributed memory system. Ray effects due to angular discretization and errors due to domain decomposition are minimized until small variations are introduced by these effects in order to focus on the influence of flux limiters on errors due to spatial discretization, known as numerical diffusion, smearing or false scattering. Results are presented for the DOM-integrated quantities such as heat flux, irradiation and emission. A variety of flux limiters are compared to "exact" solutions available in the literature, such as the integral solution of the RTE for pure absorbing-emitting media and isotropic scattering cases and a Monte Carlo solution for a forward scattering case. Additionally, a non-homogeneous 3D enclosure is included to extend the use of flux limiters to more practical cases. The overall balance of convergence, accuracy, speed and stability using flux limiters is shown to be superior compared to step schemes for any test case.
A scalable, fully implicit algorithm for the reduced two-field low-β extended MHD model
Chacon, Luis; Stanier, Adam John
2016-12-01
Here, we demonstrate a scalable fully implicit algorithm for the two-field low-β extended MHD model. This reduced model describes plasma behavior in the presence of strong guide fields, and is of significant practical impact both in nature and in laboratory plasmas. The model displays strong hyperbolic behavior, as manifested by the presence of fast dispersive waves, which make a fully implicit treatment very challenging. In this study, we employ a Jacobian-free Newton–Krylov nonlinear solver, for which we propose a physics-based preconditioner that renders the linearized set of equations suitable for inversion with multigrid methods. As a result, the algorithm ismore » shown to scale both algorithmically (i.e., the iteration count is insensitive to grid refinement and timestep size) and in parallel in a weak-scaling sense, with the wall-clock time scaling weakly with the number of cores for up to 4096 cores. For a 4096 × 4096 mesh, we demonstrate a wall-clock-time speedup of ~6700 with respect to explicit algorithms. The model is validated linearly (against linear theory predictions) and nonlinearly (against fully kinetic simulations), demonstrating excellent agreement.« less
NASA Astrophysics Data System (ADS)
Chen, G.; Chacón, L.; Barnes, D. C.
2012-03-01
A recent proof-of-principle study proposes an energy- and charge-conserving, fully implicit particle-in-cell algorithm in one dimension [1], which is able to use timesteps comparable to the dynamical timescale of interest. Here, we generalize the method to employ non-uniform meshes via a curvilinear map. The key enabling technology is a hybrid particle pusher [2], with particle positions updated in logical space and particle velocities updated in physical space. The self-adaptive, charge-conserving particle mover of Ref. [1] is extended to the non-uniform mesh case. The fully implicit implementation, using a Jacobian-free Newton-Krylov iterative solver, remains exactly charge- and energy-conserving. The extension of the formulation to multiple dimensions will be discussed. We present numerical experiments of 1D electrostatic, long-timescale ion-acoustic wave and ion-acoustic shock wave simulations, demonstrating that charge and energy are conserved to round-off for arbitrary mesh non-uniformity, and that the total momentum remains well conserved.[4pt] [1] Chen, Chac'on, Barnes, J. Comput. Phys. 230 (2011). [0pt] [2] Camporeale and Delzanno, Bull. Am. Phys. Soc. 56(6) (2011); Wang, et al., J. Plasma Physics, 61 (1999).
Fully-Implicit Orthogonal Reconstructed Discontinuous Galerkin for Fluid Dynamics with Phase Change
Nourgaliev, R.; Luo, H.; Weston, B.; ...
2015-11-11
A new reconstructed Discontinuous Galerkin (rDG) method, based on orthogonal basis/test functions, is developed for fluid flows on unstructured meshes. Orthogonality of basis functions is essential for enabling robust and efficient fully-implicit Newton-Krylov based time integration. The method is designed for generic partial differential equations, including transient, hyperbolic, parabolic or elliptic operators, which are attributed to many multiphysics problems. We demonstrate the method’s capabilities for solving compressible fluid-solid systems (in the low Mach number limit), with phase change (melting/solidification), as motivated by applications in Additive Manufacturing (AM). We focus on the method’s accuracy (in both space and time), as wellmore » as robustness and solvability of the system of linear equations involved in the linearization steps of Newton-based methods. The performance of the developed method is investigated for highly-stiff problems with melting/solidification, emphasizing the advantages from tight coupling of mass, momentum and energy conservation equations, as well as orthogonality of basis functions, which leads to better conditioning of the underlying (approximate) Jacobian matrices, and rapid convergence of the Krylov-based linear solver.« less
Implicit Plasma Kinetic Simulation Using The Jacobian-Free Newton-Krylov Method
NASA Astrophysics Data System (ADS)
Taitano, William; Knoll, Dana; Chacon, Luis
2009-11-01
The use of fully implicit time integration methods in kinetic simulation is still area of algorithmic research. A brute-force approach to simultaneously including the field equations and the particle distribution function would result in an intractable linear algebra problem. A number of algorithms have been put forward which rely on an extrapolation in time. They can be thought of as linearly implicit methods or one-step Newton methods. However, issues related to time accuracy of these methods still remain. We are pursuing a route to implicit plasma kinetic simulation which eliminates extrapolation, eliminates phase-space from the linear algebra problem, and converges the entire nonlinear system within a time step. We accomplish all this using the Jacobian-Free Newton-Krylov algorithm. The original research along these lines considered particle methods to advance the distribution function [1]. In the current research we are advancing the Vlasov equations on a grid. Results will be presented which highlight algorithmic details for single species electrostatic problems and coupled ion-electron electrostatic problems. [4pt] [1] H. J. Kim, L. Chac'on, G. Lapenta, ``Fully implicit particle in cell algorithm,'' 47th Annual Meeting of the Division of Plasma Physics, Oct. 24-28, 2005, Denver, CO
Implementation of the SPH Procedure Within the MOOSE Finite Element Framework
NASA Astrophysics Data System (ADS)
Laurier, Alexandre
The goal of this thesis was to implement the SPH homogenization procedure within the MOOSE finite element framework at INL. Before this project, INL relied on DRAGON to do their SPH homogenization which was not flexible enough for their needs. As such, the SPH procedure was implemented for the neutron diffusion equation with the traditional, Selengut and true Selengut normalizations. Another aspect of this research was to derive the SPH corrected neutron transport equations and implement them in the same framework. Following in the footsteps of other articles, this feature was implemented and tested successfully with both the PN and S N transport calculation schemes. Although the results obtained for the power distribution in PWR assemblies show no advantages over the use of the SPH diffusion equation, we believe the inclusion of this transport correction will allow for better results in cases where either P N or SN are required. An additional aspect of this research was the implementation of a novel way of solving the non-linear SPH problem. Traditionally, this was done through a Picard, fixed-point iterative process whereas the new implementation relies on MOOSE's Preconditioned Jacobian-Free Newton Krylov (PJFNK) method to allow for a direct solution to the non-linear problem. This novel implementation showed a decrease in calculation time by a factor reaching 50 and generated SPH factors that correspond to those obtained through a fixed-point iterative process with a very tight convergence criteria: epsilon < 10-8. The use of the PJFNK SPH procedure also allows to reach convergence in problems containing important reflector regions and void boundary conditions, something that the traditional SPH method has never been able to achieve. At times when the PJFNK method cannot reach convergence to the SPH problem, a hybrid method is used where by the traditional SPH iteration forces the initial condition to be within the radius of convergence of the Newton method. This new method was tested on a simplified model of INL's TREAT reactor, a problem that includes very important graphite reflector regions as well as vacuum boundary conditions with great success. To demonstrate the power of PJFNK SPH on a more common case, the correction was applied to a simplified PWR reactor core from the BEAVRS benchmark that included 15 assemblies and the water reflector to obtain very good results. This opens up the possibility to apply the SPH correction to full reactor cores in order to reduce homogenization errors for use in transient or multi-physics calculations.
A comparison of acceleration methods for solving the neutron transport k-eigenvalue problem
NASA Astrophysics Data System (ADS)
Willert, Jeffrey; Park, H.; Knoll, D. A.
2014-10-01
Over the past several years a number of papers have been written describing modern techniques for numerically computing the dominant eigenvalue of the neutron transport criticality problem. These methods fall into two distinct categories. The first category of methods rewrite the multi-group k-eigenvalue problem as a nonlinear system of equations and solve the resulting system using either a Jacobian-Free Newton-Krylov (JFNK) method or Nonlinear Krylov Acceleration (NKA), a variant of Anderson Acceleration. These methods are generally successful in significantly reducing the number of transport sweeps required to compute the dominant eigenvalue. The second category of methods utilize Moment-Based Acceleration (or High-Order/Low-Order (HOLO) Acceleration). These methods solve a sequence of modified diffusion eigenvalue problems whose solutions converge to the solution of the original transport eigenvalue problem. This second class of methods is, in our experience, always superior to the first, as most of the computational work is eliminated by the acceleration from the LO diffusion system. In this paper, we review each of these methods. Our computational results support our claim that the choice of which nonlinear solver to use, JFNK or NKA, should be secondary. The primary computational savings result from the implementation of a HOLO algorithm. We display computational results for a series of challenging multi-dimensional test problems.
MGMRES: A generalization of GMRES for solving large sparse nonsymmetric linear systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Young, D.M.; Chen, J.Y.
1994-12-31
The authors are concerned with the solution of the linear system (1): Au = b, where A is a real square nonsingular matrix which is large, sparse and non-symmetric. They consider the use of Krylov subspace methods. They first choose an initial approximation u{sup (0)} to the solution {bar u} = A{sup {minus}1}B of (1). They also choose an auxiliary matrix Z which is nonsingular. For n = 1,2,{hor_ellipsis} they determine u{sup (n)} such that u{sup (n)} {minus} u{sup (0)}{epsilon}K{sub n}(r{sup (0)},A) where K{sub n}(r{sup (0)},A) is the (Krylov) subspace spanned by the Krylov vectors r{sup (0)}, Ar{sup (0)}, {hor_ellipsis},more » A{sup n{minus}1}r{sup 0} and where r{sup (0)} = b{minus}Au{sup (0)}. If ZA is SPD they also require that (u{sup (n)}{minus}{bar u}, ZA(u{sup (n)}{minus}{bar u})) be minimized. If, on the other hand, ZA is not SPD, then they require that the Galerkin condition, (Zr{sup n}, v) = 0, be satisfied for all v{epsilon}K{sub n}(r{sup (0)}, A) where r{sup n} = b{minus}Au{sup (n)}. In this paper the authors consider a generalization of GMRES. This generalized method, which they refer to as `MGMRES`, is very similar to GMRES except that they let Z = A{sup T}Y where Y is a nonsingular matrix which is symmetric by not necessarily SPD.« less
NASA Astrophysics Data System (ADS)
Kaus, B.; Popov, A.
2015-12-01
The analytical expression for the Jacobian is a key component to achieve fast and robust convergence of the nonlinear Newton-Raphson iterative solver. Accomplishing this task in practice often requires a significant algebraic effort. Therefore it is quite common to use a cheap alternative instead, for example by approximating the Jacobian with a finite difference estimation. Despite its simplicity it is a relatively fragile and unreliable technique that is sensitive to the scaling of the residual and unknowns, as well as to the perturbation parameter selection. Unfortunately no universal rule can be applied to provide both a robust scaling and a perturbation. The approach we use here is to derive the analytical Jacobian for the coupled set of momentum, mass, and energy conservation equations together with the elasto-visco-plastic rheology and a marker in cell/staggered finite difference method. The software project LaMEM (Lithosphere and Mantle Evolution Model) is primarily developed for the thermo-mechanically coupled modeling of the 3D lithospheric deformation. The code is based on a staggered grid finite difference discretization in space, and uses customized scalable solvers form PETSc library to efficiently run on the massively parallel machines (such as IBM Blue Gene/Q). Currently LaMEM relies on the Jacobian-Free Newton-Krylov (JFNK) nonlinear solver, which approximates the Jacobian-vector product using a simple finite difference formula. This approach never requires an assembled Jacobian matrix and uses only the residual computation routine. We use an approximate Jacobian (Picard) matrix to precondition the Krylov solver with the Galerkin geometric multigrid. Because of the inherent problems of the finite difference Jacobian estimation, this approach doesn't always result in stable convergence. In this work we present and discuss a matrix-free technique in which the Jacobian-vector product is replaced by analytically-derived expressions and compare results with those obtained with a finite difference approximation of the Jacobian. This project is funded by ERC Starting Grant 258830 and computer facilities were provided by Jülich supercomputer center (Germany).
Numerical methods in Markov chain modeling
NASA Technical Reports Server (NTRS)
Philippe, Bernard; Saad, Youcef; Stewart, William J.
1989-01-01
Several methods for computing stationary probability distributions of Markov chains are described and compared. The main linear algebra problem consists of computing an eigenvector of a sparse, usually nonsymmetric, matrix associated with a known eigenvalue. It can also be cast as a problem of solving a homogeneous singular linear system. Several methods based on combinations of Krylov subspace techniques are presented. The performance of these methods on some realistic problems are compared.
NASA Astrophysics Data System (ADS)
Taitano, W. T.; Chacón, L.; Simakov, A. N.; Molvig, K.
2015-09-01
In this study, we demonstrate a fully implicit algorithm for the multi-species, multidimensional Rosenbluth-Fokker-Planck equation which is exactly mass-, momentum-, and energy-conserving, and which preserves positivity. Unlike most earlier studies, we base our development on the Rosenbluth (rather than Landau) form of the Fokker-Planck collision operator, which reduces complexity while allowing for an optimal fully implicit treatment. Our discrete conservation strategy employs nonlinear constraints that force the continuum symmetries of the collision operator to be satisfied upon discretization. We converge the resulting nonlinear system iteratively using Jacobian-free Newton-Krylov methods, effectively preconditioned with multigrid methods for efficiency. Single- and multi-species numerical examples demonstrate the advertised accuracy properties of the scheme, and the superior algorithmic performance of our approach. In particular, the discretization approach is numerically shown to be second-order accurate in time and velocity space and to exhibit manifestly positive entropy production. That is, H-theorem behavior is indicated for all the examples we have tested. The solution approach is demonstrated to scale optimally with respect to grid refinement (with CPU time growing linearly with the number of mesh points), and timestep (showing very weak dependence of CPU time with time-step size). As a result, the proposed algorithm delivers several orders-of-magnitude speedup vs. explicit algorithms.
Fully-Implicit Reconstructed Discontinuous Galerkin Method for Stiff Multiphysics Problems
NASA Astrophysics Data System (ADS)
Nourgaliev, Robert
2015-11-01
A new reconstructed Discontinuous Galerkin (rDG) method, based on orthogonal basis/test functions, is developed for fluid flows on unstructured meshes. Orthogonality of basis functions is essential for enabling robust and efficient fully-implicit Newton-Krylov based time integration. The method is designed for generic partial differential equations, including transient, hyperbolic, parabolic or elliptic operators, which are attributed to many multiphysics problems. We demonstrate the method's capabilities for solving compressible fluid-solid systems (in the low Mach number limit), with phase change (melting/solidification), as motivated by applications in Additive Manufacturing. We focus on the method's accuracy (in both space and time), as well as robustness and solvability of the system of linear equations involved in the linearization steps of Newton-based methods. The performance of the developed method is investigated for highly-stiff problems with melting/solidification, emphasizing the advantages from tight coupling of mass, momentum and energy conservation equations, as well as orthogonality of basis functions, which leads to better conditioning of the underlying (approximate) Jacobian matrices, and rapid convergence of the Krylov-based linear solver. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344, and funded by the LDRD at LLNL under project tracking code 13-SI-002.
NASA Astrophysics Data System (ADS)
Caplan, R. M.; Mikić, Z.; Linker, J. A.; Lionello, R.
2017-05-01
We explore the performance and advantages/disadvantages of using unconditionally stable explicit super time-stepping (STS) algorithms versus implicit schemes with Krylov solvers for integrating parabolic operators in thermodynamic MHD models of the solar corona. Specifically, we compare the second-order Runge-Kutta Legendre (RKL2) STS method with the implicit backward Euler scheme computed using the preconditioned conjugate gradient (PCG) solver with both a point-Jacobi and a non-overlapping domain decomposition ILU0 preconditioner. The algorithms are used to integrate anisotropic Spitzer thermal conduction and artificial kinematic viscosity at time-steps much larger than classic explicit stability criteria allow. A key component of the comparison is the use of an established MHD model (MAS) to compute a real-world simulation on a large HPC cluster. Special attention is placed on the parallel scaling of the algorithms. It is shown that, for a specific problem and model, the RKL2 method is comparable or surpasses the implicit method with PCG solvers in performance and scaling, but suffers from some accuracy limitations. These limitations, and the applicability of RKL methods are briefly discussed.
NASA Astrophysics Data System (ADS)
Sharkov, N. A.; Sharkova, O. A.
2018-05-01
The paper identifies the importance of the Leonhard Euler's discoveries in the field of shipbuilding for the scientific evolution of academician A. N. Krylov and for the modern knowledge in survivability and safety of ships. The works by Leonard Euler "Marine Science" and "The Moon Motion New Theory" are discussed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, G., E-mail: gchen@lanl.gov; Chacón, L.; Leibs, C.A.
2014-02-01
A recent proof-of-principle study proposes an energy- and charge-conserving, nonlinearly implicit electrostatic particle-in-cell (PIC) algorithm in one dimension [9]. The algorithm in the reference employs an unpreconditioned Jacobian-free Newton–Krylov method, which ensures nonlinear convergence at every timestep (resolving the dynamical timescale of interest). Kinetic enslavement, which is one key component of the algorithm, not only enables fully implicit PIC as a practical approach, but also allows preconditioning the kinetic solver with a fluid approximation. This study proposes such a preconditioner, in which the linearized moment equations are closed with moments computed from particles. Effective acceleration of the linear GMRES solvemore » is demonstrated, on both uniform and non-uniform meshes. The algorithm performance is largely insensitive to the electron–ion mass ratio. Numerical experiments are performed on a 1D multi-scale ion acoustic wave test problem.« less
Parallel Cartesian grid refinement for 3D complex flow simulations
NASA Astrophysics Data System (ADS)
Angelidis, Dionysios; Sotiropoulos, Fotis
2013-11-01
A second order accurate method for discretizing the Navier-Stokes equations on 3D unstructured Cartesian grids is presented. Although the grid generator is based on the oct-tree hierarchical method, fully unstructured data-structure is adopted enabling robust calculations for incompressible flows, avoiding both the need of synchronization of the solution between different levels of refinement and usage of prolongation/restriction operators. The current solver implements a hybrid staggered/non-staggered grid layout, employing the implicit fractional step method to satisfy the continuity equation. The pressure-Poisson equation is discretized by using a novel second order fully implicit scheme for unstructured Cartesian grids and solved using an efficient Krylov subspace solver. The momentum equation is also discretized with second order accuracy and the high performance Newton-Krylov method is used for integrating them in time. Neumann and Dirichlet conditions are used to validate the Poisson solver against analytical functions and grid refinement results to a significant reduction of the solution error. The effectiveness of the fractional step method results in the stability of the overall algorithm and enables the performance of accurate multi-resolution real life simulations. This material is based upon work supported by the Department of Energy under Award Number DE-EE0005482.
Zou, Ling; Zhao, Haihua; Zhang, Hongbin
2016-03-09
This work represents a first-of-its-kind successful application to employ advanced numerical methods in solving realistic two-phase flow problems with two-fluid six-equation two-phase flow model. These advanced numerical methods include high-resolution spatial discretization scheme with staggered grids (high-order) fully implicit time integration schemes, and Jacobian-free Newton–Krylov (JFNK) method as the nonlinear solver. The computer code developed in this work has been extensively validated with existing experimental flow boiling data in vertical pipes and rod bundles, which cover wide ranges of experimental conditions, such as pressure, inlet mass flux, wall heat flux and exit void fraction. Additional code-to-code benchmark with the RELAP5-3Dmore » code further verifies the correct code implementation. The combined methods employed in this work exhibit strong robustness in solving two-phase flow problems even when phase appearance (boiling) and realistic discrete flow regimes are considered. Transitional flow regimes used in existing system analysis codes, normally introduced to overcome numerical difficulty, were completely removed in this work. As a result, this in turn provides the possibility to utilize more sophisticated flow regime maps in the future to further improve simulation accuracy.« less
Mang, Andreas; Ruthotto, Lars
2017-01-01
We present an efficient solver for diffeomorphic image registration problems in the framework of Large Deformations Diffeomorphic Metric Mappings (LDDMM). We use an optimal control formulation, in which the velocity field of a hyperbolic PDE needs to be found such that the distance between the final state of the system (the transformed/transported template image) and the observation (the reference image) is minimized. Our solver supports both stationary and non-stationary (i.e., transient or time-dependent) velocity fields. As transformation models, we consider both the transport equation (assuming intensities are preserved during the deformation) and the continuity equation (assuming mass-preservation). We consider the reduced form of the optimal control problem and solve the resulting unconstrained optimization problem using a discretize-then-optimize approach. A key contribution is the elimination of the PDE constraint using a Lagrangian hyperbolic PDE solver. Lagrangian methods rely on the concept of characteristic curves. We approximate these curves using a fourth-order Runge-Kutta method. We also present an efficient algorithm for computing the derivatives of the final state of the system with respect to the velocity field. This allows us to use fast Gauss-Newton based methods. We present quickly converging iterative linear solvers using spectral preconditioners that render the overall optimization efficient and scalable. Our method is embedded into the image registration framework FAIR and, thus, supports the most commonly used similarity measures and regularization functionals. We demonstrate the potential of our new approach using several synthetic and real world test problems with up to 14.7 million degrees of freedom.
DOE Office of Scientific and Technical Information (OSTI.GOV)
2012-12-17
Grizzly is a simulation tool for assessing the effects of age-related degradation on systems, structures, and components of nuclear power plants. Grizzly is built on the MOOSE framework, and uses a Jacobian-free Newton Krylov method to obtain solutions to tightly coupled thermo-mechanical simulations. Grizzly runs on a wide range of hardware, from a single processor to massively parallel machines.
Minimal Krylov Subspaces for Dimension Reduction
2013-01-01
these applications realized a maximal compute time improvement with minimal Krylov subspaces. More recently, Halko et . al . [36] have investigated... Halko et . al . proposed a variety of them in [36], but we focus on the “direct eigenvalue approximation for Hermitian matri- ces with random...result due to Halko et . al . Theorem 5 ( Halko , Martinsson and Tropp [36]). Let A ∈ Rn×m be the input matrix with partitioned singular value
Effects of high-frequency damping on iterative convergence of implicit viscous solver
NASA Astrophysics Data System (ADS)
Nishikawa, Hiroaki; Nakashima, Yoshitaka; Watanabe, Norihiko
2017-11-01
This paper discusses effects of high-frequency damping on iterative convergence of an implicit defect-correction solver for viscous problems. The study targets a finite-volume discretization with a one parameter family of damped viscous schemes. The parameter α controls high-frequency damping: zero damping with α = 0, and larger damping for larger α (> 0). Convergence rates are predicted for a model diffusion equation by a Fourier analysis over a practical range of α. It is shown that the convergence rate attains its minimum at α = 1 on regular quadrilateral grids, and deteriorates for larger values of α. A similar behavior is observed for regular triangular grids. In both quadrilateral and triangular grids, the solver is predicted to diverge for α smaller than approximately 0.5. Numerical results are shown for the diffusion equation and the Navier-Stokes equations on regular and irregular grids. The study suggests that α = 1 and 4/3 are suitable values for robust and efficient computations, and α = 4 / 3 is recommended for the diffusion equation, which achieves higher-order accuracy on regular quadrilateral grids. Finally, a Jacobian-Free Newton-Krylov solver with the implicit solver (a low-order Jacobian approximately inverted by a multi-color Gauss-Seidel relaxation scheme) used as a variable preconditioner is recommended for practical computations, which provides robust and efficient convergence for a wide range of α.
Tensor-product preconditioners for a space-time discontinuous Galerkin method
NASA Astrophysics Data System (ADS)
Diosady, Laslo T.; Murman, Scott M.
2014-10-01
A space-time discontinuous Galerkin spectral element discretization is presented for direct numerical simulation of the compressible Navier-Stokes equations. An efficient solution technique based on a matrix-free Newton-Krylov method is presented. A diagonalized alternating direction implicit preconditioner is extended to a space-time formulation using entropy variables. The effectiveness of this technique is demonstrated for the direct numerical simulation of turbulent flow in a channel.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Salinger, Andy; Evans, Katherine J; Lemieux, Jean-Francois
2011-01-01
We have implemented the Jacobian-free Newton-Krylov (JFNK) method for solving the rst-order ice sheet momentum equation in order to improve the numerical performance of the Community Ice Sheet Model (CISM), the land ice component of the Community Earth System Model (CESM). Our JFNK implementation is based on signicant re-use of existing code. For example, our physics-based preconditioner uses the original Picard linear solver in CISM. For several test cases spanning a range of geometries and boundary conditions, our JFNK implementation is 1.84-3.62 times more efficient than the standard Picard solver in CISM. Importantly, this computational gain of JFNK over themore » Picard solver increases when rening the grid. Global convergence of the JFNK solver has been signicantly improved by rescaling the equation for the basal boundary condition and through the use of an inexact Newton method. While a diverse set of test cases show that our JFNK implementation is usually robust, for some problems it may fail to converge with increasing resolution (as does the Picard solver). Globalization through parameter continuation did not remedy this problem and future work to improve robustness will explore a combination of Picard and JFNK and the use of homotopy methods.« less
Solving lattice QCD systems of equations using mixed precision solvers on GPUs
NASA Astrophysics Data System (ADS)
Clark, M. A.; Babich, R.; Barros, K.; Brower, R. C.; Rebbi, C.
2010-09-01
Modern graphics hardware is designed for highly parallel numerical tasks and promises significant cost and performance benefits for many scientific applications. One such application is lattice quantum chromodynamics (lattice QCD), where the main computational challenge is to efficiently solve the discretized Dirac equation in the presence of an SU(3) gauge field. Using NVIDIA's CUDA platform we have implemented a Wilson-Dirac sparse matrix-vector product that performs at up to 40, 135 and 212 Gflops for double, single and half precision respectively on NVIDIA's GeForce GTX 280 GPU. We have developed a new mixed precision approach for Krylov solvers using reliable updates which allows for full double precision accuracy while using only single or half precision arithmetic for the bulk of the computation. The resulting BiCGstab and CG solvers run in excess of 100 Gflops and, in terms of iterations until convergence, perform better than the usual defect-correction approach for mixed precision.
Efficient parallel implicit methods for rotary-wing aerodynamics calculations
NASA Astrophysics Data System (ADS)
Wissink, Andrew M.
Euler/Navier-Stokes Computational Fluid Dynamics (CFD) methods are commonly used for prediction of the aerodynamics and aeroacoustics of modern rotary-wing aircraft. However, their widespread application to large complex problems is limited lack of adequate computing power. Parallel processing offers the potential for dramatic increases in computing power, but most conventional implicit solution methods are inefficient in parallel and new techniques must be adopted to realize its potential. This work proposes alternative implicit schemes for Euler/Navier-Stokes rotary-wing calculations which are robust and efficient in parallel. The first part of this work proposes an efficient parallelizable modification of the Lower Upper-Symmetric Gauss Seidel (LU-SGS) implicit operator used in the well-known Transonic Unsteady Rotor Navier Stokes (TURNS) code. The new hybrid LU-SGS scheme couples a point-relaxation approach of the Data Parallel-Lower Upper Relaxation (DP-LUR) algorithm for inter-processor communication with the Symmetric Gauss Seidel algorithm of LU-SGS for on-processor computations. With the modified operator, TURNS is implemented in parallel using Message Passing Interface (MPI) for communication. Numerical performance and parallel efficiency are evaluated on the IBM SP2 and Thinking Machines CM-5 multi-processors for a variety of steady-state and unsteady test cases. The hybrid LU-SGS scheme maintains the numerical performance of the original LU-SGS algorithm in all cases and shows a good degree of parallel efficiency. It experiences a higher degree of robustness than DP-LUR for third-order upwind solutions. The second part of this work examines use of Krylov subspace iterative solvers for the nonlinear CFD solutions. The hybrid LU-SGS scheme is used as a parallelizable preconditioner. Two iterative methods are tested, Generalized Minimum Residual (GMRES) and Orthogonal s-Step Generalized Conjugate Residual (OSGCR). The Newton method demonstrates good parallel performance on the IBM SP2, with OS-GCR giving slightly better performance than GMRES on large numbers of processors. For steady and quasi-steady calculations, the convergence rate is accelerated but the overall solution time remains about the same as the standard hybrid LU-SGS scheme. For unsteady calculations, however, the Newton method maintains a higher degree of time-accuracy which allows tbe use of larger timesteps and results in CPU savings of 20-35%.
A Legendre–Fourier spectral method with exact conservation laws for the Vlasov–Poisson system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Manzini, Gianmarco; Delzanno, Gian Luca; Vencels, Juris
In this study, we present the design and implementation of an L 2-stable spectral method for the discretization of the Vlasov–Poisson model of a collisionless plasma in one space and velocity dimension. The velocity and space dependence of the Vlasov equation are resolved through a truncated spectral expansion based on Legendre and Fourier basis functions, respectively. The Poisson equation, which is coupled to the Vlasov equation, is also resolved through a Fourier expansion. The resulting system of ordinary differential equation is discretized by the implicit second-order accurate Crank–Nicolson time discretization. The non-linear dependence between the Vlasov and Poisson equations ismore » iteratively solved at any time cycle by a Jacobian-Free Newton–Krylov method. In this work we analyze the structure of the main conservation laws of the resulting Legendre–Fourier model, e.g., mass, momentum, and energy, and prove that they are exactly satisfied in the semi-discrete and discrete setting. The L 2-stability of the method is ensured by discretizing the boundary conditions of the distribution function at the boundaries of the velocity domain by a suitable penalty term. The impact of the penalty term on the conservation properties is investigated theoretically and numerically. An implementation of the penalty term that does not affect the conservation of mass, momentum and energy, is also proposed and studied. A collisional term is introduced in the discrete model to control the filamentation effect, but does not affect the conservation properties of the system. Numerical results on a set of standard test problems illustrate the performance of the method.« less
A Legendre–Fourier spectral method with exact conservation laws for the Vlasov–Poisson system
Manzini, Gianmarco; Delzanno, Gian Luca; Vencels, Juris; ...
2016-04-22
In this study, we present the design and implementation of an L 2-stable spectral method for the discretization of the Vlasov–Poisson model of a collisionless plasma in one space and velocity dimension. The velocity and space dependence of the Vlasov equation are resolved through a truncated spectral expansion based on Legendre and Fourier basis functions, respectively. The Poisson equation, which is coupled to the Vlasov equation, is also resolved through a Fourier expansion. The resulting system of ordinary differential equation is discretized by the implicit second-order accurate Crank–Nicolson time discretization. The non-linear dependence between the Vlasov and Poisson equations ismore » iteratively solved at any time cycle by a Jacobian-Free Newton–Krylov method. In this work we analyze the structure of the main conservation laws of the resulting Legendre–Fourier model, e.g., mass, momentum, and energy, and prove that they are exactly satisfied in the semi-discrete and discrete setting. The L 2-stability of the method is ensured by discretizing the boundary conditions of the distribution function at the boundaries of the velocity domain by a suitable penalty term. The impact of the penalty term on the conservation properties is investigated theoretically and numerically. An implementation of the penalty term that does not affect the conservation of mass, momentum and energy, is also proposed and studied. A collisional term is introduced in the discrete model to control the filamentation effect, but does not affect the conservation properties of the system. Numerical results on a set of standard test problems illustrate the performance of the method.« less
Numerical simulation of cavitating flows in shipbuilding
NASA Astrophysics Data System (ADS)
Bagaev, D.; Yegorov, S.; Lobachev, M.; Rudnichenko, A.; Taranov, A.
2018-05-01
The paper presents validation of numerical simulations of cavitating flows around different marine objects carried out at the Krylov State Research Centre (KSRC). Preliminary validation was done with reference to international test objects. The main part of the paper contains results of solving practical problems of ship propulsion design. The validation of numerical simulations by comparison with experimental data shows a good accuracy of the supercomputer technologies existing at Krylov State Research Centre for both hydrodynamic and cavitation characteristics prediction.
Recovery Discontinuous Galerkin Jacobian-free Newton-Krylov Method for all-speed flows
DOE Office of Scientific and Technical Information (OSTI.GOV)
HyeongKae Park; Robert Nourgaliev; Vincent Mousseau
2008-07-01
There is an increasing interest to develop the next generation simulation tools for the advanced nuclear energy systems. These tools will utilize the state-of-art numerical algorithms and computer science technology in order to maximize the predictive capability, support advanced reactor designs, reduce uncertainty and increase safety margins. In analyzing nuclear energy systems, we are interested in compressible low-Mach number, high heat flux flows with a wide range of Re, Ra, and Pr numbers. Under these conditions, the focus is placed on turbulent heat transfer, in contrast to other industries whose main interest is in capturing turbulent mixing. Our objective ismore » to develop singlepoint turbulence closure models for large-scale engineering CFD code, using Direct Numerical Simulation (DNS) or Large Eddy Simulation (LES) tools, requireing very accurate and efficient numerical algorithms. The focus of this work is placed on fully-implicit, high-order spatiotemporal discretization based on the discontinuous Galerkin method solving the conservative form of the compressible Navier-Stokes equations. The method utilizes a local reconstruction procedure derived from weak formulation of the problem, which is inspired by the recovery diffusion flux algorithm of van Leer and Nomura [?] and by the piecewise parabolic reconstruction [?] in the finite volume method. The developed methodology is integrated into the Jacobianfree Newton-Krylov framework [?] to allow a fully-implicit solution of the problem.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zou, Ling; Zhao, Haihua; Zhang, Hongbin
2016-04-01
The phase appearance/disappearance issue presents serious numerical challenges in two-phase flow simulations. Many existing reactor safety analysis codes use different kinds of treatments for the phase appearance/disappearance problem. However, to our best knowledge, there are no fully satisfactory solutions. Additionally, the majority of the existing reactor system analysis codes were developed using low-order numerical schemes in both space and time. In many situations, it is desirable to use high-resolution spatial discretization and fully implicit time integration schemes to reduce numerical errors. In this work, we adapted a high-resolution spatial discretization scheme on staggered grid mesh and fully implicit time integrationmore » methods (such as BDF1 and BDF2) to solve the two-phase flow problems. The discretized nonlinear system was solved by the Jacobian-free Newton Krylov (JFNK) method, which does not require the derivation and implementation of analytical Jacobian matrix. These methods were tested with a few two-phase flow problems with phase appearance/disappearance phenomena considered, such as a linear advection problem, an oscillating manometer problem, and a sedimentation problem. The JFNK method demonstrated extremely robust and stable behaviors in solving the two-phase flow problems with phase appearance/disappearance. No special treatments such as water level tracking or void fraction limiting were used. High-resolution spatial discretization and second- order fully implicit method also demonstrated their capabilities in significantly reducing numerical errors.« less
NASA Astrophysics Data System (ADS)
Lin, Youzuo; O'Malley, Daniel; Vesselinov, Velimir V.
2016-09-01
Inverse modeling seeks model parameters given a set of observations. However, for practical problems because the number of measurements is often large and the model parameters are also numerous, conventional methods for inverse modeling can be computationally expensive. We have developed a new, computationally efficient parallel Levenberg-Marquardt method for solving inverse modeling problems with a highly parameterized model space. Levenberg-Marquardt methods require the solution of a linear system of equations which can be prohibitively expensive to compute for moderate to large-scale problems. Our novel method projects the original linear problem down to a Krylov subspace such that the dimensionality of the problem can be significantly reduced. Furthermore, we store the Krylov subspace computed when using the first damping parameter and recycle the subspace for the subsequent damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved using these computational techniques. We apply this new inverse modeling method to invert for random transmissivity fields in 2-D and a random hydraulic conductivity field in 3-D. Our algorithm is fast enough to solve for the distributed model parameters (transmissivity) in the model domain. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). By comparing with Levenberg-Marquardt methods using standard linear inversion techniques such as QR or SVD methods, our Levenberg-Marquardt method yields a speed-up ratio on the order of ˜101 to ˜102 in a multicore computational environment. Therefore, our new inverse modeling method is a powerful tool for characterizing subsurface heterogeneity for moderate to large-scale problems.
NASA Astrophysics Data System (ADS)
Schmidt, Burkhard; Hartmann, Carsten
2018-07-01
WavePacket is an open-source program package for numeric simulations in quantum dynamics. It can solve time-independent or time-dependent linear Schrödinger and Liouville-von Neumann-equations in one or more dimensions. Also coupled equations can be treated, which allows, e.g., to simulate molecular quantum dynamics beyond the Born-Oppenheimer approximation. Optionally accounting for the interaction with external electric fields within the semi-classical dipole approximation, WavePacket can be used to simulate experiments involving tailored light pulses in photo-induced physics or chemistry. Being highly versatile and offering visualization of quantum dynamics 'on the fly', WavePacket is well suited for teaching or research projects in atomic, molecular and optical physics as well as in physical or theoretical chemistry. Building on the previous Part I [Comp. Phys. Comm. 213, 223-234 (2017)] which dealt with closed quantum systems and discrete variable representations, the present Part II focuses on the dynamics of open quantum systems, with Lindblad operators modeling dissipation and dephasing. This part also describes the WavePacket function for optimal control of quantum dynamics, building on rapid monotonically convergent iteration methods. Furthermore, two different approaches to dimension reduction implemented in WavePacket are documented here. In the first one, a balancing transformation based on the concepts of controllability and observability Gramians is used to identify states that are neither well controllable nor well observable. Those states are either truncated or averaged out. In the other approach, the H2-error for a given reduced dimensionality is minimized by H2 optimal model reduction techniques, utilizing a bilinear iterative rational Krylov algorithm. The present work describes the MATLAB version of WavePacket 5.3.0 which is hosted and further developed at the Sourceforge platform, where also extensive Wiki-documentation as well as numerous worked-out demonstration examples with animated graphics can be found.
Controller reduction by preserving impulse response energy
NASA Technical Reports Server (NTRS)
Craig, Roy R., Jr.; Su, Tzu-Jeng
1989-01-01
A model order reduction algorithm based on a Krylov recurrence formulation is developed to reduce order of controllers. The reduced-order controller is obtained by projecting the full-order LQG controller onto a Krylov subspace in which either the controllability or the observability grammian is equal to the identity matrix. The reduced-order controller preserves the impulse response energy of the full-order controller and has a parameter-matching property. Two numerical examples drawn from other controller reduction literature are used to illustrate the efficacy of the proposed reduction algorithm.
Adaptive Implicit Non-Equilibrium Radiation Diffusion
DOE Office of Scientific and Technical Information (OSTI.GOV)
Philip, Bobby; Wang, Zhen; Berrill, Mark A
2013-01-01
We describe methods for accurate and efficient long term time integra- tion of non-equilibrium radiation diffusion systems: implicit time integration for effi- cient long term time integration of stiff multiphysics systems, local control theory based step size control to minimize the required global number of time steps while control- ling accuracy, dynamic 3D adaptive mesh refinement (AMR) to minimize memory and computational costs, Jacobian Free Newton-Krylov methods on AMR grids for efficient nonlinear solution, and optimal multilevel preconditioner components that provide level independent solver convergence.
Implementation details of the coupled QMR algorithm
NASA Technical Reports Server (NTRS)
Freund, Roland W.; Nachtigal, Noel M.
1992-01-01
The original quasi-minimal residual method (QMR) relies on the three-term look-ahead Lanczos process, to generate basis vectors for the underlying Krylov subspaces. However, empirical observations indicate that, in finite precision arithmetic, three-term vector recurrences are less robust than mathematically equivalent coupled two-term recurrences. Therefore, we recently proposed a new implementation of the QMR method based on a coupled two-term look-ahead Lanczos procedure. In this paper, we describe implementation details of this coupled QMR algorithm, and we present results of numerical experiments.
Ordering Unstructured Meshes for Sparse Matrix Computations on Leading Parallel Systems
NASA Technical Reports Server (NTRS)
Oliker, Leonid; Li, Xiaoye; Heber, Gerd; Biswas, Rupak
2000-01-01
The ability of computers to solve hitherto intractable problems and simulate complex processes using mathematical models makes them an indispensable part of modern science and engineering. Computer simulations of large-scale realistic applications usually require solving a set of non-linear partial differential equations (PDES) over a finite region. For example, one thrust area in the DOE Grand Challenge projects is to design future accelerators such as the SpaHation Neutron Source (SNS). Our colleagues at SLAC need to model complex RFQ cavities with large aspect ratios. Unstructured grids are currently used to resolve the small features in a large computational domain; dynamic mesh adaptation will be added in the future for additional efficiency. The PDEs for electromagnetics are discretized by the FEM method, which leads to a generalized eigenvalue problem Kx = AMx, where K and M are the stiffness and mass matrices, and are very sparse. In a typical cavity model, the number of degrees of freedom is about one million. For such large eigenproblems, direct solution techniques quickly reach the memory limits. Instead, the most widely-used methods are Krylov subspace methods, such as Lanczos or Jacobi-Davidson. In all the Krylov-based algorithms, sparse matrix-vector multiplication (SPMV) must be performed repeatedly. Therefore, the efficiency of SPMV usually determines the eigensolver speed. SPMV is also one of the most heavily used kernels in large-scale numerical simulations.
NASA Technical Reports Server (NTRS)
Datta, Anubhav; Johnson, Wayne R.
2009-01-01
This paper has two objectives. The first objective is to formulate a 3-dimensional Finite Element Model for the dynamic analysis of helicopter rotor blades. The second objective is to implement and analyze a dual-primal iterative substructuring based Krylov solver, that is parallel and scalable, for the solution of the 3-D FEM analysis. The numerical and parallel scalability of the solver is studied using two prototype problems - one for ideal hover (symmetric) and one for a transient forward flight (non-symmetric) - both carried out on up to 48 processors. In both hover and forward flight conditions, a perfect linear speed-up is observed, for a given problem size, up to the point of substructure optimality. Substructure optimality and the linear parallel speed-up range are both shown to depend on the problem size as well as on the selection of the coarse problem. With a larger problem size, linear speed-up is restored up to the new substructure optimality. The solver also scales with problem size - even though this conclusion is premature given the small prototype grids considered in this study.
Power flow prediction in vibrating systems via model reduction
NASA Astrophysics Data System (ADS)
Li, Xianhui
This dissertation focuses on power flow prediction in vibrating systems. Reduced order models (ROMs) are built based on rational Krylov model reduction which preserve power flow information in the original systems over a specified frequency band. Stiffness and mass matrices of the ROMs are obtained by projecting the original system matrices onto the subspaces spanned by forced responses. A matrix-free algorithm is designed to construct ROMs directly from the power quantities at selected interpolation frequencies. Strategies for parallel implementation of the algorithm via message passing interface are proposed. The quality of ROMs is iteratively refined according to the error estimate based on residual norms. Band capacity is proposed to provide a priori estimate of the sizes of good quality ROMs. Frequency averaging is recast as ensemble averaging and Cauchy distribution is used to simplify the computation. Besides model reduction for deterministic systems, details of constructing ROMs for parametric and nonparametric random systems are also presented. Case studies have been conducted on testbeds from Harwell-Boeing collections. Input and coupling power flow are computed for the original systems and the ROMs. Good agreement is observed in all cases.
The algebraic criteria for the stability of control systems
NASA Technical Reports Server (NTRS)
Cremer, H.; Effertz, F. H.
1986-01-01
This paper critically examines the standard algebraic criteria for the stability of linear control systems and their proofs, reveals important previously unnoticed connections, and presents new representations. Algebraic stability criteria have also acquired significance for stability studies of non-linear differential equation systems by the Krylov-Bogoljubov-Magnus Method, and allow realization conditions to be determined for classes of broken rational functions as frequency characteristics of electrical network.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, Paul T.; Shadid, John N.; Sala, Marzio
In this study results are presented for the large-scale parallel performance of an algebraic multilevel preconditioner for solution of the drift-diffusion model for semiconductor devices. The preconditioner is the key numerical procedure determining the robustness, efficiency and scalability of the fully-coupled Newton-Krylov based, nonlinear solution method that is employed for this system of equations. The coupled system is comprised of a source term dominated Poisson equation for the electric potential, and two convection-diffusion-reaction type equations for the electron and hole concentration. The governing PDEs are discretized in space by a stabilized finite element method. Solution of the discrete system ismore » obtained through a fully-implicit time integrator, a fully-coupled Newton-based nonlinear solver, and a restarted GMRES Krylov linear system solver. The algebraic multilevel preconditioner is based on an aggressive coarsening graph partitioning of the nonzero block structure of the Jacobian matrix. Representative performance results are presented for various choices of multigrid V-cycles and W-cycles and parameter variations for smoothers based on incomplete factorizations. Parallel scalability results are presented for solution of up to 10{sup 8} unknowns on 4096 processors of a Cray XT3/4 and an IBM POWER eServer system.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, Youzuo; O'Malley, Daniel; Vesselinov, Velimir V.
Inverse modeling seeks model parameters given a set of observations. However, for practical problems because the number of measurements is often large and the model parameters are also numerous, conventional methods for inverse modeling can be computationally expensive. We have developed a new, computationally-efficient parallel Levenberg-Marquardt method for solving inverse modeling problems with a highly parameterized model space. Levenberg-Marquardt methods require the solution of a linear system of equations which can be prohibitively expensive to compute for moderate to large-scale problems. Our novel method projects the original linear problem down to a Krylov subspace, such that the dimensionality of themore » problem can be significantly reduced. Furthermore, we store the Krylov subspace computed when using the first damping parameter and recycle the subspace for the subsequent damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved using these computational techniques. We apply this new inverse modeling method to invert for random transmissivity fields in 2D and a random hydraulic conductivity field in 3D. Our algorithm is fast enough to solve for the distributed model parameters (transmissivity) in the model domain. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). By comparing with Levenberg-Marquardt methods using standard linear inversion techniques such as QR or SVD methods, our Levenberg-Marquardt method yields a speed-up ratio on the order of ~10 1 to ~10 2 in a multi-core computational environment. Furthermore, our new inverse modeling method is a powerful tool for characterizing subsurface heterogeneity for moderate- to large-scale problems.« less
Lin, Youzuo; O'Malley, Daniel; Vesselinov, Velimir V.
2016-09-01
Inverse modeling seeks model parameters given a set of observations. However, for practical problems because the number of measurements is often large and the model parameters are also numerous, conventional methods for inverse modeling can be computationally expensive. We have developed a new, computationally-efficient parallel Levenberg-Marquardt method for solving inverse modeling problems with a highly parameterized model space. Levenberg-Marquardt methods require the solution of a linear system of equations which can be prohibitively expensive to compute for moderate to large-scale problems. Our novel method projects the original linear problem down to a Krylov subspace, such that the dimensionality of themore » problem can be significantly reduced. Furthermore, we store the Krylov subspace computed when using the first damping parameter and recycle the subspace for the subsequent damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved using these computational techniques. We apply this new inverse modeling method to invert for random transmissivity fields in 2D and a random hydraulic conductivity field in 3D. Our algorithm is fast enough to solve for the distributed model parameters (transmissivity) in the model domain. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). By comparing with Levenberg-Marquardt methods using standard linear inversion techniques such as QR or SVD methods, our Levenberg-Marquardt method yields a speed-up ratio on the order of ~10 1 to ~10 2 in a multi-core computational environment. Furthermore, our new inverse modeling method is a powerful tool for characterizing subsurface heterogeneity for moderate- to large-scale problems.« less
NASA Astrophysics Data System (ADS)
Weiss, Chester J.
2013-08-01
An essential element for computational hypothesis testing, data inversion and experiment design for electromagnetic geophysics is a robust forward solver, capable of easily and quickly evaluating the electromagnetic response of arbitrary geologic structure. The usefulness of such a solver hinges on the balance among competing desires like ease of use, speed of forward calculation, scalability to large problems or compute clusters, parsimonious use of memory access, accuracy and by necessity, the ability to faithfully accommodate a broad range of geologic scenarios over extremes in length scale and frequency content. This is indeed a tall order. The present study addresses recent progress toward the development of a forward solver with these properties. Based on the Lorenz-gauged Helmholtz decomposition, a new finite volume solution over Cartesian model domains endowed with complex-valued electrical properties is shown to be stable over the frequency range 10-2-1010 Hz and range 10-3-105 m in length scale. Benchmark examples are drawn from magnetotellurics, exploration geophysics, geotechnical mapping and laboratory-scale analysis, showing excellent agreement with reference analytic solutions. Computational efficiency is achieved through use of a matrix-free implementation of the quasi-minimum-residual (QMR) iterative solver, which eliminates explicit storage of finite volume matrix elements in favor of "on the fly" computation as needed by the iterative Krylov sequence. Further efficiency is achieved through sparse coupling matrices between the vector and scalar potentials whose non-zero elements arise only in those parts of the model domain where the conductivity gradient is non-zero. Multi-thread parallelization in the QMR solver through OpenMP pragmas is used to reduce the computational cost of its most expensive step: the single matrix-vector product at each iteration. High-level MPI communicators farm independent processes to available compute nodes for simultaneous computation of multi-frequency or multi-transmitter responses.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jung, Y. S.; Joo, H. G.; Yoon, J. I.
The nTRACER direct whole core transport code employing the planar MOC solution based 3-D calculation method, the subgroup method for resonance treatment, the Krylov matrix exponential method for depletion, and a subchannel thermal/hydraulic calculation solver was developed for practical high-fidelity simulation of power reactors. Its accuracy and performance is verified by comparing with the measurement data obtained for three pressurized water reactor cores. It is demonstrated that accurate and detailed multi-physic simulation of power reactors is practically realizable without any prior calculations or adjustments. (authors)
CYLINDRICAL WAVES OF FINITE AMPLITUDE IN DISSIPATIVE MEDIUM (in Russian)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Naugol'nykh, K.A.; Soluyan, S.I.; Khokhlov, R.V.
1962-07-01
Propagation of diverging and converging cylindrical waves in a nonlinear, viscous, heat conducting medium is analyzed using approximation methods. The KrylovBogolyubov method was used for small Raynold's numbers, and the method of S. I. Soluyan et al. (Vest. Mosk. Univ. ser. phys. and astronomy 3, 52-81, 1981), was used for large Raynold's numbers. The formation and dissipation of shock fronts and spatial dimensions of shock phenomena were analyzed. It is shown that the problem of finiteamplitude cylindrical wave propagation is identical to the problem of plane wave propagations in a medium with variable viscosity. (tr-auth)
A Tensor-Train accelerated solver for integral equations in complex geometries
NASA Astrophysics Data System (ADS)
Corona, Eduardo; Rahimian, Abtin; Zorin, Denis
2017-04-01
We present a framework using the Quantized Tensor Train (QTT) decomposition to accurately and efficiently solve volume and boundary integral equations in three dimensions. We describe how the QTT decomposition can be used as a hierarchical compression and inversion scheme for matrices arising from the discretization of integral equations. For a broad range of problems, computational and storage costs of the inversion scheme are extremely modest O (log N) and once the inverse is computed, it can be applied in O (Nlog N) . We analyze the QTT ranks for hierarchically low rank matrices and discuss its relationship to commonly used hierarchical compression techniques such as FMM and HSS. We prove that the QTT ranks are bounded for translation-invariant systems and argue that this behavior extends to non-translation invariant volume and boundary integrals. For volume integrals, the QTT decomposition provides an efficient direct solver requiring significantly less memory compared to other fast direct solvers. We present results demonstrating the remarkable performance of the QTT-based solver when applied to both translation and non-translation invariant volume integrals in 3D. For boundary integral equations, we demonstrate that using a QTT decomposition to construct preconditioners for a Krylov subspace method leads to an efficient and robust solver with a small memory footprint. We test the QTT preconditioners in the iterative solution of an exterior elliptic boundary value problem (Laplace) formulated as a boundary integral equation in complex, multiply connected geometries.
Shadid, J. N.; Pawlowski, R. P.; Cyr, E. C.; ...
2016-02-10
Here, we discuss that the computational solution of the governing balance equations for mass, momentum, heat transfer and magnetic induction for resistive magnetohydrodynamics (MHD) systems can be extremely challenging. These difficulties arise from both the strong nonlinear, nonsymmetric coupling of fluid and electromagnetic phenomena, as well as the significant range of time- and length-scales that the interactions of these physical mechanisms produce. This paper explores the development of a scalable, fully-implicit stabilized unstructured finite element (FE) capability for 3D incompressible resistive MHD. The discussion considers the development of a stabilized FE formulation in context of the variational multiscale (VMS) method,more » and describes the scalable implicit time integration and direct-to-steady-state solution capability. The nonlinear solver strategy employs Newton–Krylov methods, which are preconditioned using fully-coupled algebraic multilevel preconditioners. These preconditioners are shown to enable a robust, scalable and efficient solution approach for the large-scale sparse linear systems generated by the Newton linearization. Verification results demonstrate the expected order-of-accuracy for the stabilized FE discretization. The approach is tested on a variety of prototype problems, that include MHD duct flows, an unstable hydromagnetic Kelvin–Helmholtz shear layer, and a 3D island coalescence problem used to model magnetic reconnection. Initial results that explore the scaling of the solution methods are also presented on up to 128K processors for problems with up to 1.8B unknowns on a CrayXK7.« less
A Jacobian-free Newton Krylov method for mortar-discretized thermomechanical contact problems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hansen, Glen, E-mail: Glen.Hansen@inl.gov
2011-07-20
Multibody contact problems are common within the field of multiphysics simulation. Applications involving thermomechanical contact scenarios are also quite prevalent. Such problems can be challenging to solve due to the likelihood of thermal expansion affecting contact geometry which, in turn, can change the thermal behavior of the components being analyzed. This paper explores a simple model of a light water reactor nuclear fuel rod, which consists of cylindrical pellets of uranium dioxide (UO{sub 2}) fuel sealed within a Zircalloy cladding tube. The tube is initially filled with helium gas, which fills the gap between the pellets and cladding tube. Themore » accurate modeling of heat transfer across the gap between fuel pellets and the protective cladding is essential to understanding fuel performance, including cladding stress and behavior under irradiated conditions, which are factors that affect the lifetime of the fuel. The thermomechanical contact approach developed here is based on the mortar finite element method, where Lagrange multipliers are used to enforce weak continuity constraints at participating interfaces. In this formulation, the heat equation couples to linear mechanics through a thermal expansion term. Lagrange multipliers are used to formulate the continuity constraints for both heat flux and interface traction at contact interfaces. The resulting system of nonlinear algebraic equations are cast in residual form for solution of the transient problem. A Jacobian-free Newton Krylov method is used to provide for fully-coupled solution of the coupled thermal contact and heat equations.« less
A Jacobian-Free Newton Krylov Method for Mortar-Discretized Thermomechanical Contact Problems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Glen Hansen
2011-07-01
Multibody contact problems are common within the field of multiphysics simulation. Applications involving thermomechanical contact scenarios are also quite prevalent. Such problems can be challenging to solve due to the likelihood of thermal expansion affecting contact geometry which, in turn, can change the thermal behavior of the components being analyzed. This paper explores a simple model of a light water reactor nuclear reactor fuel rod, which consists of cylindrical pellets of uranium dioxide (UO2) fuel sealed within a Zircalloy cladding tube. The tube is initially filled with helium gas, which fills the gap between the pellets and cladding tube. Themore » accurate modeling of heat transfer across the gap between fuel pellets and the protective cladding is essential to understanding fuel performance, including cladding stress and behavior under irradiated conditions, which are factors that affect the lifetime of the fuel. The thermomechanical contact approach developed here is based on the mortar finite element method, where Lagrange multipliers are used to enforce weak continuity constraints at participating interfaces. In this formulation, the heat equation couples to linear mechanics through a thermal expansion term. Lagrange multipliers are used to formulate the continuity constraints for both heat flux and interface traction at contact interfaces. The resulting system of nonlinear algebraic equations are cast in residual form for solution of the transient problem. A Jacobian-free Newton Krylov method is used to provide for fully-coupled solution of the coupled thermal contact and heat equations.« less
NASA Astrophysics Data System (ADS)
Weston, Brian; Nourgaliev, Robert; Delplanque, Jean-Pierre
2017-11-01
We present a new block-based Schur complement preconditioner for simulating all-speed compressible flow with phase change. The conservation equations are discretized with a reconstructed Discontinuous Galerkin method and integrated in time with fully implicit time discretization schemes. The resulting set of non-linear equations is converged using a robust Newton-Krylov framework. Due to the stiffness of the underlying physics associated with stiff acoustic waves and viscous material strength effects, we solve for the primitive-variables (pressure, velocity, and temperature). To enable convergence of the highly ill-conditioned linearized systems, we develop a physics-based preconditioner, utilizing approximate block factorization techniques to reduce the fully-coupled 3×3 system to a pair of reduced 2×2 systems. We demonstrate that our preconditioned Newton-Krylov framework converges on very stiff multi-physics problems, corresponding to large CFL and Fourier numbers, with excellent algorithmic and parallel scalability. Results are shown for the classic lid-driven cavity flow problem as well as for 3D laser-induced phase change. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
A Newton-Krylov solver for fast spin-up of online ocean tracers
NASA Astrophysics Data System (ADS)
Lindsay, Keith
2017-01-01
We present a Newton-Krylov based solver to efficiently spin up tracers in an online ocean model. We demonstrate that the solver converges, that tracer simulations initialized with the solution from the solver have small drift, and that the solver takes orders of magnitude less computational time than the brute force spin-up approach. To demonstrate the application of the solver, we use it to efficiently spin up the tracer ideal age with respect to the circulation from different time intervals in a long physics run. We then evaluate how the spun-up ideal age tracer depends on the duration of the physics run, i.e., on how equilibrated the circulation is.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Feng, Tao, E-mail: fengtao2@mail.ustc.edu.cn; Graduate School of China Academy Engineering Physics, Beijing 100083; An, Hengbin, E-mail: an_hengbin@iapcm.ac.cn
2013-03-01
Jacobian-free Newton–Krylov (JFNK) method is an effective algorithm for solving large scale nonlinear equations. One of the most important advantages of JFNK method is that there is no necessity to form and store the Jacobian matrix of the nonlinear system when JFNK method is employed. However, an approximation of the Jacobian is needed for the purpose of preconditioning. In this paper, JFNK method is employed to solve a class of non-equilibrium radiation diffusion coupled to material thermal conduction equations, and two preconditioners are designed by linearizing the equations in two methods. Numerical results show that the two preconditioning methods canmore » improve the convergence behavior and efficiency of JFNK method.« less
Scaling Optimization of the SIESTA MHD Code
NASA Astrophysics Data System (ADS)
Seal, Sudip; Hirshman, Steven; Perumalla, Kalyan
2013-10-01
SIESTA is a parallel three-dimensional plasma equilibrium code capable of resolving magnetic islands at high spatial resolutions for toroidal plasmas. Originally designed to exploit small-scale parallelism, SIESTA has now been scaled to execute efficiently over several thousands of processors P. This scaling improvement was accomplished with minimal intrusion to the execution flow of the original version. First, the efficiency of the iterative solutions was improved by integrating the parallel tridiagonal block solver code BCYCLIC. Krylov-space generation in GMRES was then accelerated using a customized parallel matrix-vector multiplication algorithm. Novel parallel Hessian generation algorithms were integrated and memory access latencies were dramatically reduced through loop nest optimizations and data layout rearrangement. These optimizations sped up equilibria calculations by factors of 30-50. It is possible to compute solutions with granularity N/P near unity on extremely fine radial meshes (N > 1024 points). Grid separation in SIESTA, which manifests itself primarily in the resonant components of the pressure far from rational surfaces, is strongly suppressed by finer meshes. Large problem sizes of up to 300 K simultaneous non-linear coupled equations have been solved on the NERSC supercomputers. Work supported by U.S. DOE under Contract DE-AC05-00OR22725 with UT-Battelle, LLC.
Recent advances in nonlinear implicit, electrostatic particle-in-cell (PIC) algorithms
NASA Astrophysics Data System (ADS)
Chen, Guangye; Chacón, Luis; Barnes, Daniel
2012-10-01
An implicit 1D electrostatic PIC algorithmfootnotetextChen, Chac'on, Barnes, J. Comput. Phys. 230 (2011) has been developed that satisfies exact energy and charge conservation. The algorithm employs a kinetic-enslaved Jacobian-free Newton-Krylov methodfootnotetextIbid. that ensures nonlinear convergence while taking timesteps comparable to the dynamical timescale of interest. Here we present two main improvements of the algorithm. The first is the formulation of a preconditioner based on linearized fluid equations, which are closed using available particle information. The computational benefit is that solving the fluid system is much cheaper than the kinetic one. The effectiveness of the preconditioner in accelerating nonlinear iterations on challenging problems will be demonstrated. A second improvement is the generalization of Ref. 1 to curvilinear meshes,footnotetextChac'on, Chen, Barnes, J. Comput. Phys. submitted (2012) with a hybrid particle update of positions and velocities in logical and physical space respectively.footnotetextSwift, J. Comp. Phys., 126 (1996) The curvilinear algorithm remains exactly charge and energy-conserving, and can be extended to multiple dimensions. We demonstrate the accuracy and efficiency of the algorithm with a 1D ion-acoustic shock wave simulation.
NASA Astrophysics Data System (ADS)
Ge, Zheng-Ming
2008-04-01
Necessary and sufficient conditions for the stability of a sleeping top described by dynamic equations of six state variables, Euler equations, and Poisson equations, by a two-degree-of-freedom system, Krylov equations, and by a one-degree-of-freedom system, nutation angle equation, is obtained by the Lyapunov direct method, Ge-Liu second instability theorem, an instability theorem, and a Ge-Yao-Chen partial region stability theorem without using the first approximation theory altogether.
NASA Astrophysics Data System (ADS)
Fuchs, A.; Androsov, A.; Harig, S.; Hiller, W.; Rakowsky, N.
2012-04-01
Based on the jeopardy of devastating tsunamis and the unpredictability of such events, tsunami modelling as part of warning systems is still a contemporary topic. The tsunami group of Alfred Wegener Institute developed the simulation tool TsunAWI as contribution to the Early Warning System in Indonesia. Although the precomputed scenarios for this purpose qualify for satisfying deliverables, the study of further improvements continues. While TsunAWI is governed by the Shallow Water Equations, an extension of the model is based on a nonhydrostatic approach. At the arrival of a tsunami wave in coastal regions with rough bathymetry, the term containing the nonhydrostatic part of pressure, that is neglected in the original hydrostatic model, gains in importance. In consideration of this term, a better approximation of the wave is expected. Differences of hydrostatic and nonhydrostatic model results are contrasted in the standard benchmark problem of a solitary wave runup on a plane beach. The observation data provided by Titov and Synolakis (1995) serves as reference. The nonhydrostatic approach implies a set of equations that are similar to the Shallow Water Equations, so the variation of the code can be implemented on top. However, this additional routines cause a lot of issues you have to cope with. So far the computations of the model were purely explicit. In the nonhydrostatic version the determination of an additional unknown and the solution of a large sparse system of linear equations is necessary. The latter constitutes the lion's share of computing time and memory requirement. Since the corresponding matrix is only symmetric in structure and not in values, an iterative Krylov Subspace Method is used, in particular the restarted Generalized Minimal Residual Algorithm GMRES(m). With regard to optimization, we present a comparison of several combinations of sequential and parallel preconditioning techniques respective number of iterations and setup/application time. Since the used software package pARMS 3.2, that provides solving and preconditioning techniques, works via MPI parallelism, in an auxiliary branch we adapted TsunAWI and switched from OpenMP to MPI with attached importance to internal partition management.
NASA Astrophysics Data System (ADS)
Lin, Y.; O'Malley, D.; Vesselinov, V. V.
2015-12-01
Inverse modeling seeks model parameters given a set of observed state variables. However, for many practical problems due to the facts that the observed data sets are often large and model parameters are often numerous, conventional methods for solving the inverse modeling can be computationally expensive. We have developed a new, computationally-efficient Levenberg-Marquardt method for solving large-scale inverse modeling. Levenberg-Marquardt methods require the solution of a dense linear system of equations which can be prohibitively expensive to compute for large-scale inverse problems. Our novel method projects the original large-scale linear problem down to a Krylov subspace, such that the dimensionality of the measurements can be significantly reduced. Furthermore, instead of solving the linear system for every Levenberg-Marquardt damping parameter, we store the Krylov subspace computed when solving the first damping parameter and recycle it for all the following damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved by using these computational techniques. We apply this new inverse modeling method to invert for a random transitivity field. Our algorithm is fast enough to solve for the distributed model parameters (transitivity) at each computational node in the model domain. The inversion is also aided by the use regularization techniques. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). Julia is an advanced high-level scientific programing language that allows for efficient memory management and utilization of high-performance computational resources. By comparing with a Levenberg-Marquardt method using standard linear inversion techniques, our Levenberg-Marquardt method yields speed-up ratio of 15 in a multi-core computational environment and a speed-up ratio of 45 in a single-core computational environment. Therefore, our new inverse modeling method is a powerful tool for large-scale applications.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lawson, M.; Yu, Y. H.; Nelessen, A.
2014-05-01
Wave energy converters (WECs) are commonly designed and analyzed using numerical models that combine multi-body dynamics with hydrodynamic models based on the Cummins Equation and linearized hydrodynamic coefficients. These modeling methods are attractive design tools because they are computationally inexpensive and do not require the use of high performance computing resources necessitated by high-fidelity methods, such as Navier Stokes computational fluid dynamics. Modeling hydrodynamics using linear coefficients assumes that the device undergoes small motions and that the wetted surface area of the devices is approximately constant. WEC devices, however, are typically designed to undergo large motions in order to maximizemore » power extraction, calling into question the validity of assuming that linear hydrodynamic models accurately capture the relevant fluid-structure interactions. In this paper, we study how calculating buoyancy and Froude-Krylov forces from the instantaneous position of a WEC device (referred to as instantaneous buoyancy and Froude-Krylov forces from herein) changes WEC simulation results compared to simulations that use linear hydrodynamic coefficients. First, we describe the WEC-Sim tool used to perform simulations and how the ability to model instantaneous forces was incorporated into WEC-Sim. We then use a simplified one-body WEC device to validate the model and to demonstrate how accounting for these instantaneously calculated forces affects the accuracy of simulation results, such as device motions, hydrodynamic forces, and power generation.« less
NASA Astrophysics Data System (ADS)
Chui, Siu Lit; Lu, Ya Yan
2004-03-01
Wide-angle full-vector beam propagation methods (BPMs) for three-dimensional wave-guiding structures can be derived on the basis of rational approximants of a square root operator or its exponential (i.e., the one-way propagator). While the less accurate BPM based on the slowly varying envelope approximation can be efficiently solved by the alternating direction implicit (ADI) method, the wide-angle variants involve linear systems that are more difficult to handle. We present an efficient solver for these linear systems that is based on a Krylov subspace method with an ADI preconditioner. The resulting wide-angle full-vector BPM is used to simulate the propagation of wave fields in a Y branch and a taper.
Chui, Siu Lit; Lu, Ya Yan
2004-03-01
Wide-angle full-vector beam propagation methods (BPMs) for three-dimensional wave-guiding structures can be derived on the basis of rational approximants of a square root operator or its exponential (i.e., the one-way propagator). While the less accurate BPM based on the slowly varying envelope approximation can be efficiently solved by the alternating direction implicit (ADI) method, the wide-angle variants involve linear systems that are more difficult to handle. We present an efficient solver for these linear systems that is based on a Krylov subspace method with an ADI preconditioner. The resulting wide-angle full-vector BPM is used to simulate the propagation of wave fields in a Y branch and a taper.
NASA Astrophysics Data System (ADS)
Parand, K.; Nikarya, M.
2017-11-01
In this paper a novel method will be introduced to solve a nonlinear partial differential equation (PDE). In the proposed method, we use the spectral collocation method based on Bessel functions of the first kind and the Jacobian free Newton-generalized minimum residual (JFNGMRes) method with adaptive preconditioner. In this work a nonlinear PDE has been converted to a nonlinear system of algebraic equations using the collocation method based on Bessel functions without any linearization, discretization or getting the help of any other methods. Finally, by using JFNGMRes, the solution of the nonlinear algebraic system is achieved. To illustrate the reliability and efficiency of the proposed method, we solve some examples of the famous Fisher equation. We compare our results with other methods.
Stable and Spectrally Accurate Schemes for the Navier-Stokes Equations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jia, Jun; Liu, Jie
2011-01-01
In this paper, we present an accurate, efficient and stable numerical method for the incompressible Navier-Stokes equations (NSEs). The method is based on (1) an equivalent pressure Poisson equation formulation of the NSE with proper pressure boundary conditions, which facilitates the design of high-order and stable numerical methods, and (2) the Krylov deferred correction (KDC) accelerated method of lines transpose (mbox MoL{sup T}), which is very stable, efficient, and of arbitrary order in time. Numerical tests with known exact solutions in three dimensions show that the new method is spectrally accurate in time, and a numerical order of convergence 9more » was observed. Two-dimensional computational results of flow past a cylinder and flow in a bifurcated tube are also reported.« less
pyCTQW: A continuous-time quantum walk simulator on distributed memory computers
NASA Astrophysics Data System (ADS)
Izaac, Josh A.; Wang, Jingbo B.
2015-01-01
In the general field of quantum information and computation, quantum walks are playing an increasingly important role in constructing physical models and quantum algorithms. We have recently developed a distributed memory software package pyCTQW, with an object-oriented Python interface, that allows efficient simulation of large multi-particle CTQW (continuous-time quantum walk)-based systems. In this paper, we present an introduction to the Python and Fortran interfaces of pyCTQW, discuss various numerical methods of calculating the matrix exponential, and demonstrate the performance behavior of pyCTQW on a distributed memory cluster. In particular, the Chebyshev and Krylov-subspace methods for calculating the quantum walk propagation are provided, as well as methods for visualization and data analysis.
Constrained H1-regularization schemes for diffeomorphic image registration
Mang, Andreas; Biros, George
2017-01-01
We propose regularization schemes for deformable registration and efficient algorithms for their numerical approximation. We treat image registration as a variational optimal control problem. The deformation map is parametrized by its velocity. Tikhonov regularization ensures well-posedness. Our scheme augments standard smoothness regularization operators based on H1- and H2-seminorms with a constraint on the divergence of the velocity field, which resembles variational formulations for Stokes incompressible flows. In our formulation, we invert for a stationary velocity field and a mass source map. This allows us to explicitly control the compressibility of the deformation map and by that the determinant of the deformation gradient. We also introduce a new regularization scheme that allows us to control shear. We use a globalized, preconditioned, matrix-free, reduced space (Gauss–)Newton–Krylov scheme for numerical optimization. We exploit variable elimination techniques to reduce the number of unknowns of our system; we only iterate on the reduced space of the velocity field. Our current implementation is limited to the two-dimensional case. The numerical experiments demonstrate that we can control the determinant of the deformation gradient without compromising registration quality. This additional control allows us to avoid oversmoothing of the deformation map. We also demonstrate that we can promote or penalize shear whilst controlling the determinant of the deformation gradient. PMID:29075361
Low-Rank Correction Methods for Algebraic Domain Decomposition Preconditioners
Li, Ruipeng; Saad, Yousef
2017-08-01
This study presents a parallel preconditioning method for distributed sparse linear systems, based on an approximate inverse of the original matrix, that adopts a general framework of distributed sparse matrices and exploits domain decomposition (DD) and low-rank corrections. The DD approach decouples the matrix and, once inverted, a low-rank approximation is applied by exploiting the Sherman--Morrison--Woodbury formula, which yields two variants of the preconditioning methods. The low-rank expansion is computed by the Lanczos procedure with reorthogonalizations. Numerical experiments indicate that, when combined with Krylov subspace accelerators, this preconditioner can be efficient and robust for solving symmetric sparse linear systems. Comparisonsmore » with pARMS, a DD-based parallel incomplete LU (ILU) preconditioning method, are presented for solving Poisson's equation and linear elasticity problems.« less
Tensor-product preconditioners for higher-order space-time discontinuous Galerkin methods
NASA Astrophysics Data System (ADS)
Diosady, Laslo T.; Murman, Scott M.
2017-02-01
A space-time discontinuous-Galerkin spectral-element discretization is presented for direct numerical simulation of the compressible Navier-Stokes equations. An efficient solution technique based on a matrix-free Newton-Krylov method is developed in order to overcome the stiffness associated with high solution order. The use of tensor-product basis functions is key to maintaining efficiency at high-order. Efficient preconditioning methods are presented which can take advantage of the tensor-product formulation. A diagonalized Alternating-Direction-Implicit (ADI) scheme is extended to the space-time discontinuous Galerkin discretization. A new preconditioner for the compressible Euler/Navier-Stokes equations based on the fast-diagonalization method is also presented. Numerical results demonstrate the effectiveness of these preconditioners for the direct numerical simulation of subsonic turbulent flows.
Low-Rank Correction Methods for Algebraic Domain Decomposition Preconditioners
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Ruipeng; Saad, Yousef
This study presents a parallel preconditioning method for distributed sparse linear systems, based on an approximate inverse of the original matrix, that adopts a general framework of distributed sparse matrices and exploits domain decomposition (DD) and low-rank corrections. The DD approach decouples the matrix and, once inverted, a low-rank approximation is applied by exploiting the Sherman--Morrison--Woodbury formula, which yields two variants of the preconditioning methods. The low-rank expansion is computed by the Lanczos procedure with reorthogonalizations. Numerical experiments indicate that, when combined with Krylov subspace accelerators, this preconditioner can be efficient and robust for solving symmetric sparse linear systems. Comparisonsmore » with pARMS, a DD-based parallel incomplete LU (ILU) preconditioning method, are presented for solving Poisson's equation and linear elasticity problems.« less
Tensor-Product Preconditioners for Higher-Order Space-Time Discontinuous Galerkin Methods
NASA Technical Reports Server (NTRS)
Diosady, Laslo T.; Murman, Scott M.
2016-01-01
space-time discontinuous-Galerkin spectral-element discretization is presented for direct numerical simulation of the compressible Navier-Stokes equat ions. An efficient solution technique based on a matrix-free Newton-Krylov method is developed in order to overcome the stiffness associated with high solution order. The use of tensor-product basis functions is key to maintaining efficiency at high order. Efficient preconditioning methods are presented which can take advantage of the tensor-product formulation. A diagonalized Alternating-Direction-Implicit (ADI) scheme is extended to the space-time discontinuous Galerkin discretization. A new preconditioner for the compressible Euler/Navier-Stokes equations based on the fast-diagonalization method is also presented. Numerical results demonstrate the effectiveness of these preconditioners for the direct numerical simulation of subsonic turbulent flows.
PIXIE3D: A Parallel, Implicit, eXtended MHD 3D Code
NASA Astrophysics Data System (ADS)
Chacon, Luis
2006-10-01
We report on the development of PIXIE3D, a 3D parallel, fully implicit Newton-Krylov extended MHD code in general curvilinear geometry. PIXIE3D employs a second-order, finite-volume-based spatial discretization that satisfies remarkable properties such as being conservative, solenoidal in the magnetic field to machine precision, non-dissipative, and linearly and nonlinearly stable in the absence of physical dissipation. PIXIE3D employs fully-implicit Newton-Krylov methods for the time advance. Currently, second-order implicit schemes such as Crank-Nicolson and BDF2 (2^nd order backward differentiation formula) are available. PIXIE3D is fully parallel (employs PETSc for parallelism), and exhibits excellent parallel scalability. A parallel, scalable, MG preconditioning strategy, based on physics-based preconditioning ideas, has been developed for resistive MHD, and is currently being extended to Hall MHD. In this poster, we will report on progress in the algorithmic formulation for extended MHD, as well as the the serial and parallel performance of PIXIE3D in a variety of problems and geometries. L. Chac'on, Comput. Phys. Comm., 163 (3), 143-171 (2004) L. Chac'on et al., J. Comput. Phys. 178 (1), 15- 36 (2002); J. Comput. Phys., 188 (2), 573-592 (2003) L. Chac'on, 32nd EPS Conf. Plasma Physics, Tarragona, Spain, 2005 L. Chac'on et al., 33rd EPS Conf. Plasma Physics, Rome, Italy, 2006
NASA Astrophysics Data System (ADS)
Angelidis, Dionysios; Chawdhary, Saurabh; Sotiropoulos, Fotis
2016-11-01
A novel numerical method is developed for solving the 3D, unsteady, incompressible Navier-Stokes equations on locally refined fully unstructured Cartesian grids in domains with arbitrarily complex immersed boundaries. Owing to the utilization of the fractional step method on an unstructured Cartesian hybrid staggered/non-staggered grid layout, flux mismatch and pressure discontinuity issues are avoided and the divergence free constraint is inherently satisfied to machine zero. Auxiliary/hanging nodes are used to facilitate the discretization of the governing equations. The second-order accuracy of the solver is ensured by using multi-dimension Lagrange interpolation operators and appropriate differencing schemes at the interface of regions with different levels of refinement. The sharp interface immersed boundary method is augmented with local near-boundary refinement to handle arbitrarily complex boundaries. The discrete momentum equation is solved with the matrix free Newton-Krylov method and the Krylov-subspace method is employed to solve the Poisson equation. The second-order accuracy of the proposed method on unstructured Cartesian grids is demonstrated by solving the Poisson equation with a known analytical solution. A number of three-dimensional laminar flow simulations of increasing complexity illustrate the ability of the method to handle flows across a range of Reynolds numbers and flow regimes. Laminar steady and unsteady flows past a sphere and the oblique vortex shedding from a circular cylinder mounted between two end walls demonstrate the accuracy, the efficiency and the smooth transition of scales and coherent structures across refinement levels. Large-eddy simulation (LES) past a miniature wind turbine rotor, parameterized using the actuator line approach, indicates the ability of the fully unstructured solver to simulate complex turbulent flows. Finally, a geometry resolving LES of turbulent flow past a complete hydrokinetic turbine illustrates the potential of the method to simulate turbulent flows past geometrically complex bodies on locally refined meshes. In all the cases, the results are found to be in very good agreement with published data and savings in computational resources are achieved.
[The taxonomic rank and place of Colpodellida in the system of the Protista].
Myl'nikov, A P; Krylov, M V; Frolov, A O
2000-01-01
The analysis of ultrastructure organisation and divergent processes in Colpodellida, Perkinsida, Gregarinea and Coccidea has confirmed the presence of unique basic structures in all of these organisms and the necessity to combine them into the single phylum Sporozoa. A taxonomic rank and place of Colpodellida in the system of living organisms is represented as follows: phylum Sporozoa Leuckart, 1879; em. Krylov, Mylnikov, 1986. (Syn.: Apicomplexa Levine, 1970). Predators or parasites. Common basic structure: pellicular membranes, subpellicular microtubules, micropores, conoid, rhoptries and micronemes, tubular mitochondrial cristae. Class Perkinsea Levine, 1978. Predators or parasites, vegetative stages with two heterodynamic flagella. Subclass 1. Colpodellia nom. nov. (Syn.: Spiromonadia Krylov, Mylnikov, 1986). Predators, two heterodynamic flagella with string-like mastigonemes (if present), division is exclusively within a cyst, with 2-4 daughter cells being produced, extrusomes are trichocyst-like. Subclass 2. Perkinsia Levine, 1978. Parasites, zoospores with two heterodynamic flagella, mastigonemes (if present) bristle-like or string-like.
Robust large-scale parallel nonlinear solvers for simulations.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bader, Brett William; Pawlowski, Roger Patrick; Kolda, Tamara Gibson
2005-11-01
This report documents research to develop robust and efficient solution techniques for solving large-scale systems of nonlinear equations. The most widely used method for solving systems of nonlinear equations is Newton's method. While much research has been devoted to augmenting Newton-based solvers (usually with globalization techniques), little has been devoted to exploring the application of different models. Our research has been directed at evaluating techniques using different models than Newton's method: a lower order model, Broyden's method, and a higher order model, the tensor method. We have developed large-scale versions of each of these models and have demonstrated their usemore » in important applications at Sandia. Broyden's method replaces the Jacobian with an approximation, allowing codes that cannot evaluate a Jacobian or have an inaccurate Jacobian to converge to a solution. Limited-memory methods, which have been successful in optimization, allow us to extend this approach to large-scale problems. We compare the robustness and efficiency of Newton's method, modified Newton's method, Jacobian-free Newton-Krylov method, and our limited-memory Broyden method. Comparisons are carried out for large-scale applications of fluid flow simulations and electronic circuit simulations. Results show that, in cases where the Jacobian was inaccurate or could not be computed, Broyden's method converged in some cases where Newton's method failed to converge. We identify conditions where Broyden's method can be more efficient than Newton's method. We also present modifications to a large-scale tensor method, originally proposed by Bouaricha, for greater efficiency, better robustness, and wider applicability. Tensor methods are an alternative to Newton-based methods and are based on computing a step based on a local quadratic model rather than a linear model. The advantage of Bouaricha's method is that it can use any existing linear solver, which makes it simple to write and easily portable. However, the method usually takes twice as long to solve as Newton-GMRES on general problems because it solves two linear systems at each iteration. In this paper, we discuss modifications to Bouaricha's method for a practical implementation, including a special globalization technique and other modifications for greater efficiency. We present numerical results showing computational advantages over Newton-GMRES on some realistic problems. We further discuss a new approach for dealing with singular (or ill-conditioned) matrices. In particular, we modify an algorithm for identifying a turning point so that an increasingly ill-conditioned Jacobian does not prevent convergence.« less
Lagardère, Louis; Jolly, Luc-Henri; Lipparini, Filippo; Aviat, Félix; Stamm, Benjamin; Jing, Zhifeng F; Harger, Matthew; Torabifard, Hedieh; Cisneros, G Andrés; Schnieders, Michael J; Gresh, Nohad; Maday, Yvon; Ren, Pengyu Y; Ponder, Jay W; Piquemal, Jean-Philip
2018-01-28
We present Tinker-HP, a massively MPI parallel package dedicated to classical molecular dynamics (MD) and to multiscale simulations, using advanced polarizable force fields (PFF) encompassing distributed multipoles electrostatics. Tinker-HP is an evolution of the popular Tinker package code that conserves its simplicity of use and its reference double precision implementation for CPUs. Grounded on interdisciplinary efforts with applied mathematics, Tinker-HP allows for long polarizable MD simulations on large systems up to millions of atoms. We detail in the paper the newly developed extension of massively parallel 3D spatial decomposition to point dipole polarizable models as well as their coupling to efficient Krylov iterative and non-iterative polarization solvers. The design of the code allows the use of various computer systems ranging from laboratory workstations to modern petascale supercomputers with thousands of cores. Tinker-HP proposes therefore the first high-performance scalable CPU computing environment for the development of next generation point dipole PFFs and for production simulations. Strategies linking Tinker-HP to Quantum Mechanics (QM) in the framework of multiscale polarizable self-consistent QM/MD simulations are also provided. The possibilities, performances and scalability of the software are demonstrated via benchmarks calculations using the polarizable AMOEBA force field on systems ranging from large water boxes of increasing size and ionic liquids to (very) large biosystems encompassing several proteins as well as the complete satellite tobacco mosaic virus and ribosome structures. For small systems, Tinker-HP appears to be competitive with the Tinker-OpenMM GPU implementation of Tinker. As the system size grows, Tinker-HP remains operational thanks to its access to distributed memory and takes advantage of its new algorithmic enabling for stable long timescale polarizable simulations. Overall, a several thousand-fold acceleration over a single-core computation is observed for the largest systems. The extension of the present CPU implementation of Tinker-HP to other computational platforms is discussed.
Model and controller reduction of large-scale structures based on projection methods
NASA Astrophysics Data System (ADS)
Gildin, Eduardo
The design of low-order controllers for high-order plants is a challenging problem theoretically as well as from a computational point of view. Frequently, robust controller design techniques result in high-order controllers. It is then interesting to achieve reduced-order models and controllers while maintaining robustness properties. Controller designed for large structures based on models obtained by finite element techniques yield large state-space dimensions. In this case, problems related to storage, accuracy and computational speed may arise. Thus, model reduction methods capable of addressing controller reduction problems are of primary importance to allow the practical applicability of advanced controller design methods for high-order systems. A challenging large-scale control problem that has emerged recently is the protection of civil structures, such as high-rise buildings and long-span bridges, from dynamic loadings such as earthquakes, high wind, heavy traffic, and deliberate attacks. Even though significant effort has been spent in the application of control theory to the design of civil structures in order increase their safety and reliability, several challenging issues are open problems for real-time implementation. This dissertation addresses with the development of methodologies for controller reduction for real-time implementation in seismic protection of civil structures using projection methods. Three classes of schemes are analyzed for model and controller reduction: nodal truncation, singular value decomposition methods and Krylov-based methods. A family of benchmark problems for structural control are used as a framework for a comparative study of model and controller reduction techniques. It is shown that classical model and controller reduction techniques, such as balanced truncation, modal truncation and moment matching by Krylov techniques, yield reduced-order controllers that do not guarantee stability of the closed-loop system, that is, the reduced-order controller implemented with the full-order plant. A controller reduction approach is proposed such that to guarantee closed-loop stability. It is based on the concept of dissipativity (or positivity) of linear dynamical systems. Utilizing passivity preserving model reduction together with dissipative-LQG controllers, effective low-order optimal controllers are obtained. Results are shown through simulations.
Exploratory High-Fidelity Aerostructural Optimization Using an Efficient Monolithic Solution Method
NASA Astrophysics Data System (ADS)
Zhang, Jenmy Zimi
This thesis is motivated by the desire to discover fuel efficient aircraft concepts through exploratory design. An optimization methodology based on tightly integrated high-fidelity aerostructural analysis is proposed, which has the flexibility, robustness, and efficiency to contribute to this goal. The present aerostructural optimization methodology uses an integrated geometry parameterization and mesh movement strategy, which was initially proposed for aerodynamic shape optimization. This integrated approach provides the optimizer with a large amount of geometric freedom for conducting exploratory design, while allowing for efficient and robust mesh movement in the presence of substantial shape changes. In extending this approach to aerostructural optimization, this thesis has addressed a number of important challenges. A structural mesh deformation strategy has been introduced to translate consistently the shape changes described by the geometry parameterization to the structural model. A three-field formulation of the discrete steady aerostructural residual couples the mesh movement equations with the three-dimensional Euler equations and a linear structural analysis. Gradients needed for optimization are computed with a three-field coupled adjoint approach. A number of investigations have been conducted to demonstrate the suitability and accuracy of the present methodology for use in aerostructural optimization involving substantial shape changes. Robustness and efficiency in the coupled solution algorithms is crucial to the success of an exploratory optimization. This thesis therefore also focuses on the design of an effective monolithic solution algorithm for the proposed methodology. This involves using a Newton-Krylov method for the aerostructural analysis and a preconditioned Krylov subspace method for the coupled adjoint solution. Several aspects of the monolithic solution method have been investigated. These include appropriate strategies for scaling and matrix-vector product evaluation, as well as block preconditioning techniques that preserve the modularity between subproblems. The monolithic solution method is applied to problems with varying degrees of fluid-structural coupling, as well as a wing span optimization study. The monolithic solution algorithm typically requires 20%-70% less computing time than its partitioned counterpart. This advantage increases with increasing wing flexibility. The performance of the monolithic solution method is also much less sensitive to the choice of the solution parameter.
NASA Astrophysics Data System (ADS)
Augustin, Christoph M.; Neic, Aurel; Liebmann, Manfred; Prassl, Anton J.; Niederer, Steven A.; Haase, Gundolf; Plank, Gernot
2016-01-01
Electromechanical (EM) models of the heart have been used successfully to study fundamental mechanisms underlying a heart beat in health and disease. However, in all modeling studies reported so far numerous simplifications were made in terms of representing biophysical details of cellular function and its heterogeneity, gross anatomy and tissue microstructure, as well as the bidirectional coupling between electrophysiology (EP) and tissue distension. One limiting factor is the employed spatial discretization methods which are not sufficiently flexible to accommodate complex geometries or resolve heterogeneities, but, even more importantly, the limited efficiency of the prevailing solver techniques which is not sufficiently scalable to deal with the incurring increase in degrees of freedom (DOF) when modeling cardiac electromechanics at high spatio-temporal resolution. This study reports on the development of a novel methodology for solving the nonlinear equation of finite elasticity using human whole organ models of cardiac electromechanics, discretized at a high para-cellular resolution. Three patient-specific, anatomically accurate, whole heart EM models were reconstructed from magnetic resonance (MR) scans at resolutions of 220 μm, 440 μm and 880 μm, yielding meshes of approximately 184.6, 24.4 and 3.7 million tetrahedral elements and 95.9, 13.2 and 2.1 million displacement DOF, respectively. The same mesh was used for discretizing the governing equations of both electrophysiology (EP) and nonlinear elasticity. A novel algebraic multigrid (AMG) preconditioner for an iterative Krylov solver was developed to deal with the resulting computational load. The AMG preconditioner was designed under the primary objective of achieving favorable strong scaling characteristics for both setup and solution runtimes, as this is key for exploiting current high performance computing hardware. Benchmark results using the 220 μm, 440 μm and 880 μm meshes demonstrate efficient scaling up to 1024, 4096 and 8192 compute cores which allowed the simulation of a single heart beat in 44.3, 87.8 and 235.3 minutes, respectively. The efficiency of the method allows fast simulation cycles without compromising anatomical or biophysical detail.
Augustin, Christoph M.; Neic, Aurel; Liebmann, Manfred; Prassl, Anton J.; Niederer, Steven A.; Haase, Gundolf; Plank, Gernot
2016-01-01
Electromechanical (EM) models of the heart have been used successfully to study fundamental mechanisms underlying a heart beat in health and disease. However, in all modeling studies reported so far numerous simplifications were made in terms of representing biophysical details of cellular function and its heterogeneity, gross anatomy and tissue microstructure, as well as the bidirectional coupling between electrophysiology (EP) and tissue distension. One limiting factor is the employed spatial discretization methods which are not sufficiently flexible to accommodate complex geometries or resolve heterogeneities, but, even more importantly, the limited efficiency of the prevailing solver techniques which are not sufficiently scalable to deal with the incurring increase in degrees of freedom (DOF) when modeling cardiac electromechanics at high spatio-temporal resolution. This study reports on the development of a novel methodology for solving the nonlinear equation of finite elasticity using human whole organ models of cardiac electromechanics, discretized at a high para-cellular resolution. Three patient-specific, anatomically accurate, whole heart EM models were reconstructed from magnetic resonance (MR) scans at resolutions of 220 μm, 440 μm and 880 μm, yielding meshes of approximately 184.6, 24.4 and 3.7 million tetrahedral elements and 95.9, 13.2 and 2.1 million displacement DOF, respectively. The same mesh was used for discretizing the governing equations of both electrophysiology (EP) and nonlinear elasticity. A novel algebraic multigrid (AMG) preconditioner for an iterative Krylov solver was developed to deal with the resulting computational load. The AMG preconditioner was designed under the primary objective of achieving favorable strong scaling characteristics for both setup and solution runtimes, as this is key for exploiting current high performance computing hardware. Benchmark results using the 220 μm, 440 μm and 880 μm meshes demonstrate efficient scaling up to 1024, 4096 and 8192 compute cores which allowed the simulation of a single heart beat in 44.3, 87.8 and 235.3 minutes, respectively. The efficiency of the method allows fast simulation cycles without compromising anatomical or biophysical detail. PMID:26819483
Multigrid for Staggered Lattice Fermions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brower, Richard C.; Clark, M. A.; Strelchenko, Alexei
Critical slowing down in Krylov methods for the Dirac operator presents a major obstacle to further advances in lattice field theory as it approaches the continuum solution. Here we formulate a multi-grid algorithm for the Kogut-Susskind (or staggered) fermion discretization which has proven difficult relative to Wilson multigrid due to its first-order anti-Hermitian structure. The solution is to introduce a novel spectral transformation by the K\\"ahler-Dirac spin structure prior to the Galerkin projection. We present numerical results for the two-dimensional, two-flavor Schwinger model, however, the general formalism is agnostic to dimension and is directly applicable to four-dimensional lattice QCD.
Coupling Schemes for Multiphysics Reactor Simulation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vijay Mahadeven; Jean Ragusa
2007-11-01
This report documents the progress of the student Vijay S. Mahadevan from the Nuclear Engineering Department of Texas A&M University over the summer of 2007 during his visit to the INL. The purpose of his visit was to investigate the physics-based preconditioned Jacobian-free Newton-Krylov method applied to physics relevant to nuclear reactor simulation. To this end he studied two test problems that represented reaction-diffusion and advection-reaction. These two test problems will provide the basis for future work in which neutron diffusion, nonlinear heat conduction, and a twophase flow model will be tightly coupled to provide an accurate model of amore » BWR core.« less
A monolithic homotopy continuation algorithm with application to computational fluid dynamics
NASA Astrophysics Data System (ADS)
Brown, David A.; Zingg, David W.
2016-09-01
A new class of homotopy continuation methods is developed suitable for globalizing quasi-Newton methods for large sparse nonlinear systems of equations. The new continuation methods, described as monolithic homotopy continuation, differ from the classical predictor-corrector algorithm in that the predictor and corrector phases are replaced with a single phase which includes both a predictor and corrector component. Conditional convergence and stability are proved analytically. Using a Laplacian-like operator to construct the homotopy, the new algorithm is shown to be more efficient than the predictor-corrector homotopy continuation algorithm as well as an implementation of the widely-used pseudo-transient continuation algorithm for some inviscid and turbulent, subsonic and transonic external aerodynamic flows over the ONERA M6 wing and the NACA 0012 airfoil using a parallel implicit Newton-Krylov finite-difference flow solver.
Numerical Computation of Subsonic Conical Diffuser Flows with Nonuniform Turbulent Inlet Conditions
1977-09-01
Gauss - Seidel Point Iteration Method . . . . . . . . . . . . . . . 7.0 FACTORS AFFECTING THE RATE OF CONVERGENCE OF THE POINT...can be solved in several ways. For simplicity, a standard Gauss - Seidel iteration method is used to obtain the solution . The method updates the...FACTORS AFFECTING THE RATE OF CONVERGENCE OF THE POINT ITERATION ,ŘETHOD The advantage of using the Gauss - Seidel point iteration method to
NASA Astrophysics Data System (ADS)
Wang, Yi-Hong; Wu, Guo-Cheng; Baleanu, Dumitru
2013-10-01
The variational iteration method is newly used to construct various integral equations of fractional order. Some iterative schemes are proposed which fully use the method and the predictor-corrector approach. The fractional Bagley-Torvik equation is then illustrated as an example of multi-order and the results show the efficiency of the variational iteration method's new role.
A novel iterative scheme and its application to differential equations.
Khan, Yasir; Naeem, F; Šmarda, Zdeněk
2014-01-01
The purpose of this paper is to employ an alternative approach to reconstruct the standard variational iteration algorithm II proposed by He, including Lagrange multiplier, and to give a simpler formulation of Adomian decomposition and modified Adomian decomposition method in terms of newly proposed variational iteration method-II (VIM). Through careful investigation of the earlier variational iteration algorithm and Adomian decomposition method, we find unnecessary calculations for Lagrange multiplier and also repeated calculations involved in each iteration, respectively. Several examples are given to verify the reliability and efficiency of the method.
NASA Astrophysics Data System (ADS)
Trujillo Bueno, Javier; Manso Sainz, Rafael
1999-05-01
This paper shows how to generalize to non-LTE polarization transfer some operator splitting methods that were originally developed for solving unpolarized transfer problems. These are the Jacobi-based accelerated Λ-iteration (ALI) method of Olson, Auer, & Buchler and the iterative schemes based on Gauss-Seidel and successive overrelaxation (SOR) iteration of Trujillo Bueno and Fabiani Bendicho. The theoretical framework chosen for the formulation of polarization transfer problems is the quantum electrodynamics (QED) theory of Landi Degl'Innocenti, which specifies the excitation state of the atoms in terms of the irreducible tensor components of the atomic density matrix. This first paper establishes the grounds of our numerical approach to non-LTE polarization transfer by concentrating on the standard case of scattering line polarization in a gas of two-level atoms, including the Hanle effect due to a weak microturbulent and isotropic magnetic field. We begin demonstrating that the well-known Λ-iteration method leads to the self-consistent solution of this type of problem if one initializes using the ``exact'' solution corresponding to the unpolarized case. We show then how the above-mentioned splitting methods can be easily derived from this simple Λ-iteration scheme. We show that our SOR method is 10 times faster than the Jacobi-based ALI method, while our implementation of the Gauss-Seidel method is 4 times faster. These iterative schemes lead to the self-consistent solution independently of the chosen initialization. The convergence rate of these iterative methods is very high; they do not require either the construction or the inversion of any matrix, and the computing time per iteration is similar to that of the Λ-iteration method.
Optimised Iteration in Coupled Monte Carlo - Thermal-Hydraulics Calculations
NASA Astrophysics Data System (ADS)
Hoogenboom, J. Eduard; Dufek, Jan
2014-06-01
This paper describes an optimised iteration scheme for the number of neutron histories and the relaxation factor in successive iterations of coupled Monte Carlo and thermal-hydraulic reactor calculations based on the stochastic iteration method. The scheme results in an increasing number of neutron histories for the Monte Carlo calculation in successive iteration steps and a decreasing relaxation factor for the spatial power distribution to be used as input to the thermal-hydraulics calculation. The theoretical basis is discussed in detail and practical consequences of the scheme are shown, among which a nearly linear increase per iteration of the number of cycles in the Monte Carlo calculation. The scheme is demonstrated for a full PWR type fuel assembly. Results are shown for the axial power distribution during several iteration steps. A few alternative iteration method are also tested and it is concluded that the presented iteration method is near optimal.
An iterative method for near-field Fresnel region polychromatic phase contrast imaging
NASA Astrophysics Data System (ADS)
Carroll, Aidan J.; van Riessen, Grant A.; Balaur, Eugeniu; Dolbnya, Igor P.; Tran, Giang N.; Peele, Andrew G.
2017-07-01
We present an iterative method for polychromatic phase contrast imaging that is suitable for broadband illumination and which allows for the quantitative determination of the thickness of an object given the refractive index of the sample material. Experimental and simulation results suggest the iterative method provides comparable image quality and quantitative object thickness determination when compared to the analytical polychromatic transport of intensity and contrast transfer function methods. The ability of the iterative method to work over a wider range of experimental conditions means the iterative method is a suitable candidate for use with polychromatic illumination and may deliver more utility for laboratory-based x-ray sources, which typically have a broad spectrum.
New methods of testing nonlinear hypothesis using iterative NLLS estimator
NASA Astrophysics Data System (ADS)
Mahaboob, B.; Venkateswarlu, B.; Mokeshrayalu, G.; Balasiddamuni, P.
2017-11-01
This research paper discusses the method of testing nonlinear hypothesis using iterative Nonlinear Least Squares (NLLS) estimator. Takeshi Amemiya [1] explained this method. However in the present research paper, a modified Wald test statistic due to Engle, Robert [6] is proposed to test the nonlinear hypothesis using iterative NLLS estimator. An alternative method for testing nonlinear hypothesis using iterative NLLS estimator based on nonlinear hypothesis using iterative NLLS estimator based on nonlinear studentized residuals has been proposed. In this research article an innovative method of testing nonlinear hypothesis using iterative restricted NLLS estimator is derived. Pesaran and Deaton [10] explained the methods of testing nonlinear hypothesis. This paper uses asymptotic properties of nonlinear least squares estimator proposed by Jenrich [8]. The main purpose of this paper is to provide very innovative methods of testing nonlinear hypothesis using iterative NLLS estimator, iterative NLLS estimator based on nonlinear studentized residuals and iterative restricted NLLS estimator. Eakambaram et al. [12] discussed least absolute deviation estimations versus nonlinear regression model with heteroscedastic errors and also they studied the problem of heteroscedasticity with reference to nonlinear regression models with suitable illustration. William Grene [13] examined the interaction effect in nonlinear models disused by Ai and Norton [14] and suggested ways to examine the effects that do not involve statistical testing. Peter [15] provided guidelines for identifying composite hypothesis and addressing the probability of false rejection for multiple hypotheses.
Some nonlinear damping models in flexible structures
NASA Technical Reports Server (NTRS)
Balakrishnan, A. V.
1988-01-01
A class of nonlinear damping models is introduced with application to flexible flight structures characterized by low damping. Approximate solutions of engineering interest are obtained for the model using the classical averaging technique of Krylov and Bogoliubov. The results should be considered preliminary pending further investigation.
Application of a fast Newton-Krylov solver for equilibrium simulations of phosphorus and oxygen
NASA Astrophysics Data System (ADS)
Fu, Weiwei; Primeau, François
2017-11-01
Model drift due to inadequate spinup is a serious problem that complicates the interpretation of climate change simulations. Even after a 300 year spinup we show that solutions are not only still drifting but often drifting away from their eventual equilibrium over large parts of the ocean. Here we present a Newton-Krylov solver for computing cyclostationary equilibrium solutions of a biogeochemical model for the cycling of phosphorus and oxygen. In addition to using previously developed preconditioning strategies - time-averaging and coarse-graining the Jacobian matrix - we also introduce a new strategy: the adiabatic elimination of a fast variable (particulate organic phosphorus) by slaving it to a slow variable (dissolved inorganic phosphorus). We use transport matrices derived from the Community Earth System Model (CESM) with a nominal horizontal resolution of 1° × 1° and 60 vertical levels to implement and test the solver. We find that the new solver obtains seasonally-varying equilibrium solutions with no visible drift using no more than 80 simulation years.
Solution of Cubic Equations by Iteration Methods on a Pocket Calculator
ERIC Educational Resources Information Center
Bamdad, Farzad
2004-01-01
A method to provide students a vision of how they can write iteration programs on an inexpensive programmable pocket calculator, without requiring a PC or a graphing calculator is developed. Two iteration methods are used, successive-approximations and bisection methods.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kawaguchi, Tomoya; Liu, Yihua; Reiter, Anthony
Here, a one-dimensional non-iterative direct method was employed for normalized crystal truncation rod analysis. The non-iterative approach, utilizing the Kramers–Kronig relation, avoids the ambiguities due to an improper initial model or incomplete convergence in the conventional iterative methods. The validity and limitations of the present method are demonstrated through both numerical simulations and experiments with Pt(111) in a 0.1 M CsF aqueous solution. The present method is compared with conventional iterative phase-retrieval methods.
Kawaguchi, Tomoya; Liu, Yihua; Reiter, Anthony; ...
2018-04-20
Here, a one-dimensional non-iterative direct method was employed for normalized crystal truncation rod analysis. The non-iterative approach, utilizing the Kramers–Kronig relation, avoids the ambiguities due to an improper initial model or incomplete convergence in the conventional iterative methods. The validity and limitations of the present method are demonstrated through both numerical simulations and experiments with Pt(111) in a 0.1 M CsF aqueous solution. The present method is compared with conventional iterative phase-retrieval methods.
NASA Astrophysics Data System (ADS)
Muhiddin, F. A.; Sulaiman, J.
2017-09-01
The aim of this paper is to investigate the effectiveness of the Successive Over-Relaxation (SOR) iterative method by using the fourth-order Crank-Nicolson (CN) discretization scheme to derive a five-point Crank-Nicolson approximation equation in order to solve diffusion equation. From this approximation equation, clearly, it can be shown that corresponding system of five-point approximation equations can be generated and then solved iteratively. In order to access the performance results of the proposed iterative method with the fourth-order CN scheme, another point iterative method which is Gauss-Seidel (GS), also presented as a reference method. Finally the numerical results obtained from the use of the fourth-order CN discretization scheme, it can be pointed out that the SOR iterative method is superior in terms of number of iterations, execution time, and maximum absolute error.
Iterative methods for mixed finite element equations
NASA Technical Reports Server (NTRS)
Nakazawa, S.; Nagtegaal, J. C.; Zienkiewicz, O. C.
1985-01-01
Iterative strategies for the solution of indefinite system of equations arising from the mixed finite element method are investigated in this paper with application to linear and nonlinear problems in solid and structural mechanics. The augmented Hu-Washizu form is derived, which is then utilized to construct a family of iterative algorithms using the displacement method as the preconditioner. Two types of iterative algorithms are implemented. Those are: constant metric iterations which does not involve the update of preconditioner; variable metric iterations, in which the inverse of the preconditioning matrix is updated. A series of numerical experiments is conducted to evaluate the numerical performance with application to linear and nonlinear model problems.
Leapfrog variants of iterative methods for linear algebra equations
NASA Technical Reports Server (NTRS)
Saylor, Paul E.
1988-01-01
Two iterative methods are considered, Richardson's method and a general second order method. For both methods, a variant of the method is derived for which only even numbered iterates are computed. The variant is called a leapfrog method. Comparisons between the conventional form of the methods and the leapfrog form are made under the assumption that the number of unknowns is large. In the case of Richardson's method, it is possible to express the final iterate in terms of only the initial approximation, a variant of the iteration called the grand-leap method. In the case of the grand-leap variant, a set of parameters is required. An algorithm is presented to compute these parameters that is related to algorithms to compute the weights and abscissas for Gaussian quadrature. General algorithms to implement the leapfrog and grand-leap methods are presented. Algorithms for the important special case of the Chebyshev method are also given.
High resolution x-ray CMT: Reconstruction methods
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brown, J.K.
This paper qualitatively discusses the primary characteristics of methods for reconstructing tomographic images from a set of projections. These reconstruction methods can be categorized as either {open_quotes}analytic{close_quotes} or {open_quotes}iterative{close_quotes} techniques. Analytic algorithms are derived from the formal inversion of equations describing the imaging process, while iterative algorithms incorporate a model of the imaging process and provide a mechanism to iteratively improve image estimates. Analytic reconstruction algorithms are typically computationally more efficient than iterative methods; however, analytic algorithms are available for a relatively limited set of imaging geometries and situations. Thus, the framework of iterative reconstruction methods is better suited formore » high accuracy, tomographic reconstruction codes.« less
Comparison results on preconditioned SOR-type iterative method for Z-matrices linear systems
NASA Astrophysics Data System (ADS)
Wang, Xue-Zhong; Huang, Ting-Zhu; Fu, Ying-Ding
2007-09-01
In this paper, we present some comparison theorems on preconditioned iterative method for solving Z-matrices linear systems, Comparison results show that the rate of convergence of the Gauss-Seidel-type method is faster than the rate of convergence of the SOR-type iterative method.
Li, Haichen; Yaron, David J
2016-11-08
A least-squares commutator in the iterative subspace (LCIIS) approach is explored for accelerating self-consistent field (SCF) calculations. LCIIS is similar to direct inversion of the iterative subspace (DIIS) methods in that the next iterate of the density matrix is obtained as a linear combination of past iterates. However, whereas DIIS methods find the linear combination by minimizing a sum of error vectors, LCIIS minimizes the Frobenius norm of the commutator between the density matrix and the Fock matrix. This minimization leads to a quartic problem that can be solved iteratively through a constrained Newton's method. The relationship between LCIIS and DIIS is discussed. Numerical experiments suggest that LCIIS leads to faster convergence than other SCF convergence accelerating methods in a statistically significant sense, and in a number of cases LCIIS leads to stable SCF solutions that are not found by other methods. The computational cost involved in solving the quartic minimization problem is small compared to the typical cost of SCF iterations and the approach is easily integrated into existing codes. LCIIS can therefore serve as a powerful addition to SCF convergence accelerating methods in computational quantum chemistry packages.
On coupling fluid plasma and kinetic neutral physics models
Joseph, I.; Rensink, M. E.; Stotler, D. P.; ...
2017-03-01
The coupled fluid plasma and kinetic neutral physics equations are analyzed through theory and simulation of benchmark cases. It is shown that coupling methods that do not treat the coupling rates implicitly are restricted to short time steps for stability. Fast charge exchange, ionization and recombination coupling rates exist, even after constraining the solution by requiring that the neutrals are at equilibrium. For explicit coupling, the present implementation of Monte Carlo correlated sampling techniques does not allow for complete convergence in slab geometry. For the benchmark case, residuals decay with particle number and increase with grid size, indicating that theymore » scale in a manner that is similar to the theoretical prediction for nonlinear bias error. Progress is reported on implementation of a fully implicit Jacobian-free Newton–Krylov coupling scheme. The present block Jacobi preconditioning method is still sensitive to time step and methods that better precondition the coupled system are under investigation.« less
MIBPB: A software package for electrostatic analysis
Chen, Duan; Chen, Zhan; Chen, Changjun; Geng, Weihua; Wei, Guo-Wei
2010-01-01
The Poisson-Boltzmann equation (PBE) is an established model for the electrostatic analysis of biomolecules. The development of advanced computational techniques for the solution of the PBE has been an important topic in the past two decades. This paper presents a matched interface and boundary (MIB) based PBE software package, the MIBPB solver, for electrostatic analysis. The MIBPB has a unique feature that it is the first interface technique based PBE solver that rigorously enforces the solution and flux continuity conditions at the dielectric interface between the biomolecule and the solvent. For protein molecular surfaces which may possess troublesome geometrical singularities, the MIB scheme makes the MIBPB by far the only existing PBE solver that is able to deliver the second order convergence, i.e., the accuracy increases four times when the mesh size is halved. The MIBPB method is also equipped with a Dirichlet-to-Neumann mapping (DNM) technique, that builds a Green's function approach to analytically resolve the singular charge distribution in biomolecules in order to obtain reliable solutions at meshes as coarse as 1Å — while it usually takes other traditional PB solvers 0.25Å to reach similar level of reliability. The present work further accelerates the rate of convergence of linear equation systems resulting from the MIBPB by utilizing the Krylov subspace (KS) techniques. Condition numbers of the MIBPB matrices are significantly reduced by using appropriate Krylov subspace solver and preconditioner combinations. Both linear and nonlinear PBE solvers in the MIBPB package are tested by protein-solvent solvation energy calculations and analysis of salt effects on protein-protein binding energies, respectively. PMID:20845420
Annual Copper Mountain Conferences on Multigrid and Iterative Methods, Copper Mountain, Colorado
DOE Office of Scientific and Technical Information (OSTI.GOV)
McCormick, Stephen F.
This project supported the Copper Mountain Conference on Multigrid and Iterative Methods, held from 2007 to 2015, at Copper Mountain, Colorado. The subject of the Copper Mountain Conference Series alternated between Multigrid Methods in odd-numbered years and Iterative Methods in even-numbered years. Begun in 1983, the Series represents an important forum for the exchange of ideas in these two closely related fields. This report describes the Copper Mountain Conference on Multigrid and Iterative Methods, 2007-2015. Information on the conference series is available at http://grandmaster.colorado.edu/~copper/.
NASA Technical Reports Server (NTRS)
Allan, Brian G.
2000-01-01
A reduced order modeling approach of the Navier-Stokes equations is presented for the design of a distributed optimal feedback kernel. This approach is based oil a Krylov subspace method where significant modes of the flow are captured in the model This model is then used in all optimal feedback control design where sensing and actuation is performed oil tile entire flow field. This control design approach yields all optimal feedback kernel which provides insight into the placement of sensors and actuators in the flow field. As all evaluation of this approach, a two-dimensional shear layer and driven cavity flow are investigated.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wollaber, Allan Benton; Park, HyeongKae; Lowrie, Robert Byron
Moment-based acceleration via the development of “high-order, low-order” (HO-LO) algorithms has provided substantial accuracy and efficiency enhancements for solutions of the nonlinear, thermal radiative transfer equations by CCS-2 and T-3 staff members. Accuracy enhancements over traditional, linearized methods are obtained by solving a nonlinear, timeimplicit HO-LO system via a Jacobian-free Newton Krylov procedure. This also prevents the appearance of non-physical maximum principle violations (“temperature spikes”) associated with linearization. Efficiency enhancements are obtained in part by removing “effective scattering” from the linearized system. In this highlight, we summarize recent work in which we formally extended the HO-LO radiation algorithm to includemore » operator-split radiation-hydrodynamics.« less
Relativistic electron plasma oscillations in an inhomogeneous ion background
NASA Astrophysics Data System (ADS)
Karmakar, Mithun; Maity, Chandan; Chakrabarti, Nikhil
2018-06-01
The combined effect of relativistic electron mass variation and background ion inhomogeneity on the phase mixing process of large amplitude electron oscillations in cold plasmas have been analyzed by using Lagrangian coordinates. An inhomogeneity in the ion density is assumed to be time-independent but spatially periodic, and a periodic perturbation in the electron density is considered as well. An approximate space-time dependent solution is obtained in the weakly-relativistic limit by employing the Bogolyubov and Krylov method of averaging. It is shown that the phase mixing process of relativistically corrected electron oscillations is strongly influenced by the presence of a pre-existing ion density ripple in the plasma background.
Analysis of a Delayed Delta Modulator.
1983-05-01
parallels that of Janardhanan [10] for DPCM with matched integra- tion of stationary first-order Gauss-Markov input. In Subsection A the limiting...181, 1978. [10] JANARDHANAN , E., "Differential PCM -ystems", IEEE Trans. Conmmun., vol. Com-27, pp. 82-93, 1979. [111 KANTOROVICI, L.V. and KRYLOV, V.I
Degenerate SDEs with singular drift and applications to Heisenberg groups
NASA Astrophysics Data System (ADS)
Huang, Xing; Wang, Feng-Yu
2018-09-01
By using the ultracontractivity of a reference diffusion semigroup, Krylov's estimate is established for a class of degenerate SDEs with singular drifts, which leads to existence and pathwise uniqueness by means of Zvonkin's transformation. The main result is applied to singular SDEs on generalized Heisenberg groups.
Nested Conjugate Gradient Algorithm with Nested Preconditioning for Non-linear Image Restoration.
Skariah, Deepak G; Arigovindan, Muthuvel
2017-06-19
We develop a novel optimization algorithm, which we call Nested Non-Linear Conjugate Gradient algorithm (NNCG), for image restoration based on quadratic data fitting and smooth non-quadratic regularization. The algorithm is constructed as a nesting of two conjugate gradient (CG) iterations. The outer iteration is constructed as a preconditioned non-linear CG algorithm; the preconditioning is performed by the inner CG iteration that is linear. The inner CG iteration, which performs preconditioning for outer CG iteration, itself is accelerated by an another FFT based non-iterative preconditioner. We prove that the method converges to a stationary point for both convex and non-convex regularization functionals. We demonstrate experimentally that proposed method outperforms the well-known majorization-minimization method used for convex regularization, and a non-convex inertial-proximal method for non-convex regularization functional.
An implicit-iterative solution of the heat conduction equation with a radiation boundary condition
NASA Technical Reports Server (NTRS)
Williams, S. D.; Curry, D. M.
1977-01-01
For the problem of predicting one-dimensional heat transfer between conducting and radiating mediums by an implicit finite difference method, four different formulations were used to approximate the surface radiation boundary condition while retaining an implicit formulation for the interior temperature nodes. These formulations are an explicit boundary condition, a linearized boundary condition, an iterative boundary condition, and a semi-iterative boundary method. The results of these methods in predicting surface temperature on the space shuttle orbiter thermal protection system model under a variety of heating rates were compared. The iterative technique caused the surface temperature to be bounded at each step. While the linearized and explicit methods were generally more efficient, the iterative and semi-iterative techniques provided a realistic surface temperature response without requiring step size control techniques.
Comparing direct and iterative equation solvers in a large structural analysis software system
NASA Technical Reports Server (NTRS)
Poole, E. L.
1991-01-01
Two direct Choleski equation solvers and two iterative preconditioned conjugate gradient (PCG) equation solvers used in a large structural analysis software system are described. The two direct solvers are implementations of the Choleski method for variable-band matrix storage and sparse matrix storage. The two iterative PCG solvers include the Jacobi conjugate gradient method and an incomplete Choleski conjugate gradient method. The performance of the direct and iterative solvers is compared by solving several representative structural analysis problems. Some key factors affecting the performance of the iterative solvers relative to the direct solvers are identified.
Upwind relaxation methods for the Navier-Stokes equations using inner iterations
NASA Technical Reports Server (NTRS)
Taylor, Arthur C., III; Ng, Wing-Fai; Walters, Robert W.
1992-01-01
A subsonic and a supersonic problem are respectively treated by an upwind line-relaxation algorithm for the Navier-Stokes equations using inner iterations to accelerate steady-state solution convergence and thereby minimize CPU time. While the ability of the inner iterative procedure to mimic the quadratic convergence of the direct solver method is attested to in both test problems, some of the nonquadratic inner iterative results are noted to have been more efficient than the quadratic. In the more successful, supersonic test case, inner iteration required only about 65 percent of the line-relaxation method-entailed CPU time.
Scalable Parallel Computation for Extended MHD Modeling of Fusion Plasmas
NASA Astrophysics Data System (ADS)
Glasser, Alan H.
2008-11-01
Parallel solution of a linear system is scalable if simultaneously doubling the number of dependent variables and the number of processors results in little or no increase in the computation time to solution. Two approaches have this property for parabolic systems: multigrid and domain decomposition. Since extended MHD is primarily a hyperbolic rather than a parabolic system, additional steps must be taken to parabolize the linear system to be solved by such a method. Such physics-based preconditioning (PBP) methods have been pioneered by Chac'on, using finite volumes for spatial discretization, multigrid for solution of the preconditioning equations, and matrix-free Newton-Krylov methods for the accurate solution of the full nonlinear preconditioned equations. The work described here is an extension of these methods using high-order spectral element methods and FETI-DP domain decomposition. Application of PBP to a flux-source representation of the physics equations is discussed. The resulting scalability will be demonstrated for simple wave and for ideal and Hall MHD waves.
AIR-MRF: Accelerated iterative reconstruction for magnetic resonance fingerprinting.
Cline, Christopher C; Chen, Xiao; Mailhe, Boris; Wang, Qiu; Pfeuffer, Josef; Nittka, Mathias; Griswold, Mark A; Speier, Peter; Nadar, Mariappan S
2017-09-01
Existing approaches for reconstruction of multiparametric maps with magnetic resonance fingerprinting (MRF) are currently limited by their estimation accuracy and reconstruction time. We aimed to address these issues with a novel combination of iterative reconstruction, fingerprint compression, additional regularization, and accelerated dictionary search methods. The pipeline described here, accelerated iterative reconstruction for magnetic resonance fingerprinting (AIR-MRF), was evaluated with simulations as well as phantom and in vivo scans. We found that the AIR-MRF pipeline provided reduced parameter estimation errors compared to non-iterative and other iterative methods, particularly at shorter sequence lengths. Accelerated dictionary search methods incorporated into the iterative pipeline reduced the reconstruction time at little cost of quality. Copyright © 2017 Elsevier Inc. All rights reserved.
A Two-Dimensional Helmholtz Equation Solution for the Multiple Cavity Scattering Problem
2013-02-01
obtained by using the block Gauss – Seidel iterative meth- od. To show the convergence of the iterative method, we define the error between two...models to the general multiple cavity setting. Numerical examples indicate that the convergence of the Gauss – Seidel iterative method depends on the...variational approach. A block Gauss – Seidel iterative method is introduced to solve the cou- pled system of the multiple cavity scattering problem, where
Quantum dynamics in continuum for proton transport—Generalized correlation
NASA Astrophysics Data System (ADS)
Chen, Duan; Wei, Guo-Wei
2012-04-01
As a key process of many biological reactions such as biological energy transduction or human sensory systems, proton transport has attracted much research attention in biological, biophysical, and mathematical fields. A quantum dynamics in continuum framework has been proposed to study proton permeation through membrane proteins in our earlier work and the present work focuses on the generalized correlation of protons with their environment. Being complementary to electrostatic potentials, generalized correlations consist of proton-proton, proton-ion, proton-protein, and proton-water interactions. In our approach, protons are treated as quantum particles while other components of generalized correlations are described classically and in different levels of approximations upon simulation feasibility and difficulty. Specifically, the membrane protein is modeled as a group of discrete atoms, while ion densities are approximated by Boltzmann distributions, and water molecules are represented as a dielectric continuum. These proton-environment interactions are formulated as convolutions between number densities of species and their corresponding interaction kernels, in which parameters are obtained from experimental data. In the present formulation, generalized correlations are important components in the total Hamiltonian of protons, and thus is seamlessly embedded in the multiscale/multiphysics total variational model of the system. It takes care of non-electrostatic interactions, including the finite size effect, the geometry confinement induced channel barriers, dehydration and hydrogen bond effects, etc. The variational principle or the Euler-Lagrange equation is utilized to minimize the total energy functional, which includes the total Hamiltonian of protons, and obtain a new version of generalized Laplace-Beltrami equation, generalized Poisson-Boltzmann equation and generalized Kohn-Sham equation. A set of numerical algorithms, such as the matched interface and boundary method, the Dirichlet to Neumann mapping, Gummel iteration, and Krylov space techniques, is employed to improve the accuracy, efficiency, and robustness of model simulations. Finally, comparisons between the present model predictions and experimental data of current-voltage curves, as well as current-concentration curves of the Gramicidin A channel, verify our new model.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Weston, Brian T.
This dissertation focuses on the development of a fully-implicit, high-order compressible ow solver with phase change. The work is motivated by laser-induced phase change applications, particularly by the need to develop large-scale multi-physics simulations of the selective laser melting (SLM) process in metal additive manufacturing (3D printing). Simulations of the SLM process require precise tracking of multi-material solid-liquid-gas interfaces, due to laser-induced melting/ solidi cation and evaporation/condensation of metal powder in an ambient gas. These rapid density variations and phase change processes tightly couple the governing equations, requiring a fully compressible framework to robustly capture the rapid density variations ofmore » the ambient gas and the melting/evaporation of the metal powder. For non-isothermal phase change, the velocity is gradually suppressed through the mushy region by a variable viscosity and Darcy source term model. The governing equations are discretized up to 4th-order accuracy with our reconstructed Discontinuous Galerkin spatial discretization scheme and up to 5th-order accuracy with L-stable fully implicit time discretization schemes (BDF2 and ESDIRK3-5). The resulting set of non-linear equations is solved using a robust Newton-Krylov method, with the Jacobian-free version of the GMRES solver for linear iterations. Due to the sti nes associated with the acoustic waves and thermal and viscous/material strength e ects, preconditioning the GMRES solver is essential. A robust and scalable approximate block factorization preconditioner was developed, which utilizes the velocity-pressure (vP) and velocity-temperature (vT) Schur complement systems. This multigrid block reduction preconditioning technique converges for high CFL/Fourier numbers and exhibits excellent parallel and algorithmic scalability on classic benchmark problems in uid dynamics (lid-driven cavity ow and natural convection heat transfer) as well as for laser-induced phase change problems in 2D and 3D.« less
Quantum Dynamics in Continuum for Proton Transport I: Basic Formulation.
Chen, Duan; Wei, Guo-Wei
2013-01-01
Proton transport is one of the most important and interesting phenomena in living cells. The present work proposes a multiscale/multiphysics model for the understanding of the molecular mechanism of proton transport in transmembrane proteins. We describe proton dynamics quantum mechanically via a density functional approach while implicitly model other solvent ions as a dielectric continuum to reduce the number of degrees of freedom. The densities of all other ions in the solvent are assumed to obey the Boltzmann distribution. The impact of protein molecular structure and its charge polarization on the proton transport is considered explicitly at the atomic level. We formulate a total free energy functional to put proton kinetic and potential energies as well as electrostatic energy of all ions on an equal footing. The variational principle is employed to derive nonlinear governing equations for the proton transport system. Generalized Poisson-Boltzmann equation and Kohn-Sham equation are obtained from the variational framework. Theoretical formulations for the proton density and proton conductance are constructed based on fundamental principles. The molecular surface of the channel protein is utilized to split the discrete protein domain and the continuum solvent domain, and facilitate the multiscale discrete/continuum/quantum descriptions. A number of mathematical algorithms, including the Dirichlet to Neumann mapping, matched interface and boundary method, Gummel iteration, and Krylov space techniques are utilized to implement the proposed model in a computationally efficient manner. The Gramicidin A (GA) channel is used to demonstrate the performance of the proposed proton transport model and validate the efficiency of proposed mathematical algorithms. The electrostatic characteristics of the GA channel is analyzed with a wide range of model parameters. The proton conductances are studied over a number of applied voltages and reference concentrations. A comparison with experimental data verifies the present model predictions and validates the proposed model.
Robust iterative method for nonlinear Helmholtz equation
NASA Astrophysics Data System (ADS)
Yuan, Lijun; Lu, Ya Yan
2017-08-01
A new iterative method is developed for solving the two-dimensional nonlinear Helmholtz equation which governs polarized light in media with the optical Kerr nonlinearity. In the strongly nonlinear regime, the nonlinear Helmholtz equation could have multiple solutions related to phenomena such as optical bistability and symmetry breaking. The new method exhibits a much more robust convergence behavior than existing iterative methods, such as frozen-nonlinearity iteration, Newton's method and damped Newton's method, and it can be used to find solutions when good initial guesses are unavailable. Numerical results are presented for the scattering of light by a nonlinear circular cylinder based on the exact nonlocal boundary condition and a pseudospectral method in the polar coordinate system.
Twostep-by-twostep PIRK-type PC methods with continuous output formulas
NASA Astrophysics Data System (ADS)
Cong, Nguyen Huu; Xuan, Le Ngoc
2008-11-01
This paper deals with parallel predictor-corrector (PC) iteration methods based on collocation Runge-Kutta (RK) corrector methods with continuous output formulas for solving nonstiff initial-value problems (IVPs) for systems of first-order differential equations. At nth step, the continuous output formulas are used not only for predicting the stage values in the PC iteration methods but also for calculating the step values at (n+2)th step. In this case, the integration processes can be proceeded twostep-by-twostep. The resulting twostep-by-twostep (TBT) parallel-iterated RK-type (PIRK-type) methods with continuous output formulas (twostep-by-twostep PIRKC methods or TBTPIRKC methods) give us a faster integration process. Fixed stepsize applications of these TBTPIRKC methods to a few widely-used test problems reveal that the new PC methods are much more efficient when compared with the well-known parallel-iterated RK methods (PIRK methods), parallel-iterated RK-type PC methods with continuous output formulas (PIRKC methods) and sequential explicit RK codes DOPRI5 and DOP853 available from the literature.
Development of iterative techniques for the solution of unsteady compressible viscous flows
NASA Technical Reports Server (NTRS)
Sankar, Lakshmi N.; Hixon, Duane
1992-01-01
The development of efficient iterative solution methods for the numerical solution of two- and three-dimensional compressible Navier-Stokes equations is discussed. Iterative time marching methods have several advantages over classical multi-step explicit time marching schemes, and non-iterative implicit time marching schemes. Iterative schemes have better stability characteristics than non-iterative explicit and implicit schemes. In this work, another approach based on the classical conjugate gradient method, known as the Generalized Minimum Residual (GMRES) algorithm is investigated. The GMRES algorithm has been used in the past by a number of researchers for solving steady viscous and inviscid flow problems. Here, we investigate the suitability of this algorithm for solving the system of non-linear equations that arise in unsteady Navier-Stokes solvers at each time step.
2012-03-22
with performance profiles, Math. Program., 91 (2002), pp. 201–213. [6] P. DRINEAS, R. KANNAN, AND M. W. MAHONEY , Fast Monte Carlo algorithms for matrices...computing invariant subspaces of non-Hermitian matri- ces, Numer. Math., 25 ( 1975 /76), pp. 123–136. [25] , Matrix algorithms Vol. II: Eigensystems
Transport synthetic acceleration with opposing reflecting boundary conditions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zika, M.R.; Adams, M.L.
2000-02-01
The transport synthetic acceleration (TSA) scheme is extended to problems with opposing reflecting boundary conditions. This synthetic method employs a simplified transport operator as its low-order approximation. A procedure is developed that allows the use of the conjugate gradient (CG) method to solve the resulting low-order system of equations. Several well-known transport iteration algorithms are cast in a linear algebraic form to show their equivalence to standard iterative techniques. Source iteration in the presence of opposing reflecting boundary conditions is shown to be equivalent to a (poorly) preconditioned stationary Richardson iteration, with the preconditioner defined by the method of iteratingmore » on the incident fluxes on the reflecting boundaries. The TSA method (and any synthetic method) amounts to a further preconditioning of the Richardson iteration. The presence of opposing reflecting boundary conditions requires special consideration when developing a procedure to realize the CG method for the proposed system of equations. The CG iteration may be applied only to symmetric positive definite matrices; this condition requires the algebraic elimination of the boundary angular corrections from the low-order equations. As a consequence of this elimination, evaluating the action of the resulting matrix on an arbitrary vector involves two transport sweeps and a transmission iteration. Results of applying the acceleration scheme to a simple test problem are presented.« less
Reducing the latency of the Fractal Iterative Method to half an iteration
NASA Astrophysics Data System (ADS)
Béchet, Clémentine; Tallon, Michel
2013-12-01
The fractal iterative method for atmospheric tomography (FRiM-3D) has been introduced to solve the wavefront reconstruction at the dimensions of an ELT with a low-computational cost. Previous studies reported the requirement of only 3 iterations of the algorithm in order to provide the best adaptive optics (AO) performance. Nevertheless, any iterative method in adaptive optics suffer from the intrinsic latency induced by the fact that one iteration can start only once the previous one is completed. Iterations hardly match the low-latency requirement of the AO real-time computer. We present here a new approach to avoid iterations in the computation of the commands with FRiM-3D, thus allowing low-latency AO response even at the scale of the European ELT (E-ELT). The method highlights the importance of "warm-start" strategy in adaptive optics. To our knowledge, this particular way to use the "warm-start" has not been reported before. Futhermore, removing the requirement of iterating to compute the commands, the computational cost of the reconstruction with FRiM-3D can be simplified and at least reduced to half the computational cost of a classical iteration. Thanks to simulations of both single-conjugate and multi-conjugate AO for the E-ELT,with FRiM-3D on Octopus ESO simulator, we demonstrate the benefit of this approach. We finally enhance the robustness of this new implementation with respect to increasing measurement noise, wind speed and even modeling errors.
Iterative deep convolutional encoder-decoder network for medical image segmentation.
Jung Uk Kim; Hak Gu Kim; Yong Man Ro
2017-07-01
In this paper, we propose a novel medical image segmentation using iterative deep learning framework. We have combined an iterative learning approach and an encoder-decoder network to improve segmentation results, which enables to precisely localize the regions of interest (ROIs) including complex shapes or detailed textures of medical images in an iterative manner. The proposed iterative deep convolutional encoder-decoder network consists of two main paths: convolutional encoder path and convolutional decoder path with iterative learning. Experimental results show that the proposed iterative deep learning framework is able to yield excellent medical image segmentation performances for various medical images. The effectiveness of the proposed method has been proved by comparing with other state-of-the-art medical image segmentation methods.
Fully implicit adaptive mesh refinement solver for 2D MHD
NASA Astrophysics Data System (ADS)
Philip, B.; Chacon, L.; Pernice, M.
2008-11-01
Application of implicit adaptive mesh refinement (AMR) to simulate resistive magnetohydrodynamics is described. Solving this challenging multi-scale, multi-physics problem can improve understanding of reconnection in magnetically-confined plasmas. AMR is employed to resolve extremely thin current sheets, essential for an accurate macroscopic description. Implicit time stepping allows us to accurately follow the dynamical time scale of the developing magnetic field, without being restricted by fast Alfven time scales. At each time step, the large-scale system of nonlinear equations is solved by a Jacobian-free Newton-Krylov method together with a physics-based preconditioner. Each block within the preconditioner is solved optimally using the Fast Adaptive Composite grid method, which can be considered as a multiplicative Schwarz method on AMR grids. We will demonstrate the excellent accuracy and efficiency properties of the method with several challenging reduced MHD applications, including tearing, island coalescence, and tilt instabilities. B. Philip, L. Chac'on, M. Pernice, J. Comput. Phys., in press (2008)
Lagardère, Louis; Jolly, Luc-Henri; Lipparini, Filippo; Aviat, Félix; Stamm, Benjamin; Jing, Zhifeng F.; Harger, Matthew; Torabifard, Hedieh; Cisneros, G. Andrés; Schnieders, Michael J.; Gresh, Nohad; Maday, Yvon; Ren, Pengyu Y.; Ponder, Jay W.
2017-01-01
We present Tinker-HP, a massively MPI parallel package dedicated to classical molecular dynamics (MD) and to multiscale simulations, using advanced polarizable force fields (PFF) encompassing distributed multipoles electrostatics. Tinker-HP is an evolution of the popular Tinker package code that conserves its simplicity of use and its reference double precision implementation for CPUs. Grounded on interdisciplinary efforts with applied mathematics, Tinker-HP allows for long polarizable MD simulations on large systems up to millions of atoms. We detail in the paper the newly developed extension of massively parallel 3D spatial decomposition to point dipole polarizable models as well as their coupling to efficient Krylov iterative and non-iterative polarization solvers. The design of the code allows the use of various computer systems ranging from laboratory workstations to modern petascale supercomputers with thousands of cores. Tinker-HP proposes therefore the first high-performance scalable CPU computing environment for the development of next generation point dipole PFFs and for production simulations. Strategies linking Tinker-HP to Quantum Mechanics (QM) in the framework of multiscale polarizable self-consistent QM/MD simulations are also provided. The possibilities, performances and scalability of the software are demonstrated via benchmarks calculations using the polarizable AMOEBA force field on systems ranging from large water boxes of increasing size and ionic liquids to (very) large biosystems encompassing several proteins as well as the complete satellite tobacco mosaic virus and ribosome structures. For small systems, Tinker-HP appears to be competitive with the Tinker-OpenMM GPU implementation of Tinker. As the system size grows, Tinker-HP remains operational thanks to its access to distributed memory and takes advantage of its new algorithmic enabling for stable long timescale polarizable simulations. Overall, a several thousand-fold acceleration over a single-core computation is observed for the largest systems. The extension of the present CPU implementation of Tinker-HP to other computational platforms is discussed. PMID:29732110
A fast and robust iterative algorithm for prediction of RNA pseudoknotted secondary structures
2014-01-01
Background Improving accuracy and efficiency of computational methods that predict pseudoknotted RNA secondary structures is an ongoing challenge. Existing methods based on free energy minimization tend to be very slow and are limited in the types of pseudoknots that they can predict. Incorporating known structural information can improve prediction accuracy; however, there are not many methods for prediction of pseudoknotted structures that can incorporate structural information as input. There is even less understanding of the relative robustness of these methods with respect to partial information. Results We present a new method, Iterative HFold, for pseudoknotted RNA secondary structure prediction. Iterative HFold takes as input a pseudoknot-free structure, and produces a possibly pseudoknotted structure whose energy is at least as low as that of any (density-2) pseudoknotted structure containing the input structure. Iterative HFold leverages strengths of earlier methods, namely the fast running time of HFold, a method that is based on the hierarchical folding hypothesis, and the energy parameters of HotKnots V2.0. Our experimental evaluation on a large data set shows that Iterative HFold is robust with respect to partial information, with average accuracy on pseudoknotted structures steadily increasing from roughly 54% to 79% as the user provides up to 40% of the input structure. Iterative HFold is much faster than HotKnots V2.0, while having comparable accuracy. Iterative HFold also has significantly better accuracy than IPknot on our HK-PK and IP-pk168 data sets. Conclusions Iterative HFold is a robust method for prediction of pseudoknotted RNA secondary structures, whose accuracy with more than 5% information about true pseudoknot-free structures is better than that of IPknot, and with about 35% information about true pseudoknot-free structures compares well with that of HotKnots V2.0 while being significantly faster. Iterative HFold and all data used in this work are freely available at http://www.cs.ubc.ca/~hjabbari/software.php. PMID:24884954
Iterative Strain-Gage Balance Calibration Data Analysis for Extended Independent Variable Sets
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert Manfred
2011-01-01
A new method was developed that makes it possible to use an extended set of independent calibration variables for an iterative analysis of wind tunnel strain gage balance calibration data. The new method permits the application of the iterative analysis method whenever the total number of balance loads and other independent calibration variables is greater than the total number of measured strain gage outputs. Iteration equations used by the iterative analysis method have the limitation that the number of independent and dependent variables must match. The new method circumvents this limitation. It simply adds a missing dependent variable to the original data set by using an additional independent variable also as an additional dependent variable. Then, the desired solution of the regression analysis problem can be obtained that fits each gage output as a function of both the original and additional independent calibration variables. The final regression coefficients can be converted to data reduction matrix coefficients because the missing dependent variables were added to the data set without changing the regression analysis result for each gage output. Therefore, the new method still supports the application of the two load iteration equation choices that the iterative method traditionally uses for the prediction of balance loads during a wind tunnel test. An example is discussed in the paper that illustrates the application of the new method to a realistic simulation of temperature dependent calibration data set of a six component balance.
A comparison theorem for the SOR iterative method
NASA Astrophysics Data System (ADS)
Sun, Li-Ying
2005-09-01
In 1997, Kohno et al. have reported numerically that the improving modified Gauss-Seidel method, which was referred to as the IMGS method, is superior to the SOR iterative method. In this paper, we prove that the spectral radius of the IMGS method is smaller than that of the SOR method and Gauss-Seidel method, if the relaxation parameter [omega][set membership, variant](0,1]. As a result, we prove theoretically that this method is succeeded in improving the convergence of some classical iterative methods. Some recent results are improved.
Iterative approach as alternative to S-matrix in modal methods
NASA Astrophysics Data System (ADS)
Semenikhin, Igor; Zanuccoli, Mauro
2014-12-01
The continuously increasing complexity of opto-electronic devices and the rising demands of simulation accuracy lead to the need of solving very large systems of linear equations making iterative methods promising and attractive from the computational point of view with respect to direct methods. In particular, iterative approach potentially enables the reduction of required computational time to solve Maxwell's equations by Eigenmode Expansion algorithms. Regardless of the particular eigenmodes finding method used, the expansion coefficients are computed as a rule by scattering matrix (S-matrix) approach or similar techniques requiring order of M3 operations. In this work we consider alternatives to the S-matrix technique which are based on pure iterative or mixed direct-iterative approaches. The possibility to diminish the impact of M3 -order calculations to overall time and in some cases even to reduce the number of arithmetic operations to M2 by applying iterative techniques are discussed. Numerical results are illustrated to discuss validity and potentiality of the proposed approaches.
Efficient solution of the simplified P N equations
Hamilton, Steven P.; Evans, Thomas M.
2014-12-23
We show new solver strategies for the multigroup SPN equations for nuclear reactor analysis. By forming the complete matrix over space, moments, and energy a robust set of solution strategies may be applied. Moreover, power iteration, shifted power iteration, Rayleigh quotient iteration, Arnoldi's method, and a generalized Davidson method, each using algebraic and physics-based multigrid preconditioners, have been compared on C5G7 MOX test problem as well as an operational PWR model. These results show that the most ecient approach is the generalized Davidson method, that is 30-40 times faster than traditional power iteration and 6-10 times faster than Arnoldi's method.
ERIC Educational Resources Information Center
Mathews, John H.
1989-01-01
Describes Newton's method to locate roots of an equation using the Newton-Raphson iteration formula. Develops an adaptive method overcoming limitations of the iteration method. Provides the algorithm and computer program of the adaptive Newton-Raphson method. (YP)
An accelerated subspace iteration for eigenvector derivatives
NASA Technical Reports Server (NTRS)
Ting, Tienko
1991-01-01
An accelerated subspace iteration method for calculating eigenvector derivatives has been developed. Factors affecting the effectiveness and the reliability of the subspace iteration are identified, and effective strategies concerning these factors are presented. The method has been implemented, and the results of a demonstration problem are presented.
Iterative CT reconstruction using coordinate descent with ordered subsets of data
NASA Astrophysics Data System (ADS)
Noo, F.; Hahn, K.; Schöndube, H.; Stierstorfer, K.
2016-04-01
Image reconstruction based on iterative minimization of a penalized weighted least-square criteria has become an important topic of research in X-ray computed tomography. This topic is motivated by increasing evidence that such a formalism may enable a significant reduction in dose imparted to the patient while maintaining or improving image quality. One important issue associated with this iterative image reconstruction concept is slow convergence and the associated computational effort. For this reason, there is interest in finding methods that produce approximate versions of the targeted image with a small number of iterations and an acceptable level of discrepancy. We introduce here a novel method to produce such approximations: ordered subsets in combination with iterative coordinate descent. Preliminary results demonstrate that this method can produce, within 10 iterations and using only a constant image as initial condition, satisfactory reconstructions that retain the noise properties of the targeted image.
Special class of nonlinear damping models in flexible space structures
NASA Technical Reports Server (NTRS)
Hu, Anren; Singh, Ramendra P.; Taylor, Lawrence W.
1991-01-01
A special class of nonlinear damping models is investigated in which the damping force is proportional to the product of positive integer or the fractional power of the absolute values of displacement and velocity. For a one-degree-of-freedom system, the classical Krylov-Bogoliubov 'averaging' method is used, whereas for a distributed system, both an ad hoc perturbation technique and the finite difference method are employed to study the effects of nonlinear damping. The results are compared with linear viscous damping models. The amplitude decrement of free vibration for a single mode system with nonlinear models depends not only on the damping ratio but also on the initial amplitude, the time to measure the response, the frequency of the system, and the powers of displacement and velocity. For the distributed system, the action of nonlinear damping is found to reduce the energy of the system and to pass energy to lower modes.
Design of a Variational Multiscale Method for Turbulent Compressible Flows
NASA Technical Reports Server (NTRS)
Diosady, Laslo Tibor; Murman, Scott M.
2013-01-01
A spectral-element framework is presented for the simulation of subsonic compressible high-Reynolds-number flows. The focus of the work is maximizing the efficiency of the computational schemes to enable unsteady simulations with a large number of spatial and temporal degrees of freedom. A collocation scheme is combined with optimized computational kernels to provide a residual evaluation with computational cost independent of order of accuracy up to 16th order. The optimized residual routines are used to develop a low-memory implicit scheme based on a matrix-free Newton-Krylov method. A preconditioner based on the finite-difference diagonalized ADI scheme is developed which maintains the low memory of the matrix-free implicit solver, while providing improved convergence properties. Emphasis on low memory usage throughout the solver development is leveraged to implement a coupled space-time DG solver which may offer further efficiency gains through adaptivity in both space and time.
The influence of the great inequality on the secular disturbing function of the planetary system.
NASA Technical Reports Server (NTRS)
Musen, P.
1971-01-01
This paper derives the contribution by the great inequality to the secular disturbing function of the principal planets. Andoyer's expansion of the planetary disturbing function and von Zeipel's method of eliminating the periodic terms is employed; thereby, the corrected secular disturbing function for the planetary system is derived. The conclusion is drawn that the canonicity of the equations for the secular variation of the heliocentric elements can be preserved if there be retained, in the secular disturbing function, terms only of the second and fourth order relative to the eccentricity and inclinations. The Krylov-Bogoliubov method is suggested for eliminating periodic terms, if it is desired to include the secular perturbations of the fifth and higher order in the heliocentric elements. The additional part of the secular disturbing function derived in this paper can be included in existing theories of the secular effects of principal planets.
Parallelized implicit propagators for the finite-difference Schrödinger equation
NASA Astrophysics Data System (ADS)
Parker, Jonathan; Taylor, K. T.
1995-08-01
We describe the application of block Gauss-Seidel and block Jacobi iterative methods to the design of implicit propagators for finite-difference models of the time-dependent Schrödinger equation. The block-wise iterative methods discussed here are mixed direct-iterative methods for solving simultaneous equations, in the sense that direct methods (e.g. LU decomposition) are used to invert certain block sub-matrices, and iterative methods are used to complete the solution. We describe parallel variants of the basic algorithm that are well suited to the medium- to coarse-grained parallelism of work-station clusters, and MIMD supercomputers, and we show that under a wide range of conditions, fine-grained parallelism of the computation can be achieved. Numerical tests are conducted on a typical one-electron atom Hamiltonian. The methods converge robustly to machine precision (15 significant figures), in some cases in as few as 6 or 7 iterations. The rate of convergence is nearly independent of the finite-difference grid-point separations.
SU-D-206-03: Segmentation Assisted Fast Iterative Reconstruction Method for Cone-Beam CT
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wu, P; Mao, T; Gong, S
2016-06-15
Purpose: Total Variation (TV) based iterative reconstruction (IR) methods enable accurate CT image reconstruction from low-dose measurements with sparse projection acquisition, due to the sparsifiable feature of most CT images using gradient operator. However, conventional solutions require large amount of iterations to generate a decent reconstructed image. One major reason is that the expected piecewise constant property is not taken into consideration at the optimization starting point. In this work, we propose an iterative reconstruction method for cone-beam CT (CBCT) using image segmentation to guide the optimization path more efficiently on the regularization term at the beginning of the optimizationmore » trajectory. Methods: Our method applies general knowledge that one tissue component in the CT image contains relatively uniform distribution of CT number. This general knowledge is incorporated into the proposed reconstruction using image segmentation technique to generate the piecewise constant template on the first-pass low-quality CT image reconstructed using analytical algorithm. The template image is applied as an initial value into the optimization process. Results: The proposed method is evaluated on the Shepp-Logan phantom of low and high noise levels, and a head patient. The number of iterations is reduced by overall 40%. Moreover, our proposed method tends to generate a smoother reconstructed image with the same TV value. Conclusion: We propose a computationally efficient iterative reconstruction method for CBCT imaging. Our method achieves a better optimization trajectory and a faster convergence behavior. It does not rely on prior information and can be readily incorporated into existing iterative reconstruction framework. Our method is thus practical and attractive as a general solution to CBCT iterative reconstruction. This work is supported by the Zhejiang Provincial Natural Science Foundation of China (Grant No. LR16F010001), National High-tech R&D Program for Young Scientists by the Ministry of Science and Technology of China (Grant No. 2015AA020917).« less
Linear solver performance in elastoplastic problem solution on GPU cluster
NASA Astrophysics Data System (ADS)
Khalevitsky, Yu. V.; Konovalov, A. V.; Burmasheva, N. V.; Partin, A. S.
2017-12-01
Applying the finite element method to severe plastic deformation problems involves solving linear equation systems. While the solution procedure is relatively hard to parallelize and computationally intensive by itself, a long series of large scale systems need to be solved for each problem. When dealing with fine computational meshes, such as in the simulations of three-dimensional metal matrix composite microvolume deformation, tens and hundreds of hours may be needed to complete the whole solution procedure, even using modern supercomputers. In general, one of the preconditioned Krylov subspace methods is used in a linear solver for such problems. The method convergence highly depends on the operator spectrum of a problem stiffness matrix. In order to choose the appropriate method, a series of computational experiments is used. Different methods may be preferable for different computational systems for the same problem. In this paper we present experimental data obtained by solving linear equation systems from an elastoplastic problem on a GPU cluster. The data can be used to substantiate the choice of the appropriate method for a linear solver to use in severe plastic deformation simulations.
ERIC Educational Resources Information Center
Dobbs, David E.
2009-01-01
The main purpose of this note is to present and justify proof via iteration as an intuitive, creative and empowering method that is often available and preferable as an alternative to proofs via either mathematical induction or the well-ordering principle. The method of iteration depends only on the fact that any strictly decreasing sequence of…
A new iterative triclass thresholding technique in image segmentation.
Cai, Hongmin; Yang, Zhong; Cao, Xinhua; Xia, Weiming; Xu, Xiaoyin
2014-03-01
We present a new method in image segmentation that is based on Otsu's method but iteratively searches for subregions of the image for segmentation, instead of treating the full image as a whole region for processing. The iterative method starts with Otsu's threshold and computes the mean values of the two classes as separated by the threshold. Based on the Otsu's threshold and the two mean values, the method separates the image into three classes instead of two as the standard Otsu's method does. The first two classes are determined as the foreground and background and they will not be processed further. The third class is denoted as a to-be-determined (TBD) region that is processed at next iteration. At the succeeding iteration, Otsu's method is applied on the TBD region to calculate a new threshold and two class means and the TBD region is again separated into three classes, namely, foreground, background, and a new TBD region, which by definition is smaller than the previous TBD regions. Then, the new TBD region is processed in the similar manner. The process stops when the Otsu's thresholds calculated between two iterations is less than a preset threshold. Then, all the intermediate foreground and background regions are, respectively, combined to create the final segmentation result. Tests on synthetic and real images showed that the new iterative method can achieve better performance than the standard Otsu's method in many challenging cases, such as identifying weak objects and revealing fine structures of complex objects while the added computational cost is minimal.
Banerjee, Amartya S.; Suryanarayana, Phanish; Pask, John E.
2016-01-21
Pulay's Direct Inversion in the Iterative Subspace (DIIS) method is one of the most widely used mixing schemes for accelerating the self-consistent solution of electronic structure problems. In this work, we propose a simple generalization of DIIS in which Pulay extrapolation is performed at periodic intervals rather than on every self-consistent field iteration, and linear mixing is performed on all other iterations. Lastly, we demonstrate through numerical tests on a wide variety of materials systems in the framework of density functional theory that the proposed generalization of Pulay's method significantly improves its robustness and efficiency.
NASA Technical Reports Server (NTRS)
Krogh, F. T.; Stewart, K.
1984-01-01
Methods based on backward differentiation formulas (BDFs) for solving stiff differential equations require iterating to approximate the solution of the corrector equation on each step. One hope for reducing the cost of this is to make do with iteration matrices that are known to have errors and to do no more iterations than are necessary to maintain the stability of the method. This paper, following work by Klopfenstein, examines the effect of errors in the iteration matrix on the stability of the method. Application of the results to an algorithm is discussed briefly.
A Build-Up Interior Method for Linear Programming: Affine Scaling Form
1990-02-01
initiating a major iteration imply convergence in a finite number of iterations. Each iteration t of the Dikin algorithm starts with an interior dual...this variant with the affine scaling method of Dikin [5] (in dual form). We have also looked into the analogous variant for the related Karmarkar’s...4] G. B. Dantzig, Linear Programming and Extensions (Princeton University Press, Princeton, NJ, 1963). [5] I. I. Dikin , "Iterative solution of
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Hanming; Wang, Linyuan; Li, Lei
2016-06-15
Purpose: Metal artifact reduction (MAR) is a major problem and a challenging issue in x-ray computed tomography (CT) examinations. Iterative reconstruction from sinograms unaffected by metals shows promising potential in detail recovery. This reconstruction has been the subject of much research in recent years. However, conventional iterative reconstruction methods easily introduce new artifacts around metal implants because of incomplete data reconstruction and inconsistencies in practical data acquisition. Hence, this work aims at developing a method to suppress newly introduced artifacts and improve the image quality around metal implants for the iterative MAR scheme. Methods: The proposed method consists of twomore » steps based on the general iterative MAR framework. An uncorrected image is initially reconstructed, and the corresponding metal trace is obtained. The iterative reconstruction method is then used to reconstruct images from the unaffected sinogram. In the reconstruction step of this work, an iterative strategy utilizing unmatched projector/backprojector pairs is used. A ramp filter is introduced into the back-projection procedure to restrain the inconsistency components in low frequencies and generate more reliable images of the regions around metals. Furthermore, a constrained total variation (TV) minimization model is also incorporated to enhance efficiency. The proposed strategy is implemented based on an iterative FBP and an alternating direction minimization (ADM) scheme, respectively. The developed algorithms are referred to as “iFBP-TV” and “TV-FADM,” respectively. Two projection-completion-based MAR methods and three iterative MAR methods are performed simultaneously for comparison. Results: The proposed method performs reasonably on both simulation and real CT-scanned datasets. This approach could reduce streak metal artifacts effectively and avoid the mentioned effects in the vicinity of the metals. The improvements are evaluated by inspecting regions of interest and by comparing the root-mean-square errors, normalized mean absolute distance, and universal quality index metrics of the images. Both iFBP-TV and TV-FADM methods outperform other counterparts in all cases. Unlike the conventional iterative methods, the proposed strategy utilizing unmatched projector/backprojector pairs shows excellent performance in detail preservation and prevention of the introduction of new artifacts. Conclusions: Qualitative and quantitative evaluations of experimental results indicate that the developed method outperforms classical MAR algorithms in suppressing streak artifacts and preserving the edge structural information of the object. In particular, structures lying close to metals can be gradually recovered because of the reduction of artifacts caused by inconsistency effects.« less
Fluid-structure interaction simulations of deformable structures with non-linear thin shell elements
NASA Astrophysics Data System (ADS)
Asgharzadeh, Hafez; Hedayat, Mohammadali; Borazjani, Iman; Scientific Computing; Biofluids Laboratory Team
2017-11-01
Large deformation of structures in a fluid is simulated using a strongly coupled partitioned fluid-structure interaction (FSI) approach which is stabilized with under-relaxation and the Aitken acceleration technique. The fluid is simulated using a recently developed implicit Newton-Krylov method with a novel analytical Jacobian. Structures are simulated using a triangular thin-shell finite element formulation, which considers only translational degrees of freedom. The thin-shell method is developed on the top of a previously implemented membrane finite element formulation. A sharp interface immersed boundary method is used to handle structures in the fluid domain. The developed FSI framework is validated against two three-dimensional experiments: (1) a flexible aquatic vegetation in the fluid and (2) a heaving flexible panel in fluid. Furthermore, the developed FSI framework is used to simulate tissue heart valves, which involve large deformations and non-linear material properties. This work was supported by American Heart Association (AHA) Grant 13SDG17220022 and the Center of Computational Research (CCR) of University at Buffalo.
Electromagnetic scattering of large structures in layered earths using integral equations
NASA Astrophysics Data System (ADS)
Xiong, Zonghou; Tripp, Alan C.
1995-07-01
An electromagnetic scattering algorithm for large conductivity structures in stratified media has been developed and is based on the method of system iteration and spatial symmetry reduction using volume electric integral equations. The method of system iteration divides a structure into many substructures and solves the resulting matrix equation using a block iterative method. The block submatrices usually need to be stored on disk in order to save computer core memory. However, this requires a large disk for large structures. If the body is discretized into equal-size cells it is possible to use the spatial symmetry relations of the Green's functions to regenerate the scattering impedance matrix in each iteration, thus avoiding expensive disk storage. Numerical tests show that the system iteration converges much faster than the conventional point-wise Gauss-Seidel iterative method. The numbers of cells do not significantly affect the rate of convergency. Thus the algorithm effectively reduces the solution of the scattering problem to an order of O(N2), instead of O(N3) as with direct solvers.
NASA Astrophysics Data System (ADS)
Kochmann, Julian; Wulfinghoff, Stephan; Ehle, Lisa; Mayer, Joachim; Svendsen, Bob; Reese, Stefanie
2018-06-01
Recently, two-scale FE-FFT-based methods (e.g., Spahn et al. in Comput Methods Appl Mech Eng 268:871-883, 2014; Kochmann et al. in Comput Methods Appl Mech Eng 305:89-110, 2016) have been proposed to predict the microscopic and overall mechanical behavior of heterogeneous materials. The purpose of this work is the extension to elasto-viscoplastic polycrystals, efficient and robust Fourier solvers and the prediction of micromechanical fields during macroscopic deformation processes. Assuming scale separation, the macroscopic problem is solved using the finite element method. The solution of the microscopic problem, which is embedded as a periodic unit cell (UC) in each macroscopic integration point, is found by employing fast Fourier transforms, fixed-point and Newton-Krylov methods. The overall material behavior is defined by the mean UC response. In order to ensure spatially converged micromechanical fields as well as feasible overall CPU times, an efficient but simple solution strategy for two-scale simulations is proposed. As an example, the constitutive behavior of 42CrMo4 steel is predicted during macroscopic three-point bending tests.
NASA Astrophysics Data System (ADS)
Kochmann, Julian; Wulfinghoff, Stephan; Ehle, Lisa; Mayer, Joachim; Svendsen, Bob; Reese, Stefanie
2017-09-01
Recently, two-scale FE-FFT-based methods (e.g., Spahn et al. in Comput Methods Appl Mech Eng 268:871-883, 2014; Kochmann et al. in Comput Methods Appl Mech Eng 305:89-110, 2016) have been proposed to predict the microscopic and overall mechanical behavior of heterogeneous materials. The purpose of this work is the extension to elasto-viscoplastic polycrystals, efficient and robust Fourier solvers and the prediction of micromechanical fields during macroscopic deformation processes. Assuming scale separation, the macroscopic problem is solved using the finite element method. The solution of the microscopic problem, which is embedded as a periodic unit cell (UC) in each macroscopic integration point, is found by employing fast Fourier transforms, fixed-point and Newton-Krylov methods. The overall material behavior is defined by the mean UC response. In order to ensure spatially converged micromechanical fields as well as feasible overall CPU times, an efficient but simple solution strategy for two-scale simulations is proposed. As an example, the constitutive behavior of 42CrMo4 steel is predicted during macroscopic three-point bending tests.
NASA Astrophysics Data System (ADS)
Kang, Seokkoo; Borazjani, Iman; Sotiropoulos, Fotis
2008-11-01
Unsteady 3D simulations of flows in natural streams is a challenging task due to the complexity of the bathymetry, the shallowness of the flow, and the presence of multiple nature- and man-made obstacles. This work is motivated by the need to develop a powerful numerical method for simulating such flows using coherent-structure-resolving turbulence models. We employ the curvilinear immersed boundary method of Ge and Sotiropoulos (Journal of Computational Physics, 2007) and address the critical issue of numerical efficiency in large aspect ratio computational domains and grids such as those encountered in long and shallow open channels. We show that the matrix-free Newton-Krylov method for solving the momentum equations coupled with an algebraic multigrid method with incomplete LU preconditioner for solving the Poisson equation yield a robust and efficient procedure for obtaining time-accurate solutions in such problems. We demonstrate the potential of the numerical approach by carrying out a direct numerical simulation of flow in a long and shallow meandering stream with multiple hydraulic structures.
Implicity restarted Arnoldi/Lanczos methods for large scale eigenvalue calculations
NASA Technical Reports Server (NTRS)
Sorensen, Danny C.
1996-01-01
Eigenvalues and eigenfunctions of linear operators are important to many areas of applied mathematics. The ability to approximate these quantities numerically is becoming increasingly important in a wide variety of applications. This increasing demand has fueled interest in the development of new methods and software for the numerical solution of large-scale algebraic eigenvalue problems. In turn, the existence of these new methods and software, along with the dramatically increased computational capabilities now available, has enabled the solution of problems that would not even have been posed five or ten years ago. Until very recently, software for large-scale nonsymmetric problems was virtually non-existent. Fortunately, the situation is improving rapidly. The purpose of this article is to provide an overview of the numerical solution of large-scale algebraic eigenvalue problems. The focus will be on a class of methods called Krylov subspace projection methods. The well-known Lanczos method is the premier member of this class. The Arnoldi method generalizes the Lanczos method to the nonsymmetric case. A recently developed variant of the Arnoldi/Lanczos scheme called the Implicitly Restarted Arnoldi Method is presented here in some depth. This method is highlighted because of its suitability as a basis for software development.
ERIC Educational Resources Information Center
Smith, Michael
1990-01-01
Presents several examples of the iteration method using computer spreadsheets. Examples included are simple iterative sequences and the solution of equations using the Newton-Raphson formula, linear interpolation, and interval bisection. (YP)
Iterative methods for tomography problems: implementation to a cross-well tomography problem
NASA Astrophysics Data System (ADS)
Karadeniz, M. F.; Weber, G. W.
2018-01-01
The velocity distribution between two boreholes is reconstructed by cross-well tomography, which is commonly used in geology. In this paper, iterative methods, Kaczmarz’s algorithm, algebraic reconstruction technique (ART), and simultaneous iterative reconstruction technique (SIRT), are implemented to a specific cross-well tomography problem. Convergence to the solution of these methods and their CPU time for the cross-well tomography problem are compared. Furthermore, these three methods for this problem are compared for different tolerance values.
Li, Chuan; Li, Lin; Zhang, Jie; Alexov, Emil
2012-01-01
The Gauss-Seidel method is a standard iterative numerical method widely used to solve a system of equations and, in general, is more efficient comparing to other iterative methods, such as the Jacobi method. However, standard implementation of the Gauss-Seidel method restricts its utilization in parallel computing due to its requirement of using updated neighboring values (i.e., in current iteration) as soon as they are available. Here we report an efficient and exact (not requiring assumptions) method to parallelize iterations and to reduce the computational time as a linear/nearly linear function of the number of CPUs. In contrast to other existing solutions, our method does not require any assumptions and is equally applicable for solving linear and nonlinear equations. This approach is implemented in the DelPhi program, which is a finite difference Poisson-Boltzmann equation solver to model electrostatics in molecular biology. This development makes the iterative procedure on obtaining the electrostatic potential distribution in the parallelized DelPhi several folds faster than that in the serial code. Further we demonstrate the advantages of the new parallelized DelPhi by computing the electrostatic potential and the corresponding energies of large supramolecular structures. PMID:22674480
Pseudo-time methods for constrained optimization problems governed by PDE
NASA Technical Reports Server (NTRS)
Taasan, Shlomo
1995-01-01
In this paper we present a novel method for solving optimization problems governed by partial differential equations. Existing methods are gradient information in marching toward the minimum, where the constrained PDE is solved once (sometimes only approximately) per each optimization step. Such methods can be viewed as a marching techniques on the intersection of the state and costate hypersurfaces while improving the residuals of the design equations per each iteration. In contrast, the method presented here march on the design hypersurface and at each iteration improve the residuals of the state and costate equations. The new method is usually much less expensive per iteration step since, in most problems of practical interest, the design equation involves much less unknowns that that of either the state or costate equations. Convergence is shown using energy estimates for the evolution equations governing the iterative process. Numerical tests show that the new method allows the solution of the optimization problem in a cost of solving the analysis problems just a few times, independent of the number of design parameters. The method can be applied using single grid iterations as well as with multigrid solvers.
A New Newton-Like Iterative Method for Roots of Analytic Functions
ERIC Educational Resources Information Center
Otolorin, Olayiwola
2005-01-01
A new Newton-like iterative formula for the solution of non-linear equations is proposed. To derive the formula, the convergence criteria of the one-parameter iteration formula, and also the quasilinearization in the derivation of Newton's formula are reviewed. The result is a new formula which eliminates the limitations of other methods. There is…
Iterative Nonlocal Total Variation Regularization Method for Image Restoration
Xu, Huanyu; Sun, Quansen; Luo, Nan; Cao, Guo; Xia, Deshen
2013-01-01
In this paper, a Bregman iteration based total variation image restoration algorithm is proposed. Based on the Bregman iteration, the algorithm splits the original total variation problem into sub-problems that are easy to solve. Moreover, non-local regularization is introduced into the proposed algorithm, and a method to choose the non-local filter parameter locally and adaptively is proposed. Experiment results show that the proposed algorithms outperform some other regularization methods. PMID:23776560
NASA Astrophysics Data System (ADS)
Schultz, A.
2010-12-01
3D forward solvers lie at the core of inverse formulations used to image the variation of electrical conductivity within the Earth's interior. This property is associated with variations in temperature, composition, phase, presence of volatiles, and in specific settings, the presence of groundwater, geothermal resources, oil/gas or minerals. The high cost of 3D solutions has been a stumbling block to wider adoption of 3D methods. Parallel algorithms for modeling frequency domain 3D EM problems have not achieved wide scale adoption, with emphasis on fairly coarse grained parallelism using MPI and similar approaches. The communications bandwidth as well as the latency required to send and receive network communication packets is a limiting factor in implementing fine grained parallel strategies, inhibiting wide adoption of these algorithms. Leading Graphics Processor Unit (GPU) companies now produce GPUs with hundreds of GPU processor cores per die. The footprint, in silicon, of the GPU's restricted instruction set is much smaller than the general purpose instruction set required of a CPU. Consequently, the density of processor cores on a GPU can be much greater than on a CPU. GPUs also have local memory, registers and high speed communication with host CPUs, usually through PCIe type interconnects. The extremely low cost and high computational power of GPUs provides the EM geophysics community with an opportunity to achieve fine grained (i.e. massive) parallelization of codes on low cost hardware. The current generation of GPUs (e.g. NVidia Fermi) provides 3 billion transistors per chip die, with nearly 500 processor cores and up to 6 GB of fast (DDR5) GPU memory. This latest generation of GPU supports fast hardware double precision (64 bit) floating point operations of the type required for frequency domain EM forward solutions. Each Fermi GPU board can sustain nearly 1 TFLOP in double precision, and multiple boards can be installed in the host computer system. We describe our ongoing efforts to achieve massive parallelization on a novel hybrid GPU testbed machine currently configured with 12 Intel Westmere Xeon CPU cores (or 24 parallel computational threads) with 96 GB DDR3 system memory, 4 GPU subsystems which in aggregate contain 960 NVidia Tesla GPU cores with 16 GB dedicated DDR3 GPU memory, and a second interleved bank of 4 GPU subsystems containing in aggregate 1792 NVidia Fermi GPU cores with 12 GB dedicated DDR5 GPU memory. We are applying domain decomposition methods to a modified version of Weiss' (2001) 3D frequency domain full physics EM finite difference code, an open source GPL licensed f90 code available for download from www.OpenEM.org. This will be the core of a new hybrid 3D inversion that parallelizes frequencies across CPUs and individual forward solutions across GPUs. We describe progress made in modifying the code to use direct solvers in GPU cores dedicated to each small subdomain, iteratively improving the solution by matching adjacent subdomain boundary solutions, rather than iterative Krylov space sparse solvers as currently applied to the whole domain.
Iterative integral parameter identification of a respiratory mechanics model.
Schranz, Christoph; Docherty, Paul D; Chiew, Yeong Shiong; Möller, Knut; Chase, J Geoffrey
2012-07-18
Patient-specific respiratory mechanics models can support the evaluation of optimal lung protective ventilator settings during ventilation therapy. Clinical application requires that the individual's model parameter values must be identified with information available at the bedside. Multiple linear regression or gradient-based parameter identification methods are highly sensitive to noise and initial parameter estimates. Thus, they are difficult to apply at the bedside to support therapeutic decisions. An iterative integral parameter identification method is applied to a second order respiratory mechanics model. The method is compared to the commonly used regression methods and error-mapping approaches using simulated and clinical data. The clinical potential of the method was evaluated on data from 13 Acute Respiratory Distress Syndrome (ARDS) patients. The iterative integral method converged to error minima 350 times faster than the Simplex Search Method using simulation data sets and 50 times faster using clinical data sets. Established regression methods reported erroneous results due to sensitivity to noise. In contrast, the iterative integral method was effective independent of initial parameter estimations, and converged successfully in each case tested. These investigations reveal that the iterative integral method is beneficial with respect to computing time, operator independence and robustness, and thus applicable at the bedside for this clinical application.
A fully-implicit high-order system thermal-hydraulics model for advanced non-LWR safety analyses
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hu, Rui
An advanced system analysis tool is being developed for advanced reactor safety analysis. This paper describes the underlying physics and numerical models used in the code, including the governing equations, the stabilization schemes, the high-order spatial and temporal discretization schemes, and the Jacobian Free Newton Krylov solution method. The effects of the spatial and temporal discretization schemes are investigated. Additionally, a series of verification test problems are presented to confirm the high-order schemes. Furthermore, it is demonstrated that the developed system thermal-hydraulics model can be strictly verified with the theoretical convergence rates, and that it performs very well for amore » wide range of flow problems with high accuracy, efficiency, and minimal numerical diffusions.« less
A fully-implicit high-order system thermal-hydraulics model for advanced non-LWR safety analyses
Hu, Rui
2016-11-19
An advanced system analysis tool is being developed for advanced reactor safety analysis. This paper describes the underlying physics and numerical models used in the code, including the governing equations, the stabilization schemes, the high-order spatial and temporal discretization schemes, and the Jacobian Free Newton Krylov solution method. The effects of the spatial and temporal discretization schemes are investigated. Additionally, a series of verification test problems are presented to confirm the high-order schemes. Furthermore, it is demonstrated that the developed system thermal-hydraulics model can be strictly verified with the theoretical convergence rates, and that it performs very well for amore » wide range of flow problems with high accuracy, efficiency, and minimal numerical diffusions.« less
A Universal Tare Load Prediction Algorithm for Strain-Gage Balance Calibration Data Analysis
NASA Technical Reports Server (NTRS)
Ulbrich, N.
2011-01-01
An algorithm is discussed that may be used to estimate tare loads of wind tunnel strain-gage balance calibration data. The algorithm was originally developed by R. Galway of IAR/NRC Canada and has been described in the literature for the iterative analysis technique. Basic ideas of Galway's algorithm, however, are universally applicable and work for both the iterative and the non-iterative analysis technique. A recent modification of Galway's algorithm is presented that improves the convergence behavior of the tare load prediction process if it is used in combination with the non-iterative analysis technique. The modified algorithm allows an analyst to use an alternate method for the calculation of intermediate non-linear tare load estimates whenever Galway's original approach does not lead to a convergence of the tare load iterations. It is also shown in detail how Galway's algorithm may be applied to the non-iterative analysis technique. Hand load data from the calibration of a six-component force balance is used to illustrate the application of the original and modified tare load prediction method. During the analysis of the data both the iterative and the non-iterative analysis technique were applied. Overall, predicted tare loads for combinations of the two tare load prediction methods and the two balance data analysis techniques showed excellent agreement as long as the tare load iterations converged. The modified algorithm, however, appears to have an advantage over the original algorithm when absolute voltage measurements of gage outputs are processed using the non-iterative analysis technique. In these situations only the modified algorithm converged because it uses an exact solution of the intermediate non-linear tare load estimate for the tare load iteration.
Development of iterative techniques for the solution of unsteady compressible viscous flows
NASA Technical Reports Server (NTRS)
Sankar, Lakshmi N.; Hixon, Duane
1991-01-01
Efficient iterative solution methods are being developed for the numerical solution of two- and three-dimensional compressible Navier-Stokes equations. Iterative time marching methods have several advantages over classical multi-step explicit time marching schemes, and non-iterative implicit time marching schemes. Iterative schemes have better stability characteristics than non-iterative explicit and implicit schemes. Thus, the extra work required by iterative schemes can also be designed to perform efficiently on current and future generation scalable, missively parallel machines. An obvious candidate for iteratively solving the system of coupled nonlinear algebraic equations arising in CFD applications is the Newton method. Newton's method was implemented in existing finite difference and finite volume methods. Depending on the complexity of the problem, the number of Newton iterations needed per step to solve the discretized system of equations can, however, vary dramatically from a few to several hundred. Another popular approach based on the classical conjugate gradient method, known as the GMRES (Generalized Minimum Residual) algorithm is investigated. The GMRES algorithm was used in the past by a number of researchers for solving steady viscous and inviscid flow problems with considerable success. Here, the suitability of this algorithm is investigated for solving the system of nonlinear equations that arise in unsteady Navier-Stokes solvers at each time step. Unlike the Newton method which attempts to drive the error in the solution at each and every node down to zero, the GMRES algorithm only seeks to minimize the L2 norm of the error. In the GMRES algorithm the changes in the flow properties from one time step to the next are assumed to be the sum of a set of orthogonal vectors. By choosing the number of vectors to a reasonably small value N (between 5 and 20) the work required for advancing the solution from one time step to the next may be kept to (N+1) times that of a noniterative scheme. Many of the operations required by the GMRES algorithm such as matrix-vector multiplies, matrix additions and subtractions can all be vectorized and parallelized efficiently.
Global Asymptotic Behavior of Iterative Implicit Schemes
NASA Technical Reports Server (NTRS)
Yee, H. C.; Sweby, P. K.
1994-01-01
The global asymptotic nonlinear behavior of some standard iterative procedures in solving nonlinear systems of algebraic equations arising from four implicit linear multistep methods (LMMs) in discretizing three models of 2 x 2 systems of first-order autonomous nonlinear ordinary differential equations (ODEs) is analyzed using the theory of dynamical systems. The iterative procedures include simple iteration and full and modified Newton iterations. The results are compared with standard Runge-Kutta explicit methods, a noniterative implicit procedure, and the Newton method of solving the steady part of the ODEs. Studies showed that aside from exhibiting spurious asymptotes, all of the four implicit LMMs can change the type and stability of the steady states of the differential equations (DEs). They also exhibit a drastic distortion but less shrinkage of the basin of attraction of the true solution than standard nonLMM explicit methods. The simple iteration procedure exhibits behavior which is similar to standard nonLMM explicit methods except that spurious steady-state numerical solutions cannot occur. The numerical basins of attraction of the noniterative implicit procedure mimic more closely the basins of attraction of the DEs and are more efficient than the three iterative implicit procedures for the four implicit LMMs. Contrary to popular belief, the initial data using the Newton method of solving the steady part of the DEs may not have to be close to the exact steady state for convergence. These results can be used as an explanation for possible causes and cures of slow convergence and nonconvergence of steady-state numerical solutions when using an implicit LMM time-dependent approach in computational fluid dynamics.
Implementation of an improved adaptive-implicit method in a thermal compositional simulator
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tan, T.B.
1988-11-01
A multicomponent thermal simulator with an adaptive-implicit-method (AIM) formulation/inexact-adaptive-Newton (IAN) method is presented. The final coefficient matrix retains the original banded structure so that conventional iterative methods can be used. Various methods for selection of the eliminated unknowns are tested. AIM/IAN method has a lower work count per Newtonian iteration than fully implicit methods, but a wrong choice of unknowns will result in excessive Newtonian iterations. For the problems tested, the residual-error method described in the paper for selecting implicit unknowns, together with the IAN method, had an improvement of up to 28% of the CPU time over the fullymore » implicit method.« less
NASA Astrophysics Data System (ADS)
Zhou, Y.; Zhang, X.; Xiao, W.
2018-04-01
As the geomagnetic sensor is susceptible to interference, a pre-processing total least square iteration method is proposed for calibration compensation. Firstly, the error model of the geomagnetic sensor is analyzed and the correction model is proposed, then the characteristics of the model are analyzed and converted into nine parameters. The geomagnetic data is processed by Hilbert transform (HHT) to improve the signal-to-noise ratio, and the nine parameters are calculated by using the combination of Newton iteration method and the least squares estimation method. The sifter algorithm is used to filter the initial value of the iteration to ensure that the initial error is as small as possible. The experimental results show that this method does not need additional equipment and devices, can continuously update the calibration parameters, and better than the two-step estimation method, it can compensate geomagnetic sensor error well.
Perturbation-iteration theory for analyzing microwave striplines
NASA Technical Reports Server (NTRS)
Kretch, B. E.
1985-01-01
A perturbation-iteration technique is presented for determining the propagation constant and characteristic impedance of an unshielded microstrip transmission line. The method converges to the correct solution with a few iterations at each frequency and is equivalent to a full wave analysis. The perturbation-iteration method gives a direct solution for the propagation constant without having to find the roots of a transcendental dispersion equation. The theory is presented in detail along with numerical results for the effective dielectric constant and characteristic impedance for a wide range of substrate dielectric constants, stripline dimensions, and frequencies.
Steady state numerical solutions for determining the location of MEMS on projectile
NASA Astrophysics Data System (ADS)
Abiprayu, K.; Abdigusna, M. F. F.; Gunawan, P. H.
2018-03-01
This paper is devoted to compare the numerical solutions for the steady and unsteady state heat distribution model on projectile. Here, the best location for installing of the MEMS on the projectile based on the surface temperature is investigated. Numerical iteration methods, Jacobi and Gauss-Seidel have been elaborated to solve the steady state heat distribution model on projectile. The results using Jacobi and Gauss-Seidel are shown identical but the discrepancy iteration cost for each methods is gained. Using Jacobi’s method, the iteration cost is 350 iterations. Meanwhile, using Gauss-Seidel 188 iterations are obtained, faster than the Jacobi’s method. The comparison of the simulation by steady state model and the unsteady state model by a reference is shown satisfying. Moreover, the best candidate for installing MEMS on projectile is observed at pointT(10, 0) which has the lowest temperature for the other points. The temperature using Jacobi and Gauss-Seidel for scenario 1 and 2 atT(10, 0) are 307 and 309 Kelvin respectively.
Material nonlinear analysis via mixed-iterative finite element method
NASA Technical Reports Server (NTRS)
Sutjahjo, Edhi; Chamis, Christos C.
1992-01-01
The performance of elastic-plastic mixed-iterative analysis is examined through a set of convergence studies. Membrane and bending behaviors are tested using 4-node quadrilateral finite elements. The membrane result is excellent, which indicates the implementation of elastic-plastic mixed-iterative analysis is appropriate. On the other hand, further research to improve bending performance of the method seems to be warranted.
Large Scale, High Resolution, Mantle Dynamics Modeling
NASA Astrophysics Data System (ADS)
Geenen, T.; Berg, A. V.; Spakman, W.
2007-12-01
To model the geodynamic evolution of plate convergence, subduction and collision and to allow for a connection to various types of observational data, geophysical, geodetical and geological, we developed a 4D (space-time) numerical mantle convection code. The model is based on a spherical 3D Eulerian fem model, with quadratic elements, on top of which we constructed a 3D Lagrangian particle in cell(PIC) method. We use the PIC method to transport material properties and to incorporate a viscoelastic rheology. Since capturing small scale processes associated with localization phenomena require a high resolution, we spend a considerable effort on implementing solvers suitable to solve for models with over 100 million degrees of freedom. We implemented Additive Schwartz type ILU based methods in combination with a Krylov solver, GMRES. However we found that for problems with over 500 thousend degrees of freedom the convergence of the solver degraded severely. This observation is known from the literature [Saad, 2003] and results from the local character of the ILU preconditioner resulting in a poor approximation of the inverse of A for large A. The size of A for which ILU is no longer usable depends on the condition of A and on the amount of fill in allowed for the ILU preconditioner. We found that for our problems with over 5×105 degrees of freedom convergence became to slow to solve the system within an acceptable amount of walltime, one minute, even when allowing for considerable amount of fill in. We also implemented MUMPS and found good scaling results for problems up to 107 degrees of freedom for up to 32 CPU¡¯s. For problems with over 100 million degrees of freedom we implemented Algebraic Multigrid type methods (AMG) from the ML library [Sala, 2006]. Since multigrid methods are most effective for single parameter problems, we rebuild our model to use the SIMPLE method in the Stokes solver [Patankar, 1980]. We present scaling results from these solvers for 3D spherical models. We also applied the above mentioned method to a high resolution (~ 1 km) 2D mantle convection model with temperature, pressure and phase dependent rheology including several phase transitions. We focus on a model of a subducting lithospheric slab which is subject to strong folding at the bottom of the mantle's D" region which includes the postperovskite phase boundary. For a detailed description of this model we refer to poster [Mantel convection models of the D" region, U17] [Saad, 2003] Saad, Y. (2003). Iterative methods for sparse linear systems. [Sala, 2006] Sala. M (2006) An Object-Oriented Framework for the Development of Scalable Parallel Multilevel Preconditioners. ACM Transactions on Mathematical Software, 32 (3), 2006 [Patankar, 1980] Patankar, S. V.(1980) Numerical Heat Transfer and Fluid Flow, Hemisphere, Washington.
On the solution of evolution equations based on multigrid and explicit iterative methods
NASA Astrophysics Data System (ADS)
Zhukov, V. T.; Novikova, N. D.; Feodoritova, O. B.
2015-08-01
Two schemes for solving initial-boundary value problems for three-dimensional parabolic equations are studied. One is implicit and is solved using the multigrid method, while the other is explicit iterative and is based on optimal properties of the Chebyshev polynomials. In the explicit iterative scheme, the number of iteration steps and the iteration parameters are chosen as based on the approximation and stability conditions, rather than on the optimization of iteration convergence to the solution of the implicit scheme. The features of the multigrid scheme include the implementation of the intergrid transfer operators for the case of discontinuous coefficients in the equation and the adaptation of the smoothing procedure to the spectrum of the difference operators. The results produced by these schemes as applied to model problems with anisotropic discontinuous coefficients are compared.
Differential Characteristics Based Iterative Multiuser Detection for Wireless Sensor Networks
Chen, Xiaoguang; Jiang, Xu; Wu, Zhilu; Zhuang, Shufeng
2017-01-01
High throughput, low latency and reliable communication has always been a hot topic for wireless sensor networks (WSNs) in various applications. Multiuser detection is widely used to suppress the bad effect of multiple access interference in WSNs. In this paper, a novel multiuser detection method based on differential characteristics is proposed to suppress multiple access interference. The proposed iterative receive method consists of three stages. Firstly, a differential characteristics function is presented based on the optimal multiuser detection decision function; then on the basis of differential characteristics, a preliminary threshold detection is utilized to find the potential wrongly received bits; after that an error bit corrector is employed to correct the wrong bits. In order to further lower the bit error ratio (BER), the differential characteristics calculation, threshold detection and error bit correction process described above are iteratively executed. Simulation results show that after only a few iterations the proposed multiuser detection method can achieve satisfactory BER performance. Besides, BER and near far resistance performance are much better than traditional suboptimal multiuser detection methods. Furthermore, the proposed iterative multiuser detection method also has a large system capacity. PMID:28212328
The application of contraction theory to an iterative formulation of electromagnetic scattering
NASA Technical Reports Server (NTRS)
Brand, J. C.; Kauffman, J. F.
1985-01-01
Contraction theory is applied to an iterative formulation of electromagnetic scattering from periodic structures and a computational method for insuring convergence is developed. A short history of spectral (or k-space) formulation is presented with an emphasis on application to periodic surfaces. To insure a convergent solution of the iterative equation, a process called the contraction corrector method is developed. Convergence properties of previously presented iterative solutions to one-dimensional problems are examined utilizing contraction theory and the general conditions for achieving a convergent solution are explored. The contraction corrector method is then applied to several scattering problems including an infinite grating of thin wires with the solution data compared to previous works.
Benzi, Michele; Evans, Thomas M.; Hamilton, Steven P.; ...
2017-03-05
Here, we consider hybrid deterministic-stochastic iterative algorithms for the solution of large, sparse linear systems. Starting from a convergent splitting of the coefficient matrix, we analyze various types of Monte Carlo acceleration schemes applied to the original preconditioned Richardson (stationary) iteration. We expect that these methods will have considerable potential for resiliency to faults when implemented on massively parallel machines. We also establish sufficient conditions for the convergence of the hybrid schemes, and we investigate different types of preconditioners including sparse approximate inverses. Numerical experiments on linear systems arising from the discretization of partial differential equations are presented.
A non-iterative extension of the multivariate random effects meta-analysis.
Makambi, Kepher H; Seung, Hyunuk
2015-01-01
Multivariate methods in meta-analysis are becoming popular and more accepted in biomedical research despite computational issues in some of the techniques. A number of approaches, both iterative and non-iterative, have been proposed including the multivariate DerSimonian and Laird method by Jackson et al. (2010), which is non-iterative. In this study, we propose an extension of the method by Hartung and Makambi (2002) and Makambi (2001) to multivariate situations. A comparison of the bias and mean square error from a simulation study indicates that, in some circumstances, the proposed approach perform better than the multivariate DerSimonian-Laird approach. An example is presented to demonstrate the application of the proposed approach.
IMPLEMENTATION AND VALIDATION OF A FULLY IMPLICIT ACCUMULATOR MODEL IN RELAP-7
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhao, Haihua; Zou, Ling; Zhang, Hongbin
2016-01-01
This paper presents the implementation and validation of an accumulator model in RELAP-7 under the framework of preconditioned Jacobian free Newton Krylov (JFNK) method, based on the similar model used in RELAP5. RELAP-7 is a new nuclear reactor system safety analysis code being developed at the Idaho National Laboratory (INL). RELAP-7 is a fully implicit system code. The JFNK and preconditioning methods used in RELAP-7 is briefly discussed. The slightly modified accumulator model is summarized for completeness. The implemented model was validated with LOFT L3-1 test and benchmarked with RELAP5 results. RELAP-7 and RELAP5 had almost identical results for themore » accumulator gas pressure and water level, although there were some minor difference in other parameters such as accumulator gas temperature and tank wall temperature. One advantage of the JFNK method is its easiness to maintain and modify models due to fully separation of numerical methods from physical models. It would be straightforward to extend the current RELAP-7 accumulator model to simulate the advanced accumulator design.« less
Song, Junqiang; Leng, Hongze; Lu, Fengshun
2014-01-01
We present a new numerical method to get the approximate solutions of fractional differential equations. A new operational matrix of integration for fractional-order Legendre functions (FLFs) is first derived. Then a modified variational iteration formula which can avoid “noise terms” is constructed. Finally a numerical method based on variational iteration method (VIM) and FLFs is developed for fractional differential equations (FDEs). Block-pulse functions (BPFs) are used to calculate the FLFs coefficient matrices of the nonlinear terms. Five examples are discussed to demonstrate the validity and applicability of the technique. PMID:24511303
Hudson, H M; Ma, J; Green, P
1994-01-01
Many algorithms for medical image reconstruction adopt versions of the expectation-maximization (EM) algorithm. In this approach, parameter estimates are obtained which maximize a complete data likelihood or penalized likelihood, in each iteration. Implicitly (and sometimes explicitly) penalized algorithms require smoothing of the current reconstruction in the image domain as part of their iteration scheme. In this paper, we discuss alternatives to EM which adapt Fisher's method of scoring (FS) and other methods for direct maximization of the incomplete data likelihood. Jacobi and Gauss-Seidel methods for non-linear optimization provide efficient algorithms applying FS in tomography. One approach uses smoothed projection data in its iterations. We investigate the convergence of Jacobi and Gauss-Seidel algorithms with clinical tomographic projection data.
NASA Astrophysics Data System (ADS)
Shedge, Sapana V.; Pal, Sourav; Köster, Andreas M.
2011-07-01
Recently, two non-iterative approaches have been proposed to calculate response properties within density functional theory (DFT). These approaches are auxiliary density perturbation theory (ADPT) and the non-iterative approach to the coupled-perturbed Kohn-Sham (NIA-CPKS) method. Though both methods are non-iterative, they use different techniques to obtain the perturbed Kohn-Sham matrix. In this Letter, for the first time, both of these two independent methods have been used for the calculation of dipole-quadrupole polarizabilities. To validate these methods, three tetrahedral molecules viz., P4,CH4 and adamantane (C10H16) have been used as examples. The comparison with MP2 and CCSD proves the reliability of the methodology.
NASA Technical Reports Server (NTRS)
Brand, J. C.
1985-01-01
Contraction theory is applied to an iterative formulation of electromagnetic scattering from periodic structures and a computational method for insuring convergence is developed. A short history of spectral (or k-space) formulation is presented with an emphasis on application to periodic surfaces. The mathematical background for formulating an iterative equation is covered using straightforward single variable examples including an extension to vector spaces. To insure a convergent solution of the iterative equation, a process called the contraction corrector method is developed. Convergence properties of previously presented iterative solutions to one-dimensional problems are examined utilizing contraction theory and the general conditions for achieving a convergent solution are explored. The contraction corrector method is then applied to several scattering problems including an infinite grating of thin wires with the solution data compared to previous works.
Iterative methods for elliptic finite element equations on general meshes
NASA Technical Reports Server (NTRS)
Nicolaides, R. A.; Choudhury, Shenaz
1986-01-01
Iterative methods for arbitrary mesh discretizations of elliptic partial differential equations are surveyed. The methods discussed are preconditioned conjugate gradients, algebraic multigrid, deflated conjugate gradients, an element-by-element techniques, and domain decomposition. Computational results are included.
NASA Technical Reports Server (NTRS)
Johnson, F. T.; Samant, S. S.; Bieterman, M. B.; Melvin, R. G.; Young, D. P.; Bussoletti, J. E.; Hilmes, C. L.
1992-01-01
A new computer program, called TranAir, for analyzing complex configurations in transonic flow (with subsonic or supersonic freestream) was developed. This program provides accurate and efficient simulations of nonlinear aerodynamic flows about arbitrary geometries with the ease and flexibility of a typical panel method program. The numerical method implemented in TranAir is described. The method solves the full potential equation subject to a set of general boundary conditions and can handle regions with differing total pressure and temperature. The boundary value problem is discretized using the finite element method on a locally refined rectangular grid. The grid is automatically constructed by the code and is superimposed on the boundary described by networks of panels; thus no surface fitted grid generation is required. The nonlinear discrete system arising from the finite element method is solved using a preconditioned Krylov subspace method embedded in an inexact Newton method. The solution is obtained on a sequence of successively refined grids which are either constructed adaptively based on estimated solution errors or are predetermined based on user inputs. Many results obtained by using TranAir to analyze aerodynamic configurations are presented.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jing Yanfei, E-mail: yanfeijing@uestc.edu.c; Huang Tingzhu, E-mail: tzhuang@uestc.edu.c; Duan Yong, E-mail: duanyong@yahoo.c
This study is mainly focused on iterative solutions with simple diagonal preconditioning to two complex-valued nonsymmetric systems of linear equations arising from a computational chemistry model problem proposed by Sherry Li of NERSC. Numerical experiments show the feasibility of iterative methods to some extent when applied to the problems and reveal the competitiveness of our recently proposed Lanczos biconjugate A-orthonormalization methods to other classic and popular iterative methods. By the way, experiment results also indicate that application specific preconditioners may be mandatory and required for accelerating convergence.
NASA Astrophysics Data System (ADS)
Woollands, Robyn M.; Read, Julie L.; Probe, Austin B.; Junkins, John L.
2017-12-01
We present a new method for solving the multiple revolution perturbed Lambert problem using the method of particular solutions and modified Chebyshev-Picard iteration. The method of particular solutions differs from the well-known Newton-shooting method in that integration of the state transition matrix (36 additional differential equations) is not required, and instead it makes use of a reference trajectory and a set of n particular solutions. Any numerical integrator can be used for solving two-point boundary problems with the method of particular solutions, however we show that using modified Chebyshev-Picard iteration affords an avenue for increased efficiency that is not available with other step-by-step integrators. We take advantage of the path approximation nature of modified Chebyshev-Picard iteration (nodes iteratively converge to fixed points in space) and utilize a variable fidelity force model for propagating the reference trajectory. Remarkably, we demonstrate that computing the particular solutions with only low fidelity function evaluations greatly increases the efficiency of the algorithm while maintaining machine precision accuracy. Our study reveals that solving the perturbed Lambert's problem using the method of particular solutions with modified Chebyshev-Picard iteration is about an order of magnitude faster compared with the classical shooting method and a tenth-twelfth order Runge-Kutta integrator. It is well known that the solution to Lambert's problem over multiple revolutions is not unique and to ensure that all possible solutions are considered we make use of a reliable preexisting Keplerian Lambert solver to warm start our perturbed algorithm.
de Lima, Camila; Salomão Helou, Elias
2018-01-01
Iterative methods for tomographic image reconstruction have the computational cost of each iteration dominated by the computation of the (back)projection operator, which take roughly O(N 3 ) floating point operations (flops) for N × N pixels images. Furthermore, classical iterative algorithms may take too many iterations in order to achieve acceptable images, thereby making the use of these techniques unpractical for high-resolution images. Techniques have been developed in the literature in order to reduce the computational cost of the (back)projection operator to O(N 2 logN) flops. Also, incremental algorithms have been devised that reduce by an order of magnitude the number of iterations required to achieve acceptable images. The present paper introduces an incremental algorithm with a cost of O(N 2 logN) flops per iteration and applies it to the reconstruction of very large tomographic images obtained from synchrotron light illuminated data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gao, H
Purpose: This work is to develop a general framework, namely filtered iterative reconstruction (FIR) method, to incorporate analytical reconstruction (AR) method into iterative reconstruction (IR) method, for enhanced CT image quality. Methods: FIR is formulated as a combination of filtered data fidelity and sparsity regularization, and then solved by proximal forward-backward splitting (PFBS) algorithm. As a result, the image reconstruction decouples data fidelity and image regularization with a two-step iterative scheme, during which an AR-projection step updates the filtered data fidelity term, while a denoising solver updates the sparsity regularization term. During the AR-projection step, the image is projected tomore » the data domain to form the data residual, and then reconstructed by certain AR to a residual image which is in turn weighted together with previous image iterate to form next image iterate. Since the eigenvalues of AR-projection operator are close to the unity, PFBS based FIR has a fast convergence. Results: The proposed FIR method is validated in the setting of circular cone-beam CT with AR being FDK and total-variation sparsity regularization, and has improved image quality from both AR and IR. For example, AIR has improved visual assessment and quantitative measurement in terms of both contrast and resolution, and reduced axial and half-fan artifacts. Conclusion: FIR is proposed to incorporate AR into IR, with an efficient image reconstruction algorithm based on PFBS. The CBCT results suggest that FIR synergizes AR and IR with improved image quality and reduced axial and half-fan artifacts. The authors was partially supported by the NSFC (#11405105), the 973 Program (#2015CB856000), and the Shanghai Pujiang Talent Program (#14PJ1404500).« less
Accelerating the weighted histogram analysis method by direct inversion in the iterative subspace.
Zhang, Cheng; Lai, Chun-Liang; Pettitt, B Montgomery
The weighted histogram analysis method (WHAM) for free energy calculations is a valuable tool to produce free energy differences with the minimal errors. Given multiple simulations, WHAM obtains from the distribution overlaps the optimal statistical estimator of the density of states, from which the free energy differences can be computed. The WHAM equations are often solved by an iterative procedure. In this work, we use a well-known linear algebra algorithm which allows for more rapid convergence to the solution. We find that the computational complexity of the iterative solution to WHAM and the closely-related multiple Bennett acceptance ratio (MBAR) method can be improved by using the method of direct inversion in the iterative subspace. We give examples from a lattice model, a simple liquid and an aqueous protein solution.
Chen, Kun; Zhang, Hongyuan; Wei, Haoyun; Li, Yan
2014-08-20
In this paper, we propose an improved subtraction algorithm for rapid recovery of Raman spectra that can substantially reduce the computation time. This algorithm is based on an improved Savitzky-Golay (SG) iterative smoothing method, which involves two key novel approaches: (a) the use of the Gauss-Seidel method and (b) the introduction of a relaxation factor into the iterative procedure. By applying a novel successive relaxation (SG-SR) iterative method to the relaxation factor, additional improvement in the convergence speed over the standard Savitzky-Golay procedure is realized. The proposed improved algorithm (the RIA-SG-SR algorithm), which uses SG-SR-based iteration instead of Savitzky-Golay iteration, has been optimized and validated with a mathematically simulated Raman spectrum, as well as experimentally measured Raman spectra from non-biological and biological samples. The method results in a significant reduction in computing cost while yielding consistent rejection of fluorescence and noise for spectra with low signal-to-fluorescence ratios and varied baselines. In the simulation, RIA-SG-SR achieved 1 order of magnitude improvement in iteration number and 2 orders of magnitude improvement in computation time compared with the range-independent background-subtraction algorithm (RIA). Furthermore the computation time of the experimentally measured raw Raman spectrum processing from skin tissue decreased from 6.72 to 0.094 s. In general, the processing of the SG-SR method can be conducted within dozens of milliseconds, which can provide a real-time procedure in practical situations.
Leiner, Claude; Nemitz, Wolfgang; Schweitzer, Susanne; Kuna, Ladislav; Wenzl, Franz P; Hartmann, Paul; Satzinger, Valentin; Sommer, Christian
2016-03-20
We show that with an appropriate combination of two optical simulation techniques-classical ray-tracing and the finite difference time domain method-an optical device containing multiple diffractive and refractive optical elements can be accurately simulated in an iterative simulation approach. We compare the simulation results with experimental measurements of the device to discuss the applicability and accuracy of our iterative simulation procedure.
NASA Astrophysics Data System (ADS)
Vasil'ev, V. I.; Kardashevsky, A. M.; Popov, V. V.; Prokopev, G. A.
2017-10-01
This article presents results of computational experiment carried out using a finite-difference method for solving the inverse Cauchy problem for a two-dimensional elliptic equation. The computational algorithm involves an iterative determination of the missing boundary condition from the override condition using the conjugate gradient method. The results of calculations are carried out on the examples with exact solutions as well as at specifying an additional condition with random errors are presented. Results showed a high efficiency of the iterative method of conjugate gradients for numerical solution
NASA Technical Reports Server (NTRS)
Gartling, D. K.; Roache, P. J.
1978-01-01
The efficiency characteristics of finite element and finite difference approximations for the steady-state solution of the Navier-Stokes equations are examined. The finite element method discussed is a standard Galerkin formulation of the incompressible, steady-state Navier-Stokes equations. The finite difference formulation uses simple centered differences that are O(delta x-squared). Operation counts indicate that a rapidly converging Newton-Raphson-Kantorovitch iteration scheme is generally preferable over a Picard method. A split NOS Picard iterative algorithm for the finite difference method was most efficient.
Continuous analog of multiplicative algebraic reconstruction technique for computed tomography
NASA Astrophysics Data System (ADS)
Tateishi, Kiyoko; Yamaguchi, Yusaku; Abou Al-Ola, Omar M.; Kojima, Takeshi; Yoshinaga, Tetsuya
2016-03-01
We propose a hybrid dynamical system as a continuous analog to the block-iterative multiplicative algebraic reconstruction technique (BI-MART), which is a well-known iterative image reconstruction algorithm for computed tomography. The hybrid system is described by a switched nonlinear system with a piecewise smooth vector field or differential equation and, for consistent inverse problems, the convergence of non-negatively constrained solutions to a globally stable equilibrium is guaranteed by the Lyapunov theorem. Namely, we can prove theoretically that a weighted Kullback-Leibler divergence measure can be a common Lyapunov function for the switched system. We show that discretizing the differential equation by using the first-order approximation (Euler's method) based on the geometric multiplicative calculus leads to the same iterative formula of the BI-MART with the scaling parameter as a time-step of numerical discretization. The present paper is the first to reveal that a kind of iterative image reconstruction algorithm is constructed by the discretization of a continuous-time dynamical system for solving tomographic inverse problems. Iterative algorithms with not only the Euler method but also the Runge-Kutta methods of lower-orders applied for discretizing the continuous-time system can be used for image reconstruction. A numerical example showing the characteristics of the discretized iterative methods is presented.
Solving large mixed linear models using preconditioned conjugate gradient iteration.
Strandén, I; Lidauer, M
1999-12-01
Continuous evaluation of dairy cattle with a random regression test-day model requires a fast solving method and algorithm. A new computing technique feasible in Jacobi and conjugate gradient based iterative methods using iteration on data is presented. In the new computing technique, the calculations in multiplication of a vector by a matrix were recorded to three steps instead of the commonly used two steps. The three-step method was implemented in a general mixed linear model program that used preconditioned conjugate gradient iteration. Performance of this program in comparison to other general solving programs was assessed via estimation of breeding values using univariate, multivariate, and random regression test-day models. Central processing unit time per iteration with the new three-step technique was, at best, one-third that needed with the old technique. Performance was best with the test-day model, which was the largest and most complex model used. The new program did well in comparison to other general software. Programs keeping the mixed model equations in random access memory required at least 20 and 435% more time to solve the univariate and multivariate animal models, respectively. Computations of the second best iteration on data took approximately three and five times longer for the animal and test-day models, respectively, than did the new program. Good performance was due to fast computing time per iteration and quick convergence to the final solutions. Use of preconditioned conjugate gradient based methods in solving large breeding value problems is supported by our findings.
A heuristic statistical stopping rule for iterative reconstruction in emission tomography.
Ben Bouallègue, F; Crouzet, J F; Mariano-Goulart, D
2013-01-01
We propose a statistical stopping criterion for iterative reconstruction in emission tomography based on a heuristic statistical description of the reconstruction process. The method was assessed for MLEM reconstruction. Based on Monte-Carlo numerical simulations and using a perfectly modeled system matrix, our method was compared with classical iterative reconstruction followed by low-pass filtering in terms of Euclidian distance to the exact object, noise, and resolution. The stopping criterion was then evaluated with realistic PET data of a Hoffman brain phantom produced using the GATE platform for different count levels. The numerical experiments showed that compared with the classical method, our technique yielded significant improvement of the noise-resolution tradeoff for a wide range of counting statistics compatible with routine clinical settings. When working with realistic data, the stopping rule allowed a qualitatively and quantitatively efficient determination of the optimal image. Our method appears to give a reliable estimation of the optimal stopping point for iterative reconstruction. It should thus be of practical interest as it produces images with similar or better quality than classical post-filtered iterative reconstruction with a mastered computation time.
A Method to Solve Interior and Exterior Camera Calibration Parameters for Image Resection
NASA Technical Reports Server (NTRS)
Samtaney, Ravi
1999-01-01
An iterative method is presented to solve the internal and external camera calibration parameters, given model target points and their images from one or more camera locations. The direct linear transform formulation was used to obtain a guess for the iterative method, and herein lies one of the strengths of the present method. In all test cases, the method converged to the correct solution. In general, an overdetermined system of nonlinear equations is solved in the least-squares sense. The iterative method presented is based on Newton-Raphson for solving systems of nonlinear algebraic equations. The Jacobian is analytically derived and the pseudo-inverse of the Jacobian is obtained by singular value decomposition.
NASA Astrophysics Data System (ADS)
Chew, J. V. L.; Sulaiman, J.
2017-09-01
Partial differential equations that are used in describing the nonlinear heat and mass transfer phenomena are difficult to be solved. For the case where the exact solution is difficult to be obtained, it is necessary to use a numerical procedure such as the finite difference method to solve a particular partial differential equation. In term of numerical procedure, a particular method can be considered as an efficient method if the method can give an approximate solution within the specified error with the least computational complexity. Throughout this paper, the two-dimensional Porous Medium Equation (2D PME) is discretized by using the implicit finite difference scheme to construct the corresponding approximation equation. Then this approximation equation yields a large-sized and sparse nonlinear system. By using the Newton method to linearize the nonlinear system, this paper deals with the application of the Four-Point Newton-EGSOR (4NEGSOR) iterative method for solving the 2D PMEs. In addition to that, the efficiency of the 4NEGSOR iterative method is studied by solving three examples of the problems. Based on the comparative analysis, the Newton-Gauss-Seidel (NGS) and the Newton-SOR (NSOR) iterative methods are also considered. The numerical findings show that the 4NEGSOR method is superior to the NGS and the NSOR methods in terms of the number of iterations to get the converged solutions, the time of computation and the maximum absolute errors produced by the methods.
Modules and methods for all photonic computing
Schultz, David R.; Ma, Chao Hung
2001-01-01
A method for all photonic computing, comprising the steps of: encoding a first optical/electro-optical element with a two dimensional mathematical function representing input data; illuminating the first optical/electro-optical element with a collimated beam of light; illuminating a second optical/electro-optical element with light from the first optical/electro-optical element, the second optical/electro-optical element having a characteristic response corresponding to an iterative algorithm useful for solving a partial differential equation; iteratively recirculating the signal through the second optical/electro-optical element with light from the second optical/electro-optical element for a predetermined number of iterations; and, after the predetermined number of iterations, optically and/or electro-optically collecting output data representing an iterative optical solution from the second optical/electro-optical element.
Use of upscaled elevation and surface roughness data in two-dimensional surface water models
Hughes, J.D.; Decker, J.D.; Langevin, C.D.
2011-01-01
In this paper, we present an approach that uses a combination of cell-block- and cell-face-averaging of high-resolution cell elevation and roughness data to upscale hydraulic parameters and accurately simulate surface water flow in relatively low-resolution numerical models. The method developed allows channelized features that preferentially connect large-scale grid cells at cell interfaces to be represented in models where these features are significantly smaller than the selected grid size. The developed upscaling approach has been implemented in a two-dimensional finite difference model that solves a diffusive wave approximation of the depth-integrated shallow surface water equations using preconditioned Newton–Krylov methods. Computational results are presented to show the effectiveness of the mixed cell-block and cell-face averaging upscaling approach in maintaining model accuracy, reducing model run-times, and how decreased grid resolution affects errors. Application examples demonstrate that sub-grid roughness coefficient variations have a larger effect on simulated error than sub-grid elevation variations.
An Object-Oriented Finite Element Framework for Multiphysics Phase Field Simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Michael R Tonks; Derek R Gaston; Paul C Millett
2012-01-01
The phase field approach is a powerful and popular method for modeling microstructure evolution. In this work, advanced numerical tools are used to create a phase field framework that facilitates rapid model development. This framework, called MARMOT, is based on Idaho National Laboratory's finite element Multiphysics Object-Oriented Simulation Environment. In MARMOT, the system of phase field partial differential equations (PDEs) are solved simultaneously with PDEs describing additional physics, such as solid mechanics and heat conduction, using the Jacobian-Free Newton Krylov Method. An object-oriented architecture is created by taking advantage of commonalities in phase fields models to facilitate development of newmore » models with very little written code. In addition, MARMOT provides access to mesh and time step adaptivity, reducing the cost for performing simulations with large disparities in both spatial and temporal scales. In this work, phase separation simulations are used to show the numerical performance of MARMOT. Deformation-induced grain growth and void growth simulations are included to demonstrate the muliphysics capability.« less
MOOSE: A parallel computational framework for coupled systems of nonlinear equations.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Derek Gaston; Chris Newman; Glen Hansen
Systems of coupled, nonlinear partial differential equations (PDEs) often arise in simulation of nuclear processes. MOOSE: Multiphysics Object Oriented Simulation Environment, a parallel computational framework targeted at the solution of such systems, is presented. As opposed to traditional data-flow oriented computational frameworks, MOOSE is instead founded on the mathematical principle of Jacobian-free Newton-Krylov (JFNK) solution methods. Utilizing the mathematical structure present in JFNK, physics expressions are modularized into `Kernels,'' allowing for rapid production of new simulation tools. In addition, systems are solved implicitly and fully coupled, employing physics based preconditioning, which provides great flexibility even with large variance in timemore » scales. A summary of the mathematics, an overview of the structure of MOOSE, and several representative solutions from applications built on the framework are presented.« less
Convergence Results on Iteration Algorithms to Linear Systems
Wang, Zhuande; Yang, Chuansheng; Yuan, Yubo
2014-01-01
In order to solve the large scale linear systems, backward and Jacobi iteration algorithms are employed. The convergence is the most important issue. In this paper, a unified backward iterative matrix is proposed. It shows that some well-known iterative algorithms can be deduced with it. The most important result is that the convergence results have been proved. Firstly, the spectral radius of the Jacobi iterative matrix is positive and the one of backward iterative matrix is strongly positive (lager than a positive constant). Secondly, the mentioned two iterations have the same convergence results (convergence or divergence simultaneously). Finally, some numerical experiments show that the proposed algorithms are correct and have the merit of backward methods. PMID:24991640
Value Iteration Adaptive Dynamic Programming for Optimal Control of Discrete-Time Nonlinear Systems.
Wei, Qinglai; Liu, Derong; Lin, Hanquan
2016-03-01
In this paper, a value iteration adaptive dynamic programming (ADP) algorithm is developed to solve infinite horizon undiscounted optimal control problems for discrete-time nonlinear systems. The present value iteration ADP algorithm permits an arbitrary positive semi-definite function to initialize the algorithm. A novel convergence analysis is developed to guarantee that the iterative value function converges to the optimal performance index function. Initialized by different initial functions, it is proven that the iterative value function will be monotonically nonincreasing, monotonically nondecreasing, or nonmonotonic and will converge to the optimum. In this paper, for the first time, the admissibility properties of the iterative control laws are developed for value iteration algorithms. It is emphasized that new termination criteria are established to guarantee the effectiveness of the iterative control laws. Neural networks are used to approximate the iterative value function and compute the iterative control law, respectively, for facilitating the implementation of the iterative ADP algorithm. Finally, two simulation examples are given to illustrate the performance of the present method.
Molecular Quantum Mechanics: Analytic Gradients and Beyond - Program and Abstracts
2007-06-03
Kutzelnigg (Bochum, Germany) Chair: Pekka Pyykko (Helsinki, Finland) Which Masses are Vibrating or Rotating in a Molecule? 15:40-16:15 O30...Krylov (Los Angeles, CA, U.S.A.) Multiconfigurational Quantum Chemistry for Actinide Containing Systems: From Isolated Molecules to Condensed...the genetic algorithm will be critically assessed. For B4n, the double rings are notably stable. The DFT calculations provide strong indications of
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dul, F.A.; Arczewski, K.
1994-03-01
Although it has been stated that [open quotes]an attempt to solve (very large problems) by subspace iterations seems futile[close quotes], we will show that the statement is not true, especially for extremely large eigenproblems. In this paper a new two-phase subspace iteration/Rayleigh quotient/conjugate gradient method for generalized, large, symmetric eigenproblems Ax = [lambda]Bx is presented. It has the ability of solving extremely large eigenproblems, N = 216,000, for example, and finding a large number of leftmost or rightmost eigenpairs, up to 1000 or more. Multiple eigenpairs, even those with multiplicity 100, can be easily found. The use of the proposedmore » method for solving the big full eigenproblems (N [approximately] 10[sup 3]), as well as for large weakly non-symmetric eigenproblems, have been considered also. The proposed method is fully iterative; thus the factorization of matrices ins avoided. The key idea consists in joining two methods: subspace and Rayleigh quotient iterations. The systems of indefinite and almost singular linear equations (a - [sigma]B)x = By are solved by various iterative conjugate gradient method can be used without danger of breaking down due to its property that may be called [open quotes]self-correction towards the eigenvector,[close quotes] discovered recently by us. The use of various preconditioners (SSOR and IC) has also been considered. The main features of the proposed method have been analyzed in detail. Comparisons with other methods, such as, accelerated subspace iteration, Lanczos, Davidson, TLIME, TRACMN, and SRQMCG, are presented. The results of numerical tests for various physical problems (acoustic, vibrations of structures, quantum chemistry) are presented as well. 40 refs., 12 figs., 2 tabs.« less
Use of direct and iterative solvers for estimation of SNP effects in genome-wide selection
2010-01-01
The aim of this study was to compare iterative and direct solvers for estimation of marker effects in genomic selection. One iterative and two direct methods were used: Gauss-Seidel with Residual Update, Cholesky Decomposition and Gentleman-Givens rotations. For resembling different scenarios with respect to number of markers and of genotyped animals, a simulated data set divided into 25 subsets was used. Number of markers ranged from 1,200 to 5,925 and number of animals ranged from 1,200 to 5,865. Methods were also applied to real data comprising 3081 individuals genotyped for 45181 SNPs. Results from simulated data showed that the iterative solver was substantially faster than direct methods for larger numbers of markers. Use of a direct solver may allow for computing (co)variances of SNP effects. When applied to real data, performance of the iterative method varied substantially, depending on the level of ill-conditioning of the coefficient matrix. From results with real data, Gentleman-Givens rotations would be the method of choice in this particular application as it provided an exact solution within a fairly reasonable time frame (less than two hours). It would indeed be the preferred method whenever computer resources allow its use. PMID:21637627
Shading correction assisted iterative cone-beam CT reconstruction
NASA Astrophysics Data System (ADS)
Yang, Chunlin; Wu, Pengwei; Gong, Shutao; Wang, Jing; Lyu, Qihui; Tang, Xiangyang; Niu, Tianye
2017-11-01
Recent advances in total variation (TV) technology enable accurate CT image reconstruction from highly under-sampled and noisy projection data. The standard iterative reconstruction algorithms, which work well in conventional CT imaging, fail to perform as expected in cone beam CT (CBCT) applications, wherein the non-ideal physics issues, including scatter and beam hardening, are more severe. These physics issues result in large areas of shading artifacts and cause deterioration to the piecewise constant property assumed in reconstructed images. To overcome this obstacle, we incorporate a shading correction scheme into low-dose CBCT reconstruction and propose a clinically acceptable and stable three-dimensional iterative reconstruction method that is referred to as the shading correction assisted iterative reconstruction. In the proposed method, we modify the TV regularization term by adding a shading compensation image to the reconstructed image to compensate for the shading artifacts while leaving the data fidelity term intact. This compensation image is generated empirically, using image segmentation and low-pass filtering, and updated in the iterative process whenever necessary. When the compensation image is determined, the objective function is minimized using the fast iterative shrinkage-thresholding algorithm accelerated on a graphic processing unit. The proposed method is evaluated using CBCT projection data of the Catphan© 600 phantom and two pelvis patients. Compared with the iterative reconstruction without shading correction, the proposed method reduces the overall CT number error from around 200 HU to be around 25 HU and increases the spatial uniformity by a factor of 20 percent, given the same number of sparsely sampled projections. A clinically acceptable and stable iterative reconstruction algorithm for CBCT is proposed in this paper. Differing from the existing algorithms, this algorithm incorporates a shading correction scheme into the low-dose CBCT reconstruction and achieves more stable optimization path and more clinically acceptable reconstructed image. The method proposed by us does not rely on prior information and thus is practically attractive to the applications of low-dose CBCT imaging in the clinic.
Visser, R; Godart, J; Wauben, D J L; Langendijk, J A; Van't Veld, A A; Korevaar, E W
2016-05-21
The objective of this study was to introduce a new iterative method to reconstruct multi leaf collimator (MLC) positions based on low resolution ionization detector array measurements and to evaluate its error detection performance. The iterative reconstruction method consists of a fluence model, a detector model and an optimizer. Expected detector response was calculated using a radiotherapy treatment plan in combination with the fluence model and detector model. MLC leaf positions were reconstructed by minimizing differences between expected and measured detector response. The iterative reconstruction method was evaluated for an Elekta SLi with 10.0 mm MLC leafs in combination with the COMPASS system and the MatriXX Evolution (IBA Dosimetry) detector with a spacing of 7.62 mm. The detector was positioned in such a way that each leaf pair of the MLC was aligned with one row of ionization chambers. Known leaf displacements were introduced in various field geometries ranging from -10.0 mm to 10.0 mm. Error detection performance was tested for MLC leaf position dependency relative to the detector position, gantry angle dependency, monitor unit dependency, and for ten clinical intensity modulated radiotherapy (IMRT) treatment beams. For one clinical head and neck IMRT treatment beam, influence of the iterative reconstruction method on existing 3D dose reconstruction artifacts was evaluated. The described iterative reconstruction method was capable of individual MLC leaf position reconstruction with millimeter accuracy, independent of the relative detector position within the range of clinically applied MU's for IMRT. Dose reconstruction artifacts in a clinical IMRT treatment beam were considerably reduced as compared to the current dose verification procedure. The iterative reconstruction method allows high accuracy 3D dose verification by including actual MLC leaf positions reconstructed from low resolution 2D measurements.
NASA Astrophysics Data System (ADS)
Del Carpio R., Maikol; Hashemi, M. Javad; Mosqueda, Gilberto
2017-10-01
This study examines the performance of integration methods for hybrid simulation of large and complex structural systems in the context of structural collapse due to seismic excitations. The target application is not necessarily for real-time testing, but rather for models that involve large-scale physical sub-structures and highly nonlinear numerical models. Four case studies are presented and discussed. In the first case study, the accuracy of integration schemes including two widely used methods, namely, modified version of the implicit Newmark with fixed-number of iteration (iterative) and the operator-splitting (non-iterative) is examined through pure numerical simulations. The second case study presents the results of 10 hybrid simulations repeated with the two aforementioned integration methods considering various time steps and fixed-number of iterations for the iterative integration method. The physical sub-structure in these tests consists of a single-degree-of-freedom (SDOF) cantilever column with replaceable steel coupons that provides repeatable highlynonlinear behavior including fracture-type strength and stiffness degradations. In case study three, the implicit Newmark with fixed-number of iterations is applied for hybrid simulations of a 1:2 scale steel moment frame that includes a relatively complex nonlinear numerical substructure. Lastly, a more complex numerical substructure is considered by constructing a nonlinear computational model of a moment frame coupled to a hybrid model of a 1:2 scale steel gravity frame. The last two case studies are conducted on the same porotype structure and the selection of time steps and fixed number of iterations are closely examined in pre-test simulations. The generated unbalance forces is used as an index to track the equilibrium error and predict the accuracy and stability of the simulations.
Calibration and Data Analysis of the MC-130 Air Balance
NASA Technical Reports Server (NTRS)
Booth, Dennis; Ulbrich, N.
2012-01-01
Design, calibration, calibration analysis, and intended use of the MC-130 air balance are discussed. The MC-130 balance is an 8.0 inch diameter force balance that has two separate internal air flow systems and one external bellows system. The manual calibration of the balance consisted of a total of 1854 data points with both unpressurized and pressurized air flowing through the balance. A subset of 1160 data points was chosen for the calibration data analysis. The regression analysis of the subset was performed using two fundamentally different analysis approaches. First, the data analysis was performed using a recently developed extension of the Iterative Method. This approach fits gage outputs as a function of both applied balance loads and bellows pressures while still allowing the application of the iteration scheme that is used with the Iterative Method. Then, for comparison, the axial force was also analyzed using the Non-Iterative Method. This alternate approach directly fits loads as a function of measured gage outputs and bellows pressures and does not require a load iteration. The regression models used by both the extended Iterative and Non-Iterative Method were constructed such that they met a set of widely accepted statistical quality requirements. These requirements lead to reliable regression models and prevent overfitting of data because they ensure that no hidden near-linear dependencies between regression model terms exist and that only statistically significant terms are included. Finally, a comparison of the axial force residuals was performed. Overall, axial force estimates obtained from both methods show excellent agreement as the differences of the standard deviation of the axial force residuals are on the order of 0.001 % of the axial force capacity.
Application of the perturbation iteration method to boundary layer type problems.
Pakdemirli, Mehmet
2016-01-01
The recently developed perturbation iteration method is applied to boundary layer type singular problems for the first time. As a preliminary work on the topic, the simplest algorithm of PIA(1,1) is employed in the calculations. Linear and nonlinear problems are solved to outline the basic ideas of the new solution technique. The inner and outer solutions are determined with the iteration algorithm and matched to construct a composite expansion valid within all parts of the domain. The solutions are contrasted with the available exact or numerical solutions. It is shown that the perturbation-iteration algorithm can be effectively used for solving boundary layer type problems.
Spotting the difference in molecular dynamics simulations of biomolecules
NASA Astrophysics Data System (ADS)
Sakuraba, Shun; Kono, Hidetoshi
2016-08-01
Comparing two trajectories from molecular simulations conducted under different conditions is not a trivial task. In this study, we apply a method called Linear Discriminant Analysis with ITERative procedure (LDA-ITER) to compare two molecular simulation results by finding the appropriate projection vectors. Because LDA-ITER attempts to determine a projection such that the projections of the two trajectories do not overlap, the comparison does not suffer from a strong anisotropy, which is an issue in protein dynamics. LDA-ITER is applied to two test cases: the T4 lysozyme protein simulation with or without a point mutation and the allosteric protein PDZ2 domain of hPTP1E with or without a ligand. The projection determined by the method agrees with the experimental data and previous simulations. The proposed procedure, which complements existing methods, is a versatile analytical method that is specialized to find the "difference" between two trajectories.
Improved Convergence and Robustness of USM3D Solutions on Mixed-Element Grids
NASA Technical Reports Server (NTRS)
Pandya, Mohagna J.; Diskin, Boris; Thomas, James L.; Frink, Neal T.
2016-01-01
Several improvements to the mixed-element USM3D discretization and defect-correction schemes have been made. A new methodology for nonlinear iterations, called the Hierarchical Adaptive Nonlinear Iteration Method, has been developed and implemented. The Hierarchical Adaptive Nonlinear Iteration Method provides two additional hierarchies around a simple and approximate preconditioner of USM3D. The hierarchies are a matrix-free linear solver for the exact linearization of Reynolds-averaged Navier-Stokes equations and a nonlinear control of the solution update. Two variants of the Hierarchical Adaptive Nonlinear Iteration Method are assessed on four benchmark cases, namely, a zero-pressure-gradient flat plate, a bump-in-channel configuration, the NACA 0012 airfoil, and a NASA Common Research Model configuration. The new methodology provides a convergence acceleration factor of 1.4 to 13 over the preconditioner-alone method representing the baseline solver technology.
Improved Convergence and Robustness of USM3D Solutions on Mixed-Element Grids
NASA Technical Reports Server (NTRS)
Pandya, Mohagna J.; Diskin, Boris; Thomas, James L.; Frinks, Neal T.
2016-01-01
Several improvements to the mixed-elementUSM3Ddiscretization and defect-correction schemes have been made. A new methodology for nonlinear iterations, called the Hierarchical Adaptive Nonlinear Iteration Method, has been developed and implemented. The Hierarchical Adaptive Nonlinear Iteration Method provides two additional hierarchies around a simple and approximate preconditioner of USM3D. The hierarchies are a matrix-free linear solver for the exact linearization of Reynolds-averaged Navier-Stokes equations and a nonlinear control of the solution update. Two variants of the Hierarchical Adaptive Nonlinear Iteration Method are assessed on four benchmark cases, namely, a zero-pressure-gradient flat plate, a bump-in-channel configuration, the NACA 0012 airfoil, and a NASA Common Research Model configuration. The new methodology provides a convergence acceleration factor of 1.4 to 13 over the preconditioner-alone method representing the baseline solver technology.
Investigation of iterative image reconstruction in three-dimensional optoacoustic tomography
Wang, Kun; Su, Richard; Oraevsky, Alexander A; Anastasio, Mark A
2012-01-01
Iterative image reconstruction algorithms for optoacoustic tomography (OAT), also known as photoacoustic tomography, have the ability to improve image quality over analytic algorithms due to their ability to incorporate accurate models of the imaging physics, instrument response, and measurement noise. However, to date, there have been few reported attempts to employ advanced iterative image reconstruction algorithms for improving image quality in three-dimensional (3D) OAT. In this work, we implement and investigate two iterative image reconstruction methods for use with a 3D OAT small animal imager: namely, a penalized least-squares (PLS) method employing a quadratic smoothness penalty and a PLS method employing a total variation norm penalty. The reconstruction algorithms employ accurate models of the ultrasonic transducer impulse responses. Experimental data sets are employed to compare the performances of the iterative reconstruction algorithms to that of a 3D filtered backprojection (FBP) algorithm. By use of quantitative measures of image quality, we demonstrate that the iterative reconstruction algorithms can mitigate image artifacts and preserve spatial resolution more effectively than FBP algorithms. These features suggest that the use of advanced image reconstruction algorithms can improve the effectiveness of 3D OAT while reducing the amount of data required for biomedical applications. PMID:22864062
Quantum dynamics in continuum for proton transport II: Variational solvent-solute interface.
Chen, Duan; Chen, Zhan; Wei, Guo-Wei
2012-01-01
Proton transport plays an important role in biological energy transduction and sensory systems. Therefore, it has attracted much attention in biological science and biomedical engineering in the past few decades. The present work proposes a multiscale/multiphysics model for the understanding of the molecular mechanism of proton transport in transmembrane proteins involving continuum, atomic, and quantum descriptions, assisted with the evolution, formation, and visualization of membrane channel surfaces. We describe proton dynamics quantum mechanically via a new density functional theory based on the Boltzmann statistics, while implicitly model numerous solvent molecules as a dielectric continuum to reduce the number of degrees of freedom. The density of all other ions in the solvent is assumed to obey the Boltzmann distribution in a dynamic manner. The impact of protein molecular structure and its charge polarization on the proton transport is considered explicitly at the atomic scale. A variational solute-solvent interface is designed to separate the explicit molecule and implicit solvent regions. We formulate a total free-energy functional to put proton kinetic and potential energies, the free energy of all other ions, and the polar and nonpolar energies of the whole system on an equal footing. The variational principle is employed to derive coupled governing equations for the proton transport system. Generalized Laplace-Beltrami equation, generalized Poisson-Boltzmann equation, and generalized Kohn-Sham equation are obtained from the present variational framework. The variational solvent-solute interface is generated and visualized to facilitate the multiscale discrete/continuum/quantum descriptions. Theoretical formulations for the proton density and conductance are constructed based on fundamental laws of physics. A number of mathematical algorithms, including the Dirichlet-to-Neumann mapping, matched interface and boundary method, Gummel iteration, and Krylov space techniques are utilized to implement the proposed model in a computationally efficient manner. The gramicidin A channel is used to validate the performance of the proposed proton transport model and demonstrate the efficiency of the proposed mathematical algorithms. The proton channel conductances are studied over a number of applied voltages and reference concentrations. A comparison with experimental data verifies the present model predictions and confirms the proposed model. Copyright © 2011 John Wiley & Sons, Ltd.
A robust, efficient equidistribution 2D grid generation method
NASA Astrophysics Data System (ADS)
Chacon, Luis; Delzanno, Gian Luca; Finn, John; Chung, Jeojin; Lapenta, Giovanni
2007-11-01
We present a new cell-area equidistribution method for two- dimensional grid adaptation [1]. The method is able to satisfy the equidistribution constraint to arbitrary precision while optimizing desired grid properties (such as isotropy and smoothness). The method is based on the minimization of the grid smoothness integral, constrained to producing a given positive-definite cell volume distribution. The procedure gives rise to a single, non-linear scalar equation with no free-parameters. We solve this equation numerically with the Newton-Krylov technique. The ellipticity property of the linearized scalar equation allows multigrid preconditioning techniques to be effectively used. We demonstrate a solution exists and is unique. Therefore, once the solution is found, the adapted grid cannot be folded due to the positivity of the constraint on the cell volumes. We present several challenging tests to show that our new method produces optimal grids in which the constraint is satisfied numerically to arbitrary precision. We also compare the new method to the deformation method [2] and show that our new method produces better quality grids. [1] G.L. Delzanno, L. Chac'on, J.M. Finn, Y. Chung, G. Lapenta, A new, robust equidistribution method for two-dimensional grid generation, in preparation. [2] G. Liao and D. Anderson, A new approach to grid generation, Appl. Anal. 44, 285--297 (1992).
Efficient Iterative Methods Applied to the Solution of Transonic Flows
NASA Astrophysics Data System (ADS)
Wissink, Andrew M.; Lyrintzis, Anastasios S.; Chronopoulos, Anthony T.
1996-02-01
We investigate the use of an inexact Newton's method to solve the potential equations in the transonic regime. As a test case, we solve the two-dimensional steady transonic small disturbance equation. Approximate factorization/ADI techniques have traditionally been employed for implicit solutions of this nonlinear equation. Instead, we apply Newton's method using an exact analytical determination of the Jacobian with preconditioned conjugate gradient-like iterative solvers for solution of the linear systems in each Newton iteration. Two iterative solvers are tested; a block s-step version of the classical Orthomin(k) algorithm called orthogonal s-step Orthomin (OSOmin) and the well-known GMRES method. The preconditioner is a vectorizable and parallelizable version of incomplete LU (ILU) factorization. Efficiency of the Newton-Iterative method on vector and parallel computer architectures is the main issue addressed. In vectorized tests on a single processor of the Cray C-90, the performance of Newton-OSOmin is superior to Newton-GMRES and a more traditional monotone AF/ADI method (MAF) for a variety of transonic Mach numbers and mesh sizes. Newton-GMRES is superior to MAF for some cases. The parallel performance of the Newton method is also found to be very good on multiple processors of the Cray C-90 and on the massively parallel thinking machine CM-5, where very fast execution rates (up to 9 Gflops) are found for large problems.
On a new iterative method for solving linear systems and comparison results
NASA Astrophysics Data System (ADS)
Jing, Yan-Fei; Huang, Ting-Zhu
2008-10-01
In Ujevic [A new iterative method for solving linear systems, Appl. Math. Comput. 179 (2006) 725-730], the author obtained a new iterative method for solving linear systems, which can be considered as a modification of the Gauss-Seidel method. In this paper, we show that this is a special case from a point of view of projection techniques. And a different approach is established, which is both theoretically and numerically proven to be better than (at least the same as) Ujevic's. As the presented numerical examples show, in most cases, the convergence rate is more than one and a half that of Ujevic.
NASA Astrophysics Data System (ADS)
Trujillo Bueno, J.; Fabiani Bendicho, P.
1995-12-01
Iterative schemes based on Gauss-Seidel (G-S) and optimal successive over-relaxation (SOR) iteration are shown to provide a dramatic increase in the speed with which non-LTE radiation transfer (RT) problems can be solved. The convergence rates of these new RT methods are identical to those of upper triangular nonlocal approximate operator splitting techniques, but the computing time per iteration and the memory requirements are similar to those of a local operator splitting method. In addition to these properties, both methods are particularly suitable for multidimensional geometry, since they neither require the actual construction of nonlocal approximate operators nor the application of any matrix inversion procedure. Compared with the currently used Jacobi technique, which is based on the optimal local approximate operator (see Olson, Auer, & Buchler 1986), the G-S method presented here is faster by a factor 2. It gives excellent smoothing of the high-frequency error components, which makes it the iterative scheme of choice for multigrid radiative transfer. This G-S method can also be suitably combined with standard acceleration techniques to achieve even higher performance. Although the convergence rate of the optimal SOR scheme developed here for solving non-LTE RT problems is much higher than G-S, the computing time per iteration is also minimal, i.e., virtually identical to that of a local operator splitting method. While the conventional optimal local operator scheme provides the converged solution after a total CPU time (measured in arbitrary units) approximately equal to the number n of points per decade of optical depth, the time needed by this new method based on the optimal SOR iterations is only √n/2√2. This method is competitive with those that result from combining the above-mentioned Jacobi and G-S schemes with the best acceleration techniques. Contrary to what happens with the local operator splitting strategy currently in use, these novel methods remain effective even under extreme non-LTE conditions in very fine grids.
Non-homogeneous updates for the iterative coordinate descent algorithm
NASA Astrophysics Data System (ADS)
Yu, Zhou; Thibault, Jean-Baptiste; Bouman, Charles A.; Sauer, Ken D.; Hsieh, Jiang
2007-02-01
Statistical reconstruction methods show great promise for improving resolution, and reducing noise and artifacts in helical X-ray CT. In fact, statistical reconstruction seems to be particularly valuable in maintaining reconstructed image quality when the dosage is low and the noise is therefore high. However, high computational cost and long reconstruction times remain as a barrier to the use of statistical reconstruction in practical applications. Among the various iterative methods that have been studied for statistical reconstruction, iterative coordinate descent (ICD) has been found to have relatively low overall computational requirements due to its fast convergence. This paper presents a novel method for further speeding the convergence of the ICD algorithm, and therefore reducing the overall reconstruction time for statistical reconstruction. The method, which we call nonhomogeneous iterative coordinate descent (NH-ICD) uses spatially non-homogeneous updates to speed convergence by focusing computation where it is most needed. Experimental results with real data indicate that the method speeds reconstruction by roughly a factor of two for typical 3D multi-slice geometries.
Matrix completion-based reconstruction for undersampled magnetic resonance fingerprinting data.
Doneva, Mariya; Amthor, Thomas; Koken, Peter; Sommer, Karsten; Börnert, Peter
2017-09-01
An iterative reconstruction method for undersampled magnetic resonance fingerprinting data is presented. The method performs the reconstruction entirely in k-space and is related to low rank matrix completion methods. A low dimensional data subspace is estimated from a small number of k-space locations fully sampled in the temporal direction and used to reconstruct the missing k-space samples before MRF dictionary matching. Performing the iterations in k-space eliminates the need for applying a forward and an inverse Fourier transform in each iteration required in previously proposed iterative reconstruction methods for undersampled MRF data. A projection onto the low dimensional data subspace is performed as a matrix multiplication instead of a singular value thresholding typically used in low rank matrix completion, further reducing the computational complexity of the reconstruction. The method is theoretically described and validated in phantom and in-vivo experiments. The quality of the parameter maps can be significantly improved compared to direct matching on undersampled data. Copyright © 2017 Elsevier Inc. All rights reserved.
perturbation formulas of Groebner (1960) and Alexseev (1961) for the solution of ordinary differential equations. These formulas are generalized and...iteration methods are given, which include the Methods of Picard, Groebner -Knapp, Poincare, Chen, as special cases. Chapter 3 generalizes an iterated
Convergence characteristics of nonlinear vortex-lattice methods for configuration aerodynamics
NASA Technical Reports Server (NTRS)
Seginer, A.; Rusak, Z.; Wasserstrom, E.
1983-01-01
Nonlinear panel methods have no proof for the existence and uniqueness of their solutions. The convergence characteristics of an iterative, nonlinear vortex-lattice method are, therefore, carefully investigated. The effects of several parameters, including (1) the surface-paneling method, (2) an integration method of the trajectories of the wake vortices, (3) vortex-grid refinement, and (4) the initial conditions for the first iteration on the computed aerodynamic coefficients and on the flow-field details are presented. The convergence of the iterative-solution procedure is usually rapid. The solution converges with grid refinement to a constant value, but the final value is not unique and varies with the wing surface-paneling and wake-discretization methods within some range in the vicinity of the experimental result.
NASA Astrophysics Data System (ADS)
Furuichi, Mikito; Nishiura, Daisuke
2017-10-01
We developed dynamic load-balancing algorithms for Particle Simulation Methods (PSM) involving short-range interactions, such as Smoothed Particle Hydrodynamics (SPH), Moving Particle Semi-implicit method (MPS), and Discrete Element method (DEM). These are needed to handle billions of particles modeled in large distributed-memory computer systems. Our method utilizes flexible orthogonal domain decomposition, allowing the sub-domain boundaries in the column to be different for each row. The imbalances in the execution time between parallel logical processes are treated as a nonlinear residual. Load-balancing is achieved by minimizing the residual within the framework of an iterative nonlinear solver, combined with a multigrid technique in the local smoother. Our iterative method is suitable for adjusting the sub-domain frequently by monitoring the performance of each computational process because it is computationally cheaper in terms of communication and memory costs than non-iterative methods. Numerical tests demonstrated the ability of our approach to handle workload imbalances arising from a non-uniform particle distribution, differences in particle types, or heterogeneous computer architecture which was difficult with previously proposed methods. We analyzed the parallel efficiency and scalability of our method using Earth simulator and K-computer supercomputer systems.
Zhao, Jing; Zong, Haili
2018-01-01
In this paper, we propose parallel and cyclic iterative algorithms for solving the multiple-set split equality common fixed-point problem of firmly quasi-nonexpansive operators. We also combine the process of cyclic and parallel iterative methods and propose two mixed iterative algorithms. Our several algorithms do not need any prior information about the operator norms. Under mild assumptions, we prove weak convergence of the proposed iterative sequences in Hilbert spaces. As applications, we obtain several iterative algorithms to solve the multiple-set split equality problem.
The preconditioned Gauss-Seidel method faster than the SOR method
NASA Astrophysics Data System (ADS)
Niki, Hiroshi; Kohno, Toshiyuki; Morimoto, Munenori
2008-09-01
In recent years, a number of preconditioners have been applied to linear systems [A.D. Gunawardena, S.K. Jain, L. Snyder, Modified iterative methods for consistent linear systems, Linear Algebra Appl. 154-156 (1991) 123-143; T. Kohno, H. Kotakemori, H. Niki, M. Usui, Improving modified Gauss-Seidel method for Z-matrices, Linear Algebra Appl. 267 (1997) 113-123; H. Kotakemori, K. Harada, M. Morimoto, H. Niki, A comparison theorem for the iterative method with the preconditioner (I+Smax), J. Comput. Appl. Math. 145 (2002) 373-378; H. Kotakemori, H. Niki, N. Okamoto, Accelerated iteration method for Z-matrices, J. Comput. Appl. Math. 75 (1996) 87-97; M. Usui, H. Niki, T.Kohno, Adaptive Gauss-Seidel method for linear systems, Internat. J. Comput. Math. 51(1994)119-125 [10
Hesford, Andrew J.; Chew, Weng C.
2010-01-01
The distorted Born iterative method (DBIM) computes iterative solutions to nonlinear inverse scattering problems through successive linear approximations. By decomposing the scattered field into a superposition of scattering by an inhomogeneous background and by a material perturbation, large or high-contrast variations in medium properties can be imaged through iterations that are each subject to the distorted Born approximation. However, the need to repeatedly compute forward solutions still imposes a very heavy computational burden. To ameliorate this problem, the multilevel fast multipole algorithm (MLFMA) has been applied as a forward solver within the DBIM. The MLFMA computes forward solutions in linear time for volumetric scatterers. The typically regular distribution and shape of scattering elements in the inverse scattering problem allow the method to take advantage of data redundancy and reduce the computational demands of the normally expensive MLFMA setup. Additional benefits are gained by employing Kaczmarz-like iterations, where partial measurements are used to accelerate convergence. Numerical results demonstrate both the efficiency of the forward solver and the successful application of the inverse method to imaging problems with dimensions in the neighborhood of ten wavelengths. PMID:20707438
Sometimes "Newton's Method" Always "Cycles"
ERIC Educational Resources Information Center
Latulippe, Joe; Switkes, Jennifer
2012-01-01
Are there functions for which Newton's method cycles for all non-trivial initial guesses? We construct and solve a differential equation whose solution is a real-valued function that two-cycles under Newton iteration. Higher-order cycles of Newton's method iterates are explored in the complex plane using complex powers of "x." We find a class of…
Gauss-Seidel Iterative Method as a Real-Time Pile-Up Solver of Scintillation Pulses
NASA Astrophysics Data System (ADS)
Novak, Roman; Vencelj, Matja¿
2009-12-01
The pile-up rejection in nuclear spectroscopy has been confronted recently by several pile-up correction schemes that compensate for distortions of the signal and subsequent energy spectra artifacts as the counting rate increases. We study here a real-time capability of the event-by-event correction method, which at the core translates to solving many sets of linear equations. Tight time limits and constrained front-end electronics resources make well-known direct solvers inappropriate. We propose a novel approach based on the Gauss-Seidel iterative method, which turns out to be a stable and cost-efficient solution to improve spectroscopic resolution in the front-end electronics. We show the method convergence properties for a class of matrices that emerge in calorimetric processing of scintillation detector signals and demonstrate the ability of the method to support the relevant resolutions. The sole iteration-based error component can be brought below the sliding window induced errors in a reasonable number of iteration steps, thus allowing real-time operation. An area-efficient hardware implementation is proposed that fully utilizes the method's inherent parallelism.
Iterative optimization method for design of quantitative magnetization transfer imaging experiments.
Levesque, Ives R; Sled, John G; Pike, G Bruce
2011-09-01
Quantitative magnetization transfer imaging (QMTI) using spoiled gradient echo sequences with pulsed off-resonance saturation can be a time-consuming technique. A method is presented for selection of an optimum experimental design for quantitative magnetization transfer imaging based on the iterative reduction of a discrete sampling of the Z-spectrum. The applicability of the technique is demonstrated for human brain white matter imaging at 1.5 T and 3 T, and optimal designs are produced to target specific model parameters. The optimal number of measurements and the signal-to-noise ratio required for stable parameter estimation are also investigated. In vivo imaging results demonstrate that this optimal design approach substantially improves parameter map quality. The iterative method presented here provides an advantage over free form optimal design methods, in that pragmatic design constraints are readily incorporated. In particular, the presented method avoids clustering and repeated measures in the final experimental design, an attractive feature for the purpose of magnetization transfer model validation. The iterative optimal design technique is general and can be applied to any method of quantitative magnetization transfer imaging. Copyright © 2011 Wiley-Liss, Inc.
NASA Technical Reports Server (NTRS)
Wood, C. A.
1974-01-01
For polynomials of higher degree, iterative numerical methods must be used. Four iterative methods are presented for approximating the zeros of a polynomial using a digital computer. Newton's method and Muller's method are two well known iterative methods which are presented. They extract the zeros of a polynomial by generating a sequence of approximations converging to each zero. However, both of these methods are very unstable when used on a polynomial which has multiple zeros. That is, either they fail to converge to some or all of the zeros, or they converge to very bad approximations of the polynomial's zeros. This material introduces two new methods, the greatest common divisor (G.C.D.) method and the repeated greatest common divisor (repeated G.C.D.) method, which are superior methods for numerically approximating the zeros of a polynomial having multiple zeros. These methods were programmed in FORTRAN 4 and comparisons in time and accuracy are given.
Finding the Optimal Guidance for Enhancing Anchored Instruction
ERIC Educational Resources Information Center
Zydney, Janet Mannheimer; Bathke, Arne; Hasselbring, Ted S.
2014-01-01
This study investigated the effect of different methods of guidance with anchored instruction on students' mathematical problem-solving performance. The purpose of this research was to iteratively design a learning environment to find the optimal level of guidance. Two iterations of the software were compared. The first iteration used explicit…
Numerical Grid Generation and Potential Airfoil Analysis and Design
1988-01-01
Gauss- Seidel , SOR and ADI iterative methods e JACOBI METHOD In the Jacobi method each new value of a function is computed entirely from old values...preceding iteration and adding the inhomogeneous (boundary condition) term. * GAUSS- SEIDEL METHOD When we compute I in a Jacobi method, we have already...Gauss- Seidel method. Sufficient condition for p convergence of the Gauss- Seidel method is diagonal-dominance of [A].9W e SUCESSIVE OVER-RELAXATION (SOR
Influence of Primary Gage Sensitivities on the Convergence of Balance Load Iterations
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert Manfred
2012-01-01
The connection between the convergence of wind tunnel balance load iterations and the existence of the primary gage sensitivities of a balance is discussed. First, basic elements of two load iteration equations that the iterative method uses in combination with results of a calibration data analysis for the prediction of balance loads are reviewed. Then, the connection between the primary gage sensitivities, the load format, the gage output format, and the convergence characteristics of the load iteration equation choices is investigated. A new criterion is also introduced that may be used to objectively determine if the primary gage sensitivity of a balance gage exists. Then, it is shown that both load iteration equations will converge as long as a suitable regression model is used for the analysis of the balance calibration data, the combined influence of non linear terms of the regression model is very small, and the primary gage sensitivities of all balance gages exist. The last requirement is fulfilled, e.g., if force balance calibration data is analyzed in force balance format. Finally, it is demonstrated that only one of the two load iteration equation choices, i.e., the iteration equation used by the primary load iteration method, converges if one or more primary gage sensitivities are missing. This situation may occur, e.g., if force balance calibration data is analyzed in direct read format using the original gage outputs. Data from the calibration of a six component force balance is used to illustrate the connection between the convergence of the load iteration equation choices and the existence of the primary gage sensitivities.
Block Gauss elimination followed by a classical iterative method for the solution of linear systems
NASA Astrophysics Data System (ADS)
Alanelli, Maria; Hadjidimos, Apostolos
2004-02-01
In the last two decades many papers have appeared in which the application of an iterative method for the solution of a linear system is preceded by a step of the Gauss elimination process in the hope that this will increase the rates of convergence of the iterative method. This combination of methods has been proven successful especially when the matrix A of the system is an M-matrix. The purpose of this paper is to extend the idea of one to more Gauss elimination steps, consider other classes of matrices A, e.g., p-cyclic consistently ordered, and generalize and improve the asymptotic convergence rates of some of the methods known so far.
Modified conjugate gradient method for diagonalizing large matrices.
Jie, Quanlin; Liu, Dunhuan
2003-11-01
We present an iterative method to diagonalize large matrices. The basic idea is the same as the conjugate gradient (CG) method, i.e, minimizing the Rayleigh quotient via its gradient and avoiding reintroducing errors to the directions of previous gradients. Each iteration step is to find lowest eigenvector of the matrix in a subspace spanned by the current trial vector and the corresponding gradient of the Rayleigh quotient, as well as some previous trial vectors. The gradient, together with the previous trial vectors, play a similar role as the conjugate gradient of the original CG algorithm. Our numeric tests indicate that this method converges significantly faster than the original CG method. And the computational cost of one iteration step is about the same as the original CG method. It is suitable for first principle calculations.
NASA Astrophysics Data System (ADS)
Chun, Tae Yoon; Lee, Jae Young; Park, Jin Bae; Choi, Yoon Ho
2018-06-01
In this paper, we propose two multirate generalised policy iteration (GPI) algorithms applied to discrete-time linear quadratic regulation problems. The proposed algorithms are extensions of the existing GPI algorithm that consists of the approximate policy evaluation and policy improvement steps. The two proposed schemes, named heuristic dynamic programming (HDP) and dual HDP (DHP), based on multirate GPI, use multi-step estimation (M-step Bellman equation) at the approximate policy evaluation step for estimating the value function and its gradient called costate, respectively. Then, we show that these two methods with the same update horizon can be considered equivalent in the iteration domain. Furthermore, monotonically increasing and decreasing convergences, so called value iteration (VI)-mode and policy iteration (PI)-mode convergences, are proved to hold for the proposed multirate GPIs. Further, general convergence properties in terms of eigenvalues are also studied. The data-driven online implementation methods for the proposed HDP and DHP are demonstrated and finally, we present the results of numerical simulations performed to verify the effectiveness of the proposed methods.
Amesos2 and Belos: Direct and Iterative Solvers for Large Sparse Linear Systems
Bavier, Eric; Hoemmen, Mark; Rajamanickam, Sivasankaran; ...
2012-01-01
Solvers for large sparse linear systems come in two categories: direct and iterative. Amesos2, a package in the Trilinos software project, provides direct methods, and Belos, another Trilinos package, provides iterative methods. Amesos2 offers a common interface to many different sparse matrix factorization codes, and can handle any implementation of sparse matrices and vectors, via an easy-to-extend C++ traits interface. It can also factor matrices whose entries have arbitrary “Scalar” type, enabling extended-precision and mixed-precision algorithms. Belos includes many different iterative methods for solving large sparse linear systems and least-squares problems. Unlike competing iterative solver libraries, Belos completely decouples themore » algorithms from the implementations of the underlying linear algebra objects. This lets Belos exploit the latest hardware without changes to the code. Belos favors algorithms that solve higher-level problems, such as multiple simultaneous linear systems and sequences of related linear systems, faster than standard algorithms. The package also supports extended-precision and mixed-precision algorithms. Together, Amesos2 and Belos form a complete suite of sparse linear solvers.« less
Efficient iterative methods applied to the solution of transonic flows
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wissink, A.M.; Lyrintzis, A.S.; Chronopoulos, A.T.
1996-02-01
We investigate the use of an inexact Newton`s method to solve the potential equations in the transonic regime. As a test case, we solve the two-dimensional steady transonic small disturbance equation. Approximate factorization/ADI techniques have traditionally been employed for implicit solutions of this nonlinear equation. Instead, we apply Newton`s method using an exact analytical determination of the Jacobian with preconditioned conjugate gradient-like iterative solvers for solution of the linear systems in each Newton iteration. Two iterative solvers are tested; a block s-step version of the classical Orthomin(k) algorithm called orthogonal s-step Orthomin (OSOmin) and the well-known GIVIRES method. The preconditionermore » is a vectorizable and parallelizable version of incomplete LU (ILU) factorization. Efficiency of the Newton-Iterative method on vector and parallel computer architectures is the main issue addressed. In vectorized tests on a single processor of the Cray C-90, the performance of Newton-OSOmin is superior to Newton-GMRES and a more traditional monotone AF/ADI method (MAF) for a variety of transonic Mach numbers and mesh sizes. Newton- GIVIRES is superior to MAF for some cases. The parallel performance of the Newton method is also found to be very good on multiple processors of the Cray C-90 and on the massively parallel thinking machine CM-5, where very fast execution rates (up to 9 Gflops) are found for large problems. 38 refs., 14 figs., 7 tabs.« less
Sparse magnetic resonance imaging reconstruction using the bregman iteration
NASA Astrophysics Data System (ADS)
Lee, Dong-Hoon; Hong, Cheol-Pyo; Lee, Man-Woo
2013-01-01
Magnetic resonance imaging (MRI) reconstruction needs many samples that are sequentially sampled by using phase encoding gradients in a MRI system. It is directly connected to the scan time for the MRI system and takes a long time. Therefore, many researchers have studied ways to reduce the scan time, especially, compressed sensing (CS), which is used for sparse images and reconstruction for fewer sampling datasets when the k-space is not fully sampled. Recently, an iterative technique based on the bregman method was developed for denoising. The bregman iteration method improves on total variation (TV) regularization by gradually recovering the fine-scale structures that are usually lost in TV regularization. In this study, we studied sparse sampling image reconstruction using the bregman iteration for a low-field MRI system to improve its temporal resolution and to validate its usefulness. The image was obtained with a 0.32 T MRI scanner (Magfinder II, SCIMEDIX, Korea) with a phantom and an in-vivo human brain in a head coil. We applied random k-space sampling, and we determined the sampling ratios by using half the fully sampled k-space. The bregman iteration was used to generate the final images based on the reduced data. We also calculated the root-mean-square-error (RMSE) values from error images that were obtained using various numbers of bregman iterations. Our reconstructed images using the bregman iteration for sparse sampling images showed good results compared with the original images. Moreover, the RMSE values showed that the sparse reconstructed phantom and the human images converged to the original images. We confirmed the feasibility of sparse sampling image reconstruction methods using the bregman iteration with a low-field MRI system and obtained good results. Although our results used half the sampling ratio, this method will be helpful in increasing the temporal resolution at low-field MRI systems.
NASA Astrophysics Data System (ADS)
Leclerc, Arnaud; Thomas, Phillip S.; Carrington, Tucker
2017-08-01
Vibrational spectra and wavefunctions of polyatomic molecules can be calculated at low memory cost using low-rank sum-of-product (SOP) decompositions to represent basis functions generated using an iterative eigensolver. Using a SOP tensor format does not determine the iterative eigensolver. The choice of the interative eigensolver is limited by the need to restrict the rank of the SOP basis functions at every stage of the calculation. We have adapted, implemented and compared different reduced-rank algorithms based on standard iterative methods (block-Davidson algorithm, Chebyshev iteration) to calculate vibrational energy levels and wavefunctions of the 12-dimensional acetonitrile molecule. The effect of using low-rank SOP basis functions on the different methods is analysed and the numerical results are compared with those obtained with the reduced rank block power method. Relative merits of the different algorithms are presented, showing that the advantage of using a more sophisticated method, although mitigated by the use of reduced-rank SOP functions, is noticeable in terms of CPU time.
Computational aspects of helicopter trim analysis and damping levels from Floquet theory
NASA Technical Reports Server (NTRS)
Gaonkar, Gopal H.; Achar, N. S.
1992-01-01
Helicopter trim settings of periodic initial state and control inputs are investigated for convergence of Newton iteration in computing the settings sequentially and in parallel. The trim analysis uses a shooting method and a weak version of two temporal finite element methods with displacement formulation and with mixed formulation of displacements and momenta. These three methods broadly represent two main approaches of trim analysis: adaptation of initial-value and finite element boundary-value codes to periodic boundary conditions, particularly for unstable and marginally stable systems. In each method, both the sequential and in-parallel schemes are used and the resulting nonlinear algebraic equations are solved by damped Newton iteration with an optimally selected damping parameter. The impact of damped Newton iteration, including earlier-observed divergence problems in trim analysis, is demonstrated by the maximum condition number of the Jacobian matrices of the iterative scheme and by virtual elimination of divergence. The advantages of the in-parallel scheme over the conventional sequential scheme are also demonstrated.
Evaluating user reputation in online rating systems via an iterative group-based ranking method
NASA Astrophysics Data System (ADS)
Gao, Jian; Zhou, Tao
2017-05-01
Reputation is a valuable asset in online social lives and it has drawn increased attention. Due to the existence of noisy ratings and spamming attacks, how to evaluate user reputation in online rating systems is especially significant. However, most of the previous ranking-based methods either follow a debatable assumption or have unsatisfied robustness. In this paper, we propose an iterative group-based ranking method by introducing an iterative reputation-allocation process into the original group-based ranking method. More specifically, the reputation of users is calculated based on the weighted sizes of the user rating groups after grouping all users by their rating similarities, and the high reputation users' ratings have larger weights in dominating the corresponding user rating groups. The reputation of users and the user rating group sizes are iteratively updated until they become stable. Results on two real data sets with artificial spammers suggest that the proposed method has better performance than the state-of-the-art methods and its robustness is considerably improved comparing with the original group-based ranking method. Our work highlights the positive role of considering users' grouping behaviors towards a better online user reputation evaluation.
Computation of nonlinear ultrasound fields using a linearized contrast source method.
Verweij, Martin D; Demi, Libertario; van Dongen, Koen W A
2013-08-01
Nonlinear ultrasound is important in medical diagnostics because imaging of the higher harmonics improves resolution and reduces scattering artifacts. Second harmonic imaging is currently standard, and higher harmonic imaging is under investigation. The efficient development of novel imaging modalities and equipment requires accurate simulations of nonlinear wave fields in large volumes of realistic (lossy, inhomogeneous) media. The Iterative Nonlinear Contrast Source (INCS) method has been developed to deal with spatiotemporal domains measuring hundreds of wavelengths and periods. This full wave method considers the nonlinear term of the Westervelt equation as a nonlinear contrast source, and solves the equivalent integral equation via the Neumann iterative solution. Recently, the method has been extended with a contrast source that accounts for spatially varying attenuation. The current paper addresses the problem that the Neumann iterative solution converges badly for strong contrast sources. The remedy is linearization of the nonlinear contrast source, combined with application of more advanced methods for solving the resulting integral equation. Numerical results show that linearization in combination with a Bi-Conjugate Gradient Stabilized method allows the INCS method to deal with fairly strong, inhomogeneous attenuation, while the error due to the linearization can be eliminated by restarting the iterative scheme.
Compensation for the phase-type spatial periodic modulation of the near-field beam at 1053 nm
NASA Astrophysics Data System (ADS)
Gao, Yaru; Liu, Dean; Yang, Aihua; Tang, Ruyu; Zhu, Jianqiang
2017-10-01
A phase-only spatial light modulator is used to provide and compensate for the spatial periodic modulation (SPM) of the near-field beam at the near infrared at 1053nm wavelength with an improved iterative weight-based method. The transmission characteristics of the incident beam has been changed by a spatial light modulator (SLM) to shape the spatial intensity of the output beam. The propagation and reverse propagation of the light in free space are two important processes in the iterative process. The based theory is the beam angular spectrum transmit formula (ASTF) and the principle of the iterative weight-based method. We have made two improvements to the originally proposed iterative weight-based method. We select the appropriate parameter by choosing the minimum value of the output beam contrast degree and use the MATLAB built-in angle function to acquire the corresponding phase of the light wave function. The required phase that compensates for the intensity distribution of the incident SPM beam is iterated by this algorithm, which can decrease the magnitude of the SPM of the intensity on the observation plane. The experimental results show that the phase-type SPM of the near-field beam is subject to a certain restriction. We have also analyzed some factors that make the results imperfect. The experiment results verifies the possible applicability of this iterative weight-based method to compensate for the SPM of the near-field beam.
An iterative method for the localization of a neutron source in a large box (container)
NASA Astrophysics Data System (ADS)
Dubinski, S.; Presler, O.; Alfassi, Z. B.
2007-12-01
The localization of an unknown neutron source in a bulky box was studied. This can be used for the inspection of cargo, to prevent the smuggling of neutron and α emitters. It is important to localize the source from the outside for safety reasons. Source localization is necessary in order to determine its activity. A previous study showed that, by using six detectors, three on each parallel face of the box (460×420×200 mm 3), the location of the source can be found with an average distance of 4.73 cm between the real source position and the calculated one and a maximal distance of about 9 cm. Accuracy was improved in this work by applying an iteration method based on four fixed detectors and the successive iteration of positioning of an external calibrating source. The initial positioning of the calibrating source is the plane of detectors 1 and 2. This method finds the unknown source location with an average distance of 0.78 cm between the real source position and the calculated one and a maximum distance of 3.66 cm for the same box. For larger boxes, localization without iterations requires an increase in the number of detectors, while localization with iterations requires only an increase in the number of iteration steps. In addition to source localization, two methods for determining the activity of the unknown source were also studied.
NASA Technical Reports Server (NTRS)
Wu, S. T.; Sun, M. T.; Sakurai, Takashi
1990-01-01
This paper presents a comparison between two numerical methods for the extrapolation of nonlinear force-free magnetic fields, viz the Iterative Method (IM) and the Progressive Extension Method (PEM). The advantages and disadvantages of these two methods are summarized, and the accuracy and numerical instability are discussed. On the basis of this investigation, it is claimed that the two methods do resemble each other qualitatively.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gustafson, K.
1994-12-31
By means of the author`s earlier theory of antieigenvalues and antieigenvectors, a new computational approach to iterative methods is presented. This enables an explicit trigonometric understanding of iterative convergence and provides new insights into the sharpness of error bounds. Direct applications to Gradient descent, Conjugate gradient, GCR(k), Orthomin, CGN, GMRES, CGS, and other matrix iterative schemes will be given.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jamsranjav, Erdenetogtokh, E-mail: ja.erdenetogtokh@gmail.com; Shiina, Tatsuo, E-mail: shiina@faculity.chiba-u.jp; Kuge, Kenichi
2016-01-28
Soft X-ray microscopy is well recognized as a powerful tool of high-resolution imaging for hydrated biological specimens. Projection type of it has characteristics of easy zooming function, simple optical layout and so on. However the image is blurred by the diffraction of X-rays, leading the spatial resolution to be worse. In this study, the blurred images have been corrected by an iteration procedure, i.e., Fresnel and inverse Fresnel transformations repeated. This method was confirmed by earlier studies to be effective. Nevertheless it was not enough to some images showing too low contrast, especially at high magnification. In the present study,more » we tried a contrast enhancement method to make the diffraction fringes clearer prior to the iteration procedure. The method was effective to improve the images which were not successful by iteration procedure only.« less
A fast method to emulate an iterative POCS image reconstruction algorithm.
Zeng, Gengsheng L
2017-10-01
Iterative image reconstruction algorithms are commonly used to optimize an objective function, especially when the objective function is nonquadratic. Generally speaking, the iterative algorithms are computationally inefficient. This paper presents a fast algorithm that has one backprojection and no forward projection. This paper derives a new method to solve an optimization problem. The nonquadratic constraint, for example, an edge-preserving denoising constraint is implemented as a nonlinear filter. The algorithm is derived based on the POCS (projections onto projections onto convex sets) approach. A windowed FBP (filtered backprojection) algorithm enforces the data fidelity. An iterative procedure, divided into segments, enforces edge-enhancement denoising. Each segment performs nonlinear filtering. The derived iterative algorithm is computationally efficient. It contains only one backprojection and no forward projection. Low-dose CT data are used for algorithm feasibility studies. The nonlinearity is implemented as an edge-enhancing noise-smoothing filter. The patient studies results demonstrate its effectiveness in processing low-dose x ray CT data. This fast algorithm can be used to replace many iterative algorithms. © 2017 American Association of Physicists in Medicine.
Generalized Pattern Search methods for a class of nonsmooth optimization problems with structure
NASA Astrophysics Data System (ADS)
Bogani, C.; Gasparo, M. G.; Papini, A.
2009-07-01
We propose a Generalized Pattern Search (GPS) method to solve a class of nonsmooth minimization problems, where the set of nondifferentiability is included in the union of known hyperplanes and, therefore, is highly structured. Both unconstrained and linearly constrained problems are considered. At each iteration the set of poll directions is enforced to conform to the geometry of both the nondifferentiability set and the boundary of the feasible region, near the current iterate. This is the key issue to guarantee the convergence of certain subsequences of iterates to points which satisfy first-order optimality conditions. Numerical experiments on some classical problems validate the method.
Harmonics analysis of the ITER poloidal field converter based on a piecewise method
NASA Astrophysics Data System (ADS)
Xudong, WANG; Liuwei, XU; Peng, FU; Ji, LI; Yanan, WU
2017-12-01
Poloidal field (PF) converters provide controlled DC voltage and current to PF coils. The many harmonics generated by the PF converter flow into the power grid and seriously affect power systems and electric equipment. Due to the complexity of the system, the traditional integral operation in Fourier analysis is complicated and inaccurate. This paper presents a piecewise method to calculate the harmonics of the ITER PF converter. The relationship between the grid input current and the DC output current of the ITER PF converter is deduced. The grid current is decomposed into the sum of some simple functions. By calculating simple function harmonics based on the piecewise method, the harmonics of the PF converter under different operation modes are obtained. In order to examine the validity of the method, a simulation model is established based on Matlab/Simulink and a relevant experiment is implemented in the ITER PF integration test platform. Comparative results are given. The calculated results are found to be consistent with simulation and experiment. The piecewise method is proved correct and valid for calculating the system harmonics.
RF Pulse Design using Nonlinear Gradient Magnetic Fields
Kopanoglu, Emre; Constable, R. Todd
2014-01-01
Purpose An iterative k-space trajectory and radio-frequency (RF) pulse design method is proposed for Excitation using Nonlinear Gradient Magnetic fields (ENiGMa). Theory and Methods The spatial encoding functions (SEFs) generated by nonlinear gradient fields (NLGFs) are linearly dependent in Cartesian-coordinates. Left uncorrected, this may lead to flip-angle variations in excitation profiles. In the proposed method, SEFs (k-space samples) are selected using a Matching-Pursuit algorithm, and the RF pulse is designed using a Conjugate-Gradient algorithm. Three variants of the proposed approach are given: the full-algorithm, a computationally-cheaper version, and a third version for designing spoke-based trajectories. The method is demonstrated for various target excitation profiles using simulations and phantom experiments. Results The method is compared to other iterative (Matching-Pursuit and Conjugate Gradient) and non-iterative (coordinate-transformation and Jacobian-based) pulse design methods as well as uniform density spiral and EPI trajectories. The results show that the proposed method can increase excitation fidelity significantly. Conclusion An iterative method for designing k-space trajectories and RF pulses using nonlinear gradient fields is proposed. The method can either be used for selecting the SEFs individually to guide trajectory design, or can be adapted to design and optimize specific trajectories of interest. PMID:25203286
A comparison of several methods of solving nonlinear regression groundwater flow problems
Cooley, Richard L.
1985-01-01
Computational efficiency and computer memory requirements for four methods of minimizing functions were compared for four test nonlinear-regression steady state groundwater flow problems. The fastest methods were the Marquardt and quasi-linearization methods, which required almost identical computer times and numbers of iterations; the next fastest was the quasi-Newton method, and last was the Fletcher-Reeves method, which did not converge in 100 iterations for two of the problems. The fastest method per iteration was the Fletcher-Reeves method, and this was followed closely by the quasi-Newton method. The Marquardt and quasi-linearization methods were slower. For all four methods the speed per iteration was directly related to the number of parameters in the model. However, this effect was much more pronounced for the Marquardt and quasi-linearization methods than for the other two. Hence the quasi-Newton (and perhaps Fletcher-Reeves) method might be more efficient than either the Marquardt or quasi-linearization methods if the number of parameters in a particular model were large, although this remains to be proven. The Marquardt method required somewhat less central memory than the quasi-linearization metilod for three of the four problems. For all four problems the quasi-Newton method required roughly two thirds to three quarters of the memory required by the Marquardt method, and the Fletcher-Reeves method required slightly less memory than the quasi-Newton method. Memory requirements were not excessive for any of the four methods.
Adaptively Tuned Iterative Low Dose CT Image Denoising
Hashemi, SayedMasoud; Paul, Narinder S.; Beheshti, Soosan; Cobbold, Richard S. C.
2015-01-01
Improving image quality is a critical objective in low dose computed tomography (CT) imaging and is the primary focus of CT image denoising. State-of-the-art CT denoising algorithms are mainly based on iterative minimization of an objective function, in which the performance is controlled by regularization parameters. To achieve the best results, these should be chosen carefully. However, the parameter selection is typically performed in an ad hoc manner, which can cause the algorithms to converge slowly or become trapped in a local minimum. To overcome these issues a noise confidence region evaluation (NCRE) method is used, which evaluates the denoising residuals iteratively and compares their statistics with those produced by additive noise. It then updates the parameters at the end of each iteration to achieve a better match to the noise statistics. By combining NCRE with the fundamentals of block matching and 3D filtering (BM3D) approach, a new iterative CT image denoising method is proposed. It is shown that this new denoising method improves the BM3D performance in terms of both the mean square error and a structural similarity index. Moreover, simulations and patient results show that this method preserves the clinically important details of low dose CT images together with a substantial noise reduction. PMID:26089972
NASA Astrophysics Data System (ADS)
Lavery, N.; Taylor, C.
1999-07-01
Multigrid and iterative methods are used to reduce the solution time of the matrix equations which arise from the finite element (FE) discretisation of the time-independent equations of motion of the incompressible fluid in turbulent motion. Incompressible flow is solved by using the method of reduce interpolation for the pressure to satisfy the Brezzi-Babuska condition. The k-l model is used to complete the turbulence closure problem. The non-symmetric iterative matrix methods examined are the methods of least squares conjugate gradient (LSCG), biconjugate gradient (BCG), conjugate gradient squared (CGS), and the biconjugate gradient squared stabilised (BCGSTAB). The multigrid algorithm applied is based on the FAS algorithm of Brandt, and uses two and three levels of grids with a V-cycling schedule. These methods are all compared to the non-symmetric frontal solver. Copyright
Degrande, G; Lombaert, G
2001-09-01
In Krylov's analytical prediction model, the free field vibration response during the passage of a train is written as the superposition of the effect of all sleeper forces, using Lamb's approximate solution for the Green's function of a halfspace. When this formulation is extended with the Green's functions of a layered soil, considerable computational effort is required if these Green's functions are needed in a wide range of source-receiver distances and frequencies. It is demonstrated in this paper how the free field response can alternatively be computed, using the dynamic reciprocity theorem, applied to moving loads. The formulation is based on the response of the soil due to the moving load distribution for a single axle load. The equations are written in the wave-number-frequency domain, accounting for the invariance of the geometry in the direction of the track. The approach allows for a very efficient calculation of the free field vibration response, distinguishing the quasistatic contribution from the effect of the sleeper passage frequency and its higher harmonics. The methodology is validated by means of in situ vibration measurements during the passage of a Thalys high-speed train on the track between Brussels and Paris. It is shown that the model has good predictive capabilities in the near field at low and high frequencies, but underestimates the response in the midfrequency band.
Improved evaluation of optical depth components from Langley plot data
NASA Technical Reports Server (NTRS)
Biggar, S. F.; Gellman, D. I.; Slater, P. N.
1990-01-01
A simple, iterative procedure to determine the optical depth components of the extinction optical depth measured by a solar radiometer is presented. Simulated data show that the iterative procedure improves the determination of the exponent of a Junge law particle size distribution. The determination of the optical depth due to aerosol scattering is improved as compared to a method which uses only two points from the extinction data. The iterative method was used to determine spectral optical depth components for June 11-13, 1988 during the MAC III experiment.
Three-dimensional inverse modelling of damped elastic wave propagation in the Fourier domain
NASA Astrophysics Data System (ADS)
Petrov, Petr V.; Newman, Gregory A.
2014-09-01
3-D full waveform inversion (FWI) of seismic wavefields is routinely implemented with explicit time-stepping simulators. A clear advantage of explicit time stepping is the avoidance of solving large-scale implicit linear systems that arise with frequency domain formulations. However, FWI using explicit time stepping may require a very fine time step and (as a consequence) significant computational resources and run times. If the computational challenges of wavefield simulation can be effectively handled, an FWI scheme implemented within the frequency domain utilizing only a few frequencies, offers a cost effective alternative to FWI in the time domain. We have therefore implemented a 3-D FWI scheme for elastic wave propagation in the Fourier domain. To overcome the computational bottleneck in wavefield simulation, we have exploited an efficient Krylov iterative solver for the elastic wave equations approximated with second and fourth order finite differences. The solver does not exploit multilevel preconditioning for wavefield simulation, but is coupled efficiently to the inversion iteration workflow to reduce computational cost. The workflow is best described as a series of sequential inversion experiments, where in the case of seismic reflection acquisition geometries, the data has been laddered such that we first image highly damped data, followed by data where damping is systemically reduced. The key to our modelling approach is its ability to take advantage of solver efficiency when the elastic wavefields are damped. As the inversion experiment progresses, damping is significantly reduced, effectively simulating non-damped wavefields in the Fourier domain. While the cost of the forward simulation increases as damping is reduced, this is counterbalanced by the cost of the outer inversion iteration, which is reduced because of a better starting model obtained from the larger damped wavefield used in the previous inversion experiment. For cross-well data, it is also possible to launch a successful inversion experiment without laddering the damping constants. With this type of acquisition geometry, the solver is still quite effective using a small fixed damping constant. To avoid cycle skipping, we also employ a multiscale imaging approach, in which frequency content of the data is also laddered (with the data now including both reflection and cross-well data acquisition geometries). Thus the inversion process is launched using low frequency data to first recover the long spatial wavelength of the image. With this image as a new starting model, adding higher frequency data refines and enhances the resolution of the image. FWI using laddered frequencies with an efficient damping schemed enables reconstructing elastic attributes of the subsurface at a resolution that approaches half the smallest wavelength utilized to image the subsurface. We show the possibility of effectively carrying out such reconstructions using two to six frequencies, depending upon the application. Using the proposed FWI scheme, massively parallel computing resources are essential for reasonable execution times.
NASA Technical Reports Server (NTRS)
Peters, B. C., Jr.; Walker, H. F.
1975-01-01
A general iterative procedure is given for determining the consistent maximum likelihood estimates of normal distributions. In addition, a local maximum of the log-likelihood function, Newtons's method, a method of scoring, and modifications of these procedures are discussed.
A study on Marangoni convection by the variational iteration method
NASA Astrophysics Data System (ADS)
Karaoǧlu, Onur; Oturanç, Galip
2012-09-01
In this paper, we will consider the use of the variational iteration method and Padé approximant for finding approximate solutions for a Marangoni convection induced flow over a free surface due to an imposed temperature gradient. The solutions are compared with the numerical (fourth-order Runge Kutta) solutions.
Iteration of ultrasound aberration correction methods
NASA Astrophysics Data System (ADS)
Maasoey, Svein-Erik; Angelsen, Bjoern; Varslot, Trond
2004-05-01
Aberration in ultrasound medical imaging is usually modeled by time-delay and amplitude variations concentrated on the transmitting/receiving array. This filter process is here denoted a TDA filter. The TDA filter is an approximation to the physical aberration process, which occurs over an extended part of the human body wall. Estimation of the TDA filter, and performing correction on transmit and receive, has proven difficult. It has yet to be shown that this method works adequately for severe aberration. Estimation of the TDA filter can be iterated by retransmitting a corrected signal and re-estimate until a convergence criterion is fulfilled (adaptive imaging). Two methods for estimating time-delay and amplitude variations in receive signals from random scatterers have been developed. One method correlates each element signal with a reference signal. The other method use eigenvalue decomposition of the receive cross-spectrum matrix, based upon a receive energy-maximizing criterion. Simulations of iterating aberration correction with a TDA filter have been investigated to study its convergence properties. A weak and strong human-body wall model generated aberration. Both emulated the human abdominal wall. Results after iteration improve aberration correction substantially, and both estimation methods converge, even for the case of strong aberration.
Flexible Method for Developing Tactics, Techniques, and Procedures for Future Capabilities
2009-02-01
levels of ability, military experience, and motivation, (b) number and type of significant events, and (c) other sources of natural variability...research has developed a number of specific instruments designed to aid in this process. Second, the iterative, feed-forward nature of the method allows...FLEX method), but still lack the structured KE approach and iterative, feed-forward nature of the FLEX method. To facilitate decision making
An iterative method for the Helmholtz equation
NASA Technical Reports Server (NTRS)
Bayliss, A.; Goldstein, C. I.; Turkel, E.
1983-01-01
An iterative algorithm for the solution of the Helmholtz equation is developed. The algorithm is based on a preconditioned conjugate gradient iteration for the normal equations. The preconditioning is based on an SSOR sweep for the discrete Laplacian. Numerical results are presented for a wide variety of problems of physical interest and demonstrate the effectiveness of the algorithm.
An iterative method for obtaining the optimum lightning location on a spherical surface
NASA Technical Reports Server (NTRS)
Chao, Gao; Qiming, MA
1991-01-01
A brief introduction to the basic principles of an eigen method used to obtain the optimum source location of lightning is presented. The location of the optimum source is obtained by using multiple direction finders (DF's) on a spherical surface. An improvement of this method, which takes the distance of source-DF's as a constant, is presented. It is pointed out that using a weight factor of signal strength is not the most ideal method because of the inexact inverse signal strength-distance relation and the inaccurate signal amplitude. An iterative calculation method is presented using the distance from the source to the DF as a weight factor. This improved method has higher accuracy and needs only a little more calculation time. Some computer simulations for a 4DF system are presented to show the improvement of location through use of the iterative method.
On iterative algorithms for quantitative photoacoustic tomography in the radiative transport regime
NASA Astrophysics Data System (ADS)
Wang, Chao; Zhou, Tie
2017-11-01
In this paper, we present a numerical reconstruction method for quantitative photoacoustic tomography (QPAT), based on the radiative transfer equation (RTE), which models light propagation more accurately than diffusion approximation (DA). We investigate the reconstruction of absorption coefficient and scattering coefficient of biological tissues. An improved fixed-point iterative method to retrieve the absorption coefficient, given the scattering coefficient, is proposed for its cheap computational cost; the convergence of this method is also proved. The Barzilai-Borwein (BB) method is applied to retrieve two coefficients simultaneously. Since the reconstruction of optical coefficients involves the solutions of original and adjoint RTEs in the framework of optimization, an efficient solver with high accuracy is developed from Gao and Zhao (2009 Transp. Theory Stat. Phys. 38 149-92). Simulation experiments illustrate that the improved fixed-point iterative method and the BB method are competitive methods for QPAT in the relevant cases.
An analytically iterative method for solving problems of cosmic-ray modulation
NASA Astrophysics Data System (ADS)
Kolesnyk, Yuriy L.; Bobik, Pavol; Shakhov, Boris A.; Putis, Marian
2017-09-01
The development of an analytically iterative method for solving steady-state as well as unsteady-state problems of cosmic-ray (CR) modulation is proposed. Iterations for obtaining the solutions are constructed for the spherically symmetric form of the CR propagation equation. The main solution of the considered problem consists of the zero-order solution that is obtained during the initial iteration and amendments that may be obtained by subsequent iterations. The finding of the zero-order solution is based on the CR isotropy during propagation in the space, whereas the anisotropy is taken into account when finding the next amendments. To begin with, the method is applied to solve the problem of CR modulation where the diffusion coefficient κ and the solar wind speed u are constants with an Local Interstellar Spectra (LIS) spectrum. The solution obtained with two iterations was compared with an analytical solution and with numerical solutions. Finally, solutions that have only one iteration for two problems of CR modulation with u = constant and the same form of LIS spectrum were obtained and tested against numerical solutions. For the first problem, κ is proportional to the momentum of the particle p, so it has the form κ = k0η, where η =p/m_0c. For the second problem, the diffusion coefficient is given in the form κ = k0βη, where β =v/c is the particle speed relative to the speed of light. There was a good matching of the obtained solutions with the numerical solutions as well as with the analytical solution for the problem where κ = constant.
Multiscale modeling and computation of nano-electronic transistors and transmembrane proton channels
NASA Astrophysics Data System (ADS)
Chen, Duan
The miniaturization of nano-scale electronic transistors, such as metal oxide semiconductor field effect transistors (MOSFETs), has given rise to a pressing demand in the new theoretical understanding and practical tactic for dealing with quantum mechanical effects in integrated circuits. In biology, proton dynamics and transport across membrane proteins are of paramount importance to the normal function of living cells. Similar physical characteristics are behind the two subjects, and model simulations share common mathematical interests/challenges. In this thesis work, multiscale and multiphysical models are proposed to study the mechanisms of nanotransistors and proton transport in transmembrane at the atomic level. For nano-electronic transistors, we introduce a unified two-scale energy functional to describe the electrons and the continuum electrostatic potential. This framework enables us to put microscopic and macroscopic descriptions on an equal footing at nano-scale. Additionally, this model includes layered structures and random doping effect of nano-transistors. For transmembrane proton channels, we describe proton dynamics quantum mechanically via a density functional approach while implicitly treat numerous solvent molecules as a dielectric continuum. The densities of all other ions in the solvent are assumed to obey the Boltzmann distribution. The impact of protein molecular structure and its charge polarization on the proton transport is considered in atomic details. We formulate a total free energy functional to include kinetic and potential energies of protons, as well as electrostatic energy of all other ions on an equal footing. For both nano-transistors and proton channels systems, the variational principle is employed to derive nonlinear governing equations. The Poisson-Kohn-Sham equations are derived for nano-transistors while the generalized Poisson-Boltzmann equation and Kohn-Sham equation are obtained for proton channels. Related numerical challenges in simulations are addressed: the matched interface and boundary (MIB) method, the Dirichlet-to-Neumann mapping (DNM) technique, and the Krylov subspace and preconditioner theory are introduced to improve the computational efficiency of the Poisson-type equation. The quantum transport theory is employed to solve the Kohn-Sham equation. The Gummel iteration and relaxation technique are utilized for overall self-consistent iterations. Finally, applications are considered and model validations are verified by realistic nano-transistors and transmembrane proteins. Two distinct device configurations, a double-gate MOSFET and a four-gate MOSFET, are considered in our threedimensional numerical simulations. For these devices, the current uctuation and voltage threshold lowering effect induced by discrete dopants are explored. For proton transport, a realistic channel protein, the Gramicidin A (GA) is used to demonstrate the performance of the proposed proton channel model and validate the efficiency of the proposed mathematical algorithms. The electrostatic characteristics of the GA channel is analyzed with a wide range of model parameters. Proton channel conductances are studied over a number of applied voltages and reference concentrations. Comparisons with experimental data are utilized to verify our model predictions.
Highly Productive Application Development with ViennaCL for Accelerators
NASA Astrophysics Data System (ADS)
Rupp, K.; Weinbub, J.; Rudolf, F.
2012-12-01
The use of graphics processing units (GPUs) for the acceleration of general purpose computations has become very attractive over the last years, and accelerators based on many integrated CPU cores are about to hit the market. However, there are discussions about the benefit of GPU computing when comparing the reduction of execution times with the increased development effort [1]. To counter these concerns, our open-source linear algebra library ViennaCL [2,3] uses modern programming techniques such as generic programming in order to provide a convenient access layer for accelerator and GPU computing. Other GPU-accelerated libraries are primarily tuned for performance, but less tailored to productivity and portability: MAGMA [4] provides dense linear algebra operations via a LAPACK-comparable interface, but no dedicated matrix and vector types. Cusp [5] is closest in functionality to ViennaCL for sparse matrices, but is based on CUDA and thus restricted to devices from NVIDIA. However, no convenience layer for dense linear algebra is provided with Cusp. ViennaCL is written in C++ and uses OpenCL to access the resources of accelerators, GPUs and multi-core CPUs in a unified way. On the one hand, the library provides iterative solvers from the family of Krylov methods, including various preconditioners, for the solution of linear systems typically obtained from the discretization of partial differential equations. On the other hand, dense linear algebra operations are supported, including algorithms such as QR factorization and singular value decomposition. The user application interface of ViennaCL is compatible to uBLAS [6], which is part of the peer-reviewed Boost C++ libraries [7]. This allows to port existing applications based on uBLAS with a minimum of effort to ViennaCL. Conversely, the interface compatibility allows to use the iterative solvers from ViennaCL with uBLAS types directly, thus enabling code reuse beyond CPU-GPU boundaries. Out-of-the-box support for types from the Eigen library [8] and MTL 4 [9] are provided as well, enabling a seamless transition from single-core CPU to GPU and multi-core CPU computations. Case studies from the numerical solution of PDEs are given and isolated performance benchmarks are discussed. Also, pitfalls in scientific computing with GPUs and accelerators are addressed, allowing for a first evaluation of whether these novel devices can be mapped well to certain applications. References: [1] R. Bordawekar et al., Technical Report, IBM, 2010 [2] ViennaCL library. Online: http://viennacl.sourceforge.net/ [3] K. Rupp et al., GPUScA, 2010 [4] MAGMA library. Online: http://icl.cs.utk.edu/magma/ [5] Cusp library. Online: http://code.google.com/p/cusp-library/ [6] uBLAS library. Online: http://www.boost.org/libs/numeric/ublas/ [7] Boost C++ Libraries. Online: http://www.boost.org/ [8] Eigen library. Online: http://eigen.tuxfamily.org/ [9] MTL 4 Library. Online: http://www.mtl4.org/
Nonnegative least-squares image deblurring: improved gradient projection approaches
NASA Astrophysics Data System (ADS)
Benvenuto, F.; Zanella, R.; Zanni, L.; Bertero, M.
2010-02-01
The least-squares approach to image deblurring leads to an ill-posed problem. The addition of the nonnegativity constraint, when appropriate, does not provide regularization, even if, as far as we know, a thorough investigation of the ill-posedness of the resulting constrained least-squares problem has still to be done. Iterative methods, converging to nonnegative least-squares solutions, have been proposed. Some of them have the 'semi-convergence' property, i.e. early stopping of the iteration provides 'regularized' solutions. In this paper we consider two of these methods: the projected Landweber (PL) method and the iterative image space reconstruction algorithm (ISRA). Even if they work well in many instances, they are not frequently used in practice because, in general, they require a large number of iterations before providing a sensible solution. Therefore, the main purpose of this paper is to refresh these methods by increasing their efficiency. Starting from the remark that PL and ISRA require only the computation of the gradient of the functional, we propose the application to these algorithms of special acceleration techniques that have been recently developed in the area of the gradient methods. In particular, we propose the application of efficient step-length selection rules and line-search strategies. Moreover, remarking that ISRA is a scaled gradient algorithm, we evaluate its behaviour in comparison with a recent scaled gradient projection (SGP) method for image deblurring. Numerical experiments demonstrate that the accelerated methods still exhibit the semi-convergence property, with a considerable gain both in the number of iterations and in the computational time; in particular, SGP appears definitely the most efficient one.
Physics Model-Based Scatter Correction in Multi-Source Interior Computed Tomography.
Gong, Hao; Li, Bin; Jia, Xun; Cao, Guohua
2018-02-01
Multi-source interior computed tomography (CT) has a great potential to provide ultra-fast and organ-oriented imaging at low radiation dose. However, X-ray cross scattering from multiple simultaneously activated X-ray imaging chains compromises imaging quality. Previously, we published two hardware-based scatter correction methods for multi-source interior CT. Here, we propose a software-based scatter correction method, with the benefit of no need for hardware modifications. The new method is based on a physics model and an iterative framework. The physics model was derived analytically, and was used to calculate X-ray scattering signals in both forward direction and cross directions in multi-source interior CT. The physics model was integrated to an iterative scatter correction framework to reduce scatter artifacts. The method was applied to phantom data from both Monte Carlo simulations and physical experimentation that were designed to emulate the image acquisition in a multi-source interior CT architecture recently proposed by our team. The proposed scatter correction method reduced scatter artifacts significantly, even with only one iteration. Within a few iterations, the reconstructed images fast converged toward the "scatter-free" reference images. After applying the scatter correction method, the maximum CT number error at the region-of-interests (ROIs) was reduced to 46 HU in numerical phantom dataset and 48 HU in physical phantom dataset respectively, and the contrast-noise-ratio at those ROIs increased by up to 44.3% and up to 19.7%, respectively. The proposed physics model-based iterative scatter correction method could be useful for scatter correction in dual-source or multi-source CT.
Unsupervised iterative detection of land mines in highly cluttered environments.
Batman, Sinan; Goutsias, John
2003-01-01
An unsupervised iterative scheme is proposed for land mine detection in heavily cluttered scenes. This scheme is based on iterating hybrid multispectral filters that consist of a decorrelating linear transform coupled with a nonlinear morphological detector. Detections extracted from the first pass are used to improve results in subsequent iterations. The procedure stops after a predetermined number of iterations. The proposed scheme addresses several weaknesses associated with previous adaptations of morphological approaches to land mine detection. Improvement in detection performance, robustness with respect to clutter inhomogeneities, a completely unsupervised operation, and computational efficiency are the main highlights of the method. Experimental results reveal excellent performance.
Analysis of Anderson Acceleration on a Simplified Neutronics/Thermal Hydraulics System
DOE Office of Scientific and Technical Information (OSTI.GOV)
Toth, Alex; Kelley, C. T.; Slattery, Stuart R
ABSTRACT A standard method for solving coupled multiphysics problems in light water reactors is Picard iteration, which sequentially alternates between solving single physics applications. This solution approach is appealing due to simplicity of implementation and the ability to leverage existing software packages to accurately solve single physics applications. However, there are several drawbacks in the convergence behavior of this method; namely slow convergence and the necessity of heuristically chosen damping factors to achieve convergence in many cases. Anderson acceleration is a method that has been seen to be more robust and fast converging than Picard iteration for many problems, withoutmore » significantly higher cost per iteration or complexity of implementation, though its effectiveness in the context of multiphysics coupling is not well explored. In this work, we develop a one-dimensional model simulating the coupling between the neutron distribution and fuel and coolant properties in a single fuel pin. We show that this model generally captures the convergence issues noted in Picard iterations which couple high-fidelity physics codes. We then use this model to gauge potential improvements with regard to rate of convergence and robustness from utilizing Anderson acceleration as an alternative to Picard iteration.« less
NASA Astrophysics Data System (ADS)
Al-Chalabi, Rifat M. Khalil
1997-09-01
Development of an improvement to the computational efficiency of the existing nested iterative solution strategy of the Nodal Exapansion Method (NEM) nodal based neutron diffusion code NESTLE is presented. The improvement in the solution strategy is the result of developing a multilevel acceleration scheme that does not suffer from the numerical stalling associated with a number of iterative solution methods. The acceleration scheme is based on the multigrid method, which is specifically adapted for incorporation into the NEM nonlinear iterative strategy. This scheme optimizes the computational interplay between the spatial discretization and the NEM nonlinear iterative solution process through the use of the multigrid method. The combination of the NEM nodal method, calculation of the homogenized, neutron nodal balance coefficients (i.e. restriction operator), efficient underlying smoothing algorithm (power method of NESTLE), and the finer mesh reconstruction algorithm (i.e. prolongation operator), all operating on a sequence of coarser spatial nodes, constitutes the multilevel acceleration scheme employed in this research. Two implementations of the multigrid method into the NESTLE code were examined; the Imbedded NEM Strategy and the Imbedded CMFD Strategy. The main difference in implementation between the two methods is that in the Imbedded NEM Strategy, the NEM solution is required at every MG level. Numerical tests have shown that the Imbedded NEM Strategy suffers from divergence at coarse- grid levels, hence all the results for the different benchmarks presented here were obtained using the Imbedded CMFD Strategy. The novelties in the developed MG method are as follows: the formulation of the restriction and prolongation operators, and the selection of the relaxation method. The restriction operator utilizes a variation of the reactor physics, consistent homogenization technique. The prolongation operator is based upon a variant of the pin power reconstruction methodology. The relaxation method, which is the power method, utilizes a constant coefficient matrix within the NEM non-linear iterative strategy. The choice of the MG nesting within the nested iterative strategy enables the incorporation of other non-linear effects with no additional coding effort. In addition, if an eigenvalue problem is being solved, it remains an eigenvalue problem at all grid levels, simplifying coding implementation. The merit of the developed MG method was tested by incorporating it into the NESTLE iterative solver, and employing it to solve four different benchmark problems. In addition to the base cases, three different sensitivity studies are performed, examining the effects of number of MG levels, homogenized coupling coefficients correction (i.e. restriction operator), and fine-mesh reconstruction algorithm (i.e. prolongation operator). The multilevel acceleration scheme developed in this research provides the foundation for developing adaptive multilevel acceleration methods for steady-state and transient NEM nodal neutron diffusion equations. (Abstract shortened by UMI.)
Anderson Acceleration for Fixed-Point Iterations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Walker, Homer F.
The purpose of this grant was to support research on acceleration methods for fixed-point iterations, with applications to computational frameworks and simulation problems that are of interest to DOE.
Chen, Tinggui; Xiao, Renbin
2014-01-01
Due to fierce market competition, how to improve product quality and reduce development cost determines the core competitiveness of enterprises. However, design iteration generally causes increases of product cost and delays of development time as well, so how to identify and model couplings among tasks in product design and development has become an important issue for enterprises to settle. In this paper, the shortcomings existing in WTM model are discussed and tearing approach as well as inner iteration method is used to complement the classic WTM model. In addition, the ABC algorithm is also introduced to find out the optimal decoupling schemes. In this paper, firstly, tearing approach and inner iteration method are analyzed for solving coupled sets. Secondly, a hybrid iteration model combining these two technologies is set up. Thirdly, a high-performance swarm intelligence algorithm, artificial bee colony, is adopted to realize problem-solving. Finally, an engineering design of a chemical processing system is given in order to verify its reasonability and effectiveness.
de Oliveira, Tiago E.; Netz, Paulo A.; Kremer, Kurt; ...
2016-05-03
We present a coarse-graining strategy that we test for aqueous mixtures. The method uses pair-wise cumulative coordination as a target function within an iterative Boltzmann inversion (IBI) like protocol. We name this method coordination iterative Boltzmann inversion (C–IBI). While the underlying coarse-grained model is still structure based and, thus, preserves pair-wise solution structure, our method also reproduces solvation thermodynamics of binary and/or ternary mixtures. In addition, we observe much faster convergence within C–IBI compared to IBI. To validate the robustness, we apply C–IBI to study test cases of solvation thermodynamics of aqueous urea and a triglycine solvation in aqueous urea.
NASA Astrophysics Data System (ADS)
Ahunov, Roman R.; Kuksenko, Sergey P.; Gazizov, Talgat R.
2016-06-01
A multiple solution of linear algebraic systems with dense matrix by iterative methods is considered. To accelerate the process, the recomputing of the preconditioning matrix is used. A priory condition of the recomputing based on change of the arithmetic mean of the current solution time during the multiple solution is proposed. To confirm the effectiveness of the proposed approach, the numerical experiments using iterative methods BiCGStab and CGS for four different sets of matrices on two examples of microstrip structures are carried out. For solution of 100 linear systems the acceleration up to 1.6 times, compared to the approach without recomputing, is obtained.
NASA Astrophysics Data System (ADS)
Imamura, Seigo; Ono, Kenji; Yokokawa, Mitsuo
2016-07-01
Ensemble computing, which is an instance of capacity computing, is an effective computing scenario for exascale parallel supercomputers. In ensemble computing, there are multiple linear systems associated with a common coefficient matrix. We improve the performance of iterative solvers for multiple vectors by solving them at the same time, that is, by solving for the product of the matrices. We implemented several iterative methods and compared their performance. The maximum performance on Sparc VIIIfx was 7.6 times higher than that of a naïve implementation. Finally, to deal with the different convergence processes of linear systems, we introduced a control method to eliminate the calculation of already converged vectors.
Radiation dose reduction in medical x-ray CT via Fourier-based iterative reconstruction
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fahimian, Benjamin P.; Zhao Yunzhe; Huang Zhifeng
Purpose: A Fourier-based iterative reconstruction technique, termed Equally Sloped Tomography (EST), is developed in conjunction with advanced mathematical regularization to investigate radiation dose reduction in x-ray CT. The method is experimentally implemented on fan-beam CT and evaluated as a function of imaging dose on a series of image quality phantoms and anonymous pediatric patient data sets. Numerical simulation experiments are also performed to explore the extension of EST to helical cone-beam geometry. Methods: EST is a Fourier based iterative algorithm, which iterates back and forth between real and Fourier space utilizing the algebraically exact pseudopolar fast Fourier transform (PPFFT). Inmore » each iteration, physical constraints and mathematical regularization are applied in real space, while the measured data are enforced in Fourier space. The algorithm is automatically terminated when a proposed termination criterion is met. Experimentally, fan-beam projections were acquired by the Siemens z-flying focal spot technology, and subsequently interleaved and rebinned to a pseudopolar grid. Image quality phantoms were scanned at systematically varied mAs settings, reconstructed by EST and conventional reconstruction methods such as filtered back projection (FBP), and quantified using metrics including resolution, signal-to-noise ratios (SNRs), and contrast-to-noise ratios (CNRs). Pediatric data sets were reconstructed at their original acquisition settings and additionally simulated to lower dose settings for comparison and evaluation of the potential for radiation dose reduction. Numerical experiments were conducted to quantify EST and other iterative methods in terms of image quality and computation time. The extension of EST to helical cone-beam CT was implemented by using the advanced single-slice rebinning (ASSR) method. Results: Based on the phantom and pediatric patient fan-beam CT data, it is demonstrated that EST reconstructions with the lowest scanner flux setting of 39 mAs produce comparable image quality, resolution, and contrast relative to FBP with the 140 mAs flux setting. Compared to the algebraic reconstruction technique and the expectation maximization statistical reconstruction algorithm, a significant reduction in computation time is achieved with EST. Finally, numerical experiments on helical cone-beam CT data suggest that the combination of EST and ASSR produces reconstructions with higher image quality and lower noise than the Feldkamp Davis and Kress (FDK) method and the conventional ASSR approach. Conclusions: A Fourier-based iterative method has been applied to the reconstruction of fan-bean CT data with reduced x-ray fluence. This method incorporates advantageous features in both real and Fourier space iterative schemes: using a fast and algebraically exact method to calculate forward projection, enforcing the measured data in Fourier space, and applying physical constraints and flexible regularization in real space. Our results suggest that EST can be utilized for radiation dose reduction in x-ray CT via the readily implementable technique of lowering mAs settings. Numerical experiments further indicate that EST requires less computation time than several other iterative algorithms and can, in principle, be extended to helical cone-beam geometry in combination with the ASSR method.« less
Radiation dose reduction in medical x-ray CT via Fourier-based iterative reconstruction
Fahimian, Benjamin P.; Zhao, Yunzhe; Huang, Zhifeng; Fung, Russell; Mao, Yu; Zhu, Chun; Khatonabadi, Maryam; DeMarco, John J.; Osher, Stanley J.; McNitt-Gray, Michael F.; Miao, Jianwei
2013-01-01
Purpose: A Fourier-based iterative reconstruction technique, termed Equally Sloped Tomography (EST), is developed in conjunction with advanced mathematical regularization to investigate radiation dose reduction in x-ray CT. The method is experimentally implemented on fan-beam CT and evaluated as a function of imaging dose on a series of image quality phantoms and anonymous pediatric patient data sets. Numerical simulation experiments are also performed to explore the extension of EST to helical cone-beam geometry. Methods: EST is a Fourier based iterative algorithm, which iterates back and forth between real and Fourier space utilizing the algebraically exact pseudopolar fast Fourier transform (PPFFT). In each iteration, physical constraints and mathematical regularization are applied in real space, while the measured data are enforced in Fourier space. The algorithm is automatically terminated when a proposed termination criterion is met. Experimentally, fan-beam projections were acquired by the Siemens z-flying focal spot technology, and subsequently interleaved and rebinned to a pseudopolar grid. Image quality phantoms were scanned at systematically varied mAs settings, reconstructed by EST and conventional reconstruction methods such as filtered back projection (FBP), and quantified using metrics including resolution, signal-to-noise ratios (SNRs), and contrast-to-noise ratios (CNRs). Pediatric data sets were reconstructed at their original acquisition settings and additionally simulated to lower dose settings for comparison and evaluation of the potential for radiation dose reduction. Numerical experiments were conducted to quantify EST and other iterative methods in terms of image quality and computation time. The extension of EST to helical cone-beam CT was implemented by using the advanced single-slice rebinning (ASSR) method. Results: Based on the phantom and pediatric patient fan-beam CT data, it is demonstrated that EST reconstructions with the lowest scanner flux setting of 39 mAs produce comparable image quality, resolution, and contrast relative to FBP with the 140 mAs flux setting. Compared to the algebraic reconstruction technique and the expectation maximization statistical reconstruction algorithm, a significant reduction in computation time is achieved with EST. Finally, numerical experiments on helical cone-beam CT data suggest that the combination of EST and ASSR produces reconstructions with higher image quality and lower noise than the Feldkamp Davis and Kress (FDK) method and the conventional ASSR approach. Conclusions: A Fourier-based iterative method has been applied to the reconstruction of fan-bean CT data with reduced x-ray fluence. This method incorporates advantageous features in both real and Fourier space iterative schemes: using a fast and algebraically exact method to calculate forward projection, enforcing the measured data in Fourier space, and applying physical constraints and flexible regularization in real space. Our results suggest that EST can be utilized for radiation dose reduction in x-ray CT via the readily implementable technique of lowering mAs settings. Numerical experiments further indicate that EST requires less computation time than several other iterative algorithms and can, in principle, be extended to helical cone-beam geometry in combination with the ASSR method. PMID:23464329
Analysis of the iteratively regularized Gauss-Newton method under a heuristic rule
NASA Astrophysics Data System (ADS)
Jin, Qinian; Wang, Wei
2018-03-01
The iteratively regularized Gauss-Newton method is one of the most prominent regularization methods for solving nonlinear ill-posed inverse problems when the data is corrupted by noise. In order to produce a useful approximate solution, this iterative method should be terminated properly. The existing a priori and a posteriori stopping rules require accurate information on the noise level, which may not be available or reliable in practical applications. In this paper we propose a heuristic selection rule for this regularization method, which requires no information on the noise level. By imposing certain conditions on the noise, we derive a posteriori error estimates on the approximate solutions under various source conditions. Furthermore, we establish a convergence result without using any source condition. Numerical results are presented to illustrate the performance of our heuristic selection rule.
NASA Astrophysics Data System (ADS)
Fu, Lin; Hu, Xiangyu Y.; Adams, Nikolaus A.
2017-12-01
We propose efficient single-step formulations for reinitialization and extending algorithms, which are critical components of level-set based interface-tracking methods. The level-set field is reinitialized with a single-step (non iterative) "forward tracing" algorithm. A minimum set of cells is defined that describes the interface, and reinitialization employs only data from these cells. Fluid states are extrapolated or extended across the interface by a single-step "backward tracing" algorithm. Both algorithms, which are motivated by analogy to ray-tracing, avoid multiple block-boundary data exchanges that are inevitable for iterative reinitialization and extending approaches within a parallel-computing environment. The single-step algorithms are combined with a multi-resolution conservative sharp-interface method and validated by a wide range of benchmark test cases. We demonstrate that the proposed reinitialization method achieves second-order accuracy in conserving the volume of each phase. The interface location is invariant to reapplication of the single-step reinitialization. Generally, we observe smaller absolute errors than for standard iterative reinitialization on the same grid. The computational efficiency is higher than for the standard and typical high-order iterative reinitialization methods. We observe a 2- to 6-times efficiency improvement over the standard method for serial execution. The proposed single-step extending algorithm, which is commonly employed for assigning data to ghost cells with ghost-fluid or conservative interface interaction methods, shows about 10-times efficiency improvement over the standard method while maintaining same accuracy. Despite their simplicity, the proposed algorithms offer an efficient and robust alternative to iterative reinitialization and extending methods for level-set based multi-phase simulations.
Weak turbulence simulations with the Hermite-Fourier spectral method
NASA Astrophysics Data System (ADS)
Vencels, Juris; Delzanno, Gian Luca; Manzini, Gianmarco; Roytershteyn, Vadim; Markidis, Stefano
2015-11-01
Recently, a new (transform) method based on a Fourier-Hermite (FH) discretization of the Vlasov-Maxwell equations has been developed. The resulting set of moment equations is discretized implicitly in time with a Crank-Nicolson scheme and solved with a nonlinear Newton-Krylov technique. For periodic boundary conditions, this discretization delivers a scheme that conserves the total mass, momentum and energy of the system exactly. In this work, we apply the FH method to study a problem of Langmuir turbulence, where a low signal-to-noise ratio is important to follow the turbulent cascade and might require a lot of computational resources if studied with PIC. We simulate a weak (low density) electron beam moving in a Maxwellian plasma and subject to an instability that generates Langmuir waves and a weak turbulence field. We also discuss some optimization techniques to optimally select the Hermite basis in terms of its shift and scaling argument, and show that this technique improve the overall accuracy of the method. Finally, we discuss the applicability of the HF method for studying kinetic plasma turbulence. This work was funded by LDRD under the auspices of the NNSA of the U.S. by LANL under contract DE-AC52-06NA25396 and by EC through the EPiGRAM project (grant agreement no. 610598. epigram-project.eu).
Discrete fourier transform (DFT) analysis for applications using iterative transform methods
NASA Technical Reports Server (NTRS)
Dean, Bruce H. (Inventor)
2012-01-01
According to various embodiments, a method is provided for determining aberration data for an optical system. The method comprises collecting a data signal, and generating a pre-transformation algorithm. The data is pre-transformed by multiplying the data with the pre-transformation algorithm. A discrete Fourier transform of the pre-transformed data is performed in an iterative loop. The method further comprises back-transforming the data to generate aberration data.
Iterative Methods for Solving Nonlinear Parabolic Problem in Pension Saving Management
NASA Astrophysics Data System (ADS)
Koleva, M. N.
2011-11-01
In this work we consider a nonlinear parabolic equation, obtained from Riccati like transformation of the Hamilton-Jacobi-Bellman equation, arising in pension saving management. We discuss two numerical iterative methods for solving the model problem—fully implicit Picard method and mixed Picard-Newton method, which preserves the parabolic characteristics of the differential problem. Numerical experiments for comparison the accuracy and effectiveness of the algorithms are discussed. Finally, observations are given.
Domain decomposition preconditioners for the spectral collocation method
NASA Technical Reports Server (NTRS)
Quarteroni, Alfio; Sacchilandriani, Giovanni
1988-01-01
Several block iteration preconditioners are proposed and analyzed for the solution of elliptic problems by spectral collocation methods in a region partitioned into several rectangles. It is shown that convergence is achieved with a rate which does not depend on the polynomial degree of the spectral solution. The iterative methods here presented can be effectively implemented on multiprocessor systems due to their high degree of parallelism.
Single image super-resolution based on approximated Heaviside functions and iterative refinement
Wang, Xin-Yu; Huang, Ting-Zhu; Deng, Liang-Jian
2018-01-01
One method of solving the single-image super-resolution problem is to use Heaviside functions. This has been done previously by making a binary classification of image components as “smooth” and “non-smooth”, describing these with approximated Heaviside functions (AHFs), and iteration including l1 regularization. We now introduce a new method in which the binary classification of image components is extended to different degrees of smoothness and non-smoothness, these components being represented by various classes of AHFs. Taking into account the sparsity of the non-smooth components, their coefficients are l1 regularized. In addition, to pick up more image details, the new method uses an iterative refinement for the residuals between the original low-resolution input and the downsampled resulting image. Experimental results showed that the new method is superior to the original AHF method and to four other published methods. PMID:29329298
A non-iterative twin image elimination method with two in-line digital holograms
NASA Astrophysics Data System (ADS)
Kim, Jongwu; Lee, Heejung; Jeon, Philjun; Kim, Dug Young
2018-02-01
We propose a simple non-iterative in-line holographic measurement method which can effectively eliminate a twin image in digital holographic 3D imaging. It is shown that a twin image can be effectively eliminated with only two measured holograms by using a simple numerical propagation algorithm and arithmetic calculations.
A Probabilistic Collocation Based Iterative Kalman Filter for Landfill Data Assimilation
NASA Astrophysics Data System (ADS)
Qiang, Z.; Zeng, L.; Wu, L.
2016-12-01
Due to the strong spatial heterogeneity of landfill, uncertainty is ubiquitous in gas transport process in landfill. To accurately characterize the landfill properties, the ensemble Kalman filter (EnKF) has been employed to assimilate the measurements, e.g., the gas pressure. As a Monte Carlo (MC) based method, the EnKF usually requires a large ensemble size, which poses a high computational cost for large scale problems. In this work, we propose a probabilistic collocation based iterative Kalman filter (PCIKF) to estimate permeability in a liquid-gas coupling model. This method employs polynomial chaos expansion (PCE) to represent and propagate the uncertainties of model parameters and states, and an iterative form of Kalman filter to assimilate the current gas pressure data. To further reduce the computation cost, the functional ANOVA (analysis of variance) decomposition is conducted, and only the first order ANOVA components are remained for PCE. Illustrated with numerical case studies, this proposed method shows significant superiority in computation efficiency compared with the traditional MC based iterative EnKF. The developed method has promising potential in reliable prediction and management of landfill gas production.
Parallel iterative solution for h and p approximations of the shallow water equations
Barragy, E.J.; Walters, R.A.
1998-01-01
A p finite element scheme and parallel iterative solver are introduced for a modified form of the shallow water equations. The governing equations are the three-dimensional shallow water equations. After a harmonic decomposition in time and rearrangement, the resulting equations are a complex Helmholz problem for surface elevation, and a complex momentum equation for the horizontal velocity. Both equations are nonlinear and the resulting system is solved using the Picard iteration combined with a preconditioned biconjugate gradient (PBCG) method for the linearized subproblems. A subdomain-based parallel preconditioner is developed which uses incomplete LU factorization with thresholding (ILUT) methods within subdomains, overlapping ILUT factorizations for subdomain boundaries and under-relaxed iteration for the resulting block system. The method builds on techniques successfully applied to linear elements by introducing ordering and condensation techniques to handle uniform p refinement. The combined methods show good performance for a range of p (element order), h (element size), and N (number of processors). Performance and scalability results are presented for a field scale problem where up to 512 processors are used. ?? 1998 Elsevier Science Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Azarnavid, Babak; Parand, Kourosh; Abbasbandy, Saeid
2018-06-01
This article discusses an iterative reproducing kernel method with respect to its effectiveness and capability of solving a fourth-order boundary value problem with nonlinear boundary conditions modeling beams on elastic foundations. Since there is no method of obtaining reproducing kernel which satisfies nonlinear boundary conditions, the standard reproducing kernel methods cannot be used directly to solve boundary value problems with nonlinear boundary conditions as there is no knowledge about the existence and uniqueness of the solution. The aim of this paper is, therefore, to construct an iterative method by the use of a combination of reproducing kernel Hilbert space method and a shooting-like technique to solve the mentioned problems. Error estimation for reproducing kernel Hilbert space methods for nonlinear boundary value problems have yet to be discussed in the literature. In this paper, we present error estimation for the reproducing kernel method to solve nonlinear boundary value problems probably for the first time. Some numerical results are given out to demonstrate the applicability of the method.
NASA Technical Reports Server (NTRS)
Puliafito, E.; Bevilacqua, R.; Olivero, J.; Degenhardt, W.
1992-01-01
The formal retrieval error analysis of Rodgers (1990) allows the quantitative determination of such retrieval properties as measurement error sensitivity, resolution, and inversion bias. This technique was applied to five numerical inversion techniques and two nonlinear iterative techniques used for the retrieval of middle atmospheric constituent concentrations from limb-scanning millimeter-wave spectroscopic measurements. It is found that the iterative methods have better vertical resolution, but are slightly more sensitive to measurement error than constrained matrix methods. The iterative methods converge to the exact solution, whereas two of the matrix methods under consideration have an explicit constraint, the sensitivity of the solution to the a priori profile. Tradeoffs of these retrieval characteristics are presented.
An exponential time-integrator scheme for steady and unsteady inviscid flows
NASA Astrophysics Data System (ADS)
Li, Shu-Jie; Luo, Li-Shi; Wang, Z. J.; Ju, Lili
2018-07-01
An exponential time-integrator scheme of second-order accuracy based on the predictor-corrector methodology, denoted PCEXP, is developed to solve multi-dimensional nonlinear partial differential equations pertaining to fluid dynamics. The effective and efficient implementation of PCEXP is realized by means of the Krylov method. The linear stability and truncation error are analyzed through a one-dimensional model equation. The proposed PCEXP scheme is applied to the Euler equations discretized with a discontinuous Galerkin method in both two and three dimensions. The effectiveness and efficiency of the PCEXP scheme are demonstrated for both steady and unsteady inviscid flows. The accuracy and efficiency of the PCEXP scheme are verified and validated through comparisons with the explicit third-order total variation diminishing Runge-Kutta scheme (TVDRK3), the implicit backward Euler (BE) and the implicit second-order backward difference formula (BDF2). For unsteady flows, the PCEXP scheme generates a temporal error much smaller than the BDF2 scheme does, while maintaining the expected acceleration at the same time. Moreover, the PCEXP scheme is also shown to achieve the computational efficiency comparable to the implicit schemes for steady flows.
Hu, Rui; Yu, Yiqi
2016-09-08
For efficient and accurate temperature predictions of sodium fast reactor structures, a 3-D full-core conjugate heat transfer modeling capability is developed for an advanced system analysis tool, SAM. The hexagon lattice core is modeled with 1-D parallel channels representing the subassembly flow, and 2-D duct walls and inter-assembly gaps. The six sides of the hexagon duct wall and near-wall coolant region are modeled separately to account for different temperatures and heat transfer between coolant flow and each side of the duct wall. The Jacobian Free Newton Krylov (JFNK) solution method is applied to solve the fluid and solid field simultaneouslymore » in a fully coupled fashion. The 3-D full-core conjugate heat transfer modeling capability in SAM has been demonstrated by a verification test problem with 7 fuel assemblies in a hexagon lattice layout. In addition, the SAM simulation results are compared with RANS-based CFD simulations. Very good agreements have been achieved between the results of the two approaches.« less
Simulation of Quantum Many-Body Dynamics for Generic Strongly-Interacting Systems
NASA Astrophysics Data System (ADS)
Meyer, Gregory; Machado, Francisco; Yao, Norman
2017-04-01
Recent experimental advances have enabled the bottom-up assembly of complex, strongly interacting quantum many-body systems from individual atoms, ions, molecules and photons. These advances open the door to studying dynamics in isolated quantum systems as well as the possibility of realizing novel out-of-equilibrium phases of matter. Numerical studies provide insight into these systems; however, computational time and memory usage limit common numerical methods such as exact diagonalization to relatively small Hilbert spaces of dimension 215 . Here we present progress toward a new software package for dynamical time evolution of large generic quantum systems on massively parallel computing architectures. By projecting large sparse Hamiltonians into a much smaller Krylov subspace, we are able to compute the evolution of strongly interacting systems with Hilbert space dimension nearing 230. We discuss and benchmark different design implementations, such as matrix-free methods and GPU based calculations, using both pre-thermal time crystals and the Sachdev-Ye-Kitaev model as examples. We also include a simple symbolic language to describe generic Hamiltonians, allowing simulation of diverse quantum systems without any modification of the underlying C and Fortran code.
Efficient variable time-stepping scheme for intense field-atom interactions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cerjan, C.; Kosloff, R.
1993-03-01
The recently developed Residuum method [Tal-Ezer, Kosloff, and Cerjan, J. Comput. Phys. 100, 179 (1992)], a Krylov subspace technique with variable time-step integration for the solution of the time-dependent Schroedinger equation, is applied to the frequently used soft Coulomb potential in an intense laser field. This one-dimensional potential has asymptotic Coulomb dependence with a softened'' singularity at the origin; thus it models more realistic phenomena. Two of the more important quantities usually calculated in this idealized system are the photoelectron and harmonic photon generation spectra. These quantities are shown to be sensitive to the choice of a numerical integration scheme:more » some spectral features are incorrectly calculated or missing altogether. Furthermore, the Residuum method allows much larger grid spacings for equivalent or higher accuracy in addition to the advantages of variable time stepping. Finally, it is demonstrated that enhanced high-order harmonic generation accompanies intense field stabilization and that preparation of the atom in an intermediate Rydberg state leads to stabilization at much lower laser intensity.« less
An improved 3D MoF method based on analytical partial derivatives
NASA Astrophysics Data System (ADS)
Chen, Xiang; Zhang, Xiong
2016-12-01
MoF (Moment of Fluid) method is one of the most accurate approaches among various surface reconstruction algorithms. As other second order methods, MoF method needs to solve an implicit optimization problem to obtain the optimal approximate surface. Therefore, the partial derivatives of the objective function have to be involved during the iteration for efficiency and accuracy. However, to the best of our knowledge, the derivatives are currently estimated numerically by finite difference approximation because it is very difficult to obtain the analytical derivatives of the object function for an implicit optimization problem. Employing numerical derivatives in an iteration not only increase the computational cost, but also deteriorate the convergence rate and robustness of the iteration due to their numerical error. In this paper, the analytical first order partial derivatives of the objective function are deduced for 3D problems. The analytical derivatives can be calculated accurately, so they are incorporated into the MoF method to improve its accuracy, efficiency and robustness. Numerical studies show that by using the analytical derivatives the iterations are converged in all mixed cells with the efficiency improvement of 3 to 4 times.
Strongly Coupled Fluid-Body Dynamics in the Immersed Boundary Projection Method
NASA Astrophysics Data System (ADS)
Wang, Chengjie; Eldredge, Jeff D.
2014-11-01
A computational algorithm is developed to simulate dynamically coupled interaction between fluid and rigid bodies. The basic computational framework is built upon a multi-domain immersed boundary method library, whirl, developed in previous work. In this library, the Navier-Stokes equations for incompressible flow are solved on a uniform Cartesian grid by the vorticity-based immersed boundary projection method of Colonius and Taira. A solver for the dynamics of rigid-body systems is also included. The fluid and rigid-body solvers are strongly coupled with an iterative approach based on the block Gauss-Seidel method. Interfacial force, with its intimate connection with the Lagrange multipliers used in the fluid solver, is used as the primary iteration variable. Relaxation, developed from a stability analysis of the iterative scheme, is used to achieve convergence in only 2-4 iterations per time step. Several two- and three-dimensional numerical tests are conducted to validate and demonstrate the method, including flapping of flexible wings, self-excited oscillations of a system of linked plates and three-dimensional propulsion of flexible fluked tail. This work has been supported by AFOSR, under Award FA9550-11-1-0098.
Analytic Evolution of Singular Distribution Amplitudes in QCD
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tandogan Kunkel, Asli
2014-08-01
Distribution amplitudes (DAs) are the basic functions that contain information about the quark momentum. DAs are necessary to describe hard exclusive processes in quantum chromodynamics. We describe a method of analytic evolution of DAs that have singularities such as nonzero values at the end points of the support region, jumps at some points inside the support region and cusps. We illustrate the method by applying it to the evolution of a at (constant) DA, antisymmetric at DA, and then use the method for evolution of the two-photon generalized distribution amplitude. Our approach to DA evolution has advantages over the standardmore » method of expansion in Gegenbauer polynomials [1, 2] and over a straightforward iteration of an initial distribution with evolution kernel. Expansion in Gegenbauer polynomials requires an infinite number of terms in order to accurately reproduce functions in the vicinity of singular points. Straightforward iteration of an initial distribution produces logarithmically divergent terms at each iteration. In our method the logarithmic singularities are summed from the start, which immediately produces a continuous curve. Afterwards, in order to get precise results, only one or two iterations are needed.« less
1993-01-01
interaction: Ice forces on offshore structures were reviewed. Presentations were given on practices employed in the United States to derive ice design ...Mechanics Workshop, four days were strictly devoted to presentaticns. ’he American presentations were quite varied with heavy eaphasis on experimental and...institute coducts investigaticrs that contribute to specific ship designs , but does not do the actual designs itself. Similarly, Krylov cmVIucts
Grid generation and adaptation via Monge-Kantorovich optimization in 2D and 3D
NASA Astrophysics Data System (ADS)
Delzanno, Gian Luca; Chacon, Luis; Finn, John M.
2008-11-01
In a recent paper [1], Monge-Kantorovich (MK) optimization was proposed as a method of grid generation/adaptation in two dimensions (2D). The method is based on the minimization of the L2 norm of grid point displacement, constrained to producing a given positive-definite cell volume distribution (equidistribution constraint). The procedure gives rise to the Monge-Amp'ere (MA) equation: a single, non-linear scalar equation with no free-parameters. The MA equation was solved in Ref. [1] with the Jacobian Free Newton-Krylov technique and several challenging test cases were presented in squared domains in 2D. Here, we extend the work of Ref. [1]. We first formulate the MK approach in physical domains with curved boundary elements and in 3D. We then show the results of applying it to these more general cases. We show that MK optimization produces optimal grids in which the constraint is satisfied numerically to truncation error. [1] G.L. Delzanno, L. Chac'on, J.M. Finn, Y. Chung, G. Lapenta, A new, robust equidistribution method for two-dimensional grid generation, submitted to Journal of Computational Physics (2008).
Conjugate gradient coupled with multigrid for an indefinite problem
NASA Technical Reports Server (NTRS)
Gozani, J.; Nachshon, A.; Turkel, E.
1984-01-01
An iterative algorithm for the Helmholtz equation is presented. This scheme was based on the preconditioned conjugate gradient method for the normal equations. The preconditioning is one cycle of a multigrid method for the discrete Laplacian. The smoothing algorithm is red-black Gauss-Seidel and is constructed so it is a symmetric operator. The total number of iterations needed by the algorithm is independent of h. By varying the number of grids, the number of iterations depends only weakly on k when k(3)h(2) is constant. Comparisons with a SSOR preconditioner are presented.
NASA Astrophysics Data System (ADS)
Klimina, L. A.
2018-05-01
The modification of the Picard approach is suggested that is targeted to the construction of a bifurcation diagram of 2π -periodic motions of mechanical system with a cylindrical phase space. Each iterative step is based on principles of averaging and energy balance similar to the Poincare-Pontryagin approach. If the iterative procedure converges, it provides the periodic trajectory of the system depending on the bifurcation parameter of the model. The method is applied to describe self-sustained rotations in the model of an aerodynamic pendulum.
Optimized iterative decoding method for TPC coded CPM
NASA Astrophysics Data System (ADS)
Ma, Yanmin; Lai, Penghui; Wang, Shilian; Xie, Shunqin; Zhang, Wei
2018-05-01
Turbo Product Code (TPC) coded Continuous Phase Modulation (CPM) system (TPC-CPM) has been widely used in aeronautical telemetry and satellite communication. This paper mainly investigates the improvement and optimization on the TPC-CPM system. We first add the interleaver and deinterleaver to the TPC-CPM system, and then establish an iterative system to iteratively decode. However, the improved system has a poor convergence ability. To overcome this issue, we use the Extrinsic Information Transfer (EXIT) analysis to find the optimal factors for the system. The experiments show our method is efficient to improve the convergence performance.
Wang, G.L.; Chew, W.C.; Cui, T.J.; Aydiner, A.A.; Wright, D.L.; Smith, D.V.
2004-01-01
Three-dimensional (3D) subsurface imaging by using inversion of data obtained from the very early time electromagnetic system (VETEM) was discussed. The study was carried out by using the distorted Born iterative method to match the internal nonlinear property of the 3D inversion problem. The forward solver was based on the total-current formulation bi-conjugate gradient-fast Fourier transform (BCCG-FFT). It was found that the selection of regularization parameter follow a heuristic rule as used in the Levenberg-Marquardt algorithm so that the iteration is stable.
NASA Astrophysics Data System (ADS)
Elgohary, T.; Kim, D.; Turner, J.; Junkins, J.
2014-09-01
Several methods exist for integrating the motion in high order gravity fields. Some recent methods use an approximate starting orbit, and an efficient method is needed for generating warm starts that account for specific low order gravity approximations. By introducing two scalar Lagrange-like invariants and employing Leibniz product rule, the perturbed motion is integrated by a novel recursive formulation. The Lagrange-like invariants allow exact arbitrary order time derivatives. Restricting attention to the perturbations due to the zonal harmonics J2 through J6, we illustrate an idea. The recursively generated vector-valued time derivatives for the trajectory are used to develop a continuation series-based solution for propagating position and velocity. Numerical comparisons indicate performance improvements of ~ 70X over existing explicit Runge-Kutta methods while maintaining mm accuracy for the orbit predictions. The Modified Chebyshev Picard Iteration (MCPI) is an iterative path approximation method to solve nonlinear ordinary differential equations. The MCPI utilizes Picard iteration with orthogonal Chebyshev polynomial basis functions to recursively update the states. The key advantages of the MCPI are as follows: 1) Large segments of a trajectory can be approximated by evaluating the forcing function at multiple nodes along the current approximation during each iteration. 2) It can readily handle general gravity perturbations as well as non-conservative forces. 3) Parallel applications are possible. The Picard sequence converges to the solution over large time intervals when the forces are continuous and differentiable. According to the accuracy of the starting solutions, however, the MCPI may require significant number of iterations and function evaluations compared to other integrators. In this work, we provide an efficient methodology to establish good starting solutions from the continuation series method; this warm start improves the performance of the MCPI significantly and will likely be useful for other applications where efficiently computed approximate orbit solutions are needed.
NASA Astrophysics Data System (ADS)
Broggini, Filippo; Wapenaar, Kees; van der Neut, Joost; Snieder, Roel
2014-01-01
An iterative method is presented that allows one to retrieve the Green's function originating from a virtual source located inside a medium using reflection data measured only at the acquisition surface. In addition to the reflection response, an estimate of the travel times corresponding to the direct arrivals is required. However, no detailed information about the heterogeneities in the medium is needed. The iterative scheme generalizes the Marchenko equation for inverse scattering to the seismic reflection problem. To give insight in the mechanism of the iterative method, its steps for a simple layered medium are analyzed using physical arguments based on the stationary phase method. The retrieved Green's wavefield is shown to correctly contain the multiples due to the inhomogeneities present in the medium. Additionally, a variant of the iterative scheme enables decomposition of the retrieved wavefield into its downgoing and upgoing components. These wavefields then enable creation of a ghost-free image of the medium with either cross correlation or multidimensional deconvolution, presenting an advantage over standard prestack migration.
LDPC-based iterative joint source-channel decoding for JPEG2000.
Pu, Lingling; Wu, Zhenyu; Bilgin, Ali; Marcellin, Michael W; Vasic, Bane
2007-02-01
A framework is proposed for iterative joint source-channel decoding of JPEG2000 codestreams. At the encoder, JPEG2000 is used to perform source coding with certain error-resilience (ER) modes, and LDPC codes are used to perform channel coding. During decoding, the source decoder uses the ER modes to identify corrupt sections of the codestream and provides this information to the channel decoder. Decoding is carried out jointly in an iterative fashion. Experimental results indicate that the proposed method requires fewer iterations and improves overall system performance.
NASA Astrophysics Data System (ADS)
Park, S. Y.; Kim, G. A.; Cho, H. S.; Park, C. K.; Lee, D. Y.; Lim, H. W.; Lee, H. W.; Kim, K. S.; Kang, S. Y.; Park, J. E.; Kim, W. S.; Jeon, D. H.; Je, U. K.; Woo, T. H.; Oh, J. E.
2018-02-01
In recent digital tomosynthesis (DTS), iterative reconstruction methods are often used owing to the potential to provide multiplanar images of superior image quality to conventional filtered-backprojection (FBP)-based methods. However, they require enormous computational cost in the iterative process, which has still been an obstacle to put them to practical use. In this work, we propose a new DTS reconstruction method incorporated with a dual-resolution voxelization scheme in attempt to overcome these difficulties, in which the voxels outside a small region-of-interest (ROI) containing target diagnosis are binned by 2 × 2 × 2 while the voxels inside the ROI remain unbinned. We considered a compressed-sensing (CS)-based iterative algorithm with a dual-constraint strategy for more accurate DTS reconstruction. We implemented the proposed algorithm and performed a systematic simulation and experiment to demonstrate its viability. Our results indicate that the proposed method seems to be effective for reducing computational cost considerably in iterative DTS reconstruction, keeping the image quality inside the ROI not much degraded. A binning size of 2 × 2 × 2 required only about 31.9% computational memory and about 2.6% reconstruction time, compared to those for no binning case. The reconstruction quality was evaluated in terms of the root-mean-square error (RMSE), the contrast-to-noise ratio (CNR), and the universal-quality index (UQI).
Iterated unscented Kalman filter for phase unwrapping of interferometric fringes.
Xie, Xianming
2016-08-22
A fresh phase unwrapping algorithm based on iterated unscented Kalman filter is proposed to estimate unambiguous unwrapped phase of interferometric fringes. This method is the result of combining an iterated unscented Kalman filter with a robust phase gradient estimator based on amended matrix pencil model, and an efficient quality-guided strategy based on heap sort. The iterated unscented Kalman filter that is one of the most robust methods under the Bayesian theorem frame in non-linear signal processing so far, is applied to perform simultaneously noise suppression and phase unwrapping of interferometric fringes for the first time, which can simplify the complexity and the difficulty of pre-filtering procedure followed by phase unwrapping procedure, and even can remove the pre-filtering procedure. The robust phase gradient estimator is used to efficiently and accurately obtain phase gradient information from interferometric fringes, which is needed for the iterated unscented Kalman filtering phase unwrapping model. The efficient quality-guided strategy is able to ensure that the proposed method fast unwraps wrapped pixels along the path from the high-quality area to the low-quality area of wrapped phase images, which can greatly improve the efficiency of phase unwrapping. Results obtained from synthetic data and real data show that the proposed method can obtain better solutions with an acceptable time consumption, with respect to some of the most used algorithms.
Using the surface panel method to predict the steady performance of ducted propellers
NASA Astrophysics Data System (ADS)
Cai, Hao-Peng; Su, Yu-Min; Li, Xin; Shen, Hai-Long
2009-12-01
A new numerical method was developed for predicting the steady hydrodynamic performance of ducted propellers. A potential based surface panel method was applied both to the duct and the propeller, and the interaction between them was solved by an induced velocity potential iterative method. Compared with the induced velocity iterative method, the method presented can save programming and calculating time. Numerical results for a JD simplified ducted propeller series showed that the method presented is effective for predicting the steady hydrodynamic performance of ducted propellers.
Radiation dose reduction in medical x-ray CT via Fourier-based iterative reconstruction.
Fahimian, Benjamin P; Zhao, Yunzhe; Huang, Zhifeng; Fung, Russell; Mao, Yu; Zhu, Chun; Khatonabadi, Maryam; DeMarco, John J; Osher, Stanley J; McNitt-Gray, Michael F; Miao, Jianwei
2013-03-01
A Fourier-based iterative reconstruction technique, termed Equally Sloped Tomography (EST), is developed in conjunction with advanced mathematical regularization to investigate radiation dose reduction in x-ray CT. The method is experimentally implemented on fan-beam CT and evaluated as a function of imaging dose on a series of image quality phantoms and anonymous pediatric patient data sets. Numerical simulation experiments are also performed to explore the extension of EST to helical cone-beam geometry. EST is a Fourier based iterative algorithm, which iterates back and forth between real and Fourier space utilizing the algebraically exact pseudopolar fast Fourier transform (PPFFT). In each iteration, physical constraints and mathematical regularization are applied in real space, while the measured data are enforced in Fourier space. The algorithm is automatically terminated when a proposed termination criterion is met. Experimentally, fan-beam projections were acquired by the Siemens z-flying focal spot technology, and subsequently interleaved and rebinned to a pseudopolar grid. Image quality phantoms were scanned at systematically varied mAs settings, reconstructed by EST and conventional reconstruction methods such as filtered back projection (FBP), and quantified using metrics including resolution, signal-to-noise ratios (SNRs), and contrast-to-noise ratios (CNRs). Pediatric data sets were reconstructed at their original acquisition settings and additionally simulated to lower dose settings for comparison and evaluation of the potential for radiation dose reduction. Numerical experiments were conducted to quantify EST and other iterative methods in terms of image quality and computation time. The extension of EST to helical cone-beam CT was implemented by using the advanced single-slice rebinning (ASSR) method. Based on the phantom and pediatric patient fan-beam CT data, it is demonstrated that EST reconstructions with the lowest scanner flux setting of 39 mAs produce comparable image quality, resolution, and contrast relative to FBP with the 140 mAs flux setting. Compared to the algebraic reconstruction technique and the expectation maximization statistical reconstruction algorithm, a significant reduction in computation time is achieved with EST. Finally, numerical experiments on helical cone-beam CT data suggest that the combination of EST and ASSR produces reconstructions with higher image quality and lower noise than the Feldkamp Davis and Kress (FDK) method and the conventional ASSR approach. A Fourier-based iterative method has been applied to the reconstruction of fan-bean CT data with reduced x-ray fluence. This method incorporates advantageous features in both real and Fourier space iterative schemes: using a fast and algebraically exact method to calculate forward projection, enforcing the measured data in Fourier space, and applying physical constraints and flexible regularization in real space. Our results suggest that EST can be utilized for radiation dose reduction in x-ray CT via the readily implementable technique of lowering mAs settings. Numerical experiments further indicate that EST requires less computation time than several other iterative algorithms and can, in principle, be extended to helical cone-beam geometry in combination with the ASSR method.
The solution of linear systems of equations with a structural analysis code on the NAS CRAY-2
NASA Technical Reports Server (NTRS)
Poole, Eugene L.; Overman, Andrea L.
1988-01-01
Two methods for solving linear systems of equations on the NAS Cray-2 are described. One is a direct method; the other is an iterative method. Both methods exploit the architecture of the Cray-2, particularly the vectorization, and are aimed at structural analysis applications. To demonstrate and evaluate the methods, they were installed in a finite element structural analysis code denoted the Computational Structural Mechanics (CSM) Testbed. A description of the techniques used to integrate the two solvers into the Testbed is given. Storage schemes, memory requirements, operation counts, and reformatting procedures are discussed. Finally, results from the new methods are compared with results from the initial Testbed sparse Choleski equation solver for three structural analysis problems. The new direct solvers described achieve the highest computational rates of the methods compared. The new iterative methods are not able to achieve as high computation rates as the vectorized direct solvers but are best for well conditioned problems which require fewer iterations to converge to the solution.
An Automated Baseline Correction Method Based on Iterative Morphological Operations.
Chen, Yunliang; Dai, Liankui
2018-05-01
Raman spectra usually suffer from baseline drift caused by fluorescence or other reasons. Therefore, baseline correction is a necessary and crucial step that must be performed before subsequent processing and analysis of Raman spectra. An automated baseline correction method based on iterative morphological operations is proposed in this work. The method can adaptively determine the structuring element first and then gradually remove the spectral peaks during iteration to get an estimated baseline. Experiments on simulated data and real-world Raman data show that the proposed method is accurate, fast, and flexible for handling different kinds of baselines in various practical situations. The comparison of the proposed method with some state-of-the-art baseline correction methods demonstrates its advantages over the existing methods in terms of accuracy, adaptability, and flexibility. Although only Raman spectra are investigated in this paper, the proposed method is hopefully to be used for the baseline correction of other analytical instrumental signals, such as IR spectra and chromatograms.
Three dimensional iterative beam propagation method for optical waveguide devices
NASA Astrophysics Data System (ADS)
Ma, Changbao; Van Keuren, Edward
2006-10-01
The finite difference beam propagation method (FD-BPM) is an effective model for simulating a wide range of optical waveguide structures. The classical FD-BPMs are based on the Crank-Nicholson scheme, and in tridiagonal form can be solved using the Thomas method. We present a different type of algorithm for 3-D structures. In this algorithm, the wave equation is formulated into a large sparse matrix equation which can be solved using iterative methods. The simulation window shifting scheme and threshold technique introduced in our earlier work are utilized to overcome the convergence problem of iterative methods for large sparse matrix equation and wide-angle simulations. This method enables us to develop higher-order 3-D wide-angle (WA-) BPMs based on Pade approximant operators and the multistep method, which are commonly used in WA-BPMs for 2-D structures. Simulations using the new methods will be compared to the analytical results to assure its effectiveness and applicability.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gingold, E; Dave, J
2014-06-01
Purpose: The purpose of this study was to compare a new model-based iterative reconstruction with existing reconstruction methods (filtered backprojection and basic iterative reconstruction) using quantitative analysis of standard image quality phantom images. Methods: An ACR accreditation phantom (Gammex 464) and a CATPHAN600 phantom were scanned using 3 routine clinical acquisition protocols (adult axial brain, adult abdomen, and pediatric abdomen) on a Philips iCT system. Each scan was acquired using default conditions and 75%, 50% and 25% dose levels. Images were reconstructed using standard filtered backprojection (FBP), conventional iterative reconstruction (iDose4) and a prototype model-based iterative reconstruction (IMR). Phantom measurementsmore » included CT number accuracy, contrast to noise ratio (CNR), modulation transfer function (MTF), low contrast detectability (LCD), and noise power spectrum (NPS). Results: The choice of reconstruction method had no effect on CT number accuracy, or MTF (p<0.01). The CNR of a 6 HU contrast target was improved by 1–67% with iDose4 relative to FBP, while IMR improved CNR by 145–367% across all protocols and dose levels. Within each scan protocol, the CNR improvement from IMR vs FBP showed a general trend of greater improvement at lower dose levels. NPS magnitude was greatest for FBP and lowest for IMR. The NPS of the IMR reconstruction showed a pronounced decrease with increasing spatial frequency, consistent with the unusual noise texture seen in IMR images. Conclusion: Iterative Model Reconstruction reduces noise and improves contrast-to-noise ratio without sacrificing spatial resolution in CT phantom images. This offers the possibility of radiation dose reduction and improved low contrast detectability compared with filtered backprojection or conventional iterative reconstruction.« less
Udzawa-type iterative method with parareal preconditioner for a parabolic optimal control problem
NASA Astrophysics Data System (ADS)
Lapin, A.; Romanenko, A.
2016-11-01
The article deals with the optimal control problem with the parabolic equation as state problem. There are point-wise constraints on the state and control functions. The objective functional involves the observation given in the domain at each moment. The conditions for convergence Udzawa's type iterative method are given. The parareal method to inverse preconditioner is given. The results of calculations are presented.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Willert, Jeffrey; Taitano, William T.; Knoll, Dana
In this note we demonstrate that using Anderson Acceleration (AA) in place of a standard Picard iteration can not only increase the convergence rate but also make the iteration more robust for two transport applications. We also compare the convergence acceleration provided by AA to that provided by moment-based acceleration methods. Additionally, we demonstrate that those two acceleration methods can be used together in a nested fashion. We begin by describing the AA algorithm. At this point, we will describe two application problems, one from neutronics and one from plasma physics, on which we will apply AA. We provide computationalmore » results which highlight the benefits of using AA, namely that we can compute solutions using fewer function evaluations, larger time-steps, and achieve a more robust iteration.« less
Parallel solution of the symmetric tridiagonal eigenproblem. Research report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jessup, E.R.
1989-10-01
This thesis discusses methods for computing all eigenvalues and eigenvectors of a symmetric tridiagonal matrix on a distributed-memory Multiple Instruction, Multiple Data multiprocessor. Only those techniques having the potential for both high numerical accuracy and significant large-grained parallelism are investigated. These include the QL method or Cuppen's divide and conquer method based on rank-one updating to compute both eigenvalues and eigenvectors, bisection to determine eigenvalues and inverse iteration to compute eigenvectors. To begin, the methods are compared with respect to computation time, communication time, parallel speed up, and accuracy. Experiments on an IPSC hypercube multiprocessor reveal that Cuppen's method ismore » the most accurate approach, but bisection with inverse iteration is the fastest and most parallel. Because the accuracy of the latter combination is determined by the quality of the computed eigenvectors, the factors influencing the accuracy of inverse iteration are examined. This includes, in part, statistical analysis of the effect of a starting vector with random components. These results are used to develop an implementation of inverse iteration producing eigenvectors with lower residual error and better orthogonality than those generated by the EISPACK routine TINVIT. This thesis concludes with adaptions of methods for the symmetric tridiagonal eigenproblem to the related problem of computing the singular value decomposition (SVD) of a bidiagonal matrix.« less
Parallel solution of the symmetric tridiagonal eigenproblem
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jessup, E.R.
1989-01-01
This thesis discusses methods for computing all eigenvalues and eigenvectors of a symmetric tridiagonal matrix on a distributed memory MIMD multiprocessor. Only those techniques having the potential for both high numerical accuracy and significant large-grained parallelism are investigated. These include the QL method or Cuppen's divide and conquer method based on rank-one updating to compute both eigenvalues and eigenvectors, bisection to determine eigenvalues, and inverse iteration to compute eigenvectors. To begin, the methods are compared with respect to computation time, communication time, parallel speedup, and accuracy. Experiments on an iPSC hyper-cube multiprocessor reveal that Cuppen's method is the most accuratemore » approach, but bisection with inverse iteration is the fastest and most parallel. Because the accuracy of the latter combination is determined by the quality of the computed eigenvectors, the factors influencing the accuracy of inverse iteration are examined. This includes, in part, statistical analysis of the effects of a starting vector with random components. These results are used to develop an implementation of inverse iteration producing eigenvectors with lower residual error and better orthogonality than those generated by the EISPACK routine TINVIT. This thesis concludes with adaptations of methods for the symmetric tridiagonal eigenproblem to the related problem of computing the singular value decomposition (SVD) of a bidiagonal matrix.« less
An iterative method for systems of nonlinear hyperbolic equations
NASA Technical Reports Server (NTRS)
Scroggs, Jeffrey S.
1989-01-01
An iterative algorithm for the efficient solution of systems of nonlinear hyperbolic equations is presented. Parallelism is evident at several levels. In the formation of the iteration, the equations are decoupled, thereby providing large grain parallelism. Parallelism may also be exploited within the solves for each equation. Convergence of the interation is established via a bounding function argument. Experimental results in two-dimensions are presented.
Accelerated iterative beam angle selection in IMRT
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bangert, Mark, E-mail: m.bangert@dkfz.de; Unkelbach, Jan
2016-03-15
Purpose: Iterative methods for beam angle selection (BAS) for intensity-modulated radiation therapy (IMRT) planning sequentially construct a beneficial ensemble of beam directions. In a naïve implementation, the nth beam is selected by adding beam orientations one-by-one from a discrete set of candidates to an existing ensemble of (n − 1) beams. The best beam orientation is identified in a time consuming process by solving the fluence map optimization (FMO) problem for every candidate beam and selecting the beam that yields the largest improvement to the objective function value. This paper evaluates two alternative methods to accelerate iterative BAS based onmore » surrogates for the FMO objective function value. Methods: We suggest to select candidate beams not based on the FMO objective function value after convergence but (1) based on the objective function value after five FMO iterations of a gradient based algorithm and (2) based on a projected gradient of the FMO problem in the first iteration. The performance of the objective function surrogates is evaluated based on the resulting objective function values and dose statistics in a treatment planning study comprising three intracranial, three pancreas, and three prostate cases. Furthermore, iterative BAS is evaluated for an application in which a small number of noncoplanar beams complement a set of coplanar beam orientations. This scenario is of practical interest as noncoplanar setups may require additional attention of the treatment personnel for every couch rotation. Results: Iterative BAS relying on objective function surrogates yields similar results compared to naïve BAS with regard to the objective function values and dose statistics. At the same time, early stopping of the FMO and using the projected gradient during the first iteration enable reductions in computation time by approximately one to two orders of magnitude. With regard to the clinical delivery of noncoplanar IMRT treatments, we could show that optimized beam ensembles using only a few noncoplanar beam orientations often approach the plan quality of fully noncoplanar ensembles. Conclusions: We conclude that iterative BAS in combination with objective function surrogates can be a viable option to implement automated BAS at clinically acceptable computation times.« less
ERIC Educational Resources Information Center
De Lisle, Jerome; Seunarinesingh, Krishna; Mohammed, Rhoda; Lee-Piggott, Rinnelle
2017-01-01
In this study, methodology and theory were linked to explicate the nature of education practice within schools facing exceptionally challenging circumstances (SFECC) in Trinidad and Tobago. The research design was an iterative quan>QUAL-quan>qual multi-method research programme, consisting of 3 independent projects linked together by overall…
A Study of Morrison's Iterative Noise Removal Method. Final Report M. S. Thesis
NASA Technical Reports Server (NTRS)
Ioup, G. E.; Wright, K. A. R.
1985-01-01
Morrison's iterative noise removal method is studied by characterizing its effect upon systems of differing noise level and response function. The nature of data acquired from a linear shift invariant instrument is discussed so as to define the relationship between the input signal, the instrument response function, and the output signal. Fourier analysis is introduced, along with several pertinent theorems, as a tool to more thorough understanding of the nature of and difficulties with deconvolution. In relation to such difficulties the necessity of a noise removal process is discussed. Morrison's iterative noise removal method and the restrictions upon its application are developed. The nature of permissible response functions is discussed, as is the choice of the response functions used.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hunter, J. L.; Sutton, T. M.
2013-07-01
In Monte Carlo iterated-fission-source calculations relative uncertainties on local tallies tend to be larger in lower-power regions and smaller in higher-power regions. Reducing the largest uncertainties to an acceptable level simply by running a larger number of neutron histories is often prohibitively expensive. The uniform fission site method has been developed to yield a more spatially-uniform distribution of relative uncertainties. This is accomplished by biasing the density of fission neutron source sites while not biasing the solution. The method is integrated into the source iteration process, and does not require any auxiliary forward or adjoint calculations. For a given amountmore » of computational effort, the use of the method results in a reduction of the largest uncertainties relative to the standard algorithm. Two variants of the method have been implemented and tested. Both have been shown to be effective. (authors)« less
Subsonic panel method for designing wing surfaces from pressure distribution
NASA Technical Reports Server (NTRS)
Bristow, D. R.; Hawk, J. D.
1983-01-01
An iterative method has been developed for designing wing section contours corresponding to a prescribed subcritical distribution of pressure. The calculations are initialized by using a surface panel method to analyze a baseline wing or wing-fuselage configuration. A first-order expansion to the baseline panel method equations is then used to calculate a matrix containing the partial derivative of potential at each control point with respect to each unknown geometry parameter. In every iteration cycle, the matrix is used both to calculate the geometry perturbation and to analyze the perturbed geometry. The distribution of potential on the perturbed geometry is established by simple linear extrapolation from the baseline solution. The extrapolated potential is converted to pressure by Bernoulli's equation. Not only is the accuracy of the approach good for very large perturbations, but the computing cost of each complete iteration cycle is substantially less than one analysis solution by a conventional panel method.
NASA Astrophysics Data System (ADS)
Sudevan, Vipin; Aluri, Pavan K.; Yadav, Sarvesh Kumar; Saha, Rajib; Souradeep, Tarun
2017-06-01
We report an improved technique for diffuse foreground minimization from Cosmic Microwave Background (CMB) maps using a new multiphase iterative harmonic space internal-linear-combination (HILC) approach. Our method nullifies a foreground leakage that was present in the old and usual iterative HILC method. In phase 1 of the multiphase technique, we obtain an initial cleaned map using the single iteration HILC approach over the desired portion of the sky. In phase 2, we obtain a final CMB map using the iterative HILC approach; however, now, to nullify the leakage, during each iteration, some of the regions of the sky that are not being cleaned in the current iteration are replaced by the corresponding cleaned portions of the phase 1 map. We bring all input frequency maps to a common and maximum possible beam and pixel resolution at the beginning of the analysis, which significantly reduces data redundancy, memory usage, and computational cost, and avoids, during the HILC weight calculation, the deconvolution of partial sky harmonic coefficients by the azimuthally symmetric beam and pixel window functions, which in a strict mathematical sense, are not well defined. Using WMAP 9 year and Planck 2015 frequency maps, we obtain foreground-cleaned CMB maps and a CMB angular power spectrum for the multipole range 2≤slant {\\ell }≤slant 2500. Our power spectrum matches the published Planck results with some differences at different multipole ranges. We validate our method by performing Monte Carlo simulations. Finally, we show that the weights for HILC foreground minimization have the intrinsic characteristic that they also tend to produce a statistically isotropic CMB map.
NASA Astrophysics Data System (ADS)
Zhong, XiaoXu; Liao, ShiJun
2018-01-01
Analytic approximations of the Von Kármán's plate equations in integral form for a circular plate under external uniform pressure to arbitrary magnitude are successfully obtained by means of the homotopy analysis method (HAM), an analytic approximation technique for highly nonlinear problems. Two HAM-based approaches are proposed for either a given external uniform pressure Q or a given central deflection, respectively. Both of them are valid for uniform pressure to arbitrary magnitude by choosing proper values of the so-called convergence-control parameters c 1 and c 2 in the frame of the HAM. Besides, it is found that the HAM-based iteration approaches generally converge much faster than the interpolation iterative method. Furthermore, we prove that the interpolation iterative method is a special case of the first-order HAM iteration approach for a given external uniform pressure Q when c 1 = - θ and c 2 = -1, where θ denotes the interpolation iterative parameter. Therefore, according to the convergence theorem of Zheng and Zhou about the interpolation iterative method, the HAM-based approaches are valid for uniform pressure to arbitrary magnitude at least in the special case c 1 = - θ and c 2 = -1. In addition, we prove that the HAM approach for the Von Kármán's plate equations in differential form is just a special case of the HAM for the Von Kármán's plate equations in integral form mentioned in this paper. All of these illustrate the validity and great potential of the HAM for highly nonlinear problems, and its superiority over perturbation techniques.
NASA Astrophysics Data System (ADS)
Swastika, Windra
2017-03-01
A money's nominal value recognition system has been developed using Artificial Neural Network (ANN). ANN with Back Propagation has one disadvantage. The learning process is very slow (or never reach the target) in the case of large number of iteration, weight and samples. One way to speed up the learning process is using Quickprop method. Quickprop method is based on Newton's method and able to speed up the learning process by assuming that the weight adjustment (E) is a parabolic function. The goal is to minimize the error gradient (E'). In our system, we use 5 types of money's nominal value, i.e. 1,000 IDR, 2,000 IDR, 5,000 IDR, 10,000 IDR and 50,000 IDR. One of the surface of each nominal were scanned and digitally processed. There are 40 patterns to be used as training set in ANN system. The effectiveness of Quickprop method in the ANN system was validated by 2 factors, (1) number of iterations required to reach error below 0.1; and (2) the accuracy to predict nominal values based on the input. Our results shows that the use of Quickprop method is successfully reduce the learning process compared to Back Propagation method. For 40 input patterns, Quickprop method successfully reached error below 0.1 for only 20 iterations, while Back Propagation method required 2000 iterations. The prediction accuracy for both method is higher than 90%.
Zhou, Wenjie; Wei, Xuesong; Wang, Leqin; Wu, Guangkuan
2017-05-01
Solving the static equilibrium position is one of the most important parts of dynamic coefficients calculation and further coupled calculation of rotor system. The main contribution of this study is testing the superlinear iteration convergence method-twofold secant method, for the determination of the static equilibrium position of journal bearing with finite length. Essentially, the Reynolds equation for stable motion is solved by the finite difference method and the inner pressure is obtained by the successive over-relaxation iterative method reinforced by the compound Simpson quadrature formula. The accuracy and efficiency of the twofold secant method are higher in comparison with the secant method and dichotomy. The total number of iterative steps required for the twofold secant method are about one-third of the secant method and less than one-eighth of dichotomy for the same equilibrium position. The calculations for equilibrium position and pressure distribution for different bearing length, clearance and rotating speed were done. In the results, the eccentricity presents linear inverse proportional relationship to the attitude angle. The influence of the bearing length, clearance and bearing radius on the load-carrying capacity was also investigated. The results illustrate that larger bearing length, larger radius and smaller clearance are good for the load-carrying capacity of journal bearing. The application of the twofold secant method can greatly reduce the computational time for calculation of the dynamic coefficients and dynamic characteristics of rotor-bearing system with a journal bearing of finite length.
Newton's method: A link between continuous and discrete solutions of nonlinear problems
NASA Technical Reports Server (NTRS)
Thurston, G. A.
1980-01-01
Newton's method for nonlinear mechanics problems replaces the governing nonlinear equations by an iterative sequence of linear equations. When the linear equations are linear differential equations, the equations are usually solved by numerical methods. The iterative sequence in Newton's method can exhibit poor convergence properties when the nonlinear problem has multiple solutions for a fixed set of parameters, unless the iterative sequences are aimed at solving for each solution separately. The theory of the linear differential operators is often a better guide for solution strategies in applying Newton's method than the theory of linear algebra associated with the numerical analogs of the differential operators. In fact, the theory for the differential operators can suggest the choice of numerical linear operators. In this paper the method of variation of parameters from the theory of linear ordinary differential equations is examined in detail in the context of Newton's method to demonstrate how it might be used as a guide for numerical solutions.
Gauss Seidel-type methods for energy states of a multi-component Bose Einstein condensate
NASA Astrophysics Data System (ADS)
Chang, Shu-Ming; Lin, Wen-Wei; Shieh, Shih-Feng
2005-01-01
In this paper, we propose two iterative methods, a Jacobi-type iteration (JI) and a Gauss-Seidel-type iteration (GSI), for the computation of energy states of the time-independent vector Gross-Pitaevskii equation (VGPE) which describes a multi-component Bose-Einstein condensate (BEC). A discretization of the VGPE leads to a nonlinear algebraic eigenvalue problem (NAEP). We prove that the GSI method converges locally and linearly to a solution of the NAEP if and only if the associated minimized energy functional problem has a strictly local minimum. The GSI method can thus be used to compute ground states and positive bound states, as well as the corresponding energies of a multi-component BEC. Numerical experience shows that the GSI converges much faster than JI and converges globally within 10-20 steps.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cai, Yunfeng, E-mail: yfcai@math.pku.edu.cn; Department of Computer Science, University of California, Davis 95616; Bai, Zhaojun, E-mail: bai@cs.ucdavis.edu
2013-12-15
The iterative diagonalization of a sequence of large ill-conditioned generalized eigenvalue problems is a computational bottleneck in quantum mechanical methods employing a nonorthogonal basis for ab initio electronic structure calculations. We propose a hybrid preconditioning scheme to effectively combine global and locally accelerated preconditioners for rapid iterative diagonalization of such eigenvalue problems. In partition-of-unity finite-element (PUFE) pseudopotential density-functional calculations, employing a nonorthogonal basis, we show that the hybrid preconditioned block steepest descent method is a cost-effective eigensolver, outperforming current state-of-the-art global preconditioning schemes, and comparably efficient for the ill-conditioned generalized eigenvalue problems produced by PUFE as the locally optimal blockmore » preconditioned conjugate-gradient method for the well-conditioned standard eigenvalue problems produced by planewave methods.« less
Quality measures in applications of image restoration.
Kriete, A; Naim, M; Schafer, L
2001-01-01
We describe a new method for the estimation of image quality in image restoration applications. We demonstrate this technique on a simulated data set of fluorescent beads, in comparison with restoration by three different deconvolution methods. Both the number of iterations and a regularisation factor are varied to enforce changes in the resulting image quality. First, the data sets are directly compared by an accuracy measure. These values serve to validate the image quality descriptor, which is developed on the basis of optical information theory. This most general measure takes into account the spectral energies and the noise, weighted in a logarithmic fashion. It is demonstrated that this method is particularly helpful as a user-oriented method to control the output of iterative image restorations and to eliminate the guesswork in choosing a suitable number of iterations.
Progress in Development of the ITER Plasma Control System Simulation Platform
NASA Astrophysics Data System (ADS)
Walker, Michael; Humphreys, David; Sammuli, Brian; Ambrosino, Giuseppe; de Tommasi, Gianmaria; Mattei, Massimiliano; Raupp, Gerhard; Treutterer, Wolfgang; Winter, Axel
2017-10-01
We report on progress made and expected uses of the Plasma Control System Simulation Platform (PCSSP), the primary test environment for development of the ITER Plasma Control System (PCS). PCSSP will be used for verification and validation of the ITER PCS Final Design for First Plasma, to be completed in 2020. We discuss the objectives of PCSSP, its overall structure, selected features, application to existing devices, and expected evolution over the lifetime of the ITER PCS. We describe an archiving solution for simulation results, methods for incorporating physics models of the plasma and physical plant (tokamak, actuator, and diagnostic systems) into PCSSP, and defining characteristics of models suitable for a plasma control development environment such as PCSSP. Applications of PCSSP simulation models including resistive plasma equilibrium evolution are demonstrated. PCSSP development supported by ITER Organization under ITER/CTS/6000000037. Resistive evolution code developed under General Atomics' Internal funding. The views and opinions expressed herein do not necessarily reflect those of the ITER Organization.
A new Newton-like method for solving nonlinear equations.
Saheya, B; Chen, Guo-Qing; Sui, Yun-Kang; Wu, Cai-Ying
2016-01-01
This paper presents an iterative scheme for solving nonline ar equations. We establish a new rational approximation model with linear numerator and denominator which has generalizes the local linear model. We then employ the new approximation for nonlinear equations and propose an improved Newton's method to solve it. The new method revises the Jacobian matrix by a rank one matrix each iteration and obtains the quadratic convergence property. The numerical performance and comparison show that the proposed method is efficient.
2014-01-01
Due to fierce market competition, how to improve product quality and reduce development cost determines the core competitiveness of enterprises. However, design iteration generally causes increases of product cost and delays of development time as well, so how to identify and model couplings among tasks in product design and development has become an important issue for enterprises to settle. In this paper, the shortcomings existing in WTM model are discussed and tearing approach as well as inner iteration method is used to complement the classic WTM model. In addition, the ABC algorithm is also introduced to find out the optimal decoupling schemes. In this paper, firstly, tearing approach and inner iteration method are analyzed for solving coupled sets. Secondly, a hybrid iteration model combining these two technologies is set up. Thirdly, a high-performance swarm intelligence algorithm, artificial bee colony, is adopted to realize problem-solving. Finally, an engineering design of a chemical processing system is given in order to verify its reasonability and effectiveness. PMID:25431584
Fast non-interferometric iterative phase retrieval for holographic data storage.
Lin, Xiao; Huang, Yong; Shimura, Tsutomu; Fujimura, Ryushi; Tanaka, Yoshito; Endo, Masao; Nishimoto, Hajimu; Liu, Jinpeng; Li, Yang; Liu, Ying; Tan, Xiaodi
2017-12-11
Fast non-interferometric phase retrieval is a very important technique for phase-encoded holographic data storage and other phase based applications due to its advantage of easy implementation, simple system setup, and robust noise tolerance. Here we present an iterative non-interferometric phase retrieval for 4-level phase encoded holographic data storage based on an iterative Fourier transform algorithm and known portion of the encoded data, which increases the storage code rate to two-times that of an amplitude based method. Only a single image at the Fourier plane of the beam is captured for the iterative reconstruction. Since beam intensity at the Fourier plane of the reconstructed beam is more concentrated than the reconstructed beam itself, the requirement of diffractive efficiency of the recording media is reduced, which will improve the dynamic range of recording media significantly. The phase retrieval only requires 10 iterations to achieve a less than 5% phase data error rate, which is successfully demonstrated by recording and reconstructing a test image data experimentally. We believe our method will further advance the holographic data storage technique in the era of big data.
Casper, T. A.; Meyer, W. H.; Jackson, G. L.; ...
2010-12-08
We are exploring characteristics of ITER startup scenarios in similarity experiments conducted on the DIII-D Tokamak. In these experiments, we have validated scenarios for the ITER current ramp up to full current and developed methods to control the plasma parameters to achieve stability. Predictive simulations of ITER startup using 2D free-boundary equilibrium and 1D transport codes rely on accurate estimates of the electron and ion temperature profiles that determine the electrical conductivity and pressure profiles during the current rise. Here we present results of validation studies that apply the transport model used by the ITER team to DIII-D discharge evolutionmore » and comparisons with data from our similarity experiments.« less
Zhou, Wenjie; Wei, Xuesong; Wang, Leqin
2017-01-01
Solving the static equilibrium position is one of the most important parts of dynamic coefficients calculation and further coupled calculation of rotor system. The main contribution of this study is testing the superlinear iteration convergence method—twofold secant method, for the determination of the static equilibrium position of journal bearing with finite length. Essentially, the Reynolds equation for stable motion is solved by the finite difference method and the inner pressure is obtained by the successive over-relaxation iterative method reinforced by the compound Simpson quadrature formula. The accuracy and efficiency of the twofold secant method are higher in comparison with the secant method and dichotomy. The total number of iterative steps required for the twofold secant method are about one-third of the secant method and less than one-eighth of dichotomy for the same equilibrium position. The calculations for equilibrium position and pressure distribution for different bearing length, clearance and rotating speed were done. In the results, the eccentricity presents linear inverse proportional relationship to the attitude angle. The influence of the bearing length, clearance and bearing radius on the load-carrying capacity was also investigated. The results illustrate that larger bearing length, larger radius and smaller clearance are good for the load-carrying capacity of journal bearing. The application of the twofold secant method can greatly reduce the computational time for calculation of the dynamic coefficients and dynamic characteristics of rotor-bearing system with a journal bearing of finite length. PMID:28572997
Iterative computation of generalized inverses, with an application to CMG steering laws
NASA Technical Reports Server (NTRS)
Steincamp, J. W.
1971-01-01
A cubically convergent iterative method for computing the generalized inverse of an arbitrary M X N matrix A is developed and a FORTRAN subroutine by which the method was implemented for real matrices on a CDC 3200 is given, with a numerical example to illustrate accuracy. Application to a redundant single-gimbal CMG assembly steering law is discussed.
A multigrid LU-SSOR scheme for approximate Newton iteration applied to the Euler equations
NASA Technical Reports Server (NTRS)
Yoon, Seokkwan; Jameson, Antony
1986-01-01
A new efficient relaxation scheme in conjunction with a multigrid method is developed for the Euler equations. The LU SSOR scheme is based on a central difference scheme and does not need flux splitting for Newton iteration. Application to transonic flow shows that the new method surpasses the performance of the LU implicit scheme.
Iterative methods used in overlap astrometric reduction techniques do not always converge
NASA Astrophysics Data System (ADS)
Rapaport, M.; Ducourant, C.; Colin, J.; Le Campion, J. F.
1993-04-01
In this paper we prove that the classical Gauss-Seidel type iterative methods used for the solution of the reduced normal equations occurring in overlapping reduction methods of astrometry do not always converge. We exhibit examples of divergence. We then analyze an alternative algorithm proposed by Wang (1985). We prove the consistency of this algorithm and verify that it can be convergent while the Gauss-Seidel method is divergent. We conjecture the convergence of Wang method for the solution of astrometric problems using overlap techniques.
Iterative Methods to Solve Linear RF Fields in Hot Plasma
NASA Astrophysics Data System (ADS)
Spencer, Joseph; Svidzinski, Vladimir; Evstatiev, Evstati; Galkin, Sergei; Kim, Jin-Soo
2014-10-01
Most magnetic plasma confinement devices use radio frequency (RF) waves for current drive and/or heating. Numerical modeling of RF fields is an important part of performance analysis of such devices and a predictive tool aiding design and development of future devices. Prior attempts at this modeling have mostly used direct solvers to solve the formulated linear equations. Full wave modeling of RF fields in hot plasma with 3D nonuniformities is mostly prohibited, with memory demands of a direct solver placing a significant limitation on spatial resolution. Iterative methods can significantly increase spatial resolution. We explore the feasibility of using iterative methods in 3D full wave modeling. The linear wave equation is formulated using two approaches: for cold plasmas the local cold plasma dielectric tensor is used (resolving resonances by particle collisions), while for hot plasmas the conductivity kernel (which includes a nonlocal dielectric response) is calculated by integrating along test particle orbits. The wave equation is discretized using a finite difference approach. The initial guess is important in iterative methods, and we examine different initial guesses including the solution to the cold plasma wave equation. Work is supported by the U.S. DOE SBIR program.
Uniform convergence of multigrid V-cycle iterations for indefinite and nonsymmetric problems
NASA Technical Reports Server (NTRS)
Bramble, James H.; Kwak, Do Y.; Pasciak, Joseph E.
1993-01-01
In this paper, we present an analysis of a multigrid method for nonsymmetric and/or indefinite elliptic problems. In this multigrid method various types of smoothers may be used. One type of smoother which we consider is defined in terms of an associated symmetric problem and includes point and line, Jacobi, and Gauss-Seidel iterations. We also study smoothers based entirely on the original operator. One is based on the normal form, that is, the product of the operator and its transpose. Other smoothers studied include point and line, Jacobi, and Gauss-Seidel. We show that the uniform estimates for symmetric positive definite problems carry over to these algorithms. More precisely, the multigrid iteration for the nonsymmetric and/or indefinite problem is shown to converge at a uniform rate provided that the coarsest grid in the multilevel iteration is sufficiently fine (but not depending on the number of multigrid levels).
Low-memory iterative density fitting.
Grajciar, Lukáš
2015-07-30
A new low-memory modification of the density fitting approximation based on a combination of a continuous fast multipole method (CFMM) and a preconditioned conjugate gradient solver is presented. Iterative conjugate gradient solver uses preconditioners formed from blocks of the Coulomb metric matrix that decrease the number of iterations needed for convergence by up to one order of magnitude. The matrix-vector products needed within the iterative algorithm are calculated using CFMM, which evaluates them with the linear scaling memory requirements only. Compared with the standard density fitting implementation, up to 15-fold reduction of the memory requirements is achieved for the most efficient preconditioner at a cost of only 25% increase in computational time. The potential of the method is demonstrated by performing density functional theory calculations for zeolite fragment with 2592 atoms and 121,248 auxiliary basis functions on a single 12-core CPU workstation. © 2015 Wiley Periodicals, Inc.
2D and 3D registration methods for dual-energy contrast-enhanced digital breast tomosynthesis
NASA Astrophysics Data System (ADS)
Lau, Kristen C.; Roth, Susan; Maidment, Andrew D. A.
2014-03-01
Contrast-enhanced digital breast tomosynthesis (CE-DBT) uses an iodinated contrast agent to image the threedimensional breast vasculature. The University of Pennsylvania is conducting a CE-DBT clinical study in patients with known breast cancers. The breast is compressed continuously and imaged at four time points (1 pre-contrast; 3 postcontrast). A hybrid subtraction scheme is proposed. First, dual-energy (DE) images are obtained by a weighted logarithmic subtraction of the high-energy and low-energy image pairs. Then, post-contrast DE images are subtracted from the pre-contrast DE image. This hybrid temporal subtraction of DE images is performed to analyze iodine uptake, but suffers from motion artifacts. Employing image registration further helps to correct for motion, enhancing the evaluation of vascular kinetics. Registration using ANTS (Advanced Normalization Tools) is performed in an iterative manner. Mutual information optimization first corrects large-scale motions. Normalized cross-correlation optimization then iteratively corrects fine-scale misalignment. Two methods have been evaluated: a 2D method using a slice-by-slice approach, and a 3D method using a volumetric approach to account for out-of-plane breast motion. Our results demonstrate that iterative registration qualitatively improves with each iteration (five iterations total). Motion artifacts near the edge of the breast are corrected effectively and structures within the breast (e.g. blood vessels, surgical clip) are better visualized. Statistical and clinical evaluations of registration accuracy in the CE-DBT images are ongoing.
On the inversion of geodetic integrals defined over the sphere using 1-D FFT
NASA Astrophysics Data System (ADS)
García, R. V.; Alejo, C. A.
2005-08-01
An iterative method is presented which performs inversion of integrals defined over the sphere. The method is based on one-dimensional fast Fourier transform (1-D FFT) inversion and is implemented with the projected Landweber technique, which is used to solve constrained least-squares problems reducing the associated 1-D cyclic-convolution error. The results obtained are as precise as the direct matrix inversion approach, but with better computational efficiency. A case study uses the inversion of Hotine’s integral to obtain gravity disturbances from geoid undulations. Numerical convergence is also analyzed and comparisons with respect to the direct matrix inversion method using conjugate gradient (CG) iteration are presented. Like the CG method, the number of iterations needed to get the optimum (i.e., small) error decreases as the measurement noise increases. Nevertheless, for discrete data given over a whole parallel band, the method can be applied directly without implementing the projected Landweber method, since no cyclic convolution error exists.
Cosmic Microwave Background Mapmaking with a Messenger Field
NASA Astrophysics Data System (ADS)
Huffenberger, Kevin M.; Næss, Sigurd K.
2018-01-01
We apply a messenger field method to solve the linear minimum-variance mapmaking equation in the context of Cosmic Microwave Background (CMB) observations. In simulations, the method produces sky maps that converge significantly faster than those from a conjugate gradient descent algorithm with a diagonal preconditioner, even though the computational cost per iteration is similar. The messenger method recovers large scales in the map better than conjugate gradient descent, and yields a lower overall χ2. In the single, pencil beam approximation, each iteration of the messenger mapmaking procedure produces an unbiased map, and the iterations become more optimal as they proceed. A variant of the method can handle differential data or perform deconvolution mapmaking. The messenger method requires no preconditioner, but a high-quality solution needs a cooling parameter to control the convergence. We study the convergence properties of this new method and discuss how the algorithm is feasible for the large data sets of current and future CMB experiments.
Semi-automatic sparse preconditioners for high-order finite element methods on non-uniform meshes
NASA Astrophysics Data System (ADS)
Austin, Travis M.; Brezina, Marian; Jamroz, Ben; Jhurani, Chetan; Manteuffel, Thomas A.; Ruge, John
2012-05-01
High-order finite elements often have a higher accuracy per degree of freedom than the classical low-order finite elements. However, in the context of implicit time-stepping methods, high-order finite elements present challenges to the construction of efficient simulations due to the high cost of inverting the denser finite element matrix. There are many cases where simulations are limited by the memory required to store the matrix and/or the algorithmic components of the linear solver. We are particularly interested in preconditioned Krylov methods for linear systems generated by discretization of elliptic partial differential equations with high-order finite elements. Using a preconditioner like Algebraic Multigrid can be costly in terms of memory due to the need to store matrix information at the various levels. We present a novel method for defining a preconditioner for systems generated by high-order finite elements that is based on a much sparser system than the original high-order finite element system. We investigate the performance for non-uniform meshes on a cube and a cubed sphere mesh, showing that the sparser preconditioner is more efficient and uses significantly less memory. Finally, we explore new methods to construct the sparse preconditioner and examine their effectiveness for non-uniform meshes. We compare results to a direct use of Algebraic Multigrid as a preconditioner and to a two-level additive Schwarz method.
Run-time parallelization and scheduling of loops
NASA Technical Reports Server (NTRS)
Saltz, Joel H.; Mirchandaney, Ravi; Crowley, Kay
1991-01-01
Run-time methods are studied to automatically parallelize and schedule iterations of a do loop in certain cases where compile-time information is inadequate. The methods presented involve execution time preprocessing of the loop. At compile-time, these methods set up the framework for performing a loop dependency analysis. At run-time, wavefronts of concurrently executable loop iterations are identified. Using this wavefront information, loop iterations are reordered for increased parallelism. Symbolic transformation rules are used to produce: inspector procedures that perform execution time preprocessing, and executors or transformed versions of source code loop structures. These transformed loop structures carry out the calculations planned in the inspector procedures. Performance results are presented from experiments conducted on the Encore Multimax. These results illustrate that run-time reordering of loop indexes can have a significant impact on performance.
DOE Office of Scientific and Technical Information (OSTI.GOV)
He, Hongxing; Fang, Hengrui; Miller, Mitchell D.
2016-07-15
An iterative transform algorithm is proposed to improve the conventional molecular-replacement method for solving the phase problem in X-ray crystallography. Several examples of successful trial calculations carried out with real diffraction data are presented. An iterative transform method proposed previously for direct phasing of high-solvent-content protein crystals is employed for enhancing the molecular-replacement (MR) algorithm in protein crystallography. Target structures that are resistant to conventional MR due to insufficient similarity between the template and target structures might be tractable with this modified phasing method. Trial calculations involving three different structures are described to test and illustrate the methodology. The relationshipmore » of the approach to PHENIX Phaser-MR and MR-Rosetta is discussed.« less
Ramani, Sathish; Liu, Zhihao; Rosen, Jeffrey; Nielsen, Jon-Fredrik; Fessler, Jeffrey A.
2012-01-01
Regularized iterative reconstruction algorithms for imaging inverse problems require selection of appropriate regularization parameter values. We focus on the challenging problem of tuning regularization parameters for nonlinear algorithms for the case of additive (possibly complex) Gaussian noise. Generalized cross-validation (GCV) and (weighted) mean-squared error (MSE) approaches (based on Stein's Unbiased Risk Estimate— SURE) need the Jacobian matrix of the nonlinear reconstruction operator (representative of the iterative algorithm) with respect to the data. We derive the desired Jacobian matrix for two types of nonlinear iterative algorithms: a fast variant of the standard iterative reweighted least-squares method and the contemporary split-Bregman algorithm, both of which can accommodate a wide variety of analysis- and synthesis-type regularizers. The proposed approach iteratively computes two weighted SURE-type measures: Predicted-SURE and Projected-SURE (that require knowledge of noise variance σ2), and GCV (that does not need σ2) for these algorithms. We apply the methods to image restoration and to magnetic resonance image (MRI) reconstruction using total variation (TV) and an analysis-type ℓ1-regularization. We demonstrate through simulations and experiments with real data that minimizing Predicted-SURE and Projected-SURE consistently lead to near-MSE-optimal reconstructions. We also observed that minimizing GCV yields reconstruction results that are near-MSE-optimal for image restoration and slightly sub-optimal for MRI. Theoretical derivations in this work related to Jacobian matrix evaluations can be extended, in principle, to other types of regularizers and reconstruction algorithms. PMID:22531764
The fractal geometry of Hartree-Fock
NASA Astrophysics Data System (ADS)
Theel, Friethjof; Karamatskou, Antonia; Santra, Robin
2017-12-01
The Hartree-Fock method is an important approximation for the ground-state electronic wave function of atoms and molecules so that its usage is widespread in computational chemistry and physics. The Hartree-Fock method is an iterative procedure in which the electronic wave functions of the occupied orbitals are determined. The set of functions found in one step builds the basis for the next iteration step. In this work, we interpret the Hartree-Fock method as a dynamical system since dynamical systems are iterations where iteration steps represent the time development of the system, as encountered in the theory of fractals. The focus is put on the convergence behavior of the dynamical system as a function of a suitable control parameter. In our case, a complex parameter λ controls the strength of the electron-electron interaction. An investigation of the convergence behavior depending on the parameter λ is performed for helium, neon, and argon. We observe fractal structures in the complex λ-plane, which resemble the well-known Mandelbrot set, determine their fractal dimension, and find that with increasing nuclear charge, the fragmentation increases as well.
Iterative pass optimization of sequence data
NASA Technical Reports Server (NTRS)
Wheeler, Ward C.
2003-01-01
The problem of determining the minimum-cost hypothetical ancestral sequences for a given cladogram is known to be NP-complete. This "tree alignment" problem has motivated the considerable effort placed in multiple sequence alignment procedures. Wheeler in 1996 proposed a heuristic method, direct optimization, to calculate cladogram costs without the intervention of multiple sequence alignment. This method, though more efficient in time and more effective in cladogram length than many alignment-based procedures, greedily optimizes nodes based on descendent information only. In their proposal of an exact multiple alignment solution, Sankoff et al. in 1976 described a heuristic procedure--the iterative improvement method--to create alignments at internal nodes by solving a series of median problems. The combination of a three-sequence direct optimization with iterative improvement and a branch-length-based cladogram cost procedure, provides an algorithm that frequently results in superior (i.e., lower) cladogram costs. This iterative pass optimization is both computation and memory intensive, but economies can be made to reduce this burden. An example in arthropod systematics is discussed. c2003 The Willi Hennig Society. Published by Elsevier Science (USA). All rights reserved.