diagonal implicit algorithm: Topics by Science.gov

Sample records for diagonal implicit algorithm

An improved semi-implicit method for structural dynamics analysis

NASA Technical Reports Server (NTRS)

Park, K. C.

1982-01-01

A semi-implicit algorithm is presented for direct time integration of the structural dynamics equations. The algorithm avoids the factoring of the implicit difference solution matrix and mitigates the unacceptable accuracy losses which plagued previous semi-implicit algorithms. This substantial accuracy improvement is achieved by augmenting the solution matrix with two simple diagonal matrices of the order of the integration truncation error.
Diagonalization of complex symmetric matrices: Generalized Householder reflections, iterative deflation and implicit shifts

NASA Astrophysics Data System (ADS)

Noble, J. H.; Lubasch, M.; Stevens, J.; Jentschura, U. D.

2017-12-01

We describe a matrix diagonalization algorithm for complex symmetric (not Hermitian) matrices, A ̲ =A̲T, which is based on a two-step algorithm involving generalized Householder reflections based on the indefinite inner product 〈 u ̲ , v ̲ 〉 ∗ =∑iuivi. This inner product is linear in both arguments and avoids complex conjugation. The complex symmetric input matrix is transformed to tridiagonal form using generalized Householder transformations (first step). An iterative, generalized QL decomposition of the tridiagonal matrix employing an implicit shift converges toward diagonal form (second step). The QL algorithm employs iterative deflation techniques when a machine-precision zero is encountered "prematurely" on the super-/sub-diagonal. The algorithm allows for a reliable and computationally efficient computation of resonance and antiresonance energies which emerge from complex-scaled Hamiltonians, and for the numerical determination of the real energy eigenvalues of pseudo-Hermitian and PT-symmetric Hamilton matrices. Numerical reference values are provided.
A diagonal algorithm for the method of pseudocompressibility. [for steady-state solution to incompressible Navier-Stokes equation

NASA Technical Reports Server (NTRS)

Rogers, S. E.; Kwak, D.; Chang, J. L. C.

1986-01-01

The method of pseudocompressibility has been shown to be an efficient method for obtaining a steady-state solution to the incompressible Navier-Stokes equations. Recent improvements to this method include the use of a diagonal scheme for the inversion of the equations at each iteration. The necessary transformations have been derived for the pseudocompressibility equations in generalized coordinates. The diagonal algorithm reduces the computing time necessary to obtain a steady-state solution by a factor of nearly three. Implicit viscous terms are maintained in the equations, and it has become possible to use fourth-order implicit dissipation. The steady-state solution is unchanged by the approximations resulting from the diagonalization of the equations. Computed results for flow over a two-dimensional backward-facing step and a three-dimensional cylinder mounted normal to a flat plate are presented for both the old and new algorithms. The accuracy and computing efficiency of these algorithms are compared.
Nonequilibrium thermo-chemical calculations using a diagonal implicit scheme

NASA Technical Reports Server (NTRS)

Imlay, Scott T.; Roberts, Donald W.; Soetrisno, Moeljo; Eberhardt, Scott

1991-01-01

A recently developed computer program for hypersonic vehicle flow analysis is described. The program uses a diagonal implicit algorithm to solve the equations of viscous flow for a gas in thermochemical nonequilibrium. The diagonal scheme eliminates the expense of inverting large block matrices that arise when species conservation equations are introduced. The program uses multiple zones of grids patched together and includes radiation wall and rarefied gas boundary conditions. Solutions are presented for hypersonic flows of air and hydrogen air mixtures.
A diagonal implicit scheme for computing flows with finite-rate chemistry

NASA Technical Reports Server (NTRS)

Eberhardt, Scott; Imlay, Scott

1990-01-01

A new algorithm for solving steady, finite-rate chemistry, flow problems is presented. The new scheme eliminates the expense of inverting large block matrices that arise when species conservation equations are introduced. The source Jacobian matrix is replaced by a diagonal matrix which is tailored to account for the fastest reactions in the chemical system. A point-implicit procedure is discussed and then the algorithm is included into the LU-SGS scheme. Solutions are presented for hypervelocity reentry and Hydrogen-Oxygen combustion. For the LU-SGS scheme a CFL number in excess of 10,000 has been achieved.
Multigrid calculation of three-dimensional turbomachinery flows

NASA Technical Reports Server (NTRS)

Caughey, David A.

1989-01-01

Research was performed in the general area of computational aerodynamics, with particular emphasis on the development of efficient techniques for the solution of the Euler and Navier-Stokes equations for transonic flows through the complex blade passages associated with turbomachines. In particular, multigrid methods were developed, using both explicit and implicit time-stepping schemes as smoothing algorithms. The specific accomplishments of the research have included: (1) the development of an explicit multigrid method to solve the Euler equations for three-dimensional turbomachinery flows based upon the multigrid implementation of Jameson's explicit Runge-Kutta scheme (Jameson 1983); (2) the development of an implicit multigrid scheme for the three-dimensional Euler equations based upon lower-upper factorization; (3) the development of a multigrid scheme using a diagonalized alternating direction implicit (ADI) algorithm; (4) the extension of the diagonalized ADI multigrid method to solve the Euler equations of inviscid flow for three-dimensional turbomachinery flows; and also (5) the extension of the diagonalized ADI multigrid scheme to solve the Reynolds-averaged Navier-Stokes equations for two-dimensional turbomachinery flows.
Block Preconditioning to Enable Physics-Compatible Implicit Multifluid Plasma Simulations

NASA Astrophysics Data System (ADS)

Phillips, Edward; Shadid, John; Cyr, Eric; Miller, Sean

2017-10-01

Multifluid plasma simulations involve large systems of partial differential equations in which many time-scales ranging over many orders of magnitude arise. Since the fastest of these time-scales may set a restrictively small time-step limit for explicit methods, the use of implicit or implicit-explicit time integrators can be more tractable for obtaining dynamics at time-scales of interest. Furthermore, to enforce properties such as charge conservation and divergence-free magnetic field, mixed discretizations using volume, nodal, edge-based, and face-based degrees of freedom are often employed in some form. Together with the presence of stiff modes due to integrating over fast time-scales, the mixed discretization makes the required linear solves for implicit methods particularly difficult for black box and monolithic solvers. This work presents a block preconditioning strategy for multifluid plasma systems that segregates the linear system based on discretization type and approximates off-diagonal coupling in block diagonal Schur complement operators. By employing multilevel methods for the block diagonal subsolves, this strategy yields algorithmic and parallel scalability which we demonstrate on a range of problems.
Semi-implicit finite difference methods for three-dimensional shallow water flow

USGS Publications Warehouse

Casulli, Vincenzo; Cheng, Ralph T.

1992-01-01

A semi-implicit finite difference method for the numerical solution of three-dimensional shallow water flows is presented and discussed. The governing equations are the primitive three-dimensional turbulent mean flow equations where the pressure distribution in the vertical has been assumed to be hydrostatic. In the method of solution a minimal degree of implicitness has been adopted in such a fashion that the resulting algorithm is stable and gives a maximal computational efficiency at a minimal computational cost. At each time step the numerical method requires the solution of one large linear system which can be formally decomposed into a set of small three-diagonal systems coupled with one five-diagonal system. All these linear systems are symmetric and positive definite. Thus the existence and uniquencess of the numerical solution are assured. When only one vertical layer is specified, this method reduces as a special case to a semi-implicit scheme for solving the corresponding two-dimensional shallow water equations. The resulting two- and three-dimensional algorithm has been shown to be fast, accurate and mass-conservative and can also be applied to simulate flooding and drying of tidal mud-flats in conjunction with three-dimensional flows. Furthermore, the resulting algorithm is fully vectorizable for an efficient implementation on modern vector computers.
Exponential-fitted methods for integrating stiff systems of ordinary differential equations: Applications to homogeneous gas-phase chemical kinetics

NASA Technical Reports Server (NTRS)

Pratt, D. T.

1984-01-01

Conventional algorithms for the numerical integration of ordinary differential equations (ODEs) are based on the use of polynomial functions as interpolants. However, the exact solutions of stiff ODEs behave like decaying exponential functions, which are poorly approximated by polynomials. An obvious choice of interpolant are the exponential functions themselves, or their low-order diagonal Pade (rational function) approximants. A number of explicit, A-stable, integration algorithms were derived from the use of a three-parameter exponential function as interpolant, and their relationship to low-order, polynomial-based and rational-function-based implicit and explicit methods were shown by examining their low-order diagonal Pade approximants. A robust implicit formula was derived by exponential fitting the trapezoidal rule. Application of these algorithms to integration of the ODEs governing homogenous, gas-phase chemical kinetics was demonstrated in a developmental code CREK1D, which compares favorably with the Gear-Hindmarsh code LSODE in spite of the use of a primitive stepsize control strategy.
Numerical Aspects of Atomic Physics: Helium Basis Sets and Matrix Diagonalization

NASA Astrophysics Data System (ADS)

Jentschura, Ulrich; Noble, Jonathan

2014-03-01

We present a matrix diagonalization algorithm for complex symmetric matrices, which can be used in order to determine the resonance energies of auto-ionizing states of comparatively simple quantum many-body systems such as helium. The algorithm is based in multi-precision arithmetic and proceeds via a tridiagonalization of the complex symmetric (not necessarily Hermitian) input matrix using generalized Householder transformations. Example calculations involving so-called PT-symmetric quantum systems lead to reference values which pertain to the imaginary cubic perturbation (the imaginary cubic anharmonic oscillator). We then proceed to novel basis sets for the helium atom and present results for Bethe logarithms in hydrogen and helium, obtained using the enhanced numerical techniques. Some intricacies of ``canned'' algorithms such as those used in LAPACK will be discussed. Our algorithm, for complex symmetric matrices such as those describing cubic resonances after complex scaling, is faster than LAPACK's built-in routines, for specific classes of input matrices. It also offer flexibility in terms of the calculation of the so-called implicit shift, which is used in order to ``pivot'' the system toward the convergence to diagonal form. We conclude with a wider overview.
Universal single level implicit algorithm for gasdynamics

NASA Technical Reports Server (NTRS)

Lombard, C. K.; Venkatapthy, E.

1984-01-01

A single level effectively explicit implicit algorithm for gasdynamics is presented. The method meets all the requirements for unconditionally stable global iteration over flows with mixed supersonic and supersonic zones including blunt body flow and boundary layer flows with strong interaction and streamwise separation. For hyperbolic (supersonic flow) regions the method is automatically equivalent to contemporary space marching methods. For elliptic (subsonic flow) regions, rapid convergence is facilitated by alternating direction solution sweeps which bring both sets of eigenvectors and the influence of both boundaries of a coordinate line equally into play. Point by point updating of the data with local iteration on the solution procedure at each spatial step as the sweeps progress not only renders the method single level in storage but, also, improves nonlinear accuracy to accelerate convergence by an order of magnitude over related two level linearized implicit methods. The method derives robust stability from the combination of an eigenvector split upwind difference method (CSCM) with diagonally dominant ADI(DDADI) approximate factorization and computed characteristic boundary approximations.
A comparison of two central difference schemes for solving the Navier-Stokes equations

NASA Technical Reports Server (NTRS)

Maksymiuk, C. M.; Swanson, R. C.; Pulliam, T. H.

1990-01-01

Five viscous transonic airfoil cases were computed by two significantly different computational fluid dynamics codes: An explicit finite-volume algorithm with multigrid, and an implicit finite-difference approximate-factorization method with Eigenvector diagonalization. Both methods are described in detail, and their performance on the test cases is compared. The codes utilized the same grids, turbulence model, and computer to provide the truest test of the algorithms. The two approaches produce very similar results, which, for attached flows, also agree well with experimental results; however, the explicit code is considerably faster.
Implicit high-order discontinuous Galerkin method with HWENO type limiters for steady viscous flow simulations

NASA Astrophysics Data System (ADS)

Jiang, Zhen-Hua; Yan, Chao; Yu, Jian

2013-08-01

Two types of implicit algorithms have been improved for high order discontinuous Galerkin (DG) method to solve compressible Navier-Stokes (NS) equations on triangular grids. A block lower-upper symmetric Gauss-Seidel (BLU-SGS) approach is implemented as a nonlinear iterative scheme. And a modified LU-SGS (LLU-SGS) approach is suggested to reduce the memory requirements while retain the good convergence performance of the original LU-SGS approach. Both implicit schemes have the significant advantage that only the diagonal block matrix is stored. The resulting implicit high-order DG methods are applied, in combination with Hermite weighted essentially non-oscillatory (HWENO) limiters, to solve viscous flow problems. Numerical results demonstrate that the present implicit methods are able to achieve significant efficiency improvements over explicit counterparts and for viscous flows with shocks, and the HWENO limiters can be used to achieve the desired essentially non-oscillatory shock transition and the designed high-order accuracy simultaneously.
Aerodynamics of Engine-Airframe Interaction

NASA Technical Reports Server (NTRS)

Caughey, D. A.

1986-01-01

The report describes progress in research directed towards the efficient solution of the inviscid Euler and Reynolds-averaged Navier-Stokes equations for transonic flows through engine inlets, and past complete aircraft configurations, with emphasis on the flowfields in the vicinity of engine inlets. The research focusses upon the development of solution-adaptive grid procedures for these problems, and the development of multi-grid algorithms in conjunction with both, implicit and explicit time-stepping schemes for the solution of three-dimensional problems. The work includes further development of mesh systems suitable for inlet and wing-fuselage-inlet geometries using a variational approach. Work during this reporting period concentrated upon two-dimensional problems, and has been in two general areas: (1) the development of solution-adaptive procedures to cluster the grid cells in regions of high (truncation) error;and (2) the development of a multigrid scheme for solution of the two-dimensional Euler equations using a diagonalized alternating direction implicit (ADI) smoothing algorithm.
Development of advanced Navier-Stokes solver

NASA Technical Reports Server (NTRS)

Yoon, Seokkwan

1994-01-01

The objective of research was to develop and validate new computational algorithms for solving the steady and unsteady Euler and Navier-Stokes equations. The end-products are new three-dimensional Euler and Navier-Stokes codes that are faster, more reliable, more accurate, and easier to use. The three-dimensional Euler and full/thin-layer Reynolds-averaged Navier-Stokes equations for compressible/incompressible flows are solved on structured hexahedral grids. The Baldwin-Lomax algebraic turbulence model is used for closure. The space discretization is based on a cell-centered finite-volume method augmented by a variety of numerical dissipation models with optional total variation diminishing limiters. The governing equations are integrated in time by an implicit method based on lower-upper factorization and symmetric Gauss-Seidel relaxation. The algorithm is vectorized on diagonal planes of sweep using two-dimensional indices in three dimensions. Convergence rates and the robustness of the codes are enhanced by the use of an implicit full approximation storage multigrid method.
A transient FETI methodology for large-scale parallel implicit computations in structural mechanics

NASA Technical Reports Server (NTRS)

Farhat, Charbel; Crivelli, Luis; Roux, Francois-Xavier

1992-01-01

Explicit codes are often used to simulate the nonlinear dynamics of large-scale structural systems, even for low frequency response, because the storage and CPU requirements entailed by the repeated factorizations traditionally found in implicit codes rapidly overwhelm the available computing resources. With the advent of parallel processing, this trend is accelerating because explicit schemes are also easier to parallelize than implicit ones. However, the time step restriction imposed by the Courant stability condition on all explicit schemes cannot yet -- and perhaps will never -- be offset by the speed of parallel hardware. Therefore, it is essential to develop efficient and robust alternatives to direct methods that are also amenable to massively parallel processing because implicit codes using unconditionally stable time-integration algorithms are computationally more efficient when simulating low-frequency dynamics. Here we present a domain decomposition method for implicit schemes that requires significantly less storage than factorization algorithms, that is several times faster than other popular direct and iterative methods, that can be easily implemented on both shared and local memory parallel processors, and that is both computationally and communication-wise efficient. The proposed transient domain decomposition method is an extension of the method of Finite Element Tearing and Interconnecting (FETI) developed by Farhat and Roux for the solution of static problems. Serial and parallel performance results on the CRAY Y-MP/8 and the iPSC-860/128 systems are reported and analyzed for realistic structural dynamics problems. These results establish the superiority of the FETI method over both the serial/parallel conjugate gradient algorithm with diagonal scaling and the serial/parallel direct method, and contrast the computational power of the iPSC-860/128 parallel processor with that of the CRAY Y-MP/8 system.
Preconditioned conjugate gradient methods for the compressible Navier-Stokes equations

NASA Technical Reports Server (NTRS)

Venkatakrishnan, V.

1990-01-01

The compressible Navier-Stokes equations are solved for a variety of two-dimensional inviscid and viscous problems by preconditioned conjugate gradient-like algorithms. Roe's flux difference splitting technique is used to discretize the inviscid fluxes. The viscous terms are discretized by using central differences. An algebraic turbulence model is also incorporated. The system of linear equations which arises out of the linearization of a fully implicit scheme is solved iteratively by the well known methods of GMRES (Generalized Minimum Residual technique) and Chebyschev iteration. Incomplete LU factorization and block diagonal factorization are used as preconditioners. The resulting algorithm is competitive with the best current schemes, but has wide applications in parallel computing and unstructured mesh computations.
On the development of OpenFOAM solvers based on explicit and implicit high-order Runge-Kutta schemes for incompressible flows with heat transfer

NASA Astrophysics Data System (ADS)

D'Alessandro, Valerio; Binci, Lorenzo; Montelpare, Sergio; Ricci, Renato

2018-01-01

Open-source CFD codes provide suitable environments for implementing and testing low-dissipative algorithms typically used to simulate turbulence. In this research work we developed CFD solvers for incompressible flows based on high-order explicit and diagonally implicit Runge-Kutta (RK) schemes for time integration. In particular, an iterated PISO-like procedure based on Rhie-Chow correction was used to handle pressure-velocity coupling within each implicit RK stage. For the explicit approach, a projected scheme was used to avoid the "checker-board" effect. The above-mentioned approaches were also extended to flow problems involving heat transfer. It is worth noting that the numerical technology available in the OpenFOAM library was used for space discretization. In this work, we additionally explore the reliability and effectiveness of the proposed implementations by computing several unsteady flow benchmarks; we also show that the numerical diffusion due to the time integration approach is completely canceled using the solution techniques proposed here.
An implict LU scheme for the Euler equations applied to arbitrary cascades. [new method of factoring

NASA Technical Reports Server (NTRS)

Buratynski, E. K.; Caughey, D. A.

1984-01-01

An implicit scheme for solving the Euler equations is derived and demonstrated. The alternating-direction implicit (ADI) technique is modified, using two implicit-operator factors corresponding to lower-block-diagonal (L) or upper-block-diagonal (U) algebraic systems which can be easily inverted. The resulting LU scheme is implemented in finite-volume mode and applied to 2D subsonic and transonic cascade flows with differing degrees of geometric complexity. The results are presented graphically and found to be in good agreement with those of other numerical and analytical approaches. The LU method is also 2.0-3.4 times faster than ADI, suggesting its value in calculating 3D problems.
Iterative algorithm for joint zero diagonalization with application in blind source separation.

PubMed

Zhang, Wei-Tao; Lou, Shun-Tian

2011-07-01

A new iterative algorithm for the nonunitary joint zero diagonalization of a set of matrices is proposed for blind source separation applications. On one hand, since the zero diagonalizer of the proposed algorithm is constructed iteratively by successive multiplications of an invertible matrix, the singular solutions that occur in the existing nonunitary iterative algorithms are naturally avoided. On the other hand, compared to the algebraic method for joint zero diagonalization, the proposed algorithm requires fewer matrices to be zero diagonalized to yield even better performance. The extension of the algorithm to the complex and nonsquare mixing cases is also addressed. Numerical simulations on both synthetic data and blind source separation using time-frequency distributions illustrate the performance of the algorithm and provide a comparison to the leading joint zero diagonalization schemes.

Upwind MacCormack Euler solver with non-equilibrium chemistry

NASA Technical Reports Server (NTRS)

Sherer, Scott E.; Scott, James N.

1993-01-01

A computer code, designated UMPIRE, is currently under development to solve the Euler equations in two dimensions with non-equilibrium chemistry. UMPIRE employs an explicit MacCormack algorithm with dissipation introduced via Roe's flux-difference split upwind method. The code also has the capability to employ a point-implicit methodology for flows where stiffness is introduced through the chemical source term. A technique consisting of diagonal sweeps across the computational domain from each corner is presented, which is used to reduce storage and execution requirements. Results depicting one dimensional shock tube flow for both calorically perfect gas and thermally perfect, dissociating nitrogen are presented to verify current capabilities of the program. Also, computational results from a chemical reactor vessel with no fluid dynamic effects are presented to check the chemistry capability and to verify the point implicit strategy.
Numerical study of supersonic combustion using a finite rate chemistry model

NASA Technical Reports Server (NTRS)

Chitsomboon, T.; Tiwari, S. N.; Kumar, A.; Drummond, J. P.

1986-01-01

The governing equations of two-dimensional chemically reacting flows are presented together with a global two-step chemistry model for H2-air combustion. The explicit unsplit MacCormack finite difference algorithm is used to advance the discrete system of the governing equations in time until convergence is attained. The source terms in the species equations are evaluated implicitly to alleviate stiffness associated with fast reactions. With implicit source terms, the species equations give rise to a block-diagonal system which can be solved very efficiently on vector-processing computers. A supersonic reacting flow in an inlet-combustor configuration is calculated for the case where H2 is injected into the flow from the side walls and the strut. Results of the calculation are compared against the results obtained by using a complete reaction model.
Parallelization of implicit finite difference schemes in computational fluid dynamics

NASA Technical Reports Server (NTRS)

Decker, Naomi H.; Naik, Vijay K.; Nicoules, Michel

1990-01-01

Implicit finite difference schemes are often the preferred numerical schemes in computational fluid dynamics, requiring less stringent stability bounds than the explicit schemes. Each iteration in an implicit scheme involves global data dependencies in the form of second and higher order recurrences. Efficient parallel implementations of such iterative methods are considerably more difficult and non-intuitive. The parallelization of the implicit schemes that are used for solving the Euler and the thin layer Navier-Stokes equations and that require inversions of large linear systems in the form of block tri-diagonal and/or block penta-diagonal matrices is discussed. Three-dimensional cases are emphasized and schemes that minimize the total execution time are presented. Partitioning and scheduling schemes for alleviating the effects of the global data dependencies are described. An analysis of the communication and the computation aspects of these methods is presented. The effect of the boundary conditions on the parallel schemes is also discussed.
An Initial Investigation of the Effects of Turbulence Models on the Convergence of the RK/Implicit Scheme

NASA Technical Reports Server (NTRS)

Swanson, R. C.; Rossow, C.-C.

2008-01-01

A three-stage Runge-Kutta (RK) scheme with multigrid and an implicit preconditioner has been shown to be an effective solver for the fluid dynamic equations. This scheme has been applied to both the compressible and essentially incompressible Reynolds-averaged Navier-Stokes (RANS) equations using the algebraic turbulence model of Baldwin and Lomax (BL). In this paper we focus on the convergence of the RK/implicit scheme when the effects of turbulence are represented by either the Spalart-Allmaras model or the Wilcox k-! model, which are frequently used models in practical fluid dynamic applications. Convergence behavior of the scheme with these turbulence models and the BL model are directly compared. For this initial investigation we solve the flow equations and the partial differential equations of the turbulence models indirectly coupled. With this approach we examine the convergence behavior of each system. Both point and line symmetric Gauss-Seidel are considered for approximating the inverse of the implicit operator of the flow solver. To solve the turbulence equations we use a diagonally dominant alternating direction implicit (DDADI) scheme. Computational results are presented for three airfoil flow cases and comparisons are made with experimental data. We demonstrate that the two-dimensional RANS equations and transport-type equations for turbulence modeling can be efficiently solved with an indirectly coupled algorithm that uses the RK/implicit scheme for the flow equations.
A space-time lower-upper symmetric Gauss-Seidel scheme for the time-spectral method

NASA Astrophysics Data System (ADS)

Zhan, Lei; Xiong, Juntao; Liu, Feng

2016-05-01

The time-spectral method (TSM) offers the advantage of increased order of accuracy compared to methods using finite-difference in time for periodic unsteady flow problems. Explicit Runge-Kutta pseudo-time marching and implicit schemes have been developed to solve iteratively the space-time coupled nonlinear equations resulting from TSM. Convergence of the explicit schemes is slow because of the stringent time-step limit. Many implicit methods have been developed for TSM. Their computational efficiency is, however, still limited in practice because of delayed implicit temporal coupling, multiple iterative loops, costly matrix operations, or lack of strong diagonal dominance of the implicit operator matrix. To overcome these shortcomings, an efficient space-time lower-upper symmetric Gauss-Seidel (ST-LU-SGS) implicit scheme with multigrid acceleration is presented. In this scheme, the implicit temporal coupling term is split as one additional dimension of space in the LU-SGS sweeps. To improve numerical stability for periodic flows with high frequency, a modification to the ST-LU-SGS scheme is proposed. Numerical results show that fast convergence is achieved using large or even infinite Courant-Friedrichs-Lewy (CFL) numbers for unsteady flow problems with moderately high frequency and with the use of moderately high numbers of time intervals. The ST-LU-SGS implicit scheme is also found to work well in calculating periodic flow problems where the frequency is not known a priori and needed to be determined by using a combined Fourier analysis and gradient-based search algorithm.
Multi-processing on supercomputers for computational aerodynamics

NASA Technical Reports Server (NTRS)

Yarrow, Maurice; Mehta, Unmeel B.

1990-01-01

The MIMD concept is applied, through multitasking, with relatively minor modifications to an existing code for a single processor. This approach maps the available memory to multiple processors, exploiting the C-FORTRAN-Unix interface. An existing single processor algorithm is mapped without the need for developing a new algorithm. The procedure of designing a code utilizing this approach is automated with the Unix stream editor. A Multiple Processor Multiple Grid (MPMG) code is developed as a demonstration of this approach. This code solves the three-dimensional, Reynolds-averaged, thin-layer and slender-layer Navier-Stokes equations with an implicit, approximately factored and diagonalized method. This solver is applied to a generic, oblique-wing aircraft problem on a four-processor computer using one process for data management and nonparallel computations and three processes for pseudotime advance on three different grid systems.
A Diagonal-Steering-Based Binaural Beamforming Algorithm Incorporating a Diagonal Speech Localizer for Persons With Bilateral Hearing Impairment.

PubMed

Lee, Jun Chang; Nam, Kyoung Won; Jang, Dong Pyo; Kim, In Young

2015-12-01

Previously suggested diagonal-steering algorithms for binaural hearing support devices have commonly assumed that the direction of the speech signal is known in advance, which is not always the case in many real circumstances. In this study, a new diagonal-steering-based binaural speech localization (BSL) algorithm is proposed, and the performances of the BSL algorithm and the binaural beamforming algorithm, which integrates the BSL and diagonal-steering algorithms, were evaluated using actual speech-in-noise signals in several simulated listening scenarios. Testing sounds were recorded in a KEMAR mannequin setup and two objective indices, improvements in signal-to-noise ratio (SNRi ) and segmental SNR (segSNRi ), were utilized for performance evaluation. Experimental results demonstrated that the accuracy of the BSL was in the 90-100% range when input SNR was -10 to +5 dB range. The average differences between the γ-adjusted and γ-fixed diagonal-steering algorithms (for -15 to +5 dB input SNR) in the talking in the restaurant scenario were 0.203-0.937 dB for SNRi and 0.052-0.437 dB for segSNRi , and in the listening while car driving scenario, the differences were 0.387-0.835 dB for SNRi and 0.259-1.175 dB for segSNRi . In addition, the average difference between the BSL-turned-on and the BSL-turned-off cases for the binaural beamforming algorithm in the listening while car driving scenario was 1.631-4.246 dB for SNRi and 0.574-2.784 dB for segSNRi . In all testing conditions, the γ-adjusted diagonal-steering and BSL algorithm improved the values of the indices more than the conventional algorithms. The binaural beamforming algorithm, which integrates the proposed BSL and diagonal-steering algorithm, is expected to improve the performance of the binaural hearing support devices in noisy situations. Copyright © 2015 International Center for Artificial Organs and Transplantation and Wiley Periodicals, Inc.
Efficient conjugate gradient algorithms for computation of the manipulator forward dynamics

NASA Technical Reports Server (NTRS)

Fijany, Amir; Scheid, Robert E.

1989-01-01

The applicability of conjugate gradient algorithms for computation of the manipulator forward dynamics is investigated. The redundancies in the previously proposed conjugate gradient algorithm are analyzed. A new version is developed which, by avoiding these redundancies, achieves a significantly greater efficiency. A preconditioned conjugate gradient algorithm is also presented. A diagonal matrix whose elements are the diagonal elements of the inertia matrix is proposed as the preconditioner. In order to increase the computational efficiency, an algorithm is developed which exploits the synergism between the computation of the diagonal elements of the inertia matrix and that required by the conjugate gradient algorithm.
Diagonally Implicit Runge-Kutta Methods for Ordinary Differential Equations. A Review

NASA Technical Reports Server (NTRS)

Kennedy, Christopher A.; Carpenter, Mark H.

2016-01-01

A review of diagonally implicit Runge-Kutta (DIRK) methods applied to rst-order ordinary di erential equations (ODEs) is undertaken. The goal of this review is to summarize the characteristics, assess the potential, and then design several nearly optimal, general purpose, DIRK-type methods. Over 20 important aspects of DIRKtype methods are reviewed. A design study is then conducted on DIRK-type methods having from two to seven implicit stages. From this, 15 schemes are selected for general purpose application. Testing of the 15 chosen methods is done on three singular perturbation problems. Based on the review of method characteristics, these methods focus on having a stage order of two, sti accuracy, L-stability, high quality embedded and dense-output methods, small magnitudes of the algebraic stability matrix eigenvalues, small values of aii, and small or vanishing values of the internal stability function for large eigenvalues of the Jacobian. Among the 15 new methods, ESDIRK4(3)6L[2]SA is recommended as a good default method for solving sti problems at moderate error tolerances.
A Partitioning Algorithm for Block-Diagonal Matrices With Overlap

DOE Office of Scientific and Technical Information (OSTI.GOV)

Guy Antoine Atenekeng Kahou; Laura Grigori; Masha Sosonkina

2008-02-02

We present a graph partitioning algorithm that aims at partitioning a sparse matrix into a block-diagonal form, such that any two consecutive blocks overlap. We denote this form of the matrix as the overlapped block-diagonal matrix. The partitioned matrix is suitable for applying the explicit formulation of Multiplicative Schwarz preconditioner (EFMS) described in [3]. The graph partitioning algorithm partitions the graph of the input matrix into K partitions, such that every partition {Omega}{sub i} has at most two neighbors {Omega}{sub i-1} and {Omega}{sub i+1}. First, an ordering algorithm, such as the reverse Cuthill-McKee algorithm, that reduces the matrix profile ismore » performed. An initial overlapped block-diagonal partition is obtained from the profile of the matrix. An iterative strategy is then used to further refine the partitioning by allowing nodes to be transferred between neighboring partitions. Experiments are performed on matrices arising from real-world applications to show the feasibility and usefulness of this approach.« less
Improved L-BFGS diagonal preconditioners for a large-scale 4D-Var inversion system: application to CO2 flux constraints and analysis error calculation

NASA Astrophysics Data System (ADS)

Bousserez, Nicolas; Henze, Daven; Bowman, Kevin; Liu, Junjie; Jones, Dylan; Keller, Martin; Deng, Feng

2013-04-01

This work presents improved analysis error estimates for 4D-Var systems. From operational NWP models to top-down constraints on trace gas emissions, many of today's data assimilation and inversion systems in atmospheric science rely on variational approaches. This success is due to both the mathematical clarity of these formulations and the availability of computationally efficient minimization algorithms. However, unlike Kalman Filter-based algorithms, these methods do not provide an estimate of the analysis or forecast error covariance matrices, these error statistics being propagated only implicitly by the system. From both a practical (cycling assimilation) and scientific perspective, assessing uncertainties in the solution of the variational problem is critical. For large-scale linear systems, deterministic or randomization approaches can be considered based on the equivalence between the inverse Hessian of the cost function and the covariance matrix of analysis error. For perfectly quadratic systems, like incremental 4D-Var, Lanczos/Conjugate-Gradient algorithms have proven to be most efficient in generating low-rank approximations of the Hessian matrix during the minimization. For weakly non-linear systems though, the Limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS), a quasi-Newton descent algorithm, is usually considered the best method for the minimization. Suitable for large-scale optimization, this method allows one to generate an approximation to the inverse Hessian using the latest m vector/gradient pairs generated during the minimization, m depending upon the available core memory. At each iteration, an initial low-rank approximation to the inverse Hessian has to be provided, which is called preconditioning. The ability of the preconditioner to retain useful information from previous iterations largely determines the efficiency of the algorithm. Here we assess the performance of different preconditioners to estimate the inverse Hessian of a large-scale 4D-Var system. The impact of using the diagonal preconditioners proposed by Gilbert and Le Maréchal (1989) instead of the usual Oren-Spedicato scalar will be first presented. We will also introduce new hybrid methods that combine randomization estimates of the analysis error variance with L-BFGS diagonal updates to improve the inverse Hessian approximation. Results from these new algorithms will be evaluated against standard large ensemble Monte-Carlo simulations. The methods explored here are applied to the problem of inferring global atmospheric CO2 fluxes using remote sensing observations, and are intended to be integrated with the future NASA Carbon Monitoring System.
Bayesian block-diagonal variable selection and model averaging

PubMed Central

Papaspiliopoulos, O.; Rossell, D.

2018-01-01

Summary We propose a scalable algorithmic framework for exact Bayesian variable selection and model averaging in linear models under the assumption that the Gram matrix is block-diagonal, and as a heuristic for exploring the model space for general designs. In block-diagonal designs our approach returns the most probable model of any given size without resorting to numerical integration. The algorithm also provides a novel and efficient solution to the frequentist best subset selection problem for block-diagonal designs. Posterior probabilities for any number of models are obtained by evaluating a single one-dimensional integral, and other quantities of interest such as variable inclusion probabilities and model-averaged regression estimates are obtained by an adaptive, deterministic one-dimensional numerical integration. The overall computational cost scales linearly with the number of blocks, which can be processed in parallel, and exponentially with the block size, rendering it most adequate in situations where predictors are organized in many moderately-sized blocks. For general designs, we approximate the Gram matrix by a block-diagonal matrix using spectral clustering and propose an iterative algorithm that capitalizes on the block-diagonal algorithms to explore efficiently the model space. All methods proposed in this paper are implemented in the R library mombf. PMID:29861501
Inexact adaptive Newton methods

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bertiger, W.I.; Kelsey, F.J.

1985-02-01

The Inexact Adaptive Newton method (IAN) is a modification of the Adaptive Implicit Method/sup 1/ (AIM) with improved Newton convergence. Both methods simplify the Jacobian at each time step by zeroing coefficients in regions where saturations are changing slowly. The methods differ in how the diagonal block terms are treated. On test problems with up to 3,000 cells, IAN consistently saves approximately 30% of the CPU time when compared to the fully implicit method. AIM shows similar savings on some problems, but takes as much CPU time as fully implicit on other test problems due to poor Newton convergence.
Accurate Grid-based Clustering Algorithm with Diagonal Grid Searching and Merging

NASA Astrophysics Data System (ADS)

Liu, Feng; Ye, Chengcheng; Zhu, Erzhou

2017-09-01

Due to the advent of big data, data mining technology has attracted more and more attentions. As an important data analysis method, grid clustering algorithm is fast but with relatively lower accuracy. This paper presents an improved clustering algorithm combined with grid and density parameters. The algorithm first divides the data space into the valid meshes and invalid meshes through grid parameters. Secondly, from the starting point located at the first point of the diagonal of the grids, the algorithm takes the direction of “horizontal right, vertical down” to merge the valid meshes. Furthermore, by the boundary grid processing, the invalid grids are searched and merged when the adjacent left, above, and diagonal-direction grids are all the valid ones. By doing this, the accuracy of clustering is improved. The experimental results have shown that the proposed algorithm is accuracy and relatively faster when compared with some popularly used algorithms.
Implementation of the diagonalization-free algorithm in the self-consistent field procedure within the four-component relativistic scheme.

PubMed

Hrdá, Marcela; Kulich, Tomáš; Repiský, Michal; Noga, Jozef; Malkina, Olga L; Malkin, Vladimir G

2014-09-05

A recently developed Thouless-expansion-based diagonalization-free approach for improving the efficiency of self-consistent field (SCF) methods (Noga and Šimunek, J. Chem. Theory Comput. 2010, 6, 2706) has been adapted to the four-component relativistic scheme and implemented within the program package ReSpect. In addition to the implementation, the method has been thoroughly analyzed, particularly with respect to cases for which it is difficult or computationally expensive to find a good initial guess. Based on this analysis, several modifications of the original algorithm, refining its stability and efficiency, are proposed. To demonstrate the robustness and efficiency of the improved algorithm, we present the results of four-component diagonalization-free SCF calculations on several heavy-metal complexes, the largest of which contains more than 80 atoms (about 6000 4-spinor basis functions). The diagonalization-free procedure is about twice as fast as the corresponding diagonalization. Copyright © 2014 Wiley Periodicals, Inc.
Recovery Discontinuous Galerkin Jacobian-Free Newton-Krylov Method for All-Speed Flows

DOE Office of Scientific and Technical Information (OSTI.GOV)

HyeongKae Park; Robert Nourgaliev; Vincent Mousseau

2008-07-01

A novel numerical algorithm (rDG-JFNK) for all-speed fluid flows with heat conduction and viscosity is introduced. The rDG-JFNK combines the Discontinuous Galerkin spatial discretization with the implicit Runge-Kutta time integration under the Jacobian-free Newton-Krylov framework. We solve fully-compressible Navier-Stokes equations without operator-splitting of hyperbolic, diffusion and reaction terms, which enables fully-coupled high-order temporal discretization. The stability constraint is removed due to the L-stable Explicit, Singly Diagonal Implicit Runge-Kutta (ESDIRK) scheme. The governing equations are solved in the conservative form, which allows one to accurately compute shock dynamics, as well as low-speed flows. For spatial discretization, we develop a “recovery” familymore » of DG, exhibiting nearly-spectral accuracy. To precondition the Krylov-based linear solver (GMRES), we developed an “Operator-Split”-(OS) Physics Based Preconditioner (PBP), in which we transform/simplify the fully-coupled system to a sequence of segregated scalar problems, each can be solved efficiently with Multigrid method. Each scalar problem is designed to target/cluster eigenvalues of the Jacobian matrix associated with a specific physics.« less
Numerical solution of 3D Navier-Stokes equations with upwind implicit schemes

NASA Technical Reports Server (NTRS)

Marx, Yves P.

1990-01-01

An upwind MUSCL type implicit scheme for the three-dimensional Navier-Stokes equations is presented. Comparison between different approximate Riemann solvers (Roe and Osher) are performed and the influence of the reconstructions schemes on the accuracy of the solution as well as on the convergence of the method is studied. A new limiter is introduced in order to remove the problems usually associated with non-linear upwind schemes. The implementation of a diagonal upwind implicit operator for the three-dimensional Navier-Stokes equations is also discussed. Finally the turbulence modeling is assessed. Good prediction of separated flows are demonstrated if a non-equilibrium turbulence model is used.
Applications and accuracy of the parallel diagonal dominant algorithm

NASA Technical Reports Server (NTRS)

Sun, Xian-He

1993-01-01

The Parallel Diagonal Dominant (PDD) algorithm is a highly efficient, ideally scalable tridiagonal solver. In this paper, a detailed study of the PDD algorithm is given. First the PDD algorithm is introduced. Then the algorithm is extended to solve periodic tridiagonal systems. A variant, the reduced PDD algorithm, is also proposed. Accuracy analysis is provided for a class of tridiagonal systems, the symmetric, and anti-symmetric Toeplitz tridiagonal systems. Implementation results show that the analysis gives a good bound on the relative error, and the algorithm is a good candidate for the emerging massively parallel machines.
Research on numerical algorithms for large space structures

NASA Technical Reports Server (NTRS)

Denman, E. D.

1981-01-01

Numerical algorithms for analysis and design of large space structures are investigated. The sign algorithm and its application to decoupling of differential equations are presented. The generalized sign algorithm is given and its application to several problems discussed. The Laplace transforms of matrix functions and the diagonalization procedure for a finite element equation are discussed. The diagonalization of matrix polynomials is considered. The quadrature method and Laplace transforms is discussed and the identification of linear systems by the quadrature method investigated.
A third-order implicit discontinuous Galerkin method based on a Hermite WENO reconstruction for time-accurate solution of the compressible Navier-Stokes equations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xia, Yidong; Liu, Xiaodong; Luo, Hong

2015-06-01

Here, a space and time third-order discontinuous Galerkin method based on a Hermite weighted essentially non-oscillatory reconstruction is presented for the unsteady compressible Euler and Navier–Stokes equations. At each time step, a lower-upper symmetric Gauss–Seidel preconditioned generalized minimal residual solver is used to solve the systems of linear equations arising from an explicit first stage, single diagonal coefficient, diagonally implicit Runge–Kutta time integration scheme. The performance of the developed method is assessed through a variety of unsteady flow problems. Numerical results indicate that this method is able to deliver the designed third-order accuracy of convergence in both space and time,more » while requiring remarkably less storage than the standard third-order discontinous Galerkin methods, and less computing time than the lower-order discontinous Galerkin methods to achieve the same level of temporal accuracy for computing unsteady flow problems.« less

Development of Implicit Methods in CFD NASA Ames Research Center 1970's - 1980's

NASA Technical Reports Server (NTRS)

Pulliam, Thomas H.

2010-01-01

The focus here is on the early development (mid 1970's-1980's) at NASA Ames Research Center of implicit methods in Computational Fluid Dynamics (CFD). A class of implicit finite difference schemes of the Beam and Warming approximate factorization type will be addressed. The emphasis will be on the Euler equations. A review of material pertinent to the solution of the Euler equations within the framework of implicit methods will be presented. The eigensystem of the equations will be used extensively in developing a framework for various methods applied to the Euler equations. The development and analysis of various aspects of this class of schemes will be given along with the motivations behind many of the choices. Various acceleration and efficiency modifications such as matrix reduction, diagonalization and flux split schemes will be presented.
On Richardson extrapolation for low-dissipation low-dispersion diagonally implicit Runge-Kutta schemes

NASA Astrophysics Data System (ADS)

Havasi, Ágnes; Kazemi, Ehsan

2018-04-01

In the modeling of wave propagation phenomena it is necessary to use time integration methods which are not only sufficiently accurate, but also properly describe the amplitude and phase of the propagating waves. It is not clear if amending the developed schemes by extrapolation methods to obtain a high order of accuracy preserves the qualitative properties of these schemes in the perspective of dissipation, dispersion and stability analysis. It is illustrated that the combination of various optimized schemes with Richardson extrapolation is not optimal for minimal dissipation and dispersion errors. Optimized third-order and fourth-order methods are obtained, and it is shown that the proposed methods combined with Richardson extrapolation result in fourth and fifth orders of accuracy correspondingly, while preserving optimality and stability. The numerical applications include the linear wave equation, a stiff system of reaction-diffusion equations and the nonlinear Euler equations with oscillatory initial conditions. It is demonstrated that the extrapolated third-order scheme outperforms the recently developed fourth-order diagonally implicit Runge-Kutta scheme in terms of accuracy and stability.
Parallel algorithms for computation of the manipulator inertia matrix

NASA Technical Reports Server (NTRS)

Amin-Javaheri, Masoud; Orin, David E.

1989-01-01

The development of an O(log2N) parallel algorithm for the manipulator inertia matrix is presented. It is based on the most efficient serial algorithm which uses the composite rigid body method. Recursive doubling is used to reformulate the linear recurrence equations which are required to compute the diagonal elements of the matrix. It results in O(log2N) levels of computation. Computation of the off-diagonal elements involves N linear recurrences of varying-size and a new method, which avoids redundant computation of position and orientation transforms for the manipulator, is developed. The O(log2N) algorithm is presented in both equation and graphic forms which clearly show the parallelism inherent in the algorithm.
Efficient spares matrix multiplication scheme for the CYBER 203

NASA Technical Reports Server (NTRS)

Lambiotte, J. J., Jr.

1984-01-01

This work has been directed toward the development of an efficient algorithm for performing this computation on the CYBER-203. The desire to provide software which gives the user the choice between the often conflicting goals of minimizing central processing (CPU) time or storage requirements has led to a diagonal-based algorithm in which one of three types of storage is selected for each diagonal. For each storage type, an initialization sub-routine estimates the CPU and storage requirements based upon results from previously performed numerical experimentation. These requirements are adjusted by weights provided by the user which reflect the relative importance the user places on the resources. The three storage types employed were chosen to be efficient on the CYBER-203 for diagonals which are sparse, moderately sparse, or dense; however, for many densities, no diagonal type is most efficient with respect to both resource requirements. The user-supplied weights dictate the choice.
A fully redundant double difference algorithm for obtaining minimum variance estimates from GPS observations

NASA Technical Reports Server (NTRS)

Melbourne, William G.

1986-01-01

In double differencing a regression system obtained from concurrent Global Positioning System (GPS) observation sequences, one either undersamples the system to avoid introducing colored measurement statistics, or one fully samples the system incurring the resulting non-diagonal covariance matrix for the differenced measurement errors. A suboptimal estimation result will be obtained in the undersampling case and will also be obtained in the fully sampled case unless the color noise statistics are taken into account. The latter approach requires a least squares weighting matrix derived from inversion of a non-diagonal covariance matrix for the differenced measurement errors instead of inversion of the customary diagonal one associated with white noise processes. Presented is the so-called fully redundant double differencing algorithm for generating a weighted double differenced regression system that yields equivalent estimation results, but features for certain cases a diagonal weighting matrix even though the differenced measurement error statistics are highly colored.
Development of computational methods for heavy lift launch vehicles

NASA Technical Reports Server (NTRS)

Yoon, Seokkwan; Ryan, James S.

1993-01-01

The research effort has been focused on the development of an advanced flow solver for complex viscous turbulent flows with shock waves. The three-dimensional Euler and full/thin-layer Reynolds-averaged Navier-Stokes equations for compressible flows are solved on structured hexahedral grids. The Baldwin-Lomax algebraic turbulence model is used for closure. The space discretization is based on a cell-centered finite-volume method augmented by a variety of numerical dissipation models with optional total variation diminishing limiters. The governing equations are integrated in time by an implicit method based on lower-upper factorization and symmetric Gauss-Seidel relaxation. The algorithm is vectorized on diagonal planes of sweep using two-dimensional indices in three dimensions. A new computer program named CENS3D has been developed for viscous turbulent flows with discontinuities. Details of the code are described in Appendix A and Appendix B. With the developments of the numerical algorithm and dissipation model, the simulation of three-dimensional viscous compressible flows has become more efficient and accurate. The results of the research are expected to yield a direct impact on the design process of future liquid fueled launch systems.
Using Volunteer Computing to Study Some Features of Diagonal Latin Squares

NASA Astrophysics Data System (ADS)

Vatutin, Eduard; Zaikin, Oleg; Kochemazov, Stepan; Valyaev, Sergey

2017-12-01

In this research, the study concerns around several features of diagonal Latin squares (DLSs) of small order. Authors of the study suggest an algorithm for computing minimal and maximal numbers of transversals of DLSs. According to this algorithm, all DLSs of a particular order are generated, and for each square all its transversals and diagonal transversals are constructed. The algorithm was implemented and applied to DLSs of order at most 7 on a personal computer. The experiment for order 8 was performed in the volunteer computing project Gerasim@home. In addition, the problem of finding pairs of orthogonal DLSs of order 10 was considered and reduced to Boolean satisfiability problem. The obtained problem turned out to be very hard, therefore it was decomposed into a family of subproblems. In order to solve the problem, the volunteer computing project SAT@home was used. As a result, several dozen pairs of described kind were found.
Singular value decomposition utilizing parallel algorithms on graphical processors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kotas, Charlotte W; Barhen, Jacob

2011-01-01

One of the current challenges in underwater acoustic array signal processing is the detection of quiet targets in the presence of noise. In order to enable robust detection, one of the key processing steps requires data and replica whitening. This, in turn, involves the eigen-decomposition of the sample spectral matrix, Cx = 1/K xKX(k)XH(k) where X(k) denotes a single frequency snapshot with an element for each element of the array. By employing the singular value decomposition (SVD) method, the eigenvectors and eigenvalues can be determined directly from the data without computing the sample covariance matrix, reducing the computational requirements formore » a given level of accuracy (van Trees, Optimum Array Processing). (Recall that the SVD of a complex matrix A involves determining V, , and U such that A = U VH where U and V are orthonormal and is a positive, real, diagonal matrix containing the singular values of A. U and V are the eigenvectors of AAH and AHA, respectively, while the singular values are the square roots of the eigenvalues of AAH.) Because it is desirable to be able to compute these quantities in real time, an efficient technique for computing the SVD is vital. In addition, emerging multicore processors like graphical processing units (GPUs) are bringing parallel processing capabilities to an ever increasing number of users. Since the computational tasks involved in array signal processing are well suited for parallelization, it is expected that these computations will be implemented using GPUs as soon as users have the necessary computational tools available to them. Thus, it is important to have an SVD algorithm that is suitable for these processors. This work explores the effectiveness of two different parallel SVD implementations on an NVIDIA Tesla C2050 GPU (14 multiprocessors, 32 cores per multiprocessor, 1.15 GHz clock - peed). The first algorithm is based on a two-step algorithm which bidiagonalizes the matrix using Householder transformations, and then diagonalizes the intermediate bidiagonal matrix through implicit QR shifts. This is similar to that implemented for real matrices by Lahabar and Narayanan ("Singular Value Decomposition on GPU using CUDA", IEEE International Parallel Distributed Processing Symposium 2009). The implementation is done in a hybrid manner, with the bidiagonalization stage done using the GPU while the diagonalization stage is done using the CPU, with the GPU used to update the U and V matrices. The second algorithm is based on a one-sided Jacobi scheme utilizing a sequence of pair-wise column orthogonalizations such that A is replaced by AV until the resulting matrix is sufficiently orthogonal (that is, equal to U ). V is obtained from the sequence of orthogonalizations, while can be found from the square root of the diagonal elements of AH A and, once is known, U can be found from column scaling the resulting matrix. These implementations utilize CUDA Fortran and NVIDIA's CUB LAS library. The primary goal of this study is to quantify the comparative performance of these two techniques against themselves and other standard implementations (for example, MATLAB). Considering that there is significant overhead associated with transferring data to the GPU and with synchronization between the GPU and the host CPU, it is also important to understand when it is worthwhile to use the GPU in terms of the matrix size and number of concurrent SVDs to be calculated.« less
Small-Noise Analysis and Symmetrization of Implicit Monte Carlo Samplers

DOE PAGES

Goodman, Jonathan; Lin, Kevin K.; Morzfeld, Matthias

2015-07-06

Implicit samplers are algorithms for producing independent, weighted samples from multivariate probability distributions. These are often applied in Bayesian data assimilation algorithms. We use Laplace asymptotic expansions to analyze two implicit samplers in the small noise regime. Our analysis suggests a symmetrization of the algorithms that leads to improved implicit sampling schemes at a relatively small additional cost. Here, computational experiments confirm the theory and show that symmetrization is effective for small noise sampling problems.
Reduced order feedback control equations for linear time and frequency domain analysis

NASA Technical Reports Server (NTRS)

Frisch, H. P.

1981-01-01

An algorithm was developed which can be used to obtain the equations. In a more general context, the algorithm computes a real nonsingular similarity transformation matrix which reduces a real nonsymmetric matrix to block diagonal form, each block of which is a real quasi upper triangular matrix. The algorithm works with both defective and derogatory matrices and when and if it fails, the resultant output can be used as a guide for the reformulation of the mathematical equations that lead up to the ill conditioned matrix which could not be block diagonalized.
Implicit solvers for unstructured meshes

NASA Technical Reports Server (NTRS)

Venkatakrishnan, V.; Mavriplis, Dimitri J.

1991-01-01

Implicit methods for unstructured mesh computations are developed and tested. The approximate system which arises from the Newton-linearization of the nonlinear evolution operator is solved by using the preconditioned generalized minimum residual technique. These different preconditioners are investigated: the incomplete LU factorization (ILU), block diagonal factorization, and the symmetric successive over-relaxation (SSOR). The preconditioners have been optimized to have good vectorization properties. The various methods are compared over a wide range of problems. Ordering of the unknowns, which affects the convergence of these sparse matrix iterative methods, is also investigated. Results are presented for inviscid and turbulent viscous calculations on single and multielement airfoil configurations using globally and adaptively generated meshes.
Computing the Density Matrix in Electronic Structure Theory on Graphics Processing Units.

PubMed

Cawkwell, M J; Sanville, E J; Mniszewski, S M; Niklasson, Anders M N

2012-11-13

The self-consistent solution of a Schrödinger-like equation for the density matrix is a critical and computationally demanding step in quantum-based models of interatomic bonding. This step was tackled historically via the diagonalization of the Hamiltonian. We have investigated the performance and accuracy of the second-order spectral projection (SP2) algorithm for the computation of the density matrix via a recursive expansion of the Fermi operator in a series of generalized matrix-matrix multiplications. We demonstrate that owing to its simplicity, the SP2 algorithm [Niklasson, A. M. N. Phys. Rev. B2002, 66, 155115] is exceptionally well suited to implementation on graphics processing units (GPUs). The performance in double and single precision arithmetic of a hybrid GPU/central processing unit (CPU) and full GPU implementation of the SP2 algorithm exceed those of a CPU-only implementation of the SP2 algorithm and traditional matrix diagonalization when the dimensions of the matrices exceed about 2000 × 2000. Padding schemes for arrays allocated in the GPU memory that optimize the performance of the CUBLAS implementations of the level 3 BLAS DGEMM and SGEMM subroutines for generalized matrix-matrix multiplications are described in detail. The analysis of the relative performance of the hybrid CPU/GPU and full GPU implementations indicate that the transfer of arrays between the GPU and CPU constitutes only a small fraction of the total computation time. The errors measured in the self-consistent density matrices computed using the SP2 algorithm are generally smaller than those measured in matrices computed via diagonalization. Furthermore, the errors in the density matrices computed using the SP2 algorithm do not exhibit any dependence of system size, whereas the errors increase linearly with the number of orbitals when diagonalization is employed.
Computational manipulation of a radiative MHD flow with Hall current and chemical reaction in the presence of rotating fluid

NASA Astrophysics Data System (ADS)

Alias Suba, Subbu; Muthucumaraswamy, R.

2018-04-01

A numerical analysis of transient radiative MHD(MagnetoHydroDynamic) natural convective flow of a viscous, incompressible, electrically conducting and rotating fluid along a semi-infinite isothermal vertical plate is carried out taking into consideration Hall current, rotation and first order chemical reaction.The coupled non-linear partial differential equations are expressed in difference form using implicit finite difference scheme. The difference equations are then reduced to a system of linear algebraic equations with a tri-diagonal structure which is solved by Thomas Algorithm. The primary and secondary velocity profiles, temperature profile, concentration profile, skin friction, Nusselt number and Sherwood Number are depicted graphically for a range of values of rotation parameter, Hall parameter,magnetic parameter, chemical reaction parameter, radiation parameter, Prandtl number and Schmidt number.It is recognized that rate of heat transfer and rate of mass transfer decrease with increase in time but they increase with increasing values of radiation parameter and Schmidt number respectively.
An efficient sparse matrix multiplication scheme for the CYBER 205 computer

NASA Technical Reports Server (NTRS)

Lambiotte, Jules J., Jr.

1988-01-01

This paper describes the development of an efficient algorithm for computing the product of a matrix and vector on a CYBER 205 vector computer. The desire to provide software which allows the user to choose between the often conflicting goals of minimizing central processing unit (CPU) time or storage requirements has led to a diagonal-based algorithm in which one of four types of storage is selected for each diagonal. The candidate storage types employed were chosen to be efficient on the CYBER 205 for diagonals which have nonzero structure which is dense, moderately sparse, very sparse and short, or very sparse and long; however, for many densities, no diagonal type is most efficient with respect to both resource requirements, and a trade-off must be made. For each diagonal, an initialization subroutine estimates the CPU time and storage required for each storage type based on results from previously performed numerical experimentation. These requirements are adjusted by weights provided by the user which reflect the relative importance the user places on the two resources. The adjusted resource requirements are then compared to select the most efficient storage and computational scheme.
Performance Comparison of a Set of Periodic and Non-Periodic Tridiagonal Solvers on SP2 and Paragon Parallel Computers

NASA Technical Reports Server (NTRS)

Sun, Xian-He; Moitra, Stuti

1996-01-01

Various tridiagonal solvers have been proposed in recent years for different parallel platforms. In this paper, the performance of three tridiagonal solvers, namely, the parallel partition LU algorithm, the parallel diagonal dominant algorithm, and the reduced diagonal dominant algorithm, is studied. These algorithms are designed for distributed-memory machines and are tested on an Intel Paragon and an IBM SP2 machines. Measured results are reported in terms of execution time and speedup. Analytical study are conducted for different communication topologies and for different tridiagonal systems. The measured results match the analytical results closely. In addition to address implementation issues, performance considerations such as problem sizes and models of speedup are also discussed.
The density matrix renormalization group algorithm on kilo-processor architectures: Implementation and trade-offs

NASA Astrophysics Data System (ADS)

Nemes, Csaba; Barcza, Gergely; Nagy, Zoltán; Legeza, Örs; Szolgay, Péter

2014-06-01

In the numerical analysis of strongly correlated quantum lattice models one of the leading algorithms developed to balance the size of the effective Hilbert space and the accuracy of the simulation is the density matrix renormalization group (DMRG) algorithm, in which the run-time is dominated by the iterative diagonalization of the Hamilton operator. As the most time-dominant step of the diagonalization can be expressed as a list of dense matrix operations, the DMRG is an appealing candidate to fully utilize the computing power residing in novel kilo-processor architectures. In the paper a smart hybrid CPU-GPU implementation is presented, which exploits the power of both CPU and GPU and tolerates problems exceeding the GPU memory size. Furthermore, a new CUDA kernel has been designed for asymmetric matrix-vector multiplication to accelerate the rest of the diagonalization. Besides the evaluation of the GPU implementation, the practical limits of an FPGA implementation are also discussed.
About the coupling of turbulence closure models with averaged Navier-Stokes equations

NASA Technical Reports Server (NTRS)

Vandromme, D.; Ha Minh, H.

1986-01-01

The MacCormack implicit predictor-corrector model (1981) for numerical solution of the coupled Navier-Stokes equations for turbulent flows is extended to nonconservative multiequation turbulence models, as well as the inclusion of second-order Reynolds stress turbulence closure. A scalar effective pressure turbulent contribution to the pressure field is defined to approximate the effects of the Reynolds stress in strongly sheared flows. The Jacobian matrices of the transport equations are diagonalized to reduce the required computer memory and run time. Techniques are defined for including turbulence in the diagonalization. Application of the method is demonstrated with solutions generated for transonic nozzle flow and for the interaction between a supersonic flat plate boundary layer and a 12 deg compression-expansion ramp.
Newton Algorithms for Analytic Rotation: An Implicit Function Approach

ERIC Educational Resources Information Center

Boik, Robert J.

2008-01-01

In this paper implicit function-based parameterizations for orthogonal and oblique rotation matrices are proposed. The parameterizations are used to construct Newton algorithms for minimizing differentiable rotation criteria applied to "m" factors and "p" variables. The speed of the new algorithms is compared to that of existing algorithms and to…
Teaching the "Diagonalization Concept" in Linear Algebra with Technology: A Case Study at Galatasaray University

ERIC Educational Resources Information Center

Yildiz Ulus, Aysegul

2013-01-01

This paper examines experimental and algorithmic contributions of advanced calculators (graphing and computer algebra system, CAS) in teaching the concept of "diagonalization," one of the key topics in Linear Algebra courses taught at the undergraduate level. Specifically, the proposed hypothesis of this study is to assess the effective…
Asymptotic integration algorithms for nonhomogeneous, nonlinear, first order, ordinary differential equations

NASA Technical Reports Server (NTRS)

Walker, K. P.; Freed, A. D.

1991-01-01

New methods for integrating systems of stiff, nonlinear, first order, ordinary differential equations are developed by casting the differential equations into integral form. Nonlinear recursive relations are obtained that allow the solution to a system of equations at time t plus delta t to be obtained in terms of the solution at time t in explicit and implicit forms. Examples of accuracy obtained with the new technique are given by considering systems of nonlinear, first order equations which arise in the study of unified models of viscoplastic behaviors, the spread of the AIDS virus, and predator-prey populations. In general, the new implicit algorithm is unconditionally stable, and has a Jacobian of smaller dimension than that which is acquired by current implicit methods, such as the Euler backward difference algorithm; yet, it gives superior accuracy. The asymptotic explicit and implicit algorithms are suitable for solutions that are of the growing and decaying exponential kinds, respectively, whilst the implicit Euler-Maclaurin algorithm is superior when the solution oscillates, i.e., when there are regions in which both growing and decaying exponential solutions exist.

A multi-dimensional nonlinearly implicit, electromagnetic Vlasov-Darwin particle-in-cell (PIC) algorithm

NASA Astrophysics Data System (ADS)

Chen, Guangye; Chacón, Luis; CoCoMans Team

2014-10-01

For decades, the Vlasov-Darwin model has been recognized to be attractive for PIC simulations (to avoid radiative noise issues) in non-radiative electromagnetic regimes. However, the Darwin model results in elliptic field equations that renders explicit time integration unconditionally unstable. Improving on linearly implicit schemes, fully implicit PIC algorithms for both electrostatic and electromagnetic regimes, with exact discrete energy and charge conservation properties, have been recently developed in 1D. This study builds on these recent algorithms to develop an implicit, orbit-averaged, time-space-centered finite difference scheme for the particle-field equations in multiple dimensions. The algorithm conserves energy, charge, and canonical-momentum exactly, even with grid packing. A simple fluid preconditioner allows efficient use of large timesteps, O (√{mi/me}c/veT) larger than the explicit CFL. We demonstrate the accuracy and efficiency properties of the of the algorithm with various numerical experiments in 2D3V.
Tensor-product preconditioners for higher-order space-time discontinuous Galerkin methods

NASA Astrophysics Data System (ADS)

Diosady, Laslo T.; Murman, Scott M.

2017-02-01

A space-time discontinuous-Galerkin spectral-element discretization is presented for direct numerical simulation of the compressible Navier-Stokes equations. An efficient solution technique based on a matrix-free Newton-Krylov method is developed in order to overcome the stiffness associated with high solution order. The use of tensor-product basis functions is key to maintaining efficiency at high-order. Efficient preconditioning methods are presented which can take advantage of the tensor-product formulation. A diagonalized Alternating-Direction-Implicit (ADI) scheme is extended to the space-time discontinuous Galerkin discretization. A new preconditioner for the compressible Euler/Navier-Stokes equations based on the fast-diagonalization method is also presented. Numerical results demonstrate the effectiveness of these preconditioners for the direct numerical simulation of subsonic turbulent flows.
Tensor-Product Preconditioners for Higher-Order Space-Time Discontinuous Galerkin Methods

NASA Technical Reports Server (NTRS)

Diosady, Laslo T.; Murman, Scott M.

2016-01-01

space-time discontinuous-Galerkin spectral-element discretization is presented for direct numerical simulation of the compressible Navier-Stokes equat ions. An efficient solution technique based on a matrix-free Newton-Krylov method is developed in order to overcome the stiffness associated with high solution order. The use of tensor-product basis functions is key to maintaining efficiency at high order. Efficient preconditioning methods are presented which can take advantage of the tensor-product formulation. A diagonalized Alternating-Direction-Implicit (ADI) scheme is extended to the space-time discontinuous Galerkin discretization. A new preconditioner for the compressible Euler/Navier-Stokes equations based on the fast-diagonalization method is also presented. Numerical results demonstrate the effectiveness of these preconditioners for the direct numerical simulation of subsonic turbulent flows.
3D Gaussian Beam Modeling

DTIC Science & Technology

2011-09-01

optimized building blocks such as a parallelized tri-diagonal linear solver (used in the “implicit finite differences ” and split-step Pade PE models...and Ding Lee. “A finite - difference treatment of interface conditions for the parabolic wave equation: The horizontal interface.” The Journal of the...Acoustical Society of America, 71(4):855, 1982. 3. Ding Lee and Suzanne T. McDaniel. “A finite - difference treatment of interface conditions for
Tensor-product preconditioners for a space-time discontinuous Galerkin method

NASA Astrophysics Data System (ADS)

Diosady, Laslo T.; Murman, Scott M.

2014-10-01

A space-time discontinuous Galerkin spectral element discretization is presented for direct numerical simulation of the compressible Navier-Stokes equations. An efficient solution technique based on a matrix-free Newton-Krylov method is presented. A diagonalized alternating direction implicit preconditioner is extended to a space-time formulation using entropy variables. The effectiveness of this technique is demonstrated for the direct numerical simulation of turbulent flow in a channel.
Application of the Hughes-LIU algorithm to the 2-dimensional heat equation

NASA Technical Reports Server (NTRS)

Malkus, D. S.; Reichmann, P. I.; Haftka, R. T.

1982-01-01

An implicit explicit algorithm for the solution of transient problems in structural dynamics is described. The method involved dividing the finite elements into implicit and explicit groups while automatically satisfying the conditions. This algorithm is applied to the solution of the linear, transient, two dimensional heat equation subject to an initial condition derived from the soluton of a steady state problem over an L-shaped region made up of a good conductor and an insulating material. Using the IIT/PRIME computer with virtual memory, a FORTRAN computer program code was developed to make accuracy, stability, and cost comparisons among the fully explicit Euler, the Hughes-Liu, and the fully implicit Crank-Nicholson algorithms. The Hughes-Liu claim that the explicit group governs the stability of the entire region while maintaining the unconditional stability of the implicit group is illustrated.
State-Based Implicit Coordination and Applications

NASA Technical Reports Server (NTRS)

Narkawicz, Anthony J.; Munoz, Cesar A.

2011-01-01

In air traffic management, pairwise coordination is the ability to achieve separation requirements when conflicting aircraft simultaneously maneuver to solve a conflict. Resolution algorithms are implicitly coordinated if they provide coordinated resolution maneuvers to conflicting aircraft when only surveillance data, e.g., position and velocity vectors, is periodically broadcast by the aircraft. This paper proposes an abstract framework for reasoning about state-based implicit coordination. The framework consists of a formalized mathematical development that enables and simplifies the design and verification of implicitly coordinated state-based resolution algorithms. The use of the framework is illustrated with several examples of algorithms and formal proofs of their coordination properties. The work presented here supports the safety case for a distributed self-separation air traffic management concept where different aircraft may use different conflict resolution algorithms and be assured that separation will be maintained.
Implementation of the SU(2) Hamiltonian Symmetry for the DMRG Algorithm

DOE Office of Scientific and Technical Information (OSTI.GOV)

Alvarez, Gonzalo

2012-01-01

In the Density Matrix Renormalization Group (DMRG) algorithm (White, 1992, 1993) and Hamiltonian symmetries play an important role. Using symmetries, the matrix representation of the Hamiltonian can be blocked. Diagonalizing each matrix block is more efficient than diagonalizing the original matrix. This paper explains how the the DMRG++ code (Alvarez, 2009) has been extended to handle the non-local SU(2) symmetry in a model independent way. Improvements in CPU times compared to runs with only local symmetries are discussed for the one-orbital Hubbard model, and for a two-orbital Hubbard model for iron-based superconductors. The computational bottleneck of the algorithm and themore » use of shared memory parallelization are also addressed.« less
Block LU factorization

NASA Technical Reports Server (NTRS)

Demmel, James W.; Higham, Nicholas J.; Schreiber, Robert S.

1992-01-01

Many of the currently popular 'block algorithms' are scalar algorithms in which the operations have been grouped and reordered into matrix operations. One genuine block algorithm in practical use is block LU factorization, and this has recently been shown by Demmel and Higham to be unstable in general. It is shown here that block LU factorization is stable if A is block diagonally dominant by columns. Moreover, for a general matrix the level of instability in block LU factorization can be founded in terms of the condition number kappa(A) and the growth factor for Gaussian elimination without pivoting. A consequence is that block LU factorization is stable for a matrix A that is symmetric positive definite or point diagonally dominant by rows or columns as long as A is well-conditioned.
Studies of implicit and explicit solution techniques in transient thermal analysis of structures

NASA Technical Reports Server (NTRS)

Adelman, H. M.; Haftka, R. T.; Robinson, J. C.

1982-01-01

Studies aimed at an increase in the efficiency of calculating transient temperature fields in complex aerospace vehicle structures are reported. The advantages and disadvantages of explicit and implicit algorithms are discussed and a promising set of implicit algorithms with variable time steps, known as GEARIB, is described. Test problems, used for evaluating and comparing various algorithms, are discussed and finite element models of the configurations are described. These problems include a coarse model of the Space Shuttle wing, an insulated frame tst article, a metallic panel for a thermal protection system, and detailed models of sections of the Space Shuttle wing. Results generally indicate a preference for implicit over explicit algorithms for transient structural heat transfer problems when the governing equations are stiff (typical of many practical problems such as insulated metal structures). The effects on algorithm performance of different models of an insulated cylinder are demonstrated. The stiffness of the problem is highly sensitive to modeling details and careful modeling can reduce the stiffness of the equations to the extent that explicit methods may become the best choice. Preliminary applications of a mixed implicit-explicit algorithm and operator splitting techniques for speeding up the solution of the algebraic equations are also described.
Multidimensional, fully implicit, exactly conserving electromagnetic particle-in-cell simulations

NASA Astrophysics Data System (ADS)

Chacon, Luis

2015-09-01

We discuss a new, conservative, fully implicit 2D-3V particle-in-cell algorithm for non-radiative, electromagnetic kinetic plasma simulations, based on the Vlasov-Darwin model. Unlike earlier linearly implicit PIC schemes and standard explicit PIC schemes, fully implicit PIC algorithms are unconditionally stable and allow exact discrete energy and charge conservation. This has been demonstrated in 1D electrostatic and electromagnetic contexts. In this study, we build on these recent algorithms to develop an implicit, orbit-averaged, time-space-centered finite difference scheme for the Darwin field and particle orbit equations for multiple species in multiple dimensions. The Vlasov-Darwin model is very attractive for PIC simulations because it avoids radiative noise issues in non-radiative electromagnetic regimes. The algorithm conserves global energy, local charge, and particle canonical-momentum exactly, even with grid packing. The nonlinear iteration is effectively accelerated with a fluid preconditioner, which allows efficient use of large timesteps, O(√{mi/me}c/veT) larger than the explicit CFL. In this presentation, we will introduce the main algorithmic components of the approach, and demonstrate the accuracy and efficiency properties of the algorithm with various numerical experiments in 1D and 2D. Support from the LANL LDRD program and the DOE-SC ASCR office.
Studies of implicit and explicit solution techniques in transient thermal analysis of structures

NASA Astrophysics Data System (ADS)

Adelman, H. M.; Haftka, R. T.; Robinson, J. C.

1982-08-01

Studies aimed at an increase in the efficiency of calculating transient temperature fields in complex aerospace vehicle structures are reported. The advantages and disadvantages of explicit and implicit algorithms are discussed and a promising set of implicit algorithms with variable time steps, known as GEARIB, is described. Test problems, used for evaluating and comparing various algorithms, are discussed and finite element models of the configurations are described. These problems include a coarse model of the Space Shuttle wing, an insulated frame tst article, a metallic panel for a thermal protection system, and detailed models of sections of the Space Shuttle wing. Results generally indicate a preference for implicit over explicit algorithms for transient structural heat transfer problems when the governing equations are stiff (typical of many practical problems such as insulated metal structures). The effects on algorithm performance of different models of an insulated cylinder are demonstrated. The stiffness of the problem is highly sensitive to modeling details and careful modeling can reduce the stiffness of the equations to the extent that explicit methods may become the best choice. Preliminary applications of a mixed implicit-explicit algorithm and operator splitting techniques for speeding up the solution of the algebraic equations are also described.
Implicit solvers for unstructured meshes

NASA Technical Reports Server (NTRS)

Venkatakrishnan, V.; Mavriplis, Dimitri J.

1991-01-01

Implicit methods were developed and tested for unstructured mesh computations. The approximate system which arises from the Newton linearization of the nonlinear evolution operator is solved by using the preconditioned GMRES (Generalized Minimum Residual) technique. Three different preconditioners were studied, namely, the incomplete LU factorization (ILU), block diagonal factorization, and the symmetric successive over relaxation (SSOR). The preconditioners were optimized to have good vectorization properties. SSOR and ILU were also studied as iterative schemes. The various methods are compared over a wide range of problems. Ordering of the unknowns, which affects the convergence of these sparse matrix iterative methods, is also studied. Results are presented for inviscid and turbulent viscous calculations on single and multielement airfoil configurations using globally and adaptively generated meshes.
Implicit Plasma Kinetic Simulation Using The Jacobian-Free Newton-Krylov Method

NASA Astrophysics Data System (ADS)

Taitano, William; Knoll, Dana; Chacon, Luis

2009-11-01

The use of fully implicit time integration methods in kinetic simulation is still area of algorithmic research. A brute-force approach to simultaneously including the field equations and the particle distribution function would result in an intractable linear algebra problem. A number of algorithms have been put forward which rely on an extrapolation in time. They can be thought of as linearly implicit methods or one-step Newton methods. However, issues related to time accuracy of these methods still remain. We are pursuing a route to implicit plasma kinetic simulation which eliminates extrapolation, eliminates phase-space from the linear algebra problem, and converges the entire nonlinear system within a time step. We accomplish all this using the Jacobian-Free Newton-Krylov algorithm. The original research along these lines considered particle methods to advance the distribution function [1]. In the current research we are advancing the Vlasov equations on a grid. Results will be presented which highlight algorithmic details for single species electrostatic problems and coupled ion-electron electrostatic problems. [4pt] [1] H. J. Kim, L. Chac'on, G. Lapenta, ``Fully implicit particle in cell algorithm,'' 47th Annual Meeting of the Division of Plasma Physics, Oct. 24-28, 2005, Denver, CO
Algorithm Development for the Multi-Fluid Plasma Model

DTIC Science & Technology

2011-05-30

392, Sep 1995. [13] L Chacon , DC Barnes, DA Knoll, and GH Miley. An implicit energy- conservative 2D Fokker-Planck algorithm. Journal of Computational...Physics, 157(2):618–653, 2000. [14] L Chacon , DC Barnes, DA Knoll, and GH Miley. An implicit energy- conservative 2D Fokker-Planck algorithm - II
On the performance of explicit and implicit algorithms for transient thermal analysis

NASA Astrophysics Data System (ADS)

Adelman, H. M.; Haftka, R. T.

1980-09-01

The status of an effort to increase the efficiency of calculating transient temperature fields in complex aerospace vehicle structures is described. The advantages and disadvantages of explicit and implicit algorithms are discussed. A promising set of implicit algorithms, known as the GEAR package is described. Four test problems, used for evaluating and comparing various algorithms, have been selected and finite element models of the configurations are discribed. These problems include a space shuttle frame component, an insulated cylinder, a metallic panel for a thermal protection system and a model of the space shuttle orbiter wing. Calculations were carried out using the SPAR finite element program, the MITAS lumped parameter program and a special purpose finite element program incorporating the GEAR algorithms. Results generally indicate a preference for implicit over explicit algorithms for solution of transient structural heat transfer problems when the governing equations are stiff. Careful attention to modeling detail such as avoiding thin or short high-conducting elements can sometimes reduce the stiffness to the extent that explicit methods become advantageous.
Multi-dimensional, fully implicit, exactly conserving electromagnetic particle-in-cell simulations in curvilinear geometry

NASA Astrophysics Data System (ADS)

Chen, Guangye; Chacon, Luis

2015-11-01

We discuss a new, conservative, fully implicit 2D3V Vlasov-Darwin particle-in-cell algorithm in curvilinear geometry for non-radiative, electromagnetic kinetic plasma simulations. Unlike standard explicit PIC schemes, fully implicit PIC algorithms are unconditionally stable and allow exact discrete energy and charge conservation. Here, we extend these algorithms to curvilinear geometry. The algorithm retains its exact conservation properties in curvilinear grids. The nonlinear iteration is effectively accelerated with a fluid preconditioner for weakly to modestly magnetized plasmas, which allows efficient use of large timesteps, O (√{mi/me}c/veT) larger than the explicit CFL. In this presentation, we will introduce the main algorithmic components of the approach, and demonstrate the accuracy and efficiency properties of the algorithm with various numerical experiments in 1D (slow shock) and 2D (island coalescense).
On the Origins of the Linear Free Energy Relationships: Exploring the Nature of the Off-Diagonal Coupling Elements in SN2 Reactions

PubMed Central

Rosta, Edina; Warshel, Arieh

2012-01-01

Understanding the relationship between the adiabatic free energy profiles of chemical reactions and the underlining diabatic states is central to the description of chemical reactivity. The diabatic states form the theoretical basis of Linear Free Energy Relationships (LFERs) and thus play a major role in physical organic chemistry and related fields. However, the theoretical justification for some of the implicit LFER assumptions has not been fully established by quantum mechanical studies. This study follows our earlier works1,2 and uses the ab initio frozen density functional theory (FDFT) method3 to evaluate both the diabatic and adiabatic free energy surfaces and to determine the corresponding off-diagonal coupling matrix elements for a series of SN2 reactions. It is found that the off-diagonal coupling matrix elements are almost the same regardless of the nucleophile and the leaving group but change upon changing the central group. Furthermore, it is also found that the off diagonal elements are basically the same in gas phase and in solution, even when the solvent is explicitly included in the ab initio calculations. Furthermore, our study establishes that the FDFT diabatic profiles are parabolic to a good approximation thus providing a first principle support to the origin of LFER. These findings further support the basic approximation of the EVB treatment. PMID:23329895
A parallel algorithm for Hamiltonian matrix construction in electron-molecule collision calculations: MPI-SCATCI

NASA Astrophysics Data System (ADS)

Al-Refaie, Ahmed F.; Tennyson, Jonathan

2017-12-01

Construction and diagonalization of the Hamiltonian matrix is the rate-limiting step in most low-energy electron - molecule collision calculations. Tennyson (1996) implemented a novel algorithm for Hamiltonian construction which took advantage of the structure of the wavefunction in such calculations. This algorithm is re-engineered to make use of modern computer architectures and the use of appropriate diagonalizers is considered. Test calculations demonstrate that significant speed-ups can be gained using multiple CPUs. This opens the way to calculations which consider higher collision energies, larger molecules and / or more target states. The methodology, which is implemented as part of the UK molecular R-matrix codes (UKRMol and UKRMol+) can also be used for studies of bound molecular Rydberg states, photoionization and positron-molecule collisions.
Rotor cascade shape optimization with unsteady passing wakes using implicit dual time stepping method

NASA Astrophysics Data System (ADS)

Lee, Eun Seok

2000-10-01

An improved aerodynamics performance of a turbine cascade shape can be achieved by an understanding of the flow-field associated with the stator-rotor interaction. In this research, an axial gas turbine airfoil cascade shape is optimized for improved aerodynamic performance by using an unsteady Navier-Stokes solver and a parallel genetic algorithm. The objective of the research is twofold: (1) to develop a computational fluid dynamics code having faster convergence rate and unsteady flow simulation capabilities, and (2) to optimize a turbine airfoil cascade shape with unsteady passing wakes for improved aerodynamic performance. The computer code solves the Reynolds averaged Navier-Stokes equations. It is based on the explicit, finite difference, Runge-Kutta time marching scheme and the Diagonalized Alternating Direction Implicit (DADI) scheme, with the Baldwin-Lomax algebraic and k-epsilon turbulence modeling. Improvements in the code focused on the cascade shape design capability, convergence acceleration and unsteady formulation. First, the inverse shape design method was implemented in the code to provide the design capability, where a surface transpiration concept was employed as an inverse technique to modify the geometry satisfying the user specified pressure distribution on the airfoil surface. Second, an approximation storage multigrid method was implemented as an acceleration technique. Third, the preconditioning method was adopted to speed up the convergence rate in solving the low Mach number flows. Finally, the implicit dual time stepping method was incorporated in order to simulate the unsteady flow-fields. For the unsteady code validation, the Stokes's 2nd problem and the Poiseuille flow were chosen and compared with the computed results and analytic solutions. To test the code's ability to capture the natural unsteady flow phenomena, vortex shedding past a cylinder and the shock oscillation over a bicircular airfoil were simulated and compared with experiments and other research results. The rotor cascade shape optimization with unsteady passing wakes was performed to obtain an improved aerodynamic performance using the unsteady Navier-Stokes solver. Two objective functions were defined as minimization of total pressure loss and maximization of lift, while the mass flow rate was fixed. A parallel genetic algorithm was used as an optimizer and the penalty method was introduced. Each individual's objective function was computed simultaneously by using a 32 processor distributed memory computer. One optimization took about four days.

High-order solution methods for grey discrete ordinates thermal radiative transfer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Maginot, Peter G., E-mail: maginot1@llnl.gov; Ragusa, Jean C., E-mail: jean.ragusa@tamu.edu; Morel, Jim E., E-mail: morel@tamu.edu

This work presents a solution methodology for solving the grey radiative transfer equations that is both spatially and temporally more accurate than the canonical radiative transfer solution technique of linear discontinuous finite element discretization in space with implicit Euler integration in time. We solve the grey radiative transfer equations by fully converging the nonlinear temperature dependence of the material specific heat, material opacities, and Planck function. The grey radiative transfer equations are discretized in space using arbitrary-order self-lumping discontinuous finite elements and integrated in time with arbitrary-order diagonally implicit Runge–Kutta time integration techniques. Iterative convergence of the radiation equation ismore » accelerated using a modified interior penalty diffusion operator to precondition the full discrete ordinates transport operator.« less
High-order solution methods for grey discrete ordinates thermal radiative transfer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Maginot, Peter G.; Ragusa, Jean C.; Morel, Jim E.

This paper presents a solution methodology for solving the grey radiative transfer equations that is both spatially and temporally more accurate than the canonical radiative transfer solution technique of linear discontinuous finite element discretization in space with implicit Euler integration in time. We solve the grey radiative transfer equations by fully converging the nonlinear temperature dependence of the material specific heat, material opacities, and Planck function. The grey radiative transfer equations are discretized in space using arbitrary-order self-lumping discontinuous finite elements and integrated in time with arbitrary-order diagonally implicit Runge–Kutta time integration techniques. Iterative convergence of the radiation equation ismore » accelerated using a modified interior penalty diffusion operator to precondition the full discrete ordinates transport operator.« less
High-order solution methods for grey discrete ordinates thermal radiative transfer

DOE PAGES

Maginot, Peter G.; Ragusa, Jean C.; Morel, Jim E.

2016-09-29

This paper presents a solution methodology for solving the grey radiative transfer equations that is both spatially and temporally more accurate than the canonical radiative transfer solution technique of linear discontinuous finite element discretization in space with implicit Euler integration in time. We solve the grey radiative transfer equations by fully converging the nonlinear temperature dependence of the material specific heat, material opacities, and Planck function. The grey radiative transfer equations are discretized in space using arbitrary-order self-lumping discontinuous finite elements and integrated in time with arbitrary-order diagonally implicit Runge–Kutta time integration techniques. Iterative convergence of the radiation equation ismore » accelerated using a modified interior penalty diffusion operator to precondition the full discrete ordinates transport operator.« less
On the construction and application of implicit factored schemes for conservation laws. [in computational fluid dynamics

NASA Technical Reports Server (NTRS)

Warming, R. F.; Beam, R. M.

1978-01-01

Efficient, noniterative, implicit finite difference algorithms are systematically developed for nonlinear conservation laws including purely hyperbolic systems and mixed hyperbolic parabolic systems. Utilization of a rational fraction or Pade time differencing formulas, yields a direct and natural derivation of an implicit scheme in a delta form. Attention is given to advantages of the delta formation and to various properties of one- and two-dimensional algorithms.
Parallel conjugate gradient algorithms for manipulator dynamic simulation

NASA Technical Reports Server (NTRS)

Fijany, Amir; Scheld, Robert E.

1989-01-01

Parallel conjugate gradient algorithms for the computation of multibody dynamics are developed for the specialized case of a robot manipulator. For an n-dimensional positive-definite linear system, the Classical Conjugate Gradient (CCG) algorithms are guaranteed to converge in n iterations, each with a computation cost of O(n); this leads to a total computational cost of O(n sq) on a serial processor. A conjugate gradient algorithms is presented that provide greater efficiency using a preconditioner, which reduces the number of iterations required, and by exploiting parallelism, which reduces the cost of each iteration. Two Preconditioned Conjugate Gradient (PCG) algorithms are proposed which respectively use a diagonal and a tridiagonal matrix, composed of the diagonal and tridiagonal elements of the mass matrix, as preconditioners. Parallel algorithms are developed to compute the preconditioners and their inversions in O(log sub 2 n) steps using n processors. A parallel algorithm is also presented which, on the same architecture, achieves the computational time of O(log sub 2 n) for each iteration. Simulation results for a seven degree-of-freedom manipulator are presented. Variants of the proposed algorithms are also developed which can be efficiently implemented on the Robot Mathematics Processor (RMP).
An Unsteady Preconditioning Scheme Based on Convective-Upwind Split-Pressure (CUSP) Artificial Dissipation

DTIC Science & Technology

2014-01-07

this can have a disastrous effect on convergence rate. Even if steady state is obtained for low Mach number flows (after many iterations ), the results...rally lead do a diagonally dominant left-hand-side matrix, which causes stability problems for implicit Gauss - Seidel schemes. For this reason, matrix... convergence at the stagnation point. The iterations for each airfoil is also reported in Fig. 2. Without preconditioning, dramatic efficiency problems are seen
An implicit iterative algorithm with a tuning parameter for Itô Lyapunov matrix equations

NASA Astrophysics Data System (ADS)

Zhang, Ying; Wu, Ai-Guo; Sun, Hui-Jie

2018-01-01

In this paper, an implicit iterative algorithm is proposed for solving a class of Lyapunov matrix equations arising in Itô stochastic linear systems. A tuning parameter is introduced in this algorithm, and thus the convergence rate of the algorithm can be changed. Some conditions are presented such that the developed algorithm is convergent. In addition, an explicit expression is also derived for the optimal tuning parameter, which guarantees that the obtained algorithm achieves its fastest convergence rate. Finally, numerical examples are employed to illustrate the effectiveness of the given algorithm.
A Low-Stress Algorithm for Fractions

ERIC Educational Resources Information Center

Ruais, Ronald W.

1978-01-01

An algorithm is given for the addition and subtraction of fractions based on dividing the sum of diagonal numerator and denominator products by the product of the denominators. As an explanation of the teaching method, activities used in teaching are demonstrated. (MN)
Spectral factorization of wavefields and wave operators

NASA Astrophysics Data System (ADS)

Rickett, James Edward

Spectral factorization is the problem of finding a minimum-phase function with a given power spectrum. Minimum phase functions have the property that they are causal with a causal (stable) inverse. In this thesis, I factor multidimensional systems into their minimum-phase components. Helical boundary conditions resolve any ambiguities over causality, allowing me to factor multi-dimensional systems with conventional one-dimensional spectral factorization algorithms. In the first part, I factor passive seismic wavefields recorded in two-dimensional spatial arrays. The result provides an estimate of the acoustic impulse response of the medium that has higher bandwidth than autocorrelation-derived estimates. Also, the function's minimum-phase nature mimics the physics of the system better than the zero-phase autocorrelation model. I demonstrate this on helioseismic data recorded by the satellite-based Michelson Doppler Imager (MDI) instrument, and shallow seismic data recorded at Long Beach, California. In the second part of this thesis, I take advantage of the stable-inverse property of minimum-phase functions to solve wave-equation partial differential equations. By factoring multi-dimensional finite-difference stencils into minimum-phase components, I can invert them efficiently, facilitating rapid implicit extrapolation without the azimuthal anisotropy that is observed with splitting approximations. The final part of this thesis describes how to calculate diagonal weighting functions that approximate the combined operation of seismic modeling and migration. These weighting functions capture the effects of irregular subsurface illumination, which can be the result of either the surface-recording geometry, or focusing and defocusing of the seismic wavefield as it propagates through the earth. Since they are diagonal, they can be easily both factored and inverted to compensate for uneven subsurface illumination in migrated images. Experimental results show that applying these weighting functions after migration leads to significantly improved estimates of seismic reflectivity.
Exponential integration algorithms applied to viscoplasticity

NASA Technical Reports Server (NTRS)

Freed, Alan D.; Walker, Kevin P.

1991-01-01

Four, linear, exponential, integration algorithms (two implicit, one explicit, and one predictor/corrector) are applied to a viscoplastic model to assess their capabilities. Viscoplasticity comprises a system of coupled, nonlinear, stiff, first order, ordinary differential equations which are a challenge to integrate by any means. Two of the algorithms (the predictor/corrector and one of the implicits) give outstanding results, even for very large time steps.
A combined joint diagonalization-MUSIC algorithm for subsurface targets localization

NASA Astrophysics Data System (ADS)

Wang, Yinlin; Sigman, John B.; Barrowes, Benjamin E.; O'Neill, Kevin; Shubitidze, Fridon

2014-06-01

This paper presents a combined joint diagonalization (JD) and multiple signal classification (MUSIC) algorithm for estimating subsurface objects locations from electromagnetic induction (EMI) sensor data, without solving ill-posed inverse-scattering problems. JD is a numerical technique that finds the common eigenvectors that diagonalize a set of multistatic response (MSR) matrices measured by a time-domain EMI sensor. Eigenvalues from targets of interest (TOI) can be then distinguished automatically from noise-related eigenvalues. Filtering is also carried out in JD to improve the signal-to-noise ratio (SNR) of the data. The MUSIC algorithm utilizes the orthogonality between the signal and noise subspaces in the MSR matrix, which can be separated with information provided by JD. An array of theoreticallycalculated Green's functions are then projected onto the noise subspace, and the location of the target is estimated by the minimum of the projection owing to the orthogonality. This combined method is applied to data from the Time-Domain Electromagnetic Multisensor Towed Array Detection System (TEMTADS). Examples of TEMTADS test stand data and field data collected at Spencer Range, Tennessee are analyzed and presented. Results indicate that due to its noniterative mechanism, the method can be executed fast enough to provide real-time estimation of objects' locations in the field.
Constraint treatment techniques and parallel algorithms for multibody dynamic analysis. Ph.D. Thesis

NASA Technical Reports Server (NTRS)

Chiou, Jin-Chern

1990-01-01

Computational procedures for kinematic and dynamic analysis of three-dimensional multibody dynamic (MBD) systems are developed from the differential-algebraic equations (DAE's) viewpoint. Constraint violations during the time integration process are minimized and penalty constraint stabilization techniques and partitioning schemes are developed. The governing equations of motion, a two-stage staggered explicit-implicit numerical algorithm, are treated which takes advantage of a partitioned solution procedure. A robust and parallelizable integration algorithm is developed. This algorithm uses a two-stage staggered central difference algorithm to integrate the translational coordinates and the angular velocities. The angular orientations of bodies in MBD systems are then obtained by using an implicit algorithm via the kinematic relationship between Euler parameters and angular velocities. It is shown that the combination of the present solution procedures yields a computationally more accurate solution. To speed up the computational procedures, parallel implementation of the present constraint treatment techniques, the two-stage staggered explicit-implicit numerical algorithm was efficiently carried out. The DAE's and the constraint treatment techniques were transformed into arrowhead matrices to which Schur complement form was derived. By fully exploiting the sparse matrix structural analysis techniques, a parallel preconditioned conjugate gradient numerical algorithm is used to solve the systems equations written in Schur complement form. A software testbed was designed and implemented in both sequential and parallel computers. This testbed was used to demonstrate the robustness and efficiency of the constraint treatment techniques, the accuracy of the two-stage staggered explicit-implicit numerical algorithm, and the speed up of the Schur-complement-based parallel preconditioned conjugate gradient algorithm on a parallel computer.
A comparison of three-dimensional nonequilibrium solution algorithms applied to hypersonic flows with stiff chemical source terms

NASA Technical Reports Server (NTRS)

Palmer, Grant; Venkatapathy, Ethiraj

1993-01-01

Three solution algorithms, explicit underrelaxation, point implicit, and lower upper symmetric Gauss-Seidel (LUSGS), are used to compute nonequilibrium flow around the Apollo 4 return capsule at 62 km altitude. By varying the Mach number, the efficiency and robustness of the solution algorithms were tested for different levels of chemical stiffness. The performance of the solution algorithms degraded as the Mach number and stiffness of the flow increased. At Mach 15, 23, and 30, the LUSGS method produces an eight order of magnitude drop in the L2 norm of the energy residual in 1/3 to 1/2 the Cray C-90 computer time as compared to the point implicit and explicit under-relaxation methods. The explicit under-relaxation algorithm experienced convergence difficulties at Mach 23 and above. At Mach 40 the performance of the LUSGS algorithm deteriorates to the point it is out-performed by the point implicit method. The effects of the viscous terms are investigated. Grid dependency questions are explored.
An implicit time-marching method for the three-dimensional Navier-Stokes equations of contravariant velocity components

NASA Astrophysics Data System (ADS)

Daiguji, Hisaaki; Yamamoto, Satoru

1988-12-01

The implicit time-marching finite-difference method for solving the three-dimensional compressible Euler equations developed by the authors is extended to the Navier-Stokes equations. The distinctive features of this method are to make use of momentum equations of contravariant velocities instead of physical boundaries, and to be able to treat the periodic boundary condition for the three-dimensional impeller flow easily. These equations can be solved by using the same techniques as the Euler equations, such as the delta-form approximate factorization, diagonalization and upstreaming. In addition to them, a simplified total variation diminishing scheme by the authors is applied to the present method in order to capture strong shock waves clearly. Finally, the computed results of the three-dimensional flow through a transonic compressor rotor with tip clearance are shown.
Implementing the SU(2) Symmetry for the DMRG

NASA Astrophysics Data System (ADS)

Alvarez, Gonzalo

2010-03-01

In the Density Matrix Renormalization Group (DMRG) algorithm (White, 1992), Hamiltonian symmetries play an important role. Using symmetries, the matrix representation of the Hamiltonian can be blocked. Diagonalizing each matrix block is more efficient than diagonalizing the original matrix. This talk will explain how the DMRG++ codefootnotetextarXiv:0902.3185 or Computer Physics Communications 180 (2009) 1572-1578. has been extended to handle the non-local SU(2) symmetry in a model independent way. Improvements in CPU times compared to runs with only local symmetries will be discussed for typical tight-binding models of strongly correlated electronic systems. The computational bottleneck of the algorithm, and the use of shared memory parallelization will also be addressed. Finally, a roadmap for future work on DMRG++ will be presented.
An implicit flux-split algorithm to calculate hypersonic flowfields in chemical equilibrium

NASA Technical Reports Server (NTRS)

Palmer, Grant

1987-01-01

An implicit, finite-difference, shock-capturing algorithm that calculates inviscid, hypersonic flows in chemical equilibrium is presented. The flux vectors and flux Jacobians are differenced using a first-order, flux-split technique. The equilibrium composition of the gas is determined by minimizing the Gibbs free energy at every node point. The code is validated by comparing results over an axisymmetric hemisphere against previously published results. The algorithm is also applied to more practical configurations. The accuracy, stability, and versatility of the algorithm have been promising.
Improving stochastic estimates with inference methods: calculating matrix diagonals.

PubMed

Selig, Marco; Oppermann, Niels; Ensslin, Torsten A

2012-02-01

Estimating the diagonal entries of a matrix, that is not directly accessible but only available as a linear operator in the form of a computer routine, is a common necessity in many computational applications, especially in image reconstruction and statistical inference. Here, methods of statistical inference are used to improve the accuracy or the computational costs of matrix probing methods to estimate matrix diagonals. In particular, the generalized Wiener filter methodology, as developed within information field theory, is shown to significantly improve estimates based on only a few sampling probes, in cases in which some form of continuity of the solution can be assumed. The strength, length scale, and precise functional form of the exploited autocorrelation function of the matrix diagonal is determined from the probes themselves. The developed algorithm is successfully applied to mock and real world problems. These performance tests show that, in situations where a matrix diagonal has to be calculated from only a small number of computationally expensive probes, a speedup by a factor of 2 to 10 is possible with the proposed method. © 2012 American Physical Society
Radiated chemical reaction impacts on natural convective MHD mass transfer flow induced by a vertical cone

NASA Astrophysics Data System (ADS)

Sambath, P.; Pullepu, Bapuji; Hussain, T.; Ali Shehzad, Sabir

2018-03-01

The consequence of thermal radiation in laminar natural convective hydromagnetic flow of viscous incompressible fluid past a vertical cone with mass transfer under the influence of chemical reaction with heat source/sink is presented here. The surface of the cone is focused to a variable wall temperature (VWT) and wall concentration (VWC). The fluid considered here is a gray absorbing and emitting, but non-scattering medium. The boundary layer dimensionless equations governing the flow are solved by an implicit finite-difference scheme of Crank-Nicolson which has speedy convergence and stable. This method converts the dimensionless equations into a system of tri-diagonal equations and which are then solved by using well known Thomas algorithm. Numerical solutions are obtained for momentum, temperature, concentration, local and average shear stress, heat and mass transfer rates for various values of parameters Pr, Sc, λ, Δ, Rd are established with graphical representations. We observed that the liquid velocity decreased for higher values of Prandtl and Schmidt numbers. The temperature is boost up for decreasing values of Schimdt and Prandtl numbers. The enhancement in radiative parameter gives more heat to liquid due to which temperature is enhanced significantly.
A Fast parallel tridiagonal algorithm for a class of CFD applications

NASA Technical Reports Server (NTRS)

Moitra, Stuti; Sun, Xian-He

1996-01-01

The parallel diagonal dominant (PDD) algorithm is an efficient tridiagonal solver. This paper presents for study a variation of the PDD algorithm, the reduced PDD algorithm. The new algorithm maintains the minimum communication provided by the PDD algorithm, but has a reduced operation count. The PDD algorithm also has a smaller operation count than the conventional sequential algorithm for many applications. Accuracy analysis is provided for the reduced PDD algorithm for symmetric Toeplitz tridiagonal (STT) systems. Implementation results on Langley's Intel Paragon and IBM SP2 show that both the PDD and reduced PDD algorithms are efficient and scalable.
Algorithm development for Maxwell's equations for computational electromagnetism

NASA Technical Reports Server (NTRS)

Goorjian, Peter M.

1990-01-01

A new algorithm has been developed for solving Maxwell's equations for the electromagnetic field. It solves the equations in the time domain with central, finite differences. The time advancement is performed implicitly, using an alternating direction implicit procedure. The space discretization is performed with finite volumes, using curvilinear coordinates with electromagnetic components along those directions. Sample calculations are presented of scattering from a metal pin, a square and a circle to demonstrate the capabilities of the new algorithm.

A Newton-Krylov method with an approximate analytical Jacobian for implicit solution of Navier-Stokes equations on staggered overset-curvilinear grids with immersed boundaries.

PubMed

Asgharzadeh, Hafez; Borazjani, Iman

2017-02-15

The explicit and semi-implicit schemes in flow simulations involving complex geometries and moving boundaries suffer from time-step size restriction and low convergence rates. Implicit schemes can be used to overcome these restrictions, but implementing them to solve the Navier-Stokes equations is not straightforward due to their non-linearity. Among the implicit schemes for nonlinear equations, Newton-based techniques are preferred over fixed-point techniques because of their high convergence rate but each Newton iteration is more expensive than a fixed-point iteration. Krylov subspace methods are one of the most advanced iterative methods that can be combined with Newton methods, i.e., Newton-Krylov Methods (NKMs) to solve non-linear systems of equations. The success of NKMs vastly depends on the scheme for forming the Jacobian, e.g., automatic differentiation is very expensive, and matrix-free methods without a preconditioner slow down as the mesh is refined. A novel, computationally inexpensive analytical Jacobian for NKM is developed to solve unsteady incompressible Navier-Stokes momentum equations on staggered overset-curvilinear grids with immersed boundaries. Moreover, the analytical Jacobian is used to form preconditioner for matrix-free method in order to improve its performance. The NKM with the analytical Jacobian was validated and verified against Taylor-Green vortex, inline oscillations of a cylinder in a fluid initially at rest, and pulsatile flow in a 90 degree bend. The capability of the method in handling complex geometries with multiple overset grids and immersed boundaries is shown by simulating an intracranial aneurysm. It was shown that the NKM with an analytical Jacobian is 1.17 to 14.77 times faster than the fixed-point Runge-Kutta method, and 1.74 to 152.3 times (excluding an intensively stretched grid) faster than automatic differentiation depending on the grid (size) and the flow problem. In addition, it was shown that using only the diagonal of the Jacobian further improves the performance by 42 - 74% compared to the full Jacobian. The NKM with an analytical Jacobian showed better performance than the fixed point Runge-Kutta because it converged with higher time steps and in approximately 30% less iterations even when the grid was stretched and the Reynold number was increased. In fact, stretching the grid decreased the performance of all methods, but the fixed-point Runge-Kutta performance decreased 4.57 and 2.26 times more than NKM with a diagonal Jacobian when the stretching factor was increased, respectively. The NKM with a diagonal analytical Jacobian and matrix-free method with an analytical preconditioner are the fastest methods and the superiority of one to another depends on the flow problem. Furthermore, the implemented methods are fully parallelized with parallel efficiency of 80-90% on the problems tested. The NKM with the analytical Jacobian can guide building preconditioners for other techniques to improve their performance in the future.
A Newton–Krylov method with an approximate analytical Jacobian for implicit solution of Navier–Stokes equations on staggered overset-curvilinear grids with immersed boundaries

PubMed Central

Asgharzadeh, Hafez; Borazjani, Iman

2016-01-01

The explicit and semi-implicit schemes in flow simulations involving complex geometries and moving boundaries suffer from time-step size restriction and low convergence rates. Implicit schemes can be used to overcome these restrictions, but implementing them to solve the Navier-Stokes equations is not straightforward due to their non-linearity. Among the implicit schemes for nonlinear equations, Newton-based techniques are preferred over fixed-point techniques because of their high convergence rate but each Newton iteration is more expensive than a fixed-point iteration. Krylov subspace methods are one of the most advanced iterative methods that can be combined with Newton methods, i.e., Newton-Krylov Methods (NKMs) to solve non-linear systems of equations. The success of NKMs vastly depends on the scheme for forming the Jacobian, e.g., automatic differentiation is very expensive, and matrix-free methods without a preconditioner slow down as the mesh is refined. A novel, computationally inexpensive analytical Jacobian for NKM is developed to solve unsteady incompressible Navier-Stokes momentum equations on staggered overset-curvilinear grids with immersed boundaries. Moreover, the analytical Jacobian is used to form preconditioner for matrix-free method in order to improve its performance. The NKM with the analytical Jacobian was validated and verified against Taylor-Green vortex, inline oscillations of a cylinder in a fluid initially at rest, and pulsatile flow in a 90 degree bend. The capability of the method in handling complex geometries with multiple overset grids and immersed boundaries is shown by simulating an intracranial aneurysm. It was shown that the NKM with an analytical Jacobian is 1.17 to 14.77 times faster than the fixed-point Runge-Kutta method, and 1.74 to 152.3 times (excluding an intensively stretched grid) faster than automatic differentiation depending on the grid (size) and the flow problem. In addition, it was shown that using only the diagonal of the Jacobian further improves the performance by 42 – 74% compared to the full Jacobian. The NKM with an analytical Jacobian showed better performance than the fixed point Runge-Kutta because it converged with higher time steps and in approximately 30% less iterations even when the grid was stretched and the Reynold number was increased. In fact, stretching the grid decreased the performance of all methods, but the fixed-point Runge-Kutta performance decreased 4.57 and 2.26 times more than NKM with a diagonal Jacobian when the stretching factor was increased, respectively. The NKM with a diagonal analytical Jacobian and matrix-free method with an analytical preconditioner are the fastest methods and the superiority of one to another depends on the flow problem. Furthermore, the implemented methods are fully parallelized with parallel efficiency of 80–90% on the problems tested. The NKM with the analytical Jacobian can guide building preconditioners for other techniques to improve their performance in the future. PMID:28042172
A Newton-Krylov method with an approximate analytical Jacobian for implicit solution of Navier-Stokes equations on staggered overset-curvilinear grids with immersed boundaries

NASA Astrophysics Data System (ADS)

Asgharzadeh, Hafez; Borazjani, Iman

2017-02-01

The explicit and semi-implicit schemes in flow simulations involving complex geometries and moving boundaries suffer from time-step size restriction and low convergence rates. Implicit schemes can be used to overcome these restrictions, but implementing them to solve the Navier-Stokes equations is not straightforward due to their non-linearity. Among the implicit schemes for non-linear equations, Newton-based techniques are preferred over fixed-point techniques because of their high convergence rate but each Newton iteration is more expensive than a fixed-point iteration. Krylov subspace methods are one of the most advanced iterative methods that can be combined with Newton methods, i.e., Newton-Krylov Methods (NKMs) to solve non-linear systems of equations. The success of NKMs vastly depends on the scheme for forming the Jacobian, e.g., automatic differentiation is very expensive, and matrix-free methods without a preconditioner slow down as the mesh is refined. A novel, computationally inexpensive analytical Jacobian for NKM is developed to solve unsteady incompressible Navier-Stokes momentum equations on staggered overset-curvilinear grids with immersed boundaries. Moreover, the analytical Jacobian is used to form a preconditioner for matrix-free method in order to improve its performance. The NKM with the analytical Jacobian was validated and verified against Taylor-Green vortex, inline oscillations of a cylinder in a fluid initially at rest, and pulsatile flow in a 90 degree bend. The capability of the method in handling complex geometries with multiple overset grids and immersed boundaries is shown by simulating an intracranial aneurysm. It was shown that the NKM with an analytical Jacobian is 1.17 to 14.77 times faster than the fixed-point Runge-Kutta method, and 1.74 to 152.3 times (excluding an intensively stretched grid) faster than automatic differentiation depending on the grid (size) and the flow problem. In addition, it was shown that using only the diagonal of the Jacobian further improves the performance by 42-74% compared to the full Jacobian. The NKM with an analytical Jacobian showed better performance than the fixed point Runge-Kutta because it converged with higher time steps and in approximately 30% less iterations even when the grid was stretched and the Reynold number was increased. In fact, stretching the grid decreased the performance of all methods, but the fixed-point Runge-Kutta performance decreased 4.57 and 2.26 times more than NKM with a diagonal and full Jacobian, respectivley, when the stretching factor was increased. The NKM with a diagonal analytical Jacobian and matrix-free method with an analytical preconditioner are the fastest methods and the superiority of one to another depends on the flow problem. Furthermore, the implemented methods are fully parallelized with parallel efficiency of 80-90% on the problems tested. The NKM with the analytical Jacobian can guide building preconditioners for other techniques to improve their performance in the future.
An implicit adaptation algorithm for a linear model reference control system

NASA Technical Reports Server (NTRS)

Mabius, L.; Kaufman, H.

1975-01-01

This paper presents a stable implicit adaptation algorithm for model reference control. The constraints for stability are found using Lyapunov's second method and do not depend on perfect model following between the system and the reference model. Methods are proposed for satisfying these constraints without estimating the parameters on which the constraints depend.
Efficiency Study of Implicit and Explicit Time Integration Operators for Finite Element Applications

DTIC Science & Technology

1977-07-01

cffiAciency, wherein Beta =0 provides anl exp~licit algorithm, wvhile Beta &0 provides anl implicit algorithm. Both algorithmns arc used in the same...Hlueneme CA: CO, Code C44A Port j IHuenemne, CA NAVSEC Cod,. 6034 (Library), Washington DC NAVSI*CGRUAC’I’ PWO, ’rorri Sta, OkinawaI NAVSIIIPRBFTAC Library
Block recursive LU preconditioners for the thermally coupled incompressible inductionless MHD problem

NASA Astrophysics Data System (ADS)

Badia, Santiago; Martín, Alberto F.; Planas, Ramon

2014-10-01

The thermally coupled incompressible inductionless magnetohydrodynamics (MHD) problem models the flow of an electrically charged fluid under the influence of an external electromagnetic field with thermal coupling. This system of partial differential equations is strongly coupled and highly nonlinear for real cases of interest. Therefore, fully implicit time integration schemes are very desirable in order to capture the different physical scales of the problem at hand. However, solving the multiphysics linear systems of equations resulting from such algorithms is a very challenging task which requires efficient and scalable preconditioners. In this work, a new family of recursive block LU preconditioners is designed and tested for solving the thermally coupled inductionless MHD equations. These preconditioners are obtained after splitting the fully coupled matrix into one-physics problems for every variable (velocity, pressure, current density, electric potential and temperature) that can be optimally solved, e.g., using preconditioned domain decomposition algorithms. The main idea is to arrange the original matrix into an (arbitrary) 2 × 2 block matrix, and consider an LU preconditioner obtained by approximating the corresponding Schur complement. For every one of the diagonal blocks in the LU preconditioner, if it involves more than one type of unknowns, we proceed the same way in a recursive fashion. This approach is stated in an abstract way, and can be straightforwardly applied to other multiphysics problems. Further, we precisely explain a flexible and general software design for the code implementation of this type of preconditioners.
Multiprocessing on supercomputers for computational aerodynamics

NASA Technical Reports Server (NTRS)

Yarrow, Maurice; Mehta, Unmeel B.

1990-01-01

Very little use is made of multiple processors available on current supercomputers (computers with a theoretical peak performance capability equal to 100 MFLOPs or more) in computational aerodynamics to significantly improve turnaround time. The productivity of a computer user is directly related to this turnaround time. In a time-sharing environment, the improvement in this speed is achieved when multiple processors are used efficiently to execute an algorithm. The concept of multiple instructions and multiple data (MIMD) through multi-tasking is applied via a strategy which requires relatively minor modifications to an existing code for a single processor. Essentially, this approach maps the available memory to multiple processors, exploiting the C-FORTRAN-Unix interface. The existing single processor code is mapped without the need for developing a new algorithm. The procedure for building a code utilizing this approach is automated with the Unix stream editor. As a demonstration of this approach, a Multiple Processor Multiple Grid (MPMG) code is developed. It is capable of using nine processors, and can be easily extended to a larger number of processors. This code solves the three-dimensional, Reynolds averaged, thin-layer and slender-layer Navier-Stokes equations with an implicit, approximately factored and diagonalized method. The solver is applied to generic oblique-wing aircraft problem on a four processor Cray-2 computer. A tricubic interpolation scheme is developed to increase the accuracy of coupling of overlapped grids. For the oblique-wing aircraft problem, a speedup of two in elapsed (turnaround) time is observed in a saturated time-sharing environment.
A curvilinear, fully implicit, conservative electromagnetic PIC algorithm in multiple dimensions

DOE PAGES

Chacon, L.; Chen, G.

2016-04-19

Here, we extend a recently proposed fully implicit PIC algorithm for the Vlasov–Darwin model in multiple dimensions (Chen and Chacón (2015) [1]) to curvilinear geometry. As in the Cartesian case, the approach is based on a potential formulation (Φ, A), and overcomes many difficulties of traditional semi-implicit Darwin PIC algorithms. Conservation theorems for local charge and global energy are derived in curvilinear representation, and then enforced discretely by a careful choice of the discretization of field and particle equations. Additionally, the algorithm conserves canonical-momentum in any ignorable direction, and preserves the Coulomb gauge ∇ • A = 0 exactly. Anmore » asymptotically well-posed fluid preconditioner allows efficient use of large cell sizes, which are determined by accuracy considerations, not stability, and can be orders of magnitude larger than required in a standard explicit electromagnetic PIC simulation. We demonstrate the accuracy and efficiency properties of the algorithm with numerical experiments in mapped meshes in 1D-3V and 2D-3V.« less
A curvilinear, fully implicit, conservative electromagnetic PIC algorithm in multiple dimensions

NASA Astrophysics Data System (ADS)

Chacón, L.; Chen, G.

2016-07-01

We extend a recently proposed fully implicit PIC algorithm for the Vlasov-Darwin model in multiple dimensions (Chen and Chacón (2015) [1]) to curvilinear geometry. As in the Cartesian case, the approach is based on a potential formulation (ϕ, A), and overcomes many difficulties of traditional semi-implicit Darwin PIC algorithms. Conservation theorems for local charge and global energy are derived in curvilinear representation, and then enforced discretely by a careful choice of the discretization of field and particle equations. Additionally, the algorithm conserves canonical-momentum in any ignorable direction, and preserves the Coulomb gauge ∇ ṡ A = 0 exactly. An asymptotically well-posed fluid preconditioner allows efficient use of large cell sizes, which are determined by accuracy considerations, not stability, and can be orders of magnitude larger than required in a standard explicit electromagnetic PIC simulation. We demonstrate the accuracy and efficiency properties of the algorithm with numerical experiments in mapped meshes in 1D-3V and 2D-3V.
A curvilinear, fully implicit, conservative electromagnetic PIC algorithm in multiple dimensions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chacon, L.; Chen, G.

Here, we extend a recently proposed fully implicit PIC algorithm for the Vlasov–Darwin model in multiple dimensions (Chen and Chacón (2015) [1]) to curvilinear geometry. As in the Cartesian case, the approach is based on a potential formulation (Φ, A), and overcomes many difficulties of traditional semi-implicit Darwin PIC algorithms. Conservation theorems for local charge and global energy are derived in curvilinear representation, and then enforced discretely by a careful choice of the discretization of field and particle equations. Additionally, the algorithm conserves canonical-momentum in any ignorable direction, and preserves the Coulomb gauge ∇ • A = 0 exactly. Anmore » asymptotically well-posed fluid preconditioner allows efficient use of large cell sizes, which are determined by accuracy considerations, not stability, and can be orders of magnitude larger than required in a standard explicit electromagnetic PIC simulation. We demonstrate the accuracy and efficiency properties of the algorithm with numerical experiments in mapped meshes in 1D-3V and 2D-3V.« less
Fluid preconditioning for Newton–Krylov-based, fully implicit, electrostatic particle-in-cell simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, G., E-mail: gchen@lanl.gov; Chacón, L.; Leibs, C.A.

2014-02-01

A recent proof-of-principle study proposes an energy- and charge-conserving, nonlinearly implicit electrostatic particle-in-cell (PIC) algorithm in one dimension [9]. The algorithm in the reference employs an unpreconditioned Jacobian-free Newton–Krylov method, which ensures nonlinear convergence at every timestep (resolving the dynamical timescale of interest). Kinetic enslavement, which is one key component of the algorithm, not only enables fully implicit PIC as a practical approach, but also allows preconditioning the kinetic solver with a fluid approximation. This study proposes such a preconditioner, in which the linearized moment equations are closed with moments computed from particles. Effective acceleration of the linear GMRES solvemore » is demonstrated, on both uniform and non-uniform meshes. The algorithm performance is largely insensitive to the electron–ion mass ratio. Numerical experiments are performed on a 1D multi-scale ion acoustic wave test problem.« less
Radiation-MHD Simulations of Pillars and Globules in HII Regions

NASA Astrophysics Data System (ADS)

Mackey, J.

2012-07-01

Implicit and explicit raytracing-photoionisation algorithms have been implemented in the author's radiation-magnetohydrodynamics code. The algorithms are described briefly and their efficiency and parallel scaling are investigated. The implicit algorithm is more efficient for calculations where ionisation fronts have very supersonic velocities, and the explicit algorithm is favoured in the opposite limit because of its better parallel scaling. The implicit method is used to investigate the effects of initially uniform magnetic fields on the formation and evolution of dense pillars and cometary globules at the boundaries of HII regions. It is shown that for weak and medium field strengths an initially perpendicular field is swept into alignment with the pillar during its dynamical evolution, matching magnetic field observations of the ‘Pillars of Creation’ in M16. A strong perpendicular magnetic field remains in its initial configuration and also confines the photoevaporation flow into a bar-shaped, dense, ionised ribbon which partially shields the ionisation front.
Comparison of Nonequilibrium Solution Algorithms Applied to Chemically Stiff Hypersonic Flows

NASA Technical Reports Server (NTRS)

Palmer, Grant; Venkatapathy, Ethiraj

1995-01-01

Three solution algorithms, explicit under-relaxation, point implicit, and lower-upper symmetric Gauss-Seidel, are used to compute nonequilibrium flow around the Apollo 4 return capsule at the 62-km altitude point in its descent trajectory. By varying the Mach number, the efficiency and robustness of the solution algorithms were tested for different levels of chemical stiffness.The performance of the solution algorithms degraded as the Mach number and stiffness of the flow increased. At Mach 15 and 30, the lower-upper symmetric Gauss-Seidel method produces an eight order of magnitude drop in the energy residual in one-third to one-half the Cray C-90 computer time as compared to the point implicit and explicit under-relaxation methods. The explicit under-relaxation algorithm experienced convergence difficulties at Mach 30 and above. At Mach 40 the performance of the lower-upper symmetric Gauss-Seidel algorithm deteriorates to the point that it is out performed by the point implicit method. The effects of the viscous terms are investigated. Grid dependency questions are explored.
An analytical particle mover for the charge- and energy-conserving, nonlinearly implicit, electrostatic particle-in-cell algorithm

NASA Astrophysics Data System (ADS)

Chen, G.; Chacón, L.

2013-08-01

We propose a 1D analytical particle mover for the recent charge- and energy-conserving electrostatic particle-in-cell (PIC) algorithm in Ref. [G. Chen, L. Chacón, D.C. Barnes, An energy- and charge-conserving, implicit, electrostatic particle-in-cell algorithm, Journal of Computational Physics 230 (2011) 7018-7036]. The approach computes particle orbits exactly for a given piece-wise linear electric field. The resulting PIC algorithm maintains the exact charge and energy conservation properties of the original algorithm, but with improved performance (both in efficiency and robustness against the number of particles and timestep). We demonstrate the advantageous properties of the scheme with a challenging multiscale numerical test case, the ion acoustic wave. Using the analytical mover as a reference, we demonstrate that the choice of error estimator in the Crank-Nicolson mover has significant impact on the overall performance of the implicit PIC algorithm. The generalization of the approach to the multi-dimensional case is outlined, based on a novel and simple charge conserving interpolation scheme.
Improved diagonal queue medical image steganography using Chaos theory, LFSR, and Rabin cryptosystem.

PubMed

Jain, Mamta; Kumar, Anil; Choudhary, Rishabh Charan

2017-06-01

In this article, we have proposed an improved diagonal queue medical image steganography for patient secret medical data transmission using chaotic standard map, linear feedback shift register, and Rabin cryptosystem, for improvement of previous technique (Jain and Lenka in Springer Brain Inform 3:39-51, 2016). The proposed algorithm comprises four stages, generation of pseudo-random sequences (pseudo-random sequences are generated by linear feedback shift register and standard chaotic map), permutation and XORing using pseudo-random sequences, encryption using Rabin cryptosystem, and steganography using the improved diagonal queues. Security analysis has been carried out. Performance analysis is observed using MSE, PSNR, maximum embedding capacity, as well as by histogram analysis between various Brain disease stego and cover images.
Design of a Variational Multiscale Method for Turbulent Compressible Flows

NASA Technical Reports Server (NTRS)

Diosady, Laslo Tibor; Murman, Scott M.

2013-01-01

A spectral-element framework is presented for the simulation of subsonic compressible high-Reynolds-number flows. The focus of the work is maximizing the efficiency of the computational schemes to enable unsteady simulations with a large number of spatial and temporal degrees of freedom. A collocation scheme is combined with optimized computational kernels to provide a residual evaluation with computational cost independent of order of accuracy up to 16th order. The optimized residual routines are used to develop a low-memory implicit scheme based on a matrix-free Newton-Krylov method. A preconditioner based on the finite-difference diagonalized ADI scheme is developed which maintains the low memory of the matrix-free implicit solver, while providing improved convergence properties. Emphasis on low memory usage throughout the solver development is leveraged to implement a coupled space-time DG solver which may offer further efficiency gains through adaptivity in both space and time.
Chemistry-split techniques for viscous reactive blunt body flow computations

NASA Technical Reports Server (NTRS)

Li, C. P.

1987-01-01

The weak-coupling structure between the fluid and species equations has been exploited and resulted in three, closely related, time-iterative implicit techniques. While the primitive variables are solved in two separated groups and each by an Alternating Direction Implicit (ADI) factorization scheme, the rate-species Jacobian can be treated in either full or diagonal matrix form, or simply ignored. The latter two versions render the split technique to solving for species as scalar rather than vector variables. The solution is completed at the end of each iteration after determining temperature and pressure from the flow density, energy and species concentrations. Numerical experimentation has shown that the split scalar technique, using partial rate Jacobian, yields the best overall stability and consistency. Satisfactory viscous solutions were obtained for an ellipsoidal body of axis ratio 3:1 at Mach 35 and an angle of attack of 20 degrees.
A conservative implicit finite difference algorithm for the unsteady transonic full potential equation

NASA Technical Reports Server (NTRS)

Steger, J. L.; Caradonna, F. X.

1980-01-01

An implicit finite difference procedure is developed to solve the unsteady full potential equation in conservation law form. Computational efficiency is maintained by use of approximate factorization techniques. The numerical algorithm is first order in time and second order in space. A circulation model and difference equations are developed for lifting airfoils in unsteady flow; however, thin airfoil body boundary conditions have been used with stretching functions to simplify the development of the numerical algorithm.
Implicit treatment of diffusion terms in lower-upper algorithms

NASA Technical Reports Server (NTRS)

Shih, T. I.-P.; Steinthorsson, E.; Chyu, W. J.

1993-01-01

A method is presented which allows diffusion terms to be treated implicitly in the lower-upper (LU) algorithm (which is a commonly used method for solving 'compressible' Euler and Navier-Stokes equations) so that the algorithm's good stability properties will not be impaired. The new method generalizes the concept of LU factorization from that associated with the sign of eigenvalues to that associated with backward- and forward-difference operators without regard to eigenvalues. The method is verified in a turbulent boundary layer study.
Efficient Numerical Diagonalization of Hermitian 3 × 3 Matrices

NASA Astrophysics Data System (ADS)

Kopp, Joachim

A very common problem in science is the numerical diagonalization of symmetric or hermitian 3 × 3 matrices. Since standard "black box" packages may be too inefficient if the number of matrices is large, we study several alternatives. We consider optimized implementations of the Jacobi, QL, and Cuppen algorithms and compare them with an alytical method relying on Cardano's formula for the eigenvalues and on vector cross products for the eigenvectors. Jacobi is the most accurate, but also the slowest method, while QL and Cuppen are good general purpose algorithms. The analytical algorithm outperforms the others by more than a factor of 2, but becomes inaccurate or may even fail completely if the matrix entries differ greatly in magnitude. This can mostly be circumvented by using a hybrid method, which falls back to QL if conditions are such that the analytical calculation might become too inaccurate. For all algorithms, we give an overview of the underlying mathematical ideas, and present detailed benchmark results. C and Fortran implementations of our code are available for download from .

Parallelized CCHE2D flow model with CUDA Fortran on Graphics Process Units

USDA-ARS?s Scientific Manuscript database

This paper presents the CCHE2D implicit flow model parallelized using CUDA Fortran programming technique on Graphics Processing Units (GPUs). A parallelized implicit Alternating Direction Implicit (ADI) solver using Parallel Cyclic Reduction (PCR) algorithm on GPU is developed and tested. This solve...
A splitting algorithm for the wavelet transform of cubic splines on a nonuniform grid

NASA Astrophysics Data System (ADS)

Sulaimanov, Z. M.; Shumilov, B. M.

2017-10-01

For cubic splines with nonuniform nodes, splitting with respect to the even and odd nodes is used to obtain a wavelet expansion algorithm in the form of the solution to a three-diagonal system of linear algebraic equations for the coefficients. Computations by hand are used to investigate the application of this algorithm for numerical differentiation. The results are illustrated by solving a prediction problem.
Diagonal dominance for the multivariable Nyquist array using function minimization

NASA Technical Reports Server (NTRS)

Leininger, G. G.

1977-01-01

A new technique for the design of multivariable control systems using the multivariable Nyquist array method was developed. A conjugate direction function minimization algorithm is utilized to achieve a diagonal dominant condition over the extended frequency range of the control system. The minimization is performed on the ratio of the moduli of the off-diagonal terms to the moduli of the diagonal terms of either the inverse or direct open loop transfer function matrix. Several new feedback design concepts were also developed, including: (1) dominance control parameters for each control loop; (2) compensator normalization to evaluate open loop conditions for alternative design configurations; and (3) an interaction index to determine the degree and type of system interaction when all feedback loops are closed simultaneously. This new design capability was implemented on an IBM 360/75 in a batch mode but can be easily adapted to an interactive computer facility. The method was applied to the Pratt and Whitney F100 turbofan engine.
Single-step methods for predicting orbital motion considering its periodic components

NASA Astrophysics Data System (ADS)

Lavrov, K. N.

1989-01-01

Modern numerical methods for integration of ordinary differential equations can provide accurate and universal solutions to celestial mechanics problems. The implicit single sequence algorithms of Everhart and multiple step computational schemes using a priori information on periodic components can be combined to construct implicit single sequence algorithms which combine their advantages. The construction and analysis of the properties of such algorithms are studied, utilizing trigonometric approximation of the solutions of differential equations containing periodic components. The algorithms require 10 percent more machine memory than the Everhart algorithms, but are twice as fast, and yield short term predictions valid for five to ten orbits with good accuracy and five to six times faster than algorithms using other methods.
A robust recognition and accurate locating method for circular coded diagonal target

NASA Astrophysics Data System (ADS)

Bao, Yunna; Shang, Yang; Sun, Xiaoliang; Zhou, Jiexin

2017-10-01

As a category of special control points which can be automatically identified, artificial coded targets have been widely developed in the field of computer vision, photogrammetry, augmented reality, etc. In this paper, a new circular coded target designed by RockeTech technology Corp. Ltd is analyzed and studied, which is called circular coded diagonal target (CCDT). A novel detection and recognition method with good robustness is proposed in the paper, and implemented on Visual Studio. In this algorithm, firstly, the ellipse features of the center circle are used for rough positioning. Then, according to the characteristics of the center diagonal target, a circular frequency filter is designed to choose the correct center circle and eliminates non-target noise. The precise positioning of the coded target is done by the correlation coefficient fitting extreme value method. Finally, the coded target recognition is achieved by decoding the binary sequence in the outer ring of the extracted target. To test the proposed algorithm, this paper has carried out simulation experiments and real experiments. The results show that the CCDT recognition and accurate locating method proposed in this paper can robustly recognize and accurately locate the targets in complex and noisy background.
A Semi-Implicit, Three-Dimensional Model for Estuarine Circulation

USGS Publications Warehouse

Smith, Peter E.

2006-01-01

A semi-implicit, finite-difference method for the numerical solution of the three-dimensional equations for circulation in estuaries is presented and tested. The method uses a three-time-level, leapfrog-trapezoidal scheme that is essentially second-order accurate in the spatial and temporal numerical approximations. The three-time-level scheme is shown to be preferred over a two-time-level scheme, especially for problems with strong nonlinearities. The stability of the semi-implicit scheme is free from any time-step limitation related to the terms describing vertical diffusion and the propagation of the surface gravity waves. The scheme does not rely on any form of vertical/horizontal mode-splitting to treat the vertical diffusion implicitly. At each time step, the numerical method uses a double-sweep method to transform a large number of small tridiagonal equation systems and then uses the preconditioned conjugate-gradient method to solve a single, large, five-diagonal equation system for the water surface elevation. The governing equations for the multi-level scheme are prepared in a conservative form by integrating them over the height of each horizontal layer. The layer-integrated volumetric transports replace velocities as the dependent variables so that the depth-integrated continuity equation that is used in the solution for the water surface elevation is linear. Volumetric transports are computed explicitly from the momentum equations. The resulting method is mass conservative, efficient, and numerically accurate.
An Optimally Stable and Accurate Second-Order SSP Runge-Kutta IMEX Scheme for Atmospheric Applications

NASA Astrophysics Data System (ADS)

Rokhzadi, Arman; Mohammadian, Abdolmajid; Charron, Martin

2018-01-01

The objective of this paper is to develop an optimized implicit-explicit (IMEX) Runge-Kutta scheme for atmospheric applications focusing on stability and accuracy. Following the common terminology, the proposed method is called IMEX-SSP2(2,3,2), as it has second-order accuracy and is composed of diagonally implicit two-stage and explicit three-stage parts. This scheme enjoys the Strong Stability Preserving (SSP) property for both parts. This new scheme is applied to nonhydrostatic compressible Boussinesq equations in two different arrangements, including (i) semiimplicit and (ii) Horizontally Explicit-Vertically Implicit (HEVI) forms. The new scheme preserves the SSP property for larger regions of absolute monotonicity compared to the well-studied scheme in the same class. In addition, numerical tests confirm that the IMEX-SSP2(2,3,2) improves the maximum stable time step as well as the level of accuracy and computational cost compared to other schemes in the same class. It is demonstrated that the A-stability property as well as satisfying "second-stage order" and stiffly accurate conditions lead the proposed scheme to better performance than existing schemes for the applications examined herein.
Kato expansion in quantum canonical perturbation theory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nikolaev, Andrey, E-mail: Andrey.Nikolaev@rdtex.ru

2016-06-15

This work establishes a connection between canonical perturbation series in quantum mechanics and a Kato expansion for the resolvent of the Liouville superoperator. Our approach leads to an explicit expression for a generator of a block-diagonalizing Dyson’s ordered exponential in arbitrary perturbation order. Unitary intertwining of perturbed and unperturbed averaging superprojectors allows for a description of ambiguities in the generator and block-diagonalized Hamiltonian. We compare the efficiency of the corresponding computational algorithm with the efficiencies of the Van Vleck and Magnus methods for high perturbative orders.
A semi-implicit finite difference model for three-dimensional tidal circulation,

USGS Publications Warehouse

Casulli, V.; Cheng, R.T.

1992-01-01

A semi-implicit finite difference formulation for the numerical solution of three-dimensional tidal circulation is presented. The governing equations are the three-dimensional Reynolds equations in which the pressure is assumed to be hydrostatic. A minimal degree of implicitness has been introduced in the finite difference formula so that in the absence of horizontal viscosity the resulting algorithm is unconditionally stable at a minimal computational cost. When only one vertical layer is specified this method reduces, as a particular case, to a semi-implicit scheme for the solutions of the corresponding two-dimensional shallow water equations. The resulting two- and three-dimensional algorithm is fast, accurate and mass conservative. This formulation includes the simulation of flooding and drying of tidal flats, and is fully vectorizable for an efficient implementation on modern vector computers.
Bearing Fault Diagnosis under Variable Speed Using Convolutional Neural Networks and the Stochastic Diagonal Levenberg-Marquardt Algorithm

PubMed Central

Tra, Viet; Kim, Jaeyoung; Kim, Jong-Myon

2017-01-01

This paper presents a novel method for diagnosing incipient bearing defects under variable operating speeds using convolutional neural networks (CNNs) trained via the stochastic diagonal Levenberg-Marquardt (S-DLM) algorithm. The CNNs utilize the spectral energy maps (SEMs) of the acoustic emission (AE) signals as inputs and automatically learn the optimal features, which yield the best discriminative models for diagnosing incipient bearing defects under variable operating speeds. The SEMs are two-dimensional maps that show the distribution of energy across different bands of the AE spectrum. It is hypothesized that the variation of a bearing’s speed would not alter the overall shape of the AE spectrum rather, it may only scale and translate it. Thus, at different speeds, the same defect would yield SEMs that are scaled and shifted versions of each other. This hypothesis is confirmed by the experimental results, where CNNs trained using the S-DLM algorithm yield significantly better diagnostic performance under variable operating speeds compared to existing methods. In this work, the performance of different training algorithms is also evaluated to select the best training algorithm for the CNNs. The proposed method is used to diagnose both single and compound defects at six different operating speeds. PMID:29211025
An unconditionally stable staggered algorithm for transient finite element analysis of coupled thermoelastic problems

NASA Technical Reports Server (NTRS)

Farhat, C.; Park, K. C.; Dubois-Pelerin, Y.

1991-01-01

An unconditionally stable second order accurate implicit-implicit staggered procedure for the finite element solution of fully coupled thermoelasticity transient problems is proposed. The procedure is stabilized with a semi-algebraic augmentation technique. A comparative cost analysis reveals the superiority of the proposed computational strategy to other conventional staggered procedures. Numerical examples of one and two-dimensional thermomechanical coupled problems demonstrate the accuracy of the proposed numerical solution algorithm.
Extension of a streamwise upwind algorithm to a moving grid system

NASA Technical Reports Server (NTRS)

Obayashi, Shigeru; Goorjian, Peter M.; Guruswamy, Guru P.

1990-01-01

A new streamwise upwind algorithm was derived to compute unsteady flow fields with the use of a moving-grid system. The temporally nonconservative LU-ADI (lower-upper-factored, alternating-direction-implicit) method was applied for time marching computations. A comparison of the temporally nonconservative method with a time-conservative implicit upwind method indicates that the solutions are insensitive to the conservative properties of the implicit solvers when practical time steps are used. Using this new method, computations were made for an oscillating wing at a transonic Mach number. The computed results confirm that the present upwind scheme captures the shock motion better than the central-difference scheme based on the beam-warming algorithm. The new upwind option of the code allows larger time-steps and thus is more efficient, even though it requires slightly more computational time per time step than the central-difference option.
Computational plasticity algorithm for particle dynamics simulations

NASA Astrophysics Data System (ADS)

Krabbenhoft, K.; Lyamin, A. V.; Vignes, C.

2018-01-01

The problem of particle dynamics simulation is interpreted in the framework of computational plasticity leading to an algorithm which is mathematically indistinguishable from the common implicit scheme widely used in the finite element analysis of elastoplastic boundary value problems. This algorithm provides somewhat of a unification of two particle methods, the discrete element method and the contact dynamics method, which usually are thought of as being quite disparate. In particular, it is shown that the former appears as the special case where the time stepping is explicit while the use of implicit time stepping leads to the kind of schemes usually labelled contact dynamics methods. The framing of particle dynamics simulation within computational plasticity paves the way for new approaches similar (or identical) to those frequently employed in nonlinear finite element analysis. These include mixed implicit-explicit time stepping, dynamic relaxation and domain decomposition schemes.
Overcoming Geometry-Induced Stiffness with IMplicit-Explicit (IMEX) Runge-Kutta Algorithms on Unstructured Grids with Applications to CEM, CFD, and CAA

NASA Technical Reports Server (NTRS)

Kanevsky, Alex

2004-01-01

My goal is to develop and implement efficient, accurate, and robust Implicit-Explicit Runge-Kutta (IMEX RK) methods [9] for overcoming geometry-induced stiffness with applications to computational electromagnetics (CEM), computational fluid dynamics (CFD) and computational aeroacoustics (CAA). IMEX algorithms solve the non-stiff portions of the domain using explicit methods, and isolate and solve the more expensive stiff portions using implicit methods. Current algorithms in CEM can only simulate purely harmonic (up to lOGHz plane wave) EM scattering by fighter aircraft, which are assumed to be pure metallic shells, and cannot handle the inclusion of coatings, penetration into and radiation out of the aircraft. Efficient MEX RK methods could potentially increase current CEM capabilities by 1-2 orders of magnitude, allowing scientists and engineers to attack more challenging and realistic problems.
Multigrid Methods for Aerodynamic Problems in Complex Geometries

NASA Technical Reports Server (NTRS)

Caughey, David A.

1995-01-01

Work has been directed at the development of efficient multigrid methods for the solution of aerodynamic problems involving complex geometries, including the development of computational methods for the solution of both inviscid and viscous transonic flow problems. The emphasis is on problems of complex, three-dimensional geometry. The methods developed are based upon finite-volume approximations to both the Euler and the Reynolds-Averaged Navier-Stokes equations. The methods are developed for use on multi-block grids using diagonalized implicit multigrid methods to achieve computational efficiency. The work is focused upon aerodynamic problems involving complex geometries, including advanced engine inlets.
Algorithm For Hypersonic Flow In Chemical Equilibrium

NASA Technical Reports Server (NTRS)

Palmer, Grant

1989-01-01

Implicit, finite-difference, shock-capturing algorithm calculates inviscid, hypersonic flows in chemical equilibrium. Implicit formulation chosen because overcomes limitation on mathematical stability encountered in explicit formulations. For dynamical portion of problem, Euler equations written in conservation-law form in Cartesian coordinate system for two-dimensional or axisymmetric flow. For chemical portion of problem, equilibrium state of gas at each point in computational grid determined by minimizing local Gibbs free energy, subject to local conservation of molecules, atoms, ions, and total enthalpy. Major advantage: resulting algorithm naturally stable and captures strong shocks without help of artificial-dissipation terms to damp out spurious numerical oscillations.
Inductive reasoning and implicit memory: evidence from intact and impaired memory systems.

PubMed

Girelli, Luisa; Semenza, Carlo; Delazer, Margarete

2004-01-01

In this study, we modified a classic problem solving task, number series completion, in order to explore the contribution of implicit memory to inductive reasoning. Participants were required to complete number series sharing the same underlying algorithm (e.g., +2), differing in both constituent elements (e.g., 2468 versus 57911) and correct answers (e.g., 10 versus 13). In Experiment 1, reliable priming effects emerged, whether primes and targets were separated by four or ten fillers. Experiment 2 provided direct evidence that the observed facilitation arises at central stages of problem solving, namely the identification of the algorithm and its subsequent extrapolation. The observation of analogous priming effects in a severely amnesic patient strongly supports the hypothesis that the facilitation in number series completion was largely determined by implicit memory processes. These findings demonstrate that the influence of implicit processes extends to higher level cognitive domain such as induction reasoning.
Semi-implicit time integration of atmospheric flows with characteristic-based flux partitioning

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ghosh, Debojyoti; Constantinescu, Emil M.

2016-06-23

Here, this paper presents a characteristic-based flux partitioning for the semi-implicit time integration of atmospheric flows. Nonhydrostatic models require the solution of the compressible Euler equations. The acoustic time scale is significantly faster than the advective scale, yet it is typically not relevant to atmospheric and weather phenomena. The acoustic and advective components of the hyperbolic flux are separated in the characteristic space. High-order, conservative additive Runge-Kutta methods are applied to the partitioned equations so that the acoustic component is integrated in time implicitly with an unconditionally stable method, while the advective component is integrated explicitly. The time step ofmore » the overall algorithm is thus determined by the advective scale. Benchmark flow problems are used to demonstrate the accuracy, stability, and convergence of the proposed algorithm. The computational cost of the partitioned semi-implicit approach is compared with that of explicit time integration.« less
Performance of blind source separation algorithms for fMRI analysis using a group ICA method.

PubMed

Correa, Nicolle; Adali, Tülay; Calhoun, Vince D

2007-06-01

Independent component analysis (ICA) is a popular blind source separation technique that has proven to be promising for the analysis of functional magnetic resonance imaging (fMRI) data. A number of ICA approaches have been used for fMRI data analysis, and even more ICA algorithms exist; however, the impact of using different algorithms on the results is largely unexplored. In this paper, we study the performance of four major classes of algorithms for spatial ICA, namely, information maximization, maximization of non-Gaussianity, joint diagonalization of cross-cumulant matrices and second-order correlation-based methods, when they are applied to fMRI data from subjects performing a visuo-motor task. We use a group ICA method to study variability among different ICA algorithms, and we propose several analysis techniques to evaluate their performance. We compare how different ICA algorithms estimate activations in expected neuronal areas. The results demonstrate that the ICA algorithms using higher-order statistical information prove to be quite consistent for fMRI data analysis. Infomax, FastICA and joint approximate diagonalization of eigenmatrices (JADE) all yield reliable results, with each having its strengths in specific areas. Eigenvalue decomposition (EVD), an algorithm using second-order statistics, does not perform reliably for fMRI data. Additionally, for iterative ICA algorithms, it is important to investigate the variability of estimates from different runs. We test the consistency of the iterative algorithms Infomax and FastICA by running the algorithm a number of times with different initializations, and we note that they yield consistent results over these multiple runs. Our results greatly improve our confidence in the consistency of ICA for fMRI data analysis.
A scalable, fully implicit algorithm for the reduced two-field low-β extended MHD model

DOE PAGES

Chacon, Luis; Stanier, Adam John

2016-12-01

Here, we demonstrate a scalable fully implicit algorithm for the two-field low-β extended MHD model. This reduced model describes plasma behavior in the presence of strong guide fields, and is of significant practical impact both in nature and in laboratory plasmas. The model displays strong hyperbolic behavior, as manifested by the presence of fast dispersive waves, which make a fully implicit treatment very challenging. In this study, we employ a Jacobian-free Newton–Krylov nonlinear solver, for which we propose a physics-based preconditioner that renders the linearized set of equations suitable for inversion with multigrid methods. As a result, the algorithm ismore » shown to scale both algorithmically (i.e., the iteration count is insensitive to grid refinement and timestep size) and in parallel in a weak-scaling sense, with the wall-clock time scaling weakly with the number of cores for up to 4096 cores. For a 4096 × 4096 mesh, we demonstrate a wall-clock-time speedup of ~6700 with respect to explicit algorithms. The model is validated linearly (against linear theory predictions) and nonlinearly (against fully kinetic simulations), demonstrating excellent agreement.« less

A multi-dimensional, energy- and charge-conserving, nonlinearly implicit, electromagnetic Vlasov–Darwin particle-in-cell algorithm

DOE PAGES

Chen, G.; Chacón, L.

2015-08-11

For decades, the Vlasov–Darwin model has been recognized to be attractive for particle-in-cell (PIC) kinetic plasma simulations in non-radiative electromagnetic regimes, to avoid radiative noise issues and gain computational efficiency. However, the Darwin model results in an elliptic set of field equations that renders conventional explicit time integration unconditionally unstable. We explore a fully implicit PIC algorithm for the Vlasov–Darwin model in multiple dimensions, which overcomes many difficulties of traditional semi-implicit Darwin PIC algorithms. The finite-difference scheme for Darwin field equations and particle equations of motion is space–time-centered, employing particle sub-cycling and orbit-averaging. This algorithm conserves total energy, local charge,more » canonical-momentum in the ignorable direction, and preserves the Coulomb gauge exactly. An asymptotically well-posed fluid preconditioner allows efficient use of large cell sizes, which are determined by accuracy considerations, not stability, and can be orders of magnitude larger than required in a standard explicit electromagnetic PIC simulation. Finally, we demonstrate the accuracy and efficiency properties of the algorithm with various numerical experiments in 2D–3V.« less
ITrace: An implicit trust inference method for trust-aware collaborative filtering

NASA Astrophysics Data System (ADS)

He, Xu; Liu, Bin; Chen, Kejia

2018-04-01

The growth of Internet commerce has stimulated the use of collaborative filtering (CF) algorithms as recommender systems. A CF algorithm recommends items of interest to the target user by leveraging the votes given by other similar users. In a standard CF framework, it is assumed that the credibility of every voting user is exactly the same with respect to the target user. This assumption is not satisfied and thus may lead to misleading recommendations in many practical applications. A natural countermeasure is to design a trust-aware CF (TaCF) algorithm, which can take account of the difference in the credibilities of the voting users when performing CF. To this end, this paper presents a trust inference approach, which can predict the implicit trust of the target user on every voting user from a sparse explicit trust matrix. Then an improved CF algorithm termed iTrace is proposed, which takes advantage of both the explicit and the predicted implicit trust to provide recommendations with the CF framework. An empirical evaluation on a public dataset demonstrates that the proposed algorithm provides a significant improvement in recommendation quality in terms of mean absolute error.
Directional Agglomeration Multigrid Techniques for High Reynolds Number Viscous Flow Solvers

NASA Technical Reports Server (NTRS)

1998-01-01

A preconditioned directional-implicit agglomeration algorithm is developed for solving two- and three-dimensional viscous flows on highly anisotropic unstructured meshes of mixed-element types. The multigrid smoother consists of a pre-conditioned point- or line-implicit solver which operates on lines constructed in the unstructured mesh using a weighted graph algorithm. Directional coarsening or agglomeration is achieved using a similar weighted graph algorithm. A tight coupling of the line construction and directional agglomeration algorithms enables the use of aggressive coarsening ratios in the multigrid algorithm, which in turn reduces the cost of a multigrid cycle. Convergence rates which are independent of the degree of grid stretching are demonstrated in both two and three dimensions. Further improvement of the three-dimensional convergence rates through a GMRES technique is also demonstrated.
Directional Agglomeration Multigrid Techniques for High-Reynolds Number Viscous Flows

NASA Technical Reports Server (NTRS)

Mavriplis, Dimitri J.

1998-01-01

A preconditioned directional-implicit agglomeration algorithm is developed for solving two- and three-dimensional viscous flows on highly anisotropic unstructured meshes of mixed-element types. The multigrid smoother consists of a pre-conditioned point- or line-implicit solver which operates on lines constructed in the unstructured mesh using a weighted graph algorithm. Directional coarsening or agglomeration is achieved using a similar weighted graph algorithm. A tight coupling of the line construction and directional agglomeration algorithms enables the use of aggressive coarsening ratios in the multigrid algorithm, which in turn reduces the cost of a multigrid cycle. Convergence rates which are independent of the degree of grid stretching are demonstrated in both two and three dimensions. Further improvement of the three-dimensional convergence rates through a GMRES technique is also demonstrated.
Reducing Memory Cost of Exact Diagonalization using Singular Value Decomposition

NASA Astrophysics Data System (ADS)

Weinstein, Marvin; Chandra, Ravi; Auerbach, Assa

2012-02-01

We present a modified Lanczos algorithm to diagonalize lattice Hamiltonians with dramatically reduced memory requirements. In contrast to variational approaches and most implementations of DMRG, Lanczos rotations towards the ground state do not involve incremental minimizations, (e.g. sweeping procedures) which may get stuck in false local minima. The lattice of size N is partitioned into two subclusters. At each iteration the rotating Lanczos vector is compressed into two sets of nsvd small subcluster vectors using singular value decomposition. For low entanglement entropy See, (satisfied by short range Hamiltonians), the truncation error is bounded by (-nsvd^1/See). Convergence is tested for the Heisenberg model on Kagom'e clusters of 24, 30 and 36 sites, with no lattice symmetries exploited, using less than 15GB of dynamical memory. Generalization of the Lanczos-SVD algorithm to multiple partitioning is discussed, and comparisons to other techniques are given. Reference: arXiv:1105.0007
Transonic Navier-Stokes wing solution using a zonal approach. Part 1: Solution methodology and code validation

NASA Technical Reports Server (NTRS)

Flores, J.; Gundy, K.; Gundy, K.; Gundy, K.; Gundy, K.; Gundy, K.

1986-01-01

A fast diagonalized Beam-Warming algorithm is coupled with a zonal approach to solve the three-dimensional Euler/Navier-Stokes equations. The computer code, called Transonic Navier-Stokes (TNS), uses a total of four zones for wing configurations (or can be extended to complete aircraft configurations by adding zones). In the inner blocks near the wing surface, the thin-layer Navier-Stokes equations are solved, while in the outer two blocks the Euler equations are solved. The diagonal algorithm yields a speedup of as much as a factor of 40 over the original algorithm/zonal method code. The TNS code, in addition, has the capability to model wind tunnel walls. Transonic viscous solutions are obtained on a 150,000-point mesh for a NACA 0012 wing. A three-order-of-magnitude drop in the L2-norm of the residual requires approximately 500 iterations, which takes about 45 min of CPU time on a Cray-XMP processor. Simulations are also conducted for a different geometrical wing called WING C. All cases show good agreement with experimental data.
Time-asymptotic solutions of the Navier-Stokes equation for free shear flows using an alternating-direction implicit method

NASA Technical Reports Server (NTRS)

Rudy, D. H.; Morris, D. J.

1976-01-01

An uncoupled time asymptotic alternating direction implicit method for solving the Navier-Stokes equations was tested on two laminar parallel mixing flows. A constant total temperature was assumed in order to eliminate the need to solve the full energy equation; consequently, static temperature was evaluated by using algebraic relationship. For the mixing of two supersonic streams at a Reynolds number of 1,000, convergent solutions were obtained for a time step 5 times the maximum allowable size for an explicit method. The solution diverged for a time step 10 times the explicit limit. Improved convergence was obtained when upwind differencing was used for convective terms. Larger time steps were not possible with either upwind differencing or the diagonally dominant scheme. Artificial viscosity was added to the continuity equation in order to eliminate divergence for the mixing of a subsonic stream with a supersonic stream at a Reynolds number of 1,000.
A multigrid nonoscillatory method for computing high speed flows

NASA Technical Reports Server (NTRS)

Li, C. P.; Shieh, T. H.

1993-01-01

A multigrid method using different smoothers has been developed to solve the Euler equations discretized by a nonoscillatory scheme up to fourth order accuracy. The best smoothing property is provided by a five-stage Runge-Kutta technique with optimized coefficients, yet the most efficient smoother is a backward Euler technique in factored and diagonalized form. The singlegrid solution for a hypersonic, viscous conic flow is in excellent agreement with the solution obtained by the third order MUSCL and Roe's method. Mach 8 inviscid flow computations for a complete entry probe have shown that the accuracy is at least as good as the symmetric TVD scheme of Yee and Harten. The implicit multigrid method is four times more efficient than the explicit multigrid technique and 3.5 times faster than the single-grid implicit technique. For a Mach 8.7 inviscid flow over a blunt delta wing at 30 deg incidence, the CPU reduction factor from the three-level multigrid computation is 2.2 on a grid of 37 x 41 x 73 nodes.
State-of-charge estimation in lithium-ion batteries: A particle filter approach

NASA Astrophysics Data System (ADS)

Tulsyan, Aditya; Tsai, Yiting; Gopaluni, R. Bhushan; Braatz, Richard D.

2016-11-01

The dynamics of lithium-ion batteries are complex and are often approximated by models consisting of partial differential equations (PDEs) relating the internal ionic concentrations and potentials. The Pseudo two-dimensional model (P2D) is one model that performs sufficiently accurately under various operating conditions and battery chemistries. Despite its widespread use for prediction, this model is too complex for standard estimation and control applications. This article presents an original algorithm for state-of-charge estimation using the P2D model. Partial differential equations are discretized using implicit stable algorithms and reformulated into a nonlinear state-space model. This discrete, high-dimensional model (consisting of tens to hundreds of states) contains implicit, nonlinear algebraic equations. The uncertainty in the model is characterized by additive Gaussian noise. By exploiting the special structure of the pseudo two-dimensional model, a novel particle filter algorithm that sweeps in time and spatial coordinates independently is developed. This algorithm circumvents the degeneracy problems associated with high-dimensional state estimation and avoids the repetitive solution of implicit equations by defining a 'tether' particle. The approach is illustrated through extensive simulations.
Advancing parabolic operators in thermodynamic MHD models: Explicit super time-stepping versus implicit schemes with Krylov solvers

NASA Astrophysics Data System (ADS)

Caplan, R. M.; Mikić, Z.; Linker, J. A.; Lionello, R.

2017-05-01

We explore the performance and advantages/disadvantages of using unconditionally stable explicit super time-stepping (STS) algorithms versus implicit schemes with Krylov solvers for integrating parabolic operators in thermodynamic MHD models of the solar corona. Specifically, we compare the second-order Runge-Kutta Legendre (RKL2) STS method with the implicit backward Euler scheme computed using the preconditioned conjugate gradient (PCG) solver with both a point-Jacobi and a non-overlapping domain decomposition ILU0 preconditioner. The algorithms are used to integrate anisotropic Spitzer thermal conduction and artificial kinematic viscosity at time-steps much larger than classic explicit stability criteria allow. A key component of the comparison is the use of an established MHD model (MAS) to compute a real-world simulation on a large HPC cluster. Special attention is placed on the parallel scaling of the algorithms. It is shown that, for a specific problem and model, the RKL2 method is comparable or surpasses the implicit method with PCG solvers in performance and scaling, but suffers from some accuracy limitations. These limitations, and the applicability of RKL methods are briefly discussed.
A Comparison between the Decimated Padé Approximant and Decimated Signal Diagonalization Methods for Leak Detection in Pipelines Equipped with Pressure Sensors.

PubMed

Lay-Ekuakille, Aimé; Fabbiano, Laura; Vacca, Gaetano; Kitoko, Joël Kidiamboko; Kulapa, Patrice Bibala; Telesca, Vito

2018-06-04

Pipelines conveying fluids are considered strategic infrastructures to be protected and maintained. They generally serve for transportation of important fluids such as drinkable water, waste water, oil, gas, chemicals, etc. Monitoring and continuous testing, especially on-line, are necessary to assess the condition of pipelines. The paper presents findings related to a comparison between two spectral response algorithms based on the decimated signal diagonalization (DSD) and decimated Padé approximant (DPA) techniques that allow to one to process signals delivered by pressure sensors mounted on an experimental pipeline.
Convergence to Diagonal Form of Block Jacobi-type Processes

NASA Astrophysics Data System (ADS)

Hari, Vjeran

2008-09-01

The main result of recent research on convergence to diagonal form of block Jacobi-type processes is presented. For this purpose, all notions needed to describe the result are introduced. In particular, elementary block transformation matrices, simple and non-simple algorithms, block pivot strategies together with the appropriate equivalence relations are defined. The general block Jacobi-type process considered here can be specialized to take the form of almost any known Jacobi-type method for solving the ordinary or the generalized matrix eigenvalue and singular value problems. The assumptions used in the result are satisfied by many concrete methods.
Application of the Yoshida-Ruth Techniques to Implicit Integration and Multi-Map Explicit Integration

DOE Office of Scientific and Technical Information (OSTI.GOV)

Forest, E.; Bengtsson, J.; Reusch, M.F.

1991-04-01

The full power of Yoshida's technique is exploited to produce an arbitrary order implicit symplectic integrator and multi-map explicit integrator. This implicit integrator uses a characteristic function involving the force term alone. Also we point out the usefulness of the plain Ruth algorithm in computing Taylor series map using the techniques first introduced by Berz in his 'COSY-INFINITY' code.
Use of the preconditioned conjugate gradient algorithm as a generic solver for mixed-model equations in animal breeding applications.

PubMed

Tsuruta, S; Misztal, I; Strandén, I

2001-05-01

Utility of the preconditioned conjugate gradient algorithm with a diagonal preconditioner for solving mixed-model equations in animal breeding applications was evaluated with 16 test problems. The problems included single- and multiple-trait analyses, with data on beef, dairy, and swine ranging from small examples to national data sets. Multiple-trait models considered low and high genetic correlations. Convergence was based on relative differences between left- and right-hand sides. The ordering of equations was fixed effects followed by random effects, with no special ordering within random effects. The preconditioned conjugate gradient program implemented with double precision converged for all models. However, when implemented in single precision, the preconditioned conjugate gradient algorithm did not converge for seven large models. The preconditioned conjugate gradient and successive overrelaxation algorithms were subsequently compared for 13 of the test problems. The preconditioned conjugate gradient algorithm was easy to implement with the iteration on data for general models. However, successive overrelaxation requires specific programming for each set of models. On average, the preconditioned conjugate gradient algorithm converged in three times fewer rounds of iteration than successive overrelaxation. With straightforward implementations, programs using the preconditioned conjugate gradient algorithm may be two or more times faster than those using successive overrelaxation. However, programs using the preconditioned conjugate gradient algorithm would use more memory than would comparable implementations using successive overrelaxation. Extensive optimization of either algorithm can influence rankings. The preconditioned conjugate gradient implemented with iteration on data, a diagonal preconditioner, and in double precision may be the algorithm of choice for solving mixed-model equations when sufficient memory is available and ease of implementation is essential.
Sensitivity of coronal loop sausage mode frequencies and decay rates to radial and longitudinal density inhomogeneities: a spectral approach

NASA Astrophysics Data System (ADS)

Cally, Paul S.; Xiong, Ming

2018-01-01

Fast sausage modes in solar magnetic coronal loops are only fully contained in unrealistically short dense loops. Otherwise they are leaky, losing energy to their surrounds as outgoing waves. This causes any oscillation to decay exponentially in time. Simultaneous observations of both period and decay rate therefore reveal the eigenfrequency of the observed mode, and potentially insight into the tubes’ nonuniform internal structure. In this article, a global spectral description of the oscillations is presented that results in an implicit matrix eigenvalue equation where the eigenvalues are associated predominantly with the diagonal terms of the matrix. The off-diagonal terms vanish identically if the tube is uniform. A linearized perturbation approach, applied with respect to a uniform reference model, is developed that makes the eigenvalues explicit. The implicit eigenvalue problem is easily solved numerically though, and it is shown that knowledge of the real and imaginary parts of the eigenfrequency is sufficient to determine the width and density contrast of a boundary layer over which the tubes’ enhanced internal densities drop to ambient values. Linearized density kernels are developed that show sensitivity only to the extreme outside of the loops for radial fundamental modes, especially for small density enhancements, with no sensitivity to the core. Higher radial harmonics do show some internal sensitivity, but these will be more difficult to observe. Only kink modes are sensitive to the tube centres. Variation in internal and external Alfvén speed along the loop is shown to have little effect on the fundamental dimensionless eigenfrequency, though the associated eigenfunction becomes more compact at the loop apex as stratification increases, or may even displace from the apex.
A charge- and energy-conserving implicit, electrostatic particle-in-cell algorithm on mapped computational meshes

NASA Astrophysics Data System (ADS)

Chacón, L.; Chen, G.; Barnes, D. C.

2013-01-01

We describe the extension of the recent charge- and energy-conserving one-dimensional electrostatic particle-in-cell algorithm in Ref. [G. Chen, L. Chacón, D.C. Barnes, An energy- and charge-conserving, implicit electrostatic particle-in-cell algorithm, Journal of Computational Physics 230 (2011) 7018-7036] to mapped (body-fitted) computational meshes. The approach maintains exact charge and energy conservation properties. Key to the algorithm is a hybrid push, where particle positions are updated in logical space, while velocities are updated in physical space. The effectiveness of the approach is demonstrated with a challenging numerical test case, the ion acoustic shock wave. The generalization of the approach to multiple dimensions is outlined.
Symmetry boost of the fidelity of Shor factoring

NASA Astrophysics Data System (ADS)

Nam, Y. S.; Blümel, R.

2018-05-01

In Shor's algorithm quantum subroutines occur with the structure F U F-1 , where F is a unitary transform and U is performing a quantum computation. Examples are quantum adders and subunits of quantum modulo adders. In this paper we show, both analytically and numerically, that if, in analogy to spin echoes, F and F-1 can be implemented symmetrically when executing Shor's algorithm on actual, imperfect quantum hardware, such that F and F-1 have the same hardware errors, a symmetry boost in the fidelity of the combined F U F-1 quantum operation results when compared to the case in which the errors in F and F-1 are independently random. Running the complete gate-by-gate implemented Shor algorithm, we show that the symmetry-induced fidelity boost can be as large as a factor 4. While most of our analytical and numerical results concern the case of over- and under-rotation of controlled rotation gates, in the numerically accessible case of Shor's algorithm with a small number of qubits, we show explicitly that the symmetry boost is robust with respect to more general types of errors. While, expectedly, additional error types reduce the symmetry boost, we show explicitly, by implementing general off-diagonal SU (N ) errors (N =2 ,4 ,8 ), that the boost factor scales like a Lorentzian in δ /σ , where σ and δ are the error strengths of the diagonal over- and underrotation errors and the off-diagonal SU (N ) errors, respectively. The Lorentzian shape also shows that, while the boost factor may become small with increasing δ , it declines slowly (essentially like a power law) and is never completely erased. We also investigate the effect of diagonal nonunitary errors, which, in analogy to unitary errors, reduce but never erase the symmetry boost. Going beyond the case of small quantum processors, we present analytical scaling results that show that the symmetry boost persists in the practically interesting case of a large number of qubits. We illustrate this result explicitly for the case of Shor factoring of the semiprime RSA-1024, where, analytically, focusing on over- and underrotation errors, we obtain a boost factor of about 10. In addition, we provide a proof of the fidelity product formula, including its range of applicability.
Coordination Logic for Repulsive Resolution Maneuvers

NASA Technical Reports Server (NTRS)

Narkawicz, Anthony J.; Munoz, Cesar A.; Dutle, Aaron M.

2016-01-01

This paper presents an algorithm for determining the direction an aircraft should maneuver in the event of a potential conflict with another aircraft. The algorithm is implicitly coordinated, meaning that with perfectly reliable computations and information, it will in- dependently provide directional information that is guaranteed to be coordinated without any additional information exchange or direct communication. The logic is inspired by the logic of TCAS II, the airborne system designed to reduce the risk of mid-air collisions between aircraft. TCAS II provides pilots with only vertical resolution advice, while the proposed algorithm, using a similar logic, provides implicitly coordinated vertical and horizontal directional advice.
A survey of implicit particle filters for data assimilation [Implicit particle filters for data assimilation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chorin, Alexandre J.; Morzfeld, Matthias; Tu, Xuemin

Implicit particle filters for data assimilation update the particles by first choosing probabilities and then looking for particle locations that assume them, guiding the particles one by one to the high probability domain. We provide a detailed description of these filters, with illustrative examples, together with new, more general, methods for solving the algebraic equations and with a new algorithm for parameter identification.
Investigation of flow-induced numerical instability in a mixed semi-implicit, implicit leapfrog time discretization

NASA Astrophysics Data System (ADS)

King, Jacob; Kruger, Scott

2017-10-01

Flow can impact the stability and nonlinear evolution of range of instabilities (e.g. RWMs, NTMs, sawteeth, locked modes, PBMs, and high-k turbulence) and thus robust numerical algorithms for simulations with flow are essential. Recent simulations of DIII-D QH-mode [King et al., Phys. Plasmas and Nucl. Fus. 2017] with flow have been restricted to smaller time-step sizes than corresponding computations without flow. These computations use a mixed semi-implicit, implicit leapfrog time discretization as implemented in the NIMROD code [Sovinec et al., JCP 2004]. While prior analysis has shown that this algorithm is unconditionally stable with respect to the effect of large flows on the MHD waves in slab geometry [Sovinec et al., JCP 2010], our present Von Neumann stability analysis shows that a flow-induced numerical instability may arise when ad-hoc cylindrical curvature is included. Computations with the NIMROD code in cylindrical geometry with rigid rotation and without free-energy drive from current or pressure gradients qualitatively confirm this analysis. We explore potential methods to circumvent this flow-induced numerical instability such as using a semi-Lagrangian formulation instead of time-centered implicit advection and/or modification to the semi-implicit operator. This work is supported by the DOE Office of Science (Office of Fusion Energy Sciences).

Implicit Kalman filtering

NASA Technical Reports Server (NTRS)

Skliar, M.; Ramirez, W. F.

1997-01-01

For an implicitly defined discrete system, a new algorithm for Kalman filtering is developed and an efficient numerical implementation scheme is proposed. Unlike the traditional explicit approach, the implicit filter can be readily applied to ill-conditioned systems and allows for generalization to descriptor systems. The implementation of the implicit filter depends on the solution of the congruence matrix equation (A1)(Px)(AT1) = Py. We develop a general iterative method for the solution of this equation, and prove necessary and sufficient conditions for convergence. It is shown that when the system matrices of an implicit system are sparse, the implicit Kalman filter requires significantly less computer time and storage to implement as compared to the traditional explicit Kalman filter. Simulation results are presented to illustrate and substantiate the theoretical developments.
Analysis of superimposed ultrasonic guided waves in long bones by the joint approximate diagonalization of eigen-matrices algorithm.

PubMed

Song, Xiaojun; Ta, Dean; Wang, Weiqi

2011-10-01

The parameters of ultrasonic guided waves (GWs) are very sensitive to mechanical and structural changes in long cortical bones. However, it is a challenge to obtain the group velocity and other parameters of GWs because of the presence of mixed multiple modes. This paper proposes a blind identification algorithm using the joint approximate diagonalization of eigen-matrices (JADE) and applies it to the separation of superimposed GWs in long bones. For the simulation case, the velocity of the single mode was calculated after separation. A strong agreement was obtained between the estimated velocity and the theoretical expectation. For the experiments in bovine long bones, by using the calculated velocity and a theoretical model, the cortical thickness (CTh) was obtained. For comparison with the JADE approach, an adaptive Gaussian chirplet time-frequency (ACGTF) method was also used to estimate the CTh. The results showed that the mean error of the CTh acquired by the JADE approach was 4.3%, which was smaller than that of the ACGTF method (13.6%). This suggested that the JADE algorithm may be used to separate the superimposed GWs and that the JADE algorithm could potentially be used to evaluate long bones. Copyright © 2011 World Federation for Ultrasound in Medicine & Biology. Published by Elsevier Inc. All rights reserved.
A new solution method for wheel/rail rolling contact.

PubMed

Yang, Jian; Song, Hua; Fu, Lihua; Wang, Meng; Li, Wei

2016-01-01

To solve the problem of wheel/rail rolling contact of nonlinear steady-state curving, a three-dimensional transient finite element (FE) model is developed by the explicit software ANSYS/LS-DYNA. To improve the solving speed and efficiency, an explicit-explicit order solution method is put forward based on analysis of the features of implicit and explicit algorithm. The solution method was first applied to calculate the pre-loading of wheel/rail rolling contact with explicit algorithm, and then the results became the initial conditions in solving the dynamic process of wheel/rail rolling contact with explicit algorithm as well. Simultaneously, the common implicit-explicit order solution method is used to solve the FE model. Results show that the explicit-explicit order solution method has faster operation speed and higher efficiency than the implicit-explicit order solution method while the solution accuracy is almost the same. Hence, the explicit-explicit order solution method is more suitable for the wheel/rail rolling contact model with large scale and high nonlinearity.
Advanced particle-in-cell simulation techniques for modeling the Lockheed Martin Compact Fusion Reactor

NASA Astrophysics Data System (ADS)

Welch, Dale; Font, Gabriel; Mitchell, Robert; Rose, David

2017-10-01

We report on particle-in-cell developments of the study of the Compact Fusion Reactor. Millisecond, two and three-dimensional simulations (cubic meter volume) of confinement and neutral beam heating of the magnetic confinement device requires accurate representation of the complex orbits, near perfect energy conservation, and significant computational power. In order to determine initial plasma fill and neutral beam heating, these simulations include ionization, elastic and charge exchange hydrogen reactions. To this end, we are pursuing fast electromagnetic kinetic modeling algorithms including a two implicit techniques and a hybrid quasi-neutral algorithm with kinetic ions. The kinetic modeling includes use of the Poisson-corrected direct implicit, magnetic implicit, as well as second-order cloud-in-cell techniques. The hybrid algorithm, ignoring electron inertial effects, is two orders of magnitude faster than kinetic but not as accurate with respect to confinement. The advantages and disadvantages of these techniques will be presented. Funded by Lockheed Martin.
Comments on "Including the effects of temperature-dependent opacities in the implicit Monte Carlo algorithm" by N.A. Gentile [J. Comput. Phys. 230 (2011) 5100-5114

NASA Astrophysics Data System (ADS)

Ghosh, Karabi

2017-02-01

We briefly comment on a paper by N.A. Gentile [J. Comput. Phys. 230 (2011) 5100-5114] in which the Fleck factor has been modified to include the effects of temperature-dependent opacities in the implicit Monte Carlo algorithm developed by Fleck and Cummings [1,2]. Instead of the Fleck factor, f = 1 / (1 + βcΔtσP), the author derived the modified Fleck factor g = 1 / (1 + βcΔtσP - min [σP‧ (aTr4 - aT4)cΔt/ρCV, 0 ]) to be used in the Implicit Monte Carlo (IMC) algorithm in order to obtain more accurate solutions with much larger time steps. Here β = 4 aT3 / ρCV, σP is the Planck opacity and the derivative of Planck opacity w.r.t. the material temperature is σP‧ = dσP / dT.
A parallel algorithm for the two-dimensional time fractional diffusion equation with implicit difference method.

PubMed

Gong, Chunye; Bao, Weimin; Tang, Guojian; Jiang, Yuewen; Liu, Jie

2014-01-01

It is very time consuming to solve fractional differential equations. The computational complexity of two-dimensional fractional differential equation (2D-TFDE) with iterative implicit finite difference method is O(M(x)M(y)N(2)). In this paper, we present a parallel algorithm for 2D-TFDE and give an in-depth discussion about this algorithm. A task distribution model and data layout with virtual boundary are designed for this parallel algorithm. The experimental results show that the parallel algorithm compares well with the exact solution. The parallel algorithm on single Intel Xeon X5540 CPU runs 3.16-4.17 times faster than the serial algorithm on single CPU core. The parallel efficiency of 81 processes is up to 88.24% compared with 9 processes on a distributed memory cluster system. We do think that the parallel computing technology will become a very basic method for the computational intensive fractional applications in the near future.
An energy- and charge-conserving, nonlinearly implicit, electromagnetic 1D-3V Vlasov-Darwin particle-in-cell algorithm

NASA Astrophysics Data System (ADS)

Chen, G.; Chacón, L.

2014-10-01

A recent proof-of-principle study proposes a nonlinear electrostatic implicit particle-in-cell (PIC) algorithm in one dimension (Chen et al., 2011). The algorithm employs a kinetically enslaved Jacobian-free Newton-Krylov (JFNK) method, and conserves energy and charge to numerical round-off. In this study, we generalize the method to electromagnetic simulations in 1D using the Darwin approximation to Maxwell's equations, which avoids radiative noise issues by ordering out the light wave. An implicit, orbit-averaged, time-space-centered finite difference scheme is employed in both the 1D Darwin field equations (in potential form) and the 1D-3V particle orbit equations to produce a discrete system that remains exactly charge- and energy-conserving. Furthermore, enabled by the implicit Darwin equations, exact conservation of the canonical momentum per particle in any ignorable direction is enforced via a suitable scattering rule for the magnetic field. We have developed a simple preconditioner that targets electrostatic waves and skin currents, and allows us to employ time steps O(√{mi /me } c /veT) larger than the explicit CFL. Several 1D numerical experiments demonstrate the accuracy, performance, and conservation properties of the algorithm. In particular, the scheme is shown to be second-order accurate, and CPU speedups of more than three orders of magnitude vs. an explicit Vlasov-Maxwell solver are demonstrated in the "cold" plasma regime (where kλD ≪ 1).
An implicit numerical model for multicomponent compressible two-phase flow in porous media

NASA Astrophysics Data System (ADS)

Zidane, Ali; Firoozabadi, Abbas

2015-11-01

We introduce a new implicit approach to model multicomponent compressible two-phase flow in porous media with species transfer between the phases. In the implicit discretization of the species transport equation in our formulation we calculate for the first time the derivative of the molar concentration of component i in phase α (cα, i) with respect to the total molar concentration (ci) under the conditions of a constant volume V and temperature T. The species transport equation is discretized by the finite volume (FV) method. The fluxes are calculated based on powerful features of the mixed finite element (MFE) method which provides the pressure at grid-cell interfaces in addition to the pressure at the grid-cell center. The efficiency of the proposed model is demonstrated by comparing our results with three existing implicit compositional models. Our algorithm has low numerical dispersion despite the fact it is based on first-order space discretization. The proposed algorithm is very robust.
Recovering hidden diagonal structures via non-negative matrix factorization with multiple constraints.

PubMed

Yang, Xi; Han, Guoqiang; Cai, Hongmin; Song, Yan

2017-03-31

Revealing data with intrinsically diagonal block structures is particularly useful for analyzing groups of highly correlated variables. Earlier researches based on non-negative matrix factorization (NMF) have been shown to be effective in representing such data by decomposing the observed data into two factors, where one factor is considered to be the feature and the other the expansion loading from a linear algebra perspective. If the data are sampled from multiple independent subspaces, the loading factor would possess a diagonal structure under an ideal matrix decomposition. However, the standard NMF method and its variants have not been reported to exploit this type of data via direct estimation. To address this issue, a non-negative matrix factorization with multiple constraints model is proposed in this paper. The constraints include an sparsity norm on the feature matrix and a total variational norm on each column of the loading matrix. The proposed model is shown to be capable of efficiently recovering diagonal block structures hidden in observed samples. An efficient numerical algorithm using the alternating direction method of multipliers model is proposed for optimizing the new model. Compared with several benchmark models, the proposed method performs robustly and effectively for simulated and real biological data.
Multigrid treatment of implicit continuum diffusion

NASA Astrophysics Data System (ADS)

Francisquez, Manaure; Zhu, Ben; Rogers, Barrett

2017-10-01

Implicit treatment of diffusive terms of various differential orders common in continuum mechanics modeling, such as computational fluid dynamics, is investigated with spectral and multigrid algorithms in non-periodic 2D domains. In doubly periodic time dependent problems these terms can be efficiently and implicitly handled by spectral methods, but in non-periodic systems solved with distributed memory parallel computing and 2D domain decomposition, this efficiency is lost for large numbers of processors. We built and present here a multigrid algorithm for these types of problems which outperforms a spectral solution that employs the highly optimized FFTW library. This multigrid algorithm is not only suitable for high performance computing but may also be able to efficiently treat implicit diffusion of arbitrary order by introducing auxiliary equations of lower order. We test these solvers for fourth and sixth order diffusion with idealized harmonic test functions as well as a turbulent 2D magnetohydrodynamic simulation. It is also shown that an anisotropic operator without cross-terms can improve model accuracy and speed, and we examine the impact that the various diffusion operators have on the energy, the enstrophy, and the qualitative aspect of a simulation. This work was supported by DOE-SC-0010508. This research used resources of the National Energy Research Scientific Computing Center (NERSC).
Fully implicit adaptive mesh refinement MHD algorithm

NASA Astrophysics Data System (ADS)

Philip, Bobby

2005-10-01

In the macroscopic simulation of plasmas, the numerical modeler is faced with the challenge of dealing with multiple time and length scales. The former results in stiffness due to the presence of very fast waves. The latter requires one to resolve the localized features that the system develops. Traditional approaches based on explicit time integration techniques and fixed meshes are not suitable for this challenge, as such approaches prevent the modeler from using realistic plasma parameters to keep the computation feasible. We propose here a novel approach, based on implicit methods and structured adaptive mesh refinement (SAMR). Our emphasis is on both accuracy and scalability with the number of degrees of freedom. To our knowledge, a scalable, fully implicit AMR algorithm has not been accomplished before for MHD. As a proof-of-principle, we focus on the reduced resistive MHD model as a basic MHD model paradigm, which is truly multiscale. The approach taken here is to adapt mature physics-based technologyootnotetextL. Chac'on et al., J. Comput. Phys. 178 (1), 15- 36 (2002) to AMR grids, and employ AMR-aware multilevel techniques (such as fast adaptive composite --FAC-- algorithms) for scalability. We will demonstrate that the concept is indeed feasible, featuring optimal scalability under grid refinement. Results of fully-implicit, dynamically-adaptive AMR simulations will be presented on a variety of problems.
Breaking Megrelishvili protocol using matrix diagonalization

NASA Astrophysics Data System (ADS)

Arzaki, Muhammad; Triantoro Murdiansyah, Danang; Adi Prabowo, Satrio

2018-03-01

In this article we conduct a theoretical security analysis of Megrelishvili protocol—a linear algebra-based key agreement between two participants. We study the computational complexity of Megrelishvili vector-matrix problem (MVMP) as a mathematical problem that strongly relates to the security of Megrelishvili protocol. In particular, we investigate the asymptotic upper bounds for the running time and memory requirement of the MVMP that involves diagonalizable public matrix. Specifically, we devise a diagonalization method for solving the MVMP that is asymptotically faster than all of the previously existing algorithms. We also found an important counterintuitive result: the utilization of primitive matrix in Megrelishvili protocol makes the protocol more vulnerable to attacks.
Effective Methods for Solving Band SLEs after Parabolic Nonlinear PDEs

NASA Astrophysics Data System (ADS)

Veneva, Milena; Ayriyan, Alexander

2018-04-01

A class of models of heat transfer processes in a multilayer domain is considered. The governing equation is a nonlinear heat-transfer equation with different temperature-dependent densities and thermal coefficients in each layer. Homogeneous Neumann boundary conditions and ideal contact ones are applied. A finite difference scheme on a special uneven mesh with a second-order approximation in the case of a piecewise constant spatial step is built. This discretization leads to a pentadiagonal system of linear equations (SLEs) with a matrix which is neither diagonally dominant, nor positive definite. Two different methods for solving such a SLE are developed - diagonal dominantization and symbolic algorithms.
Comparison of Conjugate Gradient Density Matrix Search and Chebyshev Expansion Methods for Avoiding Diagonalization in Large-Scale Electronic Structure Calculations

NASA Technical Reports Server (NTRS)

Bates, Kevin R.; Daniels, Andrew D.; Scuseria, Gustavo E.

1998-01-01

We report a comparison of two linear-scaling methods which avoid the diagonalization bottleneck of traditional electronic structure algorithms. The Chebyshev expansion method (CEM) is implemented for carbon tight-binding calculations of large systems and its memory and timing requirements compared to those of our previously implemented conjugate gradient density matrix search (CG-DMS). Benchmark calculations are carried out on icosahedral fullerenes from C60 to C8640 and the linear scaling memory and CPU requirements of the CEM demonstrated. We show that the CPU requisites of the CEM and CG-DMS are similar for calculations with comparable accuracy.
A deterministic global optimization using smooth diagonal auxiliary functions

NASA Astrophysics Data System (ADS)

Sergeyev, Yaroslav D.; Kvasov, Dmitri E.

2015-04-01

In many practical decision-making problems it happens that functions involved in optimization process are black-box with unknown analytical representations and hard to evaluate. In this paper, a global optimization problem is considered where both the goal function f (x) and its gradient f‧ (x) are black-box functions. It is supposed that f‧ (x) satisfies the Lipschitz condition over the search hyperinterval with an unknown Lipschitz constant K. A new deterministic 'Divide-the-Best' algorithm based on efficient diagonal partitions and smooth auxiliary functions is proposed in its basic version, its convergence conditions are studied and numerical experiments executed on eight hundred test functions are presented.
An Implicit Upwind Algorithm for Computing Turbulent Flows on Unstructured Grids

NASA Technical Reports Server (NTRS)

Anerson, W. Kyle; Bonhaus, Daryl L.

1994-01-01

An implicit, Navier-Stokes solution algorithm is presented for the computation of turbulent flow on unstructured grids. The inviscid fluxes are computed using an upwind algorithm and the solution is advanced in time using a backward-Euler time-stepping scheme. At each time step, the linear system of equations is approximately solved with a point-implicit relaxation scheme. This methodology provides a viable and robust algorithm for computing turbulent flows on unstructured meshes. Results are shown for subsonic flow over a NACA 0012 airfoil and for transonic flow over a RAE 2822 airfoil exhibiting a strong upper-surface shock. In addition, results are shown for 3 element and 4 element airfoil configurations. For the calculations, two one equation turbulence models are utilized. For the NACA 0012 airfoil, a pressure distribution and force data are compared with other computational results as well as with experiment. Comparisons of computed pressure distributions and velocity profiles with experimental data are shown for the RAE airfoil and for the 3 element configuration. For the 4 element case, comparisons of surface pressure distributions with experiment are made. In general, the agreement between the computations and the experiment is good.
A fully implicit Hall MHD algorithm based on the ion Ohm's law

NASA Astrophysics Data System (ADS)

Chacón, Luis

2010-11-01

Hall MHD is characterized by extreme hyperbolic numerical stiffness stemming from fast dispersive waves. Implicit algorithms are potentially advantageous, but of very difficult efficient implementation due to the condition numbers of associated matrices. Here, we explore the extension of a successful fully implicit, fully nonlinear algorithm for resistive MHD,ootnotetextL. Chac'on, Phys. Plasmas, 15 (2008) based on Jacobian-free Newton-Krylov methods with physics-based preconditioning, to Hall MHD. Traditionally, Hall MHD has been formulated using the electron equation of motion (EOM) to determine the electric field in the plasma (the so-called Ohm's law). However, given that the center-of-mass EOM, the ion EOM, and the electron EOM are linearly dependent, one could equivalently employ the ion EOM as the Ohm's law for a Hall MHD formulation. While, from a physical standpoint, there is no a priori advantage for using one Ohm's law vs. the other, we argue in this poster that there is an algorithmic one. We will show that, while the electron Ohm's law prevents the extension of the resistive MHD preconditioning strategy to Hall MHD, an ion Ohm's law allows it trivially. Verification and performance numerical results on relevant problems will be presented.
A fast algorithm for computer aided collimation gamma camera (CACAO)

NASA Astrophysics Data System (ADS)

Jeanguillaume, C.; Begot, S.; Quartuccio, M.; Douiri, A.; Franck, D.; Pihet, P.; Ballongue, P.

2000-08-01

The computer aided collimation gamma camera is aimed at breaking down the resolution sensitivity trade-off of the conventional parallel hole collimator. It uses larger and longer holes, having an added linear movement at the acquisition sequence. A dedicated algorithm including shift and sum, deconvolution, parabolic filtering and rotation is described. Examples of reconstruction are given. This work shows that a simple and fast algorithm, based on a diagonal dominant approximation of the problem can be derived. Its gives a practical solution to the CACAO reconstruction problem.
Implicit assumptions underlying simple harvest models of marine bird populations can mislead environmental management decisions.

PubMed

O'Brien, Susan H; Cook, Aonghais S C P; Robinson, Robert A

2017-10-01

Assessing the potential impact of additional mortality from anthropogenic causes on animal populations requires detailed demographic information. However, these data are frequently lacking, making simple algorithms, which require little data, appealing. Because of their simplicity, these algorithms often rely on implicit assumptions, some of which may be quite restrictive. Potential Biological Removal (PBR) is a simple harvest model that estimates the number of additional mortalities that a population can theoretically sustain without causing population extinction. However, PBR relies on a number of implicit assumptions, particularly around density dependence and population trajectory that limit its applicability in many situations. Among several uses, it has been widely employed in Europe in Environmental Impact Assessments (EIA), to examine the acceptability of potential effects of offshore wind farms on marine bird populations. As a case study, we use PBR to estimate the number of additional mortalities that a population with characteristics typical of a seabird population can theoretically sustain. We incorporated this level of additional mortality within Leslie matrix models to test assumptions within the PBR algorithm about density dependence and current population trajectory. Our analyses suggest that the PBR algorithm identifies levels of mortality which cause population declines for most population trajectories and forms of population regulation. Consequently, we recommend that practitioners do not use PBR in an EIA context for offshore wind energy developments. Rather than using simple algorithms that rely on potentially invalid implicit assumptions, we recommend use of Leslie matrix models for assessing the impact of additional mortality on a population, enabling the user to explicitly define assumptions and test their importance. Copyright © 2017 Elsevier Ltd. All rights reserved.
Arikan and Alamouti matrices based on fast block-wise inverse Jacket transform

NASA Astrophysics Data System (ADS)

Lee, Moon Ho; Khan, Md Hashem Ali; Kim, Kyeong Jin

2013-12-01

Recently, Lee and Hou (IEEE Signal Process Lett 13: 461-464, 2006) proposed one-dimensional and two-dimensional fast algorithms for block-wise inverse Jacket transforms (BIJTs). Their BIJTs are not real inverse Jacket transforms from mathematical point of view because their inverses do not satisfy the usual condition, i.e., the multiplication of a matrix with its inverse matrix is not equal to the identity matrix. Therefore, we mathematically propose a fast block-wise inverse Jacket transform of orders N = 2 k , 3 k , 5 k , and 6 k , where k is a positive integer. Based on the Kronecker product of the successive lower order Jacket matrices and the basis matrix, the fast algorithms for realizing these transforms are obtained. Due to the simple inverse and fast algorithms of Arikan polar binary and Alamouti multiple-input multiple-output (MIMO) non-binary matrices, which are obtained from BIJTs, they can be applied in areas such as 3GPP physical layer for ultra mobile broadband permutation matrices design, first-order q-ary Reed-Muller code design, diagonal channel design, diagonal subchannel decompose for interference alignment, and 4G MIMO long-term evolution Alamouti precoding design.

An implicit higher-order spatially accurate scheme for solving time dependent flows on unstructured meshes

NASA Astrophysics Data System (ADS)

Tomaro, Robert F.

1998-07-01

The present research is aimed at developing a higher-order, spatially accurate scheme for both steady and unsteady flow simulations using unstructured meshes. The resulting scheme must work on a variety of general problems to ensure the creation of a flexible, reliable and accurate aerodynamic analysis tool. To calculate the flow around complex configurations, unstructured grids and the associated flow solvers have been developed. Efficient simulations require the minimum use of computer memory and computational times. Unstructured flow solvers typically require more computer memory than a structured flow solver due to the indirect addressing of the cells. The approach taken in the present research was to modify an existing three-dimensional unstructured flow solver to first decrease the computational time required for a solution and then to increase the spatial accuracy. The terms required to simulate flow involving non-stationary grids were also implemented. First, an implicit solution algorithm was implemented to replace the existing explicit procedure. Several test cases, including internal and external, inviscid and viscous, two-dimensional, three-dimensional and axi-symmetric problems, were simulated for comparison between the explicit and implicit solution procedures. The increased efficiency and robustness of modified code due to the implicit algorithm was demonstrated. Two unsteady test cases, a plunging airfoil and a wing undergoing bending and torsion, were simulated using the implicit algorithm modified to include the terms required for a moving and/or deforming grid. Secondly, a higher than second-order spatially accurate scheme was developed and implemented into the baseline code. Third- and fourth-order spatially accurate schemes were implemented and tested. The original dissipation was modified to include higher-order terms and modified near shock waves to limit pre- and post-shock oscillations. The unsteady cases were repeated using the higher-order spatially accurate code. The new solutions were compared with those obtained using the second-order spatially accurate scheme. Finally, the increased efficiency of using an implicit solution algorithm in a production Computational Fluid Dynamics flow solver was demonstrated for steady and unsteady flows. A third- and fourth-order spatially accurate scheme has been implemented creating a basis for a state-of-the-art aerodynamic analysis tool.
Numerical solutions of nonlinear STIFF initial value problems by perturbed functional iterations

NASA Technical Reports Server (NTRS)

Dey, S. K.

1982-01-01

Numerical solution of nonlinear stiff initial value problems by a perturbed functional iterative scheme is discussed. The algorithm does not fully linearize the system and requires only the diagonal terms of the Jacobian. Some examples related to chemical kinetics are presented.
Measurement-induced nonlocality in arbitrary dimensions in terms of the inverse approximate joint diagonalization

NASA Astrophysics Data System (ADS)

Zhang, Li-qiang; Ma, Ting-ting; Yu, Chang-shui

2018-03-01

The computability of the quantifier of a given quantum resource is the essential challenge in the resource theory and the inevitable bottleneck for its application. Here we focus on the measurement-induced nonlocality and present a redefinition in terms of the skew information subject to a broken observable. It is shown that the obtained quantity possesses an obvious operational meaning, can tackle the noncontractivity of the measurement-induced nonlocality and has analytic expressions for pure states, (2 ⊗d )-dimensional quantum states, and some particular high-dimensional quantum states. Most importantly, an inverse approximate joint diagonalization algorithm, due to its simplicity, high efficiency, stability, and state independence, is presented to provide almost-analytic expressions for any quantum state, which can also shed light on other aspects in physics. To illustrate applications as well as demonstrate the validity of the algorithm, we compare the analytic and numerical expressions of various examples and show their perfect consistency.
Implicit methods for the Navier-Stokes equations

NASA Technical Reports Server (NTRS)

Yoon, S.; Kwak, D.

1990-01-01

Numerical solutions of the Navier-Stokes equations using explicit schemes can be obtained at the expense of efficiency. Conventional implicit methods which often achieve fast convergence rates suffer high cost per iteration. A new implicit scheme based on lower-upper factorization and symmetric Gauss-Seidel relaxation offers very low cost per iteration as well as fast convergence. High efficiency is achieved by accomplishing the complete vectorizability of the algorithm on oblique planes of sweep in three dimensions.
An implicit dispersive transport algorithm for the US Geological Survey MOC3D solute-transport model

USGS Publications Warehouse

Kipp, K.L.; Konikow, Leonard F.; Hornberger, G.Z.

1998-01-01

This report documents an extension to the U.S. Geological Survey MOC3D transport model that incorporates an implicit-in-time difference approximation for the dispersive transport equation, including source/sink terms. The original MOC3D transport model (Version 1) uses the method of characteristics to solve the transport equation on the basis of the velocity field. The original MOC3D solution algorithm incorporates particle tracking to represent advective processes and an explicit finite-difference formulation to calculate dispersive fluxes. The new implicit procedure eliminates several stability criteria required for the previous explicit formulation. This allows much larger transport time increments to be used in dispersion-dominated problems. The decoupling of advective and dispersive transport in MOC3D, however, is unchanged. With the implicit extension, the MOC3D model is upgraded to Version 2. A description of the numerical method of the implicit dispersion calculation, the data-input requirements and output options, and the results of simulator testing and evaluation are presented. Version 2 of MOC3D was evaluated for the same set of problems used for verification of Version 1. These test results indicate that the implicit calculation of Version 2 matches the accuracy of Version 1, yet is more efficient than the explicit calculation for transport problems that are characterized by a grid Peclet number less than about 1.0.
A mass, momentum, and energy conserving, fully implicit, scalable algorithm for the multi-dimensional, multi-species Rosenbluth-Fokker-Planck equation

NASA Astrophysics Data System (ADS)

Taitano, W. T.; Chacón, L.; Simakov, A. N.; Molvig, K.

2015-09-01

In this study, we demonstrate a fully implicit algorithm for the multi-species, multidimensional Rosenbluth-Fokker-Planck equation which is exactly mass-, momentum-, and energy-conserving, and which preserves positivity. Unlike most earlier studies, we base our development on the Rosenbluth (rather than Landau) form of the Fokker-Planck collision operator, which reduces complexity while allowing for an optimal fully implicit treatment. Our discrete conservation strategy employs nonlinear constraints that force the continuum symmetries of the collision operator to be satisfied upon discretization. We converge the resulting nonlinear system iteratively using Jacobian-free Newton-Krylov methods, effectively preconditioned with multigrid methods for efficiency. Single- and multi-species numerical examples demonstrate the advertised accuracy properties of the scheme, and the superior algorithmic performance of our approach. In particular, the discretization approach is numerically shown to be second-order accurate in time and velocity space and to exhibit manifestly positive entropy production. That is, H-theorem behavior is indicated for all the examples we have tested. The solution approach is demonstrated to scale optimally with respect to grid refinement (with CPU time growing linearly with the number of mesh points), and timestep (showing very weak dependence of CPU time with time-step size). As a result, the proposed algorithm delivers several orders-of-magnitude speedup vs. explicit algorithms.
Updates to Multi-Dimensional Flux Reconstruction for Hypersonic Simulations on Tetrahedral Grids

NASA Technical Reports Server (NTRS)

Gnoffo, Peter A.

2010-01-01

The quality of simulated hypersonic stagnation region heating with tetrahedral meshes is investigated by using an updated three-dimensional, upwind reconstruction algorithm for the inviscid flux vector. An earlier implementation of this algorithm provided improved symmetry characteristics on tetrahedral grids compared to conventional reconstruction methods. The original formulation however displayed quantitative differences in heating and shear that were as large as 25% compared to a benchmark, structured-grid solution. The primary cause of this discrepancy is found to be an inherent inconsistency in the formulation of the flux limiter. The inconsistency is removed by employing a Green-Gauss formulation of primitive gradients at nodes to replace the previous Gram-Schmidt algorithm. Current results are now in good agreement with benchmark solutions for two challenge problems: (1) hypersonic flow over a three-dimensional cylindrical section with special attention to the uniformity of the solution in the spanwise direction and (2) hypersonic flow over a three-dimensional sphere. The tetrahedral cells used in the simulation are derived from a structured grid where cell faces are bisected across the diagonal resulting in a consistent pattern of diagonals running in a biased direction across the otherwise symmetric domain. This grid is known to accentuate problems in both shock capturing and stagnation region heating encountered with conventional, quasi-one-dimensional inviscid flux reconstruction algorithms. Therefore the test problems provide a sensitive indicator for algorithmic effects on heating. Additional simulations on a sharp, double cone and the shuttle orbiter are then presented to demonstrate the capabilities of the new algorithm on more geometrically complex flows with tetrahedral grids. These results provide the first indication that pure tetrahedral elements utilizing the updated, three-dimensional, upwind reconstruction algorithm may be used for the simulation of heating and shear in hypersonic flows in upwind, finite volume formulations.
Implicit timing activates the left inferior parietal cortex.

PubMed

Wiener, Martin; Turkeltaub, Peter E; Coslett, H Branch

2010-11-01

Coull and Nobre (2008) suggested that tasks that employ temporal cues might be divided on the basis of whether these cues are explicitly or implicitly processed. Furthermore, they suggested that implicit timing preferentially engages the left cerebral hemisphere. We tested this hypothesis by conducting a quantitative meta-analysis of eleven neuroimaging studies of implicit timing using the activation-likelihood estimation (ALE) algorithm (Turkeltaub, Eden, Jones, & Zeffiro, 2002). Our analysis revealed a single but robust cluster of activation-likelihood in the left inferior parietal cortex (supramarginal gyrus). This result is in accord with the hypothesis that the left hemisphere subserves implicit timing mechanisms. Furthermore, in conjunction with a previously reported meta-analysis of explicit timing tasks, our data support the claim that implicit and explicit timing are supported by at least partially distinct neural structures. Copyright © 2010 Elsevier Ltd. All rights reserved.
An environment-dependent semi-empirical tight binding model suitable for electron transport in bulk metals, metal alloys, metallic interfaces, and metallic nanostructures. I. Model and validation

NASA Astrophysics Data System (ADS)

Hegde, Ganesh; Povolotskyi, Michael; Kubis, Tillmann; Boykin, Timothy; Klimeck, Gerhard

2014-03-01

Semi-empirical Tight Binding (TB) is known to be a scalable and accurate atomistic representation for electron transport for realistically extended nano-scaled semiconductor devices that might contain millions of atoms. In this paper, an environment-aware and transferable TB model suitable for electronic structure and transport simulations in technologically relevant metals, metallic alloys, metal nanostructures, and metallic interface systems are described. Part I of this paper describes the development and validation of the new TB model. The new model incorporates intra-atomic diagonal and off-diagonal elements for implicit self-consistency and greater transferability across bonding environments. The dependence of the on-site energies on strain has been obtained by appealing to the Moments Theorem that links closed electron paths in the system to energy moments of angular momentum resolved local density of states obtained ab initio. The model matches self-consistent density functional theory electronic structure results for bulk face centered cubic metals with and without strain, metallic alloys, metallic interfaces, and metallic nanostructures with high accuracy and can be used in predictive electronic structure and transport problems in metallic systems at realistically extended length scales.
On improving the iterative convergence properties of an implicit approximate-factorization finite difference algorithm. [considering transonic flow

NASA Technical Reports Server (NTRS)

Desideri, J. A.; Steger, J. L.; Tannehill, J. C.

1978-01-01

The iterative convergence properties of an approximate-factorization implicit finite-difference algorithm are analyzed both theoretically and numerically. Modifications to the base algorithm were made to remove the inconsistency in the original implementation of artificial dissipation. In this way, the steady-state solution became independent of the time-step, and much larger time-steps can be used stably. To accelerate the iterative convergence, large time-steps and a cyclic sequence of time-steps were used. For a model transonic flow problem governed by the Euler equations, convergence was achieved with 10 times fewer time-steps using the modified differencing scheme. A particular form of instability due to variable coefficients is also analyzed.
A comparative study of Rosenbrock-type and implicit Runge-Kutta time integration for discontinuous Galerkin method for unsteady 3D compressible Navier-Stokes equations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liu, Xiaodong; Xia, Yidong; Luo, Hong

A comparative study of two classes of third-order implicit time integration schemes is presented for a third-order hierarchical WENO reconstructed discontinuous Galerkin (rDG) method to solve the 3D unsteady compressible Navier-Stokes equations: — 1) the explicit first stage, single diagonally implicit Runge-Kutta (ESDIRK3) scheme, and 2) the Rosenbrock-Wanner (ROW) schemes based on the differential algebraic equations (DAEs) of Index-2. Compared with the ESDIRK3 scheme, a remarkable feature of the ROW schemes is that, they only require one approximate Jacobian matrix calculation every time step, thus considerably reducing the overall computational cost. A variety of test cases, ranging from inviscid flowsmore » to DNS of turbulent flows, are presented to assess the performance of these schemes. Here, numerical experiments demonstrate that the third-order ROW scheme for the DAEs of index-2 can not only achieve the designed formal order of temporal convergence accuracy in a benchmark test, but also require significantly less computing time than its ESDIRK3 counterpart to converge to the same level of discretization errors in all of the flow simulations in this study, indicating that the ROW methods provide an attractive alternative for the higher-order time-accurate integration of the unsteady compressible Navier-Stokes equations.« less
A comparative study of Rosenbrock-type and implicit Runge-Kutta time integration for discontinuous Galerkin method for unsteady 3D compressible Navier-Stokes equations

DOE PAGES

Liu, Xiaodong; Xia, Yidong; Luo, Hong; ...

2016-10-05

A comparative study of two classes of third-order implicit time integration schemes is presented for a third-order hierarchical WENO reconstructed discontinuous Galerkin (rDG) method to solve the 3D unsteady compressible Navier-Stokes equations: — 1) the explicit first stage, single diagonally implicit Runge-Kutta (ESDIRK3) scheme, and 2) the Rosenbrock-Wanner (ROW) schemes based on the differential algebraic equations (DAEs) of Index-2. Compared with the ESDIRK3 scheme, a remarkable feature of the ROW schemes is that, they only require one approximate Jacobian matrix calculation every time step, thus considerably reducing the overall computational cost. A variety of test cases, ranging from inviscid flowsmore » to DNS of turbulent flows, are presented to assess the performance of these schemes. Here, numerical experiments demonstrate that the third-order ROW scheme for the DAEs of index-2 can not only achieve the designed formal order of temporal convergence accuracy in a benchmark test, but also require significantly less computing time than its ESDIRK3 counterpart to converge to the same level of discretization errors in all of the flow simulations in this study, indicating that the ROW methods provide an attractive alternative for the higher-order time-accurate integration of the unsteady compressible Navier-Stokes equations.« less
Low-rank factorization of electron integral tensors and its application in electronic structure theory

DOE PAGES

Peng, Bo; Kowalski, Karol

2017-01-25

In this paper, we apply reverse Cuthill-McKee (RCM) algorithm to transform two-electron integral tensors to their block diagonal forms. By further applying Cholesky decomposition (CD) on each of the diagonal blocks, we are able to represent the high-dimensional two-electron integral tensors in terms of permutation matrices and low-rank Cholesky vectors. This representation facilitates low-rank factorizations of high-dimensional tensor contractions in post-Hartree-Fock calculations. Finally, we discuss the second-order Møller-Plesset (MP2) method and the linear coupled-cluster model with doubles (L-CCD) as examples to demonstrate the efficiency of this technique in representing the two-electron integrals in a compact form.
Low-rank factorization of electron integral tensors and its application in electronic structure theory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Peng, Bo; Kowalski, Karol

In this paper, we apply reverse Cuthill-McKee (RCM) algorithm to transform two-electron integral tensors to their block diagonal forms. By further applying Cholesky decomposition (CD) on each of the diagonal blocks, we are able to represent the high-dimensional two-electron integral tensors in terms of permutation matrices and low-rank Cholesky vectors. This representation facilitates low-rank factorizations of high-dimensional tensor contractions in post-Hartree-Fock calculations. Finally, we discuss the second-order Møller-Plesset (MP2) method and the linear coupled-cluster model with doubles (L-CCD) as examples to demonstrate the efficiency of this technique in representing the two-electron integrals in a compact form.
Fully implicit Particle-in-cell algorithms for multiscale plasma simulation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chacon, Luis

The outline of the paper is as follows: Particle-in-cell (PIC) methods for fully ionized collisionless plasmas, explicit vs. implicit PIC, 1D ES implicit PIC (charge and energy conservation, moment-based acceleration), and generalization to Multi-D EM PIC: Vlasov-Darwin model (review and motivation for Darwin model, conservation properties (energy, charge, and canonical momenta), and numerical benchmarks). The author demonstrates a fully implicit, fully nonlinear, multidimensional PIC formulation that features exact local charge conservation (via a novel particle mover strategy), exact global energy conservation (no particle self-heating or self-cooling), adaptive particle orbit integrator to control errors in momentum conservation, and canonical momenta (EM-PICmore » only, reduced dimensionality). The approach is free of numerical instabilities: ω peΔt >> 1, and Δx >> λ D. It requires many fewer dofs (vs. explicit PIC) for comparable accuracy in challenging problems. Significant CPU gains (vs explicit PIC) have been demonstrated. The method has much potential for efficiency gains vs. explicit in long-time-scale applications. Moment-based acceleration is effective in minimizing N FE, leading to an optimal algorithm.« less
Fully implicit moving mesh adaptive algorithm

NASA Astrophysics Data System (ADS)

Chacon, Luis

2005-10-01

In many problems of interest, the numerical modeler is faced with the challenge of dealing with multiple time and length scales. The former is best dealt with with fully implicit methods, which are able to step over fast frequencies to resolve the dynamical time scale of interest. The latter requires grid adaptivity for efficiency. Moving-mesh grid adaptive methods are attractive because they can be designed to minimize the numerical error for a given resolution. However, the required grid governing equations are typically very nonlinear and stiff, and of considerably difficult numerical treatment. Not surprisingly, fully coupled, implicit approaches where the grid and the physics equations are solved simultaneously are rare in the literature, and circumscribed to 1D geometries. In this study, we present a fully implicit algorithm for moving mesh methods that is feasible for multidimensional geometries. A crucial element is the development of an effective multilevel treatment of the grid equation.ootnotetextL. Chac'on, G. Lapenta, A fully implicit, nonlinear adaptive grid strategy, J. Comput. Phys., accepted (2005) We will show that such an approach is competitive vs. uniform grids both from the accuracy (due to adaptivity) and the efficiency standpoints. Results for a variety of models 1D and 2D geometries, including nonlinear diffusion, radiation-diffusion, Burgers equation, and gas dynamics will be presented.
Error analysis of multipoint flux domain decomposition methods for evolutionary diffusion problems

NASA Astrophysics Data System (ADS)

Arrarás, A.; Portero, L.; Yotov, I.

2014-01-01

We study space and time discretizations for mixed formulations of parabolic problems. The spatial approximation is based on the multipoint flux mixed finite element method, which reduces to an efficient cell-centered pressure system on general grids, including triangles, quadrilaterals, tetrahedra, and hexahedra. The time integration is performed by using a domain decomposition time-splitting technique combined with multiterm fractional step diagonally implicit Runge-Kutta methods. The resulting scheme is unconditionally stable and computationally efficient, as it reduces the global system to a collection of uncoupled subdomain problems that can be solved in parallel without the need for Schwarz-type iteration. Convergence analysis for both the semidiscrete and fully discrete schemes is presented.
Numerical solution of Euler's equation by perturbed functionals

NASA Technical Reports Server (NTRS)

Dey, S. K.

1985-01-01

A perturbed functional iteration has been developed to solve nonlinear systems. It adds at each iteration level, unique perturbation parameters to nonlinear Gauss-Seidel iterates which enhances its convergence properties. As convergence is approached these parameters are damped out. Local linearization along the diagonal has been used to compute these parameters. The method requires no computation of Jacobian or factorization of matrices. Analysis of convergence depends on properties of certain contraction-type mappings, known as D-mappings. In this article, application of this method to solve an implicit finite difference approximation of Euler's equation is studied. Some representative results for the well known shock tube problem and compressible flows in a nozzle are given.
Nonlinear Fluid Computations in a Distributed Environment

NASA Technical Reports Server (NTRS)

Atwood, Christopher A.; Smith, Merritt H.

1995-01-01

The performance of a loosely and tightly-coupled workstation cluster is compared against a conventional vector supercomputer for the solution the Reynolds- averaged Navier-Stokes equations. The application geometries include a transonic airfoil, a tiltrotor wing/fuselage, and a wing/body/empennage/nacelle transport. Decomposition is of the manager-worker type, with solution of one grid zone per worker process coupled using the PVM message passing library. Task allocation is determined by grid size and processor speed, subject to available memory penalties. Each fluid zone is computed using an implicit diagonal scheme in an overset mesh framework, while relative body motion is accomplished using an additional worker process to re-establish grid communication.
Implicit flux-split schemes for the Euler equations

NASA Technical Reports Server (NTRS)

Thomas, J. L.; Walters, R. W.; Van Leer, B.

1985-01-01

Recent progress in the development of implicit algorithms for the Euler equations using the flux-vector splitting method is described. Comparisons of the relative efficiency of relaxation and spatially-split approximately factored methods on a vector processor for two-dimensional flows are made. For transonic flows, the higher convergence rate per iteration of the Gauss-Seidel relaxation algorithms, which are only partially vectorizable, is amply compensated for by the faster computational rate per iteration of the approximately factored algorithm. For supersonic flows, the fully-upwind line-relaxation method is more efficient since the numerical domain of dependence is more closely matched to the physical domain of dependence. A hybrid three-dimensional algorithm using relaxation in one coordinate direction and approximate factorization in the cross-flow plane is developed and applied to a forebody shape at supersonic speeds and a swept, tapered wing at transonic speeds.

Three-Dimensional High-Lift Analysis Using a Parallel Unstructured Multigrid Solver

NASA Technical Reports Server (NTRS)

Mavriplis, Dimitri J.

1998-01-01

A directional implicit unstructured agglomeration multigrid solver is ported to shared and distributed memory massively parallel machines using the explicit domain-decomposition and message-passing approach. Because the algorithm operates on local implicit lines in the unstructured mesh, special care is required in partitioning the problem for parallel computing. A weighted partitioning strategy is described which avoids breaking the implicit lines across processor boundaries, while incurring minimal additional communication overhead. Good scalability is demonstrated on a 128 processor SGI Origin 2000 machine and on a 512 processor CRAY T3E machine for reasonably fine grids. The feasibility of performing large-scale unstructured grid calculations with the parallel multigrid algorithm is demonstrated by computing the flow over a partial-span flap wing high-lift geometry on a highly resolved grid of 13.5 million points in approximately 4 hours of wall clock time on the CRAY T3E.
A fast Poisson solver for unsteady incompressible Navier-Stokes equations on the half-staggered grid

NASA Technical Reports Server (NTRS)

Golub, G. H.; Huang, L. C.; Simon, H.; Tang, W. -P.

1995-01-01

In this paper, a fast Poisson solver for unsteady, incompressible Navier-Stokes equations with finite difference methods on the non-uniform, half-staggered grid is presented. To achieve this, new algorithms for diagonalizing a semi-definite pair are developed. Our fast solver can also be extended to the three dimensional case. The motivation and related issues in using this second kind of staggered grid are also discussed. Numerical testing has indicated the effectiveness of this algorithm.
Algorithm and code development for unsteady three-dimensional Navier-Stokes equations

NASA Technical Reports Server (NTRS)

Obayashi, Shigeru

1991-01-01

A streamwise upwind algorithm for solving the unsteady 3-D Navier-Stokes equations was extended to handle the moving grid system. It is noted that the finite volume concept is essential to extend the algorithm. The resulting algorithm is conservative for any motion of the coordinate system. Two extensions to an implicit method were considered and the implicit extension that makes the algorithm computationally efficient is implemented into Ames's aeroelasticity code, ENSAERO. The new flow solver has been validated through the solution of test problems. Test cases include three-dimensional problems with fixed and moving grids. The first test case shown is an unsteady viscous flow over an F-5 wing, while the second test considers the motion of the leading edge vortex as well as the motion of the shock wave for a clipped delta wing. The resulting algorithm has been implemented into ENSAERO. The upwind version leads to higher accuracy in both steady and unsteady computations than the previously used central-difference method does, while the increase in the computational time is small.
A low-dispersion, exactly energy-charge-conserving semi-implicit relativistic particle-in-cell algorithm

NASA Astrophysics Data System (ADS)

Chen, Guangye; Luis, Chacon; Bird, Robert; Stark, David; Yin, Lin; Albright, Brian

2017-10-01

Leap-frog based explicit algorithms, either ``energy-conserving'' or ``momentum-conserving'', do not conserve energy discretely. Time-centered fully implicit algorithms can conserve discrete energy exactly, but introduce large dispersion errors in the light-wave modes, regardless of timestep sizes. This can lead to intolerable simulation errors where highly accurate light propagation is needed (e.g. laser-plasma interactions, LPI). In this study, we selectively combine the leap-frog and Crank-Nicolson methods to produce a low-dispersion, exactly energy-and-charge-conserving PIC algorithm. Specifically, we employ the leap-frog method for Maxwell equations, and the Crank-Nicolson method for particle equations. Such an algorithm admits exact global energy conservation, exact local charge conservation, and preserves the dispersion properties of the leap-frog method for the light wave. The algorithm has been implemented in a code named iVPIC, based on the VPIC code developed at LANL. We will present numerical results that demonstrate the properties of the scheme with sample test problems (e.g. Weibel instability run for 107 timesteps, and LPI applications.
Advanced MHD Algorithm for Solar and Space Science: lst Year Semi Annual Progress Report

NASA Technical Reports Server (NTRS)

Schnack, Dalton D.; Lionello, Roberto

2003-01-01

We report progress for the development of MH4D for the first and second quarters of FY2004, December 29, 2002 - June 6, 2003. The present version of MH4D can now solve the full viscous and resistive MHD equations using either an explicit or a semi-implicit time advancement algorithm. In this report we describe progress in the following areas. During the two last quarters we have presented poster at the EGS-AGU-EUG Joint Assembly in Nice, France, April 6-11, 2003, and a poster at the 2003 International Sherwood Theory Conference in Corpus Christi, Texas, April 28-30 2003. In the area of code development, we have implemented the MHD equations and the semi-implicit algorithm. The new features have been tested.
Development of iterative techniques for the solution of unsteady compressible viscous flows

NASA Technical Reports Server (NTRS)

Sankar, Lakshmi N.; Hixon, Duane

1992-01-01

The development of efficient iterative solution methods for the numerical solution of two- and three-dimensional compressible Navier-Stokes equations is discussed. Iterative time marching methods have several advantages over classical multi-step explicit time marching schemes, and non-iterative implicit time marching schemes. Iterative schemes have better stability characteristics than non-iterative explicit and implicit schemes. In this work, another approach based on the classical conjugate gradient method, known as the Generalized Minimum Residual (GMRES) algorithm is investigated. The GMRES algorithm has been used in the past by a number of researchers for solving steady viscous and inviscid flow problems. Here, we investigate the suitability of this algorithm for solving the system of non-linear equations that arise in unsteady Navier-Stokes solvers at each time step.
Multi-Dimensional, Inviscid Flux Reconstruction for Simulation of Hypersonic Heating on Tetrahedral Grids

NASA Technical Reports Server (NTRS)

Gnoffo, Peter A.

2009-01-01

The quality of simulated hypersonic stagnation region heating on tetrahedral meshes is investigated by using a three-dimensional, upwind reconstruction algorithm for the inviscid flux vector. Two test problems are investigated: hypersonic flow over a three-dimensional cylinder with special attention to the uniformity of the solution in the spanwise direction and hypersonic flow over a three-dimensional sphere. The tetrahedral cells used in the simulation are derived from a structured grid where cell faces are bisected across the diagonal resulting in a consistent pattern of diagonals running in a biased direction across the otherwise symmetric domain. This grid is known to accentuate problems in both shock capturing and stagnation region heating encountered with conventional, quasi-one-dimensional inviscid flux reconstruction algorithms. Therefore the test problem provides a sensitive test for algorithmic effects on heating. This investigation is believed to be unique in its focus on three-dimensional, rotated upwind schemes for the simulation of hypersonic heating on tetrahedral grids. This study attempts to fill the void left by the inability of conventional (quasi-one-dimensional) approaches to accurately simulate heating in a tetrahedral grid system. Results show significant improvement in spanwise uniformity of heating with some penalty of ringing at the captured shock. Issues with accuracy near the peak shear location are identified and require further study.
Statistical image reconstruction from correlated data with applications to PET

PubMed Central

Alessio, Adam; Sauer, Ken; Kinahan, Paul

2008-01-01

Most statistical reconstruction methods for emission tomography are designed for data modeled as conditionally independent Poisson variates. In reality, due to scanner detectors, electronics and data processing, correlations are introduced into the data resulting in dependent variates. In general, these correlations are ignored because they are difficult to measure and lead to computationally challenging statistical reconstruction algorithms. This work addresses the second concern, seeking to simplify the reconstruction of correlated data and provide a more precise image estimate than the conventional independent methods. In general, correlated variates have a large non-diagonal covariance matrix that is computationally challenging to use as a weighting term in a reconstruction algorithm. This work proposes two methods to simplify the use of a non-diagonal covariance matrix as the weighting term by (a) limiting the number of dimensions in which the correlations are modeled and (b) adopting flexible, yet computationally tractable, models for correlation structure. We apply and test these methods with simple simulated PET data and data processed with the Fourier rebinning algorithm which include the one-dimensional correlations in the axial direction and the two-dimensional correlations in the transaxial directions. The methods are incorporated into a penalized weighted least-squares 2D reconstruction and compared with a conventional maximum a posteriori approach. PMID:17921576
A three-dimensional finite-volume Eulerian-Lagrangian Localized Adjoint Method (ELLAM) for solute-transport modeling

USGS Publications Warehouse

Heberton, C.I.; Russell, T.F.; Konikow, Leonard F.; Hornberger, G.Z.

2000-01-01

This report documents the U.S. Geological Survey Eulerian-Lagrangian Localized Adjoint Method (ELLAM) algorithm that solves an integral form of the solute-transport equation, incorporating an implicit-in-time difference approximation for the dispersive and sink terms. Like the algorithm in the original version of the U.S. Geological Survey MOC3D transport model, ELLAM uses a method of characteristics approach to solve the transport equation on the basis of the velocity field. The ELLAM algorithm, however, is based on an integral formulation of conservation of mass and uses appropriate numerical techniques to obtain global conservation of mass. The implicit procedure eliminates several stability criteria required for an explicit formulation. Consequently, ELLAM allows large transport time increments to be used. ELLAM can produce qualitatively good results using a small number of transport time steps. A description of the ELLAM numerical method, the data-input requirements and output options, and the results of simulator testing and evaluation are presented. The ELLAM algorithm was evaluated for the same set of problems used to test and evaluate Version 1 and Version 2 of MOC3D. These test results indicate that ELLAM offers a viable alternative to the explicit and implicit solvers in MOC3D. Its use is desirable when mass balance is imperative or a fast, qualitative model result is needed. Although accurate solutions can be generated using ELLAM, its efficiency relative to the two previously documented solution algorithms is problem dependent.
Three-Dimensional Navier-Stokes Method with Two-Equation Turbulence Models for Efficient Numerical Simulation of Hypersonic Flows

NASA Technical Reports Server (NTRS)

Bardina, J. E.

1994-01-01

A new computational efficient 3-D compressible Reynolds-averaged implicit Navier-Stokes method with advanced two equation turbulence models for high speed flows is presented. All convective terms are modeled using an entropy satisfying higher-order Total Variation Diminishing (TVD) scheme based on implicit upwind flux-difference split approximations and arithmetic averaging procedure of primitive variables. This method combines the best features of data management and computational efficiency of space marching procedures with the generality and stability of time dependent Navier-Stokes procedures to solve flows with mixed supersonic and subsonic zones, including streamwise separated flows. Its robust stability derives from a combination of conservative implicit upwind flux-difference splitting with Roe's property U to provide accurate shock capturing capability that non-conservative schemes do not guarantee, alternating symmetric Gauss-Seidel 'method of planes' relaxation procedure coupled with a three-dimensional two-factor diagonal-dominant approximate factorization scheme, TVD flux limiters of higher-order flux differences satisfying realizability, and well-posed characteristic-based implicit boundary-point a'pproximations consistent with the local characteristics domain of dependence. The efficiency of the method is highly increased with Newton Raphson acceleration which allows convergence in essentially one forward sweep for supersonic flows. The method is verified by comparing with experiment and other Navier-Stokes methods. Here, results of adiabatic and cooled flat plate flows, compression corner flow, and 3-D hypersonic shock-wave/turbulent boundary layer interaction flows are presented. The robust 3-D method achieves a better computational efficiency of at least one order of magnitude over the CNS Navier-Stokes code. It provides cost-effective aerodynamic predictions in agreement with experiment, and the capability of predicting complex flow structures in complex geometries with good accuracy.
An implicit boundary integral method for computing electric potential of macromolecules in solvent

NASA Astrophysics Data System (ADS)

Zhong, Yimin; Ren, Kui; Tsai, Richard

2018-04-01

A numerical method using implicit surface representations is proposed to solve the linearized Poisson-Boltzmann equation that arises in mathematical models for the electrostatics of molecules in solvent. The proposed method uses an implicit boundary integral formulation to derive a linear system defined on Cartesian nodes in a narrowband surrounding the closed surface that separates the molecule and the solvent. The needed implicit surface is constructed from the given atomic description of the molecules, by a sequence of standard level set algorithms. A fast multipole method is applied to accelerate the solution of the linear system. A few numerical studies involving some standard test cases are presented and compared to other existing results.
Noniterative implicit method for tracking particles in mixed Lagrangian-Eulerian formulations

NASA Technical Reports Server (NTRS)

Shih, T. I.-P.; Dasgupta, A.

1993-01-01

The existing implicit methods for the current initial value problems (IVPs) concerning particle-laden flows are complicated and iterative in nature. This paper presents a noniterative implicit method which can be used with pressure-based as well as with density-based algorithms. The method is illustrated by analyzing a dilute dispersion of noninteracting solid particles in an isothermal flow in a passage bounded by one straight wall and one wavy wall, in which all particles are spherical and have a finite velociy relative to the continuum phase at the inflow boundary.
On optimal improvements of classical iterative schemes for Z-matrices

NASA Astrophysics Data System (ADS)

Noutsos, D.; Tzoumas, M.

2006-04-01

Many researchers have considered preconditioners, applied to linear systems, whose matrix coefficient is a Z- or an M-matrix, that make the associated Jacobi and Gauss-Seidel methods converge asymptotically faster than the unpreconditioned ones. Such preconditioners are chosen so that they eliminate the off-diagonal elements of the same column or the elements of the first upper diagonal [Milaszewicz, LAA 93 (1987) 161-170], Gunawardena et al. [LAA 154-156 (1991) 123-143]. In this work we generalize the previous preconditioners to obtain optimal methods. "Good" Jacobi and Gauss-Seidel algorithms are given and preconditioners, that eliminate more than one entry per row, are also proposed and analyzed. Moreover, the behavior of the above preconditioners to the Krylov subspace methods is studied.
Forebody and base region real gas flow in severe planetary entry by a factored implicit numerical method. II - Equilibrium reactive gas

NASA Technical Reports Server (NTRS)

Davy, W. C.; Green, M. J.; Lombard, C. K.

1981-01-01

The factored-implicit, gas-dynamic algorithm has been adapted to the numerical simulation of equilibrium reactive flows. Changes required in the perfect gas version of the algorithm are developed, and the method of coupling gas-dynamic and chemistry variables is discussed. A flow-field solution that approximates a Jovian entry case was obtained by this method and compared with the same solution obtained by HYVIS, a computer program much used for the study of planetary entry. Comparison of surface pressure distribution and stagnation line shock-layer profiles indicates that the two solutions agree well.
An Implicit Characteristic Based Method for Electromagnetics

NASA Technical Reports Server (NTRS)

Beggs, John H.; Briley, W. Roger

2001-01-01

An implicit characteristic-based approach for numerical solution of Maxwell's time-dependent curl equations in flux conservative form is introduced. This method combines a characteristic based finite difference spatial approximation with an implicit lower-upper approximate factorization (LU/AF) time integration scheme. This approach is advantageous for three-dimensional applications because the characteristic differencing enables a two-factor approximate factorization that retains its unconditional stability in three space dimensions, and it does not require solution of tridiagonal systems. Results are given both for a Fourier analysis of stability, damping and dispersion properties, and for one-dimensional model problems involving propagation and scattering for free space and dielectric materials using both uniform and nonuniform grids. The explicit Finite Difference Time Domain Method (FDTD) algorithm is used as a convenient reference algorithm for comparison. The one-dimensional results indicate that for low frequency problems on a highly resolved uniform or nonuniform grid, this LU/AF algorithm can produce accurate solutions at Courant numbers significantly greater than one, with a corresponding improvement in efficiency for simulating a given period of time. This approach appears promising for development of dispersion optimized LU/AF schemes for three dimensional applications.
Multigrid Strategies for Viscous Flow Solvers on Anisotropic Unstructured Meshes

NASA Technical Reports Server (NTRS)

Movriplis, Dimitri J.

1998-01-01

Unstructured multigrid techniques for relieving the stiffness associated with high-Reynolds number viscous flow simulations on extremely stretched grids are investigated. One approach consists of employing a semi-coarsening or directional-coarsening technique, based on the directions of strong coupling within the mesh, in order to construct more optimal coarse grid levels. An alternate approach is developed which employs directional implicit smoothing with regular fully coarsened multigrid levels. The directional implicit smoothing is obtained by constructing implicit lines in the unstructured mesh based on the directions of strong coupling. Both approaches yield large increases in convergence rates over the traditional explicit full-coarsening multigrid algorithm. However, maximum benefits are achieved by combining the two approaches in a coupled manner into a single algorithm. An order of magnitude increase in convergence rate over the traditional explicit full-coarsening algorithm is demonstrated, and convergence rates for high-Reynolds number viscous flows which are independent of the grid aspect ratio are obtained. Further acceleration is provided by incorporating low-Mach-number preconditioning techniques, and a Newton-GMRES strategy which employs the multigrid scheme as a preconditioner. The compounding effects of these various techniques on speed of convergence is documented through several example test cases.
Globalized Newton-Krylov-Schwarz Algorithms and Software for Parallel Implicit CFD

NASA Technical Reports Server (NTRS)

Gropp, W. D.; Keyes, D. E.; McInnes, L. C.; Tidriri, M. D.

1998-01-01

Implicit solution methods are important in applications modeled by PDEs with disparate temporal and spatial scales. Because such applications require high resolution with reasonable turnaround, "routine" parallelization is essential. The pseudo-transient matrix-free Newton-Krylov-Schwarz (Psi-NKS) algorithmic framework is presented as an answer. We show that, for the classical problem of three-dimensional transonic Euler flow about an M6 wing, Psi-NKS can simultaneously deliver: globalized, asymptotically rapid convergence through adaptive pseudo- transient continuation and Newton's method-, reasonable parallelizability for an implicit method through deferred synchronization and favorable communication-to-computation scaling in the Krylov linear solver; and high per- processor performance through attention to distributed memory and cache locality, especially through the Schwarz preconditioner. Two discouraging features of Psi-NKS methods are their sensitivity to the coding of the underlying PDE discretization and the large number of parameters that must be selected to govern convergence. We therefore distill several recommendations from our experience and from our reading of the literature on various algorithmic components of Psi-NKS, and we describe a freely available, MPI-based portable parallel software implementation of the solver employed here.
Development of a fully implicit particle-in-cell scheme for gyrokinetic electromagnetic turbulence simulation in XGC1

NASA Astrophysics Data System (ADS)

Ku, Seung-Hoe; Hager, R.; Chang, C. S.; Chacon, L.; Chen, G.; EPSI Team

2016-10-01

The cancelation problem has been a long-standing issue for long wavelengths modes in electromagnetic gyrokinetic PIC simulations in toroidal geometry. As an attempt of resolving this issue, we implemented a fully implicit time integration scheme in the full-f, gyrokinetic PIC code XGC1. The new scheme - based on the implicit Vlasov-Darwin PIC algorithm by G. Chen and L. Chacon - can potentially resolve cancelation problem. The time advance for the field and the particle equations is space-time-centered, with particle sub-cycling. The resulting system of equations is solved by a Picard iteration solver with fixed-point accelerator. The algorithm is implemented in the parallel velocity formalism instead of the canonical parallel momentum formalism. XGC1 specializes in simulating the tokamak edge plasma with magnetic separatrix geometry. A fully implicit scheme could be a way to accurate and efficient gyrokinetic simulations. We will test if this numerical scheme overcomes the cancelation problem, and reproduces the dispersion relation of Alfven waves and tearing modes in cylindrical geometry. Funded by US DOE FES and ASCR, and computing resources provided by OLCF through ALCC.
Convergence issues in domain decomposition parallel computation of hovering rotor

NASA Astrophysics Data System (ADS)

Xiao, Zhongyun; Liu, Gang; Mou, Bin; Jiang, Xiong

2018-05-01

Implicit LU-SGS time integration algorithm has been widely used in parallel computation in spite of its lack of information from adjacent domains. When applied to parallel computation of hovering rotor flows in a rotating frame, it brings about convergence issues. To remedy the problem, three LU factorization-based implicit schemes (consisting of LU-SGS, DP-LUR and HLU-SGS) are investigated comparatively. A test case of pure grid rotation is designed to verify these algorithms, which show that LU-SGS algorithm introduces errors on boundary cells. When partition boundaries are circumferential, errors arise in proportion to grid speed, accumulating along with the rotation, and leading to computational failure in the end. Meanwhile, DP-LUR and HLU-SGS methods show good convergence owing to boundary treatment which are desirable in domain decomposition parallel computations.
Filtering of non-linear instabilities

NASA Technical Reports Server (NTRS)

Khosla, P. K.; Rubin, S. G.

1978-01-01

For Courant numbers larger than one and cell Reynolds numbers larger than two, oscillations and in some cases instabilities are typically found with implicit numerical solutions of the fluid dynamics equations. This behavior has sometimes been associated with the loss of diagonal dominance of the coefficient matrix. It is shown that these problems can be related to the choice of the spatial differences, with the resulting instability related to aliasing or nonlinear interaction. Appropriate filtering can reduce the intensity of these oscillations and possibly eliminate the instability. These filtering procedures are equivalent to a weighted average of conservation and nonconservation differencing. The entire spectrum of filtered equations retains a three point character as well as second order spatial accuracy. Burgers equation was considered as a model.

Implicit Numerical Solution for a Normal Shock.

DTIC Science & Technology

1983-05-01

as predictor: N N SAt N. + x6xNU t+l x N ) At H .. Nl (+ A- AI )6U =AT + JAI i l N +l N!+U~U. -2 24 p corrector: N~~l F V Fii AUN I -At 1 i-I -At H. 1...DA X-) 6Ui X W V (43) Regrouping, the left hand side is now written as SAt X- 6 - V (44)+X A) X The matrix (I + 4L DA) is diagonal so that its...the terms from the Y - momentum equation. The matrices Y-1 and D are found Bq --1 the same way as X and D were. They are given as A (22 2 "":( 2 (y-l
Formulation of the relativistic moment implicit particle-in-cell method

DOE Office of Scientific and Technical Information (OSTI.GOV)

Noguchi, Koichi; Tronci, Cesare; Zuccaro, Gianluca

2007-04-15

A new formulation is presented for the implicit moment method applied to the time-dependent relativistic Vlasov-Maxwell system. The new approach is based on a specific formulation of the implicit moment method that allows us to retain the same formalism that is valid in the classical case despite the formidable complication introduced by the nonlinear nature of the relativistic equations of motion. To demonstrate the validity of the new formulation, an implicit finite difference algorithm is developed to solve the Maxwell's equations and equations of motion. A number of benchmark problems are run: two stream instability, ion acoustic wave damping, Weibelmore » instability, and Poynting flux acceleration. The numerical results are all in agreement with analytical solutions.« less
Compressible, multiphase semi-implicit method with moment of fluid interface representation

DOE PAGES

Jemison, Matthew; Sussman, Mark; Arienti, Marco

2014-09-16

A unified method for simulating multiphase flows using an exactly mass, momentum, and energy conserving Cell-Integrated Semi-Lagrangian advection algorithm is presented. The deforming material boundaries are represented using the moment-of-fluid method. Our new algorithm uses a semi-implicit pressure update scheme that asymptotically preserves the standard incompressible pressure projection method in the limit of infinite sound speed. The asymptotically preserving attribute makes the new method applicable to compressible and incompressible flows including stiff materials; enabling large time steps characteristic of incompressible flow algorithms rather than the small time steps required by explicit methods. Moreover, shocks are captured and material discontinuities aremore » tracked, without the aid of any approximate or exact Riemann solvers. As a result, wimulations of underwater explosions and fluid jetting in one, two, and three dimensions are presented which illustrate the effectiveness of the new algorithm at efficiently computing multiphase flows containing shock waves and material discontinuities with large “impedance mismatch.”« less
Formally Verified Practical Algorithms for Recovery from Loss of Separation

NASA Technical Reports Server (NTRS)

Butler, Ricky W.; Munoz, Caesar A.

2009-01-01

In this paper, we develop and formally verify practical algorithms for recovery from loss of separation. The formal verification is performed in the context of a criteria-based framework. This framework provides rigorous definitions of horizontal and vertical maneuver correctness that guarantee divergence and achieve horizontal and vertical separation. The algorithms are shown to be independently correct, that is, separation is achieved when only one aircraft maneuvers, and implicitly coordinated, that is, separation is also achieved when both aircraft maneuver. In this paper we improve the horizontal criteria over our previous work. An important benefit of the criteria approach is that different aircraft can execute different algorithms and implicit coordination will still be achieved, as long as they all meet the explicit criteria of the framework. Towards this end we have sought to make the criteria as general as possible. The framework presented in this paper has been formalized and mechanically verified in the Prototype Verification System (PVS).
Haptics-based dynamic implicit solid modeling.

PubMed

Hua, Jing; Qin, Hong

2004-01-01

This paper systematically presents a novel, interactive solid modeling framework, Haptics-based Dynamic Implicit Solid Modeling, which is founded upon volumetric implicit functions and powerful physics-based modeling. In particular, we augment our modeling framework with a haptic mechanism in order to take advantage of additional realism associated with a 3D haptic interface. Our dynamic implicit solids are semi-algebraic sets of volumetric implicit functions and are governed by the principles of dynamics, hence responding to sculpting forces in a natural and predictable manner. In order to directly manipulate existing volumetric data sets as well as point clouds, we develop a hierarchical fitting algorithm to reconstruct and represent discrete data sets using our continuous implicit functions, which permit users to further design and edit those existing 3D models in real-time using a large variety of haptic and geometric toolkits, and visualize their interactive deformation at arbitrary resolution. The additional geometric and physical constraints afford more sophisticated control of the dynamic implicit solids. The versatility of our dynamic implicit modeling enables the user to easily modify both the geometry and the topology of modeled objects, while the inherent physical properties can offer an intuitive haptic interface for direct manipulation with force feedback.
A Block Preconditioned Conjugate Gradient-type Iterative Solver for Linear Systems in Thermal Reservoir Simulation

NASA Astrophysics Data System (ADS)

Betté, Srinivas; Diaz, Julio C.; Jines, William R.; Steihaug, Trond

1986-11-01

A preconditioned residual-norm-reducing iterative solver is described. Based on a truncated form of the generalized-conjugate-gradient method for nonsymmetric systems of linear equations, the iterative scheme is very effective for linear systems generated in reservoir simulation of thermal oil recovery processes. As a consequence of employing an adaptive implicit finite-difference scheme to solve the model equations, the number of variables per cell-block varies dynamically over the grid. The data structure allows for 5- and 9-point operators in the areal model, 5-point in the cross-sectional model, and 7- and 11-point operators in the three-dimensional model. Block-diagonal-scaling of the linear system, done prior to iteration, is found to have a significant effect on the rate of convergence. Block-incomplete-LU-decomposition (BILU) and block-symmetric-Gauss-Seidel (BSGS) methods, which result in no fill-in, are used as preconditioning procedures. A full factorization is done on the well terms, and the cells are ordered in a manner which minimizes the fill-in in the well-column due to this factorization. The convergence criterion for the linear (inner) iteration is linked to that of the nonlinear (Newton) iteration, thereby enhancing the efficiency of the computation. The algorithm, with both BILU and BSGS preconditioners, is evaluated in the context of a variety of thermal simulation problems. The solver is robust and can be used with little or no user intervention.
Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications

NASA Technical Reports Server (NTRS)

Sun, Xian-He

1997-01-01

Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm and Reduced Parallel Diagonal Dominant (RPDD) algorithm have been carefully studied on different parallel platforms for different applications, and a NASA simulation code developed by Man M. Rai and his colleagues has been parallelized and implemented based on data dependency analysis. These achievements are addressed in detail in the paper.
On multigrid solution of the implicit equations of hydrodynamics. Experiments for the compressible Euler equations in general coordinates

NASA Astrophysics Data System (ADS)

Kifonidis, K.; Müller, E.

2012-08-01

Aims: We describe and study a family of new multigrid iterative solvers for the multidimensional, implicitly discretized equations of hydrodynamics. Schemes of this class are free of the Courant-Friedrichs-Lewy condition. They are intended for simulations in which widely differing wave propagation timescales are present. A preferred solver in this class is identified. Applications to some simple stiff test problems that are governed by the compressible Euler equations, are presented to evaluate the convergence behavior, and the stability properties of this solver. Algorithmic areas are determined where further work is required to make the method sufficiently efficient and robust for future application to difficult astrophysical flow problems. Methods: The basic equations are formulated and discretized on non-orthogonal, structured curvilinear meshes. Roe's approximate Riemann solver and a second-order accurate reconstruction scheme are used for spatial discretization. Implicit Runge-Kutta (ESDIRK) schemes are employed for temporal discretization. The resulting discrete equations are solved with a full-coarsening, non-linear multigrid method. Smoothing is performed with multistage-implicit smoothers. These are applied here to the time-dependent equations by means of dual time stepping. Results: For steady-state problems, our results show that the efficiency of the present approach is comparable to the best implicit solvers for conservative discretizations of the compressible Euler equations that can be found in the literature. The use of red-black as opposed to symmetric Gauss-Seidel iteration in the multistage-smoother is found to have only a minor impact on multigrid convergence. This should enable scalable parallelization without having to seriously compromise the method's algorithmic efficiency. For time-dependent test problems, our results reveal that the multigrid convergence rate degrades with increasing Courant numbers (i.e. time step sizes). Beyond a Courant number of nine thousand, even complete multigrid breakdown is observed. Local Fourier analysis indicates that the degradation of the convergence rate is associated with the coarse-grid correction algorithm. An implicit scheme for the Euler equations that makes use of the present method was, nevertheless, able to outperform a standard explicit scheme on a time-dependent problem with a Courant number of order 1000. Conclusions: For steady-state problems, the described approach enables the construction of parallelizable, efficient, and robust implicit hydrodynamics solvers. The applicability of the method to time-dependent problems is presently restricted to cases with moderately high Courant numbers. This is due to an insufficient coarse-grid correction of the employed multigrid algorithm for large time steps. Further research will be required to help us to understand and overcome the observed multigrid convergence difficulties for time-dependent problems.
High-Order/Low-Order methods for ocean modeling

DOE PAGES

Newman, Christopher; Womeldorff, Geoff; Chacón, Luis; ...

2015-06-01

In this study, we examine a High Order/Low Order (HOLO) approach for a z-level ocean model and show that the traditional semi-implicit and split-explicit methods, as well as a recent preconditioning strategy, can easily be cast in the framework of HOLO methods. The HOLO formulation admits an implicit-explicit method that is algorithmically scalable and second-order accurate, allowing timesteps much larger than the barotropic time scale. We show how HOLO approaches, in particular the implicit-explicit method, can provide a solid route for ocean simulation to heterogeneous computing and exascale environments.
A second-order accurate parabolized Navier-Stokes algorithm for internal flows

NASA Technical Reports Server (NTRS)

Chitsomboon, T.; Tiwari, S. N.

1984-01-01

A parabolized implicit Navier-Stokes algorithm which is of second-order accuracy in both the cross flow and marching directions is presented. The algorithm is used to analyze three model supersonic flow problems (the flow over a 10-degree edge). The results are found to be in good agreement with the results of other techniques available in the literature.
Comparison of GOES Cloud Classification Algorithms Employing Explicit and Implicit Physics

NASA Technical Reports Server (NTRS)

Bankert, Richard L.; Mitrescu, Cristian; Miller, Steven D.; Wade, Robert H.

2009-01-01

Cloud-type classification based on multispectral satellite imagery data has been widely researched and demonstrated to be useful for distinguishing a variety of classes using a wide range of methods. The research described here is a comparison of the classifier output from two very different algorithms applied to Geostationary Operational Environmental Satellite (GOES) data over the course of one year. The first algorithm employs spectral channel thresholding and additional physically based tests. The second algorithm was developed through a supervised learning method with characteristic features of expertly labeled image samples used as training data for a 1-nearest-neighbor classification. The latter's ability to identify classes is also based in physics, but those relationships are embedded implicitly within the algorithm. A pixel-to-pixel comparison analysis was done for hourly daytime scenes within a region in the northeastern Pacific Ocean. Considerable agreement was found in this analysis, with many of the mismatches or disagreements providing insight to the strengths and limitations of each classifier. Depending upon user needs, a rule-based or other postprocessing system that combines the output from the two algorithms could provide the most reliable cloud-type classification.
Intra Frame Coding In Advanced Video Coding Standard (H.264) to Obtain Consistent PSNR and Reduce Bit Rate for Diagonal Down Left Mode Using Gaussian Pulse

NASA Astrophysics Data System (ADS)

Manjanaik, N.; Parameshachari, B. D.; Hanumanthappa, S. N.; Banu, Reshma

2017-08-01

Intra prediction process of H.264 video coding standard used to code first frame i.e. Intra frame of video to obtain good coding efficiency compare to previous video coding standard series. More benefit of intra frame coding is to reduce spatial pixel redundancy with in current frame, reduces computational complexity and provides better rate distortion performance. To code Intra frame it use existing process Rate Distortion Optimization (RDO) method. This method increases computational complexity, increases in bit rate and reduces picture quality so it is difficult to implement in real time applications, so the many researcher has been developed fast mode decision algorithm for coding of intra frame. The previous work carried on Intra frame coding in H.264 standard using fast decision mode intra prediction algorithm based on different techniques was achieved increased in bit rate, degradation of picture quality(PSNR) for different quantization parameters. Many previous approaches of fast mode decision algorithms on intra frame coding achieved only reduction of computational complexity or it save encoding time and limitation was increase in bit rate with loss of quality of picture. In order to avoid increase in bit rate and loss of picture quality a better approach was developed. In this paper developed a better approach i.e. Gaussian pulse for Intra frame coding using diagonal down left intra prediction mode to achieve higher coding efficiency in terms of PSNR and bitrate. In proposed method Gaussian pulse is multiplied with each 4x4 frequency domain coefficients of 4x4 sub macro block of macro block of current frame before quantization process. Multiplication of Gaussian pulse for each 4x4 integer transformed coefficients at macro block levels scales the information of the coefficients in a reversible manner. The resulting signal would turn abstract. Frequency samples are abstract in a known and controllable manner without intermixing of coefficients, it avoids picture getting bad hit for higher values of quantization parameters. The proposed work was implemented using MATLAB and JM 18.6 reference software. The proposed work measure the performance parameters PSNR, bit rate and compression of intra frame of yuv video sequences in QCIF resolution under different values of quantization parameter with Gaussian value for diagonal down left intra prediction mode. The simulation results of proposed algorithm are tabulated and compared with previous algorithm i.e. Tian et al method. The proposed algorithm achieved reduced in bit rate averagely 30.98% and maintain consistent picture quality for QCIF sequences compared to previous algorithm i.e. Tian et al method.
Mixed time integration methods for transient thermal analysis of structures

NASA Technical Reports Server (NTRS)

Liu, W. K.

1982-01-01

The computational methods used to predict and optimize the thermal structural behavior of aerospace vehicle structures are reviewed. In general, two classes of algorithms, implicit and explicit, are used in transient thermal analysis of structures. Each of these two methods has its own merits. Due to the different time scales of the mechanical and thermal responses, the selection of a time integration method can be a different yet critical factor in the efficient solution of such problems. Therefore mixed time integration methods for transient thermal analysis of structures are being developed. The computer implementation aspects and numerical evaluation of these mixed time implicit-explicit algorithms in thermal analysis of structures are presented. A computationally useful method of estimating the critical time step for linear quadrilateral element is also given. Numerical tests confirm the stability criterion and accuracy characteristics of the methods. The superiority of these mixed time methods to the fully implicit method or the fully explicit method is also demonstrated.
Mixed time integration methods for transient thermal analysis of structures

NASA Technical Reports Server (NTRS)

Liu, W. K.

1983-01-01

The computational methods used to predict and optimize the thermal-structural behavior of aerospace vehicle structures are reviewed. In general, two classes of algorithms, implicit and explicit, are used in transient thermal analysis of structures. Each of these two methods has its own merits. Due to the different time scales of the mechanical and thermal responses, the selection of a time integration method can be a difficult yet critical factor in the efficient solution of such problems. Therefore mixed time integration methods for transient thermal analysis of structures are being developed. The computer implementation aspects and numerical evaluation of these mixed time implicit-explicit algorithms in thermal analysis of structures are presented. A computationally-useful method of estimating the critical time step for linear quadrilateral element is also given. Numerical tests confirm the stability criterion and accuracy characteristics of the methods. The superiority of these mixed time methods to the fully implicit method or the fully explicit method is also demonstrated.
An Implicit Algorithm for the Numerical Simulation of Shape-Memory Alloys

DOE Office of Scientific and Technical Information (OSTI.GOV)

Becker, R; Stolken, J; Jannetti, C

Shape-memory alloys (SMA) have the potential to be used in a variety of interesting applications due to their unique properties of pseudoelasticity and the shape-memory effect. However, in order to design SMA devices efficiently, a physics-based constitutive model is required to accurately simulate the behavior of shape-memory alloys. The scope of this work is to extend the numerical capabilities of the SMA constitutive model developed by Jannetti et. al. (2003), to handle large-scale polycrystalline simulations. The constitutive model is implemented within the finite-element software ABAQUS/Standard using a user defined material subroutine, or UMAT. To improve the efficiency of the numericalmore » simulations, so that polycrystalline specimens of shape-memory alloys can be modeled, a fully implicit algorithm has been implemented to integrate the constitutive equations. Using an implicit integration scheme increases the efficiency of the UMAT over the previously implemented explicit integration method by a factor of more than 100 for single crystal simulations.« less
Numerical Simulation of a Solar Domestic Hot Water System

NASA Astrophysics Data System (ADS)

Mongibello, L.; Bianco, N.; Di Somma, M.; Graditi, G.; Naso, V.

2014-11-01

An innovative transient numerical model is presented for the simulation of a solar Domestic Hot Water (DHW) system. The solar collectors have been simulated by using a zerodimensional analytical model. The temperature distributions in the heat transfer fluid and in the water inside the tank have been evaluated by one-dimensional models. The reversion elimination algorithm has been used to include the effects of natural convection among the water layers at different heights in the tank on the thermal stratification. A finite difference implicit scheme has been implemented to solve the energy conservation equation in the coil heat exchanger, and the energy conservation equation in the tank has been solved by using the finite difference Euler implicit scheme. Energy conservation equations for the solar DHW components models have been coupled by means of a home-made implicit algorithm. Results of the simulation performed using as input data the experimental values of the ambient temperature and the solar irradiance in a summer day are presented and discussed.
Towards full-Braginskii implicit extended MHD

NASA Astrophysics Data System (ADS)

Chacon, Luis

2009-05-01

Recently, viable algorithms have been proposed for the scalable, fully-implicit temporal integration of 3D resistive MHD and cold-ion extended MHD models. While significant, these achievements must be tempered by the fact that such models lack predictive capabilities in regimes of interest for magnetic fusion. Short of including kinetic closures, a natural evolution path towards predictability starts by considering additional terms as described in Braginskii's fluid closures in the collisional regime. Here, we focus on the inclusion of two fundamental elements of relevance for fusion plasmas: anisotropic parallel electron transport, and warm-ion physics (i.e., ion finite Larmor radius effects, included via gyroviscosity). Both these elements introduce significant numerical difficulties, due to the strong anisotropy in the former, and the presence of dispersive waves in the latter. In this presentation, we will discuss progress in our fully implicit algorithmic formulation towards the inclusion of both these elements. L. Chac'on, Phys. Plasmas, 15, 056103 (2008) L. Chac'on, J. Physics: Conf. Series, 125, 012041 (2008)
Explicit and implicit springback simulation in sheet metal forming using fully coupled ductile damage and distortional hardening model

NASA Astrophysics Data System (ADS)

Yetna n'jock, M.; Houssem, B.; Labergere, C.; Saanouni, K.; Zhenming, Y.

2018-05-01

The springback is an important phenomenon which accompanies the forming of metallic sheets especially for high strength materials. A quantitative prediction of springback becomes very important for newly developed material with high mechanical characteristics. In this work, a numerical methodology is developed to quantify this undesirable phenomenon. This methodoly is based on the use of both explicit and implicit finite element solvers of Abaqus®. The most important ingredient of this methodology consists on the use of highly predictive mechanical model. A thermodynamically-consistent, non-associative and fully anisotropic elastoplastic constitutive model strongly coupled with isotropic ductile damage and accounting for distortional hardening is then used. An algorithm for local integration of the complete set of the constitutive equations is developed. This algorithm considers the rotated frame formulation (RFF) to ensure the incremental objectivity of the model in the framework of finite strains. This algorithm is implemented in both explicit (Abaqus/Explicit®) and implicit (Abaqus/Standard®) solvers of Abaqus® through the users routine VUMAT and UMAT respectively. The implicit solver of Abaqus® has been used to study spingback as it is generally a quasi-static unloading. In order to compare the methods `efficiency, the explicit method (Dynamic Relaxation Method) proposed by Rayleigh has been also used for springback prediction. The results obtained within U draw/bending benchmark are studied, discussed and compared with experimental results as reference. Finally, the purpose of this work is to evaluate the reliability of different methods predict efficiently springback in sheet metal forming.
Modified conjugate gradient method for diagonalizing large matrices.

PubMed

Jie, Quanlin; Liu, Dunhuan

2003-11-01

We present an iterative method to diagonalize large matrices. The basic idea is the same as the conjugate gradient (CG) method, i.e, minimizing the Rayleigh quotient via its gradient and avoiding reintroducing errors to the directions of previous gradients. Each iteration step is to find lowest eigenvector of the matrix in a subspace spanned by the current trial vector and the corresponding gradient of the Rayleigh quotient, as well as some previous trial vectors. The gradient, together with the previous trial vectors, play a similar role as the conjugate gradient of the original CG algorithm. Our numeric tests indicate that this method converges significantly faster than the original CG method. And the computational cost of one iteration step is about the same as the original CG method. It is suitable for first principle calculations.
Color constancy by characterization of illumination chromaticity

NASA Astrophysics Data System (ADS)

Nikkanen, Jarno T.

2011-05-01

Computational color constancy algorithms play a key role in achieving desired color reproduction in digital cameras. Failure to estimate illumination chromaticity correctly will result in invalid overall colour cast in the image that will be easily detected by human observers. A new algorithm is presented for computational color constancy. Low computational complexity and low memory requirement make the algorithm suitable for resource-limited camera devices, such as consumer digital cameras and camera phones. Operation of the algorithm relies on characterization of the range of possible illumination chromaticities in terms of camera sensor response. The fact that only illumination chromaticity is characterized instead of the full color gamut, for example, increases robustness against variations in sensor characteristics and against failure of diagonal model of illumination change. Multiple databases are used in order to demonstrate the good performance of the algorithm in comparison to the state-of-the-art color constancy algorithms.

Group implicit concurrent algorithms in nonlinear structural dynamics

NASA Technical Reports Server (NTRS)

Ortiz, M.; Sotelino, E. D.

1989-01-01

During the 70's and 80's, considerable effort was devoted to developing efficient and reliable time stepping procedures for transient structural analysis. Mathematically, the equations governing this type of problems are generally stiff, i.e., they exhibit a wide spectrum in the linear range. The algorithms best suited to this type of applications are those which accurately integrate the low frequency content of the response without necessitating the resolution of the high frequency modes. This means that the algorithms must be unconditionally stable, which in turn rules out explicit integration. The most exciting possibility in the algorithms development area in recent years has been the advent of parallel computers with multiprocessing capabilities. So, this work is mainly concerned with the development of parallel algorithms in the area of structural dynamics. A primary objective is to devise unconditionally stable and accurate time stepping procedures which lend themselves to an efficient implementation in concurrent machines. Some features of the new computer architecture are summarized. A brief survey of current efforts in the area is presented. A new class of concurrent procedures, or Group Implicit algorithms is introduced and analyzed. The numerical simulation shows that GI algorithms hold considerable promise for application in coarse grain as well as medium grain parallel computers.
A Formal Framework for the Analysis of Algorithms That Recover From Loss of Separation

NASA Technical Reports Server (NTRS)

Butler, RIcky W.; Munoz, Cesar A.

2008-01-01

We present a mathematical framework for the specification and verification of state-based conflict resolution algorithms that recover from loss of separation. In particular, we propose rigorous definitions of horizontal and vertical maneuver correctness that yield horizontal and vertical separation, respectively, in a bounded amount of time. We also provide sufficient conditions for independent correctness, i.e., separation under the assumption that only one aircraft maneuvers, and for implicitly coordinated correctness, i.e., separation under the assumption that both aircraft maneuver. An important benefit of this approach is that different aircraft can execute different algorithms and implicit coordination will still be achieved, as long as they all meet the explicit criteria of the framework. Towards this end we have sought to make the criteria as general as possible. The framework presented in this paper has been formalized and mechanically verified in the Prototype Verification System (PVS).
Robust Integration Schemes for Generalized Viscoplasticity with Internal-State Variables

NASA Technical Reports Server (NTRS)

Saleeb, Atef F.; Li, W.; Wilt, Thomas E.

1997-01-01

The scope of the work in this presentation focuses on the development of algorithms for the integration of rate dependent constitutive equations. In view of their robustness; i.e., their superior stability and convergence properties for isotropic and anisotropic coupled viscoplastic-damage models, implicit integration schemes have been selected. This is the simplest in its class and is one of the most widely used implicit integrators at present.
Time integration algorithms for the two-dimensional Euler equations on unstructured meshes

NASA Technical Reports Server (NTRS)

Slack, David C.; Whitaker, D. L.; Walters, Robert W.

1994-01-01

Explicit and implicit time integration algorithms for the two-dimensional Euler equations on unstructured grids are presented. Both cell-centered and cell-vertex finite volume upwind schemes utilizing Roe's approximate Riemann solver are developed. For the cell-vertex scheme, a four-stage Runge-Kutta time integration, a fourstage Runge-Kutta time integration with implicit residual averaging, a point Jacobi method, a symmetric point Gauss-Seidel method and two methods utilizing preconditioned sparse matrix solvers are presented. For the cell-centered scheme, a Runge-Kutta scheme, an implicit tridiagonal relaxation scheme modeled after line Gauss-Seidel, a fully implicit lower-upper (LU) decomposition, and a hybrid scheme utilizing both Runge-Kutta and LU methods are presented. A reverse Cuthill-McKee renumbering scheme is employed for the direct solver to decrease CPU time by reducing the fill of the Jacobian matrix. A comparison of the various time integration schemes is made for both first-order and higher order accurate solutions using several mesh sizes, higher order accuracy is achieved by using multidimensional monotone linear reconstruction procedures. The results obtained for a transonic flow over a circular arc suggest that the preconditioned sparse matrix solvers perform better than the other methods as the number of elements in the mesh increases.
Multi-Target Angle Tracking Algorithm for Bistatic MIMO Radar Based on the Elements of the Covariance Matrix

PubMed Central

Zhang, Zhengyan; Zhang, Jianyun; Zhou, Qingsong; Li, Xiaobo

2018-01-01

In this paper, we consider the problem of tracking the direction of arrivals (DOA) and the direction of departure (DOD) of multiple targets for bistatic multiple-input multiple-output (MIMO) radar. A high-precision tracking algorithm for target angle is proposed. First, the linear relationship between the covariance matrix difference and the angle difference of the adjacent moment was obtained through three approximate relations. Then, the proposed algorithm obtained the relationship between the elements in the covariance matrix difference. On this basis, the performance of the algorithm was improved by averaging the covariance matrix element. Finally, the least square method was used to estimate the DOD and DOA. The algorithm realized the automatic correlation of the angle and provided better performance when compared with the adaptive asymmetric joint diagonalization (AAJD) algorithm. The simulation results demonstrated the effectiveness of the proposed algorithm. The algorithm provides the technical support for the practical application of MIMO radar. PMID:29518957
Multi-Target Angle Tracking Algorithm for Bistatic Multiple-Input Multiple-Output (MIMO) Radar Based on the Elements of the Covariance Matrix.

PubMed

Zhang, Zhengyan; Zhang, Jianyun; Zhou, Qingsong; Li, Xiaobo

2018-03-07

In this paper, we consider the problem of tracking the direction of arrivals (DOA) and the direction of departure (DOD) of multiple targets for bistatic multiple-input multiple-output (MIMO) radar. A high-precision tracking algorithm for target angle is proposed. First, the linear relationship between the covariance matrix difference and the angle difference of the adjacent moment was obtained through three approximate relations. Then, the proposed algorithm obtained the relationship between the elements in the covariance matrix difference. On this basis, the performance of the algorithm was improved by averaging the covariance matrix element. Finally, the least square method was used to estimate the DOD and DOA. The algorithm realized the automatic correlation of the angle and provided better performance when compared with the adaptive asymmetric joint diagonalization (AAJD) algorithm. The simulation results demonstrated the effectiveness of the proposed algorithm. The algorithm provides the technical support for the practical application of MIMO radar.
Semi-implicit and fully implicit shock-capturing methods for hyperbolic conservation laws with stiff source terms

NASA Technical Reports Server (NTRS)

Yee, H. C.; Shinn, J. L.

1986-01-01

Some numerical aspects of finite-difference algorithms for nonlinear multidimensional hyperbolic conservation laws with stiff nonhomogenous (source) terms are discussed. If the stiffness is entirely dominated by the source term, a semi-implicit shock-capturing method is proposed provided that the Jacobian of the soruce terms possesses certain properties. The proposed semi-implicit method can be viewed as a variant of the Bussing and Murman point-implicit scheme with a more appropriate numerical dissipation for the computation of strong shock waves. However, if the stiffness is not solely dominated by the source terms, a fully implicit method would be a better choice. The situation is complicated by problems that are higher than one dimension, and the presence of stiff source terms further complicates the solution procedures for alternating direction implicit (ADI) methods. Several alternatives are discussed. The primary motivation for constructing these schemes was to address thermally and chemically nonequilibrium flows in the hypersonic regime. Due to the unique structure of the eigenvalues and eigenvectors for fluid flows of this type, the computation can be simplified, thus providing a more efficient solution procedure than one might have anticipated.
Fourier-Legendre spectral methods for incompressible channel flow

NASA Technical Reports Server (NTRS)

Zang, T. A.; Hussaini, M. Y.

1984-01-01

An iterative collocation technique is described for modeling implicit viscosity in three-dimensional incompressible wall bounded shear flow. The viscosity can vary temporally and in the vertical direction. Channel flow is modeled with a Fourier-Legendre approximation and the mean streamwise advection is treated implicitly. Explicit terms are handled with an Adams-Bashforth method to increase the allowable time-step for calculation of the implicit terms. The algorithm is applied to low amplitude unstable waves in a plane Poiseuille flow at an Re of 7500. Comparisons are made between results using the Legendre method and with Chebyshev polynomials. Comparable accuracy is obtained for the perturbation kinetic energy predicted using both discretizations.
Compliant energy and momentum conservation in NEGF simulation of electron-phonon scattering in semiconductor nano-wire transistors

NASA Astrophysics Data System (ADS)

Barker, J. R.; Martinez, A.; Aldegunde, M.

2012-05-01

The modelling of spatially inhomogeneous silicon nanowire field-effect transistors has benefited from powerful simulation tools built around the Keldysh formulation of non-equilibrium Green function (NEGF) theory. The methodology is highly efficient for situations where the self-energies are diagonal (local) in space coordinates. It has thus been common practice to adopt diagonality (locality) approximations. We demonstrate here that the scattering kernel that controls the self-energies for electron-phonon interactions is generally non-local on the scale of at least a few lattice spacings (and thus within the spatial scale of features in extreme nano-transistors) and for polar optical phonon-electron interactions may be very much longer. It is shown that the diagonality approximation strongly under-estimates the scattering rates for scattering on polar optical phonons. This is an unexpected problem in silicon devices but occurs due to strong polar SO phonon-electron interactions extending into a narrow silicon channel surrounded by high kappa dielectric in wrap-round gate devices. Since dissipative inelastic scattering is already a serious problem for highly confined devices it is concluded that new algorithms need to be forthcoming to provide appropriate and efficient NEGF tools.
Ground State and Finite Temperature Lanczos Methods

NASA Astrophysics Data System (ADS)

Prelovšek, P.; Bonča, J.

The present review will focus on recent development of exact- diagonalization (ED) methods that use Lanczos algorithm to transform large sparse matrices onto the tridiagonal form. We begin with a review of basic principles of the Lanczos method for computing ground-state static as well as dynamical properties. Next, generalization to finite-temperatures in the form of well established finite-temperature Lanczos method is described. The latter allows for the evaluation of temperatures T>0 static and dynamic quantities within various correlated models. Several extensions and modification of the latter method introduced more recently are analysed. In particular, the low-temperature Lanczos method and the microcanonical Lanczos method, especially applicable within the high-T regime. In order to overcome the problems of exponentially growing Hilbert spaces that prevent ED calculations on larger lattices, different approaches based on Lanczos diagonalization within the reduced basis have been developed. In this context, recently developed method based on ED within a limited functional space is reviewed. Finally, we briefly discuss the real-time evolution of correlated systems far from equilibrium, which can be simulated using the ED and Lanczos-based methods, as well as approaches based on the diagonalization in a reduced basis.
Optimal implicit 2-D finite differences to model wave propagation in poroelastic media

NASA Astrophysics Data System (ADS)

Itzá, Reymundo; Iturrarán-Viveros, Ursula; Parra, Jorge O.

2016-08-01

Numerical modeling of seismic waves in heterogeneous porous reservoir rocks is an important tool for the interpretation of seismic surveys in reservoir engineering. We apply globally optimal implicit staggered-grid finite differences (FD) to model 2-D wave propagation in heterogeneous poroelastic media at a low-frequency range (<10 kHz). We validate the numerical solution by comparing it to an analytical-transient solution obtaining clear seismic wavefields including fast P and slow P and S waves (for a porous media saturated with fluid). The numerical dispersion and stability conditions are derived using von Neumann analysis, showing that over a wide range of porous materials the Courant condition governs the stability and this optimal implicit scheme improves the stability of explicit schemes. High-order explicit FD can be replaced by some lower order optimal implicit FD so computational cost will not be as expensive while maintaining the accuracy. Here, we compute weights for the optimal implicit FD scheme to attain an accuracy of γ = 10-8. The implicit spatial differentiation involves solving tridiagonal linear systems of equations through Thomas' algorithm.
Recent advances in nonlinear implicit, electrostatic particle-in-cell (PIC) algorithms

NASA Astrophysics Data System (ADS)

Chen, Guangye; Chacón, Luis; Barnes, Daniel

2012-10-01

An implicit 1D electrostatic PIC algorithmfootnotetextChen, Chac'on, Barnes, J. Comput. Phys. 230 (2011) has been developed that satisfies exact energy and charge conservation. The algorithm employs a kinetic-enslaved Jacobian-free Newton-Krylov methodfootnotetextIbid. that ensures nonlinear convergence while taking timesteps comparable to the dynamical timescale of interest. Here we present two main improvements of the algorithm. The first is the formulation of a preconditioner based on linearized fluid equations, which are closed using available particle information. The computational benefit is that solving the fluid system is much cheaper than the kinetic one. The effectiveness of the preconditioner in accelerating nonlinear iterations on challenging problems will be demonstrated. A second improvement is the generalization of Ref. 1 to curvilinear meshes,footnotetextChac'on, Chen, Barnes, J. Comput. Phys. submitted (2012) with a hybrid particle update of positions and velocities in logical and physical space respectively.footnotetextSwift, J. Comp. Phys., 126 (1996) The curvilinear algorithm remains exactly charge and energy-conserving, and can be extended to multiple dimensions. We demonstrate the accuracy and efficiency of the algorithm with a 1D ion-acoustic shock wave simulation.
Quadratic RK shooting solution for a environmental parameter prediction boundary value problem

NASA Astrophysics Data System (ADS)

Famelis, Ioannis Th.; Tsitouras, Ch.

2014-10-01

Using tools of Information Geometry, the minimum distance between two elements of a statistical manifold is defined by the corresponding geodesic, e.g. the minimum length curve that connects them. Such a curve, where the probability distribution functions in the case of our meteorological data are two parameter Weibull distributions, satisfies a 2nd order Boundary Value (BV) system. We study the numerical treatment of the resulting special quadratic form system using Shooting method. We compare the solutions of the problem when we employ a classical Singly Diagonally Implicit Runge Kutta (SDIRK) 4(3) pair of methods and a quadratic SDIRK 5(3) pair . Both pairs have the same computational costs whereas the second one attains higher order as it is specially constructed for quadratic problems.
Multi-partitioning for ADI-schemes on message passing architectures

NASA Technical Reports Server (NTRS)

Vanderwijngaart, Rob F.

1994-01-01

A kind of discrete-operator splitting called Alternating Direction Implicit (ADI) has been found to be useful in simulating fluid flow problems. In particular, it is being used to study the effects of hot exhaust jets from high performance aircraft on landing surfaces. Decomposition techniques that minimize load imbalance and message-passing frequency are described. Three strategies that are investigated for implementing the NAS Scalar Penta-diagonal Parallel Benchmark (SP) are transposition, pipelined Gaussian elimination, and multipartitioning. The multipartitioning strategy, which was used on Ethernet, was found to be the most efficient, although it was considered only a moderate success because of Ethernet's limited communication properties. The efficiency derived largely from the coarse granularity of the strategy, which reduced latencies and allowed overlap of communication and computation.
Enhancing scattering images for orientation recovery with diffusion map

DOE PAGES

Winter, Martin; Saalmann, Ulf; Rost, Jan M.

2016-02-12

We explore the possibility for orientation recovery in single-molecule coherent diffractive imaging with diffusion map. This algorithm approximates the Laplace-Beltrami operator, which we diagonalize with a metric that corresponds to the mapping of Euler angles onto scattering images. While suitable for images of objects with specific properties we show why this approach fails for realistic molecules. Here, we introduce a modification of the form factor in the scattering images which facilitates the orientation recovery and should be suitable for all recovery algorithms based on the distance of individual images. (C) 2016 Optical Society of America
An experimental SMI adaptive antenna array simulator for weak interfering signals

NASA Technical Reports Server (NTRS)

Dilsavor, Ronald S.; Gupta, Inder J.

1991-01-01

An experimental sample matrix inversion (SMI) adaptive antenna array for suppressing weak interfering signals is described. The experimental adaptive array uses a modified SMI algorithm to increase the interference suppression. In the modified SMI algorithm, the sample covariance matrix is redefined to reduce the effect of thermal noise on the weights of an adaptive array. This is accomplished by subtracting a fraction of the smallest eigenvalue of the original covariance matrix from its diagonal entries. The test results obtained using the experimental system are compared with theoretical results. The two show a good agreement.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Chacon, Luis; Stanier, Adam John

Here, we demonstrate a scalable fully implicit algorithm for the two-field low-β extended MHD model. This reduced model describes plasma behavior in the presence of strong guide fields, and is of significant practical impact both in nature and in laboratory plasmas. The model displays strong hyperbolic behavior, as manifested by the presence of fast dispersive waves, which make a fully implicit treatment very challenging. In this study, we employ a Jacobian-free Newton–Krylov nonlinear solver, for which we propose a physics-based preconditioner that renders the linearized set of equations suitable for inversion with multigrid methods. As a result, the algorithm ismore » shown to scale both algorithmically (i.e., the iteration count is insensitive to grid refinement and timestep size) and in parallel in a weak-scaling sense, with the wall-clock time scaling weakly with the number of cores for up to 4096 cores. For a 4096 × 4096 mesh, we demonstrate a wall-clock-time speedup of ~6700 with respect to explicit algorithms. The model is validated linearly (against linear theory predictions) and nonlinearly (against fully kinetic simulations), demonstrating excellent agreement.« less
An O(Nm(sup 2)) Plane Solver for the Compressible Navier-Stokes Equations

NASA Technical Reports Server (NTRS)

Thomas, J. L.; Bonhaus, D. L.; Anderson, W. K.; Rumsey, C. L.; Biedron, R. T.

1999-01-01

A hierarchical multigrid algorithm for efficient steady solutions to the two-dimensional compressible Navier-Stokes equations is developed and demonstrated. The algorithm applies multigrid in two ways: a Full Approximation Scheme (FAS) for a nonlinear residual equation and a Correction Scheme (CS) for a linearized defect correction implicit equation. Multigrid analyses which include the effect of boundary conditions in one direction are used to estimate the convergence rate of the algorithm for a model convection equation. Three alternating-line- implicit algorithms are compared in terms of efficiency. The analyses indicate that full multigrid efficiency is not attained in the general case; the number of cycles to attain convergence is dependent on the mesh density for high-frequency cross-stream variations. However, the dependence is reasonably small and fast convergence is eventually attained for any given frequency with either the FAS or the CS scheme alone. The paper summarizes numerical computations for which convergence has been attained to within truncation error in a few multigrid cycles for both inviscid and viscous ow simulations on highly stretched meshes.
Progress report on PIXIE3D, a fully implicit 3D extended MHD solver

NASA Astrophysics Data System (ADS)

Chacon, Luis

2008-11-01

Recently, invited talk at DPP07 an optimal, massively parallel implicit algorithm for 3D resistive magnetohydrodynamics (PIXIE3D) was demonstrated. Excellent algorithmic and parallel results were obtained with up to 4096 processors and 138 million unknowns. While this is a remarkable result, further developments are still needed for PIXIE3D to become a 3D extended MHD production code in general geometries. In this poster, we present an update on the status of PIXIE3D on several fronts. On the physics side, we will describe our progress towards the full Braginskii model, including: electron Hall terms, anisotropic heat conduction, and gyroviscous corrections. Algorithmically, we will discuss progress towards a robust, optimal, nonlinear solver for arbitrary geometries, including preconditioning for the new physical effects described, the implementation of a coarse processor-grid solver (to maintain optimal algorithmic performance for an arbitrarily large number of processors in massively parallel computations), and of a multiblock capability to deal with complicated geometries. L. Chac'on, Phys. Plasmas 15, 056103 (2008);
A threshold-based fixed predictor for JPEG-LS image compression

NASA Astrophysics Data System (ADS)

Deng, Lihua; Huang, Zhenghua; Yao, Shoukui

2018-03-01

In JPEG-LS, fixed predictor based on median edge detector (MED) only detect horizontal and vertical edges, and thus produces large prediction errors in the locality of diagonal edges. In this paper, we propose a threshold-based edge detection scheme for the fixed predictor. The proposed scheme can detect not only the horizontal and vertical edges, but also diagonal edges. For some certain thresholds, the proposed scheme can be simplified to other existing schemes. So, it can also be regarded as the integration of these existing schemes. For a suitable threshold, the accuracy of horizontal and vertical edges detection is higher than the existing median edge detection in JPEG-LS. Thus, the proposed fixed predictor outperforms the existing JPEG-LS predictors for all images tested, while the complexity of the overall algorithm is maintained at a similar level.

Off-diagonal expansion quantum Monte Carlo

NASA Astrophysics Data System (ADS)

Albash, Tameem; Wagenbreth, Gene; Hen, Itay

2017-12-01

We propose a Monte Carlo algorithm designed to simulate quantum as well as classical systems at equilibrium, bridging the algorithmic gap between quantum and classical thermal simulation algorithms. The method is based on a decomposition of the quantum partition function that can be viewed as a series expansion about its classical part. We argue that the algorithm not only provides a theoretical advancement in the field of quantum Monte Carlo simulations, but is optimally suited to tackle quantum many-body systems that exhibit a range of behaviors from "fully quantum" to "fully classical," in contrast to many existing methods. We demonstrate the advantages, sometimes by orders of magnitude, of the technique by comparing it against existing state-of-the-art schemes such as path integral quantum Monte Carlo and stochastic series expansion. We also illustrate how our method allows for the unification of quantum and classical thermal parallel tempering techniques into a single algorithm and discuss its practical significance.
Off-diagonal expansion quantum Monte Carlo.

PubMed

Albash, Tameem; Wagenbreth, Gene; Hen, Itay

2017-12-01

We propose a Monte Carlo algorithm designed to simulate quantum as well as classical systems at equilibrium, bridging the algorithmic gap between quantum and classical thermal simulation algorithms. The method is based on a decomposition of the quantum partition function that can be viewed as a series expansion about its classical part. We argue that the algorithm not only provides a theoretical advancement in the field of quantum Monte Carlo simulations, but is optimally suited to tackle quantum many-body systems that exhibit a range of behaviors from "fully quantum" to "fully classical," in contrast to many existing methods. We demonstrate the advantages, sometimes by orders of magnitude, of the technique by comparing it against existing state-of-the-art schemes such as path integral quantum Monte Carlo and stochastic series expansion. We also illustrate how our method allows for the unification of quantum and classical thermal parallel tempering techniques into a single algorithm and discuss its practical significance.
On a fourth order accurate implicit finite difference scheme for hyperbolic conservation laws. II - Five-point schemes

NASA Technical Reports Server (NTRS)

Harten, A.; Tal-Ezer, H.

1981-01-01

This paper presents a family of two-level five-point implicit schemes for the solution of one-dimensional systems of hyperbolic conservation laws, which generalized the Crank-Nicholson scheme to fourth order accuracy (4-4) in both time and space. These 4-4 schemes are nondissipative and unconditionally stable. Special attention is given to the system of linear equations associated with these 4-4 implicit schemes. The regularity of this system is analyzed and efficiency of solution-algorithms is examined. A two-datum representation of these 4-4 implicit schemes brings about a compactification of the stencil to three mesh points at each time-level. This compact two-datum representation is particularly useful in deriving boundary treatments. Numerical results are presented to illustrate some properties of the proposed scheme.
Large scale shell model study of the evolution of mixed-symmetry states in chains of nuclei around 132Sn

NASA Astrophysics Data System (ADS)

Lo Iudice, N.; Bianco, D.; Andreozzi, F.; Porrino, A.; Knapp, F.

2012-10-01

Large scale shell model calculations based on a new diagonalization algorithm are performed in order to investigate the mixed symmetry states in chains of nuclei in the proximity of N=82. The resulting spectra and transitions are in agreement with the experiments and consistent with the scheme provided by the interacting boson model.
Improving the Numerical Stability of Fast Matrix Multiplication

DOE PAGES

Ballard, Grey; Benson, Austin R.; Druinsky, Alex; ...

2016-10-04

Fast algorithms for matrix multiplication, namely those that perform asymptotically fewer scalar operations than the classical algorithm, have been considered primarily of theoretical interest. Apart from Strassen's original algorithm, few fast algorithms have been efficiently implemented or used in practical applications. However, there exist many practical alternatives to Strassen's algorithm with varying performance and numerical properties. Fast algorithms are known to be numerically stable, but because their error bounds are slightly weaker than the classical algorithm, they are not used even in cases where they provide a performance benefit. We argue in this study that the numerical sacrifice of fastmore » algorithms, particularly for the typical use cases of practical algorithms, is not prohibitive, and we explore ways to improve the accuracy both theoretically and empirically. The numerical accuracy of fast matrix multiplication depends on properties of the algorithm and of the input matrices, and we consider both contributions independently. We generalize and tighten previous error analyses of fast algorithms and compare their properties. We discuss algorithmic techniques for improving the error guarantees from two perspectives: manipulating the algorithms, and reducing input anomalies by various forms of diagonal scaling. In conclusion, we benchmark performance and demonstrate our improved numerical accuracy.« less
Numerical Simulation of the Interaction of a Vortex with Stationary Airfoil in Transonic Flow,

DTIC Science & Technology

1984-01-12

Goorjian, P. M., "Implicit Vortex Wakes ," AIAA Journal, Vol. 15, No. 4, April Finite- Difference Computations of Unsteady Transonic 1977, pp. 581-590... Difference Simulations of Three- tion of Wing- Vortex Interaction in Transonic Flow Dimensional Flow," AIAA Journal, Vol. 18, No. 2, Using Implicit...assumptions are made in p = density modeling the nonlinear vortex wake structure. Numerical algorithms based on the Euler equations p_ = free stream density
Fully implicit moving mesh adaptive algorithm

NASA Astrophysics Data System (ADS)

Serazio, C.; Chacon, L.; Lapenta, G.

2006-10-01

In many problems of interest, the numerical modeler is faced with the challenge of dealing with multiple time and length scales. The former is best dealt with with fully implicit methods, which are able to step over fast frequencies to resolve the dynamical time scale of interest. The latter requires grid adaptivity for efficiency. Moving-mesh grid adaptive methods are attractive because they can be designed to minimize the numerical error for a given resolution. However, the required grid governing equations are typically very nonlinear and stiff, and of considerably difficult numerical treatment. Not surprisingly, fully coupled, implicit approaches where the grid and the physics equations are solved simultaneously are rare in the literature, and circumscribed to 1D geometries. In this study, we present a fully implicit algorithm for moving mesh methods that is feasible for multidimensional geometries. Crucial elements are the development of an effective multilevel treatment of the grid equation, and a robust, rigorous error estimator. For the latter, we explore the effectiveness of a coarse grid correction error estimator, which faithfully reproduces spatial truncation errors for conservative equations. We will show that the moving mesh approach is competitive vs. uniform grids both in accuracy (due to adaptivity) and efficiency. Results for a variety of models 1D and 2D geometries will be presented. L. Chac'on, G. Lapenta, J. Comput. Phys., 212 (2), 703 (2006) G. Lapenta, L. Chac'on, J. Comput. Phys., accepted (2006)
Fully Implicit, Nonlinear 3D Extended Magnetohydrodynamics

NASA Astrophysics Data System (ADS)

Chacon, Luis; Knoll, Dana

2003-10-01

Extended magnetohydrodynamics (XMHD) includes nonideal effects such as nonlinear, anisotropic transport and two-fluid (Hall) effects. XMHD supports multiple, separate time scales that make explicit time differencing approaches extremely inefficient. While a fully implicit implementation promises efficiency without sacrificing numerical accuracy,(D. A. Knoll et al., phJ. Comput. Phys.) 185 (2), 583-611 (2003) the nonlinear nature of the XMHD system and the numerical stiffness associated with the fast waves make this endeavor difficult. Newton-Krylov methods are, however, ideally suited for such a task. These synergistically combine Newton's method for nonlinear convergence, and Krylov techniques to solve the associated Jacobian (linear) systems. Krylov methods can be implemented Jacobian-free and can be preconditioned for efficiency. Successful preconditioning strategies have been developed for 2D incompressible resistive(L. Chacón et al., phJ. Comput. Phys). 178 (1), 15- 36 (2002) and Hall(L. Chacón and D. A. Knoll, phJ. Comput. Phys.), 188 (2), 573-592 (2003) MHD models. These are based on ``physics-based'' ideas, in which knowledge of the physics is exploited to derive well-conditioned (diagonally-dominant) approximations to the original system that are amenable to optimal solver technologies (multigrid). In this work, we will describe the status of the extension of the 2D preconditioning ideas for a 3D compressible, single-fluid XMHD model.
An outlet breaching algorithm for the treatment of closed depressions in a raster DEM

NASA Astrophysics Data System (ADS)

Martz, Lawrence W.; Garbrecht, Jurgen

1999-08-01

Automated drainage analysis of raster DEMs typically begins with the simulated filling of all closed depressions and the imposition of a drainage pattern on the resulting flat areas. The elimination of closed depressions by filling implicitly assumes that all depressions are caused by elevation underestimation. This assumption is difficult to support, as depressions can be produced by overestimation as well as by underestimation of DEM values.This paper presents a new algorithm that is applied in conjunction with conventional depression filling to provide a more realistic treatment of those depressions that are likely due to overestimation errors. The algorithm lowers the elevation of selected cells on the edge of closed depressions to simulate breaching of the depression outlets. Application of this breaching algorithm prior to depression filling can substantially reduce the number and size of depressions that need to be filled, especially in low relief terrain.Removing or reducing the size of a depression by breaching implicitly assumes that the depression is due to a spurious flow blockage caused by elevation overestimation. Removing a depression by filling, on the other hand, implicitly assumes that the depression is a direct artifact of elevation underestimation. Although the breaching algorithm cannot distinguish between overestimation and underestimation errors in a DEM, a constraining parameter for breaching length can be used to restrict breaching to closed depressions caused by narrow blockages along well-defined drainage courses. These are considered the depressions most likely to have arisen from overestimation errors. Applying the constrained breaching algorithm prior to a conventional depression-filling algorithm allows both positive and negative elevation adjustments to be used to remove depressions.The breaching algorithm was incorporated into the DEM pre-processing operations of the TOPAZ software system. The effect of the algorithm is illustrated by the application of TOPAZ to a DEM of a low-relief landscape. The use of the breaching algorithm during DEM pre-processing substantially reduced the number of cells that needed to be subsequently raised in elevation to remove depressions. The number and kind of depression cells that were eliminated by the breaching algorithm suggested that the algorithm effectively targeted those topographic situations for which it was intended. A detailed inspection of a portion of the DEM that was processed using breaching algorithm in conjunction with depression-filling also suggested the effects of the algorithm were as intended.The breaching algorithm provides an empirically satisfactory and robust approach to treating closed depressions in a raster DEM. It recognises that depressions in certain topographic settings are as likely to be due to elevation overestimation as to elevation underestimation errors. The algorithm allows a more realistic treatment of depressions in these situations than conventional methods that rely solely on depression-filling.
A Numerical Model for Trickle Bed Reactors

NASA Astrophysics Data System (ADS)

Propp, Richard M.; Colella, Phillip; Crutchfield, William Y.; Day, Marcus S.

2000-12-01

Trickle bed reactors are governed by equations of flow in porous media such as Darcy's law and the conservation of mass. Our numerical method for solving these equations is based on a total-velocity splitting, sequential formulation which leads to an implicit pressure equation and a semi-implicit mass conservation equation. We use high-resolution finite-difference methods to discretize these equations. Our solution scheme extends previous work in modeling porous media flows in two ways. First, we incorporate physical effects due to capillary pressure, a nonlinear inlet boundary condition, spatial porosity variations, and inertial effects on phase mobilities. In particular, capillary forces introduce a parabolic component into the recast evolution equation, and the inertial effects give rise to hyperbolic nonconvexity. Second, we introduce a modification of the slope-limiting algorithm to prevent our numerical method from producing spurious shocks. We present a numerical algorithm for accommodating these difficulties, show the algorithm is second-order accurate, and demonstrate its performance on a number of simplified problems relevant to trickle bed reactor modeling.
An Implicit LU/AF FDTD Method

NASA Technical Reports Server (NTRS)

Beggs, John H.; Briley, W. Roger

2001-01-01

There has been some recent work to develop two and three-dimensional alternating direction implicit (ADI) FDTD schemes. These ADI schemes are based upon the original ADI concept developed by Peaceman and Rachford and Douglas and Gunn, which is a popular solution method in Computational Fluid Dynamics (CFD). These ADI schemes work well and they require solution of a tridiagonal system of equations. A new approach proposed in this paper applies a LU/AF approximate factorization technique from CFD to Maxwell s equations in flux conservative form for one space dimension. The result is a scheme that will retain its unconditional stability in three space dimensions, but does not require the solution of tridiagonal systems. The theory for this new algorithm is outlined in a one-dimensional context for clarity. An extension to two and threedimensional cases is discussed. Results of Fourier analysis are discussed for both stability and dispersion/damping properties of the algorithm. Results are presented for a one-dimensional model problem, and the explicit FDTD algorithm is chosen as a convenient reference for comparison.
Fast preconditioned multigrid solution of the Euler and Navier-Stokes equations for steady, compressible flows

NASA Astrophysics Data System (ADS)

Caughey, David A.; Jameson, Antony

2003-10-01

New versions of implicit algorithms are developed for the efficient solution of the Euler and Navier-Stokes equations of compressible flow. The methods are based on a preconditioned, lower-upper (LU) implementation of a non-linear, symmetric Gauss-Seidel (SGS) algorithm for use as a smoothing algorithm in a multigrid method. Previously, this method had been implemented for flows in quasi-one-dimensional ducts and for two-dimensional flows past airfoils on boundary-conforming O-type grids for a variety of symmetric limited positive (SLIP) spatial approximations, including the scalar dissipation and convective upwind split pressure (CUSP) schemes. Here results are presented for both inviscid and viscous (laminar) flows past airfoils on boundary-conforming C-type grids. The method is significantly faster than earlier explicit or implicit methods for inviscid problems, allowing solution of these problems to the level of truncation error in three to five multigrid cycles. Viscous solutions still require as many as twenty multigrid cycles.
A fast efficient implicit scheme for the gasdynamic equations using a matrix reduction technique

NASA Technical Reports Server (NTRS)

Barth, T. J.; Steger, J. L.

1985-01-01

An efficient implicit finite-difference algorithm for the gasdynamic equations utilizing matrix reduction techniques is presented. A significant reduction in arithmetic operations is achieved without loss of the stability characteristics generality found in the Beam and Warming approximate factorization algorithm. Steady-state solutions to the conservative Euler equations in generalized coordinates are obtained for transonic flows and used to show that the method offers computational advantages over the conventional Beam and Warming scheme. Existing Beam and Warming codes can be retrofit with minimal effort. The theoretical extension of the matrix reduction technique to the full Navier-Stokes equations in Cartesian coordinates is presented in detail. Linear stability, using a Fourier stability analysis, is demonstrated and discussed for the one-dimensional Euler equations.
Impact of the Parameter Identification of Plastic Potentials on the Finite Element Simulation of Sheet Metal Forming

NASA Astrophysics Data System (ADS)

Rabahallah, M.; Bouvier, S.; Balan, T.; Bacroix, B.; Teodosiu, C.

2007-04-01

In this work, an implicit, backward Euler time integration scheme is developed for an anisotropic, elastic-plastic model based on strain-rate potentials. The constitutive algorithm includes a sub-stepping procedure to deal with the strong nonlinearity of the plastic potentials when applied to FCC materials. The algorithm is implemented in the static implicit version of the Abaqus finite element code. Several recent plastic potentials have been implemented in this framework. The most accurate potentials require the identification of about twenty material parameters. Both mechanical tests and micromechanical simulations have been used for their identification, for a number of BCC and FCC materials. The impact of the identification procedure on the prediction of ears in cup drawing is investigated.
A particle tracking method for analyzing chaotic electroosmotic flow mixing in 3D microchannels with patterned charged surfaces

NASA Astrophysics Data System (ADS)

Chang, Chih-Chang; Yang, Ruey-Jen

2006-08-01

This paper presents a numerical simulation investigation into electroosmotic flow mixing in three-dimensional microchannels with patterned non-uniform surface zeta potentials. Three types of micromixers are investigated, namely a straight diagonal strip mixer (i.e. the non-uniform surface zeta potential is applied along straight, diagonal strips on the lower wall of the mixing channel), a staggered asymmetric herringbone strip mixer and a straight diagonal/symmetric herringbone strip mixer. A particle tracing algorithm is used to visualize and evaluate the mixing performance of the various mixers. The particle trajectories and Poincaré maps of the various mixers are calculated from the three-dimensional flow fields. The surface charge patterns on the lower walls of the microchannels induce electroosmotic chaotic advection in the low Reynolds number flow regime, and hence enhance the passive mixing effect in the microfluidic devices. A quantitative measure of the mixing performance based on Shannon entropy is employed to quantify the mixing of two miscible fluids. The results show that the mixing efficiency increases as the magnitude of the heterogeneous zeta potential ratio (|ζR|) is increased, but decreases as the aspect ratio (H/W) is increased. The mixing efficiency of the straight diagonal strip mixer with a length ratio of l/W = 0.5 is slightly higher than that obtained from the same mixer with l/W = 1.0. Finally, the staggered asymmetric herringbone strip mixer with θ = 45°, ζR = -1, l/W = 0.5 and H/W = 0.2 provides the optimal mixing performance of all the mixers presented in this study.
Algorithmically scalable block preconditioner for fully implicit shallow-water equations in CAM-SE

DOE PAGES

Lott, P. Aaron; Woodward, Carol S.; Evans, Katherine J.

2014-10-19

Performing accurate and efficient numerical simulation of global atmospheric climate models is challenging due to the disparate length and time scales over which physical processes interact. Implicit solvers enable the physical system to be integrated with a time step commensurate with processes being studied. The dominant cost of an implicit time step is the ancillary linear system solves, so we have developed a preconditioner aimed at improving the efficiency of these linear system solves. Our preconditioner is based on an approximate block factorization of the linearized shallow-water equations and has been implemented within the spectral element dynamical core within themore » Community Atmospheric Model (CAM-SE). Furthermore, in this paper we discuss the development and scalability of the preconditioner for a suite of test cases with the implicit shallow-water solver within CAM-SE.« less
Studies of numerical algorithms for gyrokinetics and the effects of shaping on plasma turbulence

NASA Astrophysics Data System (ADS)

Belli, Emily Ann

Advanced numerical algorithms for gyrokinetic simulations are explored for more effective studies of plasma turbulent transport. The gyrokinetic equations describe the dynamics of particles in 5-dimensional phase space, averaging over the fast gyromotion, and provide a foundation for studying plasma microturbulence in fusion devices and in astrophysical plasmas. Several algorithms for Eulerian/continuum gyrokinetic solvers are compared. An iterative implicit scheme based on numerical approximations of the plasma response is developed. This method reduces the long time needed to set-up implicit arrays, yet still has larger time step advantages similar to a fully implicit method. Various model preconditioners and iteration schemes, including Krylov-based solvers, are explored. An Alternating Direction Implicit algorithm is also studied and is surprisingly found to yield a severe stability restriction on the time step. Overall, an iterative Krylov algorithm might be the best approach for extensions of core tokamak gyrokinetic simulations to edge kinetic formulations and may be particularly useful for studies of large-scale ExB shear effects. The effects of flux surface shape on the gyrokinetic stability and transport of tokamak plasmas are studied using the nonlinear GS2 gyrokinetic code with analytic equilibria based on interpolations of representative JET-like shapes. High shaping is found to be a stabilizing influence on both the linear ITG instability and nonlinear ITG turbulence. A scaling of the heat flux with elongation of chi ˜ kappa-1.5 or kappa-2 (depending on the triangularity) is observed, which is consistent with previous gyrofluid simulations. Thus, the GS2 turbulence simulations are explaining a significant fraction, but not all, of the empirical elongation scaling. The remainder of the scaling may come from (1) the edge boundary conditions for core turbulence, and (2) the larger Dimits nonlinear critical temperature gradient shift due to the enhancement of zonal flows with shaping, which is observed with the GS2 simulations. Finally, a local linear trial function-based gyrokinetic code is developed to aid in fast scoping studies of gyrokinetic linear stability. This code is successfully benchmarked with the full GS2 code in the collisionless, electrostatic limit, as well as in the more general electromagnetic description with higher-order Hermite basis functions.
An energy- and charge-conserving, implicit, electrostatic particle-in-cell algorithm

NASA Astrophysics Data System (ADS)

Chen, G.; Chacón, L.; Barnes, D. C.

2011-08-01

This paper discusses a novel fully implicit formulation for a one-dimensional electrostatic particle-in-cell (PIC) plasma simulation approach. Unlike earlier implicit electrostatic PIC approaches (which are based on a linearized Vlasov-Poisson formulation), ours is based on a nonlinearly converged Vlasov-Ampére (VA) model. By iterating particles and fields to a tight nonlinear convergence tolerance, the approach features superior stability and accuracy properties, avoiding most of the accuracy pitfalls in earlier implicit PIC implementations. In particular, the formulation is stable against temporal (Courant-Friedrichs-Lewy) and spatial (aliasing) instabilities. It is charge- and energy-conserving to numerical round-off for arbitrary implicit time steps (unlike the earlier "energy-conserving" explicit PIC formulation, which only conserves energy in the limit of arbitrarily small time steps). While momentum is not exactly conserved, errors are kept small by an adaptive particle sub-stepping orbit integrator, which is instrumental to prevent particle tunneling (a deleterious effect for long-term accuracy). The VA model is orbit-averaged along particle orbits to enforce an energy conservation theorem with particle sub-stepping. As a result, very large time steps, constrained only by the dynamical time scale of interest, are possible without accuracy loss. Algorithmically, the approach features a Jacobian-free Newton-Krylov solver. A main development in this study is the nonlinear elimination of the new-time particle variables (positions and velocities). Such nonlinear elimination, which we term particle enslavement, results in a nonlinear formulation with memory requirements comparable to those of a fluid computation, and affords us substantial freedom in regards to the particle orbit integrator. Numerical examples are presented that demonstrate the advertised properties of the scheme. In particular, long-time ion acoustic wave simulations show that numerical accuracy does not degrade even with very large implicit time steps, and that significant CPU gains are possible.
Multiple-body simulation with emphasis on integrated Space Shuttle vehicle

NASA Technical Reports Server (NTRS)

Chiu, Ing-Tsau

1993-01-01

The program to obtain intergrid communications - Pegasus - was enhanced to make better use of computing resources. Periodic block tridiagonal and penta-diagonal diagonal routines in OVERFLOW were modified to use a better algorithm to speed up the calculation for grids with periodic boundary conditions. Several programs were added to collar grid tools and a user friendly shell script was developed to help users generate collar grids. User interface for HYPGEN was modified to cope with the changes in HYPGEN. ET/SRB attach hardware grids were added to the computational model for the space shuttle and is currently incorporated into the refined shuttle model jointly developed at Johnson Space Center and Ames Research Center. Flow simulation for the integrated space shuttle vehicle at flight Reynolds number was carried out and compared with flight data as well as the earlier simulation for wind tunnel Reynolds number.
Characterizing the inverses of block tridiagonal, block Toeplitz matrices

DOE Office of Scientific and Technical Information (OSTI.GOV)

Boffi, Nicholas M.; Hill, Judith C.; Reuter, Matthew G.

2014-12-04

We consider the inversion of block tridiagonal, block Toeplitz matrices and comment on the behaviour of these inverses as one moves away from the diagonal. Using matrix M bius transformations, we first present an O(1) representation (with respect to the number of block rows and block columns) for the inverse matrix and subsequently use this representation to characterize the inverse matrix. There are four symmetry-distinct cases where the blocks of the inverse matrix (i) decay to zero on both sides of the diagonal, (ii) oscillate on both sides, (iii) decay on one side and oscillate on the other and (iv)more » decay on one side and grow on the other. This characterization exposes the necessary conditions for the inverse matrix to be numerically banded and may also aid in the design of preconditioners and fast algorithms. Finally, we present numerical examples of these matrix types.« less

Symmetric tridiagonal structure preserving finite element model updating problem for the quadratic model

NASA Astrophysics Data System (ADS)

Rakshit, Suman; Khare, Swanand R.; Datta, Biswa Nath

2018-07-01

One of the most important yet difficult aspect of the Finite Element Model Updating Problem is to preserve the finite element inherited structures in the updated model. Finite element matrices are in general symmetric, positive definite (or semi-definite) and banded (tridiagonal, diagonal, penta-diagonal, etc.). Though a large number of papers have been published in recent years on various aspects of solutions of this problem, papers dealing with structure preservation almost do not exist. A novel optimization based approach that preserves the symmetric tridiagonal structures of the stiffness and damping matrices is proposed in this paper. An analytical expression for the global minimum solution of the associated optimization problem along with the results of numerical experiments obtained by both the analytical expressions and by an appropriate numerical optimization algorithm are presented. The results of numerical experiments support the validity of the proposed method.
Entanglement distillation protocols and number theory

NASA Astrophysics Data System (ADS)

Bombin, H.; Martin-Delgado, M. A.

2005-09-01

We show that the analysis of entanglement distillation protocols for qudits of arbitrary dimension D benefits from applying basic concepts from number theory, since the set ZDn associated with Bell diagonal states is a module rather than a vector space. We find that a partition of ZDn into divisor classes characterizes the invariant properties of mixed Bell diagonal states under local permutations. We construct a very general class of recursion protocols by means of unitary operations implementing these local permutations. We study these distillation protocols depending on whether we use twirling operations in the intermediate steps or not, and we study them both analytically and numerically with Monte Carlo methods. In the absence of twirling operations, we construct extensions of the quantum privacy algorithms valid for secure communications with qudits of any dimension D . When D is a prime number, we show that distillation protocols are optimal both qualitatively and quantitatively.
Low-rank factorization of electron integral tensors and its application in electronic structure theory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Peng, Bo; Kowalski, Karol

In this letter, we introduce the reverse Cuthill-McKee (RCM) algorithm, which is often used for the bandwidth reduction of sparse tensors, to transform the two-electron integral tensors to their block diagonal forms. By further applying the pivoted Cholesky decomposition (CD) on each of the diagonal blocks, we are able to represent the high-dimensional two-electron integral tensors in terms of permutation matrices and low-rank Cholesky vectors. This representation facilitates the low-rank factorization of the high-dimensional tensor contractions that are usually encountered in post-Hartree-Fock calculations. In this letter, we discuss the second-order Møller-Plesset (MP2) method and linear coupled- cluster model with doublesmore » (L-CCD) as two simple examples to demonstrate the efficiency of the RCM-CD technique in representing two-electron integrals in a compact form.« less
Adaptive Mesh Refinement in Curvilinear Body-Fitted Grid Systems

NASA Technical Reports Server (NTRS)

Steinthorsson, Erlendur; Modiano, David; Colella, Phillip

1995-01-01

To be truly compatible with structured grids, an AMR algorithm should employ a block structure for the refined grids to allow flow solvers to take advantage of the strengths of unstructured grid systems, such as efficient solution algorithms for implicit discretizations and multigrid schemes. One such algorithm, the AMR algorithm of Berger and Colella, has been applied to and adapted for use with body-fitted structured grid systems. Results are presented for a transonic flow over a NACA0012 airfoil (AGARD-03 test case) and a reflection of a shock over a double wedge.
Solution algorithms for the two-dimensional Euler equations on unstructured meshes

NASA Technical Reports Server (NTRS)

Whitaker, D. L.; Slack, David C.; Walters, Robert W.

1990-01-01

The objective of the study was to analyze implicit techniques employed in structured grid algorithms for solving two-dimensional Euler equations and extend them to unstructured solvers in order to accelerate convergence rates. A comparison is made between nine different algorithms for both first-order and second-order accurate solutions. Higher-order accuracy is achieved by using multidimensional monotone linear reconstruction procedures. The discussion is illustrated by results for flow over a transonic circular arc.
Efficiency analysis of numerical integrations for finite element substructure in real-time hybrid simulation

NASA Astrophysics Data System (ADS)

Wang, Jinting; Lu, Liqiao; Zhu, Fei

2018-01-01

Finite element (FE) is a powerful tool and has been applied by investigators to real-time hybrid simulations (RTHSs). This study focuses on the computational efficiency, including the computational time and accuracy, of numerical integrations in solving FE numerical substructure in RTHSs. First, sparse matrix storage schemes are adopted to decrease the computational time of FE numerical substructure. In this way, the task execution time (TET) decreases such that the scale of the numerical substructure model increases. Subsequently, several commonly used explicit numerical integration algorithms, including the central difference method (CDM), the Newmark explicit method, the Chang method and the Gui-λ method, are comprehensively compared to evaluate their computational time in solving FE numerical substructure. CDM is better than the other explicit integration algorithms when the damping matrix is diagonal, while the Gui-λ (λ = 4) method is advantageous when the damping matrix is non-diagonal. Finally, the effect of time delay on the computational accuracy of RTHSs is investigated by simulating structure-foundation systems. Simulation results show that the influences of time delay on the displacement response become obvious with the mass ratio increasing, and delay compensation methods may reduce the relative error of the displacement peak value to less than 5% even under the large time-step and large time delay.
Digital adaptive controllers for VTOL vehicles. Volume 2: Software documentation

NASA Technical Reports Server (NTRS)

Hartmann, G. L.; Stein, G.; Pratt, S. G.

1979-01-01

The VTOL approach and landing test (VALT) adaptive software is documented. Two self-adaptive algorithms, one based on an implicit model reference design and the other on an explicit parameter estimation technique were evaluated. The organization of the software, user options, and a nominal set of input data are presented along with a flow chart and program listing of each algorithm.
Multi-Agent Task Negotiation Among UAVs to Defend Against Swarm Attacks

DTIC Science & Technology

2012-03-01

are based on economic models [39]. Auction methods of task coordination also attempt to deal with agents dealing with noisy, dynamic environments...August 2006. [34] M. Alighanbari, “ Robust and decentralized task assignment algorithms for uavs,” Ph.D. dissertation, Massachusetts Institute of Technology...Implicit Coordination . . . . . . . . . . . . . 12 2.4 Decentralized Algorithm B - Market- Based . . . . . . . . . . . . . . . . 12 2.5 Decentralized
One-dimensional Lagrangian implicit hydrodynamic algorithm for Inertial Confinement Fusion applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ramis, Rafael, E-mail: rafael.ramis@upm.es

A new one-dimensional hydrodynamic algorithm, specifically developed for Inertial Confinement Fusion (ICF) applications, is presented. The scheme uses a fully conservative Lagrangian formulation in planar, cylindrical, and spherically symmetric geometries, and supports arbitrary equations of state with separate ion and electron components. Fluid equations are discretized on a staggered grid and stabilized by means of an artificial viscosity formulation. The space discretized equations are advanced in time using an implicit algorithm. The method includes several numerical parameters that can be adjusted locally. In regions with low Courant–Friedrichs–Lewy (CFL) number, where stability is not an issue, they can be adjusted tomore » optimize the accuracy. In typical problems, the truncation error can be reduced by a factor between 2 to 10 in comparison with conventional explicit algorithms. On the other hand, in regions with high CFL numbers, the parameters can be set to guarantee unconditional stability. The method can be integrated into complex ICF codes. This is demonstrated through several examples covering a wide range of situations: from thermonuclear ignition physics, where alpha particles are managed as an additional species, to low intensity laser–matter interaction, where liquid–vapor phase transitions occur.« less
A joint precoding scheme for indoor downlink multi-user MIMO VLC systems

NASA Astrophysics Data System (ADS)

Zhao, Qiong; Fan, Yangyu; Kang, Bochao

2017-11-01

In this study, we aim to improve the system performance and reduce the implementation complexity of precoding scheme for visible light communication (VLC) systems. By incorporating the power-method algorithm and the block diagonalization (BD) algorithm, we propose a joint precoding scheme for indoor downlink multi-user multi-input-multi-output (MU-MIMO) VLC systems. In this scheme, we apply the BD algorithm to eliminate the co-channel interference (CCI) among users firstly. Secondly, the power-method algorithm is used to search the precoding weight for each user based on the optimal criterion of signal to interference plus noise ratio (SINR) maximization. Finally, the optical power restrictions of VLC systems are taken into account to constrain the precoding weight matrix. Comprehensive computer simulations in two scenarios indicate that the proposed scheme always has better bit error rate (BER) performance and lower computation complexity than that of the traditional scheme.
Computational electromagnetics: the physics of smooth versus oscillatory fields.

PubMed

Chew, W C

2004-03-15

This paper starts by discussing the difference in the physics between solutions to Laplace's equation (static) and Maxwell's equations for dynamic problems (Helmholtz equation). Their differing physical characters are illustrated by how the two fields convey information away from their source point. The paper elucidates the fact that their differing physical characters affect the use of Laplacian field and Helmholtz field in imaging. They also affect the design of fast computational algorithms for electromagnetic scattering problems. Specifically, a comparison is made between fast algorithms developed using wavelets, the simple fast multipole method, and the multi-level fast multipole algorithm for electrodynamics. The impact of the physical characters of the dynamic field on the parallelization of the multi-level fast multipole algorithm is also discussed. The relationship of diagonalization of translators to group theory is presented. Finally, future areas of research for computational electromagnetics are described.
The Power of Implicit Social Relation in Rating Prediction of Social Recommender Systems

PubMed Central

Reafee, Waleed; Salim, Naomie; Khan, Atif

2016-01-01

The explosive growth of social networks in recent times has presented a powerful source of information to be utilized as an extra source for assisting in the social recommendation problems. The social recommendation methods that are based on probabilistic matrix factorization improved the recommendation accuracy and partly solved the cold-start and data sparsity problems. However, these methods only exploited the explicit social relations and almost completely ignored the implicit social relations. In this article, we firstly propose an algorithm to extract the implicit relation in the undirected graphs of social networks by exploiting the link prediction techniques. Furthermore, we propose a new probabilistic matrix factorization method to alleviate the data sparsity problem through incorporating explicit friendship and implicit friendship. We evaluate our proposed approach on two real datasets, Last.Fm and Douban. The experimental results show that our method performs much better than the state-of-the-art approaches, which indicates the importance of incorporating implicit social relations in the recommendation process to address the poor prediction accuracy. PMID:27152663
Low-storage implicit/explicit Runge-Kutta schemes for the simulation of stiff high-dimensional ODE systems

NASA Astrophysics Data System (ADS)

Cavaglieri, Daniele; Bewley, Thomas

2015-04-01

Implicit/explicit (IMEX) Runge-Kutta (RK) schemes are effective for time-marching ODE systems with both stiff and nonstiff terms on the RHS; such schemes implement an (often A-stable or better) implicit RK scheme for the stiff part of the ODE, which is often linear, and, simultaneously, a (more convenient) explicit RK scheme for the nonstiff part of the ODE, which is often nonlinear. Low-storage RK schemes are especially effective for time-marching high-dimensional ODE discretizations of PDE systems on modern (cache-based) computational hardware, in which memory management is often the most significant computational bottleneck. In this paper, we develop and characterize eight new low-storage implicit/explicit RK schemes which have higher accuracy and better stability properties than the only low-storage implicit/explicit RK scheme available previously, the venerable second-order Crank-Nicolson/Runge-Kutta-Wray (CN/RKW3) algorithm that has dominated the DNS/LES literature for the last 25 years, while requiring similar storage (two, three, or four registers of length N) and comparable floating-point operations per timestep.
UDU/T/ covariance factorization for Kalman filtering

NASA Technical Reports Server (NTRS)

Thornton, C. L.; Bierman, G. J.

1980-01-01

There has been strong motivation to produce numerically stable formulations of the Kalman filter algorithms because it has long been known that the original discrete-time Kalman formulas are numerically unreliable. Numerical instability can be avoided by propagating certain factors of the estimate error covariance matrix rather than the covariance matrix itself. This paper documents filter algorithms that correspond to the covariance factorization P = UDU(T), where U is a unit upper triangular matrix and D is diagonal. Emphasis is on computational efficiency and numerical stability, since these properties are of key importance in real-time filter applications. The history of square-root and U-D covariance filters is reviewed. Simple examples are given to illustrate the numerical inadequacy of the Kalman covariance filter algorithms; these examples show how factorization techniques can give improved computational reliability.
An implicit fast Fourier transform method for integration of the time dependent Schrodinger equation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Riley, M.E.; Ritchie, A.B.

1997-12-31

One finds that the conventional exponentiated split operator procedure is subject to difficulties when solving the time-dependent Schrodinger equation for Coulombic systems. By rearranging the kinetic and potential energy terms in the temporal propagator of the finite difference equations, one can find a propagation algorithm for three dimensions that looks much like the Crank-Nicholson and alternating direction implicit methods for one- and two-space-dimensional partial differential equations. The authors report investigations of this novel implicit split operator procedure. The results look promising for a purely numerical approach to certain electron quantum mechanical problems. A charge exchange calculation is presented as anmore » example of the power of the method.« less
A Spectral Element Discretisation on Unstructured Triangle / Tetrahedral Meshes for Elastodynamics

NASA Astrophysics Data System (ADS)

May, Dave A.; Gabriel, Alice-A.

2017-04-01

The spectral element method (SEM) defined over quadrilateral and hexahedral element geometries has proven to be a fast, accurate and scalable approach to study wave propagation phenomena. In the context of regional scale seismology and or simulations incorporating finite earthquake sources, the geometric restrictions associated with hexahedral elements can limit the applicability of the classical quad./hex. SEM. Here we describe a continuous Galerkin spectral element discretisation defined over unstructured meshes composed of triangles (2D), or tetrahedra (3D). The method uses a stable, nodal basis constructed from PKD polynomials and thus retains the spectral accuracy and low dispersive properties of the classical SEM, in addition to the geometric versatility provided by unstructured simplex meshes. For the particular basis and quadrature rule we have adopted, the discretisation results in a mass matrix which is not diagonal, thereby mandating linear solvers be utilised. To that end, we have developed efficient solvers and preconditioners which are robust with respect to the polynomial order (p), and possess high arithmetic intensity. Furthermore, we also consider using implicit time integrators, together with a p-multigrid preconditioner to circumvent the CFL condition. Implicit time integrators become particularly relevant when considering solving problems on poor quality meshes, or meshes containing elements with a widely varying range of length scales - both of which frequently arise when meshing non-trivial geometries. We demonstrate the applicability of the new method by examining a number of two- and three-dimensional wave propagation scenarios. These scenarios serve to characterise the accuracy and cost of the new method. Lastly, we will assess the potential benefits of using implicit time integrators for regional scale wave propagation simulations.
Semi-implicit iterative methods for low Mach number turbulent reacting flows: Operator splitting versus approximate factorization

NASA Astrophysics Data System (ADS)

MacArt, Jonathan F.; Mueller, Michael E.

2016-12-01

Two formally second-order accurate, semi-implicit, iterative methods for the solution of scalar transport-reaction equations are developed for Direct Numerical Simulation (DNS) of low Mach number turbulent reacting flows. The first is a monolithic scheme based on a linearly implicit midpoint method utilizing an approximately factorized exact Jacobian of the transport and reaction operators. The second is an operator splitting scheme based on the Strang splitting approach. The accuracy properties of these schemes, as well as their stability, cost, and the effect of chemical mechanism size on relative performance, are assessed in two one-dimensional test configurations comprising an unsteady premixed flame and an unsteady nonpremixed ignition, which have substantially different Damköhler numbers and relative stiffness of transport to chemistry. All schemes demonstrate their formal order of accuracy in the fully-coupled convergence tests. Compared to a (non-)factorized scheme with a diagonal approximation to the chemical Jacobian, the monolithic, factorized scheme using the exact chemical Jacobian is shown to be both more stable and more economical. This is due to an improved convergence rate of the iterative procedure, and the difference between the two schemes in convergence rate grows as the time step increases. The stability properties of the Strang splitting scheme are demonstrated to outpace those of Lie splitting and monolithic schemes in simulations at high Damköhler number; however, in this regime, the monolithic scheme using the approximately factorized exact Jacobian is found to be the most economical at practical CFL numbers. The performance of the schemes is further evaluated in a simulation of a three-dimensional, spatially evolving, turbulent nonpremixed planar jet flame.
An efficient mixed-precision, hybrid CPU-GPU implementation of a nonlinearly implicit one-dimensional particle-in-cell algorithm

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Guangye; Chacon, Luis; Barnes, Daniel C

2012-01-01

Recently, a fully implicit, energy- and charge-conserving particle-in-cell method has been developed for multi-scale, full-f kinetic simulations [G. Chen, et al., J. Comput. Phys. 230, 18 (2011)]. The method employs a Jacobian-free Newton-Krylov (JFNK) solver and is capable of using very large timesteps without loss of numerical stability or accuracy. A fundamental feature of the method is the segregation of particle orbit integrations from the field solver, while remaining fully self-consistent. This provides great flexibility, and dramatically improves the solver efficiency by reducing the degrees of freedom of the associated nonlinear system. However, it requires a particle push per nonlinearmore » residual evaluation, which makes the particle push the most time-consuming operation in the algorithm. This paper describes a very efficient mixed-precision, hybrid CPU-GPU implementation of the implicit PIC algorithm. The JFNK solver is kept on the CPU (in double precision), while the inherent data parallelism of the particle mover is exploited by implementing it in single-precision on a graphics processing unit (GPU) using CUDA. Performance-oriented optimizations, with the aid of an analytical performance model, the roofline model, are employed. Despite being highly dynamic, the adaptive, charge-conserving particle mover algorithm achieves up to 300 400 GOp/s (including single-precision floating-point, integer, and logic operations) on a Nvidia GeForce GTX580, corresponding to 20 25% absolute GPU efficiency (against the peak theoretical performance) and 50-70% intrinsic efficiency (against the algorithm s maximum operational throughput, which neglects all latencies). This is about 200-300 times faster than an equivalent serial CPU implementation. When the single-precision GPU particle mover is combined with a double-precision CPU JFNK field solver, overall performance gains 100 vs. the double-precision CPU-only serial version are obtained, with no apparent loss of robustness or accuracy when applied to a challenging long-time scale ion acoustic wave simulation.« less
An Energy- and Charge-conserving, Implicit, Electrostatic Particle-in-Cell Algorithm in curvilinear geometry

NASA Astrophysics Data System (ADS)

Chen, G.; Chacón, L.; Barnes, D. C.

2012-03-01

A recent proof-of-principle study proposes an energy- and charge-conserving, fully implicit particle-in-cell algorithm in one dimension [1], which is able to use timesteps comparable to the dynamical timescale of interest. Here, we generalize the method to employ non-uniform meshes via a curvilinear map. The key enabling technology is a hybrid particle pusher [2], with particle positions updated in logical space and particle velocities updated in physical space. The self-adaptive, charge-conserving particle mover of Ref. [1] is extended to the non-uniform mesh case. The fully implicit implementation, using a Jacobian-free Newton-Krylov iterative solver, remains exactly charge- and energy-conserving. The extension of the formulation to multiple dimensions will be discussed. We present numerical experiments of 1D electrostatic, long-timescale ion-acoustic wave and ion-acoustic shock wave simulations, demonstrating that charge and energy are conserved to round-off for arbitrary mesh non-uniformity, and that the total momentum remains well conserved.[4pt] [1] Chen, Chac'on, Barnes, J. Comput. Phys. 230 (2011). [0pt] [2] Camporeale and Delzanno, Bull. Am. Phys. Soc. 56(6) (2011); Wang, et al., J. Plasma Physics, 61 (1999).
Implicit Geometry Meshing for the simulation of Rotary Friction Welding

NASA Astrophysics Data System (ADS)

Schmicker, D.; Persson, P.-O.; Strackeljan, J.

2014-08-01

The simulation of Rotary Friction Welding (RFW) is a challenging task, since it states a coupled problem of phenomena like large plastic deformations, heat flux, contact and friction. In particular the mesh generation and its restoration when using a Lagrangian description of motion is of significant severity. In this regard Implicit Geometry Meshing (IGM) algorithms are promising alternatives to the more conventional explicit methods. Because of the implicit description of the geometry during remeshing, the IGM procedure turns out to be highly robust and generates spatial discretizations of high quality regardless of the complexity of the flash shape and its inclusions. A model for efficient RFW simulation is presented, which is based on a Carreau fluid law, an Augmented Lagrange approach in mapping the incompressible deformations, a penalty contact approach, a fully regularized Coulomb-/fluid friction law and a hybrid time integration strategy. The implementation of the IGM algorithm using 6-node triangular finite elements is described in detail. The techniques are demonstrated on a fairly complex friction welding problem, demonstrating the performance and the potentials of the proposed method. The techniques are general and straight-forward to implement, and offer the potential of successful adoption to a wide range of other engineering problems.

Physics Based Model for Cryogenic Chilldown and Loading. Part I: Algorithm

NASA Technical Reports Server (NTRS)

Luchinsky, Dmitry G.; Smelyanskiy, Vadim N.; Brown, Barbara

2014-01-01

We report the progress in the development of the physics based model for cryogenic chilldown and loading. The chilldown and loading is model as fully separated non-equilibrium two-phase flow of cryogenic fluid thermally coupled to the pipe walls. The solution follow closely nearly-implicit and semi-implicit algorithms developed for autonomous control of thermal-hydraulic systems developed by Idaho National Laboratory. A special attention is paid to the treatment of instabilities. The model is applied to the analysis of chilldown in rapid loading system developed at NASA-Kennedy Space Center. The nontrivial characteristic feature of the analyzed chilldown regime is its active control by dump valves. The numerical predictions are in reasonable agreement with the experimental time traces. The obtained results pave the way to the development of autonomous loading operation on the ground and space.
A tightly-coupled domain-decomposition approach for highly nonlinear stochastic multiphysics systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Taverniers, Søren; Tartakovsky, Daniel M., E-mail: dmt@ucsd.edu

2017-02-01

Multiphysics simulations often involve nonlinear components that are driven by internally generated or externally imposed random fluctuations. When used with a domain-decomposition (DD) algorithm, such components have to be coupled in a way that both accurately propagates the noise between the subdomains and lends itself to a stable and cost-effective temporal integration. We develop a conservative DD approach in which tight coupling is obtained by using a Jacobian-free Newton–Krylov (JfNK) method with a generalized minimum residual iterative linear solver. This strategy is tested on a coupled nonlinear diffusion system forced by a truncated Gaussian noise at the boundary. Enforcement ofmore » path-wise continuity of the state variable and its flux, as opposed to continuity in the mean, at interfaces between subdomains enables the DD algorithm to correctly propagate boundary fluctuations throughout the computational domain. Reliance on a single Newton iteration (explicit coupling), rather than on the fully converged JfNK (implicit) coupling, may increase the solution error by an order of magnitude. Increase in communication frequency between the DD components reduces the explicit coupling's error, but makes it less efficient than the implicit coupling at comparable error levels for all noise strengths considered. Finally, the DD algorithm with the implicit JfNK coupling resolves temporally-correlated fluctuations of the boundary noise when the correlation time of the latter exceeds some multiple of an appropriately defined characteristic diffusion time.« less
Implicit, nonswitching, vector-oriented algorithm for steady transonic flow

NASA Technical Reports Server (NTRS)

Lottati, I.

1983-01-01

A rapid computation of a sequence of transonic flow solutions has to be performed in many areas of aerodynamic technology. The employment of low-cost vector array processors makes the conduction of such calculations economically feasible. However, for a full utilization of the new hardware, the developed algorithms must take advantage of the special characteristics of the vector array processor. The present investigation has the objective to develop an efficient algorithm for solving transonic flow problems governed by mixed partial differential equations on an array processor.
Numerical simulation of steady supersonic flow. [spatial marching

NASA Technical Reports Server (NTRS)

Schiff, L. B.; Steger, J. L.

1981-01-01

A noniterative, implicit, space-marching, finite-difference algorithm was developed for the steady thin-layer Navier-Stokes equations in conservation-law form. The numerical algorithm is applicable to steady supersonic viscous flow over bodies of arbitrary shape. In addition, the same code can be used to compute supersonic inviscid flow or three-dimensional boundary layers. Computed results from two-dimensional and three-dimensional versions of the numerical algorithm are in good agreement with those obtained from more costly time-marching techniques.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Jakeman, John D.; Narayan, Akil; Zhou, Tao

We propose an algorithm for recovering sparse orthogonal polynomial expansions via collocation. A standard sampling approach for recovering sparse polynomials uses Monte Carlo sampling, from the density of orthogonality, which results in poor function recovery when the polynomial degree is high. Our proposed approach aims to mitigate this limitation by sampling with respect to the weighted equilibrium measure of the parametric domain and subsequently solves a preconditionedmore » $$\\ell^1$$-minimization problem, where the weights of the diagonal preconditioning matrix are given by evaluations of the Christoffel function. Our algorithm can be applied to a wide class of orthogonal polynomial families on bounded and unbounded domains, including all classical families. We present theoretical analysis to motivate the algorithm and numerical results that show our method is superior to standard Monte Carlo methods in many situations of interest. In conclusion, numerical examples are also provided to demonstrate that our proposed algorithm leads to comparable or improved accuracy even when compared with Legendre- and Hermite-specific algorithms.« less
A Generalized Sampling and Preconditioning Scheme for Sparse Approximation of Polynomial Chaos Expansions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jakeman, John D.; Narayan, Akil; Zhou, Tao

We propose an algorithm for recovering sparse orthogonal polynomial expansions via collocation. A standard sampling approach for recovering sparse polynomials uses Monte Carlo sampling, from the density of orthogonality, which results in poor function recovery when the polynomial degree is high. Our proposed approach aims to mitigate this limitation by sampling with respect to the weighted equilibrium measure of the parametric domain and subsequently solves a preconditionedmore » $$\\ell^1$$-minimization problem, where the weights of the diagonal preconditioning matrix are given by evaluations of the Christoffel function. Our algorithm can be applied to a wide class of orthogonal polynomial families on bounded and unbounded domains, including all classical families. We present theoretical analysis to motivate the algorithm and numerical results that show our method is superior to standard Monte Carlo methods in many situations of interest. In conclusion, numerical examples are also provided to demonstrate that our proposed algorithm leads to comparable or improved accuracy even when compared with Legendre- and Hermite-specific algorithms.« less
A Generalized Sampling and Preconditioning Scheme for Sparse Approximation of Polynomial Chaos Expansions

DOE PAGES

Jakeman, John D.; Narayan, Akil; Zhou, Tao

2017-06-22

We propose an algorithm for recovering sparse orthogonal polynomial expansions via collocation. A standard sampling approach for recovering sparse polynomials uses Monte Carlo sampling, from the density of orthogonality, which results in poor function recovery when the polynomial degree is high. Our proposed approach aims to mitigate this limitation by sampling with respect to the weighted equilibrium measure of the parametric domain and subsequently solves a preconditionedmore » $$\\ell^1$$-minimization problem, where the weights of the diagonal preconditioning matrix are given by evaluations of the Christoffel function. Our algorithm can be applied to a wide class of orthogonal polynomial families on bounded and unbounded domains, including all classical families. We present theoretical analysis to motivate the algorithm and numerical results that show our method is superior to standard Monte Carlo methods in many situations of interest. In conclusion, numerical examples are also provided to demonstrate that our proposed algorithm leads to comparable or improved accuracy even when compared with Legendre- and Hermite-specific algorithms.« less
Tunneling splitting in double-proton transfer: direct diagonalization results for porphycene.

PubMed

Smedarchina, Zorka; Siebrand, Willem; Fernández-Ramos, Antonio

2014-11-07

Zero-point and excited level splittings due to double-proton tunneling are calculated for porphycene and the results are compared with experiment. The calculation makes use of a multidimensional imaginary-mode Hamiltonian, diagonalized directly by an effective reduction of its dimensionality. Porphycene has a complex potential energy surface with nine stationary configurations that allow a variety of tunneling paths, many of which include classically accessible regions. A symmetry-based approach is used to show that the zero-point level, although located above the cis minimum, corresponds to concerted tunneling along a direct trans - trans path; a corresponding cis - cis path is predicted at higher energy. This supports the conclusion of a previous paper [Z. Smedarchina, W. Siebrand, and A. Fernández-Ramos, J. Chem. Phys. 127, 174513 (2007)] based on the instanton approach to a model Hamiltonian of correlated double-proton transfer. A multidimensional tunneling Hamiltonian is then generated, based on a double-minimum potential along the coordinate of concerted proton motion, which is newly evaluated at the RI-CC2/cc-pVTZ level of theory. To make it suitable for diagonalization, its dimensionality is reduced by treating fast weakly coupled modes in the adiabatic approximation. This results in a coordinate-dependent mass of tunneling, which is included in a unique Hermitian form into the kinetic energy operator. The reduced Hamiltonian contains three symmetric and one antisymmetric mode coupled to the tunneling mode and is diagonalized by a modified Jacobi-Davidson algorithm implemented in the Jadamilu software for sparse matrices. The results are in satisfactory agreement with the observed splitting of the zero-point level and several vibrational fundamentals after a partial reassignment, imposed by recently derived selection rules. They also agree well with instanton calculations based on the same Hamiltonian.
ACOSS Eleven (Active Control of Space Structures)

DTIC Science & Technology

1984-09-01

spatial integration with thresh- old level and system track threshold level reduction factor. 2.2.3 Track Acquisition In the HRAP/LRTP simulation, input ...in both row and column, however, then the track direction is determined to be diagonal. Also, as with the first * tier, multiple hits are processed...for any system track before thresholding, clustering, and centroiding can produce the next frame to be input to the two tier algorithm. As Figure 2-10
Multi-zonal Navier-Stokes code with the LU-SGS scheme

NASA Technical Reports Server (NTRS)

Klopfer, G. H.; Yoon, S.

1993-01-01

The LU-SGS (lower upper symmetric Gauss Seidel) algorithm has been implemented into the Compressible Navier-Stokes, Finite Volume (CNSFV) code and validated with a multizonal Navier-Stokes simulation of a transonic turbulent flow around an Onera M6 transport wing. The convergence rate and robustness of the code have been improved and the computational cost has been reduced by at least a factor of 2 over the diagonal Beam-Warming scheme.
A High-Order Low-Order Algorithm with Exponentially Convergent Monte Carlo for Thermal Radiative Transfer

DOE PAGES

Bolding, Simon R.; Cleveland, Mathew Allen; Morel, Jim E.

2016-10-21

In this paper, we have implemented a new high-order low-order (HOLO) algorithm for solving thermal radiative transfer problems. The low-order (LO) system is based on the spatial and angular moments of the transport equation and a linear-discontinuous finite-element spatial representation, producing equations similar to the standard S 2 equations. The LO solver is fully implicit in time and efficiently resolves the nonlinear temperature dependence at each time step. The high-order (HO) solver utilizes exponentially convergent Monte Carlo (ECMC) to give a globally accurate solution for the angular intensity to a fixed-source pure-absorber transport problem. This global solution is used tomore » compute consistency terms, which require the HO and LO solutions to converge toward the same solution. The use of ECMC allows for the efficient reduction of statistical noise in the Monte Carlo solution, reducing inaccuracies introduced through the LO consistency terms. Finally, we compare results with an implicit Monte Carlo code for one-dimensional gray test problems and demonstrate the efficiency of ECMC over standard Monte Carlo in this HOLO algorithm.« less
A fully implicit finite element method for bidomain models of cardiac electromechanics

PubMed Central

Dal, Hüsnü; Göktepe, Serdar; Kaliske, Michael; Kuhl, Ellen

2012-01-01

We propose a novel, monolithic, and unconditionally stable finite element algorithm for the bidomain-based approach to cardiac electromechanics. We introduce the transmembrane potential, the extracellular potential, and the displacement field as independent variables, and extend the common two-field bidomain formulation of electrophysiology to a three-field formulation of electromechanics. The intrinsic coupling arises from both excitation-induced contraction of cardiac cells and the deformation-induced generation of intra-cellular currents. The coupled reaction-diffusion equations of the electrical problem and the momentum balance of the mechanical problem are recast into their weak forms through a conventional isoparametric Galerkin approach. As a novel aspect, we propose a monolithic approach to solve the governing equations of excitation-contraction coupling in a fully coupled, implicit sense. We demonstrate the consistent linearization of the resulting set of non-linear residual equations. To assess the algorithmic performance, we illustrate characteristic features by means of representative three-dimensional initial-boundary value problems. The proposed algorithm may open new avenues to patient specific therapy design by circumventing stability and convergence issues inherent to conventional staggered solution schemes. PMID:23175588
Treecode-based generalized Born method

NASA Astrophysics Data System (ADS)

Xu, Zhenli; Cheng, Xiaolin; Yang, Haizhao

2011-02-01

We have developed a treecode-based O(Nlog N) algorithm for the generalized Born (GB) implicit solvation model. Our treecode-based GB (tGB) is based on the GBr6 [J. Phys. Chem. B 111, 3055 (2007)], an analytical GB method with a pairwise descreening approximation for the R6 volume integral expression. The algorithm is composed of a cutoff scheme for the effective Born radii calculation, and a treecode implementation of the GB charge-charge pair interactions. Test results demonstrate that the tGB algorithm can reproduce the vdW surface based Poisson solvation energy with an average relative error less than 0.6% while providing an almost linear-scaling calculation for a representative set of 25 proteins with different sizes (from 2815 atoms to 65456 atoms). For a typical system of 10k atoms, the tGB calculation is three times faster than the direct summation as implemented in the original GBr6 model. Thus, our tGB method provides an efficient way for performing implicit solvent GB simulations of larger biomolecular systems at longer time scales.
A computational procedure for large rotational motions in multibody dynamics

NASA Technical Reports Server (NTRS)

Park, K. C.; Chiou, J. C.

1987-01-01

A computational procedure suitable for the solution of equations of motion for multibody systems is presented. The present procedure adopts a differential partitioning of the translational motions and the rotational motions. The translational equations of motion are then treated by either a conventional explicit or an implicit direct integration method. A principle feature of this procedure is a nonlinearly implicit algorithm for updating rotations via the Euler four-parameter representation. This procedure is applied to the rolling of a sphere through a specific trajectory, which shows that it yields robust solutions.
A Conservative Discontinuous Galerkin Semi-Implicit Formulation for the Navier-Stokes Equations in Nonhydrostatic Mesoscale Modeling

DTIC Science & Technology

2009-01-01

is usually implemented as an implicit correction to an explicit predictor substep [43]. In our case, this leads to the following algorithm : (i...ref., 50m ç C 10-6 10-5 10-4 0.01 0.1 1 s 0.01 0.1 1 m10 100 1000 Fig. 6.7. Self -convergence experiment for the density current test as in [51], Figure...by SIAM. Unauthorized reproduction of this article is prohibited. SIAM J. SCI. COMPUT. c © 2009 Society for Industrial and Applied Mathematics Vol
An Implicit Solver on A Parallel Block-Structured Adaptive Mesh Grid for FLASH

NASA Astrophysics Data System (ADS)

Lee, D.; Gopal, S.; Mohapatra, P.

2012-07-01

We introduce a fully implicit solver for FLASH based on a Jacobian-Free Newton-Krylov (JFNK) approach with an appropriate preconditioner. The main goal of developing this JFNK-type implicit solver is to provide efficient high-order numerical algorithms and methodology for simulating stiff systems of differential equations on large-scale parallel computer architectures. A large number of natural problems in nonlinear physics involve a wide range of spatial and time scales of interest. A system that encompasses such a wide magnitude of scales is described as "stiff." A stiff system can arise in many different fields of physics, including fluid dynamics/aerodynamics, laboratory/space plasma physics, low Mach number flows, reactive flows, radiation hydrodynamics, and geophysical flows. One of the big challenges in solving such a stiff system using current-day computational resources lies in resolving time and length scales varying by several orders of magnitude. We introduce FLASH's preliminary implementation of a time-accurate JFNK-based implicit solver in the framework of FLASH's unsplit hydro solver.
A time-implicit numerical method and benchmarks for the relativistic Vlasov–Ampere equations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Carrie, Michael; Shadwick, B. A.

2016-01-04

Here, we present a time-implicit numerical method to solve the relativistic Vlasov–Ampere system of equations on a two dimensional phase space grid. The time-splitting algorithm we use allows the generalization of the work presented here to higher dimensions keeping the linear aspect of the resulting discrete set of equations. The implicit method is benchmarked against linear theory results for the relativistic Landau damping for which analytical expressions using the Maxwell-Juttner distribution function are derived. We note that, independently from the shape of the distribution function, the relativistic treatment features collective behaviors that do not exist in the non relativistic case.more » The numerical study of the relativistic two-stream instability completes the set of benchmarking tests.« less
A time-implicit numerical method and benchmarks for the relativistic Vlasov–Ampere equations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Carrié, Michael, E-mail: mcarrie2@unl.edu; Shadwick, B. A., E-mail: shadwick@mailaps.org

2016-01-15

We present a time-implicit numerical method to solve the relativistic Vlasov–Ampere system of equations on a two dimensional phase space grid. The time-splitting algorithm we use allows the generalization of the work presented here to higher dimensions keeping the linear aspect of the resulting discrete set of equations. The implicit method is benchmarked against linear theory results for the relativistic Landau damping for which analytical expressions using the Maxwell-Jüttner distribution function are derived. We note that, independently from the shape of the distribution function, the relativistic treatment features collective behaviours that do not exist in the nonrelativistic case. The numericalmore » study of the relativistic two-stream instability completes the set of benchmarking tests.« less
Efficient block preconditioned eigensolvers for linear response time-dependent density functional theory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vecharynski, Eugene; Brabec, Jiri; Shao, Meiyue

We present two efficient iterative algorithms for solving the linear response eigen- value problem arising from the time dependent density functional theory. Although the matrix to be diagonalized is nonsymmetric, it has a special structure that can be exploited to save both memory and floating point operations. In particular, the nonsymmetric eigenvalue problem can be transformed into a product eigenvalue problem that is self-adjoint with respect to a K-inner product. This product eigenvalue problem can be solved efficiently by a modified Davidson algorithm and a modified locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm that make use of the K-innermore » product. The solution of the product eigenvalue problem yields one component of the eigenvector associated with the original eigenvalue problem. However, the other component of the eigenvector can be easily recovered in a postprocessing procedure. Therefore, the algorithms we present here are more efficient than existing algorithms that try to approximate both components of the eigenvectors simultaneously. The efficiency of the new algorithms is demonstrated by numerical examples.« less
Gram-Schmidt algorithms for covariance propagation

NASA Technical Reports Server (NTRS)

Thornton, C. L.; Bierman, G. J.

1977-01-01

This paper addresses the time propagation of triangular covariance factors. Attention is focused on the square-root free factorization, P = UD(transpose of U), where U is unit upper triangular and D is diagonal. An efficient and reliable algorithm for U-D propagation is derived which employs Gram-Schmidt orthogonalization. Partitioning the state vector to distinguish bias and coloured process noise parameters increase mapping efficiency. Cost comparisons of the U-D, Schmidt square-root covariance and conventional covariance propagation methods are made using weighted arithmetic operation counts. The U-D time update is shown to be less costly than the Schmidt method; and, except in unusual circumstances, it is within 20% of the cost of conventional propagation.

Gram-Schmidt algorithms for covariance propagation

NASA Technical Reports Server (NTRS)

Thornton, C. L.; Bierman, G. J.

1975-01-01

This paper addresses the time propagation of triangular covariance factors. Attention is focused on the square-root free factorization, P = UDU/T/, where U is unit upper triangular and D is diagonal. An efficient and reliable algorithm for U-D propagation is derived which employs Gram-Schmidt orthogonalization. Partitioning the state vector to distinguish bias and colored process noise parameters increases mapping efficiency. Cost comparisons of the U-D, Schmidt square-root covariance and conventional covariance propagation methods are made using weighted arithmetic operation counts. The U-D time update is shown to be less costly than the Schmidt method; and, except in unusual circumstances, it is within 20% of the cost of conventional propagation.
Analysis of Modified SMI Method for Adaptive Array Weight Control. M.S. Thesis

NASA Technical Reports Server (NTRS)

Dilsavor, Ronald Louis

1989-01-01

An adaptive array is used to receive a desired signal in the presence of weak interference signals which need to be suppressed. A modified sample matrix inversion (SMI) algorithm controls the array weights. The modification leads to increased interference suppression by subtracting a fraction of the noise power from the diagonal elements of the covariance matrix. The modified algorithm maximizes an intuitive power ratio criterion. The expected values and variances of the array weights, output powers, and power ratios as functions of the fraction and the number of snapshots are found and compared to computer simulation and real experimental array performance. Reduced-rank covariance approximations and errors in the estimated covariance are also described.
Implementation in an FPGA circuit of Edge detection algorithm based on the Discrete Wavelet Transforms

NASA Astrophysics Data System (ADS)

Bouganssa, Issam; Sbihi, Mohamed; Zaim, Mounia

2017-07-01

The 2D Discrete Wavelet Transform (DWT) is a computationally intensive task that is usually implemented on specific architectures in many imaging systems in real time. In this paper, a high throughput edge or contour detection algorithm is proposed based on the discrete wavelet transform. A technique for applying the filters on the three directions (Horizontal, Vertical and Diagonal) of the image is used to present the maximum of the existing contours. The proposed architectures were designed in VHDL and mapped to a Xilinx Sparten6 FPGA. The results of the synthesis show that the proposed architecture has a low area cost and can operate up to 100 MHz, which can perform 2D wavelet analysis for a sequence of images while maintaining the flexibility of the system to support an adaptive algorithm.
Electron-Phonon Systems on a Universal Quantum Computer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Macridin, Alexandru; Spentzouris, Panagiotis; Amundson, James

We present an algorithm that extends existing quantum algorithms forsimulating fermion systems in quantum chemistry and condensed matter physics toinclude phonons. The phonon degrees of freedom are represented with exponentialaccuracy on a truncated Hilbert space with a size that increases linearly withthe cutoff of the maximum phonon number. The additional number of qubitsrequired by the presence of phonons scales linearly with the size of thesystem. The additional circuit depth is constant for systems with finite-rangeelectron-phonon and phonon-phonon interactions and linear for long-rangeelectron-phonon interactions. Our algorithm for a Holstein polaron problem wasimplemented on an Atos Quantum Learning Machine (QLM) quantum simulatoremployingmore » the Quantum Phase Estimation method. The energy and the phonon numberdistribution of the polaron state agree with exact diagonalization results forweak, intermediate and strong electron-phonon coupling regimes.« less
Implementation of a kappa-epsilon turbulence model to RPLUS3D code

NASA Technical Reports Server (NTRS)

Chitsomboon, Tawit

1992-01-01

The RPLUS3D code has been developed at the NASA Lewis Research Center to support the National Aerospace Plane (NASP) project. The code has the ability to solve three dimensional flowfields with finite rate combustion of hydrogen and air. The combustion process of the hydrogen-air system are simulated by an 18 reaction path, 8 species chemical kinetic mechanism. The code uses a Lower-Upper (LU) decomposition numerical algorithm as its basis, making it a very efficient and robust code. Except for the Jacobian matrix for the implicit chemistry source terms, there is no inversion of a matrix even though a fully implicit numerical algorithm is used. A k-epsilon turbulence model has recently been incorporated into the code. Initial validations have been conducted for a flow over a flat plate. Results of the validation studies are shown. Some difficulties in implementing the k-epsilon equations to the code are also discussed.
Implementation of a kappa-epsilon turbulence model to RPLUS3D code

NASA Astrophysics Data System (ADS)

Chitsomboon, Tawit

1992-02-01

The RPLUS3D code has been developed at the NASA Lewis Research Center to support the National Aerospace Plane (NASP) project. The code has the ability to solve three dimensional flowfields with finite rate combustion of hydrogen and air. The combustion process of the hydrogen-air system are simulated by an 18 reaction path, 8 species chemical kinetic mechanism. The code uses a Lower-Upper (LU) decomposition numerical algorithm as its basis, making it a very efficient and robust code. Except for the Jacobian matrix for the implicit chemistry source terms, there is no inversion of a matrix even though a fully implicit numerical algorithm is used. A k-epsilon turbulence model has recently been incorporated into the code. Initial validations have been conducted for a flow over a flat plate. Results of the validation studies are shown. Some difficulties in implementing the k-epsilon equations to the code are also discussed.
Filtering of non-linear instabilities. [from finite difference solution of fluid dynamics equations

NASA Technical Reports Server (NTRS)

Khosla, P. K.; Rubin, S. G.

1979-01-01

For Courant numbers larger than one and cell Reynolds numbers larger than two, oscillations and in some cases instabilities are typically found with implicit numerical solutions of the fluid dynamics equations. This behavior has sometimes been associated with the loss of diagonal dominance of the coefficient matrix. It is shown here that these problems can in fact be related to the choice of the spatial differences, with the resulting instability related to aliasing or nonlinear interaction. Appropriate 'filtering' can reduce the intensity of these oscillations and in some cases possibly eliminate the instability. These filtering procedures are equivalent to a weighted average of conservation and non-conservation differencing. The entire spectrum of filtered equations retains a three-point character as well as second-order spatial accuracy. Burgers equation has been considered as a model. Several filters are examined in detail, and smooth solutions have been obtained for extremely large cell Reynolds numbers.
Urdu Nasta'liq text recognition using implicit segmentation based on multi-dimensional long short term memory neural networks.

PubMed

Naz, Saeeda; Umar, Arif Iqbal; Ahmed, Riaz; Razzak, Muhammad Imran; Rashid, Sheikh Faisal; Shafait, Faisal

2016-01-01

The recognition of Arabic script and its derivatives such as Urdu, Persian, Pashto etc. is a difficult task due to complexity of this script. Particularly, Urdu text recognition is more difficult due to its Nasta'liq writing style. Nasta'liq writing style inherits complex calligraphic nature, which presents major issues to recognition of Urdu text owing to diagonality in writing, high cursiveness, context sensitivity and overlapping of characters. Therefore, the work done for recognition of Arabic script cannot be directly applied to Urdu recognition. We present Multi-dimensional Long Short Term Memory (MDLSTM) Recurrent Neural Networks with an output layer designed for sequence labeling for recognition of printed Urdu text-lines written in the Nasta'liq writing style. Experiments show that MDLSTM attained a recognition accuracy of 98% for the unconstrained Urdu Nasta'liq printed text, which significantly outperforms the state-of-the-art techniques.
An efficient method for solving the steady Euler equations

NASA Technical Reports Server (NTRS)

Liou, M. S.

1986-01-01

An efficient numerical procedure for solving a set of nonlinear partial differential equations is given, specifically for the steady Euler equations. Solutions of the equations were obtained by Newton's linearization procedure, commonly used to solve the roots of nonlinear algebraic equations. In application of the same procedure for solving a set of differential equations we give a theorem showing that a quadratic convergence rate can be achieved. While the domain of quadratic convergence depends on the problems studied and is unknown a priori, we show that firstand second-order derivatives of flux vectors determine whether the condition for quadratic convergence is satisfied. The first derivatives enter as an implicit operator for yielding new iterates and the second derivatives indicates smoothness of the flows considered. Consequently flows involving shocks are expected to require larger number of iterations. First-order upwind discretization in conjunction with the Steger-Warming flux-vector splitting is employed on the implicit operator and a diagonal dominant matrix results. However the explicit operator is represented by first- and seond-order upwind differencings, using both Steger-Warming's and van Leer's splittings. We discuss treatment of boundary conditions and solution procedures for solving the resulting block matrix system. With a set of test problems for one- and two-dimensional flows, we show detailed study as to the efficiency, accuracy, and convergence of the present method.
Multigrid Methods for Fully Implicit Oil Reservoir Simulation

NASA Technical Reports Server (NTRS)

Molenaar, J.

1996-01-01

In this paper we consider the simultaneous flow of oil and water in reservoir rock. This displacement process is modeled by two basic equations: the material balance or continuity equations and the equation of motion (Darcy's law). For the numerical solution of this system of nonlinear partial differential equations there are two approaches: the fully implicit or simultaneous solution method and the sequential solution method. In the sequential solution method the system of partial differential equations is manipulated to give an elliptic pressure equation and a hyperbolic (or parabolic) saturation equation. In the IMPES approach the pressure equation is first solved, using values for the saturation from the previous time level. Next the saturations are updated by some explicit time stepping method; this implies that the method is only conditionally stable. For the numerical solution of the linear, elliptic pressure equation multigrid methods have become an accepted technique. On the other hand, the fully implicit method is unconditionally stable, but it has the disadvantage that in every time step a large system of nonlinear algebraic equations has to be solved. The most time-consuming part of any fully implicit reservoir simulator is the solution of this large system of equations. Usually this is done by Newton's method. The resulting systems of linear equations are then either solved by a direct method or by some conjugate gradient type method. In this paper we consider the possibility of applying multigrid methods for the iterative solution of the systems of nonlinear equations. There are two ways of using multigrid for this job: either we use a nonlinear multigrid method or we use a linear multigrid method to deal with the linear systems that arise in Newton's method. So far only a few authors have reported on the use of multigrid methods for fully implicit simulations. Two-level FAS algorithm is presented for the black-oil equations, and linear multigrid for two-phase flow problems with strong heterogeneities and anisotropies is studied. Here we consider both possibilities. Moreover we present a novel way for constructing the coarse grid correction operator in linear multigrid algorithms. This approach has the advantage in that it preserves the sparsity pattern of the fine grid matrix and it can be extended to systems of equations in a straightforward manner. We compare the linear and nonlinear multigrid algorithms by means of a numerical experiment.
Nonuniform grid implicit spatial finite difference method for acoustic wave modeling in tilted transversely isotropic media

NASA Astrophysics Data System (ADS)

Chu, Chunlei; Stoffa, Paul L.

2012-01-01

Discrete earth models are commonly represented by uniform structured grids. In order to ensure accurate numerical description of all wave components propagating through these uniform grids, the grid size must be determined by the slowest velocity of the entire model. Consequently, high velocity areas are always oversampled, which inevitably increases the computational cost. A practical solution to this problem is to use nonuniform grids. We propose a nonuniform grid implicit spatial finite difference method which utilizes nonuniform grids to obtain high efficiency and relies on implicit operators to achieve high accuracy. We present a simple way of deriving implicit finite difference operators of arbitrary stencil widths on general nonuniform grids for the first and second derivatives and, as a demonstration example, apply these operators to the pseudo-acoustic wave equation in tilted transversely isotropic (TTI) media. We propose an efficient gridding algorithm that can be used to convert uniformly sampled models onto vertically nonuniform grids. We use a 2D TTI salt model to demonstrate its effectiveness and show that the nonuniform grid implicit spatial finite difference method can produce highly accurate seismic modeling results with enhanced efficiency, compared to uniform grid explicit finite difference implementations.
A New Cell-Centered Implicit Numerical Scheme for Ions in the 2-D Axisymmetric Code Hall2de

NASA Technical Reports Server (NTRS)

Lopez Ortega, Alejandro; Mikellides, Ioannis G.

2014-01-01

We present a new algorithm in the Hall2De code to simulate the ion hydrodynamics in the acceleration channel and near plume regions of Hall-effect thrusters. This implementation constitutes an upgrade of the capabilities built in the Hall2De code. The equations of mass conservation and momentum for unmagnetized ions are solved using a conservative, finite-volume, cell-centered scheme on a magnetic-field-aligned grid. Major computational savings are achieved by making use of an implicit predictor/multi-corrector algorithm for time evolution. Inaccuracies in the prediction of the motion of low-energy ions in the near plume in hydrodynamics approaches are addressed by implementing a multi-fluid algorithm that tracks ions of different energies separately. A wide range of comparisons with measurements are performed to validate the new ion algorithms. Several numerical experiments with the location and value of the anomalous collision frequency are also presented. Differences in the plasma properties in the near-plume between the single fluid and multi-fluid approaches are discussed. We complete our validation by comparing predicted erosion rates at the channel walls of the thruster with measurements. Erosion rates predicted by the plasma properties obtained from simulations replicate accurately measured rates of erosion within the uncertainty range of the sputtering models employed.
Multi-color incomplete Cholesky conjugate gradient methods for vector computers. Ph.D. Thesis

NASA Technical Reports Server (NTRS)

Poole, E. L.

1986-01-01

In this research, we are concerned with the solution on vector computers of linear systems of equations, Ax = b, where A is a larger, sparse symmetric positive definite matrix. We solve the system using an iterative method, the incomplete Cholesky conjugate gradient method (ICCG). We apply a multi-color strategy to obtain p-color matrices for which a block-oriented ICCG method is implemented on the CYBER 205. (A p-colored matrix is a matrix which can be partitioned into a pXp block matrix where the diagonal blocks are diagonal matrices). This algorithm, which is based on a no-fill strategy, achieves O(N/p) length vector operations in both the decomposition of A and in the forward and back solves necessary at each iteration of the method. We discuss the natural ordering of the unknowns as an ordering that minimizes the number of diagonals in the matrix and define multi-color orderings in terms of disjoint sets of the unknowns. We give necessary and sufficient conditions to determine which multi-color orderings of the unknowns correpond to p-color matrices. A performance model is given which is used both to predict execution time for ICCG methods and also to compare an ICCG method to conjugate gradient without preconditioning or another ICCG method. Results are given from runs on the CYBER 205 at NASA's Langley Research Center for four model problems.
On K-Line and K x K Block Iterative Schemes for a Problem Arising in 3-D Elliptic Difference Equations.

DTIC Science & Technology

1980-01-01

VPARTER. , STEUERWALT No 0I 76_C-03AI UNCLASSIFIED CSTR -374 ML M EMON~hEE 111112.08 12.5 111112 1.4 1 1. KWOCP RSLINTS CHR NA11~ L .R~l0 ___VRD I-l...4b) are obtained from the well known algorithm for solving diagonally dominant tridiagonal sys- tems; see (16, 10]. The monotonicity of the Ej and the
Implicit gas-kinetic unified algorithm based on multi-block docking grid for multi-body reentry flows covering all flow regimes

NASA Astrophysics Data System (ADS)

Peng, Ao-Ping; Li, Zhi-Hui; Wu, Jun-Lin; Jiang, Xin-Yu

2016-12-01

Based on the previous researches of the Gas-Kinetic Unified Algorithm (GKUA) for flows from highly rarefied free-molecule transition to continuum, a new implicit scheme of cell-centered finite volume method is presented for directly solving the unified Boltzmann model equation covering various flow regimes. In view of the difficulty in generating the single-block grid system with high quality for complex irregular bodies, a multi-block docking grid generation method is designed on the basis of data transmission between blocks, and the data structure is constructed for processing arbitrary connection relations between blocks with high efficiency and reliability. As a result, the gas-kinetic unified algorithm with the implicit scheme and multi-block docking grid has been firstly established and used to solve the reentry flow problems around the multi-bodies covering all flow regimes with the whole range of Knudsen numbers from 10 to 3.7E-6. The implicit and explicit schemes are applied to computing and analyzing the supersonic flows in near-continuum and continuum regimes around a circular cylinder with careful comparison each other. It is shown that the present algorithm and modelling possess much higher computational efficiency and faster converging properties. The flow problems including two and three side-by-side cylinders are simulated from highly rarefied to near-continuum flow regimes, and the present computed results are found in good agreement with the related DSMC simulation and theoretical analysis solutions, which verify the good accuracy and reliability of the present method. It is observed that the spacing of the multi-body is smaller, the cylindrical throat obstruction is greater with the flow field of single-body asymmetrical more obviously and the normal force coefficient bigger. While in the near-continuum transitional flow regime of near-space flying surroundings, the spacing of the multi-body increases to six times of the diameter of the single-body, the interference effects of the multi-bodies tend to be negligible. The computing practice has confirmed that it is feasible for the present method to compute the aerodynamics and reveal flow mechanism around complex multi-body vehicles covering all flow regimes from the gas-kinetic point of view of solving the unified Boltzmann model velocity distribution function equation.
A parallel domain decomposition-based implicit method for the Cahn–Hilliard–Cook phase-field equation in 3D

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zheng, Xiang; Yang, Chao; State Key Laboratory of Computer Science, Chinese Academy of Sciences, Beijing 100190

2015-03-15

We present a numerical algorithm for simulating the spinodal decomposition described by the three dimensional Cahn–Hilliard–Cook (CHC) equation, which is a fourth-order stochastic partial differential equation with a noise term. The equation is discretized in space and time based on a fully implicit, cell-centered finite difference scheme, with an adaptive time-stepping strategy designed to accelerate the progress to equilibrium. At each time step, a parallel Newton–Krylov–Schwarz algorithm is used to solve the nonlinear system. We discuss various numerical and computational challenges associated with the method. The numerical scheme is validated by a comparison with an explicit scheme of high accuracymore » (and unreasonably high cost). We present steady state solutions of the CHC equation in two and three dimensions. The effect of the thermal fluctuation on the spinodal decomposition process is studied. We show that the existence of the thermal fluctuation accelerates the spinodal decomposition process and that the final steady morphology is sensitive to the stochastic noise. We also show the evolution of the energies and statistical moments. In terms of the parallel performance, it is found that the implicit domain decomposition approach scales well on supercomputers with a large number of processors.« less
Fully implicit adaptive mesh refinement algorithm for reduced MHD

NASA Astrophysics Data System (ADS)

Philip, Bobby; Pernice, Michael; Chacon, Luis

2006-10-01

In the macroscopic simulation of plasmas, the numerical modeler is faced with the challenge of dealing with multiple time and length scales. Traditional approaches based on explicit time integration techniques and fixed meshes are not suitable for this challenge, as such approaches prevent the modeler from using realistic plasma parameters to keep the computation feasible. We propose here a novel approach, based on implicit methods and structured adaptive mesh refinement (SAMR). Our emphasis is on both accuracy and scalability with the number of degrees of freedom. As a proof-of-principle, we focus on the reduced resistive MHD model as a basic MHD model paradigm, which is truly multiscale. The approach taken here is to adapt mature physics-based technology to AMR grids, and employ AMR-aware multilevel techniques (such as fast adaptive composite grid --FAC-- algorithms) for scalability. We demonstrate that the concept is indeed feasible, featuring near-optimal scalability under grid refinement. Results of fully-implicit, dynamically-adaptive AMR simulations in challenging dissipation regimes will be presented on a variety of problems that benefit from this capability, including tearing modes, the island coalescence instability, and the tilt mode instability. L. Chac'on et al., J. Comput. Phys. 178 (1), 15- 36 (2002) B. Philip, M. Pernice, and L. Chac'on, Lecture Notes in Computational Science and Engineering, accepted (2006)
Asymmetric color image encryption based on singular value decomposition

NASA Astrophysics Data System (ADS)

Yao, Lili; Yuan, Caojin; Qiang, Junjie; Feng, Shaotong; Nie, Shouping

2017-02-01

A novel asymmetric color image encryption approach by using singular value decomposition (SVD) is proposed. The original color image is encrypted into a ciphertext shown as an indexed image by using the proposed method. The red, green and blue components of the color image are subsequently encoded into a complex function which is then separated into U, S and V parts by SVD. The data matrix of the ciphertext is obtained by multiplying orthogonal matrices U and V while implementing phase-truncation. Diagonal entries of the three diagonal matrices of the SVD results are abstracted and scrambling combined to construct the colormap of the ciphertext. Thus, the encrypted indexed image covers less space than the original image. For decryption, the original color image cannot be recovered without private keys which are obtained from phase-truncation and the orthogonality of V. Computer simulations are presented to evaluate the performance of the proposed algorithm. We also analyze the security of the proposed system.
Direct reconstruction of cardiac PET kinetic parametric images using a preconditioned conjugate gradient approach

PubMed Central

Rakvongthai, Yothin; Ouyang, Jinsong; Guerin, Bastien; Li, Quanzheng; Alpert, Nathaniel M.; El Fakhri, Georges

2013-01-01

Purpose: Our research goal is to develop an algorithm to reconstruct cardiac positron emission tomography (PET) kinetic parametric images directly from sinograms and compare its performance with the conventional indirect approach. Methods: Time activity curves of a NCAT phantom were computed according to a one-tissue compartmental kinetic model with realistic kinetic parameters. The sinograms at each time frame were simulated using the activity distribution for the time frame. The authors reconstructed the parametric images directly from the sinograms by optimizing a cost function, which included the Poisson log-likelihood and a spatial regularization terms, using the preconditioned conjugate gradient (PCG) algorithm with the proposed preconditioner. The proposed preconditioner is a diagonal matrix whose diagonal entries are the ratio of the parameter and the sensitivity of the radioactivity associated with parameter. The authors compared the reconstructed parametric images using the direct approach with those reconstructed using the conventional indirect approach. Results: At the same bias, the direct approach yielded significant relative reduction in standard deviation by 12%–29% and 32%–70% for 50 × 106 and 10 × 106 detected coincidences counts, respectively. Also, the PCG method effectively reached a constant value after only 10 iterations (with numerical convergence achieved after 40–50 iterations), while more than 500 iterations were needed for CG. Conclusions: The authors have developed a novel approach based on the PCG algorithm to directly reconstruct cardiac PET parametric images from sinograms, and yield better estimation of kinetic parameters than the conventional indirect approach, i.e., curve fitting of reconstructed images. The PCG method increases the convergence rate of reconstruction significantly as compared to the conventional CG method. PMID:24089922
Approximate Joint Diagonalization and Geometric Mean of Symmetric Positive Definite Matrices

PubMed Central

Congedo, Marco; Afsari, Bijan; Barachant, Alexandre; Moakher, Maher

2015-01-01

We explore the connection between two problems that have arisen independently in the signal processing and related fields: the estimation of the geometric mean of a set of symmetric positive definite (SPD) matrices and their approximate joint diagonalization (AJD). Today there is a considerable interest in estimating the geometric mean of a SPD matrix set in the manifold of SPD matrices endowed with the Fisher information metric. The resulting mean has several important invariance properties and has proven very useful in diverse engineering applications such as biomedical and image data processing. While for two SPD matrices the mean has an algebraic closed form solution, for a set of more than two SPD matrices it can only be estimated by iterative algorithms. However, none of the existing iterative algorithms feature at the same time fast convergence, low computational complexity per iteration and guarantee of convergence. For this reason, recently other definitions of geometric mean based on symmetric divergence measures, such as the Bhattacharyya divergence, have been considered. The resulting means, although possibly useful in practice, do not satisfy all desirable invariance properties. In this paper we consider geometric means of covariance matrices estimated on high-dimensional time-series, assuming that the data is generated according to an instantaneous mixing model, which is very common in signal processing. We show that in these circumstances we can approximate the Fisher information geometric mean by employing an efficient AJD algorithm. Our approximation is in general much closer to the Fisher information geometric mean as compared to its competitors and verifies many invariance properties. Furthermore, convergence is guaranteed, the computational complexity is low and the convergence rate is quadratic. The accuracy of this new geometric mean approximation is demonstrated by means of simulations. PMID:25919667

Direct reconstruction of cardiac PET kinetic parametric images using a preconditioned conjugate gradient approach.

PubMed

Rakvongthai, Yothin; Ouyang, Jinsong; Guerin, Bastien; Li, Quanzheng; Alpert, Nathaniel M; El Fakhri, Georges

2013-10-01

Our research goal is to develop an algorithm to reconstruct cardiac positron emission tomography (PET) kinetic parametric images directly from sinograms and compare its performance with the conventional indirect approach. Time activity curves of a NCAT phantom were computed according to a one-tissue compartmental kinetic model with realistic kinetic parameters. The sinograms at each time frame were simulated using the activity distribution for the time frame. The authors reconstructed the parametric images directly from the sinograms by optimizing a cost function, which included the Poisson log-likelihood and a spatial regularization terms, using the preconditioned conjugate gradient (PCG) algorithm with the proposed preconditioner. The proposed preconditioner is a diagonal matrix whose diagonal entries are the ratio of the parameter and the sensitivity of the radioactivity associated with parameter. The authors compared the reconstructed parametric images using the direct approach with those reconstructed using the conventional indirect approach. At the same bias, the direct approach yielded significant relative reduction in standard deviation by 12%-29% and 32%-70% for 50 × 10(6) and 10 × 10(6) detected coincidences counts, respectively. Also, the PCG method effectively reached a constant value after only 10 iterations (with numerical convergence achieved after 40-50 iterations), while more than 500 iterations were needed for CG. The authors have developed a novel approach based on the PCG algorithm to directly reconstruct cardiac PET parametric images from sinograms, and yield better estimation of kinetic parameters than the conventional indirect approach, i.e., curve fitting of reconstructed images. The PCG method increases the convergence rate of reconstruction significantly as compared to the conventional CG method.
Discrete variable representation in electronic structure theory: quadrature grids for least-squares tensor hypercontraction.

PubMed

Parrish, Robert M; Hohenstein, Edward G; Martínez, Todd J; Sherrill, C David

2013-05-21

We investigate the application of molecular quadratures obtained from either standard Becke-type grids or discrete variable representation (DVR) techniques to the recently developed least-squares tensor hypercontraction (LS-THC) representation of the electron repulsion integral (ERI) tensor. LS-THC uses least-squares fitting to renormalize a two-sided pseudospectral decomposition of the ERI, over a physical-space quadrature grid. While this procedure is technically applicable with any choice of grid, the best efficiency is obtained when the quadrature is tuned to accurately reproduce the overlap metric for quadratic products of the primary orbital basis. Properly selected Becke DFT grids can roughly attain this property. Additionally, we provide algorithms for adopting the DVR techniques of the dynamics community to produce two different classes of grids which approximately attain this property. The simplest algorithm is radial discrete variable representation (R-DVR), which diagonalizes the finite auxiliary-basis representation of the radial coordinate for each atom, and then combines Lebedev-Laikov spherical quadratures and Becke atomic partitioning to produce the full molecular quadrature grid. The other algorithm is full discrete variable representation (F-DVR), which uses approximate simultaneous diagonalization of the finite auxiliary-basis representation of the full position operator to produce non-direct-product quadrature grids. The qualitative features of all three grid classes are discussed, and then the relative efficiencies of these grids are compared in the context of LS-THC-DF-MP2. Coarse Becke grids are found to give essentially the same accuracy and efficiency as R-DVR grids; however, the latter are built from explicit knowledge of the basis set and may guide future development of atom-centered grids. F-DVR is found to provide reasonable accuracy with markedly fewer points than either Becke or R-DVR schemes.
Transient analysis of a thermal storage unit involving a phase change material

NASA Technical Reports Server (NTRS)

Griggs, E. I.; Pitts, D. R.; Humphries, W. R.

1974-01-01

The transient response of a single cell of a typical phase change material type thermal capacitor has been modeled using numerical conductive heat transfer techniques. The cell consists of a base plate, an insulated top, and two vertical walls (fins) forming a two-dimensional cavity filled with a phase change material. Both explicit and implicit numerical formulations are outlined. A mixed explicit-implicit scheme which treats the fin implicity while treating the phase change material explicitly is discussed. A band algorithmic scheme is used to reduce computer storage requirements for the implicit approach while retaining a relatively fine grid. All formulations are presented in dimensionless form thereby enabling application to geometrically similar problems. Typical parametric results are graphically presented for the case of melting with constant heat input to the base of the cell.
Fast localized orthonormal virtual orbitals which depend smoothly on nuclear coordinates.

PubMed

Subotnik, Joseph E; Dutoi, Anthony D; Head-Gordon, Martin

2005-09-15

We present here an algorithm for computing stable, well-defined localized orthonormal virtual orbitals which depend smoothly on nuclear coordinates. The algorithm is very fast, limited only by diagonalization of two matrices with dimension the size of the number of virtual orbitals. Furthermore, we require no more than quadratic (in the number of electrons) storage. The basic premise behind our algorithm is that one can decompose any given atomic-orbital (AO) vector space as a minimal basis space (which includes the occupied and valence virtual spaces) and a hard-virtual (HV) space (which includes everything else). The valence virtual space localizes easily with standard methods, while the hard-virtual space is constructed to be atom centered and automatically local. The orbitals presented here may be computed almost as quickly as projecting the AO basis onto the virtual space and are almost as local (according to orbital variance), while our orbitals are orthonormal (rather than redundant and nonorthogonal). We expect this algorithm to find use in local-correlation methods.
Discrete Diffusion Monte Carlo for Electron Thermal Transport

NASA Astrophysics Data System (ADS)

Chenhall, Jeffrey; Cao, Duc; Wollaeger, Ryan; Moses, Gregory

2014-10-01

The iSNB (implicit Schurtz Nicolai Busquet electron thermal transport method of Cao et al. is adapted to a Discrete Diffusion Monte Carlo (DDMC) solution method for eventual inclusion in a hybrid IMC-DDMC (Implicit Monte Carlo) method. The hybrid method will combine the efficiency of a diffusion method in short mean free path regions with the accuracy of a transport method in long mean free path regions. The Monte Carlo nature of the approach allows the algorithm to be massively parallelized. Work to date on the iSNB-DDMC method will be presented. This work was supported by Sandia National Laboratory - Albuquerque.
Modifications of the PCPT method for HJB equations

NASA Astrophysics Data System (ADS)

Kossaczký, I.; Ehrhardt, M.; Günther, M.

2016-10-01

In this paper we will revisit the modification of the piecewise constant policy timestepping (PCPT) method for solving Hamilton-Jacobi-Bellman (HJB) equations. This modification is called piecewise predicted policy timestepping (PPPT) method and if properly used, it may be significantly faster. We will quickly recapitulate the algorithms of PCPT, PPPT methods and of the classical implicit method and apply them on a passport option pricing problem with non-standard payoff. We will present modifications needed to solve this problem effectively with the PPPT method and compare the performance with the PCPT method and the classical implicit method.
Solidification of a binary mixture

NASA Technical Reports Server (NTRS)

Antar, B. N.

1982-01-01

The time dependent concentration and temperature profiles of a finite layer of a binary mixture are investigated during solidification. The coupled time dependent Stefan problem is solved numerically using an implicit finite differencing algorithm with the method of lines. Specifically, the temporal operator is approximated via an implicit finite difference operator resulting in a coupled set of ordinary differential equations for the spatial distribution of the temperature and concentration for each time. Since the resulting differential equations set form a boundary value problem with matching conditions at an unknown spatial point, the method of invariant imbedding is used for its solution.
A Sparse Self-Consistent Field Algorithm and Its Parallel Implementation: Application to Density-Functional-Based Tight Binding.

PubMed

Scemama, Anthony; Renon, Nicolas; Rapacioli, Mathias

2014-06-10

We present an algorithm and its parallel implementation for solving a self-consistent problem as encountered in Hartree-Fock or density functional theory. The algorithm takes advantage of the sparsity of matrices through the use of local molecular orbitals. The implementation allows one to exploit efficiently modern symmetric multiprocessing (SMP) computer architectures. As a first application, the algorithm is used within the density-functional-based tight binding method, for which most of the computational time is spent in the linear algebra routines (diagonalization of the Fock/Kohn-Sham matrix). We show that with this algorithm (i) single point calculations on very large systems (millions of atoms) can be performed on large SMP machines, (ii) calculations involving intermediate size systems (1000-100 000 atoms) are also strongly accelerated and can run efficiently on standard servers, and (iii) the error on the total energy due to the use of a cutoff in the molecular orbital coefficients can be controlled such that it remains smaller than the SCF convergence criterion.
Joint Diagonalization Applied to the Detection and Discrimination of Unexploded Ordnance

DTIC Science & Technology

2012-08-01

center (Das et al., 1990; Barrow and Nelson, 2001; Bell et al., 2001; Pasion and Oldenburg , 2001; Zhang et al., 2003; Smith and Mor- rison, 2004; Tarokh et...matrix for the complete transmitter/receiver ar- ray by tiling all the Nr × Nt available samples of expression 5: S ¼ GscUlΛ̇lUTl ðGprÞT...L. R., and D. W. Oldenburg , 2001, A discrimination algorithm for UXO using time-domain electromagnetics: Journal of Environmental and Engineering
Wavelet multiresolution analyses adapted for the fast solution of boundary value ordinary differential equations

NASA Technical Reports Server (NTRS)

Jawerth, Bjoern; Sweldens, Wim

1993-01-01

We present ideas on how to use wavelets in the solution of boundary value ordinary differential equations. Rather than using classical wavelets, we adapt their construction so that they become (bi)orthogonal with respect to the inner product defined by the operator. The stiffness matrix in a Galerkin method then becomes diagonal and can thus be trivially inverted. We show how one can construct an O(N) algorithm for various constant and variable coefficient operators.
Parallelization of PANDA discrete ordinates code using spatial decomposition

DOE Office of Scientific and Technical Information (OSTI.GOV)

Humbert, P.

2006-07-01

We present the parallel method, based on spatial domain decomposition, implemented in the 2D and 3D versions of the discrete Ordinates code PANDA. The spatial mesh is orthogonal and the spatial domain decomposition is Cartesian. For 3D problems a 3D Cartesian domain topology is created and the parallel method is based on a domain diagonal plane ordered sweep algorithm. The parallel efficiency of the method is improved by directions and octants pipelining. The implementation of the algorithm is straightforward using MPI blocking point to point communications. The efficiency of the method is illustrated by an application to the 3D-Ext C5G7more » benchmark of the OECD/NEA. (authors)« less
Exact diagonalization of quantum lattice models on coprocessors

NASA Astrophysics Data System (ADS)

Siro, T.; Harju, A.

2016-10-01

We implement the Lanczos algorithm on an Intel Xeon Phi coprocessor and compare its performance to a multi-core Intel Xeon CPU and an NVIDIA graphics processor. The Xeon and the Xeon Phi are parallelized with OpenMP and the graphics processor is programmed with CUDA. The performance is evaluated by measuring the execution time of a single step in the Lanczos algorithm. We study two quantum lattice models with different particle numbers, and conclude that for small systems, the multi-core CPU is the fastest platform, while for large systems, the graphics processor is the clear winner, reaching speedups of up to 7.6 compared to the CPU. The Xeon Phi outperforms the CPU with sufficiently large particle number, reaching a speedup of 2.5.
A finite element solver for 3-D compressible viscous flows

NASA Technical Reports Server (NTRS)

Reddy, K. C.; Reddy, J. N.; Nayani, S.

1990-01-01

Computation of the flow field inside a space shuttle main engine (SSME) requires the application of state of the art computational fluid dynamic (CFD) technology. Several computer codes are under development to solve 3-D flow through the hot gas manifold. Some algorithms were designed to solve the unsteady compressible Navier-Stokes equations, either by implicit or explicit factorization methods, using several hundred or thousands of time steps to reach a steady state solution. A new iterative algorithm is being developed for the solution of the implicit finite element equations without assembling global matrices. It is an efficient iteration scheme based on a modified nonlinear Gauss-Seidel iteration with symmetric sweeps. The algorithm is analyzed for a model equation and is shown to be unconditionally stable. Results from a series of test problems are presented. The finite element code was tested for couette flow, which is flow under a pressure gradient between two parallel plates in relative motion. Another problem that was solved is viscous laminar flow over a flat plate. The general 3-D finite element code was used to compute the flow in an axisymmetric turnaround duct at low Mach numbers.
Trust-region based return mapping algorithm for implicit integration of elastic-plastic constitutive models

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lester, Brian; Scherzinger, William

2017-01-19

Here, a new method for the solution of the non-linear equations forming the core of constitutive model integration is proposed. Specifically, the trust-region method that has been developed in the numerical optimization community is successfully modified for use in implicit integration of elastic-plastic models. Although attention here is restricted to these rate-independent formulations, the proposed approach holds substantial promise for adoption with models incorporating complex physics, multiple inelastic mechanisms, and/or multiphysics. As a first step, the non-quadratic Hosford yield surface is used as a representative case to investigate computationally challenging constitutive models. The theory and implementation are presented, discussed, andmore » compared to other common integration schemes. Multiple boundary value problems are studied and used to verify the proposed algorithm and demonstrate the capabilities of this approach over more common methodologies. Robustness and speed are then investigated and compared to existing algorithms. Through these efforts, it is shown that the utilization of a trust-region approach leads to superior performance versus a traditional closest-point projection Newton-Raphson method and comparable speed and robustness to a line search augmented scheme.« less
Trust-region based return mapping algorithm for implicit integration of elastic-plastic constitutive models

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lester, Brian T.; Scherzinger, William M.

2017-01-19

A new method for the solution of the non-linear equations forming the core of constitutive model integration is proposed. Specifically, the trust-region method that has been developed in the numerical optimization community is successfully modified for use in implicit integration of elastic-plastic models. Although attention here is restricted to these rate-independent formulations, the proposed approach holds substantial promise for adoption with models incorporating complex physics, multiple inelastic mechanisms, and/or multiphysics. As a first step, the non-quadratic Hosford yield surface is used as a representative case to investigate computationally challenging constitutive models. The theory and implementation are presented, discussed, and comparedmore » to other common integration schemes. Multiple boundary value problems are studied and used to verify the proposed algorithm and demonstrate the capabilities of this approach over more common methodologies. Robustness and speed are then investigated and compared to existing algorithms. As a result through these efforts, it is shown that the utilization of a trust-region approach leads to superior performance versus a traditional closest-point projection Newton-Raphson method and comparable speed and robustness to a line search augmented scheme.« less
A Coulomb collision algorithm for weighted particle simulations

NASA Technical Reports Server (NTRS)

Miller, Ronald H.; Combi, Michael R.

1994-01-01

A binary Coulomb collision algorithm is developed for weighted particle simulations employing Monte Carlo techniques. Charged particles within a given spatial grid cell are pair-wise scattered, explicitly conserving momentum and implicitly conserving energy. A similar algorithm developed by Takizuka and Abe (1977) conserves momentum and energy provided the particles are unweighted (each particle representing equal fractions of the total particle density). If applied as is to simulations incorporating weighted particles, the plasma temperatures equilibrate to an incorrect temperature, as compared to theory. Using the appropriate pairing statistics, a Coulomb collision algorithm is developed for weighted particles. The algorithm conserves energy and momentum and produces the appropriate relaxation time scales as compared to theoretical predictions. Such an algorithm is necessary for future work studying self-consistent multi-species kinetic transport.
A NUMERICAL ALGORITHM FOR MODELING MULTIGROUP NEUTRINO-RADIATION HYDRODYNAMICS IN TWO SPATIAL DIMENSIONS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Swesty, F. Douglas; Myra, Eric S.

It is now generally agreed that multidimensional, multigroup, neutrino-radiation hydrodynamics (RHD) is an indispensable element of any realistic model of stellar-core collapse, core-collapse supernovae, and proto-neutron star instabilities. We have developed a new, two-dimensional, multigroup algorithm that can model neutrino-RHD flows in core-collapse supernovae. Our algorithm uses an approach similar to the ZEUS family of algorithms, originally developed by Stone and Norman. However, this completely new implementation extends that previous work in three significant ways: first, we incorporate multispecies, multigroup RHD in a flux-limited-diffusion approximation. Our approach is capable of modeling pair-coupled neutrino-RHD, and includes effects of Pauli blocking inmore » the collision integrals. Blocking gives rise to nonlinearities in the discretized radiation-transport equations, which we evolve implicitly in time. We employ parallelized Newton-Krylov methods to obtain a solution of these nonlinear, implicit equations. Our second major extension to the ZEUS algorithm is the inclusion of an electron conservation equation that describes the evolution of electron-number density in the hydrodynamic flow. This permits calculating deleptonization of a stellar core. Our third extension modifies the hydrodynamics algorithm to accommodate realistic, complex equations of state, including those having nonconvex behavior. In this paper, we present a description of our complete algorithm, giving sufficient details to allow others to implement, reproduce, and extend our work. Finite-differencing details are presented in appendices. We also discuss implementation of this algorithm on state-of-the-art, parallel-computing architectures. Finally, we present results of verification tests that demonstrate the numerical accuracy of this algorithm on diverse hydrodynamic, gravitational, radiation-transport, and RHD sample problems. We believe our methods to be of general use in a variety of model settings where radiation transport or RHD is important. Extension of this work to three spatial dimensions is straightforward.« less
Adaptive Numerical Algorithms in Space Weather Modeling

NASA Technical Reports Server (NTRS)

Toth, Gabor; vanderHolst, Bart; Sokolov, Igor V.; DeZeeuw, Darren; Gombosi, Tamas I.; Fang, Fang; Manchester, Ward B.; Meng, Xing; Nakib, Dalal; Powell, Kenneth G.;

2010-01-01

Space weather describes the various processes in the Sun-Earth system that present danger to human health and technology. The goal of space weather forecasting is to provide an opportunity to mitigate these negative effects. Physics-based space weather modeling is characterized by disparate temporal and spatial scales as well as by different physics in different domains. A multi-physics system can be modeled by a software framework comprising of several components. Each component corresponds to a physics domain, and each component is represented by one or more numerical models. The publicly available Space Weather Modeling Framework (SWMF) can execute and couple together several components distributed over a parallel machine in a flexible and efficient manner. The framework also allows resolving disparate spatial and temporal scales with independent spatial and temporal discretizations in the various models. Several of the computationally most expensive domains of the framework are modeled by the Block-Adaptive Tree Solar wind Roe Upwind Scheme (BATS-R-US) code that can solve various forms of the magnetohydrodynamics (MHD) equations, including Hall, semi-relativistic, multi-species and multi-fluid MHD, anisotropic pressure, radiative transport and heat conduction. Modeling disparate scales within BATS-R-US is achieved by a block-adaptive mesh both in Cartesian and generalized coordinates. Most recently we have created a new core for BATS-R-US: the Block-Adaptive Tree Library (BATL) that provides a general toolkit for creating, load balancing and message passing in a 1, 2 or 3 dimensional block-adaptive grid. We describe the algorithms of BATL and demonstrate its efficiency and scaling properties for various problems. BATS-R-US uses several time-integration schemes to address multiple time-scales: explicit time stepping with fixed or local time steps, partially steady-state evolution, point-implicit, semi-implicit, explicit/implicit, and fully implicit numerical schemes. Depending on the application, we find that different time stepping methods are optimal. Several of the time integration schemes exploit the block-based granularity of the grid structure. The framework and the adaptive algorithms enable physics based space weather modeling and even forecasting.

Second derivative time integration methods for discontinuous Galerkin solutions of unsteady compressible flows

NASA Astrophysics Data System (ADS)

Nigro, A.; De Bartolo, C.; Crivellini, A.; Bassi, F.

2017-12-01

In this paper we investigate the possibility of using the high-order accurate A (α) -stable Second Derivative (SD) schemes proposed by Enright for the implicit time integration of the Discontinuous Galerkin (DG) space-discretized Navier-Stokes equations. These multistep schemes are A-stable up to fourth-order, but their use results in a system matrix difficult to compute. Furthermore, the evaluation of the nonlinear function is computationally very demanding. We propose here a Matrix-Free (MF) implementation of Enright schemes that allows to obtain a method without the costs of forming, storing and factorizing the system matrix, which is much less computationally expensive than its matrix-explicit counterpart, and which performs competitively with other implicit schemes, such as the Modified Extended Backward Differentiation Formulae (MEBDF). The algorithm makes use of the preconditioned GMRES algorithm for solving the linear system of equations. The preconditioner is based on the ILU(0) factorization of an approximated but computationally cheaper form of the system matrix, and it has been reused for several time steps to improve the efficiency of the MF Newton-Krylov solver. We additionally employ a polynomial extrapolation technique to compute an accurate initial guess to the implicit nonlinear system. The stability properties of SD schemes have been analyzed by solving a linear model problem. For the analysis on the Navier-Stokes equations, two-dimensional inviscid and viscous test cases, both with a known analytical solution, are solved to assess the accuracy properties of the proposed time integration method for nonlinear autonomous and non-autonomous systems, respectively. The performance of the SD algorithm is compared with the ones obtained by using an MF-MEBDF solver, in order to evaluate its effectiveness, identifying its limitations and suggesting possible further improvements.
Existence and discrete approximation for optimization problems governed by fractional differential equations

NASA Astrophysics Data System (ADS)

Bai, Yunru; Baleanu, Dumitru; Wu, Guo-Cheng

2018-06-01

We investigate a class of generalized differential optimization problems driven by the Caputo derivative. Existence of weak Carathe ´odory solution is proved by using Weierstrass existence theorem, fixed point theorem and Filippov implicit function lemma etc. Then a numerical approximation algorithm is introduced, and a convergence theorem is established. Finally, a nonlinear programming problem constrained by the fractional differential equation is illustrated and the results verify the validity of the algorithm.

Stable computations with flat radial basis functions using vector-valued rational approximations

NASA Astrophysics Data System (ADS)

Wright, Grady B.; Fornberg, Bengt

2017-02-01

One commonly finds in applications of smooth radial basis functions (RBFs) that scaling the kernels so they are 'flat' leads to smaller discretization errors. However, the direct numerical approach for computing with flat RBFs (RBF-Direct) is severely ill-conditioned. We present an algorithm for bypassing this ill-conditioning that is based on a new method for rational approximation (RA) of vector-valued analytic functions with the property that all components of the vector share the same singularities. This new algorithm (RBF-RA) is more accurate, robust, and easier to implement than the Contour-Padé method, which is similarly based on vector-valued rational approximation. In contrast to the stable RBF-QR and RBF-GA algorithms, which are based on finding a better conditioned base in the same RBF-space, the new algorithm can be used with any type of smooth radial kernel, and it is also applicable to a wider range of tasks (including calculating Hermite type implicit RBF-FD stencils). We present a series of numerical experiments demonstrating the effectiveness of this new method for computing RBF interpolants in the flat regime. We also demonstrate the flexibility of the method by using it to compute implicit RBF-FD formulas in the flat regime and then using these for solving Poisson's equation in a 3-D spherical shell.
Application of the implicit MacCormack scheme to the PNS equations

NASA Technical Reports Server (NTRS)

Lawrence, S. L.; Tannehill, J. C.; Chaussee, D. S.

1983-01-01

The two-dimensional parabolized Navier-Stokes equations are solved using MacCormack's (1981) implicit finite-difference scheme. It is shown that this method for solving the parabolized Navier-Stokes equations does not require the inversion of block tridiagonal systems of algebraic equations and allows the original explicit scheme to be employed in those regions where implicit treatment is not needed. The finite-difference algorithm is discussed and the computational results for two laminar test cases are presented. Results obtained using this method for the case of a flat plate boundary layer are compared with those obtained using the conventional Beam-Warming scheme, as well as those obtained from a boundary layer code. The computed results for a more severe test of the method, the hypersonic flow past a 15 deg compression corner, are found to compare favorably with experiment and a numerical solution of the complete Navier-Stokes equations.
A gradient enhanced plasticity-damage microplane model for concrete

NASA Astrophysics Data System (ADS)

Zreid, Imadeddin; Kaliske, Michael

2018-03-01

Computational modeling of concrete poses two main types of challenges. The first is the mathematical description of local response for such a heterogeneous material under all stress states, and the second is the stability and efficiency of the numerical implementation in finite element codes. The paper at hand presents a comprehensive approach addressing both issues. Adopting the microplane theory, a combined plasticity-damage model is formulated and regularized by an implicit gradient enhancement. The plasticity part introduces a new microplane smooth 3-surface cap yield function, which provides a stable numerical solution within an implicit finite element algorithm. The damage part utilizes a split, which can describe the transition of loading between tension and compression. Regularization of the model by the implicit gradient approach eliminates the mesh sensitivity and numerical instabilities. Identification methods for model parameters are proposed and several numerical examples of plain and reinforced concrete are carried out for illustration.
Computational Aerothermodynamics in Aeroassist Applications

NASA Technical Reports Server (NTRS)

Gnoffo, Peter A.

2001-01-01

Aeroassisted planetary entry uses atmospheric drag to decelerate spacecraft from super-orbital to orbital or suborbital velocities. Numerical simulation of flow fields surrounding these spacecraft during hypersonic atmospheric entry is required to define aerothermal loads. The severe compression in the shock layer in front of the vehicle and subsequent, rapid expansion into the wake are characterized by high temperature, thermo-chemical nonequilibrium processes. Implicit algorithms required for efficient, stable computation of the governing equations involving disparate time scales of convection, diffusion, chemical reactions, and thermal relaxation are discussed. Robust point-implicit strategies are utilized in the initialization phase; less robust but more efficient line-implicit strategies are applied in the endgame. Applications to ballutes (balloon-like decelerators) in the atmospheres of Venus, Mars, Titan, Saturn, and Neptune and a Mars Sample Return Orbiter (MSRO) are featured. Examples are discussed where time-accurate simulation is required to achieve a steady-state solution.
Implicit solvation model for density-functional study of nanocrystal surfaces and reaction pathways

NASA Astrophysics Data System (ADS)

Mathew, Kiran; Sundararaman, Ravishankar; Letchworth-Weaver, Kendra; Arias, T. A.; Hennig, Richard G.

2014-02-01

Solid-liquid interfaces are at the heart of many modern-day technologies and provide a challenge to many materials simulation methods. A realistic first-principles computational study of such systems entails the inclusion of solvent effects. In this work, we implement an implicit solvation model that has a firm theoretical foundation into the widely used density-functional code Vienna ab initio Software Package. The implicit solvation model follows the framework of joint density functional theory. We describe the framework, our algorithm and implementation, and benchmarks for small molecular systems. We apply the solvation model to study the surface energies of different facets of semiconducting and metallic nanocrystals and the SN2 reaction pathway. We find that solvation reduces the surface energies of the nanocrystals, especially for the semiconducting ones and increases the energy barrier of the SN2 reaction.
Progress on a Taylor weak statement finite element algorithm for high-speed aerodynamic flows

NASA Technical Reports Server (NTRS)

Baker, A. J.; Freels, J. D.

1989-01-01

A new finite element numerical Computational Fluid Dynamics (CFD) algorithm has matured to the point of efficiently solving two-dimensional high speed real-gas compressible flow problems in generalized coordinates on modern vector computer systems. The algorithm employs a Taylor Weak Statement classical Galerkin formulation, a variably implicit Newton iteration, and a tensor matrix product factorization of the linear algebra Jacobian under a generalized coordinate transformation. Allowing for a general two-dimensional conservation law system, the algorithm has been exercised on the Euler and laminar forms of the Navier-Stokes equations. Real-gas fluid properties are admitted, and numerical results verify solution accuracy, efficiency, and stability over a range of test problem parameters.
A circular median filter approach for resolving directional ambiguities in wind fields retrieved from spaceborne scatterometer data

NASA Technical Reports Server (NTRS)

Schultz, Howard

1990-01-01

The retrieval algorithm for spaceborne scatterometry proposed by Schultz (1985) is extended. A circular median filter (CMF) method is presented, which operates on wind directions independently of wind speed, removing any implicit wind speed dependence. A cell weighting scheme is included in the algorithm, permitting greater weights to be assigned to more reliable data. The mathematical properties of the ambiguous solutions to the wind retrieval problem are reviewed. The CMF algorithm is tested on twelve simulated data sets. The effects of spatially correlated likelihood assignment errors on the performance of the CMF algorithm are examined. Also, consideration is given to a wind field smoothing technique that uses a CMF.
Learning Analytics: Challenges and Limitations

ERIC Educational Resources Information Center

Wilson, Anna; Watson, Cate; Thompson, Terrie Lynn; Drew, Valerie; Doyle, Sarah

2017-01-01

Learning analytic implementations are increasingly being included in learning management systems in higher education. We lay out some concerns with the way learning analytics--both data and algorithms--are often presented within an unproblematized Big Data discourse. We describe some potential problems with the often implicit assumptions about…
Jacobi-Gauss-Lobatto collocation method for the numerical solution of 1+1 nonlinear Schrödinger equations

NASA Astrophysics Data System (ADS)

Doha, E. H.; Bhrawy, A. H.; Abdelkawy, M. A.; Van Gorder, Robert A.

2014-03-01

A Jacobi-Gauss-Lobatto collocation (J-GL-C) method, used in combination with the implicit Runge-Kutta method of fourth order, is proposed as a numerical algorithm for the approximation of solutions to nonlinear Schrödinger equations (NLSE) with initial-boundary data in 1+1 dimensions. Our procedure is implemented in two successive steps. In the first one, the J-GL-C is employed for approximating the functional dependence on the spatial variable, using (N-1) nodes of the Jacobi-Gauss-Lobatto interpolation which depends upon two general Jacobi parameters. The resulting equations together with the two-point boundary conditions induce a system of 2(N-1) first-order ordinary differential equations (ODEs) in time. In the second step, the implicit Runge-Kutta method of fourth order is applied to solve this temporal system. The proposed J-GL-C method, used in combination with the implicit Runge-Kutta method of fourth order, is employed to obtain highly accurate numerical approximations to four types of NLSE, including the attractive and repulsive NLSE and a Gross-Pitaevskii equation with space-periodic potential. The numerical results obtained by this algorithm have been compared with various exact solutions in order to demonstrate the accuracy and efficiency of the proposed method. Indeed, for relatively few nodes used, the absolute error in our numerical solutions is sufficiently small.
Toward an optimal solver for time-spectral fluid-dynamic and aeroelastic solutions on unstructured meshes

NASA Astrophysics Data System (ADS)

Mundis, Nathan L.; Mavriplis, Dimitri J.

2017-09-01

The time-spectral method applied to the Euler and coupled aeroelastic equations theoretically offers significant computational savings for purely periodic problems when compared to standard time-implicit methods. However, attaining superior efficiency with time-spectral methods over traditional time-implicit methods hinges on the ability rapidly to solve the large non-linear system resulting from time-spectral discretizations which become larger and stiffer as more time instances are employed or the period of the flow becomes especially short (i.e. the maximum resolvable wave-number increases). In order to increase the efficiency of these solvers, and to improve robustness, particularly for large numbers of time instances, the Generalized Minimal Residual Method (GMRES) is used to solve the implicit linear system over all coupled time instances. The use of GMRES as the linear solver makes time-spectral methods more robust, allows them to be applied to a far greater subset of time-accurate problems, including those with a broad range of harmonic content, and vastly improves the efficiency of time-spectral methods. In previous work, a wave-number independent preconditioner that mitigates the increased stiffness of the time-spectral method when applied to problems with large resolvable wave numbers has been developed. This preconditioner, however, directly inverts a large matrix whose size increases in proportion to the number of time instances. As a result, the computational time of this method scales as the cube of the number of time instances. In the present work, this preconditioner has been reworked to take advantage of an approximate-factorization approach that effectively decouples the spatial and temporal systems. Once decoupled, the time-spectral matrix can be inverted in frequency space, where it has entries only on the main diagonal and therefore can be inverted quite efficiently. This new GMRES/preconditioner combination is shown to be over an order of magnitude more efficient than the previous wave-number independent preconditioner for problems with large numbers of time instances and/or large reduced frequencies.
Kinematic Structural Modelling in Bayesian Networks

NASA Astrophysics Data System (ADS)

Schaaf, Alexander; de la Varga, Miguel; Florian Wellmann, J.

2017-04-01

We commonly capture our knowledge about the spatial distribution of distinct geological lithologies in the form of 3-D geological models. Several methods exist to create these models, each with its own strengths and limitations. We present here an approach to combine the functionalities of two modeling approaches - implicit interpolation and kinematic modelling methods - into one framework, while explicitly considering parameter uncertainties and thus model uncertainty. In recent work, we proposed an approach to implement implicit modelling algorithms into Bayesian networks. This was done to address the issues of input data uncertainty and integration of geological information from varying sources in the form of geological likelihood functions. However, one general shortcoming of implicit methods is that they usually do not take any physical constraints into consideration, which can result in unrealistic model outcomes and artifacts. On the other hand, kinematic structural modelling intends to reconstruct the history of a geological system based on physically driven kinematic events. This type of modelling incorporates simplified, physical laws into the model, at the cost of a substantial increment of usable uncertain parameters. In the work presented here, we show an integration of these two different modelling methodologies, taking advantage of the strengths of both of them. First, we treat the two types of models separately, capturing the information contained in the kinematic models and their specific parameters in the form of likelihood functions, in order to use them in the implicit modelling scheme. We then go further and combine the two modelling approaches into one single Bayesian network. This enables the direct flow of information between the parameters of the kinematic modelling step and the implicit modelling step and links the exclusive input data and likelihoods of the two different modelling algorithms into one probabilistic inference framework. In addition, we use the capabilities of Noddy to analyze the topology of structural models to demonstrate how topological information, such as the connectivity of two layers across an unconformity, can be used as a likelihood function. In an application to a synthetic case study, we show that our approach leads to a successful combination of the two different modelling concepts. Specifically, we show that we derive ensemble realizations of implicit models that now incorporate the knowledge of the kinematic aspects, representing an important step forward in the integration of knowledge and a corresponding estimation of uncertainties in structural geological models.
Development and application of the GIM code for the Cyber 203 computer

NASA Technical Reports Server (NTRS)

Stainaker, J. F.; Robinson, M. A.; Rawlinson, E. G.; Anderson, P. G.; Mayne, A. W.; Spradley, L. W.

1982-01-01

The GIM computer code for fluid dynamics research was developed. Enhancement of the computer code, implicit algorithm development, turbulence model implementation, chemistry model development, interactive input module coding and wing/body flowfield computation are described. The GIM quasi-parabolic code development was completed, and the code used to compute a number of example cases. Turbulence models, algebraic and differential equations, were added to the basic viscous code. An equilibrium reacting chemistry model and implicit finite difference scheme were also added. Development was completed on the interactive module for generating the input data for GIM. Solutions for inviscid hypersonic flow over a wing/body configuration are also presented.
CAG12 - A CSCM based procedure for flow of an equilibrium chemically reacting gas

NASA Technical Reports Server (NTRS)

Green, M. J.; Davy, W. C.; Lombard, C. K.

1985-01-01

The Conservative Supra Characteristic Method (CSCM), an implicit upwind Navier-Stokes algorithm, is extended to the numerical simulation of flows in chemical equilibrium. The resulting computer code known as Chemistry and Gasdynamics Implicit - Version 2 (CAG12) is described. First-order accurate results are presented for inviscid and viscous Mach 20 flows of air past a hemisphere-cylinder. The solution procedure captures the bow shock in a chemically reacting gas, a technique that is needed for simulating high altitude, rarefied flows. In an initial effort to validate the code, the inviscid results are compared with published gasdynamic and chemistry solutions and satisfactorily agreement is obtained.
Adaptive truncation of matrix decompositions and efficient estimation of NMR relaxation distributions

NASA Astrophysics Data System (ADS)

Teal, Paul D.; Eccles, Craig

2015-04-01

The two most successful methods of estimating the distribution of nuclear magnetic resonance relaxation times from two dimensional data are data compression followed by application of the Butler-Reeds-Dawson algorithm, and a primal-dual interior point method using preconditioned conjugate gradient. Both of these methods have previously been presented using a truncated singular value decomposition of matrices representing the exponential kernel. In this paper it is shown that other matrix factorizations are applicable to each of these algorithms, and that these illustrate the different fundamental principles behind the operation of the algorithms. These are the rank-revealing QR (RRQR) factorization and the LDL factorization with diagonal pivoting, also known as the Bunch-Kaufman-Parlett factorization. It is shown that both algorithms can be improved by adaptation of the truncation as the optimization process progresses, improving the accuracy as the optimal value is approached. A variation on the interior method viz, the use of barrier function instead of the primal-dual approach, is found to offer considerable improvement in terms of speed and reliability. A third type of algorithm, related to the algorithm known as Fast iterative shrinkage-thresholding algorithm, is applied to the problem. This method can be efficiently formulated without the use of a matrix decomposition.
Adaptive control in the presence of unmodeled dynamics. Ph.D. Thesis

NASA Technical Reports Server (NTRS)

Rohrs, C. E.

1982-01-01

Stability and robustness properties of a wide class of adaptive control algorithms in the presence of unmodeled dynamics and output disturbances were investigated. The class of adaptive algorithms considered are those commonly referred to as model reference adaptive control algorithms, self-tuning controllers, and dead beat adaptive controllers, developed for both continuous-time systems and discrete-time systems. A unified analytical approach was developed to examine the class of existing adaptive algorithms. It was discovered that all existing algorithms contain an infinite gain operator in the dynamic system that defines command reference errors and parameter errors; it is argued that such an infinite gain operator appears to be generic to all adaptive algorithms, whether they exhibit explicit or implicit parameter identification. It is concluded that none of the adaptive algorithms considered can be used with confidence in a practical control system design, because instability will set in with a high probability.
Aspects and applications of patched grid calculations

NASA Technical Reports Server (NTRS)

Walters, R. W.; Switzer, G. F.; Thomas, J. L.

1986-01-01

Patched grid calculations within the framework of an implicit, flux-vector split upwind/relaxation algorithm for the Euler equations are presented. The effect of a metric-discontinuous interface on the convergence rate of the algorithm is discussed along with the spatial accuracy of the solution and the effect of curvature along an interface. Results are presented and discussed for the free-stream problem, shock reflection problem, supersonic inlet with a 5 degree ramp, aerodynamically choked inlet, and three-dimensional analytic forebody.
Characteristic-based algorithms for flows in thermo-chemical nonequilibrium

NASA Technical Reports Server (NTRS)

Walters, Robert W.; Cinnella, Pasquale; Slack, David C.; Halt, David

1990-01-01

A generalized finite-rate chemistry algorithm with Steger-Warming, Van Leer, and Roe characteristic-based flux splittings is presented in three-dimensional generalized coordinates for the Navier-Stokes equations. Attention is placed on convergence to steady-state solutions with fully coupled chemistry. Time integration schemes including explicit m-stage Runge-Kutta, implicit approximate-factorization, relaxation and LU decomposition are investigated and compared in terms of residual reduction per unit of CPU time. Practical issues such as code vectorization and memory usage on modern supercomputers are discussed.
Fisher's method of scoring in statistical image reconstruction: comparison of Jacobi and Gauss-Seidel iterative schemes.

PubMed

Hudson, H M; Ma, J; Green, P

1994-01-01

Many algorithms for medical image reconstruction adopt versions of the expectation-maximization (EM) algorithm. In this approach, parameter estimates are obtained which maximize a complete data likelihood or penalized likelihood, in each iteration. Implicitly (and sometimes explicitly) penalized algorithms require smoothing of the current reconstruction in the image domain as part of their iteration scheme. In this paper, we discuss alternatives to EM which adapt Fisher's method of scoring (FS) and other methods for direct maximization of the incomplete data likelihood. Jacobi and Gauss-Seidel methods for non-linear optimization provide efficient algorithms applying FS in tomography. One approach uses smoothed projection data in its iterations. We investigate the convergence of Jacobi and Gauss-Seidel algorithms with clinical tomographic projection data.
The design and implementation of cost-effective algorithms for direct solution of banded linear systems on the vector processor system 32 supercomputer

NASA Technical Reports Server (NTRS)

Samba, A. S.

1985-01-01

The problem of solving banded linear systems by direct (non-iterative) techniques on the Vector Processor System (VPS) 32 supercomputer is considered. Two efficient direct methods for solving banded linear systems on the VPS 32 are described. The vector cyclic reduction (VCR) algorithm is discussed in detail. The performance of the VCR on a three parameter model problem is also illustrated. The VCR is an adaptation of the conventional point cyclic reduction algorithm. The second direct method is the Customized Reduction of Augmented Triangles' (CRAT). CRAT has the dominant characteristics of an efficient VPS 32 algorithm. CRAT is tailored to the pipeline architecture of the VPS 32 and as a consequence the algorithm is implicitly vectorizable.
Hybrid optimization and Bayesian inference techniques for a non-smooth radiation detection problem

DOE PAGES

Stefanescu, Razvan; Schmidt, Kathleen; Hite, Jason; ...

2016-12-12

In this paper, we propose several algorithms to recover the location and intensity of a radiation source located in a simulated 250 × 180 m block of an urban center based on synthetic measurements. Radioactive decay and detection are Poisson random processes, so we employ likelihood functions based on this distribution. Owing to the domain geometry and the proposed response model, the negative logarithm of the likelihood is only piecewise continuous differentiable, and it has multiple local minima. To address these difficulties, we investigate three hybrid algorithms composed of mixed optimization techniques. For global optimization, we consider simulated annealing, particlemore » swarm, and genetic algorithm, which rely solely on objective function evaluations; that is, they do not evaluate the gradient in the objective function. By employing early stopping criteria for the global optimization methods, a pseudo-optimum point is obtained. This is subsequently utilized as the initial value by the deterministic implicit filtering method, which is able to find local extrema in non-smooth functions, to finish the search in a narrow domain. These new hybrid techniques, combining global optimization and implicit filtering address, difficulties associated with the non-smooth response, and their performances, are shown to significantly decrease the computational time over the global optimization methods. To quantify uncertainties associated with the source location and intensity, we employ the delayed rejection adaptive Metropolis and DiffeRential Evolution Adaptive Metropolis algorithms. Finally, marginal densities of the source properties are obtained, and the means of the chains compare accurately with the estimates produced by the hybrid algorithms.« less

Coupling compositional liquid gas Darcy and free gas flows at porous and free-flow domains interface

DOE Office of Scientific and Technical Information (OSTI.GOV)

Masson, R., E-mail: roland.masson@unice.fr; Team COFFEE INRIA Sophia Antipolis Méditerranée; Trenty, L., E-mail: laurent.trenty@andra.fr

This paper proposes an efficient splitting algorithm to solve coupled liquid gas Darcy and free gas flows at the interface between a porous medium and a free-flow domain. This model is compared to the reduced model introduced in [6] using a 1D approximation of the gas free flow. For that purpose, the gas molar fraction diffusive flux at the interface in the free-flow domain is approximated by a two point flux approximation based on a low-frequency diagonal approximation of a Steklov–Poincaré type operator. The splitting algorithm and the reduced model are applied in particular to the modelling of the massmore » exchanges at the interface between the storage and the ventilation galleries in radioactive waste deposits.« less
Noniterative Multireference Coupled Cluster Methods on Heterogeneous CPU-GPU Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bhaskaran-Nair, Kiran; Ma, Wenjing; Krishnamoorthy, Sriram

2013-04-09

A novel parallel algorithm for non-iterative multireference coupled cluster (MRCC) theories, which merges recently introduced reference-level parallelism (RLP) [K. Bhaskaran-Nair, J.Brabec, E. Aprà, H.J.J. van Dam, J. Pittner, K. Kowalski, J. Chem. Phys. 137, 094112 (2012)] with the possibility of accelerating numerical calculations using graphics processing unit (GPU) is presented. We discuss the performance of this algorithm on the example of the MRCCSD(T) method (iterative singles and doubles and perturbative triples), where the corrections due to triples are added to the diagonal elements of the MRCCSD (iterative singles and doubles) effective Hamiltonian matrix. The performance of the combined RLP/GPU algorithmmore » is illustrated on the example of the Brillouin-Wigner (BW) and Mukherjee (Mk) state-specific MRCCSD(T) formulations.« less
Asymmetric rotor-like probes to polarized fluorescence study of the macroscopically oriented uniaxial media: Model parameters recognition

NASA Astrophysics Data System (ADS)

Buczkowski, M.; Fisz, J. J.

2008-07-01

In this paper the possibility of the numerical data modelling in the case of angle- and time-resolved fluorescence spectroscopy is investigated. The asymmetric fluorescence probes are assumed to undergo the restricted rotational diffusion in a hosting medium. This process is described quantitatively by the diffusion tensor and the aligning potential. The evolution of the system is expressed in terms of the Smoluchowski equation with an appropriate time-developing operator. A matrix representation of this operator is calculated, then symmetrized and diagonalized. The resulting propagator is used to generate the synthetic noisy data set that imitates results of experimental measurements. The data set serves as a groundwork to the χ2 optimization, performed by the genetic algorithm followed by the gradient search, in order to recover model parameters, which are diagonal elements of the diffusion tensor, aligning potential expansion coefficients and directions of the electronic dipole moments. This whole procedure properly identifies model parameters, showing that the outlined formalism should be taken in the account in the case of analysing real experimental data.
Matrix-product-state method with local basis optimization for nonequilibrium electron-phonon systems

NASA Astrophysics Data System (ADS)

Heidrich-Meisner, Fabian; Brockt, Christoph; Dorfner, Florian; Vidmar, Lev; Jeckelmann, Eric

We present a method for simulating the time evolution of quasi-one-dimensional correlated systems with strongly fluctuating bosonic degrees of freedom (e.g., phonons) using matrix product states. For this purpose we combine the time-evolving block decimation (TEBD) algorithm with a local basis optimization (LBO) approach. We discuss the performance of our approach in comparison to TEBD with a bare boson basis, exact diagonalization, and diagonalization in a limited functional space. TEBD with LBO can reduce the computational cost by orders of magnitude when boson fluctuations are large and thus it allows one to investigate problems that are out of reach of other approaches. First, we test our method on the non-equilibrium dynamics of a Holstein polaron and show that it allows us to study the regime of strong electron-phonon coupling. Second, the method is applied to the scattering of an electronic wave packet off a region with electron-phonon coupling. Our study reveals a rich physics including transient self-trapping and dissipation. Supported by Deutsche Forschungsgemeinschaft (DFG) via FOR 1807.
Parallelized traveling cluster approximation to study numerically spin-fermion models on large lattices

NASA Astrophysics Data System (ADS)

Mukherjee, Anamitra; Patel, Niravkumar D.; Bishop, Chris; Dagotto, Elbio

2015-06-01

Lattice spin-fermion models are important to study correlated systems where quantum dynamics allows for a separation between slow and fast degrees of freedom. The fast degrees of freedom are treated quantum mechanically while the slow variables, generically referred to as the "spins," are treated classically. At present, exact diagonalization coupled with classical Monte Carlo (ED + MC) is extensively used to solve numerically a general class of lattice spin-fermion problems. In this common setup, the classical variables (spins) are treated via the standard MC method while the fermion problem is solved by exact diagonalization. The "traveling cluster approximation" (TCA) is a real space variant of the ED + MC method that allows to solve spin-fermion problems on lattice sizes with up to 103 sites. In this publication, we present a novel reorganization of the TCA algorithm in a manner that can be efficiently parallelized. This allows us to solve generic spin-fermion models easily on 104 lattice sites and with some effort on 105 lattice sites, representing the record lattice sizes studied for this family of models.
Super-resolution algorithm based on sparse representation and wavelet preprocessing for remote sensing imagery

NASA Astrophysics Data System (ADS)

Ren, Ruizhi; Gu, Lingjia; Fu, Haoyang; Sun, Chenglin

2017-04-01

An effective super-resolution (SR) algorithm is proposed for actual spectral remote sensing images based on sparse representation and wavelet preprocessing. The proposed SR algorithm mainly consists of dictionary training and image reconstruction. Wavelet preprocessing is used to establish four subbands, i.e., low frequency, horizontal, vertical, and diagonal high frequency, for an input image. As compared to the traditional approaches involving the direct training of image patches, the proposed approach focuses on the training of features derived from these four subbands. The proposed algorithm is verified using different spectral remote sensing images, e.g., moderate-resolution imaging spectroradiometer (MODIS) images with different bands, and the latest Chinese Jilin-1 satellite images with high spatial resolution. According to the visual experimental results obtained from the MODIS remote sensing data, the SR images using the proposed SR algorithm are superior to those using a conventional bicubic interpolation algorithm or traditional SR algorithms without preprocessing. Fusion algorithms, e.g., standard intensity-hue-saturation, principal component analysis, wavelet transform, and the proposed SR algorithms are utilized to merge the multispectral and panchromatic images acquired by the Jilin-1 satellite. The effectiveness of the proposed SR algorithm is assessed by parameters such as peak signal-to-noise ratio, structural similarity index, correlation coefficient, root-mean-square error, relative dimensionless global error in synthesis, relative average spectral error, spectral angle mapper, and the quality index Q4, and its performance is better than that of the standard image fusion algorithms.
A conservative finite difference algorithm for the unsteady transonic potential equation in generalized coordinates

NASA Technical Reports Server (NTRS)

Bridgeman, J. O.; Steger, J. L.; Caradonna, F. X.

1982-01-01

An implicit, approximate-factorization, finite-difference algorithm has been developed for the computation of unsteady, inviscid transonic flows in two and three dimensions. The computer program solves the full-potential equation in generalized coordinates in conservation-law form in order to properly capture shock-wave position and speed. A body-fitted coordinate system is employed for the simple and accurate treatment of boundary conditions on the body surface. The time-accurate algorithm is modified to a conventional ADI relaxation scheme for steady-state computations. Results from two- and three-dimensional steady and two-dimensional unsteady calculations are compared with existing methods.
Study of the mapping of Navier-Stokes algorithms onto multiple-instruction/multiple-data-stream computers

NASA Technical Reports Server (NTRS)

Eberhardt, D. S.; Baganoff, D.; Stevens, K.

1984-01-01

Implicit approximate-factored algorithms have certain properties that are suitable for parallel processing. A particular computational fluid dynamics (CFD) code, using this algorithm, is mapped onto a multiple-instruction/multiple-data-stream (MIMD) computer architecture. An explanation of this mapping procedure is presented, as well as some of the difficulties encountered when trying to run the code concurrently. Timing results are given for runs on the Ames Research Center's MIMD test facility which consists of two VAX 11/780's with a common MA780 multi-ported memory. Speedups exceeding 1.9 for characteristic CFD runs were indicated by the timing results.
How to estimate the 3D power spectrum of the Lyman-α forest

NASA Astrophysics Data System (ADS)

Font-Ribera, Andreu; McDonald, Patrick; Slosar, Anže

2018-01-01

We derive and numerically implement an algorithm for estimating the 3D power spectrum of the Lyman-α (Lyα) forest flux fluctuations. The algorithm exploits the unique geometry of Lyα forest data to efficiently measure the cross-spectrum between lines of sight as a function of parallel wavenumber, transverse separation and redshift. We start by approximating the global covariance matrix as block-diagonal, where only pixels from the same spectrum are correlated. We then compute the eigenvectors of the derivative of the signal covariance with respect to cross-spectrum parameters, and project the inverse-covariance-weighted spectra onto them. This acts much like a radial Fourier transform over redshift windows. The resulting cross-spectrum inference is then converted into our final product, an approximation of the likelihood for the 3D power spectrum expressed as second order Taylor expansion around a fiducial model. We demonstrate the accuracy and scalability of the algorithm and comment on possible extensions. Our algorithm will allow efficient analysis of the upcoming Dark Energy Spectroscopic Instrument dataset.
How to estimate the 3D power spectrum of the Lyman-α forest

DOE PAGES

Font-Ribera, Andreu; McDonald, Patrick; Slosar, Anže

2018-01-02

Here, we derive and numerically implement an algorithm for estimating the 3D power spectrum of the Lyman-α (Lyα) forest flux fluctuations. The algorithm exploits the unique geometry of Lyα forest data to efficiently measure the cross-spectrum between lines of sight as a function of parallel wavenumber, transverse separation and redshift. We start by approximating the global covariance matrix as block-diagonal, where only pixels from the same spectrum are correlated. We then compute the eigenvectors of the derivative of the signal covariance with respect to cross-spectrum parameters, and project the inverse-covariance-weighted spectra onto them. This acts much like a radial Fouriermore » transform over redshift windows. The resulting cross-spectrum inference is then converted into our final product, an approximation of the likelihood for the 3D power spectrum expressed as second order Taylor expansion around a fiducial model. We demonstrate the accuracy and scalability of the algorithm and comment on possible extensions. Our algorithm will allow efficient analysis of the upcoming Dark Energy Spectroscopic Instrument dataset.« less
Non-intrusive practitioner pupil detection for unmodified microscope oculars.

PubMed

Fuhl, Wolfgang; Santini, Thiago; Reichert, Carsten; Claus, Daniel; Herkommer, Alois; Bahmani, Hamed; Rifai, Katharina; Wahl, Siegfried; Kasneci, Enkelejda

2016-12-01

Modern microsurgery is a long and complex task requiring the surgeon to handle multiple microscope controls while performing the surgery. Eye tracking provides an additional means of interaction for the surgeon that could be used to alleviate this situation, diminishing surgeon fatigue and surgery time, thus decreasing risks of infection and human error. In this paper, we introduce a novel algorithm for pupil detection tailored for eye images acquired through an unmodified microscope ocular. The proposed approach, the Hough transform, and six state-of-the-art pupil detection algorithms were evaluated on over 4000 hand-labeled images acquired from a digital operating microscope with a non-intrusive monitoring system for the surgeon eyes integrated. Our results show that the proposed method reaches detection rates up to 71% for an error of ≈3% w.r.t the input image diagonal; none of the state-of-the-art pupil detection algorithms performed satisfactorily. The algorithm and hand-labeled data set can be downloaded at:: www.ti.uni-tuebingen.de/perception. Copyright © 2016 Elsevier Ltd. All rights reserved.
How to estimate the 3D power spectrum of the Lyman-α forest

DOE Office of Scientific and Technical Information (OSTI.GOV)

Font-Ribera, Andreu; McDonald, Patrick; Slosar, Anže

Here, we derive and numerically implement an algorithm for estimating the 3D power spectrum of the Lyman-α (Lyα) forest flux fluctuations. The algorithm exploits the unique geometry of Lyα forest data to efficiently measure the cross-spectrum between lines of sight as a function of parallel wavenumber, transverse separation and redshift. We start by approximating the global covariance matrix as block-diagonal, where only pixels from the same spectrum are correlated. We then compute the eigenvectors of the derivative of the signal covariance with respect to cross-spectrum parameters, and project the inverse-covariance-weighted spectra onto them. This acts much like a radial Fouriermore » transform over redshift windows. The resulting cross-spectrum inference is then converted into our final product, an approximation of the likelihood for the 3D power spectrum expressed as second order Taylor expansion around a fiducial model. We demonstrate the accuracy and scalability of the algorithm and comment on possible extensions. Our algorithm will allow efficient analysis of the upcoming Dark Energy Spectroscopic Instrument dataset.« less
A Fast Estimation Algorithm for Two-Dimensional Gravity Data (GEOFAST),

DTIC Science & Technology

1979-11-15

to a wide class of problems (Refs. 9 and 17). The major inhibitor to the widespread appli- ( cation of optimal gravity data processing is the severe...extends directly to two dimensions. Define the nln 2xn1 n2 diagonal window matrix W as the Kronecker product of two one-dimensional windows W = W1 0 W2 (B...Inversion of Separable Matrices Consider the linear system y = T x (B.3-1) where T is block Toeplitz of dimension nln 2xnIn 2 . Its fre- quency domain
Teaching Structured Design of Network Algorithms in Enhanced Versions of SQL

ERIC Educational Resources Information Center

de Brock, Bert

2004-01-01

From time to time developers of (database) applications will encounter, explicitly or implicitly, structures such as trees, graphs, and networks. Such applications can, for instance, relate to bills of material, organization charts, networks of (rail)roads, networks of conduit pipes (e.g., plumbing, electricity), telecom networks, and data…
Towards developing robust algorithms for solving partial differential equations on MIMD machines

NASA Technical Reports Server (NTRS)

Saltz, Joel H.; Naik, Vijay K.

1988-01-01

Methods for efficient computation of numerical algorithms on a wide variety of MIMD machines are proposed. These techniques reorganize the data dependency patterns to improve the processor utilization. The model problem finds the time-accurate solution to a parabolic partial differential equation discretized in space and implicitly marched forward in time. The algorithms are extensions of Jacobi and SOR. The extensions consist of iterating over a window of several timesteps, allowing efficient overlap of computation with communication. The methods increase the degree to which work can be performed while data are communicated between processors. The effect of the window size and of domain partitioning on the system performance is examined both by implementing the algorithm on a simulated multiprocessor system.
Towards developing robust algorithms for solving partial differential equations on MIMD machines

NASA Technical Reports Server (NTRS)

Saltz, J. H.; Naik, V. K.

1985-01-01

Methods for efficient computation of numerical algorithms on a wide variety of MIMD machines are proposed. These techniques reorganize the data dependency patterns to improve the processor utilization. The model problem finds the time-accurate solution to a parabolic partial differential equation discretized in space and implicitly marched forward in time. The algorithms are extensions of Jacobi and SOR. The extensions consist of iterating over a window of several timesteps, allowing efficient overlap of computation with communication. The methods increase the degree to which work can be performed while data are communicated between processors. The effect of the window size and of domain partitioning on the system performance is examined both by implementing the algorithm on a simulated multiprocessor system.
Computational modeling of chemo-electro-mechanical coupling: A novel implicit monolithic finite element approach

PubMed Central

Wong, J.; Göktepe, S.; Kuhl, E.

2014-01-01

Summary Computational modeling of the human heart allows us to predict how chemical, electrical, and mechanical fields interact throughout a cardiac cycle. Pharmacological treatment of cardiac disease has advanced significantly over the past decades, yet it remains unclear how the local biochemistry of an individual heart cell translates into global cardiac function. Here we propose a novel, unified strategy to simulate excitable biological systems across three biological scales. To discretize the governing chemical, electrical, and mechanical equations in space, we propose a monolithic finite element scheme. We apply a highly efficient and inherently modular global-local split, in which the deformation and the transmembrane potential are introduced globally as nodal degrees of freedom, while the chemical state variables are treated locally as internal variables. To ensure unconditional algorithmic stability, we apply an implicit backward Euler finite difference scheme to discretize the resulting system in time. To increase algorithmic robustness and guarantee optimal quadratic convergence, we suggest an incremental iterative Newton-Raphson scheme. The proposed algorithm allows us to simulate the interaction of chemical, electrical, and mechanical fields during a representative cardiac cycle on a patient-specific geometry, robust and stable, with calculation times on the order of four days on a standard desktop computer. PMID:23798328
Obtaining highly excited eigenstates of the localized XX chain via DMRG-X.

PubMed

Devakul, Trithep; Khemani, Vedika; Pollmann, Frank; Huse, David A; Sondhi, S L

2017-12-13

We benchmark a variant of the recently introduced density matrix renormalization group (DMRG)-X algorithm against exact results for the localized random field XX chain. We find that the eigenstates obtained via DMRG-X exhibit a highly accurate l-bit description for system sizes much bigger than the direct, many-body, exact diagonalization in the spin variables is able to access. We take advantage of the underlying free fermion description of the XX model to accurately test the strengths and limitations of this algorithm for large system sizes. We discuss the theoretical constraints on the performance of the algorithm from the entanglement properties of the eigenstates, and its actual performance at different values of disorder. A small but significant improvement to the algorithm is also presented, which helps significantly with convergence. We find that, at high entanglement, DMRG-X shows a bias towards eigenstates with low entanglement, but can be improved with increased bond dimension. This result suggests that one must be careful when applying the algorithm for interacting many-body localized spin models near a transition.This article is part of the themed issue 'Breakdown of ergodicity in quantum systems: from solids to synthetic matter'. © 2017 The Author(s).
Obtaining highly excited eigenstates of the localized XX chain via DMRG-X

NASA Astrophysics Data System (ADS)

Devakul, Trithep; Khemani, Vedika; Pollmann, Frank; Huse, David A.; Sondhi, S. L.

2017-10-01

We benchmark a variant of the recently introduced density matrix renormalization group (DMRG)-X algorithm against exact results for the localized random field XX chain. We find that the eigenstates obtained via DMRG-X exhibit a highly accurate l-bit description for system sizes much bigger than the direct, many-body, exact diagonalization in the spin variables is able to access. We take advantage of the underlying free fermion description of the XX model to accurately test the strengths and limitations of this algorithm for large system sizes. We discuss the theoretical constraints on the performance of the algorithm from the entanglement properties of the eigenstates, and its actual performance at different values of disorder. A small but significant improvement to the algorithm is also presented, which helps significantly with convergence. We find that, at high entanglement, DMRG-X shows a bias towards eigenstates with low entanglement, but can be improved with increased bond dimension. This result suggests that one must be careful when applying the algorithm for interacting many-body localized spin models near a transition. This article is part of the themed issue 'Breakdown of ergodicity in quantum systems: from solids to synthetic matter'.
A Fast Solver for Implicit Integration of the Vlasov--Poisson System in the Eulerian Framework

DOE Office of Scientific and Technical Information (OSTI.GOV)

Garrett, C. Kristopher; Hauck, Cory D.

In this paper, we present a domain decomposition algorithm to accelerate the solution of Eulerian-type discretizations of the linear, steady-state Vlasov equation. The steady-state solver then forms a key component in the implementation of fully implicit or nearly fully implicit temporal integrators for the nonlinear Vlasov--Poisson system. The solver relies on a particular decomposition of phase space that enables the use of sweeping techniques commonly used in radiation transport applications. The original linear system for the phase space unknowns is then replaced by a smaller linear system involving only unknowns on the boundary between subdomains, which can then be solvedmore » efficiently with Krylov methods such as GMRES. Steady-state solves are combined to form an implicit Runge--Kutta time integrator, and the Vlasov equation is coupled self-consistently to the Poisson equation via a linearized procedure or a nonlinear fixed-point method for the electric field. Finally, numerical results for standard test problems demonstrate the efficiency of the domain decomposition approach when compared to the direct application of an iterative solver to the original linear system.« less

Non-hydrostatic semi-elastic hybrid-coordinate SISL extension of HIRLAM. Part I: numerical scheme

NASA Astrophysics Data System (ADS)

Rõõm, Rein; Männik, Aarne; Luhamaa, Andres

2007-10-01

Two-time-level, semi-implicit, semi-Lagrangian (SISL) scheme is applied to the non-hydrostatic pressure coordinate equations, constituting a modified Miller-Pearce-White model, in hybrid-coordinate framework. Neutral background is subtracted in the initial continuous dynamics, yielding modified equations for geopotential, temperature and logarithmic surface pressure fluctuation. Implicit Lagrangian marching formulae for single time-step are derived. A disclosure scheme is presented, which results in an uncoupled diagnostic system, consisting of 3-D Poisson equation for omega velocity and 2-D Helmholtz equation for logarithmic pressure fluctuation. The model is discretized to create a non-hydrostatic extension to numerical weather prediction model HIRLAM. The discretization schemes, trajectory computation algorithms and interpolation routines, as well as the physical parametrization package are maintained from parent hydrostatic HIRLAM. For stability investigation, the derived SISL model is linearized with respect to the initial, thermally non-equilibrium resting state. Explicit residuals of the linear model prove to be sensitive to the relative departures of temperature and static stability from the reference state. Relayed on the stability study, the semi-implicit term in the vertical momentum equation is replaced to the implicit term, which results in stability increase of the model.
A Fast Solver for Implicit Integration of the Vlasov--Poisson System in the Eulerian Framework

DOE PAGES

Garrett, C. Kristopher; Hauck, Cory D.

2018-04-05

In this paper, we present a domain decomposition algorithm to accelerate the solution of Eulerian-type discretizations of the linear, steady-state Vlasov equation. The steady-state solver then forms a key component in the implementation of fully implicit or nearly fully implicit temporal integrators for the nonlinear Vlasov--Poisson system. The solver relies on a particular decomposition of phase space that enables the use of sweeping techniques commonly used in radiation transport applications. The original linear system for the phase space unknowns is then replaced by a smaller linear system involving only unknowns on the boundary between subdomains, which can then be solvedmore » efficiently with Krylov methods such as GMRES. Steady-state solves are combined to form an implicit Runge--Kutta time integrator, and the Vlasov equation is coupled self-consistently to the Poisson equation via a linearized procedure or a nonlinear fixed-point method for the electric field. Finally, numerical results for standard test problems demonstrate the efficiency of the domain decomposition approach when compared to the direct application of an iterative solver to the original linear system.« less
Acceleration methods for multi-physics compressible flow

NASA Astrophysics Data System (ADS)

Peles, Oren; Turkel, Eli

2018-04-01

In this work we investigate the Runge-Kutta (RK)/Implicit smoother scheme as a convergence accelerator for complex multi-physics flow problems including turbulent, reactive and also two-phase flows. The flows considered are subsonic, transonic and supersonic flows in complex geometries, and also can be either steady or unsteady flows. All of these problems are considered to be a very stiff. We then introduce an acceleration method for the compressible Navier-Stokes equations. We start with the multigrid method for pure subsonic flow, including reactive flows. We then add the Rossow-Swanson-Turkel RK/Implicit smoother that enables performing all these complex flow simulations with a reasonable CFL number. We next discuss the RK/Implicit smoother for time dependent problem and also for low Mach numbers. The preconditioner includes an intrinsic low Mach number treatment inside the smoother operator. We also develop a modified Roe scheme with a corresponding flux Jacobian matrix. We then give the extension of the method for real gas and reactive flow. Reactive flows are governed by a system of inhomogeneous Navier-Stokes equations with very stiff source terms. The extension of the RK/Implicit smoother requires an approximation of the source term Jacobian. The properties of the Jacobian are very important for the stability of the method. We discuss what the chemical physics theory of chemical kinetics tells about the mathematical properties of the Jacobian matrix. We focus on the implication of the Le-Chatelier's principle on the sign of the diagonal entries of the Jacobian. We present the implementation of the method for turbulent flow. We use a two RANS turbulent model - one equation model - Spalart-Allmaras and a two-equation model - k-ω SST model. The last extension is for two-phase flows with a gas as a main phase and Eulerian representation of a dispersed particles phase (EDP). We present some examples for such flow computations inside a ballistic evaluation rocket motor. The numerical examples in this work include transonic flow about a RAE2822 airfoil, about a M6 Onera wing, NACA0012 airfoil at very low Mach number, two-phase flow inside a Ballistic evaluation motor (BEM), a turbulent reactive shear layer and a time dependent Sod's tube problem.
Final Technical Report [Scalable methods for electronic excitations and optical responses of nanostructures: mathematics to algorithms to observables

DOE Office of Scientific and Technical Information (OSTI.GOV)

Saad, Yousef

2014-03-19

The master project under which this work is funded had as its main objective to develop computational methods for modeling electronic excited-state and optical properties of various nanostructures. The specific goals of the computer science group were primarily to develop effective numerical algorithms in Density Functional Theory (DFT) and Time Dependent Density Functional Theory (TDDFT). There were essentially four distinct stated objectives. The first objective was to study and develop effective numerical algorithms for solving large eigenvalue problems such as those that arise in Density Functional Theory (DFT) methods. The second objective was to explore so-called linear scaling methods ormore » Methods that avoid diagonalization. The third was to develop effective approaches for Time-Dependent DFT (TDDFT). Our fourth and final objective was to examine effective solution strategies for other problems in electronic excitations, such as the GW/Bethe-Salpeter method, and quantum transport problems.« less
Infinite projected entangled-pair state algorithm for ruby and triangle-honeycomb lattices

NASA Astrophysics Data System (ADS)

Jahromi, Saeed S.; Orús, Román; Kargarian, Mehdi; Langari, Abdollah

2018-03-01

The infinite projected entangled-pair state (iPEPS) algorithm is one of the most efficient techniques for studying the ground-state properties of two-dimensional quantum lattice Hamiltonians in the thermodynamic limit. Here, we show how the algorithm can be adapted to explore nearest-neighbor local Hamiltonians on the ruby and triangle-honeycomb lattices, using the corner transfer matrix (CTM) renormalization group for 2D tensor network contraction. Additionally, we show how the CTM method can be used to calculate the ground-state fidelity per lattice site and the boundary density operator and entanglement entropy (EE) on an infinite cylinder. As a benchmark, we apply the iPEPS method to the ruby model with anisotropic interactions and explore the ground-state properties of the system. We further extract the phase diagram of the model in different regimes of the couplings by measuring two-point correlators, ground-state fidelity, and EE on an infinite cylinder. Our phase diagram is in agreement with previous studies of the model by exact diagonalization.
Triangular covariance factorizations for. Ph.D. Thesis. - Calif. Univ.

NASA Technical Reports Server (NTRS)

Thornton, C. L.

1976-01-01

An improved computational form of the discrete Kalman filter is derived using an upper triangular factorization of the error covariance matrix. The covariance P is factored such that P = UDUT where U is unit upper triangular and D is diagonal. Recursions are developed for propagating the U-D covariance factors together with the corresponding state estimate. The resulting algorithm, referred to as the U-D filter, combines the superior numerical precision of square root filtering techniques with an efficiency comparable to that of Kalman's original formula. Moreover, this method is easily implemented and involves no more computer storage than the Kalman algorithm. These characteristics make the U-D method an attractive realtime filtering technique. A new covariance error analysis technique is obtained from an extension of the U-D filter equations. This evaluation method is flexible and efficient and may provide significantly improved numerical results. Cost comparisons show that for a large class of problems the U-D evaluation algorithm is noticeably less expensive than conventional error analysis methods.
Markov-modulated Markov chains and the covarion process of molecular evolution.

PubMed

Galtier, N; Jean-Marie, A

2004-01-01

The covarion (or site specific rate variation, SSRV) process of biological sequence evolution is a process by which the evolutionary rate of a nucleotide/amino acid/codon position can change in time. In this paper, we introduce time-continuous, space-discrete, Markov-modulated Markov chains as a model for representing SSRV processes, generalizing existing theory to any model of rate change. We propose a fast algorithm for diagonalizing the generator matrix of relevant Markov-modulated Markov processes. This algorithm makes phylogeny likelihood calculation tractable even for a large number of rate classes and a large number of states, so that SSRV models become applicable to amino acid or codon sequence datasets. Using this algorithm, we investigate the accuracy of the discrete approximation to the Gamma distribution of evolutionary rates, widely used in molecular phylogeny. We show that a relatively large number of classes is required to achieve accurate approximation of the exact likelihood when the number of analyzed sequences exceeds 20, both under the SSRV and among site rate variation (ASRV) models.
Optimal sensor placement for modal testing on wind turbines

NASA Astrophysics Data System (ADS)

Schulze, Andreas; Zierath, János; Rosenow, Sven-Erik; Bockhahn, Reik; Rachholz, Roman; Woernle, Christoph

2016-09-01

The mechanical design of wind turbines requires a profound understanding of the dynamic behaviour. Even though highly detailed simulation models are already in use to support wind turbine design, modal testing on a real prototype is irreplaceable to identify site-specific conditions such as the stiffness of the tower foundation. Correct identification of the mode shapes of a complex mechanical structure much depends on the placement of the sensors. For operational modal analysis of a 3 MW wind turbine with a 120 m rotor on a 100 m tower developed by W2E Wind to Energy, algorithms for optimal placement of acceleration sensors are applied. The mode shapes used for the optimisation are calculated by means of a detailed flexible multibody model of the wind turbine. Among the three algorithms in this study, the genetic algorithm with weighted off-diagonal criterion yields the sensor configuration with the highest quality. The ongoing measurements on the prototype will be the basis for the development of optimised wind turbine designs.
Fuzzy support vector machine: an efficient rule-based classification technique for microarrays.

PubMed

Hajiloo, Mohsen; Rabiee, Hamid R; Anooshahpour, Mahdi

2013-01-01

The abundance of gene expression microarray data has led to the development of machine learning algorithms applicable for tackling disease diagnosis, disease prognosis, and treatment selection problems. However, these algorithms often produce classifiers with weaknesses in terms of accuracy, robustness, and interpretability. This paper introduces fuzzy support vector machine which is a learning algorithm based on combination of fuzzy classifiers and kernel machines for microarray classification. Experimental results on public leukemia, prostate, and colon cancer datasets show that fuzzy support vector machine applied in combination with filter or wrapper feature selection methods develops a robust model with higher accuracy than the conventional microarray classification models such as support vector machine, artificial neural network, decision trees, k nearest neighbors, and diagonal linear discriminant analysis. Furthermore, the interpretable rule-base inferred from fuzzy support vector machine helps extracting biological knowledge from microarray data. Fuzzy support vector machine as a new classification model with high generalization power, robustness, and good interpretability seems to be a promising tool for gene expression microarray classification.
Deriving flow directions for coarse-resolution (1-4 km) gridded hydrologic modeling

NASA Astrophysics Data System (ADS)

Reed, Seann M.

2003-09-01

The National Weather Service Hydrology Laboratory (NWS-HL) is currently testing a grid-based distributed hydrologic model at a resolution (4 km) commensurate with operational, radar-based precipitation products. To implement distributed routing algorithms in this framework, a flow direction must be assigned to each model cell. A new algorithm, referred to as cell outlet tracing with an area threshold (COTAT) has been developed to automatically, accurately, and efficiently assign flow directions to any coarse-resolution grid cells using information from any higher-resolution digital elevation model. Although similar to previously published algorithms, this approach offers some advantages. Use of an area threshold allows more control over the tendency for producing diagonal flow directions. Analyses of results at different output resolutions ranging from 300 m to 4000 m indicate that it is possible to choose an area threshold that will produce minimal differences in average network flow lengths across this range of scales. Flow direction grids at a 4 km resolution have been produced for the conterminous United States.
Three-dimensional multigrid algorithms for the flux-split Euler equations

NASA Technical Reports Server (NTRS)

Anderson, W. Kyle; Thomas, James L.; Whitfield, David L.

1988-01-01

The Full Approximation Scheme (FAS) multigrid method is applied to several implicit flux-split algorithms for solving the three-dimensional Euler equations in a body fitted coordinate system. Each of the splitting algorithms uses a variation of approximate factorization and is implemented in a finite volume formulation. The algorithms are all vectorizable with little or no scalar computation required. The flux vectors are split into upwind components using both the splittings of Steger-Warming and Van Leer. The stability and smoothing rate of each of the schemes are examined using a Fourier analysis of the complete system of equations. Results are presented for three-dimensional subsonic, transonic, and supersonic flows which demonstrate substantially improved convergence rates with the multigrid algorithm. The influence of using both a V-cycle and a W-cycle on the convergence is examined.
A monolithic homotopy continuation algorithm with application to computational fluid dynamics

NASA Astrophysics Data System (ADS)

Brown, David A.; Zingg, David W.

2016-09-01

A new class of homotopy continuation methods is developed suitable for globalizing quasi-Newton methods for large sparse nonlinear systems of equations. The new continuation methods, described as monolithic homotopy continuation, differ from the classical predictor-corrector algorithm in that the predictor and corrector phases are replaced with a single phase which includes both a predictor and corrector component. Conditional convergence and stability are proved analytically. Using a Laplacian-like operator to construct the homotopy, the new algorithm is shown to be more efficient than the predictor-corrector homotopy continuation algorithm as well as an implementation of the widely-used pseudo-transient continuation algorithm for some inviscid and turbulent, subsonic and transonic external aerodynamic flows over the ONERA M6 wing and the NACA 0012 airfoil using a parallel implicit Newton-Krylov finite-difference flow solver.
A split finite element algorithm for the compressible Navier-Stokes equations

NASA Technical Reports Server (NTRS)

Baker, A. J.

1979-01-01

An accurate and efficient numerical solution algorithm is established for solution of the high Reynolds number limit of the Navier-Stokes equations governing the multidimensional flow of a compressible essentially inviscid fluid. Finite element interpolation theory is used within a dissipative formulation established using Galerkin criteria within the Method of Weighted Residuals. An implicit iterative solution algorithm is developed, employing tensor product bases within a fractional steps integration procedure, that significantly enhances solution economy concurrent with sharply reduced computer hardware demands. The algorithm is evaluated for resolution of steep field gradients and coarse grid accuracy using both linear and quadratic tensor product interpolation bases. Numerical solutions for linear and nonlinear, one, two and three dimensional examples confirm and extend the linearized theoretical analyses, and results are compared to competitive finite difference derived algorithms.
Mapping implicit spectral methods to distributed memory architectures

NASA Technical Reports Server (NTRS)

Overman, Andrea L.; Vanrosendale, John

1991-01-01

Spectral methods were proven invaluable in numerical simulation of PDEs (Partial Differential Equations), but the frequent global communication required raises a fundamental barrier to their use on highly parallel architectures. To explore this issue, a 3-D implicit spectral method was implemented on an Intel hypercube. Utilization of about 50 percent was achieved on a 32 node iPSC/860 hypercube, for a 64 x 64 x 64 Fourier-spectral grid; finer grids yield higher utilizations. Chebyshev-spectral grids are more problematic, since plane-relaxation based multigrid is required. However, by using a semicoarsening multigrid algorithm, and by relaxing all multigrid levels concurrently, relatively high utilizations were also achieved in this harder case.
Stability of mixed time integration schemes for transient thermal analysis

NASA Technical Reports Server (NTRS)

Liu, W. K.; Lin, J. I.

1982-01-01

A current research topic in coupled-field problems is the development of effective transient algorithms that permit different time integration methods with different time steps to be used simultaneously in various regions of the problems. The implicit-explicit approach seems to be very successful in structural, fluid, and fluid-structure problems. This paper summarizes this research direction. A family of mixed time integration schemes, with the capabilities mentioned above, is also introduced for transient thermal analysis. A stability analysis and the computer implementation of this technique are also presented. In particular, it is shown that the mixed time implicit-explicit methods provide a natural framework for the further development of efficient, clean, modularized computer codes.
Optimizing the static-dynamic performance of the body-in-white using a modified non-dominated sorting genetic algorithm coupled with grey relational analysis

NASA Astrophysics Data System (ADS)

Wang, Dengfeng; Cai, Kefang

2018-04-01

This article presents a hybrid method combining a modified non-dominated sorting genetic algorithm (MNSGA-II) with grey relational analysis (GRA) to improve the static-dynamic performance of a body-in-white (BIW). First, an implicit parametric model of the BIW was built using SFE-CONCEPT software, and then the validity of the implicit parametric model was verified by physical testing. Eight shape design variables were defined for BIW beam structures based on the implicit parametric technology. Subsequently, MNSGA-II was used to determine the optimal combination of the design parameters that can improve the bending stiffness, torsion stiffness and low-order natural frequencies of the BIW without considerable increase in the mass. A set of non-dominated solutions was then obtained in the multi-objective optimization design. Finally, the grey entropy theory and GRA were applied to rank all non-dominated solutions from best to worst to determine the best trade-off solution. The comparison between the GRA and the technique for order of preference by similarity to ideal solution (TOPSIS) illustrated the reliability and rationality of GRA. Moreover, the effectiveness of the hybrid method was verified by the optimal results such that the bending stiffness, torsion stiffness, first order bending and first order torsion natural frequency were improved by 5.46%, 9.30%, 7.32% and 5.73%, respectively, with the mass of the BIW increasing by 1.30%.
A semi-implicit augmented IIM for Navier–Stokes equations with open, traction, or free boundary conditions

PubMed Central

Li, Zhilin; Xiao, Li; Cai, Qin; Zhao, Hongkai; Luo, Ray

2016-01-01

In this paper, a new Navier–Stokes solver based on a finite difference approximation is proposed to solve incompressible flows on irregular domains with open, traction, and free boundary conditions, which can be applied to simulations of fluid structure interaction, implicit solvent model for biomolecular applications and other free boundary or interface problems. For some problems of this type, the projection method and the augmented immersed interface method (IIM) do not work well or does not work at all. The proposed new Navier–Stokes solver is based on the local pressure boundary method, and a semi-implicit augmented IIM. A fast Poisson solver can be used in our algorithm which gives us the potential for developing fast overall solvers in the future. The time discretization is based on a second order multi-step method. Numerical tests with exact solutions are presented to validate the accuracy of the method. Application to fluid structure interaction between an incompressible fluid and a compressible gas bubble is also presented. PMID:27087702
A semi-implicit augmented IIM for Navier-Stokes equations with open, traction, or free boundary conditions.

PubMed

Li, Zhilin; Xiao, Li; Cai, Qin; Zhao, Hongkai; Luo, Ray

2015-08-15

In this paper, a new Navier-Stokes solver based on a finite difference approximation is proposed to solve incompressible flows on irregular domains with open, traction, and free boundary conditions, which can be applied to simulations of fluid structure interaction, implicit solvent model for biomolecular applications and other free boundary or interface problems. For some problems of this type, the projection method and the augmented immersed interface method (IIM) do not work well or does not work at all. The proposed new Navier-Stokes solver is based on the local pressure boundary method, and a semi-implicit augmented IIM. A fast Poisson solver can be used in our algorithm which gives us the potential for developing fast overall solvers in the future. The time discretization is based on a second order multi-step method. Numerical tests with exact solutions are presented to validate the accuracy of the method. Application to fluid structure interaction between an incompressible fluid and a compressible gas bubble is also presented.
Efficient parallel implicit methods for rotary-wing aerodynamics calculations

NASA Astrophysics Data System (ADS)

Wissink, Andrew M.

Euler/Navier-Stokes Computational Fluid Dynamics (CFD) methods are commonly used for prediction of the aerodynamics and aeroacoustics of modern rotary-wing aircraft. However, their widespread application to large complex problems is limited lack of adequate computing power. Parallel processing offers the potential for dramatic increases in computing power, but most conventional implicit solution methods are inefficient in parallel and new techniques must be adopted to realize its potential. This work proposes alternative implicit schemes for Euler/Navier-Stokes rotary-wing calculations which are robust and efficient in parallel. The first part of this work proposes an efficient parallelizable modification of the Lower Upper-Symmetric Gauss Seidel (LU-SGS) implicit operator used in the well-known Transonic Unsteady Rotor Navier Stokes (TURNS) code. The new hybrid LU-SGS scheme couples a point-relaxation approach of the Data Parallel-Lower Upper Relaxation (DP-LUR) algorithm for inter-processor communication with the Symmetric Gauss Seidel algorithm of LU-SGS for on-processor computations. With the modified operator, TURNS is implemented in parallel using Message Passing Interface (MPI) for communication. Numerical performance and parallel efficiency are evaluated on the IBM SP2 and Thinking Machines CM-5 multi-processors for a variety of steady-state and unsteady test cases. The hybrid LU-SGS scheme maintains the numerical performance of the original LU-SGS algorithm in all cases and shows a good degree of parallel efficiency. It experiences a higher degree of robustness than DP-LUR for third-order upwind solutions. The second part of this work examines use of Krylov subspace iterative solvers for the nonlinear CFD solutions. The hybrid LU-SGS scheme is used as a parallelizable preconditioner. Two iterative methods are tested, Generalized Minimum Residual (GMRES) and Orthogonal s-Step Generalized Conjugate Residual (OSGCR). The Newton method demonstrates good parallel performance on the IBM SP2, with OS-GCR giving slightly better performance than GMRES on large numbers of processors. For steady and quasi-steady calculations, the convergence rate is accelerated but the overall solution time remains about the same as the standard hybrid LU-SGS scheme. For unsteady calculations, however, the Newton method maintains a higher degree of time-accuracy which allows tbe use of larger timesteps and results in CPU savings of 20-35%.
SAMSAN- MODERN NUMERICAL METHODS FOR CLASSICAL SAMPLED SYSTEM ANALYSIS

NASA Technical Reports Server (NTRS)

Frisch, H. P.

1994-01-01

SAMSAN was developed to aid the control system analyst by providing a self consistent set of computer algorithms that support large order control system design and evaluation studies, with an emphasis placed on sampled system analysis. Control system analysts have access to a vast array of published algorithms to solve an equally large spectrum of controls related computational problems. The analyst usually spends considerable time and effort bringing these published algorithms to an integrated operational status and often finds them less general than desired. SAMSAN reduces the burden on the analyst by providing a set of algorithms that have been well tested and documented, and that can be readily integrated for solving control system problems. Algorithm selection for SAMSAN has been biased toward numerical accuracy for large order systems with computational speed and portability being considered important but not paramount. In addition to containing relevant subroutines from EISPAK for eigen-analysis and from LINPAK for the solution of linear systems and related problems, SAMSAN contains the following not so generally available capabilities: 1) Reduction of a real non-symmetric matrix to block diagonal form via a real similarity transformation matrix which is well conditioned with respect to inversion, 2) Solution of the generalized eigenvalue problem with balancing and grading, 3) Computation of all zeros of the determinant of a matrix of polynomials, 4) Matrix exponentiation and the evaluation of integrals involving the matrix exponential, with option to first block diagonalize, 5) Root locus and frequency response for single variable transfer functions in the S, Z, and W domains, 6) Several methods of computing zeros for linear systems, and 7) The ability to generate documentation "on demand". All matrix operations in the SAMSAN algorithms assume non-symmetric matrices with real double precision elements. There is no fixed size limit on any matrix in any SAMSAN algorithm; however, it is generally agreed by experienced users, and in the numerical error analysis literature, that computation with non-symmetric matrices of order greater than about 200 should be avoided or treated with extreme care. SAMSAN attempts to support the needs of application oriented analysis by providing: 1) a methodology with unlimited growth potential, 2) a methodology to insure that associated documentation is current and available "on demand", 3) a foundation of basic computational algorithms that most controls analysis procedures are based upon, 4) a set of check out and evaluation programs which demonstrate usage of the algorithms on a series of problems which are structured to expose the limits of each algorithm's applicability, and 5) capabilities which support both a priori and a posteriori error analysis for the computational algorithms provided. The SAMSAN algorithms are coded in FORTRAN 77 for batch or interactive execution and have been implemented on a DEC VAX computer under VMS 4.7. An effort was made to assure that the FORTRAN source code was portable and thus SAMSAN may be adaptable to other machine environments. The documentation is included on the distribution tape or can be purchased separately at the price below. SAMSAN version 2.0 was developed in 1982 and updated to version 3.0 in 1988.

Markov random field model-based edge-directed image interpolation.

PubMed

Li, Min; Nguyen, Truong Q

2008-07-01

This paper presents an edge-directed image interpolation algorithm. In the proposed algorithm, the edge directions are implicitly estimated with a statistical-based approach. In opposite to explicit edge directions, the local edge directions are indicated by length-16 weighting vectors. Implicitly, the weighting vectors are used to formulate geometric regularity (GR) constraint (smoothness along edges and sharpness across edges) and the GR constraint is imposed on the interpolated image through the Markov random field (MRF) model. Furthermore, under the maximum a posteriori-MRF framework, the desired interpolated image corresponds to the minimal energy state of a 2-D random field given the low-resolution image. Simulated annealing methods are used to search for the minimal energy state from the state space. To lower the computational complexity of MRF, a single-pass implementation is designed, which performs nearly as well as the iterative optimization. Simulation results show that the proposed MRF model-based edge-directed interpolation method produces edges with strong geometric regularity. Compared to traditional methods and other edge-directed interpolation methods, the proposed method improves the subjective quality of the interpolated edges while maintaining a high PSNR level.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Bolding, Simon R.; Cleveland, Mathew Allen; Morel, Jim E.

In this paper, we have implemented a new high-order low-order (HOLO) algorithm for solving thermal radiative transfer problems. The low-order (LO) system is based on the spatial and angular moments of the transport equation and a linear-discontinuous finite-element spatial representation, producing equations similar to the standard S 2 equations. The LO solver is fully implicit in time and efficiently resolves the nonlinear temperature dependence at each time step. The high-order (HO) solver utilizes exponentially convergent Monte Carlo (ECMC) to give a globally accurate solution for the angular intensity to a fixed-source pure-absorber transport problem. This global solution is used tomore » compute consistency terms, which require the HO and LO solutions to converge toward the same solution. The use of ECMC allows for the efficient reduction of statistical noise in the Monte Carlo solution, reducing inaccuracies introduced through the LO consistency terms. Finally, we compare results with an implicit Monte Carlo code for one-dimensional gray test problems and demonstrate the efficiency of ECMC over standard Monte Carlo in this HOLO algorithm.« less
Nonlinearly preconditioned semismooth Newton methods for variational inequality solution of two-phase flow in porous media

NASA Astrophysics Data System (ADS)

Yang, Haijian; Sun, Shuyu; Yang, Chao

2017-03-01

Most existing methods for solving two-phase flow problems in porous media do not take the physically feasible saturation fractions between 0 and 1 into account, which often destroys the numerical accuracy and physical interpretability of the simulation. To calculate the solution without the loss of this basic requirement, we introduce a variational inequality formulation of the saturation equilibrium with a box inequality constraint, and use a conservative finite element method for the spatial discretization and a backward differentiation formula with adaptive time stepping for the temporal integration. The resulting variational inequality system at each time step is solved by using a semismooth Newton algorithm. To accelerate the Newton convergence and improve the robustness, we employ a family of adaptive nonlinear elimination methods as a nonlinear preconditioner. Some numerical results are presented to demonstrate the robustness and efficiency of the proposed algorithm. A comparison is also included to show the superiority of the proposed fully implicit approach over the classical IMplicit Pressure-Explicit Saturation (IMPES) method in terms of the time step size and the total execution time measured on a parallel computer.
An exact variational method to calculate vibrational energies of five atom molecules beyond the normal mode approach

DOE PAGES

Yu, Hua-Gen

2002-01-01

We present a full dimensional variational algorithm to calculate vibrational energies of penta-atomic molecules. The quantum mechanical Hamiltonian of the system for J=0 is derived in a set of orthogonal polyspherical coordinates in the body-fixed frame without any dynamical approximation. Moreover, the vibrational Hamiltonian has been obtained in an explicitly Hermitian form. Variational calculations are performed in a direct product discrete variable representation basis set. The sine functions are used for the radial coordinates, whereas the Legendre polynomials are employed for the polar angles. For the azimuthal angles, the symmetrically adapted Fourier–Chebyshev basis functions are utilized. The eigenvalue problem ismore » solved by a Lanczos iterative diagonalization algorithm. The preliminary application to methane is given. Ultimately, we made a comparison with previous results.« less
Some aspects of algorithm performance and modeling in transient analysis of structures

NASA Technical Reports Server (NTRS)

Adelman, H. M.; Haftka, R. T.; Robinson, J. C.

1981-01-01

The status of an effort to increase the efficiency of calculating transient temperature fields in complex aerospace vehicle structures is described. The advantages and disadvantages of explicit algorithms with variable time steps, known as the GEAR package, is described. Four test problems, used for evaluating and comparing various algorithms, were selected and finite-element models of the configurations are described. These problems include a space shuttle frame component, an insulated cylinder, a metallic panel for a thermal protection system, and a model of the wing of the space shuttle orbiter. Results generally indicate a preference for implicit over explicit algorithms for solution of transient structural heat transfer problems when the governing equations are stiff (typical of many practical problems such as insulated metal structures).
Implementation of a partitioned algorithm for simulation of large CSI problems

NASA Technical Reports Server (NTRS)

Alvin, Kenneth F.; Park, K. C.

1991-01-01

The implementation of a partitioned numerical algorithm for determining the dynamic response of coupled structure/controller/estimator finite-dimensional systems is reviewed. The partitioned approach leads to a set of coupled first and second-order linear differential equations which are numerically integrated with extrapolation and implicit step methods. The present software implementation, ACSIS, utilizes parallel processing techniques at various levels to optimize performance on a shared-memory concurrent/vector processing system. A general procedure for the design of controller and filter gains is also implemented, which utilizes the vibration characteristics of the structure to be solved. Also presented are: example problems; a user's guide to the software; the procedures and algorithm scripts; a stability analysis for the algorithm; and the source code for the parallel implementation.
Sparse Bayesian Learning for Nonstationary Data Sources

NASA Astrophysics Data System (ADS)

Fujimaki, Ryohei; Yairi, Takehisa; Machida, Kazuo

This paper proposes an online Sparse Bayesian Learning (SBL) algorithm for modeling nonstationary data sources. Although most learning algorithms implicitly assume that a data source does not change over time (stationary), one in the real world usually does due to such various factors as dynamically changing environments, device degradation, sudden failures, etc (nonstationary). The proposed algorithm can be made useable for stationary online SBL by setting time decay parameters to zero, and as such it can be interpreted as a single unified framework for online SBL for use with stationary and nonstationary data sources. Tests both on four types of benchmark problems and on actual stock price data have shown it to perform well.
A Pseudo-Temporal Multi-Grid Relaxation Scheme for Solving the Parabolized Navier-Stokes Equations

NASA Technical Reports Server (NTRS)

White, J. A.; Morrison, J. H.

1999-01-01

A multi-grid, flux-difference-split, finite-volume code, VULCAN, is presented for solving the elliptic and parabolized form of the equations governing three-dimensional, turbulent, calorically perfect and non-equilibrium chemically reacting flows. The space marching algorithms developed to improve convergence rate and or reduce computational cost are emphasized. The algorithms presented are extensions to the class of implicit pseudo-time iterative, upwind space-marching schemes. A full approximate storage, full multi-grid scheme is also described which is used to accelerate the convergence of a Gauss-Seidel relaxation method. The multi-grid algorithm is shown to significantly improve convergence on high aspect ratio grids.
Finite element concepts in computational aerodynamics

NASA Technical Reports Server (NTRS)

Baker, A. J.

1978-01-01

Finite element theory was employed to establish an implicit numerical solution algorithm for the time averaged unsteady Navier-Stokes equations. Both the multidimensional and a time-split form of the algorithm were considered, the latter of particular interest for problem specification on a regular mesh. A Newton matrix iteration procedure is outlined for solving the resultant nonlinear algebraic equation systems. Multidimensional discretization procedures are discussed with emphasis on automated generation of specific nonuniform solution grids and accounting of curved surfaces. The time-split algorithm was evaluated with regards to accuracy and convergence properties for hyperbolic equations on rectangular coordinates. An overall assessment of the viability of the finite element concept for computational aerodynamics is made.
Numerically robust and efficient nonlocal electron transport in 2D DRACO simulations

NASA Astrophysics Data System (ADS)

Cao, Duc; Chenhall, Jeff; Moses, Greg; Delettrez, Jacques; Collins, Tim

2013-10-01

An improved implicit algorithm based on Schurtz, Nicolai and Busquet (SNB) algorithm for nonlocal electron transport is presented. Validation with direct drive shock timing experiments and verification with the Goncharov nonlocal model in 1D LILAC simulations demonstrate the viability of this efficient algorithm for producing 2D lagrangian radiation hydrodynamics direct drive simulations. Additionally, simulations provide strong incentive to further modify key parameters within the SNB theory, namely the ``mean free path.'' An example 2D polar drive simulation to study 2D effects of the nonlocal flux as well as mean free path modifications will also be presented. This research was supported by the University of Rochester Laboratory for Laser Energetics.
Additive Runge-Kutta Schemes for Convection-Diffusion-Reaction Equations

NASA Technical Reports Server (NTRS)

Kennedy, Christopher A.; Carpenter, Mark H.

2001-01-01

Additive Runge-Kutta (ARK) methods are investigated for application to the spatially discretized one-dimensional convection-diffusion-reaction (CDR) equations. First, accuracy, stability, conservation, and dense output are considered for the general case when N different Runge-Kutta methods are grouped into a single composite method. Then, implicit-explicit, N = 2, additive Runge-Kutta ARK2 methods from third- to fifth-order are presented that allow for integration of stiff terms by an L-stable, stiffly-accurate explicit, singly diagonally implicit Runge-Kutta (ESDIRK) method while the nonstiff terms are integrated with a traditional explicit Runge-Kutta method (ERK). Coupling error terms are of equal order to those of the elemental methods. Derived ARK2 methods have vanishing stability functions for very large values of the stiff scaled eigenvalue, z(exp [I]) goes to infinity, and retain high stability efficiency in the absence of stiffness, z(exp [I]) goes to zero. Extrapolation-type stage-value predictors are provided based on dense-output formulae. Optimized methods minimize both leading order ARK2 error terms and Butcher coefficient magnitudes as well as maximize conservation properties. Numerical tests of the new schemes on a CDR problem show negligible stiffness leakage and near classical order convergence rates. However, tests on three simple singular-perturbation problems reveal generally predictable order reduction. Error control is best managed with a PID-controller. While results for the fifth-order method are disappointing, both the new third- and fourth-order methods are at least as efficient as existing ARK2 methods while offering error control and stage-value predictors.
An analysis of supersonic flows with low-Reynolds number compressible two-equation turbulence models using LU finite volume implicit numerical techniques

NASA Technical Reports Server (NTRS)

Lee, J.

1994-01-01

A generalized flow solver using an implicit Lower-upper (LU) diagonal decomposition based numerical technique has been coupled with three low-Reynolds number kappa-epsilon models for analysis of problems with engineering applications. The feasibility of using the LU technique to obtain efficient solutions to supersonic problems using the kappa-epsilon model has been demonstrated. The flow solver is then used to explore limitations and convergence characteristics of several popular two equation turbulence models. Several changes to the LU solver have been made to improve the efficiency of turbulent flow predictions. In general, the low-Reynolds number kappa-epsilon models are easier to implement than the models with wall-functions, but require much finer near-wall grid to accurately resolve the physics. The three kappa-epsilon models use different approaches to characterize the near wall regions of the flow. Therefore, the limitations imposed by the near wall characteristics have been carefully resolved. The convergence characteristics of a particular model using a given numerical technique are also an important, but most often overlooked, aspect of turbulence model predictions. It is found that some convergence characteristics could be sacrificed for more accurate near-wall prediction. However, even this gain in accuracy is not sufficient to model the effects of an external pressure gradient imposed by a shock-wave/ boundary-layer interaction. Additional work on turbulence models, especially for compressibility, is required since the solutions obtained with base line turbulence are in only reasonable agreement with the experimental data for the viscous interaction problems.
Distilling perfect GHZ states from two copies of non-GHZ-diagonal mixed states

NASA Astrophysics Data System (ADS)

Wang, Xin-Wen; Tang, Shi-Qing; Yuan, Ji-Bing; Zhang, Deng-Yu

2017-06-01

It has been shown that a nearly pure Greenberger-Horne-Zeilinger (GHZ) state could be distilled from a large (even infinite) number of GHZ-diagonal states that can be obtained by depolarizing general multipartite mixed states (non-GHZ-diagonal states) through sequences of (probabilistic) local operations and classical communications. We here demonstrate that perfect GHZ states can be extracted, with certain probabilities, from two copies of non-GHZ-diagonal mixed states when some conditions are satisfied. This result implies that it is not necessary to depolarize these entangled mixed states to the GHZ-diagonal type, and that they are better than GHZ-diagonal states for distillation of pure GHZ states. We find a wide class of multipartite entangled mixed states that fulfill the requirements. Moreover, we display that the obtained result can be applied to practical noisy environments, e.g., amplitude-damping channels. Our findings provide an important complementarity to conventional GHZ-state distillation protocols (designed for GHZ-diagonal states) in theory, as well as having practical applications.
Effective Social Relationship Measurement and Cluster Based Routing in Mobile Opportunistic Networks †

PubMed Central

Zeng, Feng; Zhao, Nan; Li, Wenjia

2017-01-01

In mobile opportunistic networks, the social relationship among nodes has an important impact on data transmission efficiency. Motivated by the strong share ability of “circles of friends” in communication networks such as Facebook, Twitter, Wechat and so on, we take a real-life example to show that social relationships among nodes consist of explicit and implicit parts. The explicit part comes from direct contact among nodes, and the implicit part can be measured through the “circles of friends”. We present the definitions of explicit and implicit social relationships between two nodes, adaptive weights of explicit and implicit parts are given according to the contact feature of nodes, and the distributed mechanism is designed to construct the “circles of friends” of nodes, which is used for the calculation of the implicit part of social relationship between nodes. Based on effective measurement of social relationships, we propose a social-based clustering and routing scheme, in which each node selects the nodes with close social relationships to form a local cluster, and the self-control method is used to keep all cluster members always having close relationships with each other. A cluster-based message forwarding mechanism is designed for opportunistic routing, in which each node only forwards the copy of the message to nodes with the destination node as a member of the local cluster. Simulation results show that the proposed social-based clustering and routing outperforms the other classic routing algorithms. PMID:28498309
Effective Social Relationship Measurement and Cluster Based Routing in Mobile Opportunistic Networks.

PubMed

Zeng, Feng; Zhao, Nan; Li, Wenjia

2017-05-12

In mobile opportunistic networks, the social relationship among nodes has an important impact on data transmission efficiency. Motivated by the strong share ability of "circles of friends" in communication networks such as Facebook, Twitter, Wechat and so on, we take a real-life example to show that social relationships among nodes consist of explicit and implicit parts. The explicit part comes from direct contact among nodes, and the implicit part can be measured through the "circles of friends". We present the definitions of explicit and implicit social relationships between two nodes, adaptive weights of explicit and implicit parts are given according to the contact feature of nodes, and the distributed mechanism is designed to construct the "circles of friends" of nodes, which is used for the calculation of the implicit part of social relationship between nodes. Based on effective measurement of social relationships, we propose a social-based clustering and routing scheme, in which each node selects the nodes with close social relationships to form a local cluster, and the self-control method is used to keep all cluster members always having close relationships with each other. A cluster-based message forwarding mechanism is designed for opportunistic routing, in which each node only forwards the copy of the message to nodes with the destination node as a member of the local cluster. Simulation results show that the proposed social-based clustering and routing outperforms the other classic routing algorithms.
An Empirical State Error Covariance Matrix for Batch State Estimation

NASA Technical Reports Server (NTRS)

Frisbee, Joseph H., Jr.

2011-01-01

State estimation techniques serve effectively to provide mean state estimates. However, the state error covariance matrices provided as part of these techniques suffer from some degree of lack of confidence in their ability to adequately describe the uncertainty in the estimated states. A specific problem with the traditional form of state error covariance matrices is that they represent only a mapping of the assumed observation error characteristics into the state space. Any errors that arise from other sources (environment modeling, precision, etc.) are not directly represented in a traditional, theoretical state error covariance matrix. Consider that an actual observation contains only measurement error and that an estimated observation contains all other errors, known and unknown. It then follows that a measurement residual (the difference between expected and observed measurements) contains all errors for that measurement. Therefore, a direct and appropriate inclusion of the actual measurement residuals in the state error covariance matrix will result in an empirical state error covariance matrix. This empirical state error covariance matrix will fully account for the error in the state estimate. By way of a literal reinterpretation of the equations involved in the weighted least squares estimation algorithm, it is possible to arrive at an appropriate, and formally correct, empirical state error covariance matrix. The first specific step of the method is to use the average form of the weighted measurement residual variance performance index rather than its usual total weighted residual form. Next it is helpful to interpret the solution to the normal equations as the average of a collection of sample vectors drawn from a hypothetical parent population. From here, using a standard statistical analysis approach, it directly follows as to how to determine the standard empirical state error covariance matrix. This matrix will contain the total uncertainty in the state estimate, regardless as to the source of the uncertainty. Also, in its most straight forward form, the technique only requires supplemental calculations to be added to existing batch algorithms. The generation of this direct, empirical form of the state error covariance matrix is independent of the dimensionality of the observations. Mixed degrees of freedom for an observation set are allowed. As is the case with any simple, empirical sample variance problems, the presented approach offers an opportunity (at least in the case of weighted least squares) to investigate confidence interval estimates for the error covariance matrix elements. The diagonal or variance terms of the error covariance matrix have a particularly simple form to associate with either a multiple degree of freedom chi-square distribution (more approximate) or with a gamma distribution (less approximate). The off diagonal or covariance terms of the matrix are less clear in their statistical behavior. However, the off diagonal covariance matrix elements still lend themselves to standard confidence interval error analysis. The distributional forms associated with the off diagonal terms are more varied and, perhaps, more approximate than those associated with the diagonal terms. Using a simple weighted least squares sample problem, results obtained through use of the proposed technique are presented. The example consists of a simple, two observer, triangulation problem with range only measurements. Variations of this problem reflect an ideal case (perfect knowledge of the range errors) and a mismodeled case (incorrect knowledge of the range errors).
Explicit symplectic algorithms based on generating functions for relativistic charged particle dynamics in time-dependent electromagnetic field

NASA Astrophysics Data System (ADS)

Zhang, Ruili; Wang, Yulei; He, Yang; Xiao, Jianyuan; Liu, Jian; Qin, Hong; Tang, Yifa

2018-02-01

Relativistic dynamics of a charged particle in time-dependent electromagnetic fields has theoretical significance and a wide range of applications. The numerical simulation of relativistic dynamics is often multi-scale and requires accurate long-term numerical simulations. Therefore, explicit symplectic algorithms are much more preferable than non-symplectic methods and implicit symplectic algorithms. In this paper, we employ the proper time and express the Hamiltonian as the sum of exactly solvable terms and product-separable terms in space-time coordinates. Then, we give the explicit symplectic algorithms based on the generating functions of orders 2 and 3 for relativistic dynamics of a charged particle. The methodology is not new, which has been applied to non-relativistic dynamics of charged particles, but the algorithm for relativistic dynamics has much significance in practical simulations, such as the secular simulation of runaway electrons in tokamaks.
Forward collision warning based on kernelized correlation filters

NASA Astrophysics Data System (ADS)

Pu, Jinchuan; Liu, Jun; Zhao, Yong

2017-07-01

A vehicle detection and tracking system is one of the indispensable methods to reduce the occurrence of traffic accidents. The nearest vehicle is the most likely to cause harm to us. So, this paper will do more research on about the nearest vehicle in the region of interest (ROI). For this system, high accuracy, real-time and intelligence are the basic requirement. In this paper, we set up a system that combines the advanced KCF tracking algorithm with the HaarAdaBoost detection algorithm. The KCF algorithm reduces computation time and increase the speed through the cyclic shift and diagonalization. This algorithm satisfies the real-time requirement. At the same time, Haar features also have the same advantage of simple operation and high speed for detection. The combination of this two algorithm contribute to an obvious improvement of the system running rate comparing with previous works. The detection result of the HaarAdaBoost classifier provides the initial value for the KCF algorithm. This fact optimizes KCF algorithm flaws that manual car marking in the initial phase, which is more scientific and more intelligent. Haar detection and KCF tracking with Histogram of Oriented Gradient (HOG) ensures the accuracy of the system. We evaluate the performance of framework on dataset that were self-collected. The experimental results demonstrate that the proposed method is robust and real-time. The algorithm can effectively adapt to illumination variation, even in the night it can meet the detection and tracking requirements, which is an improvement compared with the previous work.
Frequency-domain beamformers using conjugate gradient techniques for speech enhancement.

PubMed

Zhao, Shengkui; Jones, Douglas L; Khoo, Suiyang; Man, Zhihong

2014-09-01

A multiple-iteration constrained conjugate gradient (MICCG) algorithm and a single-iteration constrained conjugate gradient (SICCG) algorithm are proposed to realize the widely used frequency-domain minimum-variance-distortionless-response (MVDR) beamformers and the resulting algorithms are applied to speech enhancement. The algorithms are derived based on the Lagrange method and the conjugate gradient techniques. The implementations of the algorithms avoid any form of explicit or implicit autocorrelation matrix inversion. Theoretical analysis establishes formal convergence of the algorithms. Specifically, the MICCG algorithm is developed based on a block adaptation approach and it generates a finite sequence of estimates that converge to the MVDR solution. For limited data records, the estimates of the MICCG algorithm are better than the conventional estimators and equivalent to the auxiliary vector algorithms. The SICCG algorithm is developed based on a continuous adaptation approach with a sample-by-sample updating procedure and the estimates asymptotically converge to the MVDR solution. An illustrative example using synthetic data from a uniform linear array is studied and an evaluation on real data recorded by an acoustic vector sensor array is demonstrated. Performance of the MICCG algorithm and the SICCG algorithm are compared with the state-of-the-art approaches.
Recovery Discontinuous Galerkin Jacobian-free Newton-Krylov Method for all-speed flows

DOE Office of Scientific and Technical Information (OSTI.GOV)

HyeongKae Park; Robert Nourgaliev; Vincent Mousseau

2008-07-01

There is an increasing interest to develop the next generation simulation tools for the advanced nuclear energy systems. These tools will utilize the state-of-art numerical algorithms and computer science technology in order to maximize the predictive capability, support advanced reactor designs, reduce uncertainty and increase safety margins. In analyzing nuclear energy systems, we are interested in compressible low-Mach number, high heat flux flows with a wide range of Re, Ra, and Pr numbers. Under these conditions, the focus is placed on turbulent heat transfer, in contrast to other industries whose main interest is in capturing turbulent mixing. Our objective ismore » to develop singlepoint turbulence closure models for large-scale engineering CFD code, using Direct Numerical Simulation (DNS) or Large Eddy Simulation (LES) tools, requireing very accurate and efficient numerical algorithms. The focus of this work is placed on fully-implicit, high-order spatiotemporal discretization based on the discontinuous Galerkin method solving the conservative form of the compressible Navier-Stokes equations. The method utilizes a local reconstruction procedure derived from weak formulation of the problem, which is inspired by the recovery diffusion flux algorithm of van Leer and Nomura [?] and by the piecewise parabolic reconstruction [?] in the finite volume method. The developed methodology is integrated into the Jacobianfree Newton-Krylov framework [?] to allow a fully-implicit solution of the problem.« less

The Chorus Conflict and Loss of Separation Resolution Algorithms

NASA Technical Reports Server (NTRS)

Butler, Ricky W.; Hagen, George E.; Maddalon, Jeffrey M.

2013-01-01

The Chorus software is designed to investigate near-term, tactical conflict and loss of separation detection and resolution concepts for air traffic management. This software is currently being used in two different problem domains: en-route self- separation and sense and avoid for unmanned aircraft systems. This paper describes the core resolution algorithms that are part of Chorus. The combination of several features of the Chorus program distinguish this software from other approaches to conflict and loss of separation resolution. First, the program stores a history of state information over time which enables it to handle communication dropouts and take advantage of previous input data. Second, the underlying conflict algorithms find resolutions that solve the most urgent conflict, but also seek to prevent secondary conflicts with the other aircraft. Third, if the program is run on multiple aircraft, and the two aircraft maneuver at the same time, the result will be implicitly co-ordinated. This implicit coordination property is established by ensuring that a resolution produced by Chorus will comply with a mathematically-defined criteria whose correctness has been formally verified. Fourth, the program produces both instantaneous solutions and kinematic solutions, which are based on simple accel- eration models. Finally, the program provides resolutions for recovery from loss of separation. Different versions of this software are implemented as Java and C++ software programs, respectively.
A method for brain 3D surface reconstruction from MR images

NASA Astrophysics Data System (ADS)

Zhao, De-xin

2014-09-01

Due to the encephalic tissues are highly irregular, three-dimensional (3D) modeling of brain always leads to complicated computing. In this paper, we explore an efficient method for brain surface reconstruction from magnetic resonance (MR) images of head, which is helpful to surgery planning and tumor localization. A heuristic algorithm is proposed for surface triangle mesh generation with preserved features, and the diagonal length is regarded as the heuristic information to optimize the shape of triangle. The experimental results show that our approach not only reduces the computational complexity, but also completes 3D visualization with good quality.
A parallel algorithm for 2D visco-acoustic frequency-domain full-waveform inversion: application to a dense OBS data set

NASA Astrophysics Data System (ADS)

Sourbier, F.; Operto, S.; Virieux, J.

2006-12-01

We present a distributed-memory parallel algorithm for 2D visco-acoustic full-waveform inversion of wide-angle seismic data. Our code is written in fortran90 and use MPI for parallelism. The algorithm was applied to real wide-angle data set recorded by 100 OBSs with a 1-km spacing in the eastern-Nankai trough (Japan) to image the deep structure of the subduction zone. Full-waveform inversion is applied sequentially to discrete frequencies by proceeding from the low to the high frequencies. The inverse problem is solved with a classic gradient method. Full-waveform modeling is performed with a frequency-domain finite-difference method. In the frequency-domain, solving the wave equation requires resolution of a large unsymmetric system of linear equations. We use the massively parallel direct solver MUMPS (http://www.enseeiht.fr/irit/apo/MUMPS) for distributed-memory computer to solve this system. The MUMPS solver is based on a multifrontal method for the parallel factorization. The MUMPS algorithm is subdivided in 3 main steps: a symbolic analysis step that performs re-ordering of the matrix coefficients to minimize the fill-in of the matrix during the subsequent factorization and an estimation of the assembly tree of the matrix. Second, the factorization is performed with dynamic scheduling to accomodate numerical pivoting and provides the LU factors distributed over all the processors. Third, the resolution is performed for multiple sources. To compute the gradient of the cost function, 2 simulations per shot are required (one to compute the forward wavefield and one to back-propagate residuals). The multi-source resolutions can be performed in parallel with MUMPS. In the end, each processor stores in core a sub-domain of all the solutions. These distributed solutions can be exploited to compute in parallel the gradient of the cost function. Since the gradient of the cost function is a weighted stack of the shot and residual solutions of MUMPS, each processor computes the corresponding sub-domain of the gradient. In the end, the gradient is centralized on the master processor using a collective communation. The gradient is scaled by the diagonal elements of the Hessian matrix. This scaling is computed only once per frequency before the first iteration of the inversion. Estimation of the diagonal terms of the Hessian requires performing one simulation per non redondant shot and receiver position. The same strategy that the one used for the gradient is used to compute the diagonal Hessian in parallel. This algorithm was applied to a dense wide-angle data set recorded by 100 OBSs in the eastern Nankai trough, offshore Japan. Thirteen frequencies ranging from 3 and 15 Hz were inverted. Tweny iterations per frequency were computed leading to 260 tomographic velocity models of increasing resolution. The velocity model dimensions are 105 km x 25 km corresponding to a finite-difference grid of 4201 x 1001 grid with a 25-m grid interval. The number of shot was 1005 and the number of inverted OBS gathers was 93. The inversion requires 20 days on 6 32-bits bi-processor nodes with 4 Gbytes of RAM memory per node when only the LU factorization is performed in parallel. Preliminary estimations of the time required to perform the inversion with the fully-parallelized code is 6 and 4 days using 20 and 50 processors respectively.
Numerical studies of unsteady two dimensional subsonic flows using the ICE method. Ph.D. Thesis - Toledo Univ.

NASA Technical Reports Server (NTRS)

Wieber, P. R.

1973-01-01

A numerical program was developed to compute transient compressible and incompressible laminar flows in two dimensions with multicomponent mixing and chemical reaction. The algorithm used the Los Alamos Scientific Laboratory ICE (Implicit Continuous-Fluid Eulerian) method as its base. The program can compute both high and low speed compressible flows. The numerical program incorporating the stabilization techniques was quite successful in treating both old and new problems. Detailed calculations of coaxial flow very close to the entry plane were possible. The program treated complex flows such as the formation and downstream growth of a recirculation cell. An implicit solution of the species equation predicted mixing and reaction rates which compared favorably with the literature.
Efficient block preconditioned eigensolvers for linear response time-dependent density functional theory

NASA Astrophysics Data System (ADS)

Vecharynski, Eugene; Brabec, Jiri; Shao, Meiyue; Govind, Niranjan; Yang, Chao

2017-12-01

We present two efficient iterative algorithms for solving the linear response eigenvalue problem arising from the time dependent density functional theory. Although the matrix to be diagonalized is nonsymmetric, it has a special structure that can be exploited to save both memory and floating point operations. In particular, the nonsymmetric eigenvalue problem can be transformed into an eigenvalue problem that involves the product of two matrices M and K. We show that, because MK is self-adjoint with respect to the inner product induced by the matrix K, this product eigenvalue problem can be solved efficiently by a modified Davidson algorithm and a modified locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm that make use of the K-inner product. The solution of the product eigenvalue problem yields one component of the eigenvector associated with the original eigenvalue problem. We show that the other component of the eigenvector can be easily recovered in an inexpensive postprocessing procedure. As a result, the algorithms we present here become more efficient than existing methods that try to approximate both components of the eigenvectors simultaneously. In particular, our numerical experiments demonstrate that the new algorithms presented here consistently outperform the existing state-of-the-art Davidson type solvers by a factor of two in both solution time and storage.
Simulation d'ecoulements internes compressibles laminaires et turbulents par une methode d'elements finis

NASA Astrophysics Data System (ADS)

Rebaine, Ali

1997-08-01

Ce travail consiste en la simulation numerique des ecoulements internes compressibles bidimensionnels laminaires et turbulents. On s'interesse, particulierement, aux ecoulements dans les ejecteurs supersoniques. Les equations de Navier-Stokes sont formulees sous forme conservative et utilisent, comme variables independantes, les variables dites enthalpiques a savoir: la pression statique, la quantite de mouvement et l'enthalpie totale specifique. Une formulation variationnelle stable des equations de Navier-Stokes est utilisee. Elle est base sur la methode SUPG (Streamline Upwinding Petrov Galerkin) et utilise un operateur de capture des forts gradients. Un modele de turbulence, pour la simulation des ecoulements dans les ejecteurs, est mis au point. Il consiste a separer deux regions distinctes: une region proche de la paroi solide, ou le modele de Baldwin et Lomax est utilise et l'autre, loin de la paroi, ou une formulation nouvelle, basee sur le modele de Schlichting pour les jets, est proposee. Une technique de calcul de la viscosite turbulente, sur un maillage non structure, est implementee. La discretisation dans l'espace de la forme variationnelle est faite a l'aide de la methode des elements finis en utilisant une approximation mixte: quadratique pour les composantes de la quantite de mouvement et de la vitesse et lineaire pour le reste des variables. La discretisation temporelle est effectuee par une methode de differences finies en utilisant le schema d'Euler implicite. Le systeme matriciel, resultant de la discretisation spatio-temporelle, est resolu a l'aide de l'algorithme GMRES en utilisant un preconditionneur diagonal. Les validations numeriques ont ete menees sur plusieurs types de tuyeres et ejecteurs. La principale validation consiste en la simulation de l'ecoulement dans l'ejecteur teste au centre de recherche NASA Lewis. Les resultats obtenus sont tres comparables avec ceux des travaux anterieurs et sont nettement superieurs concernant les ecoulements turbulents dans les ejecteurs.
Parallelized traveling cluster approximation to study numerically spin-fermion models on large lattices

DOE PAGES

Mukherjee, Anamitra; Patel, Niravkumar D.; Bishop, Chris; ...

2015-06-08

Lattice spin-fermion models are quite important to study correlated systems where quantum dynamics allows for a separation between slow and fast degrees of freedom. The fast degrees of freedom are treated quantum mechanically while the slow variables, generically referred to as the “spins,” are treated classically. At present, exact diagonalization coupled with classical Monte Carlo (ED + MC) is extensively used to solve numerically a general class of lattice spin-fermion problems. In this common setup, the classical variables (spins) are treated via the standard MC method while the fermion problem is solved by exact diagonalization. The “traveling cluster approximation” (TCA)more » is a real space variant of the ED + MC method that allows to solve spin-fermion problems on lattice sizes with up to 10 3 sites. In this paper, we present a novel reorganization of the TCA algorithm in a manner that can be efficiently parallelized. Finally, this allows us to solve generic spin-fermion models easily on 10 4 lattice sites and with some effort on 10 5 lattice sites, representing the record lattice sizes studied for this family of models.« less
Parallelized traveling cluster approximation to study numerically spin-fermion models on large lattices

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mukherjee, Anamitra; Patel, Niravkumar D.; Bishop, Chris

Lattice spin-fermion models are quite important to study correlated systems where quantum dynamics allows for a separation between slow and fast degrees of freedom. The fast degrees of freedom are treated quantum mechanically while the slow variables, generically referred to as the “spins,” are treated classically. At present, exact diagonalization coupled with classical Monte Carlo (ED + MC) is extensively used to solve numerically a general class of lattice spin-fermion problems. In this common setup, the classical variables (spins) are treated via the standard MC method while the fermion problem is solved by exact diagonalization. The “traveling cluster approximation” (TCA)more » is a real space variant of the ED + MC method that allows to solve spin-fermion problems on lattice sizes with up to 10 3 sites. In this paper, we present a novel reorganization of the TCA algorithm in a manner that can be efficiently parallelized. Finally, this allows us to solve generic spin-fermion models easily on 10 4 lattice sites and with some effort on 10 5 lattice sites, representing the record lattice sizes studied for this family of models.« less
Diffusion-driven self-assembly of rodlike particles: Monte Carlo simulation on a square lattice

NASA Astrophysics Data System (ADS)

Lebovka, Nikolai I.; Tarasevich, Yuri Yu.; Gigiberiya, Volodymyr A.; Vygornitskii, Nikolai V.

2017-05-01

The diffusion-driven self-assembly of rodlike particles was studied by means of Monte Carlo simulation. The rods were represented as linear k -mers (i.e., particles occupying k adjacent sites). In the initial state, they were deposited onto a two-dimensional square lattice of size L ×L up to the jamming concentration using a random sequential adsorption algorithm. The size of the lattice, L , was varied from 128 to 2048, and periodic boundary conditions were applied along both x and y axes, while the length of the k -mers (determining the aspect ratio) was varied from 2 to 12. The k -mers oriented along the x and y directions (kx-mers and ky-mers, respectively) were deposited equiprobably. In the course of the simulation, the numbers of intraspecific and interspecific contacts between the same sort and between different sorts of k -mers, respectively, were calculated. Both the shift ratio of the actual number of shifts along the longitudinal or transverse axes of the k -mers and the electrical conductivity of the system were also examined. For the initial random configuration, quite different self-organization behavior was observed for short and long k -mers. For long k -mers (k ≥6 ), three main stages of diffusion-driven spatial segregation (self-assembly) were identified: the initial stage, reflecting destruction of the jamming state; the intermediate stage, reflecting continuous cluster coarsening and labyrinth pattern formation; and the final stage, reflecting the formation of diagonal stripe domains. Additional examination of two artificially constructed initial configurations showed that this pattern of diagonal stripe domains is an attractor, i.e., any spatial distribution of k -mers tends to transform into diagonal stripes. Nevertheless, the time for relaxation to the steady state essentially increases as the lattice size growth.
Preconditioned Mixed Spectral Element Methods for Elasticity and Stokes Problems

NASA Technical Reports Server (NTRS)

Pavarino, Luca F.

1996-01-01

Preconditioned iterative methods for the indefinite systems obtained by discretizing the linear elasticity and Stokes problems with mixed spectral elements in three dimensions are introduced and analyzed. The resulting stiffness matrices have the structure of saddle point problems with a penalty term, which is associated with the Poisson ratio for elasticity problems or with stabilization techniques for Stokes problems. The main results of this paper show that the convergence rate of the resulting algorithms is independent of the penalty parameter, the number of spectral elements Nu and mildly dependent on the spectral degree eta via the inf-sup constant. The preconditioners proposed for the whole indefinite system are block-diagonal and block-triangular. Numerical experiments presented in the final section show that these algorithms are a practical and efficient strategy for the iterative solution of the indefinite problems arising from mixed spectral element discretizations of elliptic systems.
Matrix-Product-State Algorithm for Finite Fractional Quantum Hall Systems

NASA Astrophysics Data System (ADS)

Liu, Zhao; Bhatt, R. N.

2015-09-01

Exact diagonalization is a powerful tool to study fractional quantum Hall (FQH) systems. However, its capability is limited by the exponentially increasing computational cost. In order to overcome this difficulty, density-matrix-renormalization-group (DMRG) algorithms were developed for much larger system sizes. Very recently, it was realized that some model FQH states have exact matrix-product-state (MPS) representation. Motivated by this, here we report a MPS code, which is closely related to, but different from traditional DMRG language, for finite FQH systems on the cylinder geometry. By representing the many-body Hamiltonian as a matrix-product-operator (MPO) and using single-site update and density matrix correction, we show that our code can efficiently search the ground state of various FQH systems. We also compare the performance of our code with traditional DMRG. The possible generalization of our code to infinite FQH systems and other physical systems is also discussed.
High speed corner and gap-seal computations using an LU-SGS scheme

NASA Technical Reports Server (NTRS)

Coirier, William J.

1989-01-01

The hybrid Lower-Upper Symmetric Gauss-Seidel (LU-SGS) algorithm was added to a widely used series of 2D/3D Euler/Navier-Stokes solvers and was demonstrated for a particular class of high-speed flows. A limited study was conducted to compare the hybrid LU-SGS for approximate Newton iteration and diagonalized Beam-Warming (DBW) schemes on a work and convergence history basis. The hybrid LU-SGS algorithm is more efficient and easier to implement than the DBW scheme originally present in the code for the cases considered. The code was validated for the hypersonic flow through two mutually perpendicular flat plates and then used to investigate the flow field in and around a simplified scramjet module gap seal configuration. Due to the similarities, the gap seal flow was compared to hypersonic corner flow at the same freestream conditions and Reynolds number.
Trotting, pacing and bounding by a quadruped robot.

PubMed

Raibert, M H

1990-01-01

This paper explores the quadruped running gaits that use the legs in pairs: the trot (diagonal pairs), the pace (lateral pairs), and the bound (front and rear pairs). Rather than study these gaits in quadruped animals, we studied them in a quadruped robot. We found that each of the gaits that use the legs in pairs can be transformed into a common underlying gait, a virtual biped gait. Once transformed, a single set of control algorithms produce all three gaits, with modest parameter variations between them. The control algorithms manipulated rebound height, running speed, and body attitude, while a low-level mechanism coordinated the behavior of the legs in each pair. The approach was tested with laboratory experiments on a four-legged robot. Data are presented that show the details of the running motion for the three gaits and for transitions from one gait to another.
A novel active disturbance rejection based tracking design for laser system with quadrant photodetector

NASA Astrophysics Data System (ADS)

Manojlović, Stojadin M.; Barbarić, Žarko P.; Mitrović, Srđan T.

2015-06-01

A new tracking design for laser systems with different arrangements of a quadrant photodetector, based on the principle of active disturbance rejection control is suggested. The detailed models of quadrant photodetector with standard add-subtract, difference-over-sum and diagonal-difference-over-sum algorithms for displacement signals are included in the control loop. Target moving, non-linearity of a photodetector, parameter perturbations and exterior disturbances are treated as a total disturbance. Active disturbance rejection controllers with linear extended state observers for total disturbance estimation and rejection are designed. Proposed methods are analysed in frequency domain to quantify their stability characteristics and disturbance rejection performances. It is shown through simulations, that tracking errors are effectively compensated, providing the laser spot positioning in the area near the centre of quadrant photodetector where the mentioned algorithms have the highest sensitivity, which provides tracking of the manoeuvring targets with high accuracy.
A tensor network approach to many-body localization

NASA Astrophysics Data System (ADS)

Yu, Xiongjie; Pekker, David; Clark, Bryan

Understanding the many-body localized phase requires access to eigenstates in the middle of the many-body spectrum. While exact-diagonalization is able to access these eigenstates, it is restricted to systems sizes of about 22 spins. To overcome this limitation, we develop tensor network algorithms which increase the accessible system size by an order of magnitude. We describe both our new algorithms as well as the additional physics about MBL we can extract from them. For example, we demonstrate the power of these methods by verifying the breakdown of the Eigenstate Thermalization Hypothesis (ETH) in the many-body localized phase of the random field Heisenberg model, and show the saturation of entanglement in the MBL phase and generate eigenstates that differ by local excitations. Work was supported by AFOSR FA9550-10-1-0524 and FA9550-12-1-0057, the Kaufmann foundation, and SciDAC FG02-12ER46875.
New imaging algorithm in diffusion tomography

NASA Astrophysics Data System (ADS)

Klibanov, Michael V.; Lucas, Thomas R.; Frank, Robert M.

1997-08-01

A novel imaging algorithm for diffusion/optical tomography is presented for the case of the time dependent diffusion equation. Numerical tests are conducted for ranges of parameters realistic for applications to an early breast cancer diagnosis using ultrafast laser pulses. This is a perturbation-like method which works for both homogeneous a heterogeneous background media. Its main innovation lies in a new approach for a novel linearized problem (LP). Such an LP is derived and reduced to a boundary value problem for a coupled system of elliptic partial differential equations. As is well known, the solution of such a system amounts to the factorization of well conditioned, sparse matrices with few non-zero entries clustered along the diagonal, which can be done very rapidly. Thus, the main advantages of this technique are that it is fast and accurate. The authors call this approach the elliptic systems method (ESM). The ESM can be extended for other data collection schemes.
Reinforcement Learning for Constrained Energy Trading Games With Incomplete Information.

PubMed

Wang, Huiwei; Huang, Tingwen; Liao, Xiaofeng; Abu-Rub, Haitham; Chen, Guo

2017-10-01

This paper considers the problem of designing adaptive learning algorithms to seek the Nash equilibrium (NE) of the constrained energy trading game among individually strategic players with incomplete information. In this game, each player uses the learning automaton scheme to generate the action probability distribution based on his/her private information for maximizing his own averaged utility. It is shown that if one of admissible mixed-strategies converges to the NE with probability one, then the averaged utility and trading quantity almost surely converge to their expected ones, respectively. For the given discontinuous pricing function, the utility function has already been proved to be upper semicontinuous and payoff secure which guarantee the existence of the mixed-strategy NE. By the strict diagonal concavity of the regularized Lagrange function, the uniqueness of NE is also guaranteed. Finally, an adaptive learning algorithm is provided to generate the strategy probability distribution for seeking the mixed-strategy NE.
Numerical algorithms based on Galerkin methods for the modeling of reactive interfaces in photoelectrochemical (PEC) solar cells

NASA Astrophysics Data System (ADS)

Harmon, Michael; Gamba, Irene M.; Ren, Kui

2016-12-01

This work concerns the numerical solution of a coupled system of self-consistent reaction-drift-diffusion-Poisson equations that describes the macroscopic dynamics of charge transport in photoelectrochemical (PEC) solar cells with reactive semiconductor and electrolyte interfaces. We present three numerical algorithms, mainly based on a mixed finite element and a local discontinuous Galerkin method for spatial discretization, with carefully chosen numerical fluxes, and implicit-explicit time stepping techniques, for solving the time-dependent nonlinear systems of partial differential equations. We perform computational simulations under various model parameters to demonstrate the performance of the proposed numerical algorithms as well as the impact of these parameters on the solution to the model.
Flux-split algorithms for flows with non-equilibrium chemistry and vibrational relaxation

NASA Technical Reports Server (NTRS)

Grossman, B.; Cinnella, P.

1990-01-01

The present consideration of numerical computation methods for gas flows with nonequilibrium chemistry thermodynamics gives attention to an equilibrium model, a general nonequilibrium model, and a simplified model based on vibrational relaxation. Flux-splitting procedures are developed for the fully-coupled inviscid equations encompassing fluid dynamics and both chemical and internal energy-relaxation processes. A fully coupled and implicit large-block structure is presented which embodies novel forms of flux-vector split and flux-difference split algorithms valid for nonequilibrium flow; illustrative high-temperature shock tube and nozzle flow examples are given.
Computation of the shock-wave boundary layer interaction with flow separation

NASA Technical Reports Server (NTRS)

Ardonceau, P.; Alziary, T.; Aymer, D.

1980-01-01

The boundary layer concept is used to describe the flow near the wall. The external flow is approximated by a pressure displacement relationship (tangent wedge in linearized supersonic flow). The boundary layer equations are solved in finite difference form and the question of the presence and unicity of the solution is considered for the direct problem (assumed pressure) or converse problem (assumed displacement thickness, friction ratio). The coupling algorithm presented implicitly processes the downstream boundary condition necessary to correctly define the interacting boundary layer problem. The algorithm uses a Newton linearization technique to provide a fast convergence.

Machine Learning-based Intelligent Formal Reasoning and Proving System

NASA Astrophysics Data System (ADS)

Chen, Shengqing; Huang, Xiaojian; Fang, Jiaze; Liang, Jia

2018-03-01

The reasoning system can be used in many fields. How to improve reasoning efficiency is the core of the design of system. Through the formal description of formal proof and the regular matching algorithm, after introducing the machine learning algorithm, the system of intelligent formal reasoning and verification has high efficiency. The experimental results show that the system can verify the correctness of propositional logic reasoning and reuse the propositional logical reasoning results, so as to obtain the implicit knowledge in the knowledge base and provide the basic reasoning model for the construction of intelligent system.
From differential to difference equations for first order ODEs

NASA Technical Reports Server (NTRS)

Freed, Alan D.; Walker, Kevin P.

1991-01-01

When constructing an algorithm for the numerical integration of a differential equation, one should first convert the known ordinary differential equation (ODE) into an ordinary difference equation. Given this difference equation, one can develop an appropriate numerical algorithm. This technical note describes the derivation of two such ordinary difference equations applicable to a first order ODE. The implicit ordinary difference equation has the same asymptotic expansion as the ODE itself, whereas the explicit ordinary difference equation has an asymptotic that is similar in structure but different in value when compared with that of the ODE.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Jin Jiasen; Yu Changshui; Song Heshan

We propose a scheme for identifying an unknown Bell diagonal state. In our scheme the measurements are performed on the probe qubits instead of the Bell diagonal state. The distinct advantage is that the quantum state of the evolved Bell diagonal state ensemble plus probe states will still collapse on the original Bell diagonal state ensemble after the measurement on probe states; i.e., our identification is quantum state nondestructive. How to realize our scheme in the framework of cavity electrodynamics is also shown.
Performance of Blind Source Separation Algorithms for FMRI Analysis using a Group ICA Method

PubMed Central

Correa, Nicolle; Adali, Tülay; Calhoun, Vince D.

2007-01-01

Independent component analysis (ICA) is a popular blind source separation (BSS) technique that has proven to be promising for the analysis of functional magnetic resonance imaging (fMRI) data. A number of ICA approaches have been used for fMRI data analysis, and even more ICA algorithms exist, however the impact of using different algorithms on the results is largely unexplored. In this paper, we study the performance of four major classes of algorithms for spatial ICA, namely information maximization, maximization of non-gaussianity, joint diagonalization of cross-cumulant matrices, and second-order correlation based methods when they are applied to fMRI data from subjects performing a visuo-motor task. We use a group ICA method to study the variability among different ICA algorithms and propose several analysis techniques to evaluate their performance. We compare how different ICA algorithms estimate activations in expected neuronal areas. The results demonstrate that the ICA algorithms using higher-order statistical information prove to be quite consistent for fMRI data analysis. Infomax, FastICA, and JADE all yield reliable results; each having their strengths in specific areas. EVD, an algorithm using second-order statistics, does not perform reliably for fMRI data. Additionally, for the iterative ICA algorithms, it is important to investigate the variability of the estimates from different runs. We test the consistency of the iterative algorithms, Infomax and FastICA, by running the algorithm a number of times with different initializations and note that they yield consistent results over these multiple runs. Our results greatly improve our confidence in the consistency of ICA for fMRI data analysis. PMID:17540281
Symmetric encryption algorithms using chaotic and non-chaotic generators: A review

PubMed Central

Radwan, Ahmed G.; AbdElHaleem, Sherif H.; Abd-El-Hafiz, Salwa K.

2015-01-01

This paper summarizes the symmetric image encryption results of 27 different algorithms, which include substitution-only, permutation-only or both phases. The cores of these algorithms are based on several discrete chaotic maps (Arnold’s cat map and a combination of three generalized maps), one continuous chaotic system (Lorenz) and two non-chaotic generators (fractals and chess-based algorithms). Each algorithm has been analyzed by the correlation coefficients between pixels (horizontal, vertical and diagonal), differential attack measures, Mean Square Error (MSE), entropy, sensitivity analyses and the 15 standard tests of the National Institute of Standards and Technology (NIST) SP-800-22 statistical suite. The analyzed algorithms include a set of new image encryption algorithms based on non-chaotic generators, either using substitution only (using fractals) and permutation only (chess-based) or both. Moreover, two different permutation scenarios are presented where the permutation-phase has or does not have a relationship with the input image through an ON/OFF switch. Different encryption-key lengths and complexities are provided from short to long key to persist brute-force attacks. In addition, sensitivities of those different techniques to a one bit change in the input parameters of the substitution key as well as the permutation key are assessed. Finally, a comparative discussion of this work versus many recent research with respect to the used generators, type of encryption, and analyses is presented to highlight the strengths and added contribution of this paper. PMID:26966561
Semantic concept-enriched dependence model for medical information retrieval.

PubMed

Choi, Sungbin; Choi, Jinwook; Yoo, Sooyoung; Kim, Heechun; Lee, Youngho

2014-02-01

In medical information retrieval research, semantic resources have been mostly used by expanding the original query terms or estimating the concept importance weight. However, implicit term-dependency information contained in semantic concept terms has been overlooked or at least underused in most previous studies. In this study, we incorporate a semantic concept-based term-dependence feature into a formal retrieval model to improve its ranking performance. Standardized medical concept terms used by medical professionals were assumed to have implicit dependency within the same concept. We hypothesized that, by elaborately revising the ranking algorithms to favor documents that preserve those implicit dependencies, the ranking performance could be improved. The implicit dependence features are harvested from the original query using MetaMap. These semantic concept-based dependence features were incorporated into a semantic concept-enriched dependence model (SCDM). We designed four different variants of the model, with each variant having distinct characteristics in the feature formulation method. We performed leave-one-out cross validations on both a clinical document corpus (TREC Medical records track) and a medical literature corpus (OHSUMED), which are representative test collections in medical information retrieval research. Our semantic concept-enriched dependence model consistently outperformed other state-of-the-art retrieval methods. Analysis shows that the performance gain has occurred independently of the concept's explicit importance in the query. By capturing implicit knowledge with regard to the query term relationships and incorporating them into a ranking model, we could build a more robust and effective retrieval model, independent of the concept importance. Copyright © 2013 Elsevier Inc. All rights reserved.
Molecular dynamics simulations of biological membranes and membrane proteins using enhanced conformational sampling algorithms.

PubMed

Mori, Takaharu; Miyashita, Naoyuki; Im, Wonpil; Feig, Michael; Sugita, Yuji

2016-07-01

This paper reviews various enhanced conformational sampling methods and explicit/implicit solvent/membrane models, as well as their recent applications to the exploration of the structure and dynamics of membranes and membrane proteins. Molecular dynamics simulations have become an essential tool to investigate biological problems, and their success relies on proper molecular models together with efficient conformational sampling methods. The implicit representation of solvent/membrane environments is reasonable approximation to the explicit all-atom models, considering the balance between computational cost and simulation accuracy. Implicit models can be easily combined with replica-exchange molecular dynamics methods to explore a wider conformational space of a protein. Other molecular models and enhanced conformational sampling methods are also briefly discussed. As application examples, we introduce recent simulation studies of glycophorin A, phospholamban, amyloid precursor protein, and mixed lipid bilayers and discuss the accuracy and efficiency of each simulation model and method. This article is part of a Special Issue entitled: Membrane Proteins edited by J.C. Gumbart and Sergei Noskov. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Face verification with balanced thresholds.

PubMed

Yan, Shuicheng; Xu, Dong; Tang, Xiaoou

2007-01-01

The process of face verification is guided by a pre-learned global threshold, which, however, is often inconsistent with class-specific optimal thresholds. It is, hence, beneficial to pursue a balance of the class-specific thresholds in the model-learning stage. In this paper, we present a new dimensionality reduction algorithm tailored to the verification task that ensures threshold balance. This is achieved by the following aspects. First, feasibility is guaranteed by employing an affine transformation matrix, instead of the conventional projection matrix, for dimensionality reduction, and, hence, we call the proposed algorithm threshold balanced transformation (TBT). Then, the affine transformation matrix, constrained as the product of an orthogonal matrix and a diagonal matrix, is optimized to improve the threshold balance and classification capability in an iterative manner. Unlike most algorithms for face verification which are directly transplanted from face identification literature, TBT is specifically designed for face verification and clarifies the intrinsic distinction between these two tasks. Experiments on three benchmark face databases demonstrate that TBT significantly outperforms the state-of-the-art subspace techniques for face verification.
An Intelligent Architecture Based on Field Programmable Gate Arrays Designed to Detect Moving Objects by Using Principal Component Analysis

PubMed Central

Bravo, Ignacio; Mazo, Manuel; Lázaro, José L.; Gardel, Alfredo; Jiménez, Pedro; Pizarro, Daniel

2010-01-01

This paper presents a complete implementation of the Principal Component Analysis (PCA) algorithm in Field Programmable Gate Array (FPGA) devices applied to high rate background segmentation of images. The classical sequential execution of different parts of the PCA algorithm has been parallelized. This parallelization has led to the specific development and implementation in hardware of the different stages of PCA, such as computation of the correlation matrix, matrix diagonalization using the Jacobi method and subspace projections of images. On the application side, the paper presents a motion detection algorithm, also entirely implemented on the FPGA, and based on the developed PCA core. This consists of dynamically thresholding the differences between the input image and the one obtained by expressing the input image using the PCA linear subspace previously obtained as a background model. The proposal achieves a high ratio of processed images (up to 120 frames per second) and high quality segmentation results, with a completely embedded and reliable hardware architecture based on commercial CMOS sensors and FPGA devices. PMID:22163406
An intelligent architecture based on Field Programmable Gate Arrays designed to detect moving objects by using Principal Component Analysis.

PubMed

Bravo, Ignacio; Mazo, Manuel; Lázaro, José L; Gardel, Alfredo; Jiménez, Pedro; Pizarro, Daniel

2010-01-01

This paper presents a complete implementation of the Principal Component Analysis (PCA) algorithm in Field Programmable Gate Array (FPGA) devices applied to high rate background segmentation of images. The classical sequential execution of different parts of the PCA algorithm has been parallelized. This parallelization has led to the specific development and implementation in hardware of the different stages of PCA, such as computation of the correlation matrix, matrix diagonalization using the Jacobi method and subspace projections of images. On the application side, the paper presents a motion detection algorithm, also entirely implemented on the FPGA, and based on the developed PCA core. This consists of dynamically thresholding the differences between the input image and the one obtained by expressing the input image using the PCA linear subspace previously obtained as a background model. The proposal achieves a high ratio of processed images (up to 120 frames per second) and high quality segmentation results, with a completely embedded and reliable hardware architecture based on commercial CMOS sensors and FPGA devices.
Sparse Gaussian elimination with controlled fill-in on a shared memory multiprocessor

NASA Technical Reports Server (NTRS)

Alaghband, Gita; Jordan, Harry F.

1989-01-01

It is shown that in sparse matrices arising from electronic circuits, it is possible to do computations on many diagonal elements simultaneously. A technique for obtaining an ordered compatible set directly from the ordered incompatible table is given. The ordering is based on the Markowitz number of the pivot candidates. This technique generates a set of compatible pivots with the property of generating few fills. A novel heuristic algorithm is presented that combines the idea of an order-compatible set with a limited binary tree search to generate several sets of compatible pivots in linear time. An elimination set for reducing the matrix is generated and selected on the basis of a minimum Markowitz sum number. The parallel pivoting technique presented is a stepwise algorithm and can be applied to any submatrix of the original matrix. Thus, it is not a preordering of the sparse matrix and is applied dynamically as the decomposition proceeds. Parameters are suggested to obtain a balance between parallelism and fill-ins. Results of applying the proposed algorithms on several large application matrices using the HEP multiprocessor (Kowalik, 1985) are presented and analyzed.
Pressure profiles in detonation cells with rectangular and diagonal structures

NASA Astrophysics Data System (ADS)

Hanana, M.; Lefebvre, M. H.

Experimental results presented in this work enable us to classify the three-dimensional structure of the detonation into two fundamental types: a rectangular structure and a diagonal structure. The rectangular structure is well documented in the literature and consists of orthogonal waves travelling independently from each another. The soot record in this case shows the classical diamond detonation cell exhibiting `slapping waves'. The experiments indicate that the diagonal structure is a structure with the triple point intersections moving along the diagonal line of the tube cross section. The axes of the transverse waves are canted at 45 degrees to the wall, accounting for the lack of slapping waves. It is possible to reproduce these diagonal structures by appropriately controlling the experimental ignition procedure. The characteristics of the diagonal structure show some similarities with detonation structure in round tube. Pressure measurements recorded along the central axis of the cellular structure show a series of pressure peaks, depending on the type of structure and the position inside the detonation cell. Pressure profiles measured for the whole length of the two types of detonation cells show that the intensity of the shock front is higher and the length of the detonation cell is shorter for the diagonal structures.
2. VIEW OF CENTRAL BEND OF LOWER DIAGONAL NO. 1 ...

Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

2. VIEW OF CENTRAL BEND OF LOWER DIAGONAL NO. 1 DRAIN, LOOKING 2932 EAST OF NORTH. - Truckee-Carson Irrigation District, Lower Diagonal No. 1 Drain, Bounded by West Gate Road & Weapons Delivery Road, Naval Air Station Fallon, Fallon, Churchill County, NV
Hoph Bifurcation in Viscous, Low Speed Flows About an Airfoil with Structural Coupling

DTIC Science & Technology

1993-03-01

8 2.1 Equations of Motion ...... ..................... 8 2.2 Coordinate Transformation ....................... 13 2.3 Aerodynamic...a-frame) f - Apparent body forces applied in noninertial system fL - Explicit fourth-order numerical damping term Ai - Implicit fourth-order...resulting airfoil motion . The equations describing the airfoil motion are integrated in time using a fourth-order Runge-Kutta algorithm. The
DOE Office of Scientific and Technical Information (OSTI.GOV)

Hewett, D.W.; Yu-Jiuan Chen

The authors describe how they hold onto orthogonal mesh discretization when dealing with curved boundaries. Special difference operators were constructed to approximate numerical zones split by the domain boundary; the operators are particularly simple for this rectangular mesh. The authors demonstrated that this simple numerical approach, termed Dynamic Alternating Direction Implicit, turned out to be considerably more efficient than more complex grid-adaptive algorithms that were tried previously.
Clique Relaxations in Biological and Social Network Analysis Foundations and Algorithms

DTIC Science & Technology

2015-10-26

study of clique relaxation models arising in biological and social networks. This project examines the elementary clique-defining properties... elementary clique-defining properties inherently exploited in the available clique relaxation models and pro- poses a taxonomic framework that not...analyzes the elementary clique-defining properties implicitly exploited in the available clique relaxation models and proposes a taxonomic framework that
Efficient temporal and interlayer parameter prediction for weighted prediction in scalable high efficiency video coding

NASA Astrophysics Data System (ADS)

Tsang, Sik-Ho; Chan, Yui-Lam; Siu, Wan-Chi

2017-01-01

Weighted prediction (WP) is an efficient video coding tool that was introduced since the establishment of the H.264/AVC video coding standard, for compensating the temporal illumination change in motion estimation and compensation. WP parameters, including a multiplicative weight and an additive offset for each reference frame, are required to be estimated and transmitted to the decoder by slice header. These parameters cause extra bits in the coded video bitstream. High efficiency video coding (HEVC) provides WP parameter prediction to reduce the overhead. Therefore, WP parameter prediction is crucial to research works or applications, which are related to WP. Prior art has been suggested to further improve the WP parameter prediction by implicit prediction of image characteristics and derivation of parameters. By exploiting both temporal and interlayer redundancies, we propose three WP parameter prediction algorithms, enhanced implicit WP parameter, enhanced direct WP parameter derivation, and interlayer WP parameter, to further improve the coding efficiency of HEVC. Results show that our proposed algorithms can achieve up to 5.83% and 5.23% bitrate reduction compared to the conventional scalable HEVC in the base layer for SNR scalability and 2× spatial scalability, respectively.
Accuracy of an unstructured-grid upwind-Euler algorithm for the ONERA M6 wing

NASA Technical Reports Server (NTRS)

Batina, John T.

1991-01-01

Improved algorithms for the solution of the three-dimensional, time-dependent Euler equations are presented for aerodynamic analysis involving unstructured dynamic meshes. The improvements have been developed recently to the spatial and temporal discretizations used by unstructured-grid flow solvers. The spatial discretization involves a flux-split approach that is naturally dissipative and captures shock waves sharply with at most one grid point within the shock structure. The temporal discretization involves either an explicit time-integration scheme using a multistage Runge-Kutta procedure or an implicit time-integration scheme using a Gauss-Seidel relaxation procedure, which is computationally efficient for either steady or unsteady flow problems. With the implicit Gauss-Seidel procedure, very large time steps may be used for rapid convergence to steady state, and the step size for unsteady cases may be selected for temporal accuracy rather than for numerical stability. Steady flow results are presented for both the NACA 0012 airfoil and the Office National d'Etudes et de Recherches Aerospatiales M6 wing to demonstrate applications of the new Euler solvers. The paper presents a description of the Euler solvers along with results and comparisons that assess the capability.
A fully implicit numerical integration of the relativistic particle equation of motion

NASA Astrophysics Data System (ADS)

Pétri, J.

2017-04-01

Relativistic strongly magnetized plasmas are produced in laboratories thanks to state-of-the-art laser technology but can naturally be found around compact objects such as neutron stars and black holes. Detailed studies of the behaviour of relativistic plasmas require accurate computations able to catch the full spatial and temporal dynamics of the system. Numerical simulations of ultra-relativistic plasmas face severe restrictions due to limitations in the maximum possible Lorentz factors that current algorithms can reproduce to good accuracy. In order to circumvent this flaw and repel the limit to 9$ , we design a new fully implicit scheme to solve the relativistic particle equation of motion in an external electromagnetic field using a three-dimensional Cartesian geometry. We show some examples of numerical integrations in constant electromagnetic fields to prove the efficiency of our algorithm. The code is also able to follow the electric drift motion for high Lorentz factors. In the most general case of spatially and temporally varying electromagnetic fields, the code performs extremely well, as shown by comparison with exact analytical solutions for the relativistic electrostatic Kepler problem as well as for linearly and circularly polarized plane waves.
Multiobjective Multifactorial Optimization in Evolutionary Multitasking.

PubMed

Gupta, Abhishek; Ong, Yew-Soon; Feng, Liang; Tan, Kay Chen

2016-05-03

In recent decades, the field of multiobjective optimization has attracted considerable interest among evolutionary computation researchers. One of the main features that makes evolutionary methods particularly appealing for multiobjective problems is the implicit parallelism offered by a population, which enables simultaneous convergence toward the entire Pareto front. While a plethora of related algorithms have been proposed till date, a common attribute among them is that they focus on efficiently solving only a single optimization problem at a time. Despite the known power of implicit parallelism, seldom has an attempt been made to multitask, i.e., to solve multiple optimization problems simultaneously. It is contended that the notion of evolutionary multitasking leads to the possibility of automated transfer of information across different optimization exercises that may share underlying similarities, thereby facilitating improved convergence characteristics. In particular, the potential for automated transfer is deemed invaluable from the standpoint of engineering design exercises where manual knowledge adaptation and reuse are routine. Accordingly, in this paper, we present a realization of the evolutionary multitasking paradigm within the domain of multiobjective optimization. The efficacy of the associated evolutionary algorithm is demonstrated on some benchmark test functions as well as on a real-world manufacturing process design problem from the composites industry.

Joint Prior Learning for Visual Sensor Network Noisy Image Super-Resolution

PubMed Central

Yue, Bo; Wang, Shuang; Liang, Xuefeng; Jiao, Licheng; Xu, Caijin

2016-01-01

The visual sensor network (VSN), a new type of wireless sensor network composed of low-cost wireless camera nodes, is being applied for numerous complex visual analyses in wild environments, such as visual surveillance, object recognition, etc. However, the captured images/videos are often low resolution with noise. Such visual data cannot be directly delivered to the advanced visual analysis. In this paper, we propose a joint-prior image super-resolution (JPISR) method using expectation maximization (EM) algorithm to improve VSN image quality. Unlike conventional methods that only focus on upscaling images, JPISR alternatively solves upscaling mapping and denoising in the E-step and M-step. To meet the requirement of the M-step, we introduce a novel non-local group-sparsity image filtering method to learn the explicit prior and induce the geometric duality between images to learn the implicit prior. The EM algorithm inherently combines the explicit prior and implicit prior by joint learning. Moreover, JPISR does not rely on large external datasets for training, which is much more practical in a VSN. Extensive experiments show that JPISR outperforms five state-of-the-art methods in terms of both PSNR, SSIM and visual perception. PMID:26927114
An efficient mode-splitting method for a curvilinear nearshore circulation model

USGS Publications Warehouse

Shi, Fengyan; Kirby, James T.; Hanes, Daniel M.

2007-01-01

A mode-splitting method is applied to the quasi-3D nearshore circulation equations in generalized curvilinear coordinates. The gravity wave mode and the vorticity wave mode of the equations are derived using the two-step projection method. Using an implicit algorithm for the gravity mode and an explicit algorithm for the vorticity mode, we combine the two modes to derive a mixed difference–differential equation with respect to surface elevation. McKee et al.'s [McKee, S., Wall, D.P., and Wilson, S.K., 1996. An alternating direction implicit scheme for parabolic equations with mixed derivative and convective terms. J. Comput. Phys., 126, 64–76.] ADI scheme is then used to solve the parabolic-type equation in dealing with the mixed derivative and convective terms from the curvilinear coordinate transformation. Good convergence rates are found in two typical cases which represent respectively the motions dominated by the gravity mode and the vorticity mode. Time step limitations imposed by the vorticity convective Courant number in vorticity-mode-dominant cases are discussed. Model efficiency and accuracy are verified in model application to tidal current simulations in San Francisco Bight.
Spacecraft charging analysis with the implicit particle-in-cell code iPic3D

DOE Office of Scientific and Technical Information (OSTI.GOV)

Deca, J.; Lapenta, G.; Marchand, R.

2013-10-15

We present the first results on the analysis of spacecraft charging with the implicit particle-in-cell code iPic3D, designed for running on massively parallel supercomputers. The numerical algorithm is presented, highlighting the implementation of the electrostatic solver and the immersed boundary algorithm; the latter which creates the possibility to handle complex spacecraft geometries. As a first step in the verification process, a comparison is made between the floating potential obtained with iPic3D and with Orbital Motion Limited theory for a spherical particle in a uniform stationary plasma. Second, the numerical model is verified for a CubeSat benchmark by comparing simulation resultsmore » with those of PTetra for space environment conditions with increasing levels of complexity. In particular, we consider spacecraft charging from plasma particle collection, photoelectron and secondary electron emission. The influence of a background magnetic field on the floating potential profile near the spacecraft is also considered. Although the numerical approaches in iPic3D and PTetra are rather different, good agreement is found between the two models, raising the level of confidence in both codes to predict and evaluate the complex plasma environment around spacecraft.« less
Application of adaptive gridding to magnetohydrodynamic flows

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schnack, D.D.; Lotatti, I.; Satyanarayana, P.

1996-12-31

The numerical simulation of the primitive, three-dimensional, time-dependent, resistive MHD equations on an unstructured, adaptive poloidal mesh using the TRIM code has been reported previously. The toroidal coordinate is approximated pseudo-spectrally with finite Fourier series and Fast-Fourier Transforms. The finite-volume algorithm preserves the magnetic field as solenoidal to round-off error, and also conserves mass, energy, and magnetic flux exactly. A semi-implicit method is used to allow for large time steps on the unstructured mesh. This is important for tokamak calculations where the relevant time scale is determined by the poloidal Alfven time. This also allows the viscosity to be treatedmore » implicitly. A conjugate-gradient method with pre-conditioning is used for matrix inversion. Applications to the growth and saturation of ideal instabilities in several toroidal fusion systems has been demonstrated. Recently we have concentrated on the details of the mesh adaption algorithm used in TRIM. We present several two-dimensional results relating to the use of grid adaptivity to track the evolution of hydrodynamic and MHD structures. Examples of plasma guns, opening switches, and supersonic flow over a magnetized sphere are presented. Issues relating to mesh adaption criteria are discussed.« less
Data Processing for a High Resolution Preclinical PET Detector Based on Philips DPC Digital SiPMs

NASA Astrophysics Data System (ADS)

Schug, David; Wehner, Jakob; Goldschmidt, Benjamin; Lerche, Christoph; Dueppenbecker, Peter Michael; Hallen, Patrick; Weissler, Bjoern; Gebhardt, Pierre; Kiessling, Fabian; Schulz, Volkmar

2015-06-01

In positron emission tomography (PET) systems, light sharing techniques are commonly used to readout scintillator arrays consisting of scintillation elements, which are smaller than the optical sensors. The scintillating element is then identified evaluating the signal heights in the readout channels using statistical algorithms, the center of gravity (COG) algorithm being the simplest and mostly used one. We propose a COG algorithm with a fixed number of input channels in order to guarantee a stable calculation of the position. The algorithm is implemented and tested with the raw detector data obtained with the Hyperion-II D preclinical PET insert which uses Philips Digital Photon Counting's (PDPC) digitial SiPMs. The gamma detectors use LYSO scintillator arrays with 30 ×30 crystals of 1 ×1 ×12 mm3 in size coupled to 4 ×4 PDPC DPC 3200-22 sensors (DPC) via a 2-mm-thick light guide. These self-triggering sensors are made up of 2 ×2 pixels resulting in a total of 64 readout channels. We restrict the COG calculation to a main pixel, which captures most of the scintillation light from a crystal, and its (direct and diagonal) neighboring pixels and reject single events in which this data is not fully available. This results in stable COG positions for a crystal element and enables high spatial image resolution. Due to the sensor layout, for some crystals it is very likely that a single diagonal neighbor pixel is missing as a result of the low light level on the corresponding DPC. This leads to a loss of sensitivity, if these events are rejected. An enhancement of the COG algorithm is proposed which handles the potentially missing pixel separately both for the crystal identification and the energy calculation. Using this advancement, we show that the sensitivity of the Hyperion-II D insert using the described scintillator configuration can be improved by 20-100% for practical useful readout thresholds of a single DPC pixel ranging from 17-52 photons. Furthermore, we show that the energy resolution of the scanner is superior for all readout thresholds if singles with a single missing pixel are accepted and correctly handled compared to the COG method only accepting singles with all neighbors present by 0-1.6% (relative difference). The presented methods can not only be applied to gamma detectors employing DPC sensors, but can be generalized to other similarly structured and self-triggering detectors, using light sharing techniques, as well.
Tradeoffs between oscillator strength and lifetime in terahertz quantum cascade lasers

DOE PAGES

Chan, Chun Wang I.; Albo, Asaf; Hu, Qing; ...

2016-11-14

Contemporary research into diagonal active region terahertz quantum cascade lasers for high temperature operation has yielded little success. We present evidence that the failure of high diagonality alone as a design strategy is due to a fundamental trade-off between large optical oscillator strength and long upper-level lifetime. Here, we hypothesize that diagonality needs to be paired with increased doping in order to succeed, and present evidence that highly diagonal designs can benefit from much higher doping than normally found in terahertz quantum cascade lasers. In assuming the benefits of high diagonality paired with high doping, we also highlight important challengesmore » that need to be overcome, specifically the increased importance of carrier induced band-bending and impurity scattering.« less
6. VIEW OF WEST GATE ROAD CULVERT OF LOWER DIAGONAL ...

Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

6. VIEW OF WEST GATE ROAD CULVERT OF LOWER DIAGONAL NO. 1 DRAIN, LOOKING 2502 EAST OF NORTH. - Truckee-Carson Irrigation District, Lower Diagonal No. 1 Drain, Bounded by West Gate Road & Weapons Delivery Road, Naval Air Station Fallon, Fallon, Churchill County, NV
7. VIEW OF WEAPONS DELIVERY ROAD CULVERT OF LOWER DIAGONAL ...

Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

7. VIEW OF WEAPONS DELIVERY ROAD CULVERT OF LOWER DIAGONAL NO. 1 DRAIN, LOOKING 522 EAST OF NORTH. - Truckee-Carson Irrigation District, Lower Diagonal No. 1 Drain, Bounded by West Gate Road & Weapons Delivery Road, Naval Air Station Fallon, Fallon, Churchill County, NV
5. VIEW OF WEST GATE ROAD CULVERT OF LOWER DIAGONAL ...

Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

5. VIEW OF WEST GATE ROAD CULVERT OF LOWER DIAGONAL NO. 1 DRAIN, LOOKING 323' EAST OF NORTH. - Truckee-Carson Irrigation District, Lower Diagonal No. 1 Drain, Bounded by West Gate Road & Weapons Delivery Road, Naval Air Station Fallon, Fallon, Churchill County, NV
Collaborative filtering recommendation model based on fuzzy clustering algorithm

NASA Astrophysics Data System (ADS)

Yang, Ye; Zhang, Yunhua

2018-05-01

As one of the most widely used algorithms in recommender systems, collaborative filtering algorithm faces two serious problems, which are the sparsity of data and poor recommendation effect in big data environment. In traditional clustering analysis, the object is strictly divided into several classes and the boundary of this division is very clear. However, for most objects in real life, there is no strict definition of their forms and attributes of their class. Concerning the problems above, this paper proposes to improve the traditional collaborative filtering model through the hybrid optimization of implicit semantic algorithm and fuzzy clustering algorithm, meanwhile, cooperating with collaborative filtering algorithm. In this paper, the fuzzy clustering algorithm is introduced to fuzzy clustering the information of project attribute, which makes the project belong to different project categories with different membership degrees, and increases the density of data, effectively reduces the sparsity of data, and solves the problem of low accuracy which is resulted from the inaccuracy of similarity calculation. Finally, this paper carries out empirical analysis on the MovieLens dataset, and compares it with the traditional user-based collaborative filtering algorithm. The proposed algorithm has greatly improved the recommendation accuracy.
Implicit approximate-factorization schemes for the low-frequency transonic equation

NASA Technical Reports Server (NTRS)

Ballhaus, W. F.; Steger, J. L.

1975-01-01

Two- and three-level implicit finite-difference algorithms for the low-frequency transonic small disturbance-equation are constructed using approximate factorization techniques. The schemes are unconditionally stable for the model linear problem. For nonlinear mixed flows, the schemes maintain stability by the use of conservatively switched difference operators for which stability is maintained only if shock propagation is restricted to be less than one spatial grid point per time step. The shock-capturing properties of the schemes were studied for various shock motions that might be encountered in problems of engineering interest. Computed results for a model airfoil problem that produces a flow field similar to that about a helicopter rotor in forward flight show the development of a shock wave and its subsequent propagation upstream off the front of the airfoil.
Nonlinear study of the parallel velocity/tearing instability using an implicit, nonlinear resistive MHD solver

NASA Astrophysics Data System (ADS)

Chacon, L.; Finn, J. M.; Knoll, D. A.

2000-10-01

Recently, a new parallel velocity instability has been found.(J. M. Finn, Phys. Plasmas), 2, 12 (1995) This mode is a tearing mode driven unstable by curvature effects and sound wave coupling in the presence of parallel velocity shear. Under such conditions, linear theory predicts that tearing instabilities will grow even in situations in which the classical tearing mode is stable. This could then be a viable seed mechanism for the neoclassical tearing mode, and hence a non-linear study is of interest. Here, the linear and non-linear stages of this instability are explored using a fully implicit, fully nonlinear 2D reduced resistive MHD code,(L. Chacon et al), ``Implicit, Jacobian-free Newton-Krylov 2D reduced resistive MHD nonlinear solver,'' submitted to J. Comput. Phys. (2000) including viscosity and particle transport effects. The nonlinear implicit time integration is performed using the Newton-Raphson iterative algorithm. Krylov iterative techniques are employed for the required algebraic matrix inversions, implemented Jacobian-free (i.e., without ever forming and storing the Jacobian matrix), and preconditioned with a ``physics-based'' preconditioner. Nonlinear results indicate that, for large total plasma beta and large parallel velocity shear, the instability results in the generation of large poloidal shear flows and large magnetic islands even in regimes when the classical tearing mode is absolutely stable. For small viscosity, the time asymptotic state can be turbulent.
A GPU-accelerated semi-implicit fractional step method for numerical solutions of incompressible Navier-Stokes equations

NASA Astrophysics Data System (ADS)

Ha, Sanghyun; Park, Junshin; You, Donghyun

2017-11-01

Utility of the computational power of modern Graphics Processing Units (GPUs) is elaborated for solutions of incompressible Navier-Stokes equations which are integrated using a semi-implicit fractional-step method. Due to its serial and bandwidth-bound nature, the present choice of numerical methods is considered to be a good candidate for evaluating the potential of GPUs for solving Navier-Stokes equations using non-explicit time integration. An efficient algorithm is presented for GPU acceleration of the Alternating Direction Implicit (ADI) and the Fourier-transform-based direct solution method used in the semi-implicit fractional-step method. OpenMP is employed for concurrent collection of turbulence statistics on a CPU while Navier-Stokes equations are computed on a GPU. Extension to multiple NVIDIA GPUs is implemented using NVLink supported by the Pascal architecture. Performance of the present method is experimented on multiple Tesla P100 GPUs compared with a single-core Xeon E5-2650 v4 CPU in simulations of boundary-layer flow over a flat plate. Supported by the National Research Foundation of Korea (NRF) Grant funded by the Korea government (Ministry of Science, ICT and Future Planning NRF-2016R1E1A2A01939553, NRF-2014R1A2A1A11049599, and Ministry of Trade, Industry and Energy 201611101000230).
Accelerating NLTE radiative transfer by means of the Forth-and-Back Implicit Lambda Iteration: A two-level atom line formation in 2D Cartesian coordinates

NASA Astrophysics Data System (ADS)

Milić, Ivan; Atanacković, Olga

2014-10-01

State-of-the-art methods in multidimensional NLTE radiative transfer are based on the use of local approximate lambda operator within either Jacobi or Gauss-Seidel iterative schemes. Here we propose another approach to the solution of 2D NLTE RT problems, Forth-and-Back Implicit Lambda Iteration (FBILI), developed earlier for 1D geometry. In order to present the method and examine its convergence properties we use the well-known instance of the two-level atom line formation with complete frequency redistribution. In the formal solution of the RT equation we employ short characteristics with two-point algorithm. Using an implicit representation of the source function in the computation of the specific intensities, we compute and store the coefficients of the linear relations J=a+bS between the mean intensity J and the corresponding source function S. The use of iteration factors in the ‘local’ coefficients of these implicit relations in two ‘inward’ sweeps of 2D grid, along with the update of the source function in other two ‘outward’ sweeps leads to four times faster solution than the Jacobi’s one. Moreover, the update made in all four consecutive sweeps of the grid leads to an acceleration by a factor of 6-7 compared to the Jacobi iterative scheme.
PIXIE3D: A Parallel, Implicit, eXtended MHD 3D Code

NASA Astrophysics Data System (ADS)

Chacon, Luis

2006-10-01

We report on the development of PIXIE3D, a 3D parallel, fully implicit Newton-Krylov extended MHD code in general curvilinear geometry. PIXIE3D employs a second-order, finite-volume-based spatial discretization that satisfies remarkable properties such as being conservative, solenoidal in the magnetic field to machine precision, non-dissipative, and linearly and nonlinearly stable in the absence of physical dissipation. PIXIE3D employs fully-implicit Newton-Krylov methods for the time advance. Currently, second-order implicit schemes such as Crank-Nicolson and BDF2 (2^nd order backward differentiation formula) are available. PIXIE3D is fully parallel (employs PETSc for parallelism), and exhibits excellent parallel scalability. A parallel, scalable, MG preconditioning strategy, based on physics-based preconditioning ideas, has been developed for resistive MHD, and is currently being extended to Hall MHD. In this poster, we will report on progress in the algorithmic formulation for extended MHD, as well as the the serial and parallel performance of PIXIE3D in a variety of problems and geometries. L. Chac'on, Comput. Phys. Comm., 163 (3), 143-171 (2004) L. Chac'on et al., J. Comput. Phys. 178 (1), 15- 36 (2002); J. Comput. Phys., 188 (2), 573-592 (2003) L. Chac'on, 32nd EPS Conf. Plasma Physics, Tarragona, Spain, 2005 L. Chac'on et al., 33rd EPS Conf. Plasma Physics, Rome, Italy, 2006
The diagonalization of cubic matrices

NASA Astrophysics Data System (ADS)

Cocolicchio, D.; Viggiano, M.

2000-08-01

This paper is devoted to analysing the problem of the diagonalization of cubic matrices. We extend the familiar algebraic approach which is based on the Cardano formulae. We rewrite the complex roots of the associated resolvent secular equation in terms of transcendental functions and we derive the diagonalizing matrix.
Chaos in non-diagonal spatially homogeneous cosmological models in spacetime dimensions <=10

NASA Astrophysics Data System (ADS)

Demaret, Jacques; de Rop, Yves; Henneaux, Marc

1988-08-01

It is shown that the chaotic oscillatory behaviour, absent in diagonal homogeneous cosmological models in spacetime dimensions between 5 and 10, can be reestablished when off-diagonal terms are included. Also at Centro de Estudios Cientificos de Santiago, Casilla 16443, Santiago 9, Chile
Regularized finite element modeling of progressive failure in soils within nonlocal softening plasticity

NASA Astrophysics Data System (ADS)

Huang, Maosong; Qu, Xie; Lü, Xilin

2017-11-01

By solving a nonlinear complementarity problem for the consistency condition, an improved implicit stress return iterative algorithm for a generalized over-nonlocal strain softening plasticity was proposed, and the consistent tangent matrix was obtained. The proposed algorithm was embodied into existing finite element codes, and it enables the nonlocal regularization of ill-posed boundary value problem caused by the pressure independent and dependent strain softening plasticity. The algorithm was verified by the numerical modeling of strain localization in a plane strain compression test. The results showed that a fast convergence can be achieved and the mesh-dependency caused by strain softening can be effectively eliminated. The influences of hardening modulus and material characteristic length on the simulation were obtained. The proposed algorithm was further used in the simulations of the bearing capacity of a strip footing; the results are mesh-independent, and the progressive failure process of the soil was well captured.
Power-on performance predictions for a complete generic hypersonic vehicle configuration

NASA Technical Reports Server (NTRS)

Bennett, Bradford C.

1991-01-01

The Compressible Navier-Stokes (CNS) code was developed to compute external hypersonic flow fields. It has been applied to various hypersonic external flow applications. Here, the CNS code was modified to compute hypersonic internal flow fields. Calculations were performed on a Mach 18 sidewall compression inlet and on the Lewis Mach 5 inlet. The use of the ARC3D diagonal algorithm was evaluated for internal flows on the Mach 5 inlet flow. The initial modifications to the CNS code involved generalization of the boundary conditions and the addition of viscous terms in the second crossflow direction and modifications to the Baldwin-Lomax turbulence model for corner flows.
Sparse polynomial space approach to dissipative quantum systems: application to the sub-ohmic spin-boson model.

PubMed

Alvermann, A; Fehske, H

2009-04-17

We propose a general numerical approach to open quantum systems with a coupling to bath degrees of freedom. The technique combines the methodology of polynomial expansions of spectral functions with the sparse grid concept from interpolation theory. Thereby we construct a Hilbert space of moderate dimension to represent the bath degrees of freedom, which allows us to perform highly accurate and efficient calculations of static, spectral, and dynamic quantities using standard exact diagonalization algorithms. The strength of the approach is demonstrated for the phase transition, critical behavior, and dissipative spin dynamics in the spin-boson model.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.