A Polynomial Time, Numerically Stable Integer Relation Algorithm
NASA Technical Reports Server (NTRS)
Ferguson, Helaman R. P.; Bailey, Daivd H.; Kutler, Paul (Technical Monitor)
1998-01-01
Let x = (x1, x2...,xn be a vector of real numbers. X is said to possess an integer relation if there exist integers a(sub i) not all zero such that a1x1 + a2x2 + ... a(sub n)Xn = 0. Beginning in 1977 several algorithms (with proofs) have been discovered to recover the a(sub i) given x. The most efficient of these existing integer relation algorithms (in terms of run time and the precision required of the input) has the drawback of being very unstable numerically. It often requires a numeric precision level in the thousands of digits to reliably recover relations in modest-sized test problems. We present here a new algorithm for finding integer relations, which we have named the "PSLQ" algorithm. It is proved in this paper that the PSLQ algorithm terminates with a relation in a number of iterations that is bounded by a polynomial in it. Because this algorithm employs a numerically stable matrix reduction procedure, it is free from the numerical difficulties, that plague other integer relation algorithms. Furthermore, its stability admits an efficient implementation with lower run times oil average than other algorithms currently in Use. Finally, this stability can be used to prove that relation bounds obtained from computer runs using this algorithm are numerically accurate.
Efficient Parallel Algorithm For Direct Numerical Simulation of Turbulent Flows
NASA Technical Reports Server (NTRS)
Moitra, Stuti; Gatski, Thomas B.
1997-01-01
A distributed algorithm for a high-order-accurate finite-difference approach to the direct numerical simulation (DNS) of transition and turbulence in compressible flows is described. This work has two major objectives. The first objective is to demonstrate that parallel and distributed-memory machines can be successfully and efficiently used to solve computationally intensive and input/output intensive algorithms of the DNS class. The second objective is to show that the computational complexity involved in solving the tridiagonal systems inherent in the DNS algorithm can be reduced by algorithm innovations that obviate the need to use a parallelized tridiagonal solver.
Doha, E.H.; Abd-Elhameed, W.M.; Youssri, Y.H.
2014-01-01
Two families of certain nonsymmetric generalized Jacobi polynomials with negative integer indexes are employed for solving third- and fifth-order two point boundary value problems governed by homogeneous and nonhomogeneous boundary conditions using a dual Petrov–Galerkin method. The idea behind our method is to use trial functions satisfying the underlying boundary conditions of the differential equations and the test functions satisfying the dual boundary conditions. The resulting linear systems from the application of our method are specially structured and they can be efficiently inverted. The use of generalized Jacobi polynomials simplify the theoretical and numerical analysis of the method and also leads to accurate and efficient numerical algorithms. The presented numerical results indicate that the proposed numerical algorithms are reliable and very efficient. PMID:26425358
Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures
2017-10-04
Report: Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures The views, opinions and/or findings contained in this...Chapel Hill Title: Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures Report Term: 0-Other Email: dm...algorithms for scientific and geometric computing by exploiting the power and performance efficiency of heterogeneous shared memory architectures . These
Multi-fidelity stochastic collocation method for computation of statistical moments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhu, Xueyu, E-mail: xueyu-zhu@uiowa.edu; Linebarger, Erin M., E-mail: aerinline@sci.utah.edu; Xiu, Dongbin, E-mail: xiu.16@osu.edu
We present an efficient numerical algorithm to approximate the statistical moments of stochastic problems, in the presence of models with different fidelities. The method extends the multi-fidelity approximation method developed in . By combining the efficiency of low-fidelity models and the accuracy of high-fidelity models, our method exhibits fast convergence with a limited number of high-fidelity simulations. We establish an error bound of the method and present several numerical examples to demonstrate the efficiency and applicability of the multi-fidelity algorithm.
On the efficient and reliable numerical solution of rate-and-state friction problems
NASA Astrophysics Data System (ADS)
Pipping, Elias; Kornhuber, Ralf; Rosenau, Matthias; Oncken, Onno
2016-03-01
We present a mathematically consistent numerical algorithm for the simulation of earthquake rupture with rate-and-state friction. Its main features are adaptive time stepping, a novel algebraic solution algorithm involving nonlinear multigrid and a fixed point iteration for the rate-and-state decoupling. The algorithm is applied to a laboratory scale subduction zone which allows us to compare our simulations with experimental results. Using physical parameters from the experiment, we find a good fit of recurrence time of slip events as well as their rupture width and peak slip. Computations in 3-D confirm efficiency and robustness of our algorithm.
Improving the Numerical Stability of Fast Matrix Multiplication
Ballard, Grey; Benson, Austin R.; Druinsky, Alex; ...
2016-10-04
Fast algorithms for matrix multiplication, namely those that perform asymptotically fewer scalar operations than the classical algorithm, have been considered primarily of theoretical interest. Apart from Strassen's original algorithm, few fast algorithms have been efficiently implemented or used in practical applications. However, there exist many practical alternatives to Strassen's algorithm with varying performance and numerical properties. Fast algorithms are known to be numerically stable, but because their error bounds are slightly weaker than the classical algorithm, they are not used even in cases where they provide a performance benefit. We argue in this study that the numerical sacrifice of fastmore » algorithms, particularly for the typical use cases of practical algorithms, is not prohibitive, and we explore ways to improve the accuracy both theoretically and empirically. The numerical accuracy of fast matrix multiplication depends on properties of the algorithm and of the input matrices, and we consider both contributions independently. We generalize and tighten previous error analyses of fast algorithms and compare their properties. We discuss algorithmic techniques for improving the error guarantees from two perspectives: manipulating the algorithms, and reducing input anomalies by various forms of diagonal scaling. In conclusion, we benchmark performance and demonstrate our improved numerical accuracy.« less
Computationally efficient multibody simulations
NASA Technical Reports Server (NTRS)
Ramakrishnan, Jayant; Kumar, Manoj
1994-01-01
Computationally efficient approaches to the solution of the dynamics of multibody systems are presented in this work. The computational efficiency is derived from both the algorithmic and implementational standpoint. Order(n) approaches provide a new formulation of the equations of motion eliminating the assembly and numerical inversion of a system mass matrix as required by conventional algorithms. Computational efficiency is also gained in the implementation phase by the symbolic processing and parallel implementation of these equations. Comparison of this algorithm with existing multibody simulation programs illustrates the increased computational efficiency.
NASA Astrophysics Data System (ADS)
Lashkin, S. V.; Kozelkov, A. S.; Yalozo, A. V.; Gerasimov, V. Yu.; Zelensky, D. K.
2017-12-01
This paper describes the details of the parallel implementation of the SIMPLE algorithm for numerical solution of the Navier-Stokes system of equations on arbitrary unstructured grids. The iteration schemes for the serial and parallel versions of the SIMPLE algorithm are implemented. In the description of the parallel implementation, special attention is paid to computational data exchange among processors under the condition of the grid model decomposition using fictitious cells. We discuss the specific features for the storage of distributed matrices and implementation of vector-matrix operations in parallel mode. It is shown that the proposed way of matrix storage reduces the number of interprocessor exchanges. A series of numerical experiments illustrates the effect of the multigrid SLAE solver tuning on the general efficiency of the algorithm; the tuning involves the types of the cycles used (V, W, and F), the number of iterations of a smoothing operator, and the number of cells for coarsening. Two ways (direct and indirect) of efficiency evaluation for parallelization of the numerical algorithm are demonstrated. The paper presents the results of solving some internal and external flow problems with the evaluation of parallelization efficiency by two algorithms. It is shown that the proposed parallel implementation enables efficient computations for the problems on a thousand processors. Based on the results obtained, some general recommendations are made for the optimal tuning of the multigrid solver, as well as for selecting the optimal number of cells per processor.
A numerical comparison of discrete Kalman filtering algorithms: An orbit determination case study
NASA Technical Reports Server (NTRS)
Thornton, C. L.; Bierman, G. J.
1976-01-01
The numerical stability and accuracy of various Kalman filter algorithms are thoroughly studied. Numerical results and conclusions are based on a realistic planetary approach orbit determination study. The case study results of this report highlight the numerical instability of the conventional and stabilized Kalman algorithms. Numerical errors associated with these algorithms can be so large as to obscure important mismodeling effects and thus give misleading estimates of filter accuracy. The positive result of this study is that the Bierman-Thornton U-D covariance factorization algorithm is computationally efficient, with CPU costs that differ negligibly from the conventional Kalman costs. In addition, accuracy of the U-D filter using single-precision arithmetic consistently matches the double-precision reference results. Numerical stability of the U-D filter is further demonstrated by its insensitivity of variations in the a priori statistics.
Progress on a Taylor weak statement finite element algorithm for high-speed aerodynamic flows
NASA Technical Reports Server (NTRS)
Baker, A. J.; Freels, J. D.
1989-01-01
A new finite element numerical Computational Fluid Dynamics (CFD) algorithm has matured to the point of efficiently solving two-dimensional high speed real-gas compressible flow problems in generalized coordinates on modern vector computer systems. The algorithm employs a Taylor Weak Statement classical Galerkin formulation, a variably implicit Newton iteration, and a tensor matrix product factorization of the linear algebra Jacobian under a generalized coordinate transformation. Allowing for a general two-dimensional conservation law system, the algorithm has been exercised on the Euler and laminar forms of the Navier-Stokes equations. Real-gas fluid properties are admitted, and numerical results verify solution accuracy, efficiency, and stability over a range of test problem parameters.
Optimization methods and silicon solar cell numerical models
NASA Technical Reports Server (NTRS)
Girardini, K.; Jacobsen, S. E.
1986-01-01
An optimization algorithm for use with numerical silicon solar cell models was developed. By coupling an optimization algorithm with a solar cell model, it is possible to simultaneously vary design variables such as impurity concentrations, front junction depth, back junction depth, and cell thickness to maximize the predicted cell efficiency. An optimization algorithm was developed and interfaced with the Solar Cell Analysis Program in 1 Dimension (SCAP1D). SCAP1D uses finite difference methods to solve the differential equations which, along with several relations from the physics of semiconductors, describe mathematically the performance of a solar cell. A major obstacle is that the numerical methods used in SCAP1D require a significant amount of computer time, and during an optimization the model is called iteratively until the design variables converge to the values associated with the maximum efficiency. This problem was alleviated by designing an optimization code specifically for use with numerically intensive simulations, to reduce the number of times the efficiency has to be calculated to achieve convergence to the optimal solution.
Andrianov, Alexey; Szabo, Aron; Sergeev, Alexander; Kim, Arkady; Chvykov, Vladimir; Kalashnikov, Mikhail
2016-11-14
We developed an improved approach to calculate the Fourier transform of signals with arbitrary large quadratic phase which can be efficiently implemented in numerical simulations utilizing Fast Fourier transform. The proposed algorithm significantly reduces the computational cost of Fourier transform of a highly chirped and stretched pulse by splitting it into two separate transforms of almost transform limited pulses, thereby reducing the required grid size roughly by a factor of the pulse stretching. The application of our improved Fourier transform algorithm in the split-step method for numerical modeling of CPA and OPCPA shows excellent agreement with standard algorithms.
NASA Astrophysics Data System (ADS)
Mahlmann, J. F.; Cerdá-Durán, P.; Aloy, M. A.
2018-07-01
The study of the electrodynamics of static, axisymmetric, and force-free Kerr magnetospheres relies vastly on solutions of the so-called relativistic Grad-Shafranov equation (GSE). Different numerical approaches to the solution of the GSE have been introduced in the literature, but none of them has been fully assessed from the numerical point of view in terms of efficiency and quality of the solutions found. We present a generalization of these algorithms and give a detailed background on the algorithmic implementation. We assess the numerical stability of the implemented algorithms and quantify the convergence of the presented methodology for the most established set-ups (split-monopole, paraboloidal, BH disc, uniform).
NASA Astrophysics Data System (ADS)
Mahlmann, J. F.; Cerdá-Durán, P.; Aloy, M. A.
2018-04-01
The study of the electrodynamics of static, axisymmetric and force-free Kerr magnetospheres relies vastly on solutions of the so called relativistic Grad-Shafranov equation (GSE). Different numerical approaches to the solution of the GSE have been introduced in the literature, but none of them has been fully assessed from the numerical point of view in terms of efficiency and quality of the solutions found. We present a generalization of these algorithms and give detailed background on the algorithmic implementation. We assess the numerical stability of the implemented algorithms and quantify the convergence of the presented methodology for the most established setups (split-monopole, paraboloidal, BH-disk, uniform).
The admissible portfolio selection problem with transaction costs and an improved PSO algorithm
NASA Astrophysics Data System (ADS)
Chen, Wei; Zhang, Wei-Guo
2010-05-01
In this paper, we discuss the portfolio selection problem with transaction costs under the assumption that there exist admissible errors on expected returns and risks of assets. We propose a new admissible efficient portfolio selection model and design an improved particle swarm optimization (PSO) algorithm because traditional optimization algorithms fail to work efficiently for our proposed problem. Finally, we offer a numerical example to illustrate the proposed effective approaches and compare the admissible portfolio efficient frontiers under different constraints.
Constraint treatment techniques and parallel algorithms for multibody dynamic analysis. Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Chiou, Jin-Chern
1990-01-01
Computational procedures for kinematic and dynamic analysis of three-dimensional multibody dynamic (MBD) systems are developed from the differential-algebraic equations (DAE's) viewpoint. Constraint violations during the time integration process are minimized and penalty constraint stabilization techniques and partitioning schemes are developed. The governing equations of motion, a two-stage staggered explicit-implicit numerical algorithm, are treated which takes advantage of a partitioned solution procedure. A robust and parallelizable integration algorithm is developed. This algorithm uses a two-stage staggered central difference algorithm to integrate the translational coordinates and the angular velocities. The angular orientations of bodies in MBD systems are then obtained by using an implicit algorithm via the kinematic relationship between Euler parameters and angular velocities. It is shown that the combination of the present solution procedures yields a computationally more accurate solution. To speed up the computational procedures, parallel implementation of the present constraint treatment techniques, the two-stage staggered explicit-implicit numerical algorithm was efficiently carried out. The DAE's and the constraint treatment techniques were transformed into arrowhead matrices to which Schur complement form was derived. By fully exploiting the sparse matrix structural analysis techniques, a parallel preconditioned conjugate gradient numerical algorithm is used to solve the systems equations written in Schur complement form. A software testbed was designed and implemented in both sequential and parallel computers. This testbed was used to demonstrate the robustness and efficiency of the constraint treatment techniques, the accuracy of the two-stage staggered explicit-implicit numerical algorithm, and the speed up of the Schur-complement-based parallel preconditioned conjugate gradient algorithm on a parallel computer.
Reconstructing householder vectors from Tall-Skinny QR
Ballard, Grey Malone; Demmel, James; Grigori, Laura; ...
2015-08-05
The Tall-Skinny QR (TSQR) algorithm is more communication efficient than the standard Householder algorithm for QR decomposition of matrices with many more rows than columns. However, TSQR produces a different representation of the orthogonal factor and therefore requires more software development to support the new representation. Further, implicitly applying the orthogonal factor to the trailing matrix in the context of factoring a square matrix is more complicated and costly than with the Householder representation. We show how to perform TSQR and then reconstruct the Householder vector representation with the same asymptotic communication efficiency and little extra computational cost. We demonstratemore » the high performance and numerical stability of this algorithm both theoretically and empirically. The new Householder reconstruction algorithm allows us to design more efficient parallel QR algorithms, with significantly lower latency cost compared to Householder QR and lower bandwidth and latency costs compared with Communication-Avoiding QR (CAQR) algorithm. Experiments on supercomputers demonstrate the benefits of the communication cost improvements: in particular, our experiments show substantial improvements over tuned library implementations for tall-and-skinny matrices. Furthermore, we also provide algorithmic improvements to the Householder QR and CAQR algorithms, and we investigate several alternatives to the Householder reconstruction algorithm that sacrifice guarantees on numerical stability in some cases in order to obtain higher performance.« less
Hu, Shaoxing; Xu, Shike; Wang, Duhu; Zhang, Aiwu
2015-11-11
Aiming at addressing the problem of high computational cost of the traditional Kalman filter in SINS/GPS, a practical optimization algorithm with offline-derivation and parallel processing methods based on the numerical characteristics of the system is presented in this paper. The algorithm exploits the sparseness and/or symmetry of matrices to simplify the computational procedure. Thus plenty of invalid operations can be avoided by offline derivation using a block matrix technique. For enhanced efficiency, a new parallel computational mechanism is established by subdividing and restructuring calculation processes after analyzing the extracted "useful" data. As a result, the algorithm saves about 90% of the CPU processing time and 66% of the memory usage needed in a classical Kalman filter. Meanwhile, the method as a numerical approach needs no precise-loss transformation/approximation of system modules and the accuracy suffers little in comparison with the filter before computational optimization. Furthermore, since no complicated matrix theories are needed, the algorithm can be easily transplanted into other modified filters as a secondary optimization method to achieve further efficiency.
Improved transition path sampling methods for simulation of rare events
NASA Astrophysics Data System (ADS)
Chopra, Manan; Malshe, Rohit; Reddy, Allam S.; de Pablo, J. J.
2008-04-01
The free energy surfaces of a wide variety of systems encountered in physics, chemistry, and biology are characterized by the existence of deep minima separated by numerous barriers. One of the central aims of recent research in computational chemistry and physics has been to determine how transitions occur between deep local minima on rugged free energy landscapes, and transition path sampling (TPS) Monte-Carlo methods have emerged as an effective means for numerical investigation of such transitions. Many of the shortcomings of TPS-like approaches generally stem from their high computational demands. Two new algorithms are presented in this work that improve the efficiency of TPS simulations. The first algorithm uses biased shooting moves to render the sampling of reactive trajectories more efficient. The second algorithm is shown to substantially improve the accuracy of the transition state ensemble by introducing a subset of local transition path simulations in the transition state. The system considered in this work consists of a two-dimensional rough energy surface that is representative of numerous systems encountered in applications. When taken together, these algorithms provide gains in efficiency of over two orders of magnitude when compared to traditional TPS simulations.
NASA Technical Reports Server (NTRS)
Fiske, David R.
2004-01-01
In an earlier paper, Misner (2004, Class. Quant. Grav., 21, S243) presented a novel algorithm for computing the spherical harmonic components of data represented on a cubic grid. I extend Misner s original analysis by making detailed error estimates of the numerical errors accrued by the algorithm, by using symmetry arguments to suggest a more efficient implementation scheme, and by explaining how the algorithm can be applied efficiently on data with explicit reflection symmetries.
Haidar, Azzam; Jagode, Heike; Vaccaro, Phil; ...
2018-03-22
The emergence of power efficiency as a primary constraint in processor and system design poses new challenges concerning power and energy awareness for numerical libraries and scientific applications. Power consumption also plays a major role in the design of data centers, which may house petascale or exascale-level computing systems. At these extreme scales, understanding and improving the energy efficiency of numerical libraries and their related applications becomes a crucial part of the successful implementation and operation of the computing system. In this paper, we study and investigate the practice of controlling a compute system's power usage, and we explore howmore » different power caps affect the performance of numerical algorithms with different computational intensities. Further, we determine the impact, in terms of performance and energy usage, that these caps have on a system running scientific applications. This analysis will enable us to characterize the types of algorithms that benefit most from these power management schemes. Our experiments are performed using a set of representative kernels and several popular scientific benchmarks. Lastly, we quantify a number of power and performance measurements and draw observations and conclusions that can be viewed as a roadmap to achieving energy efficiency in the design and execution of scientific algorithms.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Haidar, Azzam; Jagode, Heike; Vaccaro, Phil
The emergence of power efficiency as a primary constraint in processor and system design poses new challenges concerning power and energy awareness for numerical libraries and scientific applications. Power consumption also plays a major role in the design of data centers, which may house petascale or exascale-level computing systems. At these extreme scales, understanding and improving the energy efficiency of numerical libraries and their related applications becomes a crucial part of the successful implementation and operation of the computing system. In this paper, we study and investigate the practice of controlling a compute system's power usage, and we explore howmore » different power caps affect the performance of numerical algorithms with different computational intensities. Further, we determine the impact, in terms of performance and energy usage, that these caps have on a system running scientific applications. This analysis will enable us to characterize the types of algorithms that benefit most from these power management schemes. Our experiments are performed using a set of representative kernels and several popular scientific benchmarks. Lastly, we quantify a number of power and performance measurements and draw observations and conclusions that can be viewed as a roadmap to achieving energy efficiency in the design and execution of scientific algorithms.« less
Fast algorithm for bilinear transforms in optics
NASA Astrophysics Data System (ADS)
Ostrovsky, Andrey S.; Martinez-Niconoff, Gabriel C.; Ramos Romero, Obdulio; Cortes, Liliana
2000-10-01
The fast algorithm for calculating the bilinear transform in the optical system is proposed. This algorithm is based on the coherent-mode representation of the cross-spectral density function of the illumination. The algorithm is computationally efficient when the illumination is partially coherent. Numerical examples are studied and compared with the theoretical results.
Nash equilibrium and multi criterion aerodynamic optimization
NASA Astrophysics Data System (ADS)
Tang, Zhili; Zhang, Lianhe
2016-06-01
Game theory and its particular Nash Equilibrium (NE) are gaining importance in solving Multi Criterion Optimization (MCO) in engineering problems over the past decade. The solution of a MCO problem can be viewed as a NE under the concept of competitive games. This paper surveyed/proposed four efficient algorithms for calculating a NE of a MCO problem. Existence and equivalence of the solution are analyzed and proved in the paper based on fixed point theorem. Specific virtual symmetric Nash game is also presented to set up an optimization strategy for single objective optimization problems. Two numerical examples are presented to verify proposed algorithms. One is mathematical functions' optimization to illustrate detailed numerical procedures of algorithms, the other is aerodynamic drag reduction of civil transport wing fuselage configuration by using virtual game. The successful application validates efficiency of algorithms in solving complex aerodynamic optimization problem.
NASA Technical Reports Server (NTRS)
Steger, J. L.; Caradonna, F. X.
1980-01-01
An implicit finite difference procedure is developed to solve the unsteady full potential equation in conservation law form. Computational efficiency is maintained by use of approximate factorization techniques. The numerical algorithm is first order in time and second order in space. A circulation model and difference equations are developed for lifting airfoils in unsteady flow; however, thin airfoil body boundary conditions have been used with stretching functions to simplify the development of the numerical algorithm.
Density-matrix-based algorithm for solving eigenvalue problems
NASA Astrophysics Data System (ADS)
Polizzi, Eric
2009-03-01
A fast and stable numerical algorithm for solving the symmetric eigenvalue problem is presented. The technique deviates fundamentally from the traditional Krylov subspace iteration based techniques (Arnoldi and Lanczos algorithms) or other Davidson-Jacobi techniques and takes its inspiration from the contour integration and density-matrix representation in quantum mechanics. It will be shown that this algorithm—named FEAST—exhibits high efficiency, robustness, accuracy, and scalability on parallel architectures. Examples from electronic structure calculations of carbon nanotubes are presented, and numerical performances and capabilities are discussed.
Chaudhry, Jehanzeb Hameed; Estep, Don; Tavener, Simon; Carey, Varis; Sandelin, Jeff
2016-01-01
We consider numerical methods for initial value problems that employ a two stage approach consisting of solution on a relatively coarse discretization followed by solution on a relatively fine discretization. Examples include adaptive error control, parallel-in-time solution schemes, and efficient solution of adjoint problems for computing a posteriori error estimates. We describe a general formulation of two stage computations then perform a general a posteriori error analysis based on computable residuals and solution of an adjoint problem. The analysis accommodates various variations in the two stage computation and in formulation of the adjoint problems. We apply the analysis to compute "dual-weighted" a posteriori error estimates, to develop novel algorithms for efficient solution that take into account cancellation of error, and to the Parareal Algorithm. We test the various results using several numerical examples.
A frequency dependent preconditioned wavelet method for atmospheric tomography
NASA Astrophysics Data System (ADS)
Yudytskiy, Mykhaylo; Helin, Tapio; Ramlau, Ronny
2013-12-01
Atmospheric tomography, i.e. the reconstruction of the turbulence in the atmosphere, is a main task for the adaptive optics systems of the next generation telescopes. For extremely large telescopes, such as the European Extremely Large Telescope, this problem becomes overly complex and an efficient algorithm is needed to reduce numerical costs. Recently, a conjugate gradient method based on wavelet parametrization of turbulence layers was introduced [5]. An iterative algorithm can only be numerically efficient when the number of iterations required for a sufficient reconstruction is low. A way to achieve this is to design an efficient preconditioner. In this paper we propose a new frequency-dependent preconditioner for the wavelet method. In the context of a multi conjugate adaptive optics (MCAO) system simulated on the official end-to-end simulation tool OCTOPUS of the European Southern Observatory we demonstrate robustness and speed of the preconditioned algorithm. We show that three iterations are sufficient for a good reconstruction.
UDU/T/ covariance factorization for Kalman filtering
NASA Technical Reports Server (NTRS)
Thornton, C. L.; Bierman, G. J.
1980-01-01
There has been strong motivation to produce numerically stable formulations of the Kalman filter algorithms because it has long been known that the original discrete-time Kalman formulas are numerically unreliable. Numerical instability can be avoided by propagating certain factors of the estimate error covariance matrix rather than the covariance matrix itself. This paper documents filter algorithms that correspond to the covariance factorization P = UDU(T), where U is a unit upper triangular matrix and D is diagonal. Emphasis is on computational efficiency and numerical stability, since these properties are of key importance in real-time filter applications. The history of square-root and U-D covariance filters is reviewed. Simple examples are given to illustrate the numerical inadequacy of the Kalman covariance filter algorithms; these examples show how factorization techniques can give improved computational reliability.
Research on the control of large space structures
NASA Technical Reports Server (NTRS)
Denman, E. D.
1983-01-01
The research effort on the control of large space structures at the University of Houston has concentrated on the mathematical theory of finite-element models; identification of the mass, damping, and stiffness matrix; assignment of damping to structures; and decoupling of structure dynamics. The objective of the work has been and will continue to be the development of efficient numerical algorithms for analysis, control, and identification of large space structures. The major consideration in the development of the algorithms has been the large number of equations that must be handled by the algorithm as well as sensitivity of the algorithms to numerical errors.
Wang, Peng; Zhu, Zhouquan; Huang, Shuai
2013-01-01
This paper presents a novel biologically inspired metaheuristic algorithm called seven-spot ladybird optimization (SLO). The SLO is inspired by recent discoveries on the foraging behavior of a seven-spot ladybird. In this paper, the performance of the SLO is compared with that of the genetic algorithm, particle swarm optimization, and artificial bee colony algorithms by using five numerical benchmark functions with multimodality. The results show that SLO has the ability to find the best solution with a comparatively small population size and is suitable for solving optimization problems with lower dimensions.
Zhu, Zhouquan
2013-01-01
This paper presents a novel biologically inspired metaheuristic algorithm called seven-spot ladybird optimization (SLO). The SLO is inspired by recent discoveries on the foraging behavior of a seven-spot ladybird. In this paper, the performance of the SLO is compared with that of the genetic algorithm, particle swarm optimization, and artificial bee colony algorithms by using five numerical benchmark functions with multimodality. The results show that SLO has the ability to find the best solution with a comparatively small population size and is suitable for solving optimization problems with lower dimensions. PMID:24385879
Numerical Algorithms for Acoustic Integrals - The Devil is in the Details
NASA Technical Reports Server (NTRS)
Brentner, Kenneth S.
1996-01-01
The accurate prediction of the aeroacoustic field generated by aerospace vehicles or nonaerospace machinery is necessary for designers to control and reduce source noise. Powerful computational aeroacoustic methods, based on various acoustic analogies (primarily the Lighthill acoustic analogy) and Kirchhoff methods, have been developed for prediction of noise from complicated sources, such as rotating blades. Both methods ultimately predict the noise through a numerical evaluation of an integral formulation. In this paper, we consider three generic acoustic formulations and several numerical algorithms that have been used to compute the solutions to these formulations. Algorithms for retarded-time formulations are the most efficient and robust, but they are difficult to implement for supersonic-source motion. Collapsing-sphere and emission-surface formulations are good alternatives when supersonic-source motion is present, but the numerical implementations of these formulations are more computationally demanding. New algorithms - which utilize solution adaptation to provide a specified error level - are needed.
On Efficient Multigrid Methods for Materials Processing Flows with Small Particles
NASA Technical Reports Server (NTRS)
Thomas, James (Technical Monitor); Diskin, Boris; Harik, VasylMichael
2004-01-01
Multiscale modeling of materials requires simulations of multiple levels of structural hierarchy. The computational efficiency of numerical methods becomes a critical factor for simulating large physical systems with highly desperate length scales. Multigrid methods are known for their superior efficiency in representing/resolving different levels of physical details. The efficiency is achieved by employing interactively different discretizations on different scales (grids). To assist optimization of manufacturing conditions for materials processing with numerous particles (e.g., dispersion of particles, controlling flow viscosity and clusters), a new multigrid algorithm has been developed for a case of multiscale modeling of flows with small particles that have various length scales. The optimal efficiency of the algorithm is crucial for accurate predictions of the effect of processing conditions (e.g., pressure and velocity gradients) on the local flow fields that control the formation of various microstructures or clusters.
Abd-Elhameed, Waleed M.; Doha, Eid H.; Bassuony, Mahmoud A.
2014-01-01
Two numerical algorithms based on dual-Petrov-Galerkin method are developed for solving the integrated forms of high odd-order boundary value problems (BVPs) governed by homogeneous and nonhomogeneous boundary conditions. Two different choices of trial functions and test functions which satisfy the underlying boundary conditions of the differential equations and the dual boundary conditions are used for this purpose. These choices lead to linear systems with specially structured matrices that can be efficiently inverted, hence greatly reducing the cost. The various matrix systems resulting from these discretizations are carefully investigated, especially their complexities and their condition numbers. Numerical results are given to illustrate the efficiency of the proposed algorithms, and some comparisons with some other methods are made. PMID:24616620
NASA Technical Reports Server (NTRS)
Pflaum, Christoph
1996-01-01
A multilevel algorithm is presented that solves general second order elliptic partial differential equations on adaptive sparse grids. The multilevel algorithm consists of several V-cycles. Suitable discretizations provide that the discrete equation system can be solved in an efficient way. Numerical experiments show a convergence rate of order Omicron(1) for the multilevel algorithm.
Algorithms for the Fractional Calculus: A Selection of Numerical Methods
NASA Technical Reports Server (NTRS)
Diethelm, K.; Ford, N. J.; Freed, A. D.; Luchko, Yu.
2003-01-01
Many recently developed models in areas like viscoelasticity, electrochemistry, diffusion processes, etc. are formulated in terms of derivatives (and integrals) of fractional (non-integer) order. In this paper we present a collection of numerical algorithms for the solution of the various problems arising in this context. We believe that this will give the engineer the necessary tools required to work with fractional models in an efficient way.
NASA Astrophysics Data System (ADS)
Doha, E. H.; Abd-Elhameed, W. M.; Bassuony, M. A.
2013-03-01
This paper is concerned with spectral Galerkin algorithms for solving high even-order two point boundary value problems in one dimension subject to homogeneous and nonhomogeneous boundary conditions. The proposed algorithms are extended to solve two-dimensional high even-order differential equations. The key to the efficiency of these algorithms is to construct compact combinations of Chebyshev polynomials of the third and fourth kinds as basis functions. The algorithms lead to linear systems with specially structured matrices that can be efficiently inverted. Numerical examples are included to demonstrate the validity and applicability of the proposed algorithms, and some comparisons with some other methods are made.
NASA Astrophysics Data System (ADS)
Schwarz, Karsten; Rieger, Heiko
2013-03-01
We present an efficient Monte Carlo method to simulate reaction-diffusion processes with spatially varying particle annihilation or transformation rates as it occurs for instance in the context of motor-driven intracellular transport. Like Green's function reaction dynamics and first-passage time methods, our algorithm avoids small diffusive hops by propagating sufficiently distant particles in large hops to the boundaries of protective domains. Since for spatially varying annihilation or transformation rates the single particle diffusion propagator is not known analytically, we present an algorithm that generates efficiently either particle displacements or annihilations with the correct statistics, as we prove rigorously. The numerical efficiency of the algorithm is demonstrated with an illustrative example.
NASA Astrophysics Data System (ADS)
Bu, Sunyoung; Huang, Jingfang; Boyer, Treavor H.; Miller, Cass T.
2010-07-01
The focus of this work is on the modeling of an ion exchange process that occurs in drinking water treatment applications. The model formulation consists of a two-scale model in which a set of microscale diffusion equations representing ion exchange resin particles that vary in size and age are coupled through a boundary condition with a macroscopic ordinary differential equation (ODE), which represents the concentration of a species in a well-mixed reactor. We introduce a new age-averaged model (AAM) that averages all ion exchange particle ages for a given size particle to avoid the expensive Monte-Carlo simulation associated with previous modeling applications. We discuss two different numerical schemes to approximate both the original Monte-Carlo algorithm and the new AAM for this two-scale problem. The first scheme is based on the finite element formulation in space coupled with an existing backward difference formula-based ODE solver in time. The second scheme uses an integral equation based Krylov deferred correction (KDC) method and a fast elliptic solver (FES) for the resulting elliptic equations. Numerical results are presented to validate the new AAM algorithm, which is also shown to be more computationally efficient than the original Monte-Carlo algorithm. We also demonstrate that the higher order KDC scheme is more efficient than the traditional finite element solution approach and this advantage becomes increasingly important as the desired accuracy of the solution increases. We also discuss issues of smoothness, which affect the efficiency of the KDC-FES approach, and outline additional algorithmic changes that would further improve the efficiency of these developing methods for a wide range of applications.
A multistage time-stepping scheme for the Navier-Stokes equations
NASA Technical Reports Server (NTRS)
Swanson, R. C.; Turkel, E.
1985-01-01
A class of explicit multistage time-stepping schemes is used to construct an algorithm for solving the compressible Navier-Stokes equations. Flexibility in treating arbitrary geometries is obtained with a finite-volume formulation. Numerical efficiency is achieved by employing techniques for accelerating convergence to steady state. Computer processing is enhanced through vectorization of the algorithm. The scheme is evaluated by solving laminar and turbulent flows over a flat plate and an NACA 0012 airfoil. Numerical results are compared with theoretical solutions or other numerical solutions and/or experimental data.
Coverage-maximization in networks under resource constraints.
Nandi, Subrata; Brusch, Lutz; Deutsch, Andreas; Ganguly, Niloy
2010-06-01
Efficient coverage algorithms are essential for information search or dispersal in all kinds of networks. We define an extended coverage problem which accounts for constrained resources of consumed bandwidth B and time T . Our solution to the network challenge is here studied for regular grids only. Using methods from statistical mechanics, we develop a coverage algorithm with proliferating message packets and temporally modulated proliferation rate. The algorithm performs as efficiently as a single random walker but O(B(d-2)/d) times faster, resulting in significant service speed-up on a regular grid of dimension d . The algorithm is numerically compared to a class of generalized proliferating random walk strategies and on regular grids shown to perform best in terms of the product metric of speed and efficiency.
Simultaneous and semi-alternating projection algorithms for solving split equality problems.
Dong, Qiao-Li; Jiang, Dan
2018-01-01
In this article, we first introduce two simultaneous projection algorithms for solving the split equality problem by using a new choice of the stepsize, and then propose two semi-alternating projection algorithms. The weak convergence of the proposed algorithms is analyzed under standard conditions. As applications, we extend the results to solve the split feasibility problem. Finally, a numerical example is presented to illustrate the efficiency and advantage of the proposed algorithms.
Algorithm Diversity for Resilent Systems
2016-06-27
data structures. 15. SUBJECT TERMS computer security, software diversity, program transformation 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF 18...systematic method for transforming Datalog rules with general universal and existential quantification into efficient algorithms with precise complexity...worst case in the size of the ground rules. There are numerous choices during the transformation that lead to diverse algorithms and different
NASA Astrophysics Data System (ADS)
Leal, Allan M. M.; Kulik, Dmitrii A.; Kosakowski, Georg
2016-02-01
We present a numerical method for multiphase chemical equilibrium calculations based on a Gibbs energy minimization approach. The method can accurately and efficiently determine the stable phase assemblage at equilibrium independently of the type of phases and species that constitute the chemical system. We have successfully applied our chemical equilibrium algorithm in reactive transport simulations to demonstrate its effective use in computationally intensive applications. We used FEniCS to solve the governing partial differential equations of mass transport in porous media using finite element methods in unstructured meshes. Our equilibrium calculations were benchmarked with GEMS3K, the numerical kernel of the geochemical package GEMS. This allowed us to compare our results with a well-established Gibbs energy minimization algorithm, as well as their performance on every mesh node, at every time step of the transport simulation. The benchmark shows that our novel chemical equilibrium algorithm is accurate, robust, and efficient for reactive transport applications, and it is an improvement over the Gibbs energy minimization algorithm used in GEMS3K. The proposed chemical equilibrium method has been implemented in Reaktoro, a unified framework for modeling chemically reactive systems, which is now used as an alternative numerical kernel of GEMS.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vecharynski, Eugene; Brabec, Jiri; Shao, Meiyue
We present two efficient iterative algorithms for solving the linear response eigen- value problem arising from the time dependent density functional theory. Although the matrix to be diagonalized is nonsymmetric, it has a special structure that can be exploited to save both memory and floating point operations. In particular, the nonsymmetric eigenvalue problem can be transformed into a product eigenvalue problem that is self-adjoint with respect to a K-inner product. This product eigenvalue problem can be solved efficiently by a modified Davidson algorithm and a modified locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm that make use of the K-innermore » product. The solution of the product eigenvalue problem yields one component of the eigenvector associated with the original eigenvalue problem. However, the other component of the eigenvector can be easily recovered in a postprocessing procedure. Therefore, the algorithms we present here are more efficient than existing algorithms that try to approximate both components of the eigenvectors simultaneously. The efficiency of the new algorithms is demonstrated by numerical examples.« less
e-DMDAV: A new privacy preserving algorithm for wearable enterprise information systems
NASA Astrophysics Data System (ADS)
Zhang, Zhenjiang; Wang, Xiaoni; Uden, Lorna; Zhang, Peng; Zhao, Yingsi
2018-04-01
Wearable devices have been widely used in many fields to improve the quality of people's lives. More and more data on individuals and businesses are collected by statistical organizations though those devices. Almost all of this data holds confidential information. Statistical Disclosure Control (SDC) seeks to protect statistical data in such a way that it can be released without giving away confidential information that can be linked to specific individuals or entities. The MDAV (Maximum Distance to Average Vector) algorithm is an efficient micro-aggregation algorithm belonging to SDC. However, the MDAV algorithm cannot survive homogeneity and background knowledge attacks because it was designed for static numerical data. This paper proposes a systematic dynamic-updating anonymity algorithm based on MDAV called the e-DMDAV algorithm. This algorithm introduces a new parameter and a table to ensure that the k records in one cluster with the range of the distinct values in each cluster is no less than e for numerical and non-numerical datasets. This new algorithm has been evaluated and compared with the MDAV algorithm. The simulation results show that the new algorithm outperforms MDAV in terms of minimizing distortion and disclosure risk with a similar computational cost.
Improvement of Speckle Contrast Image Processing by an Efficient Algorithm.
Steimers, A; Farnung, W; Kohl-Bareis, M
2016-01-01
We demonstrate an efficient algorithm for the temporal and spatial based calculation of speckle contrast for the imaging of blood flow by laser speckle contrast analysis (LASCA). It reduces the numerical complexity of necessary calculations, facilitates a multi-core and many-core implementation of the speckle analysis and enables an independence of temporal or spatial resolution and SNR. The new algorithm was evaluated for both spatial and temporal based analysis of speckle patterns with different image sizes and amounts of recruited pixels as sequential, multi-core and many-core code.
Parareal in time 3D numerical solver for the LWR Benchmark neutron diffusion transient model
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baudron, Anne-Marie, E-mail: anne-marie.baudron@cea.fr; CEA-DRN/DMT/SERMA, CEN-Saclay, 91191 Gif sur Yvette Cedex; Lautard, Jean-Jacques, E-mail: jean-jacques.lautard@cea.fr
2014-12-15
In this paper we present a time-parallel algorithm for the 3D neutrons calculation of a transient model in a nuclear reactor core. The neutrons calculation consists in numerically solving the time dependent diffusion approximation equation, which is a simplified transport equation. The numerical resolution is done with finite elements method based on a tetrahedral meshing of the computational domain, representing the reactor core, and time discretization is achieved using a θ-scheme. The transient model presents moving control rods during the time of the reaction. Therefore, cross-sections (piecewise constants) are taken into account by interpolations with respect to the velocity ofmore » the control rods. The parallelism across the time is achieved by an adequate use of the parareal in time algorithm to the handled problem. This parallel method is a predictor corrector scheme that iteratively combines the use of two kinds of numerical propagators, one coarse and one fine. Our method is made efficient by means of a coarse solver defined with large time step and fixed position control rods model, while the fine propagator is assumed to be a high order numerical approximation of the full model. The parallel implementation of our method provides a good scalability of the algorithm. Numerical results show the efficiency of the parareal method on large light water reactor transient model corresponding to the Langenbuch–Maurer–Werner benchmark.« less
Compression of multispectral Landsat imagery using the Embedded Zerotree Wavelet (EZW) algorithm
NASA Technical Reports Server (NTRS)
Shapiro, Jerome M.; Martucci, Stephen A.; Czigler, Martin
1994-01-01
The Embedded Zerotree Wavelet (EZW) algorithm has proven to be an extremely efficient and flexible compression algorithm for low bit rate image coding. The embedding algorithm attempts to order the bits in the bit stream in numerical importance and thus a given code contains all lower rate encodings of the same algorithm. Therefore, precise bit rate control is achievable and a target rate or distortion metric can be met exactly. Furthermore, the technique is fully image adaptive. An algorithm for multispectral image compression which combines the spectral redundancy removal properties of the image-dependent Karhunen-Loeve Transform (KLT) with the efficiency, controllability, and adaptivity of the embedded zerotree wavelet algorithm is presented. Results are shown which illustrate the advantage of jointly encoding spectral components using the KLT and EZW.
A proximity algorithm accelerated by Gauss-Seidel iterations for L1/TV denoising models
NASA Astrophysics Data System (ADS)
Li, Qia; Micchelli, Charles A.; Shen, Lixin; Xu, Yuesheng
2012-09-01
Our goal in this paper is to improve the computational performance of the proximity algorithms for the L1/TV denoising model. This leads us to a new characterization of all solutions to the L1/TV model via fixed-point equations expressed in terms of the proximity operators. Based upon this observation we develop an algorithm for solving the model and establish its convergence. Furthermore, we demonstrate that the proposed algorithm can be accelerated through the use of the componentwise Gauss-Seidel iteration so that the CPU time consumed is significantly reduced. Numerical experiments using the proposed algorithm for impulsive noise removal are included, with a comparison to three recently developed algorithms. The numerical results show that while the proposed algorithm enjoys a high quality of the restored images, as the other three known algorithms do, it performs significantly better in terms of computational efficiency measured in the CPU time consumed.
The minimal residual QR-factorization algorithm for reliably solving subset regression problems
NASA Technical Reports Server (NTRS)
Verhaegen, M. H.
1987-01-01
A new algorithm to solve test subset regression problems is described, called the minimal residual QR factorization algorithm (MRQR). This scheme performs a QR factorization with a new column pivoting strategy. Basically, this strategy is based on the change in the residual of the least squares problem. Furthermore, it is demonstrated that this basic scheme might be extended in a numerically efficient way to combine the advantages of existing numerical procedures, such as the singular value decomposition, with those of more classical statistical procedures, such as stepwise regression. This extension is presented as an advisory expert system that guides the user in solving the subset regression problem. The advantages of the new procedure are highlighted by a numerical example.
Xing, Z F; Greenberg, J M
1994-08-20
The analyticity of the complex extinction efficiency is examined numerically in the size-parameter domain for homogeneous prolate and oblate spheroids and finite cylinders. The T-matrix code, which is the most efficient program available to date, is employed to calculate the individual particle-extinction efficiencies. Because of its computational limitations in the size-parameter range, a slightly modified Hilbert-transform algorithm is required to establish the analyticity numerically. The findings concerning analyticity that we reported for spheres (Astrophys. J. 399, 164-175, 1992) apply equally to these nonspherical particles.
NASA Astrophysics Data System (ADS)
Karami, Fahd; Ziad, Lamia; Sadik, Khadija
2017-12-01
In this paper, we focus on a numerical method of a problem called the Perona-Malik inequality which we use for image denoising. This model is obtained as the limit of the Perona-Malik model and the p-Laplacian operator with p→ ∞. In Atlas et al., (Nonlinear Anal. Real World Appl 18:57-68, 2014), the authors have proved the existence and uniqueness of the solution of the proposed model. However, in their work, they used the explicit numerical scheme for approximated problem which is strongly dependent to the parameter p. To overcome this, we use in this work an efficient algorithm which is a combination of the classical additive operator splitting and a nonlinear relaxation algorithm. At last, we have presented the experimental results in image filtering show, which demonstrate the efficiency and effectiveness of our algorithm and finally, we have compared it with the previous scheme presented in Atlas et al., (Nonlinear Anal. Real World Appl 18:57-68, 2014).
Downdating a time-varying square root information filter
NASA Technical Reports Server (NTRS)
Muellerschoen, Ronald J.
1990-01-01
A new method to efficiently downdate an estimate and covariance generated by a discrete time Square Root Information Filter (SRIF) is presented. The method combines the QR factor downdating algorithm of Gill and the decentralized SRIF algorithm of Bierman. Efficient removal of either measurements or a priori information is possible without loss of numerical integrity. Moreover, the method includes features for detecting potential numerical degradation. Performance on a 300 parameter system with 5800 data points shows that the method can be used in real time and hence is a promising tool for interactive data analysis. Additionally, updating a time-varying SRIF filter with either additional measurements or a priori information proceeds analogously.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hong, Youngjoon, E-mail: hongy@uic.edu; Nicholls, David P., E-mail: davidn@uic.edu
The accurate numerical simulation of linear waves interacting with periodic layered media is a crucial capability in engineering applications. In this contribution we study the stable and high-order accurate numerical simulation of the interaction of linear, time-harmonic waves with a periodic, triply layered medium with irregular interfaces. In contrast with volumetric approaches, High-Order Perturbation of Surfaces (HOPS) algorithms are inexpensive interfacial methods which rapidly and recursively estimate scattering returns by perturbation of the interface shape. In comparison with Boundary Integral/Element Methods, the stable HOPS algorithm we describe here does not require specialized quadrature rules, periodization strategies, or the solution ofmore » dense non-symmetric positive definite linear systems. In addition, the algorithm is provably stable as opposed to other classical HOPS approaches. With numerical experiments we show the remarkable efficiency, fidelity, and accuracy one can achieve with an implementation of this algorithm.« less
Computing Project, Marc develops high-fidelity turbulence models to enhance simulation accuracy and efficient numerical algorithms for future high performance computing hardware architectures. Research Interests High performance computing High order numerical methods for computational fluid dynamics Fluid
An improved conjugate gradient scheme to the solution of least squares SVM.
Chu, Wei; Ong, Chong Jin; Keerthi, S Sathiya
2005-03-01
The least square support vector machines (LS-SVM) formulation corresponds to the solution of a linear system of equations. Several approaches to its numerical solutions have been proposed in the literature. In this letter, we propose an improved method to the numerical solution of LS-SVM and show that the problem can be solved using one reduced system of linear equations. Compared with the existing algorithm for LS-SVM, the approach used in this letter is about twice as efficient. Numerical results using the proposed method are provided for comparisons with other existing algorithms.
PolyPole-1: An accurate numerical algorithm for intra-granular fission gas release
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pizzocri, D.; Rabiti, C.; Luzzi, L.
2016-09-01
This paper describes the development of a new numerical algorithm (called PolyPole-1) to efficiently solve the equation for intra-granular fission gas release in nuclear fuel. The work was carried out in collaboration with Politecnico di Milano and Institute for Transuranium Elements. The PolyPole-1 algorithms is being implemented in INL's fuels code BISON code as part of BISON's fission gas release model. The transport of fission gas from within the fuel grains to the grain boundaries (intra-granular fission gas release) is a fundamental controlling mechanism of fission gas release and gaseous swelling in nuclear fuel. Hence, accurate numerical solution of themore » corresponding mathematical problem needs to be included in fission gas behaviour models used in fuel performance codes. Under the assumption of equilibrium between trapping and resolution, the process can be described mathematically by a single diffusion equation for the gas atom concentration in a grain. In this work, we propose a new numerical algorithm (PolyPole-1) to efficiently solve the fission gas diffusion equation in time-varying conditions. The PolyPole-1 algorithm is based on the analytic modal solution of the diffusion equation for constant conditions, with the addition of polynomial corrective terms that embody the information on the deviation from constant conditions. The new algorithm is verified by comparing the results to a finite difference solution over a large number of randomly generated operation histories. Furthermore, comparison to state-of-the-art algorithms used in fuel performance codes demonstrates that the accuracy of the PolyPole-1 solution is superior to other algorithms, with similar computational effort. Finally, the concept of PolyPole-1 may be extended to the solution of the general problem of intra-granular fission gas diffusion during non-equilibrium trapping and resolution, which will be the subject of future work.« less
NASA Astrophysics Data System (ADS)
Stellmach, Stephan; Hansen, Ulrich
2008-05-01
Numerical simulations of the process of convection and magnetic field generation in planetary cores still fail to reach geophysically realistic control parameter values. Future progress in this field depends crucially on efficient numerical algorithms which are able to take advantage of the newest generation of parallel computers. Desirable features of simulation algorithms include (1) spectral accuracy, (2) an operation count per time step that is small and roughly proportional to the number of grid points, (3) memory requirements that scale linear with resolution, (4) an implicit treatment of all linear terms including the Coriolis force, (5) the ability to treat all kinds of common boundary conditions, and (6) reasonable efficiency on massively parallel machines with tens of thousands of processors. So far, algorithms for fully self-consistent dynamo simulations in spherical shells do not achieve all these criteria simultaneously, resulting in strong restrictions on the possible resolutions. In this paper, we demonstrate that local dynamo models in which the process of convection and magnetic field generation is only simulated for a small part of a planetary core in Cartesian geometry can achieve the above goal. We propose an algorithm that fulfills the first five of the above criteria and demonstrate that a model implementation of our method on an IBM Blue Gene/L system scales impressively well for up to O(104) processors. This allows for numerical simulations at rather extreme parameter values.
NASA Astrophysics Data System (ADS)
Kazemzadeh Azad, Saeid
2018-01-01
In spite of considerable research work on the development of efficient algorithms for discrete sizing optimization of steel truss structures, only a few studies have addressed non-algorithmic issues affecting the general performance of algorithms. For instance, an important question is whether starting the design optimization from a feasible solution is fruitful or not. This study is an attempt to investigate the effect of seeding the initial population with feasible solutions on the general performance of metaheuristic techniques. To this end, the sensitivity of recently proposed metaheuristic algorithms to the feasibility of initial candidate designs is evaluated through practical discrete sizing of real-size steel truss structures. The numerical experiments indicate that seeding the initial population with feasible solutions can improve the computational efficiency of metaheuristic structural optimization algorithms, especially in the early stages of the optimization. This paves the way for efficient metaheuristic optimization of large-scale structural systems.
Optimal Energy Efficiency Fairness of Nodes in Wireless Powered Communication Networks.
Zhang, Jing; Zhou, Qingjie; Ng, Derrick Wing Kwan; Jo, Minho
2017-09-15
In wireless powered communication networks (WPCNs), it is essential to research energy efficiency fairness in order to evaluate the balance of nodes for receiving information and harvesting energy. In this paper, we propose an efficient iterative algorithm for optimal energy efficiency proportional fairness in WPCN. The main idea is to use stochastic geometry to derive the mean proportionally fairness utility function with respect to user association probability and receive threshold. Subsequently, we prove that the relaxed proportionally fairness utility function is a concave function for user association probability and receive threshold, respectively. At the same time, a sub-optimal algorithm by exploiting alternating optimization approach is proposed. Through numerical simulations, we demonstrate that our sub-optimal algorithm can obtain a result close to optimal energy efficiency proportional fairness with significant reduction of computational complexity.
Optimal Energy Efficiency Fairness of Nodes in Wireless Powered Communication Networks
Zhou, Qingjie; Ng, Derrick Wing Kwan; Jo, Minho
2017-01-01
In wireless powered communication networks (WPCNs), it is essential to research energy efficiency fairness in order to evaluate the balance of nodes for receiving information and harvesting energy. In this paper, we propose an efficient iterative algorithm for optimal energy efficiency proportional fairness in WPCN. The main idea is to use stochastic geometry to derive the mean proportionally fairness utility function with respect to user association probability and receive threshold. Subsequently, we prove that the relaxed proportionally fairness utility function is a concave function for user association probability and receive threshold, respectively. At the same time, a sub-optimal algorithm by exploiting alternating optimization approach is proposed. Through numerical simulations, we demonstrate that our sub-optimal algorithm can obtain a result close to optimal energy efficiency proportional fairness with significant reduction of computational complexity. PMID:28914818
NASA Astrophysics Data System (ADS)
Wichert, Viktoria; Arkenberg, Mario; Hauschildt, Peter H.
2016-10-01
Highly resolved state-of-the-art 3D atmosphere simulations will remain computationally extremely expensive for years to come. In addition to the need for more computing power, rethinking coding practices is necessary. We take a dual approach by introducing especially adapted, parallel numerical methods and correspondingly parallelizing critical code passages. In the following, we present our respective work on PHOENIX/3D. With new parallel numerical algorithms, there is a big opportunity for improvement when iteratively solving the system of equations emerging from the operator splitting of the radiative transfer equation J = ΛS. The narrow-banded approximate Λ-operator Λ* , which is used in PHOENIX/3D, occurs in each iteration step. By implementing a numerical algorithm which takes advantage of its characteristic traits, the parallel code's efficiency is further increased and a speed-up in computational time can be achieved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pastore, Giovanni; Rabiti, Cristian; Pizzocri, Davide
PolyPole is a numerical algorithm for the calculation of intra-granular fission gas release. In particular, the algorithm solves the gas diffusion problem in a fuel grain in time-varying conditions. The program has been extensively tested. PolyPole combines a high accuracy with a high computational efficiency and is ideally suited for application in fuel performance codes.
New Parallel Algorithms for Structural Analysis and Design of Aerospace Structures
NASA Technical Reports Server (NTRS)
Nguyen, Duc T.
1998-01-01
Subspace and Lanczos iterations have been developed, well documented, and widely accepted as efficient methods for obtaining p-lowest eigen-pair solutions of large-scale, practical engineering problems. The focus of this paper is to incorporate recent developments in vectorized sparse technologies in conjunction with Subspace and Lanczos iterative algorithms for computational enhancements. Numerical performance, in terms of accuracy and efficiency of the proposed sparse strategies for Subspace and Lanczos algorithm, is demonstrated by solving for the lowest frequencies and mode shapes of structural problems on the IBM-R6000/590 and SunSparc 20 workstations.
Pernice, W H; Payne, F P; Gallagher, D F
2007-09-03
We present a novel numerical scheme for the simulation of the field enhancement by metal nano-particles in the time domain. The algorithm is based on a combination of the finite-difference time-domain method and the pseudo-spectral time-domain method for dispersive materials. The hybrid solver leads to an efficient subgridding algorithm that does not suffer from spurious field spikes as do FDTD schemes. Simulation of the field enhancement by gold particles shows the expected exponential field profile. The enhancement factors are computed for single particles and particle arrays. Due to the geometry conforming mesh the algorithm is stable for long integration times and thus suitable for the simulation of resonance phenomena in coupled nano-particle structures.
Adaptive recurrence quantum entanglement distillation for two-Kraus-operator channels
NASA Astrophysics Data System (ADS)
Ruan, Liangzhong; Dai, Wenhan; Win, Moe Z.
2018-05-01
Quantum entanglement serves as a valuable resource for many important quantum operations. A pair of entangled qubits can be shared between two agents by first preparing a maximally entangled qubit pair at one agent, and then sending one of the qubits to the other agent through a quantum channel. In this process, the deterioration of entanglement is inevitable since the noise inherent in the channel contaminates the qubit. To address this challenge, various quantum entanglement distillation (QED) algorithms have been developed. Among them, recurrence algorithms have advantages in terms of implementability and robustness. However, the efficiency of recurrence QED algorithms has not been investigated thoroughly in the literature. This paper puts forth two recurrence QED algorithms that adapt to the quantum channel to tackle the efficiency issue. The proposed algorithms have guaranteed convergence for quantum channels with two Kraus operators, which include phase-damping and amplitude-damping channels. Analytical results show that the convergence speed of these algorithms is improved from linear to quadratic and one of the algorithms achieves the optimal speed. Numerical results confirm that the proposed algorithms significantly improve the efficiency of QED.
A fast marching algorithm for the factored eikonal equation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Treister, Eran, E-mail: erantreister@gmail.com; Haber, Eldad, E-mail: haber@math.ubc.ca; Department of Mathematics, The University of British Columbia, Vancouver, BC
The eikonal equation is instrumental in many applications in several fields ranging from computer vision to geoscience. This equation can be efficiently solved using the iterative Fast Sweeping (FS) methods and the direct Fast Marching (FM) methods. However, when used for a point source, the original eikonal equation is known to yield inaccurate numerical solutions, because of a singularity at the source. In this case, the factored eikonal equation is often preferred, and is known to yield a more accurate numerical solution. One application that requires the solution of the eikonal equation for point sources is travel time tomography. Thismore » inverse problem may be formulated using the eikonal equation as a forward problem. While this problem has been solved using FS in the past, the more recent choice for applying it involves FM methods because of the efficiency in which sensitivities can be obtained using them. However, while several FS methods are available for solving the factored equation, the FM method is available only for the original eikonal equation. In this paper we develop a Fast Marching algorithm for the factored eikonal equation, using both first and second order finite-difference schemes. Our algorithm follows the same lines as the original FM algorithm and requires the same computational effort. In addition, we show how to obtain sensitivities using this FM method and apply travel time tomography, formulated as an inverse factored eikonal equation. Numerical results in two and three dimensions show that our algorithm solves the factored eikonal equation efficiently, and demonstrate the achieved accuracy for computing the travel time. We also demonstrate a recovery of a 2D and 3D heterogeneous medium by travel time tomography using the eikonal equation for forward modeling and inversion by Gauss–Newton.« less
Efficient calculation of higher-order optical waveguide dispersion.
Mores, J A; Malheiros-Silveira, G N; Fragnito, H L; Hernández-Figueroa, H E
2010-09-13
An efficient numerical strategy to compute the higher-order dispersion parameters of optical waveguides is presented. For the first time to our knowledge, a systematic study of the errors involved in the higher-order dispersions' numerical calculation process is made, showing that the present strategy can accurately model those parameters. Such strategy combines a full-vectorial finite element modal solver and a proper finite difference differentiation algorithm. Its performance has been carefully assessed through the analysis of several key geometries. In addition, the optimization of those higher-order dispersion parameters can also be carried out by coupling to the present scheme a genetic algorithm, as shown here through the design of a photonic crystal fiber suitable for parametric amplification applications.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Havu, V.; Fritz Haber Institute of the Max Planck Society, Berlin; Blum, V.
2009-12-01
We consider the problem of developing O(N) scaling grid-based operations needed in many central operations when performing electronic structure calculations with numeric atom-centered orbitals as basis functions. We outline the overall formulation of localized algorithms, and specifically the creation of localized grid batches. The choice of the grid partitioning scheme plays an important role in the performance and memory consumption of the grid-based operations. Three different top-down partitioning methods are investigated, and compared with formally more rigorous yet much more expensive bottom-up algorithms. We show that a conceptually simple top-down grid partitioning scheme achieves essentially the same efficiency as themore » more rigorous bottom-up approaches.« less
Dynamical analysis of the avian-human influenza epidemic model using the semi-analytical method
NASA Astrophysics Data System (ADS)
Jabbari, Azizeh; Kheiri, Hossein; Bekir, Ahmet
2015-03-01
In this work, we present a dynamic behavior of the avian-human influenza epidemic model by using efficient computational algorithm, namely the multistage differential transform method(MsDTM). The MsDTM is used here as an algorithm for approximating the solutions of the avian-human influenza epidemic model in a sequence of time intervals. In order to show the efficiency of the method, the obtained numerical results are compared with the fourth-order Runge-Kutta method (RK4M) and differential transform method(DTM) solutions. It is shown that the MsDTM has the advantage of giving an analytical form of the solution within each time interval which is not possible in purely numerical techniques like RK4M.
Efficient field-theoretic simulation of polymer solutions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Villet, Michael C.; Fredrickson, Glenn H., E-mail: ghf@mrl.ucsb.edu; Department of Materials, University of California, Santa Barbara, California 93106
2014-12-14
We present several developments that facilitate the efficient field-theoretic simulation of polymers by complex Langevin sampling. A regularization scheme using finite Gaussian excluded volume interactions is used to derive a polymer solution model that appears free of ultraviolet divergences and hence is well-suited for lattice-discretized field theoretic simulation. We show that such models can exhibit ultraviolet sensitivity, a numerical pathology that dramatically increases sampling error in the continuum lattice limit, and further show that this pathology can be eliminated by appropriate model reformulation by variable transformation. We present an exponential time differencing algorithm for integrating complex Langevin equations for fieldmore » theoretic simulation, and show that the algorithm exhibits excellent accuracy and stability properties for our regularized polymer model. These developments collectively enable substantially more efficient field-theoretic simulation of polymers, and illustrate the importance of simultaneously addressing analytical and numerical pathologies when implementing such computations.« less
AN ACCURATE AND EFFICIENT ALGORITHM FOR NUMERICAL SIMULATION OF CONDUCTION-TYPE PROBLEMS. (R824801)
A modification of the finite analytic numerical method for conduction-type (diffusion) problems is presented. The finite analytic discretization scheme is derived by means of the Fourier series expansion for the most general case of nonuniform grid and variabl...
Wavelet-based Adaptive Mesh Refinement Method for Global Atmospheric Chemical Transport Modeling
NASA Astrophysics Data System (ADS)
Rastigejev, Y.
2011-12-01
Numerical modeling of global atmospheric chemical transport presents enormous computational difficulties, associated with simulating a wide range of time and spatial scales. The described difficulties are exacerbated by the fact that hundreds of chemical species and thousands of chemical reactions typically are used for chemical kinetic mechanism description. These computational requirements very often forces researches to use relatively crude quasi-uniform numerical grids with inadequate spatial resolution that introduces significant numerical diffusion into the system. It was shown that this spurious diffusion significantly distorts the pollutant mixing and transport dynamics for typically used grid resolution. The described numerical difficulties have to be systematically addressed considering that the demand for fast, high-resolution chemical transport models will be exacerbated over the next decade by the need to interpret satellite observations of tropospheric ozone and related species. In this study we offer dynamically adaptive multilevel Wavelet-based Adaptive Mesh Refinement (WAMR) method for numerical modeling of atmospheric chemical evolution equations. The adaptive mesh refinement is performed by adding and removing finer levels of resolution in the locations of fine scale development and in the locations of smooth solution behavior accordingly. The algorithm is based on the mathematically well established wavelet theory. This allows us to provide error estimates of the solution that are used in conjunction with an appropriate threshold criteria to adapt the non-uniform grid. Other essential features of the numerical algorithm include: an efficient wavelet spatial discretization that allows to minimize the number of degrees of freedom for a prescribed accuracy, a fast algorithm for computing wavelet amplitudes, and efficient and accurate derivative approximations on an irregular grid. The method has been tested for a variety of benchmark problems including numerical simulation of transpacific traveling pollution plumes. The generated pollution plumes are diluted due to turbulent mixing as they are advected downwind. Despite this dilution, it was recently discovered that pollution plumes in the remote troposphere can preserve their identity as well-defined structures for two weeks or more as they circle the globe. Present Global Chemical Transport Models (CTMs) implemented for quasi-uniform grids are completely incapable of reproducing these layered structures due to high numerical plume dilution caused by numerical diffusion combined with non-uniformity of atmospheric flow. It is shown that WAMR algorithm solutions of comparable accuracy as conventional numerical techniques are obtained with more than an order of magnitude reduction in number of grid points, therefore the adaptive algorithm is capable to produce accurate results at a relatively low computational cost. The numerical simulations demonstrate that WAMR algorithm applied the traveling plume problem accurately reproduces the plume dynamics unlike conventional numerical methods that utilizes quasi-uniform numerical grids.
Convergence and Applications of a Gossip-Based Gauss-Newton Algorithm
NASA Astrophysics Data System (ADS)
Li, Xiao; Scaglione, Anna
2013-11-01
The Gauss-Newton algorithm is a popular and efficient centralized method for solving non-linear least squares problems. In this paper, we propose a multi-agent distributed version of this algorithm, named Gossip-based Gauss-Newton (GGN) algorithm, which can be applied in general problems with non-convex objectives. Furthermore, we analyze and present sufficient conditions for its convergence and show numerically that the GGN algorithm achieves performance comparable to the centralized algorithm, with graceful degradation in case of network failures. More importantly, the GGN algorithm provides significant performance gains compared to other distributed first order methods.
Applications and development of communication models for the touchstone GAMMA and DELTA prototypes
NASA Technical Reports Server (NTRS)
Seidel, Steven R.
1993-01-01
The goal of this project was to develop models of the interconnection networks of the Intel iPSC/860 and DELTA multicomputers to guide the design of efficient algorithms for interprocessor communication in problems that commonly occur in CFD codes and other applications. Interprocessor communication costs of codes for message-passing architectures such as the iPSC/860 and DELTA significantly affect the level of performance that can be obtained from those machines. This project addressed several specific problems in the achievement of efficient communication on the Intel iPSC/860 hypercube and DELTA mesh. In particular, an efficient global processor synchronization algorithm was developed for the iPSC/860 and numerous broadcast algorithms were designed for the DELTA.
A split finite element algorithm for the compressible Navier-Stokes equations
NASA Technical Reports Server (NTRS)
Baker, A. J.
1979-01-01
An accurate and efficient numerical solution algorithm is established for solution of the high Reynolds number limit of the Navier-Stokes equations governing the multidimensional flow of a compressible essentially inviscid fluid. Finite element interpolation theory is used within a dissipative formulation established using Galerkin criteria within the Method of Weighted Residuals. An implicit iterative solution algorithm is developed, employing tensor product bases within a fractional steps integration procedure, that significantly enhances solution economy concurrent with sharply reduced computer hardware demands. The algorithm is evaluated for resolution of steep field gradients and coarse grid accuracy using both linear and quadratic tensor product interpolation bases. Numerical solutions for linear and nonlinear, one, two and three dimensional examples confirm and extend the linearized theoretical analyses, and results are compared to competitive finite difference derived algorithms.
NASA Astrophysics Data System (ADS)
Chen, Guangye; Chacon, Luis
2015-11-01
We discuss a new, conservative, fully implicit 2D3V Vlasov-Darwin particle-in-cell algorithm in curvilinear geometry for non-radiative, electromagnetic kinetic plasma simulations. Unlike standard explicit PIC schemes, fully implicit PIC algorithms are unconditionally stable and allow exact discrete energy and charge conservation. Here, we extend these algorithms to curvilinear geometry. The algorithm retains its exact conservation properties in curvilinear grids. The nonlinear iteration is effectively accelerated with a fluid preconditioner for weakly to modestly magnetized plasmas, which allows efficient use of large timesteps, O (√{mi/me}c/veT) larger than the explicit CFL. In this presentation, we will introduce the main algorithmic components of the approach, and demonstrate the accuracy and efficiency properties of the algorithm with various numerical experiments in 1D (slow shock) and 2D (island coalescense).
Numerical implementation of the S-matrix algorithm for modeling of relief diffraction gratings
NASA Astrophysics Data System (ADS)
Yaremchuk, Iryna; Tamulevičius, Tomas; Fitio, Volodymyr; Gražulevičiūte, Ieva; Bobitski, Yaroslav; Tamulevičius, Sigitas
2013-11-01
A new numerical implementation is developed to calculate the diffraction efficiency of relief diffraction gratings. In the new formulation, vectors containing the expansion coefficients of electric and magnetic fields on boundaries of the grating layer are expressed by additional constants. An S-matrix algorithm has been systematically described in detail and adapted to a simple matrix form. This implementation is suitable for the study of optical characteristics of periodic structures by using modern object-oriented programming languages and different standard mathematical software. The modeling program has been developed on the basis of this numerical implementation and tested by comparison with other commercially available programs and experimental data. Numerical examples are given to show the usefulness of the new implementation.
An intercomparison study of TSM, SEBS, and SEBAL using high-resolution imagery and lysimetric data
USDA-ARS?s Scientific Manuscript database
Over the past three decades, numerous remote sensing based ET mapping algorithms were developed. These algorithms provided a robust, economical, and efficient tool for ET estimations at field and regional scales. The Two Source Model (TSM), Surface Energy Balance System (SEBS), and Surface Energy Ba...
Automatic computation and solution of generalized harmonic balance equations
NASA Astrophysics Data System (ADS)
Peyton Jones, J. C.; Yaser, K. S. A.; Stevenson, J.
2018-02-01
Generalized methods are presented for generating and solving the harmonic balance equations for a broad class of nonlinear differential or difference equations and for a general set of harmonics chosen by the user. In particular, a new algorithm for automatically generating the Jacobian of the balance equations enables efficient solution of these equations using continuation methods. Efficient numeric validation techniques are also presented, and the combined algorithm is applied to the analysis of dc, fundamental, second and third harmonic response of a nonlinear automotive damper.
Loading relativistic Maxwell distributions in particle simulations
NASA Astrophysics Data System (ADS)
Zenitani, Seiji
2015-04-01
Numerical algorithms to load relativistic Maxwell distributions in particle-in-cell (PIC) and Monte-Carlo simulations are presented. For stationary relativistic Maxwellian, the inverse transform method and the Sobol algorithm are reviewed. To boost particles to obtain relativistic shifted-Maxwellian, two rejection methods are proposed in a physically transparent manner. Their acceptance efficiencies are ≈50 % for generic cases and 100% for symmetric distributions. They can be combined with arbitrary base algorithms.
NASA Astrophysics Data System (ADS)
Wang, Jinting; Lu, Liqiao; Zhu, Fei
2018-01-01
Finite element (FE) is a powerful tool and has been applied by investigators to real-time hybrid simulations (RTHSs). This study focuses on the computational efficiency, including the computational time and accuracy, of numerical integrations in solving FE numerical substructure in RTHSs. First, sparse matrix storage schemes are adopted to decrease the computational time of FE numerical substructure. In this way, the task execution time (TET) decreases such that the scale of the numerical substructure model increases. Subsequently, several commonly used explicit numerical integration algorithms, including the central difference method (CDM), the Newmark explicit method, the Chang method and the Gui-λ method, are comprehensively compared to evaluate their computational time in solving FE numerical substructure. CDM is better than the other explicit integration algorithms when the damping matrix is diagonal, while the Gui-λ (λ = 4) method is advantageous when the damping matrix is non-diagonal. Finally, the effect of time delay on the computational accuracy of RTHSs is investigated by simulating structure-foundation systems. Simulation results show that the influences of time delay on the displacement response become obvious with the mass ratio increasing, and delay compensation methods may reduce the relative error of the displacement peak value to less than 5% even under the large time-step and large time delay.
A second order derivative scheme based on Bregman algorithm class
NASA Astrophysics Data System (ADS)
Campagna, Rosanna; Crisci, Serena; Cuomo, Salvatore; Galletti, Ardelio; Marcellino, Livia
2016-10-01
The algorithms based on the Bregman iterative regularization are known for efficiently solving convex constraint optimization problems. In this paper, we introduce a second order derivative scheme for the class of Bregman algorithms. Its properties of convergence and stability are investigated by means of numerical evidences. Moreover, we apply the proposed scheme to an isotropic Total Variation (TV) problem arising out of the Magnetic Resonance Image (MRI) denoising. Experimental results confirm that our algorithm has good performance in terms of denoising quality, effectiveness and robustness.
NASA Astrophysics Data System (ADS)
Kiss, Gellért Zsolt; Borbély, Sándor; Nagy, Ladislau
2017-12-01
We have presented here an efficient numerical approach for the ab initio numerical solution of the time-dependent Schrödinger Equation describing diatomic molecules, which interact with ultrafast laser pulses. During the construction of the model we have assumed a frozen nuclear configuration and a single active electron. In order to increase efficiency our system was described using prolate spheroidal coordinates, where the wave function was discretized using the finite-element discrete variable representation (FE-DVR) method. The discretized wave functions were efficiently propagated in time using the short-iterative Lanczos algorithm. As a first test we have studied here how the laser induced bound state dynamics in H2+ is influenced by the strength of the driving laser field.
A Model-Free No-arbitrage Price Bound for Variance Options
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bonnans, J. Frederic, E-mail: frederic.bonnans@inria.fr; Tan Xiaolu, E-mail: xiaolu.tan@polytechnique.edu
2013-08-01
We suggest a numerical approximation for an optimization problem, motivated by its applications in finance to find the model-free no-arbitrage bound of variance options given the marginal distributions of the underlying asset. A first approximation restricts the computation to a bounded domain. Then we propose a gradient projection algorithm together with the finite difference scheme to solve the optimization problem. We prove the general convergence, and derive some convergence rate estimates. Finally, we give some numerical examples to test the efficiency of the algorithm.
An integrated algorithm for hypersonic fluid-thermal-structural numerical simulation
NASA Astrophysics Data System (ADS)
Li, Jia-Wei; Wang, Jiang-Feng
2018-05-01
In this paper, a fluid-structural-thermal integrated method is presented based on finite volume method. A unified integral equations system is developed as the control equations for physical process of aero-heating and structural heat transfer. The whole physical field is discretized by using an up-wind finite volume method. To demonstrate its capability, the numerical simulation of Mach 6.47 flow over stainless steel cylinder shows a good agreement with measured values, and this method dynamically simulates the objective physical processes. Thus, the integrated algorithm proves to be efficient and reliable.
Wang, Hua; Liu, Feng; Xia, Ling; Crozier, Stuart
2008-11-21
This paper presents a stabilized Bi-conjugate gradient algorithm (BiCGstab) that can significantly improve the performance of the impedance method, which has been widely applied to model low-frequency field induction phenomena in voxel phantoms. The improved impedance method offers remarkable computational advantages in terms of convergence performance and memory consumption over the conventional, successive over-relaxation (SOR)-based algorithm. The scheme has been validated against other numerical/analytical solutions on a lossy, multilayered sphere phantom excited by an ideal coil loop. To demonstrate the computational performance and application capability of the developed algorithm, the induced fields inside a human phantom due to a low-frequency hyperthermia device is evaluated. The simulation results show the numerical accuracy and superior performance of the method.
NASA Astrophysics Data System (ADS)
Penenko, Alexey; Penenko, Vladimir; Nuterman, Roman; Baklanov, Alexander; Mahura, Alexander
2015-11-01
Atmospheric chemistry dynamics is studied with convection-diffusion-reaction model. The numerical Data Assimilation algorithm presented is based on the additive-averaged splitting schemes. It carries out ''fine-grained'' variational data assimilation on the separate splitting stages with respect to spatial dimensions and processes i.e. the same measurement data is assimilated to different parts of the split model. This design has efficient implementation due to the direct data assimilation algorithms of the transport process along coordinate lines. Results of numerical experiments with chemical data assimilation algorithm of in situ concentration measurements on real data scenario have been presented. In order to construct the scenario, meteorological data has been taken from EnviroHIRLAM model output, initial conditions from MOZART model output and measurements from Airbase database.
NASA Astrophysics Data System (ADS)
Titeux, Isabelle; Li, Yuming M.; Debray, Karl; Guo, Ying-Qiao
2004-11-01
This Note deals with an efficient algorithm to carry out the plastic integration and compute the stresses due to large strains for materials satisfying the Hill's anisotropic yield criterion. The classical algorithm of plastic integration such as 'Return Mapping Method' is largely used for nonlinear analyses of structures and numerical simulations of forming processes, but it requires an iterative schema and may have convergence problems. A new direct algorithm based on a scalar method is developed which allows us to directly obtain the plastic multiplier without an iteration procedure; thus the computation time is largely reduced and the numerical problems are avoided. To cite this article: I. Titeux et al., C. R. Mecanique 332 (2004).
Hasani, Mojtaba H; Gharibzadeh, Shahriar; Farjami, Yaghoub; Tavakkoli, Jahan
2013-09-01
Various numerical algorithms have been developed to solve the Khokhlov-Kuznetsov-Zabolotskaya (KZK) parabolic nonlinear wave equation. In this work, a generalized time-domain numerical algorithm is proposed to solve the diffraction term of the KZK equation. This algorithm solves the transverse Laplacian operator of the KZK equation in three-dimensional (3D) Cartesian coordinates using a finite-difference method based on the five-point implicit backward finite difference and the five-point Crank-Nicolson finite difference discretization techniques. This leads to a more uniform discretization of the Laplacian operator which in turn results in fewer calculation gridding nodes without compromising accuracy in the diffraction term. In addition, a new empirical algorithm based on the LU decomposition technique is proposed to solve the system of linear equations obtained from this discretization. The proposed empirical algorithm improves the calculation speed and memory usage, while the order of computational complexity remains linear in calculation of the diffraction term in the KZK equation. For evaluating the accuracy of the proposed algorithm, two previously published algorithms are used as comparison references: the conventional 2D Texas code and its generalization for 3D geometries. The results show that the accuracy/efficiency performance of the proposed algorithm is comparable with the established time-domain methods.
A Parallel Compact Multi-Dimensional Numerical Algorithm with Aeroacoustics Applications
NASA Technical Reports Server (NTRS)
Povitsky, Alex; Morris, Philip J.
1999-01-01
In this study we propose a novel method to parallelize high-order compact numerical algorithms for the solution of three-dimensional PDEs (Partial Differential Equations) in a space-time domain. For this numerical integration most of the computer time is spent in computation of spatial derivatives at each stage of the Runge-Kutta temporal update. The most efficient direct method to compute spatial derivatives on a serial computer is a version of Gaussian elimination for narrow linear banded systems known as the Thomas algorithm. In a straightforward pipelined implementation of the Thomas algorithm processors are idle due to the forward and backward recurrences of the Thomas algorithm. To utilize processors during this time, we propose to use them for either non-local data independent computations, solving lines in the next spatial direction, or local data-dependent computations by the Runge-Kutta method. To achieve this goal, control of processor communication and computations by a static schedule is adopted. Thus, our parallel code is driven by a communication and computation schedule instead of the usual "creative, programming" approach. The obtained parallelization speed-up of the novel algorithm is about twice as much as that for the standard pipelined algorithm and close to that for the explicit DRP algorithm.
Towards developing robust algorithms for solving partial differential equations on MIMD machines
NASA Technical Reports Server (NTRS)
Saltz, Joel H.; Naik, Vijay K.
1988-01-01
Methods for efficient computation of numerical algorithms on a wide variety of MIMD machines are proposed. These techniques reorganize the data dependency patterns to improve the processor utilization. The model problem finds the time-accurate solution to a parabolic partial differential equation discretized in space and implicitly marched forward in time. The algorithms are extensions of Jacobi and SOR. The extensions consist of iterating over a window of several timesteps, allowing efficient overlap of computation with communication. The methods increase the degree to which work can be performed while data are communicated between processors. The effect of the window size and of domain partitioning on the system performance is examined both by implementing the algorithm on a simulated multiprocessor system.
Super-Nyquist shaping and processing technologies for high-spectral-efficiency optical systems
NASA Astrophysics Data System (ADS)
Jia, Zhensheng; Chien, Hung-Chang; Zhang, Junwen; Dong, Ze; Cai, Yi; Yu, Jianjun
2013-12-01
The implementations of super-Nyquist pulse generation, both in a digital field using a digital-to-analog converter (DAC) or an optical filter at transmitter side, are introduced. Three corresponding signal processing algorithms at receiver are presented and compared for high spectral-efficiency (SE) optical systems employing the spectral prefiltering. Those algorithms are designed for the mitigation towards inter-symbol-interference (ISI) and inter-channel-interference (ICI) impairments by the bandwidth constraint, including 1-tap constant modulus algorithm (CMA) and 3-tap maximum likelihood sequence estimation (MLSE), regular CMA and digital filter with 2-tap MLSE, and constant multi-modulus algorithm (CMMA) with 2-tap MLSE. The principles and prefiltering tolerance are given through numerical and experimental results.
Towards developing robust algorithms for solving partial differential equations on MIMD machines
NASA Technical Reports Server (NTRS)
Saltz, J. H.; Naik, V. K.
1985-01-01
Methods for efficient computation of numerical algorithms on a wide variety of MIMD machines are proposed. These techniques reorganize the data dependency patterns to improve the processor utilization. The model problem finds the time-accurate solution to a parabolic partial differential equation discretized in space and implicitly marched forward in time. The algorithms are extensions of Jacobi and SOR. The extensions consist of iterating over a window of several timesteps, allowing efficient overlap of computation with communication. The methods increase the degree to which work can be performed while data are communicated between processors. The effect of the window size and of domain partitioning on the system performance is examined both by implementing the algorithm on a simulated multiprocessor system.
Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications
NASA Technical Reports Server (NTRS)
Sun, Xian-He
1997-01-01
Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm and Reduced Parallel Diagonal Dominant (RPDD) algorithm have been carefully studied on different parallel platforms for different applications, and a NASA simulation code developed by Man M. Rai and his colleagues has been parallelized and implemented based on data dependency analysis. These achievements are addressed in detail in the paper.
Efficient least angle regression for identification of linear-in-the-parameters models
Beach, Thomas H.; Rezgui, Yacine
2017-01-01
Least angle regression, as a promising model selection method, differentiates itself from conventional stepwise and stagewise methods, in that it is neither too greedy nor too slow. It is closely related to L1 norm optimization, which has the advantage of low prediction variance through sacrificing part of model bias property in order to enhance model generalization capability. In this paper, we propose an efficient least angle regression algorithm for model selection for a large class of linear-in-the-parameters models with the purpose of accelerating the model selection process. The entire algorithm works completely in a recursive manner, where the correlations between model terms and residuals, the evolving directions and other pertinent variables are derived explicitly and updated successively at every subset selection step. The model coefficients are only computed when the algorithm finishes. The direct involvement of matrix inversions is thereby relieved. A detailed computational complexity analysis indicates that the proposed algorithm possesses significant computational efficiency, compared with the original approach where the well-known efficient Cholesky decomposition is involved in solving least angle regression. Three artificial and real-world examples are employed to demonstrate the effectiveness, efficiency and numerical stability of the proposed algorithm. PMID:28293140
A parallel computing engine for a class of time critical processes.
Nabhan, T M; Zomaya, A Y
1997-01-01
This paper focuses on the efficient parallel implementation of systems of numerically intensive nature over loosely coupled multiprocessor architectures. These analytical models are of significant importance to many real-time systems that have to meet severe time constants. A parallel computing engine (PCE) has been developed in this work for the efficient simplification and the near optimal scheduling of numerical models over the different cooperating processors of the parallel computer. First, the analytical system is efficiently coded in its general form. The model is then simplified by using any available information (e.g., constant parameters). A task graph representing the interconnections among the different components (or equations) is generated. The graph can then be compressed to control the computation/communication requirements. The task scheduler employs a graph-based iterative scheme, based on the simulated annealing algorithm, to map the vertices of the task graph onto a Multiple-Instruction-stream Multiple-Data-stream (MIMD) type of architecture. The algorithm uses a nonanalytical cost function that properly considers the computation capability of the processors, the network topology, the communication time, and congestion possibilities. Moreover, the proposed technique is simple, flexible, and computationally viable. The efficiency of the algorithm is demonstrated by two case studies with good results.
Triangular covariance factorizations for. Ph.D. Thesis. - Calif. Univ.
NASA Technical Reports Server (NTRS)
Thornton, C. L.
1976-01-01
An improved computational form of the discrete Kalman filter is derived using an upper triangular factorization of the error covariance matrix. The covariance P is factored such that P = UDUT where U is unit upper triangular and D is diagonal. Recursions are developed for propagating the U-D covariance factors together with the corresponding state estimate. The resulting algorithm, referred to as the U-D filter, combines the superior numerical precision of square root filtering techniques with an efficiency comparable to that of Kalman's original formula. Moreover, this method is easily implemented and involves no more computer storage than the Kalman algorithm. These characteristics make the U-D method an attractive realtime filtering technique. A new covariance error analysis technique is obtained from an extension of the U-D filter equations. This evaluation method is flexible and efficient and may provide significantly improved numerical results. Cost comparisons show that for a large class of problems the U-D evaluation algorithm is noticeably less expensive than conventional error analysis methods.
Ovtchinnikov, Evgueni E.; Xanthis, Leonidas S.
2000-01-01
We present a methodology for the efficient numerical solution of eigenvalue problems of full three-dimensional elasticity for thin elastic structures, such as shells, plates and rods of arbitrary geometry, discretized by the finite element method. Such problems are solved by iterative methods, which, however, are known to suffer from slow convergence or even convergence failure, when the thickness is small. In this paper we show an effective way of resolving this difficulty by invoking a special preconditioning technique associated with the effective dimensional reduction algorithm (EDRA). As an example, we present an algorithm for computing the minimal eigenvalue of a thin elastic plate and we show both theoretically and numerically that it is robust with respect to both the thickness and discretization parameters, i.e. the convergence does not deteriorate with diminishing thickness or mesh refinement. This robustness is sine qua non for the efficient computation of large-scale eigenvalue problems for thin elastic structures. PMID:10655469
The instanton method and its numerical implementation in fluid mechanics
NASA Astrophysics Data System (ADS)
Grafke, Tobias; Grauer, Rainer; Schäfer, Tobias
2015-08-01
A precise characterization of structures occurring in turbulent fluid flows at high Reynolds numbers is one of the last open problems of classical physics. In this review we discuss recent developments related to the application of instanton methods to turbulence. Instantons are saddle point configurations of the underlying path integrals. They are equivalent to minimizers of the related Freidlin-Wentzell action and known to be able to characterize rare events in such systems. While there is an impressive body of work concerning their analytical description, this review focuses on the question on how to compute these minimizers numerically. In a short introduction we present the relevant mathematical and physical background before we discuss the stochastic Burgers equation in detail. We present algorithms to compute instantons numerically by an efficient solution of the corresponding Euler-Lagrange equations. A second focus is the discussion of a recently developed numerical filtering technique that allows to extract instantons from direct numerical simulations. In the following we present modifications of the algorithms to make them efficient when applied to two- or three-dimensional (2D or 3D) fluid dynamical problems. We illustrate these ideas using the 2D Burgers equation and the 3D Navier-Stokes equations.
NASA Astrophysics Data System (ADS)
Toufik, Mekkaoui; Atangana, Abdon
2017-10-01
Recently a new concept of fractional differentiation with non-local and non-singular kernel was introduced in order to extend the limitations of the conventional Riemann-Liouville and Caputo fractional derivatives. A new numerical scheme has been developed, in this paper, for the newly established fractional differentiation. We present in general the error analysis. The new numerical scheme was applied to solve linear and non-linear fractional differential equations. We do not need a predictor-corrector to have an efficient algorithm, in this method. The comparison of approximate and exact solutions leaves no doubt believing that, the new numerical scheme is very efficient and converges toward exact solution very rapidly.
Solution of quadratic matrix equations for free vibration analysis of structures.
NASA Technical Reports Server (NTRS)
Gupta, K. K.
1973-01-01
An efficient digital computer procedure and the related numerical algorithm are presented herein for the solution of quadratic matrix equations associated with free vibration analysis of structures. Such a procedure enables accurate and economical analysis of natural frequencies and associated modes of discretized structures. The numerically stable algorithm is based on the Sturm sequence method, which fully exploits the banded form of associated stiffness and mass matrices. The related computer program written in FORTRAN V for the JPL UNIVAC 1108 computer proves to be substantially more accurate and economical than other existing procedures of such analysis. Numerical examples are presented for two structures - a cantilever beam and a semicircular arch.
Efficient and Flexible Computation of Many-Electron Wave Function Overlaps.
Plasser, Felix; Ruckenbauer, Matthias; Mai, Sebastian; Oppel, Markus; Marquetand, Philipp; González, Leticia
2016-03-08
A new algorithm for the computation of the overlap between many-electron wave functions is described. This algorithm allows for the extensive use of recurring intermediates and thus provides high computational efficiency. Because of the general formalism employed, overlaps can be computed for varying wave function types, molecular orbitals, basis sets, and molecular geometries. This paves the way for efficiently computing nonadiabatic interaction terms for dynamics simulations. In addition, other application areas can be envisaged, such as the comparison of wave functions constructed at different levels of theory. Aside from explaining the algorithm and evaluating the performance, a detailed analysis of the numerical stability of wave function overlaps is carried out, and strategies for overcoming potential severe pitfalls due to displaced atoms and truncated wave functions are presented.
Efficient Computational Prototyping of Mixed Technology Microfluidic Components and Systems
2002-08-01
AFRL-IF-RS-TR-2002-190 Final Technical Report August 2002 EFFICIENT COMPUTATIONAL PROTOTYPING OF MIXED TECHNOLOGY MICROFLUIDIC...SUBTITLE EFFICIENT COMPUTATIONAL PROTOTYPING OF MIXED TECHNOLOGY MICROFLUIDIC COMPONENTS AND SYSTEMS 6. AUTHOR(S) Narayan R. Aluru, Jacob White...Aided Design (CAD) tools for microfluidic components and systems were developed in this effort. Innovative numerical methods and algorithms for mixed
Explicit symplectic algorithms based on generating functions for charged particle dynamics.
Zhang, Ruili; Qin, Hong; Tang, Yifa; Liu, Jian; He, Yang; Xiao, Jianyuan
2016-07-01
Dynamics of a charged particle in the canonical coordinates is a Hamiltonian system, and the well-known symplectic algorithm has been regarded as the de facto method for numerical integration of Hamiltonian systems due to its long-term accuracy and fidelity. For long-term simulations with high efficiency, explicit symplectic algorithms are desirable. However, it is generally believed that explicit symplectic algorithms are only available for sum-separable Hamiltonians, and this restriction limits the application of explicit symplectic algorithms to charged particle dynamics. To overcome this difficulty, we combine the familiar sum-split method and a generating function method to construct second- and third-order explicit symplectic algorithms for dynamics of charged particle. The generating function method is designed to generate explicit symplectic algorithms for product-separable Hamiltonian with form of H(x,p)=p_{i}f(x) or H(x,p)=x_{i}g(p). Applied to the simulations of charged particle dynamics, the explicit symplectic algorithms based on generating functions demonstrate superiorities in conservation and efficiency.
Explicit symplectic algorithms based on generating functions for charged particle dynamics
NASA Astrophysics Data System (ADS)
Zhang, Ruili; Qin, Hong; Tang, Yifa; Liu, Jian; He, Yang; Xiao, Jianyuan
2016-07-01
Dynamics of a charged particle in the canonical coordinates is a Hamiltonian system, and the well-known symplectic algorithm has been regarded as the de facto method for numerical integration of Hamiltonian systems due to its long-term accuracy and fidelity. For long-term simulations with high efficiency, explicit symplectic algorithms are desirable. However, it is generally believed that explicit symplectic algorithms are only available for sum-separable Hamiltonians, and this restriction limits the application of explicit symplectic algorithms to charged particle dynamics. To overcome this difficulty, we combine the familiar sum-split method and a generating function method to construct second- and third-order explicit symplectic algorithms for dynamics of charged particle. The generating function method is designed to generate explicit symplectic algorithms for product-separable Hamiltonian with form of H (x ,p ) =pif (x ) or H (x ,p ) =xig (p ) . Applied to the simulations of charged particle dynamics, the explicit symplectic algorithms based on generating functions demonstrate superiorities in conservation and efficiency.
An efficient and accurate 3D displacements tracking strategy for digital volume correlation
NASA Astrophysics Data System (ADS)
Pan, Bing; Wang, Bo; Wu, Dafang; Lubineau, Gilles
2014-07-01
Owing to its inherent computational complexity, practical implementation of digital volume correlation (DVC) for internal displacement and strain mapping faces important challenges in improving its computational efficiency. In this work, an efficient and accurate 3D displacement tracking strategy is proposed for fast DVC calculation. The efficiency advantage is achieved by using three improvements. First, to eliminate the need of updating Hessian matrix in each iteration, an efficient 3D inverse compositional Gauss-Newton (3D IC-GN) algorithm is introduced to replace existing forward additive algorithms for accurate sub-voxel displacement registration. Second, to ensure the 3D IC-GN algorithm that converges accurately and rapidly and avoid time-consuming integer-voxel displacement searching, a generalized reliability-guided displacement tracking strategy is designed to transfer accurate and complete initial guess of deformation for each calculation point from its computed neighbors. Third, to avoid the repeated computation of sub-voxel intensity interpolation coefficients, an interpolation coefficient lookup table is established for tricubic interpolation. The computational complexity of the proposed fast DVC and the existing typical DVC algorithms are first analyzed quantitatively according to necessary arithmetic operations. Then, numerical tests are performed to verify the performance of the fast DVC algorithm in terms of measurement accuracy and computational efficiency. The experimental results indicate that, compared with the existing DVC algorithm, the presented fast DVC algorithm produces similar precision and slightly higher accuracy at a substantially reduced computational cost.
A numerical algorithm of tooth profile of non-circular cylindrical gear
NASA Astrophysics Data System (ADS)
Wang, Xuan
2017-08-01
Non-circular cylindrical gear (NCCG) is a common form of non-circular gear. Different from the circular gear, the tooth profile equation of NCCG cannot be obtained. So it is necessary to use a numerical algorithm to calculate the tooth profile of NCCG. For this reason, this paper presents a simple and highly efficient numerical algorithm to obtain the tooth profile of NCCG. Firstly, the mathematical model of tooth profile envelope of NCCG is established based on the principle of gear shaping, and the tooth profile envelope of NCCG is obtained. Secondly, the polar radius and polar angle of shaper cutter tooth profile are chosen as the criterions, by which the points of NCCG tooth cogging can be screened out. Finally, the boundary of tooth cogging points is extracted by a distance criterion and correspondingly the tooth profile of NCCG is obtained.
Fast algorithms for Quadrature by Expansion I: Globally valid expansions
NASA Astrophysics Data System (ADS)
Rachh, Manas; Klöckner, Andreas; O'Neil, Michael
2017-09-01
The use of integral equation methods for the efficient numerical solution of PDE boundary value problems requires two main tools: quadrature rules for the evaluation of layer potential integral operators with singular kernels, and fast algorithms for solving the resulting dense linear systems. Classically, these tools were developed separately. In this work, we present a unified numerical scheme based on coupling Quadrature by Expansion, a recent quadrature method, to a customized Fast Multipole Method (FMM) for the Helmholtz equation in two dimensions. The method allows the evaluation of layer potentials in linear-time complexity, anywhere in space, with a uniform, user-chosen level of accuracy as a black-box computational method. Providing this capability requires geometric and algorithmic considerations beyond the needs of standard FMMs as well as careful consideration of the accuracy of multipole translations. We illustrate the speed and accuracy of our method with various numerical examples.
A three dimensional finite element formulation for thermoviscoelastic orthotropic media
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zocher, M.A.
1997-12-31
A numerical algorithm for the efficient solution of the uncoupled quasistatic initial/boundary value problem involving orthotropic linear viscoelastic media undergoing thermal and/or mechanical deformation is briefly outlined.
Input-output-controlled nonlinear equation solvers
NASA Technical Reports Server (NTRS)
Padovan, Joseph
1988-01-01
To upgrade the efficiency and stability of the successive substitution (SS) and Newton-Raphson (NR) schemes, the concept of input-output-controlled solvers (IOCS) is introduced. By employing the formal properties of the constrained version of the SS and NR schemes, the IOCS algorithm can handle indefiniteness of the system Jacobian, can maintain iterate monotonicity, and provide for separate control of load incrementation and iterate excursions, as well as having other features. To illustrate the algorithmic properties, the results for several benchmark examples are presented. These define the associated numerical efficiency and stability of the IOCS.
Wang, Jun; Zhou, Bi-hua; Zhou, Shu-dao; Sheng, Zheng
2015-01-01
The paper proposes a novel function expression method to forecast chaotic time series, using an improved genetic-simulated annealing (IGSA) algorithm to establish the optimum function expression that describes the behavior of time series. In order to deal with the weakness associated with the genetic algorithm, the proposed algorithm incorporates the simulated annealing operation which has the strong local search ability into the genetic algorithm to enhance the performance of optimization; besides, the fitness function and genetic operators are also improved. Finally, the method is applied to the chaotic time series of Quadratic and Rossler maps for validation. The effect of noise in the chaotic time series is also studied numerically. The numerical results verify that the method can forecast chaotic time series with high precision and effectiveness, and the forecasting precision with certain noise is also satisfactory. It can be concluded that the IGSA algorithm is energy-efficient and superior. PMID:26000011
Wang, Jun; Zhou, Bi-hua; Zhou, Shu-dao; Sheng, Zheng
2015-01-01
The paper proposes a novel function expression method to forecast chaotic time series, using an improved genetic-simulated annealing (IGSA) algorithm to establish the optimum function expression that describes the behavior of time series. In order to deal with the weakness associated with the genetic algorithm, the proposed algorithm incorporates the simulated annealing operation which has the strong local search ability into the genetic algorithm to enhance the performance of optimization; besides, the fitness function and genetic operators are also improved. Finally, the method is applied to the chaotic time series of Quadratic and Rossler maps for validation. The effect of noise in the chaotic time series is also studied numerically. The numerical results verify that the method can forecast chaotic time series with high precision and effectiveness, and the forecasting precision with certain noise is also satisfactory. It can be concluded that the IGSA algorithm is energy-efficient and superior.
Computation of Reacting Flows in Combustion Processes
NASA Technical Reports Server (NTRS)
Keith, Theo G., Jr.; Chen, K.-H.
2001-01-01
The objective of this research is to develop an efficient numerical algorithm with unstructured grids for the computation of three-dimensional chemical reacting flows that are known to occur in combustion components of propulsion systems. During the grant period (1996 to 1999), two companion codes have been developed and various numerical and physical models were implemented into the two codes.
Numerical Algorithm for Delta of Asian Option
Zhang, Boxiang; Yu, Yang; Wang, Weiguo
2015-01-01
We study the numerical solution of the Greeks of Asian options. In particular, we derive a close form solution of Δ of Asian geometric option and use this analytical form as a control to numerically calculate Δ of Asian arithmetic option, which is known to have no explicit close form solution. We implement our proposed numerical method and compare the standard error with other classical variance reduction methods. Our method provides an efficient solution to the hedging strategy with Asian options. PMID:26266271
An efficient numerical algorithm for transverse impact problems
NASA Technical Reports Server (NTRS)
Sankar, B. V.; Sun, C. T.
1985-01-01
Transverse impact problems in which the elastic and plastic indentation effects are considered, involve a nonlinear integral equation for the contact force, which, in practice, is usually solved by an iterative scheme with small increments in time. In this paper, a numerical method is proposed wherein the iterations of the nonlinear problem are separated from the structural response computations. This makes the numerical procedures much simpler and also efficient. The proposed method is applied to some impact problems for which solutions are available, and they are found to be in good agreement. The effect of the magnitude of time increment on the results is also discussed.
Efficient numerical method for solving Cauchy problem for the Gamma equation
NASA Astrophysics Data System (ADS)
Koleva, Miglena N.
2011-12-01
In this work we consider Cauchy problem for the so called Gamma equation, derived by transforming the fully nonlinear Black-Scholes equation for option price into a quasilinear parabolic equation for the second derivative (Greek) Γ = VSS of the option price V. We develop an efficient numerical method for solving the model problem concerning different volatility terms. Using suitable change of variables the problem is transformed on finite interval, keeping original behavior of the solution at the infinity. Then we construct Picard-Newton algorithm with adaptive mesh step in time, which can be applied also in the case of non-differentiable functions. Results of numerical simulations are given.
Finding all solutions of nonlinear equations using the dual simplex method
NASA Astrophysics Data System (ADS)
Yamamura, Kiyotaka; Fujioka, Tsuyoshi
2003-03-01
Recently, an efficient algorithm has been proposed for finding all solutions of systems of nonlinear equations using linear programming. This algorithm is based on a simple test (termed the LP test) for nonexistence of a solution to a system of nonlinear equations using the dual simplex method. In this letter, an improved version of the LP test algorithm is proposed. By numerical examples, it is shown that the proposed algorithm could find all solutions of a system of 300 nonlinear equations in practical computation time.
Qi, Hong; Qiao, Yao-Bin; Ren, Ya-Tao; Shi, Jing-Wen; Zhang, Ze-Yu; Ruan, Li-Ming
2016-10-17
Sequential quadratic programming (SQP) is used as an optimization algorithm to reconstruct the optical parameters based on the time-domain radiative transfer equation (TD-RTE). Numerous time-resolved measurement signals are obtained using the TD-RTE as forward model. For a high computational efficiency, the gradient of objective function is calculated using an adjoint equation technique. SQP algorithm is employed to solve the inverse problem and the regularization term based on the generalized Gaussian Markov random field (GGMRF) model is used to overcome the ill-posed problem. Simulated results show that the proposed reconstruction scheme performs efficiently and accurately.
Numerically robust and efficient nonlocal electron transport in 2D DRACO simulations
NASA Astrophysics Data System (ADS)
Cao, Duc; Chenhall, Jeff; Moses, Greg; Delettrez, Jacques; Collins, Tim
2013-10-01
An improved implicit algorithm based on Schurtz, Nicolai and Busquet (SNB) algorithm for nonlocal electron transport is presented. Validation with direct drive shock timing experiments and verification with the Goncharov nonlocal model in 1D LILAC simulations demonstrate the viability of this efficient algorithm for producing 2D lagrangian radiation hydrodynamics direct drive simulations. Additionally, simulations provide strong incentive to further modify key parameters within the SNB theory, namely the ``mean free path.'' An example 2D polar drive simulation to study 2D effects of the nonlocal flux as well as mean free path modifications will also be presented. This research was supported by the University of Rochester Laboratory for Laser Energetics.
A Strassen-Newton algorithm for high-speed parallelizable matrix inversion
NASA Technical Reports Server (NTRS)
Bailey, David H.; Ferguson, Helaman R. P.
1988-01-01
Techniques are described for computing matrix inverses by algorithms that are highly suited to massively parallel computation. The techniques are based on an algorithm suggested by Strassen (1969). Variations of this scheme use matrix Newton iterations and other methods to improve the numerical stability while at the same time preserving a very high level of parallelism. One-processor Cray-2 implementations of these schemes range from one that is up to 55 percent faster than a conventional library routine to one that is slower than a library routine but achieves excellent numerical stability. The problem of computing the solution to a single set of linear equations is discussed, and it is shown that this problem can also be solved efficiently using these techniques.
Computing rank-revealing QR factorizations of dense matrices.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bischof, C. H.; Quintana-Orti, G.; Mathematics and Computer Science
1998-06-01
We develop algorithms and implementations for computing rank-revealing QR (RRQR) factorizations of dense matrices. First, we develop an efficient block algorithm for approximating an RRQR factorization, employing a windowed version of the commonly used Golub pivoting strategy, aided by incremental condition estimation. Second, we develop efficiently implementable variants of guaranteed reliable RRQR algorithms for triangular matrices originally suggested by Chandrasekaran and Ipsen and by Pan and Tang. We suggest algorithmic improvements with respect to condition estimation, termination criteria, and Givens updating. By combining the block algorithm with one of the triangular postprocessing steps, we arrive at an efficient and reliablemore » algorithm for computing an RRQR factorization of a dense matrix. Experimental results on IBM RS/6000 SGI R8000 platforms show that this approach performs up to three times faster that the less reliable QR factorization with column pivoting as it is currently implemented in LAPACK, and comes within 15% of the performance of the LAPACK block algorithm for computing a QR factorization without any column exchanges. Thus, we expect this routine to be useful in may circumstances where numerical rank deficiency cannot be ruled out, but currently has been ignored because of the computational cost of dealing with it.« less
A finite element algorithm for high-lying eigenvalues with Neumann and Dirichlet boundary conditions
NASA Astrophysics Data System (ADS)
Báez, G.; Méndez-Sánchez, R. A.; Leyvraz, F.; Seligman, T. H.
2014-01-01
We present a finite element algorithm that computes eigenvalues and eigenfunctions of the Laplace operator for two-dimensional problems with homogeneous Neumann or Dirichlet boundary conditions, or combinations of either for different parts of the boundary. We use an inverse power plus Gauss-Seidel algorithm to solve the generalized eigenvalue problem. For Neumann boundary conditions the method is much more efficient than the equivalent finite difference algorithm. We checked the algorithm by comparing the cumulative level density of the spectrum obtained numerically with the theoretical prediction given by the Weyl formula. We found a systematic deviation due to the discretization, not to the algorithm itself.
An Implicit Algorithm for the Numerical Simulation of Shape-Memory Alloys
DOE Office of Scientific and Technical Information (OSTI.GOV)
Becker, R; Stolken, J; Jannetti, C
Shape-memory alloys (SMA) have the potential to be used in a variety of interesting applications due to their unique properties of pseudoelasticity and the shape-memory effect. However, in order to design SMA devices efficiently, a physics-based constitutive model is required to accurately simulate the behavior of shape-memory alloys. The scope of this work is to extend the numerical capabilities of the SMA constitutive model developed by Jannetti et. al. (2003), to handle large-scale polycrystalline simulations. The constitutive model is implemented within the finite-element software ABAQUS/Standard using a user defined material subroutine, or UMAT. To improve the efficiency of the numericalmore » simulations, so that polycrystalline specimens of shape-memory alloys can be modeled, a fully implicit algorithm has been implemented to integrate the constitutive equations. Using an implicit integration scheme increases the efficiency of the UMAT over the previously implemented explicit integration method by a factor of more than 100 for single crystal simulations.« less
Hypersonic research at Stanford University
NASA Technical Reports Server (NTRS)
Candler, Graham; Maccormack, Robert
1988-01-01
The status of the hypersonic research program at Stanford University is discussed and recent results are highlighted. The main areas of interest in the program are the numerical simulation of radiating, reacting and thermally excited flows, the investigation and numerical solution of hypersonic shock wave physics, the extension of the continuum fluid dynamic equations to the transition regime between continuum and free-molecule flow, and the development of novel numerical algorithms for efficient particulate simulations of flowfields.
Algorithms and Application of Sparse Matrix Assembly and Equation Solvers for Aeroacoustics
NASA Technical Reports Server (NTRS)
Watson, W. R.; Nguyen, D. T.; Reddy, C. J.; Vatsa, V. N.; Tang, W. H.
2001-01-01
An algorithm for symmetric sparse equation solutions on an unstructured grid is described. Efficient, sequential sparse algorithms for degree-of-freedom reordering, supernodes, symbolic/numerical factorization, and forward backward solution phases are reviewed. Three sparse algorithms for the generation and assembly of symmetric systems of matrix equations are presented. The accuracy and numerical performance of the sequential version of the sparse algorithms are evaluated over the frequency range of interest in a three-dimensional aeroacoustics application. Results show that the solver solutions are accurate using a discretization of 12 points per wavelength. Results also show that the first assembly algorithm is impractical for high-frequency noise calculations. The second and third assembly algorithms have nearly equal performance at low values of source frequencies, but at higher values of source frequencies the third algorithm saves CPU time and RAM. The CPU time and the RAM required by the second and third assembly algorithms are two orders of magnitude smaller than that required by the sparse equation solver. A sequential version of these sparse algorithms can, therefore, be conveniently incorporated into a substructuring for domain decomposition formulation to achieve parallel computation, where different substructures are handles by different parallel processors.
The efficiency of geophysical adjoint codes generated by automatic differentiation tools
NASA Astrophysics Data System (ADS)
Vlasenko, A. V.; Köhl, A.; Stammer, D.
2016-02-01
The accuracy of numerical models that describe complex physical or chemical processes depends on the choice of model parameters. Estimating an optimal set of parameters by optimization algorithms requires knowledge of the sensitivity of the process of interest to model parameters. Typically the sensitivity computation involves differentiation of the model, which can be performed by applying algorithmic differentiation (AD) tools to the underlying numerical code. However, existing AD tools differ substantially in design, legibility and computational efficiency. In this study we show that, for geophysical data assimilation problems of varying complexity, the performance of adjoint codes generated by the existing AD tools (i) Open_AD, (ii) Tapenade, (iii) NAGWare and (iv) Transformation of Algorithms in Fortran (TAF) can be vastly different. Based on simple test problems, we evaluate the efficiency of each AD tool with respect to computational speed, accuracy of the adjoint, the efficiency of memory usage, and the capability of each AD tool to handle modern FORTRAN 90-95 elements such as structures and pointers, which are new elements that either combine groups of variables or provide aliases to memory addresses, respectively. We show that, while operator overloading tools are the only ones suitable for modern codes written in object-oriented programming languages, their computational efficiency lags behind source transformation by orders of magnitude, rendering the application of these modern tools to practical assimilation problems prohibitive. In contrast, the application of source transformation tools appears to be the most efficient choice, allowing handling even large geophysical data assimilation problems. However, they can only be applied to numerical models written in earlier generations of programming languages. Our study indicates that applying existing AD tools to realistic geophysical problems faces limitations that urgently need to be solved to allow the continuous use of AD tools for solving geophysical problems on modern computer architectures.
NASA Technical Reports Server (NTRS)
Chaderjian, Neal M.
1991-01-01
Computations from two Navier-Stokes codes, NSS and F3D, are presented for a tangent-ogive-cylinder body at high angle of attack. Features of this steady flow include a pair of primary vortices on the leeward side of the body as well as secondary vortices. The topological and physical plausibility of this vortical structure is discussed. The accuracy of these codes are assessed by comparison of the numerical solutions with experimental data. The effects of turbulence model, numerical dissipation, and grid refinement are presented. The overall efficiency of these codes are also assessed by examining their convergence rates, computational time per time step, and maximum allowable time step for time-accurate computations. Overall, the numerical results from both codes compared equally well with experimental data, however, the NSS code was found to be significantly more efficient than the F3D code.
NASA Astrophysics Data System (ADS)
Chen, Guangye; Chacón, Luis; CoCoMans Team
2014-10-01
For decades, the Vlasov-Darwin model has been recognized to be attractive for PIC simulations (to avoid radiative noise issues) in non-radiative electromagnetic regimes. However, the Darwin model results in elliptic field equations that renders explicit time integration unconditionally unstable. Improving on linearly implicit schemes, fully implicit PIC algorithms for both electrostatic and electromagnetic regimes, with exact discrete energy and charge conservation properties, have been recently developed in 1D. This study builds on these recent algorithms to develop an implicit, orbit-averaged, time-space-centered finite difference scheme for the particle-field equations in multiple dimensions. The algorithm conserves energy, charge, and canonical-momentum exactly, even with grid packing. A simple fluid preconditioner allows efficient use of large timesteps, O (√{mi/me}c/veT) larger than the explicit CFL. We demonstrate the accuracy and efficiency properties of the of the algorithm with various numerical experiments in 2D3V.
Wang, Jun; Zhou, Bihua; Zhou, Shudao
2016-01-01
This paper proposes an improved cuckoo search (ICS) algorithm to establish the parameters of chaotic systems. In order to improve the optimization capability of the basic cuckoo search (CS) algorithm, the orthogonal design and simulated annealing operation are incorporated in the CS algorithm to enhance the exploitation search ability. Then the proposed algorithm is used to establish parameters of the Lorenz chaotic system and Chen chaotic system under the noiseless and noise condition, respectively. The numerical results demonstrate that the algorithm can estimate parameters with high accuracy and reliability. Finally, the results are compared with the CS algorithm, genetic algorithm, and particle swarm optimization algorithm, and the compared results demonstrate the method is energy-efficient and superior. PMID:26880874
NASA Astrophysics Data System (ADS)
Boyko, Oleksiy; Zheleznyak, Mark
2015-04-01
The original numerical code TOPKAPI-IMMS of the distributed rainfall-runoff model TOPKAPI ( Todini et al, 1996-2014) is developed and implemented in Ukraine. The parallel version of the code has been developed recently to be used on multiprocessors systems - multicore/processors PC and clusters. Algorithm is based on binary-tree decomposition of the watershed for the balancing of the amount of computation for all processors/cores. Message passing interface (MPI) protocol is used as a parallel computing framework. The numerical efficiency of the parallelization algorithms is demonstrated for the case studies for the flood predictions of the mountain watersheds of the Ukrainian Carpathian regions. The modeling results is compared with the predictions based on the lumped parameters models.
Numerical approach of the quantum circuit theory
DOE Office of Scientific and Technical Information (OSTI.GOV)
Silva, J.J.B., E-mail: jaedsonfisica@hotmail.com; Duarte-Filho, G.C.; Almeida, F.A.G.
2017-03-15
In this paper we develop a numerical method based on the quantum circuit theory to approach the coherent electronic transport in a network of quantum dots connected with arbitrary topology. The algorithm was employed in a circuit formed by quantum dots connected each other in a shape of a linear chain (associations in series), and of a ring (associations in series, and in parallel). For both systems we compute two current observables: conductance and shot noise power. We find an excellent agreement between our numerical results and the ones found in the literature. Moreover, we analyze the algorithm efficiency formore » a chain of quantum dots, where the mean processing time exhibits a linear dependence with the number of quantum dots in the array.« less
Adaptive Wavelet Modeling of Geophysical Data
NASA Astrophysics Data System (ADS)
Plattner, A.; Maurer, H.; Dahmen, W.; Vorloeper, J.
2009-12-01
Despite the ever-increasing power of modern computers, realistic modeling of complex three-dimensional Earth models is still a challenging task and requires substantial computing resources. The overwhelming majority of current geophysical modeling approaches includes either finite difference or non-adaptive finite element algorithms, and variants thereof. These numerical methods usually require the subsurface to be discretized with a fine mesh to accurately capture the behavior of the physical fields. However, this may result in excessive memory consumption and computing times. A common feature of most of these algorithms is that the modeled data discretizations are independent of the model complexity, which may be wasteful when there are only minor to moderate spatial variations in the subsurface parameters. Recent developments in the theory of adaptive numerical solvers have the potential to overcome this problem. Here, we consider an adaptive wavelet based approach that is applicable to a large scope of problems, also including nonlinear problems. To the best of our knowledge such algorithms have not yet been applied in geophysics. Adaptive wavelet algorithms offer several attractive features: (i) for a given subsurface model, they allow the forward modeling domain to be discretized with a quasi minimal number of degrees of freedom, (ii) sparsity of the associated system matrices is guaranteed, which makes the algorithm memory efficient, and (iii) the modeling accuracy scales linearly with computing time. We have implemented the adaptive wavelet algorithm for solving three-dimensional geoelectric problems. To test its performance, numerical experiments were conducted with a series of conductivity models exhibiting varying degrees of structural complexity. Results were compared with a non-adaptive finite element algorithm, which incorporates an unstructured mesh to best fit subsurface boundaries. Such algorithms represent the current state-of-the-art in geoelectrical modeling. An analysis of the numerical accuracy as a function of the number of degrees of freedom revealed that the adaptive wavelet algorithm outperforms the finite element solver for simple and moderately complex models, whereas the results become comparable for models with spatially highly variable electrical conductivities. The linear dependency of the modeling error and the computing time proved to be model-independent. This feature will allow very efficient computations using large-scale models as soon as our experimental code is optimized in terms of its implementation.
Semi-implicit finite difference methods for three-dimensional shallow water flow
Casulli, Vincenzo; Cheng, Ralph T.
1992-01-01
A semi-implicit finite difference method for the numerical solution of three-dimensional shallow water flows is presented and discussed. The governing equations are the primitive three-dimensional turbulent mean flow equations where the pressure distribution in the vertical has been assumed to be hydrostatic. In the method of solution a minimal degree of implicitness has been adopted in such a fashion that the resulting algorithm is stable and gives a maximal computational efficiency at a minimal computational cost. At each time step the numerical method requires the solution of one large linear system which can be formally decomposed into a set of small three-diagonal systems coupled with one five-diagonal system. All these linear systems are symmetric and positive definite. Thus the existence and uniquencess of the numerical solution are assured. When only one vertical layer is specified, this method reduces as a special case to a semi-implicit scheme for solving the corresponding two-dimensional shallow water equations. The resulting two- and three-dimensional algorithm has been shown to be fast, accurate and mass-conservative and can also be applied to simulate flooding and drying of tidal mud-flats in conjunction with three-dimensional flows. Furthermore, the resulting algorithm is fully vectorizable for an efficient implementation on modern vector computers.
Research on allocation efficiency of the daisy chain allocation algorithm
NASA Astrophysics Data System (ADS)
Shi, Jingping; Zhang, Weiguo
2013-03-01
With the improvement of the aircraft performance in reliability, maneuverability and survivability, the number of the control effectors increases a lot. How to distribute the three-axis moments into the control surfaces reasonably becomes an important problem. Daisy chain method is simple and easy to be carried out in the design of the allocation system. But it can not solve the allocation problem for entire attainable moment subset. For the lateral-directional allocation problem, the allocation efficiency of the daisy chain can be directly measured by the area of its subset of attainable moments. Because of the non-linear allocation characteristic, the subset of attainable moments of daisy-chain method is a complex non-convex polygon, and it is difficult to solve directly. By analyzing the two-dimensional allocation problems with a "micro-element" idea, a numerical calculation algorithm is proposed to compute the area of the non-convex polygon. In order to improve the allocation efficiency of the algorithm, a genetic algorithm with the allocation efficiency chosen as the fitness function is proposed to find the best pseudo-inverse matrix.
Mathematical modelling of risk reduction in reinsurance
NASA Astrophysics Data System (ADS)
Balashov, R. B.; Kryanev, A. V.; Sliva, D. E.
2017-01-01
The paper presents a mathematical model of efficient portfolio formation in the reinsurance markets. The presented approach provides the optimal ratio between the expected value of return and the risk of yield values below a certain level. The uncertainty in the return values is conditioned by use of expert evaluations and preliminary calculations, which result in expected return values and the corresponding risk levels. The proposed method allows for implementation of computationally simple schemes and algorithms for numerical calculation of the numerical structure of the efficient portfolios of reinsurance contracts of a given insurance company.
Experience with a vectorized general circulation weather model on Star-100
NASA Technical Reports Server (NTRS)
Soll, D. B.; Habra, N. R.; Russell, G. L.
1977-01-01
A version of an atmospheric general circulation model was vectorized to run on a CDC STAR 100. The numerical model was coded and run in two different vector languages, CDC and LRLTRAN. A factor of 10 speed improvement over an IBM 360/95 was realized. Efficient use of the STAR machine required some redesigning of algorithms and logic. This precludes the application of vectorizing compilers on the original scalar code to achieve the same results. Vector languages permit a more natural and efficient formulation for such numerical codes.
New Parallel Algorithms for Landscape Evolution Model
NASA Astrophysics Data System (ADS)
Jin, Y.; Zhang, H.; Shi, Y.
2017-12-01
Most landscape evolution models (LEM) developed in the last two decades solve the diffusion equation to simulate the transportation of surface sediments. This numerical approach is difficult to parallelize due to the computation of drainage area for each node, which needs huge amount of communication if run in parallel. In order to overcome this difficulty, we developed two parallel algorithms for LEM with a stream net. One algorithm handles the partition of grid with traditional methods and applies an efficient global reduction algorithm to do the computation of drainage areas and transport rates for the stream net; the other algorithm is based on a new partition algorithm, which partitions the nodes in catchments between processes first, and then partitions the cells according to the partition of nodes. Both methods focus on decreasing communication between processes and take the advantage of massive computing techniques, and numerical experiments show that they are both adequate to handle large scale problems with millions of cells. We implemented the two algorithms in our program based on the widely used finite element library deal.II, so that it can be easily coupled with ASPECT.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hewett, D.W.; Yu-Jiuan Chen
The authors describe how they hold onto orthogonal mesh discretization when dealing with curved boundaries. Special difference operators were constructed to approximate numerical zones split by the domain boundary; the operators are particularly simple for this rectangular mesh. The authors demonstrated that this simple numerical approach, termed Dynamic Alternating Direction Implicit, turned out to be considerably more efficient than more complex grid-adaptive algorithms that were tried previously.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fischetti, Sebastian; Cadonati, Laura; Mohapatra, Satyanarayan R. P.
Recent years have witnessed tremendous progress in numerical relativity and an ever improving performance of ground-based interferometric gravitational wave detectors. In preparation for the Advanced Laser Interferometer Gravitational Wave Observatory (Advanced LIGO) and a new era in gravitational wave astronomy, the numerical relativity and gravitational wave data analysis communities are collaborating to ascertain the most useful role for numerical relativity waveforms in the detection and characterization of binary black hole coalescences. In this paper, we explore the detectability of equal mass, merging black hole binaries with precessing spins and total mass M{sub T}(set-membership sign)[80,350]M{sub {center_dot}}, using numerical relativity waveforms andmore » templateless search algorithms designed for gravitational wave bursts. In particular, we present a systematic study using waveforms produced by the MayaKranc code that are added to colored, Gaussian noise and analyzed with the Omega burst search algorithm. Detection efficiency is weighed against the orientation of one of the black-hole's spin axes. We find a strong correlation between the detection efficiency and the radiated energy and angular momentum, and that the inclusion of the l=2, m={+-}1, 0 modes, at a minimum, is necessary to account for the full dynamics of precessing systems.« less
Tang, Liang; Zhu, Yongfeng; Fu, Qiang
2017-01-01
Waveform sets with good correlation and/or stopband properties have received extensive attention and been widely used in multiple-input multiple-output (MIMO) radar. In this paper, we aim at designing unimodular waveform sets with good correlation and stopband properties. To formulate the problem, we construct two criteria to measure the correlation and stopband properties and then establish an unconstrained problem in the frequency domain. After deducing the phase gradient and the step size, an efficient gradient-based algorithm with monotonicity is proposed to minimize the objective function directly. For the design problem without considering the correlation weights, we develop a simplified algorithm, which only requires a few fast Fourier transform (FFT) operations and is more efficient. Because both of the algorithms can be implemented via the FFT operations and the Hadamard product, they are computationally efficient and can be used to design waveform sets with a large waveform number and waveform length. Numerical experiments show that the proposed algorithms can provide better performance than the state-of-the-art algorithms in terms of the computational complexity. PMID:28468308
Tang, Liang; Zhu, Yongfeng; Fu, Qiang
2017-05-01
Waveform sets with good correlation and/or stopband properties have received extensive attention and been widely used in multiple-input multiple-output (MIMO) radar. In this paper, we aim at designing unimodular waveform sets with good correlation and stopband properties. To formulate the problem, we construct two criteria to measure the correlation and stopband properties and then establish an unconstrained problem in the frequency domain. After deducing the phase gradient and the step size, an efficient gradient-based algorithm with monotonicity is proposed to minimize the objective function directly. For the design problem without considering the correlation weights, we develop a simplified algorithm, which only requires a few fast Fourier transform (FFT) operations and is more efficient. Because both of the algorithms can be implemented via the FFT operations and the Hadamard product, they are computationally efficient and can be used to design waveform sets with a large waveform number and waveform length. Numerical experiments show that the proposed algorithms can provide better performance than the state-of-the-art algorithms in terms of the computational complexity.
NASA Astrophysics Data System (ADS)
Shayanfar, Mohsen Ali; Barkhordari, Mohammad Ali; Roudak, Mohammad Amin
2017-06-01
Monte Carlo simulation (MCS) is a useful tool for computation of probability of failure in reliability analysis. However, the large number of required random samples makes it time-consuming. Response surface method (RSM) is another common method in reliability analysis. Although RSM is widely used for its simplicity, it cannot be trusted in highly nonlinear problems due to its linear nature. In this paper, a new efficient algorithm, employing the combination of importance sampling, as a class of MCS, and RSM is proposed. In the proposed algorithm, analysis starts with importance sampling concepts and using a represented two-step updating rule of design point. This part finishes after a small number of samples are generated. Then RSM starts to work using Bucher experimental design, with the last design point and a represented effective length as the center point and radius of Bucher's approach, respectively. Through illustrative numerical examples, simplicity and efficiency of the proposed algorithm and the effectiveness of the represented rules are shown.
Preconditioned Alternating Projection Algorithms for Maximum a Posteriori ECT Reconstruction
Krol, Andrzej; Li, Si; Shen, Lixin; Xu, Yuesheng
2012-01-01
We propose a preconditioned alternating projection algorithm (PAPA) for solving the maximum a posteriori (MAP) emission computed tomography (ECT) reconstruction problem. Specifically, we formulate the reconstruction problem as a constrained convex optimization problem with the total variation (TV) regularization. We then characterize the solution of the constrained convex optimization problem and show that it satisfies a system of fixed-point equations defined in terms of two proximity operators raised from the convex functions that define the TV-norm and the constrain involved in the problem. The characterization (of the solution) via the proximity operators that define two projection operators naturally leads to an alternating projection algorithm for finding the solution. For efficient numerical computation, we introduce to the alternating projection algorithm a preconditioning matrix (the EM-preconditioner) for the dense system matrix involved in the optimization problem. We prove theoretically convergence of the preconditioned alternating projection algorithm. In numerical experiments, performance of our algorithms, with an appropriately selected preconditioning matrix, is compared with performance of the conventional MAP expectation-maximization (MAP-EM) algorithm with TV regularizer (EM-TV) and that of the recently developed nested EM-TV algorithm for ECT reconstruction. Based on the numerical experiments performed in this work, we observe that the alternating projection algorithm with the EM-preconditioner outperforms significantly the EM-TV in all aspects including the convergence speed, the noise in the reconstructed images and the image quality. It also outperforms the nested EM-TV in the convergence speed while providing comparable image quality. PMID:23271835
NASA Astrophysics Data System (ADS)
Noble, J. H.; Lubasch, M.; Stevens, J.; Jentschura, U. D.
2017-12-01
We describe a matrix diagonalization algorithm for complex symmetric (not Hermitian) matrices, A ̲ =A̲T, which is based on a two-step algorithm involving generalized Householder reflections based on the indefinite inner product 〈 u ̲ , v ̲ 〉 ∗ =∑iuivi. This inner product is linear in both arguments and avoids complex conjugation. The complex symmetric input matrix is transformed to tridiagonal form using generalized Householder transformations (first step). An iterative, generalized QL decomposition of the tridiagonal matrix employing an implicit shift converges toward diagonal form (second step). The QL algorithm employs iterative deflation techniques when a machine-precision zero is encountered "prematurely" on the super-/sub-diagonal. The algorithm allows for a reliable and computationally efficient computation of resonance and antiresonance energies which emerge from complex-scaled Hamiltonians, and for the numerical determination of the real energy eigenvalues of pseudo-Hermitian and PT-symmetric Hamilton matrices. Numerical reference values are provided.
Advanced Dynamically Adaptive Algorithms for Stochastic Simulations on Extreme Scales
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xiu, Dongbin
2017-03-03
The focus of the project is the development of mathematical methods and high-performance computational tools for stochastic simulations, with a particular emphasis on computations on extreme scales. The core of the project revolves around the design of highly efficient and scalable numerical algorithms that can adaptively and accurately, in high dimensional spaces, resolve stochastic problems with limited smoothness, even containing discontinuities.
Self-learning Monte Carlo method
Liu, Junwei; Qi, Yang; Meng, Zi Yang; ...
2017-01-04
Monte Carlo simulation is an unbiased numerical tool for studying classical and quantum many-body systems. One of its bottlenecks is the lack of a general and efficient update algorithm for large size systems close to the phase transition, for which local updates perform badly. In this Rapid Communication, we propose a general-purpose Monte Carlo method, dubbed self-learning Monte Carlo (SLMC), in which an efficient update algorithm is first learned from the training data generated in trial simulations and then used to speed up the actual simulation. Lastly, we demonstrate the efficiency of SLMC in a spin model at the phasemore » transition point, achieving a 10–20 times speedup.« less
Simulation of 3-D Nonequilibrium Seeded Air Flow in the NASA-Ames MHD Channel
NASA Technical Reports Server (NTRS)
Gupta, Sumeet; Tannehill, John C.; Mehta, Unmeel B.
2004-01-01
The 3-D nonequilibrium seeded air flow in the NASA-Ames experimental MHD channel has been numerically simulated. The channel contains a nozzle section, a center section, and an accelerator section where magnetic and electric fields can be imposed on the flow. In recent tests, velocity increases of up to 40% have been achieved in the accelerator section. The flow in the channel is numerically computed us ing a 3-D parabolized Navier-Stokes (PNS) algorithm that has been developed to efficiently compute MHD flows in the low magnetic Reynolds number regime: The MHD effects are modeled by introducing source terms into the PNS equations which can then be solved in a very efficient manner. The algorithm has been extended in the present study to account for nonequilibrium seeded air flows. The electrical conductivity of the flow is determined using the program of Park. The new algorithm has been used to compute two test cases that match the experimental conditions. In both cases, magnetic and electric fields are applied to the seeded flow. The computed results are in good agreement with the experimental data.
Three-dimensional near-field MIMO array imaging using range migration techniques.
Zhuge, Xiaodong; Yarovoy, Alexander G
2012-06-01
This paper presents a 3-D near-field imaging algorithm that is formulated for 2-D wideband multiple-input-multiple-output (MIMO) imaging array topology. The proposed MIMO range migration technique performs the image reconstruction procedure in the frequency-wavenumber domain. The algorithm is able to completely compensate the curvature of the wavefront in the near-field through a specifically defined interpolation process and provides extremely high computational efficiency by the application of the fast Fourier transform. The implementation aspects of the algorithm and the sampling criteria of a MIMO aperture are discussed. The image reconstruction performance and computational efficiency of the algorithm are demonstrated both with numerical simulations and measurements using 2-D MIMO arrays. Real-time 3-D near-field imaging can be achieved with a real-aperture array by applying the proposed MIMO range migration techniques.
A novel hybrid algorithm for the design of the phase diffractive optical elements for beam shaping
NASA Astrophysics Data System (ADS)
Jiang, Wenbo; Wang, Jun; Dong, Xiucheng
2013-02-01
In this paper, a novel hybrid algorithm for the design of a phase diffractive optical elements (PDOE) is proposed. It combines the genetic algorithm (GA) with the transformable scale BFGS (Broyden, Fletcher, Goldfarb, Shanno) algorithm, the penalty function was used in the cost function definition. The novel hybrid algorithm has the global merits of the genetic algorithm as well as the local improvement capabilities of the transformable scale BFGS algorithm. We designed the PDOE using the conventional simulated annealing algorithm and the novel hybrid algorithm. To compare the performance of two algorithms, three indexes of the diffractive efficiency, uniformity error and the signal-to-noise ratio are considered in numerical simulation. The results show that the novel hybrid algorithm has good convergence property and good stability. As an application example, the PDOE was used for the Gaussian beam shaping; high diffractive efficiency, low uniformity error and high signal-to-noise were obtained. The PDOE can be used for high quality beam shaping such as inertial confinement fusion (ICF), excimer laser lithography, fiber coupling laser diode array, laser welding, etc. It shows wide application value.
Efficient algorithms and implementations of entropy-based moment closures for rarefied gases
NASA Astrophysics Data System (ADS)
Schaerer, Roman Pascal; Bansal, Pratyuksh; Torrilhon, Manuel
2017-07-01
We present efficient algorithms and implementations of the 35-moment system equipped with the maximum-entropy closure in the context of rarefied gases. While closures based on the principle of entropy maximization have been shown to yield very promising results for moderately rarefied gas flows, the computational cost of these closures is in general much higher than for closure theories with explicit closed-form expressions of the closing fluxes, such as Grad's classical closure. Following a similar approach as Garrett et al. (2015) [13], we investigate efficient implementations of the computationally expensive numerical quadrature method used for the moment evaluations of the maximum-entropy distribution by exploiting its inherent fine-grained parallelism with the parallelism offered by multi-core processors and graphics cards. We show that using a single graphics card as an accelerator allows speed-ups of two orders of magnitude when compared to a serial CPU implementation. To accelerate the time-to-solution for steady-state problems, we propose a new semi-implicit time discretization scheme. The resulting nonlinear system of equations is solved with a Newton type method in the Lagrange multipliers of the dual optimization problem in order to reduce the computational cost. Additionally, fully explicit time-stepping schemes of first and second order accuracy are presented. We investigate the accuracy and efficiency of the numerical schemes for several numerical test cases, including a steady-state shock-structure problem.
Optimization methods and silicon solar cell numerical models
NASA Technical Reports Server (NTRS)
Girardini, K.
1986-01-01
The goal of this project is the development of an optimization algorithm for use with a solar cell model. It is possible to simultaneously vary design variables such as impurity concentrations, front junction depth, back junctions depth, and cell thickness to maximize the predicted cell efficiency. An optimization algorithm has been developed and interfaced with the Solar Cell Analysis Program in 1 Dimension (SCAPID). SCAPID uses finite difference methods to solve the differential equations which, along with several relations from the physics of semiconductors, describe mathematically the operation of a solar cell. A major obstacle is that the numerical methods used in SCAPID require a significant amount of computer time, and during an optimization the model is called iteratively until the design variables converge to the value associated with the maximum efficiency. This problem has been alleviated by designing an optimization code specifically for use with numerically intensive simulations, to reduce the number of times the efficiency has to be calculated to achieve convergence to the optimal solution. Adapting SCAPID so that it could be called iteratively by the optimization code provided another means of reducing the cpu time required to complete an optimization. Instead of calculating the entire I-V curve, as is usually done in SCAPID, only the efficiency is calculated (maximum power voltage and current) and the solution from previous calculations is used to initiate the next solution.
Das, Swagatam; Mukhopadhyay, Arpan; Roy, Anwit; Abraham, Ajith; Panigrahi, Bijaya K
2011-02-01
The theoretical analysis of evolutionary algorithms is believed to be very important for understanding their internal search mechanism and thus to develop more efficient algorithms. This paper presents a simple mathematical analysis of the explorative search behavior of a recently developed metaheuristic algorithm called harmony search (HS). HS is a derivative-free real parameter optimization algorithm, and it draws inspiration from the musical improvisation process of searching for a perfect state of harmony. This paper analyzes the evolution of the population-variance over successive generations in HS and thereby draws some important conclusions regarding the explorative power of HS. A simple but very useful modification to the classical HS has been proposed in light of the mathematical analysis undertaken here. A comparison with the most recently published variants of HS and four other state-of-the-art optimization algorithms over 15 unconstrained and five constrained benchmark functions reflects the efficiency of the modified HS in terms of final accuracy, convergence speed, and robustness.
Adaptive Load-Balancing Algorithms using Symmetric Broadcast Networks
NASA Technical Reports Server (NTRS)
Das, Sajal K.; Harvey, Daniel J.; Biswas, Rupak; Biegel, Bryan A. (Technical Monitor)
2002-01-01
In a distributed computing environment, it is important to ensure that the processor workloads are adequately balanced, Among numerous load-balancing algorithms, a unique approach due to Das and Prasad defines a symmetric broadcast network (SBN) that provides a robust communication pattern among the processors in a topology-independent manner. In this paper, we propose and analyze three efficient SBN-based dynamic load-balancing algorithms, and implement them on an SGI Origin2000. A thorough experimental study with Poisson distributed synthetic loads demonstrates that our algorithms are effective in balancing system load. By optimizing completion time and idle time, the proposed algorithms are shown to compare favorably with several existing approaches.
Image restoration by minimizing zero norm of wavelet frame coefficients
NASA Astrophysics Data System (ADS)
Bao, Chenglong; Dong, Bin; Hou, Likun; Shen, Zuowei; Zhang, Xiaoqun; Zhang, Xue
2016-11-01
In this paper, we propose two algorithms, namely the extrapolated proximal iterative hard thresholding (EPIHT) algorithm and the EPIHT algorithm with line-search, for solving the {{\\ell }}0-norm regularized wavelet frame balanced approach for image restoration. Under the theoretical framework of Kurdyka-Łojasiewicz property, we show that the sequences generated by the two algorithms converge to a local minimizer with linear convergence rate. Moreover, extensive numerical experiments on sparse signal reconstruction and wavelet frame based image restoration problems including CT reconstruction, image deblur, demonstrate the improvement of {{\\ell }}0-norm based regularization models over some prevailing ones, as well as the computational efficiency of the proposed algorithms.
A globally well-posed finite element algorithm for aerodynamics applications
NASA Technical Reports Server (NTRS)
Iannelli, G. S.; Baker, A. J.
1991-01-01
A finite element CFD algorithm is developed for Euler and Navier-Stokes aerodynamic applications. For the linear basis, the resultant approximation is at least second-order-accurate in time and space for synergistic use of three procedures: (1) a Taylor weak statement, which provides for derivation of companion conservation law systems with embedded dispersion-error control mechanisms; (2) a stiffly stable second-order-accurate implicit Rosenbrock-Runge-Kutta temporal algorithm; and (3) a matrix tensor product factorization that permits efficient numerical linear algebra handling of the terminal large-matrix statement. Thorough analyses are presented regarding well-posed boundary conditions for inviscid and viscous flow specifications. Numerical solutions are generated and compared for critical evaluation of quasi-one- and two-dimensional Euler and Navier-Stokes benchmark test problems.
An Efficient Moving Target Detection Algorithm Based on Sparsity-Aware Spectrum Estimation
Shen, Mingwei; Wang, Jie; Wu, Di; Zhu, Daiyin
2014-01-01
In this paper, an efficient direct data domain space-time adaptive processing (STAP) algorithm for moving targets detection is proposed, which is achieved based on the distinct spectrum features of clutter and target signals in the angle-Doppler domain. To reduce the computational complexity, the high-resolution angle-Doppler spectrum is obtained by finding the sparsest coefficients in the angle domain using the reduced-dimension data within each Doppler bin. Moreover, we will then present a knowledge-aided block-size detection algorithm that can discriminate between the moving targets and the clutter based on the extracted spectrum features. The feasibility and effectiveness of the proposed method are validated through both numerical simulations and raw data processing results. PMID:25222035
An efficient variable projection formulation for separable nonlinear least squares problems.
Gan, Min; Li, Han-Xiong
2014-05-01
We consider in this paper a class of nonlinear least squares problems in which the model can be represented as a linear combination of nonlinear functions. The variable projection algorithm projects the linear parameters out of the problem, leaving the nonlinear least squares problems involving only the nonlinear parameters. To implement the variable projection algorithm more efficiently, we propose a new variable projection functional based on matrix decomposition. The advantage of the proposed formulation is that the size of the decomposed matrix may be much smaller than those of previous ones. The Levenberg-Marquardt algorithm using finite difference method is then applied to minimize the new criterion. Numerical results show that the proposed approach achieves significant reduction in computing time.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hayami, Masao; Seino, Junji; Nakai, Hiromi, E-mail: nakai@waseda.jp
An efficient algorithm for the rapid evaluation of electron repulsion integrals is proposed. The present method, denoted by accompanying coordinate expansion and transferred recurrence relation (ACE-TRR), is constructed using a transfer relation scheme based on the accompanying coordinate expansion and recurrence relation method. Furthermore, the ACE-TRR algorithm is extended for the general-contraction basis sets. Numerical assessments clarify the efficiency of the ACE-TRR method for the systems including heavy elements, whose orbitals have long contractions and high angular momenta, such as f- and g-orbitals.
NASA Astrophysics Data System (ADS)
Peng, Heng; Liu, Yinghua; Chen, Haofeng
2018-05-01
In this paper, a novel direct method called the stress compensation method (SCM) is proposed for limit and shakedown analysis of large-scale elastoplastic structures. Without needing to solve the specific mathematical programming problem, the SCM is a two-level iterative procedure based on a sequence of linear elastic finite element solutions where the global stiffness matrix is decomposed only once. In the inner loop, the static admissible residual stress field for shakedown analysis is constructed. In the outer loop, a series of decreasing load multipliers are updated to approach to the shakedown limit multiplier by using an efficient and robust iteration control technique, where the static shakedown theorem is adopted. Three numerical examples up to about 140,000 finite element nodes confirm the applicability and efficiency of this method for two-dimensional and three-dimensional elastoplastic structures, with detailed discussions on the convergence and the accuracy of the proposed algorithm.
Analytical and numerical analysis of frictional damage in quasi brittle materials
NASA Astrophysics Data System (ADS)
Zhu, Q. Z.; Zhao, L. Y.; Shao, J. F.
2016-07-01
Frictional sliding and crack growth are two main dissipation processes in quasi brittle materials. The frictional sliding along closed cracks is the origin of macroscopic plastic deformation while the crack growth induces a material damage. The main difficulty of modeling is to consider the inherent coupling between these two processes. Various models and associated numerical algorithms have been proposed. But there are so far no analytical solutions even for simple loading paths for the validation of such algorithms. In this paper, we first present a micro-mechanical model taking into account the damage-friction coupling for a large class of quasi brittle materials. The model is formulated by combining a linear homogenization procedure with the Mori-Tanaka scheme and the irreversible thermodynamics framework. As an original contribution, a series of analytical solutions of stress-strain relations are developed for various loading paths. Based on the micro-mechanical model, two numerical integration algorithms are exploited. The first one involves a coupled friction/damage correction scheme, which is consistent with the coupling nature of the constitutive model. The second one contains a friction/damage decoupling scheme with two consecutive steps: the friction correction followed by the damage correction. With the analytical solutions as reference results, the two algorithms are assessed through a series of numerical tests. It is found that the decoupling correction scheme is efficient to guarantee a systematic numerical convergence.
Ion flux through membrane channels--an enhanced algorithm for the Poisson-Nernst-Planck model.
Dyrka, Witold; Augousti, Andy T; Kotulska, Malgorzata
2008-09-01
A novel algorithmic scheme for numerical solution of the 3D Poisson-Nernst-Planck model is proposed. The algorithmic improvements are universal and independent of the detailed physical model. They include three major steps: an adjustable gradient-based step value, an adjustable relaxation coefficient, and an optimized segmentation of the modeled space. The enhanced algorithm significantly accelerates the speed of computation and reduces the computational demands. The theoretical model was tested on a regular artificial channel and validated on a real protein channel-alpha-hemolysin, proving its efficiency. (c) 2008 Wiley Periodicals, Inc.
An efficient iteration strategy for the solution of the Euler equations
NASA Technical Reports Server (NTRS)
Walters, R. W.; Dwoyer, D. L.
1985-01-01
A line Gauss-Seidel (LGS) relaxation algorithm in conjunction with a one-parameter family of upwind discretizations of the Euler equations in two-dimensions is described. The basic algorithm has the property that convergence to the steady-state is quadratic for fully supersonic flows and linear otherwise. This is in contrast to the block ADI methods (either central or upwind differenced) and the upwind biased relaxation schemes, all of which converge linearly, independent of the flow regime. Moreover, the algorithm presented here is easily enhanced to detect regions of subsonic flow embedded in supersonic flow. This allows marching by lines in the supersonic regions, converging each line quadratically, and iterating in the subsonic regions, thus yielding a very efficient iteration strategy. Numerical results are presented for two-dimensional supersonic and transonic flows containing both oblique and normal shock waves which confirm the efficiency of the iteration strategy.
Efficient solutions to the Euler equations for supersonic flow with embedded subsonic regions
NASA Technical Reports Server (NTRS)
Walters, Robert W.; Dwoyer, Douglas L.
1987-01-01
A line Gauss-Seidel (LGS) relaxation algorithm in conjunction with a one-parameter family of upwind discretizations of the Euler equations in two dimensions is described. Convergence of the basic algorithm to the steady state is quadratic for fully supersonic flows and is linear for other flows. This is in contrast to the block alternating direction implicit methods (either central or upwind differenced) and the upwind biased relaxation schemes, all of which converge linearly, independent of the flow regime. Moreover, the algorithm presented herein is easily coupled with methods to detect regions of subsonic flow embedded in supersonic flow. This allows marching by lines in the supersonic regions, converging each line quadratically, and iterating in the subsonic regions, and yields a very efficient iteration strategy. Numerical results are presented for two-dimensional supersonic and transonic flows containing oblique and normal shock waves which confirm the efficiency of the iteration strategy.
Gradient-based Optimization for Poroelastic and Viscoelastic MR Elastography
Tan, Likun; McGarry, Matthew D.J.; Van Houten, Elijah E.W.; Ji, Ming; Solamen, Ligin; Weaver, John B.
2017-01-01
We describe an efficient gradient computation for solving inverse problems arising in magnetic resonance elastography (MRE). The algorithm can be considered as a generalized ‘adjoint method’ based on a Lagrangian formulation. One requirement for the classic adjoint method is assurance of the self-adjoint property of the stiffness matrix in the elasticity problem. In this paper, we show this property is no longer a necessary condition in our algorithm, but the computational performance can be as efficient as the classic method, which involves only two forward solutions and is independent of the number of parameters to be estimated. The algorithm is developed and implemented in material property reconstructions using poroelastic and viscoelastic modeling. Various gradient- and Hessian-based optimization techniques have been tested on simulation, phantom and in vivo brain data. The numerical results show the feasibility and the efficiency of the proposed scheme for gradient calculation. PMID:27608454
A method for data handling numerical results in parallel OpenFOAM simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anton, Alin; Muntean, Sebastian
Parallel computational fluid dynamics simulations produce vast amount of numerical result data. This paper introduces a method for reducing the size of the data by replaying the interprocessor traffic. The results are recovered only in certain regions of interest configured by the user. A known test case is used for several mesh partitioning scenarios using the OpenFOAM toolkit{sup ®}[1]. The space savings obtained with classic algorithms remain constant for more than 60 Gb of floating point data. Our method is most efficient on large simulation meshes and is much better suited for compressing large scale simulation results than the regular algorithms.
2017-01-01
Computational scientists have designed many useful algorithms by exploring a biological process or imitating natural evolution. These algorithms can be used to solve engineering optimization problems. Inspired by the change of matter state, we proposed a novel optimization algorithm called differential cloud particles evolution algorithm based on data-driven mechanism (CPDD). In the proposed algorithm, the optimization process is divided into two stages, namely, fluid stage and solid stage. The algorithm carries out the strategy of integrating global exploration with local exploitation in fluid stage. Furthermore, local exploitation is carried out mainly in solid stage. The quality of the solution and the efficiency of the search are influenced greatly by the control parameters. Therefore, the data-driven mechanism is designed for obtaining better control parameters to ensure good performance on numerical benchmark problems. In order to verify the effectiveness of CPDD, numerical experiments are carried out on all the CEC2014 contest benchmark functions. Finally, two application problems of artificial neural network are examined. The experimental results show that CPDD is competitive with respect to other eight state-of-the-art intelligent optimization algorithms. PMID:28761438
An Efficient Optimization Method for Solving Unsupervised Data Classification Problems.
Shabanzadeh, Parvaneh; Yusof, Rubiyah
2015-01-01
Unsupervised data classification (or clustering) analysis is one of the most useful tools and a descriptive task in data mining that seeks to classify homogeneous groups of objects based on similarity and is used in many medical disciplines and various applications. In general, there is no single algorithm that is suitable for all types of data, conditions, and applications. Each algorithm has its own advantages, limitations, and deficiencies. Hence, research for novel and effective approaches for unsupervised data classification is still active. In this paper a heuristic algorithm, Biogeography-Based Optimization (BBO) algorithm, was adapted for data clustering problems by modifying the main operators of BBO algorithm, which is inspired from the natural biogeography distribution of different species. Similar to other population-based algorithms, BBO algorithm starts with an initial population of candidate solutions to an optimization problem and an objective function that is calculated for them. To evaluate the performance of the proposed algorithm assessment was carried on six medical and real life datasets and was compared with eight well known and recent unsupervised data classification algorithms. Numerical results demonstrate that the proposed evolutionary optimization algorithm is efficient for unsupervised data classification.
Preconditioned alternating projection algorithms for maximum a posteriori ECT reconstruction
NASA Astrophysics Data System (ADS)
Krol, Andrzej; Li, Si; Shen, Lixin; Xu, Yuesheng
2012-11-01
We propose a preconditioned alternating projection algorithm (PAPA) for solving the maximum a posteriori (MAP) emission computed tomography (ECT) reconstruction problem. Specifically, we formulate the reconstruction problem as a constrained convex optimization problem with the total variation (TV) regularization. We then characterize the solution of the constrained convex optimization problem and show that it satisfies a system of fixed-point equations defined in terms of two proximity operators raised from the convex functions that define the TV-norm and the constraint involved in the problem. The characterization (of the solution) via the proximity operators that define two projection operators naturally leads to an alternating projection algorithm for finding the solution. For efficient numerical computation, we introduce to the alternating projection algorithm a preconditioning matrix (the EM-preconditioner) for the dense system matrix involved in the optimization problem. We prove theoretically convergence of the PAPA. In numerical experiments, performance of our algorithms, with an appropriately selected preconditioning matrix, is compared with performance of the conventional MAP expectation-maximization (MAP-EM) algorithm with TV regularizer (EM-TV) and that of the recently developed nested EM-TV algorithm for ECT reconstruction. Based on the numerical experiments performed in this work, we observe that the alternating projection algorithm with the EM-preconditioner outperforms significantly the EM-TV in all aspects including the convergence speed, the noise in the reconstructed images and the image quality. It also outperforms the nested EM-TV in the convergence speed while providing comparable image quality.
Ren, Qiang; Nagar, Jogender; Kang, Lei; Bian, Yusheng; Werner, Ping; Werner, Douglas H
2017-05-18
A highly efficient numerical approach for simulating the wideband optical response of nano-architectures comprised of Drude-Critical Points (DCP) media (e.g., gold and silver) is proposed and validated through comparing with commercial computational software. The kernel of this algorithm is the subdomain level discontinuous Galerkin time domain (DGTD) method, which can be viewed as a hybrid of the spectral-element time-domain method (SETD) and the finite-element time-domain (FETD) method. An hp-refinement technique is applied to decrease the Degrees-of-Freedom (DoFs) and computational requirements. The collocated E-J scheme facilitates solving the auxiliary equations by converting the inversions of matrices to simpler vector manipulations. A new hybrid time stepping approach, which couples the Runge-Kutta and Newmark methods, is proposed to solve the temporal auxiliary differential equations (ADEs) with a high degree of efficiency. The advantages of this new approach, in terms of computational resource overhead and accuracy, are validated through comparison with well-known commercial software for three diverse cases, which cover both near-field and far-field properties with plane wave and lumped port sources. The presented work provides the missing link between DCP dispersive models and FETD and/or SETD based algorithms. It is a competitive candidate for numerically studying the wideband plasmonic properties of DCP media.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Allgor, R.J.; Feehery, W.F.; Tolsma, J.E.
The batch process development problem serves as good candidate to guide the development of process modeling environments. It demonstrates that very robust numerical techniques are required within an environment that can collect, organize, and maintain the data and models required to address the batch process development problem. This paper focuses on improving the robustness and efficiency of the numerical algorithms required in such a modeling environment through the development of hybrid numerical and symbolic strategies.
Acoustic simulation in architecture with parallel algorithm
NASA Astrophysics Data System (ADS)
Li, Xiaohong; Zhang, Xinrong; Li, Dan
2004-03-01
In allusion to complexity of architecture environment and Real-time simulation of architecture acoustics, a parallel radiosity algorithm was developed. The distribution of sound energy in scene is solved with this method. And then the impulse response between sources and receivers at frequency segment, which are calculated with multi-process, are combined into whole frequency response. The numerical experiment shows that parallel arithmetic can improve the acoustic simulating efficiency of complex scene.
Research on an augmented Lagrangian penalty function algorithm for nonlinear programming
NASA Technical Reports Server (NTRS)
Frair, L.
1978-01-01
The augmented Lagrangian (ALAG) Penalty Function Algorithm for optimizing nonlinear mathematical models is discussed. The mathematical models of interest are deterministic in nature and finite dimensional optimization is assumed. A detailed review of penalty function techniques in general and the ALAG technique in particular is presented. Numerical experiments are conducted utilizing a number of nonlinear optimization problems to identify an efficient ALAG Penalty Function Technique for computer implementation.
Random sequential adsorption of cubes
NASA Astrophysics Data System (ADS)
Cieśla, Michał; Kubala, Piotr
2018-01-01
Random packings built of cubes are studied numerically using a random sequential adsorption algorithm. To compare the obtained results with previous reports, three different models of cube orientation sampling were used. Also, three different cube-cube intersection algorithms were tested to find the most efficient one. The study focuses on the mean saturated packing fraction as well as kinetics of packing growth. Microstructural properties of packings were analyzed using density autocorrelation function.
Bayesian block-diagonal variable selection and model averaging
Papaspiliopoulos, O.; Rossell, D.
2018-01-01
Summary We propose a scalable algorithmic framework for exact Bayesian variable selection and model averaging in linear models under the assumption that the Gram matrix is block-diagonal, and as a heuristic for exploring the model space for general designs. In block-diagonal designs our approach returns the most probable model of any given size without resorting to numerical integration. The algorithm also provides a novel and efficient solution to the frequentist best subset selection problem for block-diagonal designs. Posterior probabilities for any number of models are obtained by evaluating a single one-dimensional integral, and other quantities of interest such as variable inclusion probabilities and model-averaged regression estimates are obtained by an adaptive, deterministic one-dimensional numerical integration. The overall computational cost scales linearly with the number of blocks, which can be processed in parallel, and exponentially with the block size, rendering it most adequate in situations where predictors are organized in many moderately-sized blocks. For general designs, we approximate the Gram matrix by a block-diagonal matrix using spectral clustering and propose an iterative algorithm that capitalizes on the block-diagonal algorithms to explore efficiently the model space. All methods proposed in this paper are implemented in the R library mombf. PMID:29861501
Development and application of unified algorithms for problems in computational science
NASA Technical Reports Server (NTRS)
Shankar, Vijaya; Chakravarthy, Sukumar
1987-01-01
A framework is presented for developing computationally unified numerical algorithms for solving nonlinear equations that arise in modeling various problems in mathematical physics. The concept of computational unification is an attempt to encompass efficient solution procedures for computing various nonlinear phenomena that may occur in a given problem. For example, in Computational Fluid Dynamics (CFD), a unified algorithm will be one that allows for solutions to subsonic (elliptic), transonic (mixed elliptic-hyperbolic), and supersonic (hyperbolic) flows for both steady and unsteady problems. The objectives are: development of superior unified algorithms emphasizing accuracy and efficiency aspects; development of codes based on selected algorithms leading to validation; application of mature codes to realistic problems; and extension/application of CFD-based algorithms to problems in other areas of mathematical physics. The ultimate objective is to achieve integration of multidisciplinary technologies to enhance synergism in the design process through computational simulation. Specific unified algorithms for a hierarchy of gas dynamics equations and their applications to two other areas: electromagnetic scattering, and laser-materials interaction accounting for melting.
Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo
McDaniel, Tyler; D’Azevedo, Ed F.; Li, Ying Wai; ...
2017-11-07
Within ab initio Quantum Monte Carlo simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunction. Each Monte Carlo step requires finding the determinant of a dense matrix. This is most commonly iteratively evaluated using a rank-1 Sherman-Morrison updating scheme to avoid repeated explicit calculation of the inverse. The overall computational cost is therefore formally cubic in the number of electrons or matrix size. To improve the numerical efficiency of this procedure, we propose a novel multiple rank delayed update scheme. This strategy enables probability evaluation with applicationmore » of accepted moves to the matrices delayed until after a predetermined number of moves, K. The accepted events are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency via matrix-matrix operations instead of matrix-vector operations. Here this procedure does not change the underlying Monte Carlo sampling or its statistical efficiency. For calculations on large systems and algorithms such as diffusion Monte Carlo where the acceptance ratio is high, order of magnitude improvements in the update time can be obtained on both multi- core CPUs and GPUs.« less
Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo
DOE Office of Scientific and Technical Information (OSTI.GOV)
McDaniel, Tyler; D’Azevedo, Ed F.; Li, Ying Wai
Within ab initio Quantum Monte Carlo simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunction. Each Monte Carlo step requires finding the determinant of a dense matrix. This is most commonly iteratively evaluated using a rank-1 Sherman-Morrison updating scheme to avoid repeated explicit calculation of the inverse. The overall computational cost is therefore formally cubic in the number of electrons or matrix size. To improve the numerical efficiency of this procedure, we propose a novel multiple rank delayed update scheme. This strategy enables probability evaluation with applicationmore » of accepted moves to the matrices delayed until after a predetermined number of moves, K. The accepted events are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency via matrix-matrix operations instead of matrix-vector operations. Here this procedure does not change the underlying Monte Carlo sampling or its statistical efficiency. For calculations on large systems and algorithms such as diffusion Monte Carlo where the acceptance ratio is high, order of magnitude improvements in the update time can be obtained on both multi- core CPUs and GPUs.« less
Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo.
McDaniel, T; D'Azevedo, E F; Li, Y W; Wong, K; Kent, P R C
2017-11-07
Within ab initio Quantum Monte Carlo simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunction. Each Monte Carlo step requires finding the determinant of a dense matrix. This is most commonly iteratively evaluated using a rank-1 Sherman-Morrison updating scheme to avoid repeated explicit calculation of the inverse. The overall computational cost is, therefore, formally cubic in the number of electrons or matrix size. To improve the numerical efficiency of this procedure, we propose a novel multiple rank delayed update scheme. This strategy enables probability evaluation with an application of accepted moves to the matrices delayed until after a predetermined number of moves, K. The accepted events are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency via matrix-matrix operations instead of matrix-vector operations. This procedure does not change the underlying Monte Carlo sampling or its statistical efficiency. For calculations on large systems and algorithms such as diffusion Monte Carlo, where the acceptance ratio is high, order of magnitude improvements in the update time can be obtained on both multi-core central processing units and graphical processing units.
Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo
NASA Astrophysics Data System (ADS)
McDaniel, T.; D'Azevedo, E. F.; Li, Y. W.; Wong, K.; Kent, P. R. C.
2017-11-01
Within ab initio Quantum Monte Carlo simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunction. Each Monte Carlo step requires finding the determinant of a dense matrix. This is most commonly iteratively evaluated using a rank-1 Sherman-Morrison updating scheme to avoid repeated explicit calculation of the inverse. The overall computational cost is, therefore, formally cubic in the number of electrons or matrix size. To improve the numerical efficiency of this procedure, we propose a novel multiple rank delayed update scheme. This strategy enables probability evaluation with an application of accepted moves to the matrices delayed until after a predetermined number of moves, K. The accepted events are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency via matrix-matrix operations instead of matrix-vector operations. This procedure does not change the underlying Monte Carlo sampling or its statistical efficiency. For calculations on large systems and algorithms such as diffusion Monte Carlo, where the acceptance ratio is high, order of magnitude improvements in the update time can be obtained on both multi-core central processing units and graphical processing units.
A heuristic approach using multiple criteria for environmentally benign 3PLs selection
NASA Astrophysics Data System (ADS)
Kongar, Elif
2005-11-01
Maintaining competitiveness in an environment where price and quality differences between competing products are disappearing depends on the company's ability to reduce costs and supply time. Timely responses to rapidly changing market conditions require an efficient Supply Chain Management (SCM). Outsourcing logistics to third-party logistics service providers (3PLs) is one commonly used way of increasing the efficiency of logistics operations, while creating a more "core competency focused" business environment. However, this alone may not be sufficient. Due to recent environmental regulations and growing public awareness regarding environmental issues, 3PLs need to be not only efficient but also environmentally benign to maintain companies' competitiveness. Even though an efficient and environmentally benign combination of 3PLs can theoretically be obtained using exhaustive search algorithms, heuristics approaches to the selection process may be superior in terms of the computational complexity. In this paper, a hybrid approach that combines a multiple criteria Genetic Algorithm (GA) with Linear Physical Weighting Algorithm (LPPW) to be used in efficient and environmentally benign 3PLs is proposed. A numerical example is also provided to illustrate the method and the analyses.
NASA Astrophysics Data System (ADS)
Vecharynski, Eugene; Brabec, Jiri; Shao, Meiyue; Govind, Niranjan; Yang, Chao
2017-12-01
We present two efficient iterative algorithms for solving the linear response eigenvalue problem arising from the time dependent density functional theory. Although the matrix to be diagonalized is nonsymmetric, it has a special structure that can be exploited to save both memory and floating point operations. In particular, the nonsymmetric eigenvalue problem can be transformed into an eigenvalue problem that involves the product of two matrices M and K. We show that, because MK is self-adjoint with respect to the inner product induced by the matrix K, this product eigenvalue problem can be solved efficiently by a modified Davidson algorithm and a modified locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm that make use of the K-inner product. The solution of the product eigenvalue problem yields one component of the eigenvector associated with the original eigenvalue problem. We show that the other component of the eigenvector can be easily recovered in an inexpensive postprocessing procedure. As a result, the algorithms we present here become more efficient than existing methods that try to approximate both components of the eigenvectors simultaneously. In particular, our numerical experiments demonstrate that the new algorithms presented here consistently outperform the existing state-of-the-art Davidson type solvers by a factor of two in both solution time and storage.
A study of metaheuristic algorithms for high dimensional feature selection on microarray data
NASA Astrophysics Data System (ADS)
Dankolo, Muhammad Nasiru; Radzi, Nor Haizan Mohamed; Sallehuddin, Roselina; Mustaffa, Noorfa Haszlinna
2017-11-01
Microarray systems enable experts to examine gene profile at molecular level using machine learning algorithms. It increases the potentials of classification and diagnosis of many diseases at gene expression level. Though, numerous difficulties may affect the efficiency of machine learning algorithms which includes vast number of genes features comprised in the original data. Many of these features may be unrelated to the intended analysis. Therefore, feature selection is necessary to be performed in the data pre-processing. Many feature selection algorithms are developed and applied on microarray which including the metaheuristic optimization algorithms. This paper discusses the application of the metaheuristics algorithms for feature selection in microarray dataset. This study reveals that, the algorithms have yield an interesting result with limited resources thereby saving computational expenses of machine learning algorithms.
Physiology driven adaptivity for the numerical solution of the bidomain equations.
Whiteley, Jonathan P
2007-09-01
Previous work [Whiteley, J. P. IEEE Trans. Biomed. Eng. 53:2139-2147, 2006] derived a stable, semi-implicit numerical scheme for solving the bidomain equations. This scheme allows the timestep used when solving the bidomain equations numerically to be chosen by accuracy considerations rather than stability considerations. In this study we modify this scheme to allow an adaptive numerical solution in both time and space. The spatial mesh size is determined by the gradient of the transmembrane and extracellular potentials while the timestep is determined by the values of: (i) the fast sodium current; and (ii) the calcium release from junctional sarcoplasmic reticulum to myoplasm current. For two-dimensional simulations presented here, combining the numerical algorithm in the paper cited above with the adaptive algorithm presented here leads to an increase in computational efficiency by a factor of around 250 over previous work, together with significantly less computational memory being required. The speedup for three-dimensional simulations is likely to be more impressive.
Unconventional Hamilton-type variational principle in phase space and symplectic algorithm
NASA Astrophysics Data System (ADS)
Luo, En; Huang, Weijiang; Zhang, Hexin
2003-06-01
By a novel approach proposed by Luo, the unconventional Hamilton-type variational principle in phase space for elastodynamics of multidegree-of-freedom system is established in this paper. It not only can fully characterize the initial-value problem of this dynamic, but also has a natural symplectic structure. Based on this variational principle, a symplectic algorithm which is called a symplectic time-subdomain method is proposed. A non-difference scheme is constructed by applying Lagrange interpolation polynomial to the time subdomain. Furthermore, it is also proved that the presented symplectic algorithm is an unconditionally stable one. From the results of the two numerical examples of different types, it can be seen that the accuracy and the computational efficiency of the new method excel obviously those of widely used Wilson-θ and Newmark-β methods. Therefore, this new algorithm is a highly efficient one with better computational performance.
Implicit flux-split schemes for the Euler equations
NASA Technical Reports Server (NTRS)
Thomas, J. L.; Walters, R. W.; Van Leer, B.
1985-01-01
Recent progress in the development of implicit algorithms for the Euler equations using the flux-vector splitting method is described. Comparisons of the relative efficiency of relaxation and spatially-split approximately factored methods on a vector processor for two-dimensional flows are made. For transonic flows, the higher convergence rate per iteration of the Gauss-Seidel relaxation algorithms, which are only partially vectorizable, is amply compensated for by the faster computational rate per iteration of the approximately factored algorithm. For supersonic flows, the fully-upwind line-relaxation method is more efficient since the numerical domain of dependence is more closely matched to the physical domain of dependence. A hybrid three-dimensional algorithm using relaxation in one coordinate direction and approximate factorization in the cross-flow plane is developed and applied to a forebody shape at supersonic speeds and a swept, tapered wing at transonic speeds.
A time-spectral approach to numerical weather prediction
NASA Astrophysics Data System (ADS)
Scheffel, Jan; Lindvall, Kristoffer; Yik, Hiu Fai
2018-05-01
Finite difference methods are traditionally used for modelling the time domain in numerical weather prediction (NWP). Time-spectral solution is an attractive alternative for reasons of accuracy and efficiency and because time step limitations associated with causal CFL-like criteria, typical for explicit finite difference methods, are avoided. In this work, the Lorenz 1984 chaotic equations are solved using the time-spectral algorithm GWRM (Generalized Weighted Residual Method). Comparisons of accuracy and efficiency are carried out for both explicit and implicit time-stepping algorithms. It is found that the efficiency of the GWRM compares well with these methods, in particular at high accuracy. For perturbative scenarios, the GWRM was found to be as much as four times faster than the finite difference methods. A primary reason is that the GWRM time intervals typically are two orders of magnitude larger than those of the finite difference methods. The GWRM has the additional advantage to produce analytical solutions in the form of Chebyshev series expansions. The results are encouraging for pursuing further studies, including spatial dependence, of the relevance of time-spectral methods for NWP modelling.
NASA Technical Reports Server (NTRS)
Kao, M. H.; Bodenheimer, R. E.
1976-01-01
The tse computer's capability of achieving image congruence between temporal and multiple images with misregistration due to rotational differences is reported. The coordinate transformations are obtained and a general algorithms is devised to perform image rotation using tse operations very efficiently. The details of this algorithm as well as its theoretical implications are presented. Step by step procedures of image registration are described in detail. Numerous examples are also employed to demonstrate the correctness and the effectiveness of the algorithms and conclusions and recommendations are made.
Efficient iterative image reconstruction algorithm for dedicated breast CT
NASA Astrophysics Data System (ADS)
Antropova, Natalia; Sanchez, Adrian; Reiser, Ingrid S.; Sidky, Emil Y.; Boone, John; Pan, Xiaochuan
2016-03-01
Dedicated breast computed tomography (bCT) is currently being studied as a potential screening method for breast cancer. The X-ray exposure is set low to achieve an average glandular dose comparable to that of mammography, yielding projection data that contains high levels of noise. Iterative image reconstruction (IIR) algorithms may be well-suited for the system since they potentially reduce the effects of noise in the reconstructed images. However, IIR outcomes can be difficult to control since the algorithm parameters do not directly correspond to the image properties. Also, IIR algorithms are computationally demanding and have optimal parameter settings that depend on the size and shape of the breast and positioning of the patient. In this work, we design an efficient IIR algorithm with meaningful parameter specifications and that can be used on a large, diverse sample of bCT cases. The flexibility and efficiency of this method comes from having the final image produced by a linear combination of two separately reconstructed images - one containing gray level information and the other with enhanced high frequency components. Both of the images result from few iterations of separate IIR algorithms. The proposed algorithm depends on two parameters both of which have a well-defined impact on image quality. The algorithm is applied to numerous bCT cases from a dedicated bCT prototype system developed at University of California, Davis.
Viewing Angle Classification of Cryo-Electron Microscopy Images Using Eigenvectors
Singer, A.; Zhao, Z.; Shkolnisky, Y.; Hadani, R.
2012-01-01
The cryo-electron microscopy (cryo-EM) reconstruction problem is to find the three-dimensional structure of a macromolecule given noisy versions of its two-dimensional projection images at unknown random directions. We introduce a new algorithm for identifying noisy cryo-EM images of nearby viewing angles. This identification is an important first step in three-dimensional structure determination of macromolecules from cryo-EM, because once identified, these images can be rotationally aligned and averaged to produce “class averages” of better quality. The main advantage of our algorithm is its extreme robustness to noise. The algorithm is also very efficient in terms of running time and memory requirements, because it is based on the computation of the top few eigenvectors of a specially designed sparse Hermitian matrix. These advantages are demonstrated in numerous numerical experiments. PMID:22506089
NASA Astrophysics Data System (ADS)
Doha, Eid H.; Bhrawy, Ali H.; Abdelkawy, Mohammed A.
2014-09-01
In this paper, we propose an efficient spectral collocation algorithm to solve numerically wave type equations subject to initial, boundary and non-local conservation conditions. The shifted Jacobi pseudospectral approximation is investigated for the discretization of the spatial variable of such equations. It possesses spectral accuracy in the spatial variable. The shifted Jacobi-Gauss-Lobatto (SJ-GL) quadrature rule is established for treating the non-local conservation conditions, and then the problem with its initial and non-local boundary conditions are reduced to a system of second-order ordinary differential equations in temporal variable. This system is solved by two-stage forth-order A-stable implicit RK scheme. Five numerical examples with comparisons are given. The computational results demonstrate that the proposed algorithm is more accurate than finite difference method, method of lines and spline collocation approach
Utilizing Visual Effects Software for Efficient and Flexible Isostatic Adjustment Modelling
NASA Astrophysics Data System (ADS)
Meldgaard, A.; Nielsen, L.; Iaffaldano, G.
2017-12-01
The isostatic adjustment signal generated by transient ice sheet loading is an important indicator of past ice sheet extent and the rheological constitution of the interior of the Earth. Finite element modelling has proved to be a very useful tool in these studies. We present a simple numerical model for 3D visco elastic Earth deformation and a new approach to the design of such models utilizing visual effects software designed for the film and game industry. The software package Houdini offers an assortment of optimized tools and libraries which greatly facilitate the creation of efficient numerical algorithms. In particular, we make use of Houdini's procedural work flow, the SIMD programming language VEX, Houdini's sparse matrix creation and inversion libraries, an inbuilt tetrahedralizer for grid creation, and the user interface, which facilitates effortless manipulation of 3D geometry. We mitigate many of the time consuming steps associated with the authoring of efficient algorithms from scratch while still keeping the flexibility that may be lost with the use of commercial dedicated finite element programs. We test the efficiency of the algorithm by comparing simulation times with off-the-shelf solutions from the Abaqus software package. The algorithm is tailored for the study of local isostatic adjustment patterns, in close vicinity to present ice sheet margins. In particular, we wish to examine possible causes for the considerable spatial differences in the uplift magnitude which are apparent from field observations in these areas. Such features, with spatial scales of tens of kilometres, are not resolvable with current global isostatic adjustment models, and may require the inclusion of local topographic features. We use the presented algorithm to study a near field area where field observations are abundant, namely, Disko Bay in West Greenland with the intention of constraining Earth parameters and ice thickness. In addition, we assess how local topographic features may influence the differential isostatic uplift in the area.
A curvilinear, fully implicit, conservative electromagnetic PIC algorithm in multiple dimensions
Chacon, L.; Chen, G.
2016-04-19
Here, we extend a recently proposed fully implicit PIC algorithm for the Vlasov–Darwin model in multiple dimensions (Chen and Chacón (2015) [1]) to curvilinear geometry. As in the Cartesian case, the approach is based on a potential formulation (Φ, A), and overcomes many difficulties of traditional semi-implicit Darwin PIC algorithms. Conservation theorems for local charge and global energy are derived in curvilinear representation, and then enforced discretely by a careful choice of the discretization of field and particle equations. Additionally, the algorithm conserves canonical-momentum in any ignorable direction, and preserves the Coulomb gauge ∇ • A = 0 exactly. Anmore » asymptotically well-posed fluid preconditioner allows efficient use of large cell sizes, which are determined by accuracy considerations, not stability, and can be orders of magnitude larger than required in a standard explicit electromagnetic PIC simulation. We demonstrate the accuracy and efficiency properties of the algorithm with numerical experiments in mapped meshes in 1D-3V and 2D-3V.« less
A curvilinear, fully implicit, conservative electromagnetic PIC algorithm in multiple dimensions
NASA Astrophysics Data System (ADS)
Chacón, L.; Chen, G.
2016-07-01
We extend a recently proposed fully implicit PIC algorithm for the Vlasov-Darwin model in multiple dimensions (Chen and Chacón (2015) [1]) to curvilinear geometry. As in the Cartesian case, the approach is based on a potential formulation (ϕ, A), and overcomes many difficulties of traditional semi-implicit Darwin PIC algorithms. Conservation theorems for local charge and global energy are derived in curvilinear representation, and then enforced discretely by a careful choice of the discretization of field and particle equations. Additionally, the algorithm conserves canonical-momentum in any ignorable direction, and preserves the Coulomb gauge ∇ ṡ A = 0 exactly. An asymptotically well-posed fluid preconditioner allows efficient use of large cell sizes, which are determined by accuracy considerations, not stability, and can be orders of magnitude larger than required in a standard explicit electromagnetic PIC simulation. We demonstrate the accuracy and efficiency properties of the algorithm with numerical experiments in mapped meshes in 1D-3V and 2D-3V.
A curvilinear, fully implicit, conservative electromagnetic PIC algorithm in multiple dimensions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chacon, L.; Chen, G.
Here, we extend a recently proposed fully implicit PIC algorithm for the Vlasov–Darwin model in multiple dimensions (Chen and Chacón (2015) [1]) to curvilinear geometry. As in the Cartesian case, the approach is based on a potential formulation (Φ, A), and overcomes many difficulties of traditional semi-implicit Darwin PIC algorithms. Conservation theorems for local charge and global energy are derived in curvilinear representation, and then enforced discretely by a careful choice of the discretization of field and particle equations. Additionally, the algorithm conserves canonical-momentum in any ignorable direction, and preserves the Coulomb gauge ∇ • A = 0 exactly. Anmore » asymptotically well-posed fluid preconditioner allows efficient use of large cell sizes, which are determined by accuracy considerations, not stability, and can be orders of magnitude larger than required in a standard explicit electromagnetic PIC simulation. We demonstrate the accuracy and efficiency properties of the algorithm with numerical experiments in mapped meshes in 1D-3V and 2D-3V.« less
NASA Astrophysics Data System (ADS)
Plattner, A.; Maurer, H. R.; Vorloeper, J.; Dahmen, W.
2010-08-01
Despite the ever-increasing power of modern computers, realistic modelling of complex 3-D earth models is still a challenging task and requires substantial computing resources. The overwhelming majority of current geophysical modelling approaches includes either finite difference or non-adaptive finite element algorithms and variants thereof. These numerical methods usually require the subsurface to be discretized with a fine mesh to accurately capture the behaviour of the physical fields. However, this may result in excessive memory consumption and computing times. A common feature of most of these algorithms is that the modelled data discretizations are independent of the model complexity, which may be wasteful when there are only minor to moderate spatial variations in the subsurface parameters. Recent developments in the theory of adaptive numerical solvers have the potential to overcome this problem. Here, we consider an adaptive wavelet-based approach that is applicable to a large range of problems, also including nonlinear problems. In comparison with earlier applications of adaptive solvers to geophysical problems we employ here a new adaptive scheme whose core ingredients arose from a rigorous analysis of the overall asymptotically optimal computational complexity, including in particular, an optimal work/accuracy rate. Our adaptive wavelet algorithm offers several attractive features: (i) for a given subsurface model, it allows the forward modelling domain to be discretized with a quasi minimal number of degrees of freedom, (ii) sparsity of the associated system matrices is guaranteed, which makes the algorithm memory efficient and (iii) the modelling accuracy scales linearly with computing time. We have implemented the adaptive wavelet algorithm for solving 3-D geoelectric problems. To test its performance, numerical experiments were conducted with a series of conductivity models exhibiting varying degrees of structural complexity. Results were compared with a non-adaptive finite element algorithm, which incorporates an unstructured mesh to best-fitting subsurface boundaries. Such algorithms represent the current state-of-the-art in geoelectric modelling. An analysis of the numerical accuracy as a function of the number of degrees of freedom revealed that the adaptive wavelet algorithm outperforms the finite element solver for simple and moderately complex models, whereas the results become comparable for models with high spatial variability of electrical conductivities. The linear dependence of the modelling error and the computing time proved to be model-independent. This feature will allow very efficient computations using large-scale models as soon as our experimental code is optimized in terms of its implementation.
A generalized Condat's algorithm of 1D total variation regularization
NASA Astrophysics Data System (ADS)
Makovetskii, Artyom; Voronin, Sergei; Kober, Vitaly
2017-09-01
A common way for solving the denosing problem is to utilize the total variation (TV) regularization. Many efficient numerical algorithms have been developed for solving the TV regularization problem. Condat described a fast direct algorithm to compute the processed 1D signal. Also there exists a direct algorithm with a linear time for 1D TV denoising referred to as the taut string algorithm. The Condat's algorithm is based on a dual problem to the 1D TV regularization. In this paper, we propose a variant of the Condat's algorithm based on the direct 1D TV regularization problem. The usage of the Condat's algorithm with the taut string approach leads to a clear geometric description of the extremal function. Computer simulation results are provided to illustrate the performance of the proposed algorithm for restoration of degraded signals.
Chen, G.; Chacón, L.
2015-08-11
For decades, the Vlasov–Darwin model has been recognized to be attractive for particle-in-cell (PIC) kinetic plasma simulations in non-radiative electromagnetic regimes, to avoid radiative noise issues and gain computational efficiency. However, the Darwin model results in an elliptic set of field equations that renders conventional explicit time integration unconditionally unstable. We explore a fully implicit PIC algorithm for the Vlasov–Darwin model in multiple dimensions, which overcomes many difficulties of traditional semi-implicit Darwin PIC algorithms. The finite-difference scheme for Darwin field equations and particle equations of motion is space–time-centered, employing particle sub-cycling and orbit-averaging. This algorithm conserves total energy, local charge,more » canonical-momentum in the ignorable direction, and preserves the Coulomb gauge exactly. An asymptotically well-posed fluid preconditioner allows efficient use of large cell sizes, which are determined by accuracy considerations, not stability, and can be orders of magnitude larger than required in a standard explicit electromagnetic PIC simulation. Finally, we demonstrate the accuracy and efficiency properties of the algorithm with various numerical experiments in 2D–3V.« less
A model reduction approach to numerical inversion for a parabolic partial differential equation
NASA Astrophysics Data System (ADS)
Borcea, Liliana; Druskin, Vladimir; Mamonov, Alexander V.; Zaslavsky, Mikhail
2014-12-01
We propose a novel numerical inversion algorithm for the coefficients of parabolic partial differential equations, based on model reduction. The study is motivated by the application of controlled source electromagnetic exploration, where the unknown is the subsurface electrical resistivity and the data are time resolved surface measurements of the magnetic field. The algorithm presented in this paper considers inversion in one and two dimensions. The reduced model is obtained with rational interpolation in the frequency (Laplace) domain and a rational Krylov subspace projection method. It amounts to a nonlinear mapping from the function space of the unknown resistivity to the small dimensional space of the parameters of the reduced model. We use this mapping as a nonlinear preconditioner for the Gauss-Newton iterative solution of the inverse problem. The advantage of the inversion algorithm is twofold. First, the nonlinear preconditioner resolves most of the nonlinearity of the problem. Thus the iterations are less likely to get stuck in local minima and the convergence is fast. Second, the inversion is computationally efficient because it avoids repeated accurate simulations of the time-domain response. We study the stability of the inversion algorithm for various rational Krylov subspaces, and assess its performance with numerical experiments.
On the Solution of Elliptic Partial Differential Equations on Regions with Corners
2015-07-09
In this report we investigate the solution of boundary value problems on polygonal domains for elliptic partial differential equations . We observe...that when the problems are formulated as the boundary integral equations of classical potential theory, the solutions are representable by series of...efficient numerical algorithms. The results are illustrated by a number of numerical examples. On the solution of elliptic partial differential equations on
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sheng, Zheng, E-mail: 19994035@sina.com; Wang, Jun; Zhou, Bihua
2014-03-15
This paper introduces a novel hybrid optimization algorithm to establish the parameters of chaotic systems. In order to deal with the weaknesses of the traditional cuckoo search algorithm, the proposed adaptive cuckoo search with simulated annealing algorithm is presented, which incorporates the adaptive parameters adjusting operation and the simulated annealing operation in the cuckoo search algorithm. Normally, the parameters of the cuckoo search algorithm are kept constant that may result in decreasing the efficiency of the algorithm. For the purpose of balancing and enhancing the accuracy and convergence rate of the cuckoo search algorithm, the adaptive operation is presented tomore » tune the parameters properly. Besides, the local search capability of cuckoo search algorithm is relatively weak that may decrease the quality of optimization. So the simulated annealing operation is merged into the cuckoo search algorithm to enhance the local search ability and improve the accuracy and reliability of the results. The functionality of the proposed hybrid algorithm is investigated through the Lorenz chaotic system under the noiseless and noise condition, respectively. The numerical results demonstrate that the method can estimate parameters efficiently and accurately in the noiseless and noise condition. Finally, the results are compared with the traditional cuckoo search algorithm, genetic algorithm, and particle swarm optimization algorithm. Simulation results demonstrate the effectiveness and superior performance of the proposed algorithm.« less
Recent advances in numerical PDEs
NASA Astrophysics Data System (ADS)
Zuev, Julia Michelle
In this thesis, we investigate four neighboring topics, all in the general area of numerical methods for solving Partial Differential Equations (PDEs). Topic 1. Radial Basis Functions (RBF) are widely used for multi-dimensional interpolation of scattered data. This methodology offers smooth and accurate interpolants, which can be further refined, if necessary, by clustering nodes in select areas. We show, however, that local refinements with RBF (in a constant shape parameter [varepsilon] regime) may lead to the oscillatory errors associated with the Runge phenomenon (RP). RP is best known in the case of high-order polynomial interpolation, where its effects can be accurately predicted via Lebesgue constant L (which is based solely on the node distribution). We study the RP and the applicability of Lebesgue constant (as well as other error measures) in RBF interpolation. Mainly, we allow for a spatially variable shape parameter, and demonstrate how it can be used to suppress RP-like edge effects and to improve the overall stability and accuracy. Topic 2. Although not as versatile as RBFs, cubic splines are useful for interpolating grid-based data. In 2-D, we consider a patch representation via Hermite basis functions s i,j ( u, v ) = [Special characters omitted.] h mn H m ( u ) H n ( v ), as opposed to the standard bicubic representation. Stitching requirements for the rectangular non-equispaced grid yield a 2-D tridiagonal linear system AX = B, where X represents the unknown first derivatives. We discover that the standard methods for solving this NxM system do not take advantage of the spline-specific format of the matrix B. We develop an alternative approach using this specialization of the RHS, which allows us to pre-compute coefficients only once, instead of N times. MATLAB implementation of our fast 2-D cubic spline algorithm is provided. We confirm analytically and numerically that for large N ( N > 200), our method is at least 3 times faster than the standard algorithm and is just as accurate. Topic 3. The well-known ADI-FDTD method for solving Maxwell's curl equations is second-order accurate in space/time, unconditionally stable, and computationally efficient. We research Richardson extrapolation -based techniques to improve time discretization accuracy for spatially oversampled ADI-FDTD. A careful analysis of temporal accuracy, computational efficiency, and the algorithm's overall stability is presented. Given the context of wave- type PDEs, we find that only a limited number of extrapolations to the ADI-FDTD method are beneficial, if its unconditional stability is to be preserved. We propose a practical approach for choosing the size of a time step that can be used to improve the efficiency of the ADI-FDTD algorithm, while maintaining its accuracy and stability. Topic 4. Shock waves and their energy dissipation properties are critical to understanding the dynamics controlling the MHD turbulence. Numerical advection algorithms used in MHD solvers (e.g. the ZEUS package) introduce undesirable numerical viscosity. To counteract its effects and to resolve shocks numerically, Richtmyer and von Neumann's artificial viscosity is commonly added to the model. We study shock power by analyzing the influence of both artificial and numerical viscosity on energy decay rates. Also, we analytically characterize the numerical diffusivity of various advection algorithms by quantifying their diffusion coefficients e.
Efficient algorithms and implementations of entropy-based moment closures for rarefied gases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schaerer, Roman Pascal, E-mail: schaerer@mathcces.rwth-aachen.de; Bansal, Pratyuksh; Torrilhon, Manuel
We present efficient algorithms and implementations of the 35-moment system equipped with the maximum-entropy closure in the context of rarefied gases. While closures based on the principle of entropy maximization have been shown to yield very promising results for moderately rarefied gas flows, the computational cost of these closures is in general much higher than for closure theories with explicit closed-form expressions of the closing fluxes, such as Grad's classical closure. Following a similar approach as Garrett et al. (2015) , we investigate efficient implementations of the computationally expensive numerical quadrature method used for the moment evaluations of the maximum-entropymore » distribution by exploiting its inherent fine-grained parallelism with the parallelism offered by multi-core processors and graphics cards. We show that using a single graphics card as an accelerator allows speed-ups of two orders of magnitude when compared to a serial CPU implementation. To accelerate the time-to-solution for steady-state problems, we propose a new semi-implicit time discretization scheme. The resulting nonlinear system of equations is solved with a Newton type method in the Lagrange multipliers of the dual optimization problem in order to reduce the computational cost. Additionally, fully explicit time-stepping schemes of first and second order accuracy are presented. We investigate the accuracy and efficiency of the numerical schemes for several numerical test cases, including a steady-state shock-structure problem.« less
NASA Astrophysics Data System (ADS)
Luo, Li; Wang, Xiao-Ping; Cai, Xiao-Chuan
2017-11-01
We study numerically the dynamics of a three-dimensional droplet spreading on a rough solid surface using a phase-field model consisting of the coupled Cahn-Hilliard and Navier-Stokes equations with a generalized Navier boundary condition (GNBC). An efficient finite element method on unstructured meshes is introduced to cope with the complex geometry of the solid surfaces. We extend the GNBC to surfaces with complex geometry by including its weak form along different normal and tangential directions in the finite element formulation. The semi-implicit time discretization scheme results in a decoupled system for the phase function, the velocity, and the pressure. In addition, a mass compensation algorithm is introduced to preserve the mass of the droplet. To efficiently solve the decoupled systems, we present a highly parallel solution strategy based on domain decomposition techniques. We validate the newly developed solution method through extensive numerical experiments, particularly for those phenomena that can not be achieved by two-dimensional simulations. On a surface with circular posts, we study how wettability of the rough surface depends on the geometry of the posts. The contact line motion for a droplet spreading over some periodic rough surfaces are also efficiently computed. Moreover, we study the spreading process of an impacting droplet on a microstructured surface, a qualitative agreement is achieved between the numerical and experimental results. The parallel performance suggests that the proposed solution algorithm is scalable with over 4,000 processors cores with tens of millions of unknowns.
Modified GMDH-NN algorithm and its application for global sensitivity analysis
NASA Astrophysics Data System (ADS)
Song, Shufang; Wang, Lu
2017-11-01
Global sensitivity analysis (GSA) is a very useful tool to evaluate the influence of input variables in the whole distribution range. Sobol' method is the most commonly used among variance-based methods, which are efficient and popular GSA techniques. High dimensional model representation (HDMR) is a popular way to compute Sobol' indices, however, its drawbacks cannot be ignored. We show that modified GMDH-NN algorithm can calculate coefficients of metamodel efficiently, so this paper aims at combining it with HDMR and proposes GMDH-HDMR method. The new method shows higher precision and faster convergent rate. Several numerical and engineering examples are used to confirm its advantages.
NASA Astrophysics Data System (ADS)
Kaveh, A.; Zolghadr, A.
2017-08-01
Structural optimization with frequency constraints is seen as a challenging problem because it is associated with highly nonlinear, discontinuous and non-convex search spaces consisting of several local optima. Therefore, competent optimization algorithms are essential for addressing these problems. In this article, a newly developed metaheuristic method called the cyclical parthenogenesis algorithm (CPA) is used for layout optimization of truss structures subjected to frequency constraints. CPA is a nature-inspired, population-based metaheuristic algorithm, which imitates the reproductive and social behaviour of some animal species such as aphids, which alternate between sexual and asexual reproduction. The efficiency of the CPA is validated using four numerical examples.
An O(Nm(sup 2)) Plane Solver for the Compressible Navier-Stokes Equations
NASA Technical Reports Server (NTRS)
Thomas, J. L.; Bonhaus, D. L.; Anderson, W. K.; Rumsey, C. L.; Biedron, R. T.
1999-01-01
A hierarchical multigrid algorithm for efficient steady solutions to the two-dimensional compressible Navier-Stokes equations is developed and demonstrated. The algorithm applies multigrid in two ways: a Full Approximation Scheme (FAS) for a nonlinear residual equation and a Correction Scheme (CS) for a linearized defect correction implicit equation. Multigrid analyses which include the effect of boundary conditions in one direction are used to estimate the convergence rate of the algorithm for a model convection equation. Three alternating-line- implicit algorithms are compared in terms of efficiency. The analyses indicate that full multigrid efficiency is not attained in the general case; the number of cycles to attain convergence is dependent on the mesh density for high-frequency cross-stream variations. However, the dependence is reasonably small and fast convergence is eventually attained for any given frequency with either the FAS or the CS scheme alone. The paper summarizes numerical computations for which convergence has been attained to within truncation error in a few multigrid cycles for both inviscid and viscous ow simulations on highly stretched meshes.
An efficient HZETRN (a galactic cosmic ray transport code)
NASA Technical Reports Server (NTRS)
Shinn, Judy L.; Wilson, John W.
1992-01-01
An accurate and efficient engineering code for analyzing the shielding requirements against the high-energy galactic heavy ions is needed. The HZETRN is a deterministic code developed at Langley Research Center that is constantly under improvement both in physics and numerical computation and is targeted for such use. One problem area connected with the space-marching technique used in this code is the propagation of the local truncation error. By improving the numerical algorithms for interpolation, integration, and grid distribution formula, the efficiency of the code is increased by a factor of eight as the number of energy grid points is reduced. The numerical accuracy of better than 2 percent for a shield thickness of 150 g/cm(exp 2) is found when a 45 point energy grid is used. The propagating step size, which is related to the perturbation theory, is also reevaluated.
Visual Tracking via Sparse and Local Linear Coding.
Wang, Guofeng; Qin, Xueying; Zhong, Fan; Liu, Yue; Li, Hongbo; Peng, Qunsheng; Yang, Ming-Hsuan
2015-11-01
The state search is an important component of any object tracking algorithm. Numerous algorithms have been proposed, but stochastic sampling methods (e.g., particle filters) are arguably one of the most effective approaches. However, the discretization of the state space complicates the search for the precise object location. In this paper, we propose a novel tracking algorithm that extends the state space of particle observations from discrete to continuous. The solution is determined accurately via iterative linear coding between two convex hulls. The algorithm is modeled by an optimal function, which can be efficiently solved by either convex sparse coding or locality constrained linear coding. The algorithm is also very flexible and can be combined with many generic object representations. Thus, we first use sparse representation to achieve an efficient searching mechanism of the algorithm and demonstrate its accuracy. Next, two other object representation models, i.e., least soft-threshold squares and adaptive structural local sparse appearance, are implemented with improved accuracy to demonstrate the flexibility of our algorithm. Qualitative and quantitative experimental results demonstrate that the proposed tracking algorithm performs favorably against the state-of-the-art methods in dynamic scenes.
NASA Astrophysics Data System (ADS)
Jiang, Daijun; Li, Zhiyuan; Liu, Yikan; Yamamoto, Masahiro
2017-05-01
In this paper, we first establish a weak unique continuation property for time-fractional diffusion-advection equations. The proof is mainly based on the Laplace transform and the unique continuation properties for elliptic and parabolic equations. The result is weaker than its parabolic counterpart in the sense that we additionally impose the homogeneous boundary condition. As a direct application, we prove the uniqueness for an inverse problem on determining the spatial component in the source term by interior measurements. Numerically, we reformulate our inverse source problem as an optimization problem, and propose an iterative thresholding algorithm. Finally, several numerical experiments are presented to show the accuracy and efficiency of the algorithm.
A time-accurate algorithm for chemical non-equilibrium viscous flows at all speeds
NASA Technical Reports Server (NTRS)
Shuen, J.-S.; Chen, K.-H.; Choi, Y.
1992-01-01
A time-accurate, coupled solution procedure is described for the chemical nonequilibrium Navier-Stokes equations over a wide range of Mach numbers. This method employs the strong conservation form of the governing equations, but uses primitive variables as unknowns. Real gas properties and equilibrium chemistry are considered. Numerical tests include steady convergent-divergent nozzle flows with air dissociation/recombination chemistry, dump combustor flows with n-pentane-air chemistry, nonreacting flow in a model double annular combustor, and nonreacting unsteady driven cavity flows. Numerical results for both the steady and unsteady flows demonstrate the efficiency and robustness of the present algorithm for Mach numbers ranging from the incompressible limit to supersonic speeds.
Parallel solution of high-order numerical schemes for solving incompressible flows
NASA Technical Reports Server (NTRS)
Milner, Edward J.; Lin, Avi; Liou, May-Fun; Blech, Richard A.
1993-01-01
A new parallel numerical scheme for solving incompressible steady-state flows is presented. The algorithm uses a finite-difference approach to solving the Navier-Stokes equations. The algorithms are scalable and expandable. They may be used with only two processors or with as many processors as are available. The code is general and expandable. Any size grid may be used. Four processors of the NASA LeRC Hypercluster were used to solve for steady-state flow in a driven square cavity. The Hypercluster was configured in a distributed-memory, hypercube-like architecture. By using a 50-by-50 finite-difference solution grid, an efficiency of 74 percent (a speedup of 2.96) was obtained.
A mixed-mode traffic assignment model with new time-flow impedance function
NASA Astrophysics Data System (ADS)
Lin, Gui-Hua; Hu, Yu; Zou, Yuan-Yang
2018-01-01
Recently, with the wide adoption of electric vehicles, transportation network has shown different characteristics and been further developed. In this paper, we present a new time-flow impedance function, which may be more realistic than the existing time-flow impedance functions. Based on this new impedance function, we present an optimization model for a mixed-mode traffic network in which battery electric vehicles (BEVs) and gasoline vehicles (GVs) are chosen. We suggest two approaches to handle the model: One is to use the interior point (IP) algorithm and the other is to employ the sequential quadratic programming (SQP) algorithm. Three numerical examples are presented to illustrate the efficiency of these approaches. In particular, our numerical results show that more travelers prefer to choosing BEVs when the distance limit of BEVs is long enough and the unit operating cost of GVs is higher than that of BEVs, and the SQP algorithm is faster than the IP algorithm.
Algorithm for computing descriptive statistics for very large data sets and the exa-scale era
NASA Astrophysics Data System (ADS)
Beekman, Izaak
2017-11-01
An algorithm for Single-point, Parallel, Online, Converging Statistics (SPOCS) is presented. It is suited for in situ analysis that traditionally would be relegated to post-processing, and can be used to monitor the statistical convergence and estimate the error/residual in the quantity-useful for uncertainty quantification too. Today, data may be generated at an overwhelming rate by numerical simulations and proliferating sensing apparatuses in experiments and engineering applications. Monitoring descriptive statistics in real time lets costly computations and experiments be gracefully aborted if an error has occurred, and monitoring the level of statistical convergence allows them to be run for the shortest amount of time required to obtain good results. This algorithm extends work by Pébay (Sandia Report SAND2008-6212). Pébay's algorithms are recast into a converging delta formulation, with provably favorable properties. The mean, variance, covariances and arbitrary higher order statistical moments are computed in one pass. The algorithm is tested using Sillero, Jiménez, & Moser's (2013, 2014) publicly available UPM high Reynolds number turbulent boundary layer data set, demonstrating numerical robustness, efficiency and other favorable properties.
An efficient shooting algorithm for Evans function calculations in large systems
NASA Astrophysics Data System (ADS)
Humpherys, Jeffrey; Zumbrun, Kevin
2006-08-01
In Evans function computations of the spectra of asymptotically constant-coefficient linear operators, a basic issue is the efficient and numerically stable computation of subspaces evolving according to the associated eigenvalue ODE. For small systems, a fast, shooting algorithm may be obtained by representing subspaces as single exterior products [J.C. Alexander, R. Sachs, Linear instability of solitary waves of a Boussinesq-type equation: A computer assisted computation, Nonlinear World 2 (4) (1995) 471-507; L.Q. Brin, Numerical testing of the stability of viscous shock waves, Ph.D. Thesis, Indiana University, Bloomington, 1998; L.Q. Brin, Numerical testing of the stability of viscous shock waves, Math. Comp. 70 (235) (2001) 1071-1088; L.Q. Brin, K. Zumbrun, Analytically varying eigenvectors and the stability of viscous shock waves, in: Seventh Workshop on Partial Differential Equations, Part I, 2001, Rio de Janeiro, Mat. Contemp. 22 (2002) 19-32; T.J. Bridges, G. Derks, G. Gottwald, Stability and instability of solitary waves of the fifth-order KdV equation: A numerical framework, Physica D 172 (1-4) (2002) 190-216]. For large systems, however, the dimension of the exterior-product space quickly becomes prohibitive, growing as (n/k), where n is the dimension of the system written as a first-order ODE and k (typically ˜n/2) is the dimension of the subspace. We resolve this difficulty by the introduction of a simple polar coordinate algorithm representing “pure” (monomial) products as scalar multiples of orthonormal bases, for which the angular equation is a numerically optimized version of the continuous orthogonalization method of Drury-Davey [A. Davey, An automatic orthonormalization method for solving stiff boundary value problems, J. Comput. Phys. 51 (2) (1983) 343-356; L.O. Drury, Numerical solution of Orr-Sommerfeld-type equations, J. Comput. Phys. 37 (1) (1980) 133-139] and the radial equation is evaluable by quadrature. Notably, the polar-coordinate method preserves the important property of analyticity with respect to parameters.
Structure-preserving and rank-revealing QR-factorizations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bischof, C.H.; Hansen, P.C.
1991-11-01
The rank-revealing QR-factorization (RRQR-factorization) is a special QR-factorization that is guaranteed to reveal the numerical rank of the matrix under consideration. This makes the RRQR-factorization a useful tool in the numerical treatment of many rank-deficient problems in numerical linear algebra. In this paper, a framework is presented for the efficient implementation of RRQR algorithms, in particular, for sparse matrices. A sparse RRQR-algorithm should seek to preserve the structure and sparsity of the matrix as much as possible while retaining the ability to capture safely the numerical rank. To this end, the paper proposes to compute an initial QR-factorization using amore » restricted pivoting strategy guarded by incremental condition estimation (ICE), and then applies the algorithm suggested by Chan and Foster to this QR-factorization. The column exchange strategy used in the initial QR factorization will exploit the fact that certain column exchanges do not change the sparsity structure, and compute a sparse QR-factorization that is a good approximation of the sought-after RRQR-factorization. Due to quantities produced by ICE, the Chan/Foster RRQR algorithm can be implemented very cheaply, thus verifying that the sought-after RRQR-factorization has indeed been computed. Experimental results on a model problem show that the initial QR-factorization is indeed very likely to produce RRQR-factorization.« less
An algorithm for simulating fracture of cohesive-frictional materials
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nukala, Phani K; Sampath, Rahul S; Barai, Pallab
Fracture of disordered frictional granular materials is dominated by interfacial failure response that is characterized by de-cohesion followed by frictional sliding response. To capture such an interfacial failure response, we introduce a cohesive-friction random fuse model (CFRFM), wherein the cohesive response of the interface is represented by a linear stress-strain response until a failure threshold, which is then followed by a constant response at a threshold lower than the initial failure threshold to represent the interfacial frictional sliding mechanism. This paper presents an efficient algorithm for simulating fracture of such disordered frictional granular materials using the CFRFM. We note that,more » when applied to perfectly plastic disordered materials, our algorithm is both theoretically and numerically equivalent to the traditional tangent algorithm (Roux and Hansen 1992 J. Physique II 2 1007) used for such simulations. However, the algorithm is general and is capable of modeling discontinuous interfacial response. Our numerical simulations using the algorithm indicate that the local and global roughness exponents ({zeta}{sub loc} and {zeta}, respectively) of the fracture surface are equal to each other, and the two-dimensional crack roughness exponent is estimated to be {zeta}{sub loc} = {zeta} = 0.69 {+-} 0.03.« less
Computing Generalized Matrix Inverse on Spiking Neural Substrate.
Shukla, Rohit; Khoram, Soroosh; Jorgensen, Erik; Li, Jing; Lipasti, Mikko; Wright, Stephen
2018-01-01
Emerging neural hardware substrates, such as IBM's TrueNorth Neurosynaptic System, can provide an appealing platform for deploying numerical algorithms. For example, a recurrent Hopfield neural network can be used to find the Moore-Penrose generalized inverse of a matrix, thus enabling a broad class of linear optimizations to be solved efficiently, at low energy cost. However, deploying numerical algorithms on hardware platforms that severely limit the range and precision of representation for numeric quantities can be quite challenging. This paper discusses these challenges and proposes a rigorous mathematical framework for reasoning about range and precision on such substrates. The paper derives techniques for normalizing inputs and properly quantizing synaptic weights originating from arbitrary systems of linear equations, so that solvers for those systems can be implemented in a provably correct manner on hardware-constrained neural substrates. The analytical model is empirically validated on the IBM TrueNorth platform, and results show that the guarantees provided by the framework for range and precision hold under experimental conditions. Experiments with optical flow demonstrate the energy benefits of deploying a reduced-precision and energy-efficient generalized matrix inverse engine on the IBM TrueNorth platform, reflecting 10× to 100× improvement over FPGA and ARM core baselines.
Self-Learning Monte Carlo Method
NASA Astrophysics Data System (ADS)
Liu, Junwei; Qi, Yang; Meng, Zi Yang; Fu, Liang
Monte Carlo simulation is an unbiased numerical tool for studying classical and quantum many-body systems. One of its bottlenecks is the lack of general and efficient update algorithm for large size systems close to phase transition or with strong frustrations, for which local updates perform badly. In this work, we propose a new general-purpose Monte Carlo method, dubbed self-learning Monte Carlo (SLMC), in which an efficient update algorithm is first learned from the training data generated in trial simulations and then used to speed up the actual simulation. We demonstrate the efficiency of SLMC in a spin model at the phase transition point, achieving a 10-20 times speedup. This work is supported by the DOE Office of Basic Energy Sciences, Division of Materials Sciences and Engineering under Award DE-SC0010526.
Continuum topology optimization considering uncertainties in load locations based on the cloud model
NASA Astrophysics Data System (ADS)
Liu, Jie; Wen, Guilin
2018-06-01
Few researchers have paid attention to designing structures in consideration of uncertainties in the loading locations, which may significantly influence the structural performance. In this work, cloud models are employed to depict the uncertainties in the loading locations. A robust algorithm is developed in the context of minimizing the expectation of the structural compliance, while conforming to a material volume constraint. To guarantee optimal solutions, sufficient cloud drops are used, which in turn leads to low efficiency. An innovative strategy is then implemented to enormously improve the computational efficiency. A modified soft-kill bi-directional evolutionary structural optimization method using derived sensitivity numbers is used to output the robust novel configurations. Several numerical examples are presented to demonstrate the effectiveness and efficiency of the proposed algorithm.
Multiple quay cranes scheduling for double cycling in container terminals
Chu, Yanling; Zhang, Xiaoju; Yang, Zhongzhen
2017-01-01
Double cycling is an efficient tool to increase the efficiency of quay crane (QC) in container terminals. In this paper, an optimization model for double cycling is developed to optimize the operation sequence of multiple QCs. The objective is to minimize the makespan of the ship handling operation considering the ship balance constraint. To solve the model, an algorithm based on Lagrangian relaxation is designed. Finally, we compare the efficiency of the Lagrangian relaxation based heuristic with the branch-and-bound method and a genetic algorithm using instances of different sizes. The results of numerical experiments indicate that the proposed model can effectively reduce the unloading and loading times of QCs. The effects of the ship balance constraint are more notable when the number of QCs is high. PMID:28692699
Multiple quay cranes scheduling for double cycling in container terminals.
Chu, Yanling; Zhang, Xiaoju; Yang, Zhongzhen
2017-01-01
Double cycling is an efficient tool to increase the efficiency of quay crane (QC) in container terminals. In this paper, an optimization model for double cycling is developed to optimize the operation sequence of multiple QCs. The objective is to minimize the makespan of the ship handling operation considering the ship balance constraint. To solve the model, an algorithm based on Lagrangian relaxation is designed. Finally, we compare the efficiency of the Lagrangian relaxation based heuristic with the branch-and-bound method and a genetic algorithm using instances of different sizes. The results of numerical experiments indicate that the proposed model can effectively reduce the unloading and loading times of QCs. The effects of the ship balance constraint are more notable when the number of QCs is high.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vecharynski, Eugene; Brabec, Jiri; Shao, Meiyue
Within this paper, we present two efficient iterative algorithms for solving the linear response eigenvalue problem arising from the time dependent density functional theory. Although the matrix to be diagonalized is nonsymmetric, it has a special structure that can be exploited to save both memory and floating point operations. In particular, the nonsymmetric eigenvalue problem can be transformed into an eigenvalue problem that involves the product of two matrices M and K. We show that, because MK is self-adjoint with respect to the inner product induced by the matrix K, this product eigenvalue problem can be solved efficiently by amore » modified Davidson algorithm and a modified locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm that make use of the K-inner product. Additionally, the solution of the product eigenvalue problem yields one component of the eigenvector associated with the original eigenvalue problem. We show that the other component of the eigenvector can be easily recovered in an inexpensive postprocessing procedure. As a result, the algorithms we present here become more efficient than existing methods that try to approximate both components of the eigenvectors simultaneously. In particular, our numerical experiments demonstrate that the new algorithms presented here consistently outperform the existing state-of-the-art Davidson type solvers by a factor of two in both solution time and storage.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vecharynski, Eugene; Brabec, Jiri; Shao, Meiyue
In this article, we present two efficient iterative algorithms for solving the linear response eigenvalue problem arising from the time dependent density functional theory. Although the matrix to be diagonalized is nonsymmetric, it has a special structure that can be exploited to save both memory and floating point operations. In particular, the nonsymmetric eigenvalue problem can be transformed into an eigenvalue problem that involves the product of two matrices M and K. We show that, because MK is self-adjoint with respect to the inner product induced by the matrix K, this product eigenvalue problem can be solved efficiently by amore » modified Davidson algorithm and a modified locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm that make use of the K-inner product. The solution of the product eigenvalue problem yields one component of the eigenvector associated with the original eigenvalue problem. We show that the other component of the eigenvector can be easily recovered in an inexpensive postprocessing procedure. As a result, the algorithms we present here become more efficient than existing methods that try to approximate both components of the eigenvectors simultaneously. In particular, our numerical experiments demonstrate that the new algorithms presented here consistently outperform the existing state-of-the-art Davidson type solvers by a factor of two in both solution time and storage.« less
Vecharynski, Eugene; Brabec, Jiri; Shao, Meiyue; ...
2017-12-01
In this article, we present two efficient iterative algorithms for solving the linear response eigenvalue problem arising from the time dependent density functional theory. Although the matrix to be diagonalized is nonsymmetric, it has a special structure that can be exploited to save both memory and floating point operations. In particular, the nonsymmetric eigenvalue problem can be transformed into an eigenvalue problem that involves the product of two matrices M and K. We show that, because MK is self-adjoint with respect to the inner product induced by the matrix K, this product eigenvalue problem can be solved efficiently by amore » modified Davidson algorithm and a modified locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm that make use of the K-inner product. The solution of the product eigenvalue problem yields one component of the eigenvector associated with the original eigenvalue problem. We show that the other component of the eigenvector can be easily recovered in an inexpensive postprocessing procedure. As a result, the algorithms we present here become more efficient than existing methods that try to approximate both components of the eigenvectors simultaneously. In particular, our numerical experiments demonstrate that the new algorithms presented here consistently outperform the existing state-of-the-art Davidson type solvers by a factor of two in both solution time and storage.« less
Vecharynski, Eugene; Brabec, Jiri; Shao, Meiyue; ...
2017-08-24
Within this paper, we present two efficient iterative algorithms for solving the linear response eigenvalue problem arising from the time dependent density functional theory. Although the matrix to be diagonalized is nonsymmetric, it has a special structure that can be exploited to save both memory and floating point operations. In particular, the nonsymmetric eigenvalue problem can be transformed into an eigenvalue problem that involves the product of two matrices M and K. We show that, because MK is self-adjoint with respect to the inner product induced by the matrix K, this product eigenvalue problem can be solved efficiently by amore » modified Davidson algorithm and a modified locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm that make use of the K-inner product. Additionally, the solution of the product eigenvalue problem yields one component of the eigenvector associated with the original eigenvalue problem. We show that the other component of the eigenvector can be easily recovered in an inexpensive postprocessing procedure. As a result, the algorithms we present here become more efficient than existing methods that try to approximate both components of the eigenvectors simultaneously. In particular, our numerical experiments demonstrate that the new algorithms presented here consistently outperform the existing state-of-the-art Davidson type solvers by a factor of two in both solution time and storage.« less
Inverse transport calculations in optical imaging with subspace optimization algorithms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ding, Tian, E-mail: tding@math.utexas.edu; Ren, Kui, E-mail: ren@math.utexas.edu
2014-09-15
Inverse boundary value problems for the radiative transport equation play an important role in optics-based medical imaging techniques such as diffuse optical tomography (DOT) and fluorescence optical tomography (FOT). Despite the rapid progress in the mathematical theory and numerical computation of these inverse problems in recent years, developing robust and efficient reconstruction algorithms remains a challenging task and an active research topic. We propose here a robust reconstruction method that is based on subspace minimization techniques. The method splits the unknown transport solution (or a functional of it) into low-frequency and high-frequency components, and uses singular value decomposition to analyticallymore » recover part of low-frequency information. Minimization is then applied to recover part of the high-frequency components of the unknowns. We present some numerical simulations with synthetic data to demonstrate the performance of the proposed algorithm.« less
Simulation of two-dimensional turbulent flows in a rotating annulus
NASA Astrophysics Data System (ADS)
Storey, Brian D.
2004-05-01
Rotating water tank experiments have been used to study fundamental processes of atmospheric and geophysical turbulence in a controlled laboratory setting. When these tanks are undergoing strong rotation the forced turbulent flow becomes highly two dimensional along the axis of rotation. An efficient numerical method has been developed for simulating the forced quasi-geostrophic equations in an annular geometry to model current laboratory experiments. The algorithm employs a spectral method with Fourier series and Chebyshev polynomials as basis functions. The algorithm has been implemented on a parallel architecture to allow modelling of a wide range of spatial scales over long integration times. This paper describes the derivation of the model equations, numerical method, testing and performance of the algorithm. Results provide reasonable agreement with the experimental data, indicating that such computations can be used as a predictive tool to design future experiments.
Formation Flying Design and Applications in Weak Stability Boundary Regions
NASA Technical Reports Server (NTRS)
Folta, David
2003-01-01
Weak Stability regions serve as superior locations for interferometric scientific investigations. These regions are often selected to minimize environmental disturbances and maximize observing efficiency. Design of formations in these regions are becoming ever more challenging as more complex missions are envisioned. The development of algorithms to enable the capability for formation design must be further enabled to incorporate better understanding of WSB solution space. This development will improve the efficiency and expand the capabilities of current approaches. The Goddard Space Flight Center (GSFC) is currently supporting multiple formation missions in WSB regions. This end-to-end support consists of mission operations, trajectory design, and control. It also includes both algorithm and software development. The Constellation-X, Maxim, and Stellar Imager missions are examples of the use of improved numerical methods for attaining constrained formation geometries and controlling their dynamical evolution. This paper presents a survey of formation missions in the WSB regions and a brief description of the formation design using numerical and dynamical techniques.
Formation flying design and applications in weak stability boundary regions.
Folta, David
2004-05-01
Weak stability regions serve as superior locations for interferomertric scientific investigations. These regions are often selected to minimize environmental disturbances and maximize observation efficiency. Designs of formations in these regions are becoming ever more challenging as more complex missions are envisioned. The development of algorithms to enable the capability for formation design must be further enabled to incorporate better understanding of weak stability boundary solution space. This development will improve the efficiency and expand the capabilities of current approaches. The Goddard Space Flight Center (GSFC) is currently supporting multiple formation missions in weak stability boundary regions. This end-to-end support consists of mission operations, trajectory design, and control. It also includes both algorithm and software development. The Constellation-X, Maxim, and Stellar Imager missions are examples of the use of improved numeric methods to attain constrained formation geometries and control their dynamical evolution. This paper presents a survey of formation missions in the weak stability boundary regions and a brief description of formation design using numerical and dynamical techniques.
Trajectory Segmentation Map-Matching Approach for Large-Scale, High-Resolution GPS Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhu, Lei; Holden, Jacob R.; Gonder, Jeffrey D.
With the development of smartphones and portable GPS devices, large-scale, high-resolution GPS data can be collected. Map matching is a critical step in studying vehicle driving activity and recognizing network traffic conditions from the data. A new trajectory segmentation map-matching algorithm is proposed to deal accurately and efficiently with large-scale, high-resolution GPS trajectory data. The new algorithm separated the GPS trajectory into segments. It found the shortest path for each segment in a scientific manner and ultimately generated a best-matched path for the entire trajectory. The similarity of a trajectory segment and its matched path is described by a similaritymore » score system based on the longest common subsequence. The numerical experiment indicated that the proposed map-matching algorithm was very promising in relation to accuracy and computational efficiency. Large-scale data set applications verified that the proposed method is robust and capable of dealing with real-world, large-scale GPS data in a computationally efficient and accurate manner.« less
Trajectory Segmentation Map-Matching Approach for Large-Scale, High-Resolution GPS Data
Zhu, Lei; Holden, Jacob R.; Gonder, Jeffrey D.
2017-01-01
With the development of smartphones and portable GPS devices, large-scale, high-resolution GPS data can be collected. Map matching is a critical step in studying vehicle driving activity and recognizing network traffic conditions from the data. A new trajectory segmentation map-matching algorithm is proposed to deal accurately and efficiently with large-scale, high-resolution GPS trajectory data. The new algorithm separated the GPS trajectory into segments. It found the shortest path for each segment in a scientific manner and ultimately generated a best-matched path for the entire trajectory. The similarity of a trajectory segment and its matched path is described by a similaritymore » score system based on the longest common subsequence. The numerical experiment indicated that the proposed map-matching algorithm was very promising in relation to accuracy and computational efficiency. Large-scale data set applications verified that the proposed method is robust and capable of dealing with real-world, large-scale GPS data in a computationally efficient and accurate manner.« less
Efficient parallel resolution of the simplified transport equations in mixed-dual formulation
NASA Astrophysics Data System (ADS)
Barrault, M.; Lathuilière, B.; Ramet, P.; Roman, J.
2011-03-01
A reactivity computation consists of computing the highest eigenvalue of a generalized eigenvalue problem, for which an inverse power algorithm is commonly used. Very fine modelizations are difficult to treat for our sequential solver, based on the simplified transport equations, in terms of memory consumption and computational time. A first implementation of a Lagrangian based domain decomposition method brings to a poor parallel efficiency because of an increase in the power iterations [1]. In order to obtain a high parallel efficiency, we improve the parallelization scheme by changing the location of the loop over the subdomains in the overall algorithm and by benefiting from the characteristics of the Raviart-Thomas finite element. The new parallel algorithm still allows us to locally adapt the numerical scheme (mesh, finite element order). However, it can be significantly optimized for the matching grid case. The good behavior of the new parallelization scheme is demonstrated for the matching grid case on several hundreds of nodes for computations based on a pin-by-pin discretization.
Quantum speedup of Monte Carlo methods.
Montanaro, Ashley
2015-09-08
Monte Carlo methods use random sampling to estimate numerical quantities which are hard to compute deterministically. One important example is the use in statistical physics of rapidly mixing Markov chains to approximately compute partition functions. In this work, we describe a quantum algorithm which can accelerate Monte Carlo methods in a very general setting. The algorithm estimates the expected output value of an arbitrary randomized or quantum subroutine with bounded variance, achieving a near-quadratic speedup over the best possible classical algorithm. Combining the algorithm with the use of quantum walks gives a quantum speedup of the fastest known classical algorithms with rigorous performance bounds for computing partition functions, which use multiple-stage Markov chain Monte Carlo techniques. The quantum algorithm can also be used to estimate the total variation distance between probability distributions efficiently.
Discrete size optimization of steel trusses using a refined big bang-big crunch algorithm
NASA Astrophysics Data System (ADS)
Hasançebi, O.; Kazemzadeh Azad, S.
2014-01-01
This article presents a methodology that provides a method for design optimization of steel truss structures based on a refined big bang-big crunch (BB-BC) algorithm. It is shown that a standard formulation of the BB-BC algorithm occasionally falls short of producing acceptable solutions to problems from discrete size optimum design of steel trusses. A reformulation of the algorithm is proposed and implemented for design optimization of various discrete truss structures according to American Institute of Steel Construction Allowable Stress Design (AISC-ASD) specifications. Furthermore, the performance of the proposed BB-BC algorithm is compared to its standard version as well as other well-known metaheuristic techniques. The numerical results confirm the efficiency of the proposed algorithm in practical design optimization of truss structures.
Quantum speedup of Monte Carlo methods
Montanaro, Ashley
2015-01-01
Monte Carlo methods use random sampling to estimate numerical quantities which are hard to compute deterministically. One important example is the use in statistical physics of rapidly mixing Markov chains to approximately compute partition functions. In this work, we describe a quantum algorithm which can accelerate Monte Carlo methods in a very general setting. The algorithm estimates the expected output value of an arbitrary randomized or quantum subroutine with bounded variance, achieving a near-quadratic speedup over the best possible classical algorithm. Combining the algorithm with the use of quantum walks gives a quantum speedup of the fastest known classical algorithms with rigorous performance bounds for computing partition functions, which use multiple-stage Markov chain Monte Carlo techniques. The quantum algorithm can also be used to estimate the total variation distance between probability distributions efficiently. PMID:26528079
Fidelity-Based Ant Colony Algorithm with Q-learning of Quantum System
NASA Astrophysics Data System (ADS)
Liao, Qin; Guo, Ying; Tu, Yifeng; Zhang, Hang
2018-03-01
Quantum ant colony algorithm (ACA) has potential applications in quantum information processing, such as solutions of traveling salesman problem, zero-one knapsack problem, robot route planning problem, and so on. To shorten the search time of the ACA, we suggest the fidelity-based ant colony algorithm (FACA) for the control of quantum system. Motivated by structure of the Q-learning algorithm, we demonstrate the combination of a FACA with the Q-learning algorithm and suggest the design of a fidelity-based ant colony algorithm with the Q-learning to improve the performance of the FACA in a spin-1/2 quantum system. The numeric simulation results show that the FACA with the Q-learning can efficiently avoid trapping into local optimal policies and increase the speed of convergence process of quantum system.
Fidelity-Based Ant Colony Algorithm with Q-learning of Quantum System
NASA Astrophysics Data System (ADS)
Liao, Qin; Guo, Ying; Tu, Yifeng; Zhang, Hang
2017-12-01
Quantum ant colony algorithm (ACA) has potential applications in quantum information processing, such as solutions of traveling salesman problem, zero-one knapsack problem, robot route planning problem, and so on. To shorten the search time of the ACA, we suggest the fidelity-based ant colony algorithm (FACA) for the control of quantum system. Motivated by structure of the Q-learning algorithm, we demonstrate the combination of a FACA with the Q-learning algorithm and suggest the design of a fidelity-based ant colony algorithm with the Q-learning to improve the performance of the FACA in a spin-1/2 quantum system. The numeric simulation results show that the FACA with the Q-learning can efficiently avoid trapping into local optimal policies and increase the speed of convergence process of quantum system.
A direct method for nonlinear ill-posed problems
NASA Astrophysics Data System (ADS)
Lakhal, A.
2018-02-01
We propose a direct method for solving nonlinear ill-posed problems in Banach-spaces. The method is based on a stable inversion formula we explicitly compute by applying techniques for analytic functions. Furthermore, we investigate the convergence and stability of the method and prove that the derived noniterative algorithm is a regularization. The inversion formula provides a systematic sensitivity analysis. The approach is applicable to a wide range of nonlinear ill-posed problems. We test the algorithm on a nonlinear problem of travel-time inversion in seismic tomography. Numerical results illustrate the robustness and efficiency of the algorithm.
A gradient based algorithm to solve inverse plane bimodular problems of identification
NASA Astrophysics Data System (ADS)
Ran, Chunjiang; Yang, Haitian; Zhang, Guoqing
2018-02-01
This paper presents a gradient based algorithm to solve inverse plane bimodular problems of identifying constitutive parameters, including tensile/compressive moduli and tensile/compressive Poisson's ratios. For the forward bimodular problem, a FE tangent stiffness matrix is derived facilitating the implementation of gradient based algorithms, for the inverse bimodular problem of identification, a two-level sensitivity analysis based strategy is proposed. Numerical verification in term of accuracy and efficiency is provided, and the impacts of initial guess, number of measurement points, regional inhomogeneity, and noisy data on the identification are taken into accounts.
Homotopy Algorithm for Fixed Order Mixed H2/H(infinity) Design
NASA Technical Reports Server (NTRS)
Whorton, Mark; Buschek, Harald; Calise, Anthony J.
1996-01-01
Recent developments in the field of robust multivariable control have merged the theories of H-infinity and H-2 control. This mixed H-2/H-infinity compensator formulation allows design for nominal performance by H-2 norm minimization while guaranteeing robust stability to unstructured uncertainties by constraining the H-infinity norm. A key difficulty associated with mixed H-2/H-infinity compensation is compensator synthesis. A homotopy algorithm is presented for synthesis of fixed order mixed H-2/H-infinity compensators. Numerical results are presented for a four disk flexible structure to evaluate the efficiency of the algorithm.
Superiorization-based multi-energy CT image reconstruction
Yang, Q; Cong, W; Wang, G
2017-01-01
The recently-developed superiorization approach is efficient and robust for solving various constrained optimization problems. This methodology can be applied to multi-energy CT image reconstruction with the regularization in terms of the prior rank, intensity and sparsity model (PRISM). In this paper, we propose a superiorized version of the simultaneous algebraic reconstruction technique (SART) based on the PRISM model. Then, we compare the proposed superiorized algorithm with the Split-Bregman algorithm in numerical experiments. The results show that both the Superiorized-SART and the Split-Bregman algorithms generate good results with weak noise and reduced artefacts. PMID:28983142
Development of homotopy algorithms for fixed-order mixed H2/H(infinity) controller synthesis
NASA Technical Reports Server (NTRS)
Whorton, M.; Buschek, H.; Calise, A. J.
1994-01-01
A major difficulty associated with H-infinity and mu-synthesis methods is the order of the resulting compensator. Whereas model and/or controller reduction techniques are sometimes applied, performance and robustness properties are not preserved. By directly constraining compensator order during the optimization process, these properties are better preserved, albeit at the expense of computational complexity. This paper presents a novel homotopy algorithm to synthesize fixed-order mixed H2/H-infinity compensators. Numerical results are presented for a four-disk flexible structure to evaluate the efficiency of the algorithm.
Adaptive Grid Refinement for Atmospheric Boundary Layer Simulations
NASA Astrophysics Data System (ADS)
van Hooft, Antoon; van Heerwaarden, Chiel; Popinet, Stephane; van der linden, Steven; de Roode, Stephan; van de Wiel, Bas
2017-04-01
We validate and benchmark an adaptive mesh refinement (AMR) algorithm for numerical simulations of the atmospheric boundary layer (ABL). The AMR technique aims to distribute the computational resources efficiently over a domain by refining and coarsening the numerical grid locally and in time. This can be beneficial for studying cases in which length scales vary significantly in time and space. We present the results for a case describing the growth and decay of a convective boundary layer. The AMR results are benchmarked against two runs using a fixed, fine meshed grid. First, with the same numerical formulation as the AMR-code and second, with a code dedicated to ABL studies. Compared to the fixed and isotropic grid runs, the AMR algorithm can coarsen and refine the grid such that accurate results are obtained whilst using only a fraction of the grid cells. Performance wise, the AMR run was cheaper than the fixed and isotropic grid run with similar numerical formulations. However, for this specific case, the dedicated code outperformed both aforementioned runs.
Yu, Yi; Wu, Yonggang; Hu, Binqi; Liu, Xinglong
2018-01-01
The dispatching of hydro-thermal system is a nonlinear programming problem with multiple constraints and high dimensions and the solution techniques of the model have been a hotspot in research. Based on the advantage of that the artificial bee colony algorithm (ABC) can efficiently solve the high-dimensional problem, an improved artificial bee colony algorithm has been proposed to solve DHTS problem in this paper. The improvements of the proposed algorithm include two aspects. On one hand, local search can be guided in efficiency by the information of the global optimal solution and its gradient in each generation. The global optimal solution improves the search efficiency of the algorithm but loses diversity, while the gradient can weaken the loss of diversity caused by the global optimal solution. On the other hand, inspired by genetic algorithm, the nectar resource which has not been updated in limit generation is transformed to a new one by using selection, crossover and mutation, which can ensure individual diversity and make full use of prior information for improving the global search ability of the algorithm. The two improvements of ABC algorithm are proved to be effective via a classical numeral example at last. Among which the genetic operator for the promotion of the ABC algorithm's performance is significant. The results are also compared with those of other state-of-the-art algorithms, the enhanced ABC algorithm has general advantages in minimum cost, average cost and maximum cost which shows its usability and effectiveness. The achievements in this paper provide a new method for solving the DHTS problems, and also offer a novel reference for the improvement of mechanism and the application of algorithms.
Eigenvalue routines in NASTRAN: A comparison with the Block Lanczos method
NASA Technical Reports Server (NTRS)
Tischler, V. A.; Venkayya, Vipperla B.
1993-01-01
The NASA STRuctural ANalysis (NASTRAN) program is one of the most extensively used engineering applications software in the world. It contains a wealth of matrix operations and numerical solution techniques, and they were used to construct efficient eigenvalue routines. The purpose of this paper is to examine the current eigenvalue routines in NASTRAN and to make efficiency comparisons with a more recent implementation of the Block Lanczos algorithm by Boeing Computer Services (BCS). This eigenvalue routine is now available in the BCS mathematics library as well as in several commercial versions of NASTRAN. In addition, CRAY maintains a modified version of this routine on their network. Several example problems, with a varying number of degrees of freedom, were selected primarily for efficiency bench-marking. Accuracy is not an issue, because they all gave comparable results. The Block Lanczos algorithm was found to be extremely efficient, in particular, for very large size problems.
Efficient spares matrix multiplication scheme for the CYBER 203
NASA Technical Reports Server (NTRS)
Lambiotte, J. J., Jr.
1984-01-01
This work has been directed toward the development of an efficient algorithm for performing this computation on the CYBER-203. The desire to provide software which gives the user the choice between the often conflicting goals of minimizing central processing (CPU) time or storage requirements has led to a diagonal-based algorithm in which one of three types of storage is selected for each diagonal. For each storage type, an initialization sub-routine estimates the CPU and storage requirements based upon results from previously performed numerical experimentation. These requirements are adjusted by weights provided by the user which reflect the relative importance the user places on the resources. The three storage types employed were chosen to be efficient on the CYBER-203 for diagonals which are sparse, moderately sparse, or dense; however, for many densities, no diagonal type is most efficient with respect to both resource requirements. The user-supplied weights dictate the choice.
On-the-fly Numerical Surface Integration for Finite-Difference Poisson-Boltzmann Methods.
Cai, Qin; Ye, Xiang; Wang, Jun; Luo, Ray
2011-11-01
Most implicit solvation models require the definition of a molecular surface as the interface that separates the solute in atomic detail from the solvent approximated as a continuous medium. Commonly used surface definitions include the solvent accessible surface (SAS), the solvent excluded surface (SES), and the van der Waals surface. In this study, we present an efficient numerical algorithm to compute the SES and SAS areas to facilitate the applications of finite-difference Poisson-Boltzmann methods in biomolecular simulations. Different from previous numerical approaches, our algorithm is physics-inspired and intimately coupled to the finite-difference Poisson-Boltzmann methods to fully take advantage of its existing data structures. Our analysis shows that the algorithm can achieve very good agreement with the analytical method in the calculation of the SES and SAS areas. Specifically, in our comprehensive test of 1,555 molecules, the average unsigned relative error is 0.27% in the SES area calculations and 1.05% in the SAS area calculations at the grid spacing of 1/2Å. In addition, a systematic correction analysis can be used to improve the accuracy for the coarse-grid SES area calculations, with the average unsigned relative error in the SES areas reduced to 0.13%. These validation studies indicate that the proposed algorithm can be applied to biomolecules over a broad range of sizes and structures. Finally, the numerical algorithm can also be adapted to evaluate the surface integral of either a vector field or a scalar field defined on the molecular surface for additional solvation energetics and force calculations.
Numerical operator calculus in higher dimensions.
Beylkin, Gregory; Mohlenkamp, Martin J
2002-08-06
When an algorithm in dimension one is extended to dimension d, in nearly every case its computational cost is taken to the power d. This fundamental difficulty is the single greatest impediment to solving many important problems and has been dubbed the curse of dimensionality. For numerical analysis in dimension d, we propose to use a representation for vectors and matrices that generalizes separation of variables while allowing controlled accuracy. Basic linear algebra operations can be performed in this representation using one-dimensional operations, thus bypassing the exponential scaling with respect to the dimension. Although not all operators and algorithms may be compatible with this representation, we believe that many of the most important ones are. We prove that the multiparticle Schrödinger operator, as well as the inverse Laplacian, can be represented very efficiently in this form. We give numerical evidence to support the conjecture that eigenfunctions inherit this property by computing the ground-state eigenfunction for a simplified Schrödinger operator with 30 particles. We conjecture and provide numerical evidence that functions of operators inherit this property, in which case numerical operator calculus in higher dimensions becomes feasible.
A memetic optimization algorithm for multi-constrained multicast routing in ad hoc networks
Hammad, Karim; El Bakly, Ahmed M.
2018-01-01
A mobile ad hoc network is a conventional self-configuring network where the routing optimization problem—subject to various Quality-of-Service (QoS) constraints—represents a major challenge. Unlike previously proposed solutions, in this paper, we propose a memetic algorithm (MA) employing an adaptive mutation parameter, to solve the multicast routing problem with higher search ability and computational efficiency. The proposed algorithm utilizes an updated scheme, based on statistical analysis, to estimate the best values for all MA parameters and enhance MA performance. The numerical results show that the proposed MA improved the delay and jitter of the network, while reducing computational complexity as compared to existing algorithms. PMID:29509760
A memetic optimization algorithm for multi-constrained multicast routing in ad hoc networks.
Ramadan, Rahab M; Gasser, Safa M; El-Mahallawy, Mohamed S; Hammad, Karim; El Bakly, Ahmed M
2018-01-01
A mobile ad hoc network is a conventional self-configuring network where the routing optimization problem-subject to various Quality-of-Service (QoS) constraints-represents a major challenge. Unlike previously proposed solutions, in this paper, we propose a memetic algorithm (MA) employing an adaptive mutation parameter, to solve the multicast routing problem with higher search ability and computational efficiency. The proposed algorithm utilizes an updated scheme, based on statistical analysis, to estimate the best values for all MA parameters and enhance MA performance. The numerical results show that the proposed MA improved the delay and jitter of the network, while reducing computational complexity as compared to existing algorithms.
Loading relativistic Maxwell distributions in particle simulations
NASA Astrophysics Data System (ADS)
Zenitani, S.
2015-12-01
In order to study energetic plasma phenomena by using particle-in-cell (PIC) and Monte-Carlo simulations, we need to deal with relativistic velocity distributions in these simulations. However, numerical algorithms to deal with relativistic distributions are not well known. In this contribution, we overview basic algorithms to load relativistic Maxwell distributions in PIC and Monte-Carlo simulations. For stationary relativistic Maxwellian, the inverse transform method and the Sobol algorithm are reviewed. To boost particles to obtain relativistic shifted-Maxwellian, two rejection methods are newly proposed in a physically transparent manner. Their acceptance efficiencies are 50% for generic cases and 100% for symmetric distributions. They can be combined with arbitrary base algorithms.
Ravishankar, Saiprasad; Nadakuditi, Raj Rao; Fessler, Jeffrey A
2017-12-01
The sparsity of signals in a transform domain or dictionary has been exploited in applications such as compression, denoising and inverse problems. More recently, data-driven adaptation of synthesis dictionaries has shown promise compared to analytical dictionary models. However, dictionary learning problems are typically non-convex and NP-hard, and the usual alternating minimization approaches for these problems are often computationally expensive, with the computations dominated by the NP-hard synthesis sparse coding step. This paper exploits the ideas that drive algorithms such as K-SVD, and investigates in detail efficient methods for aggregate sparsity penalized dictionary learning by first approximating the data with a sum of sparse rank-one matrices (outer products) and then using a block coordinate descent approach to estimate the unknowns. The resulting block coordinate descent algorithms involve efficient closed-form solutions. Furthermore, we consider the problem of dictionary-blind image reconstruction, and propose novel and efficient algorithms for adaptive image reconstruction using block coordinate descent and sum of outer products methodologies. We provide a convergence study of the algorithms for dictionary learning and dictionary-blind image reconstruction. Our numerical experiments show the promising performance and speedups provided by the proposed methods over previous schemes in sparse data representation and compressed sensing-based image reconstruction.
Ravishankar, Saiprasad; Nadakuditi, Raj Rao; Fessler, Jeffrey A.
2017-01-01
The sparsity of signals in a transform domain or dictionary has been exploited in applications such as compression, denoising and inverse problems. More recently, data-driven adaptation of synthesis dictionaries has shown promise compared to analytical dictionary models. However, dictionary learning problems are typically non-convex and NP-hard, and the usual alternating minimization approaches for these problems are often computationally expensive, with the computations dominated by the NP-hard synthesis sparse coding step. This paper exploits the ideas that drive algorithms such as K-SVD, and investigates in detail efficient methods for aggregate sparsity penalized dictionary learning by first approximating the data with a sum of sparse rank-one matrices (outer products) and then using a block coordinate descent approach to estimate the unknowns. The resulting block coordinate descent algorithms involve efficient closed-form solutions. Furthermore, we consider the problem of dictionary-blind image reconstruction, and propose novel and efficient algorithms for adaptive image reconstruction using block coordinate descent and sum of outer products methodologies. We provide a convergence study of the algorithms for dictionary learning and dictionary-blind image reconstruction. Our numerical experiments show the promising performance and speedups provided by the proposed methods over previous schemes in sparse data representation and compressed sensing-based image reconstruction. PMID:29376111
Efficient evaluation of three-center Coulomb integrals
Samu, Gyula
2017-01-01
In this study we pursue the most efficient paths for the evaluation of three-center electron repulsion integrals (ERIs) over solid harmonic Gaussian functions of various angular momenta. First, the adaptation of the well-established techniques developed for four-center ERIs, such as the Obara–Saika, McMurchie–Davidson, Gill–Head-Gordon–Pople, and Rys quadrature schemes, and the combinations thereof for three-center ERIs is discussed. Several algorithmic aspects, such as the order of the various operations and primitive loops as well as prescreening strategies, are analyzed. Second, the number of floating point operations (FLOPs) is estimated for the various algorithms derived, and based on these results the most promising ones are selected. We report the efficient implementation of the latter algorithms invoking automated programming techniques and also evaluate their practical performance. We conclude that the simplified Obara–Saika scheme of Ahlrichs is the most cost-effective one in the majority of cases, but the modified Gill–Head-Gordon–Pople and Rys algorithms proposed herein are preferred for particular shell triplets. Our numerical experiments also show that even though the solid harmonic transformation and the horizontal recurrence require significantly fewer FLOPs if performed at the contracted level, this approach does not improve the efficiency in practical cases. Instead, it is more advantageous to carry out these operations at the primitive level, which allows for more efficient integral prescreening and memory layout. PMID:28571354
Efficient evaluation of three-center Coulomb integrals.
Samu, Gyula; Kállay, Mihály
2017-05-28
In this study we pursue the most efficient paths for the evaluation of three-center electron repulsion integrals (ERIs) over solid harmonic Gaussian functions of various angular momenta. First, the adaptation of the well-established techniques developed for four-center ERIs, such as the Obara-Saika, McMurchie-Davidson, Gill-Head-Gordon-Pople, and Rys quadrature schemes, and the combinations thereof for three-center ERIs is discussed. Several algorithmic aspects, such as the order of the various operations and primitive loops as well as prescreening strategies, are analyzed. Second, the number of floating point operations (FLOPs) is estimated for the various algorithms derived, and based on these results the most promising ones are selected. We report the efficient implementation of the latter algorithms invoking automated programming techniques and also evaluate their practical performance. We conclude that the simplified Obara-Saika scheme of Ahlrichs is the most cost-effective one in the majority of cases, but the modified Gill-Head-Gordon-Pople and Rys algorithms proposed herein are preferred for particular shell triplets. Our numerical experiments also show that even though the solid harmonic transformation and the horizontal recurrence require significantly fewer FLOPs if performed at the contracted level, this approach does not improve the efficiency in practical cases. Instead, it is more advantageous to carry out these operations at the primitive level, which allows for more efficient integral prescreening and memory layout.
Non-adiabatic molecular dynamics by accelerated semiclassical Monte Carlo
White, Alexander J.; Gorshkov, Vyacheslav N.; Tretiak, Sergei; ...
2015-07-07
Non-adiabatic dynamics, where systems non-radiatively transition between electronic states, plays a crucial role in many photo-physical processes, such as fluorescence, phosphorescence, and photoisomerization. Methods for the simulation of non-adiabatic dynamics are typically either numerically impractical, highly complex, or based on approximations which can result in failure for even simple systems. Recently, the Semiclassical Monte Carlo (SCMC) approach was developed in an attempt to combine the accuracy of rigorous semiclassical methods with the efficiency and simplicity of widely used surface hopping methods. However, while SCMC was found to be more efficient than other semiclassical methods, it is not yet as efficientmore » as is needed to be used for large molecular systems. Here, we have developed two new methods: the accelerated-SCMC and the accelerated-SCMC with re-Gaussianization, which reduce the cost of the SCMC algorithm up to two orders of magnitude for certain systems. In many cases shown here, the new procedures are nearly as efficient as the commonly used surface hopping schemes, with little to no loss of accuracy. This implies that these modified SCMC algorithms will be of practical numerical solutions for simulating non-adiabatic dynamics in realistic molecular systems.« less
Liu, S X; Zou, M S
2018-03-01
The radiation loading on a vibratory finite cylindrical shell is conventionally evaluated through the direct numerical integration (DNI) method. An alternative strategy via the fast Fourier transform algorithm is put forward in this work based on the general expression of radiation impedance. To check the feasibility and efficiency of the proposed method, a comparison with DNI is presented through numerical cases. The results obtained using the present method agree well with those calculated by DNI. More importantly, the proposed calculating strategy can significantly save the time cost compared with the conventional approach of straightforward numerical integration.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Goodwin, D. L.; Kuprov, Ilya, E-mail: i.kuprov@soton.ac.uk
Quadratic convergence throughout the active space is achieved for the gradient ascent pulse engineering (GRAPE) family of quantum optimal control algorithms. We demonstrate in this communication that the Hessian of the GRAPE fidelity functional is unusually cheap, having the same asymptotic complexity scaling as the functional itself. This leads to the possibility of using very efficient numerical optimization techniques. In particular, the Newton-Raphson method with a rational function optimization (RFO) regularized Hessian is shown in this work to require fewer system trajectory evaluations than any other algorithm in the GRAPE family. This communication describes algebraic and numerical implementation aspects (matrixmore » exponential recycling, Hessian regularization, etc.) for the RFO Newton-Raphson version of GRAPE and reports benchmarks for common spin state control problems in magnetic resonance spectroscopy.« less
NASA Technical Reports Server (NTRS)
Banks, H. T.; Ito, K.
1991-01-01
A hybrid method for computing the feedback gains in linear quadratic regulator problem is proposed. The method, which combines use of a Chandrasekhar type system with an iteration of the Newton-Kleinman form with variable acceleration parameter Smith schemes, is formulated to efficiently compute directly the feedback gains rather than solutions of an associated Riccati equation. The hybrid method is particularly appropriate when used with large dimensional systems such as those arising in approximating infinite-dimensional (distributed parameter) control systems (e.g., those governed by delay-differential and partial differential equations). Computational advantages of the proposed algorithm over the standard eigenvector (Potter, Laub-Schur) based techniques are discussed, and numerical evidence of the efficacy of these ideas is presented.
Self-learning Monte Carlo method and cumulative update in fermion systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Junwei; Shen, Huitao; Qi, Yang
2017-06-07
In this study, we develop the self-learning Monte Carlo (SLMC) method, a general-purpose numerical method recently introduced to simulate many-body systems, for studying interacting fermion systems. Our method uses a highly efficient update algorithm, which we design and dub “cumulative update”, to generate new candidate configurations in the Markov chain based on a self-learned bosonic effective model. From a general analysis and a numerical study of the double exchange model as an example, we find that the SLMC with cumulative update drastically reduces the computational cost of the simulation, while remaining statistically exact. Remarkably, its computational complexity is far lessmore » than the conventional algorithm with local updates.« less
Complex amplitude reconstruction by iterative amplitude-phase retrieval algorithm with reference
NASA Astrophysics Data System (ADS)
Shen, Cheng; Guo, Cheng; Tan, Jiubin; Liu, Shutian; Liu, Zhengjun
2018-06-01
Multi-image iterative phase retrieval methods have been successfully applied in plenty of research fields due to their simple but efficient implementation. However, there is a mismatch between the measurement of the first long imaging distance and the sequential interval. In this paper, an amplitude-phase retrieval algorithm with reference is put forward without additional measurements or priori knowledge. It gets rid of measuring the first imaging distance. With a designed update formula, it significantly raises the convergence speed and the reconstruction fidelity, especially in phase retrieval. Its superiority over the original amplitude-phase retrieval (APR) method is validated by numerical analysis and experiments. Furthermore, it provides a conceptual design of a compact holographic image sensor, which can achieve numerical refocusing easily.
A numerical algorithm for optimal feedback gains in high dimensional LQR problems
NASA Technical Reports Server (NTRS)
Banks, H. T.; Ito, K.
1986-01-01
A hybrid method for computing the feedback gains in linear quadratic regulator problems is proposed. The method, which combines the use of a Chandrasekhar type system with an iteration of the Newton-Kleinman form with variable acceleration parameter Smith schemes, is formulated so as to efficiently compute directly the feedback gains rather than solutions of an associated Riccati equation. The hybrid method is particularly appropriate when used with large dimensional systems such as those arising in approximating infinite dimensional (distributed parameter) control systems (e.g., those governed by delay-differential and partial differential equations). Computational advantage of the proposed algorithm over the standard eigenvector (Potter, Laub-Schur) based techniques are discussed and numerical evidence of the efficacy of our ideas presented.
Yu, Yi; Hu, Binqi; Liu, Xinglong
2018-01-01
The dispatching of hydro-thermal system is a nonlinear programming problem with multiple constraints and high dimensions and the solution techniques of the model have been a hotspot in research. Based on the advantage of that the artificial bee colony algorithm (ABC) can efficiently solve the high-dimensional problem, an improved artificial bee colony algorithm has been proposed to solve DHTS problem in this paper. The improvements of the proposed algorithm include two aspects. On one hand, local search can be guided in efficiency by the information of the global optimal solution and its gradient in each generation. The global optimal solution improves the search efficiency of the algorithm but loses diversity, while the gradient can weaken the loss of diversity caused by the global optimal solution. On the other hand, inspired by genetic algorithm, the nectar resource which has not been updated in limit generation is transformed to a new one by using selection, crossover and mutation, which can ensure individual diversity and make full use of prior information for improving the global search ability of the algorithm. The two improvements of ABC algorithm are proved to be effective via a classical numeral example at last. Among which the genetic operator for the promotion of the ABC algorithm’s performance is significant. The results are also compared with those of other state-of-the-art algorithms, the enhanced ABC algorithm has general advantages in minimum cost, average cost and maximum cost which shows its usability and effectiveness. The achievements in this paper provide a new method for solving the DHTS problems, and also offer a novel reference for the improvement of mechanism and the application of algorithms. PMID:29324743
Bilinear Inverse Problems: Theory, Algorithms, and Applications
NASA Astrophysics Data System (ADS)
Ling, Shuyang
We will discuss how several important real-world signal processing problems, such as self-calibration and blind deconvolution, can be modeled as bilinear inverse problems and solved by convex and nonconvex optimization approaches. In Chapter 2, we bring together three seemingly unrelated concepts, self-calibration, compressive sensing and biconvex optimization. We show how several self-calibration problems can be treated efficiently within the framework of biconvex compressive sensing via a new method called SparseLift. More specifically, we consider a linear system of equations y = DAx, where the diagonal matrix D (which models the calibration error) is unknown and x is an unknown sparse signal. By "lifting" this biconvex inverse problem and exploiting sparsity in this model, we derive explicit theoretical guarantees under which both x and D can be recovered exactly, robustly, and numerically efficiently. In Chapter 3, we study the question of the joint blind deconvolution and blind demixing, i.e., extracting a sequence of functions [special characters omitted] from observing only the sum of their convolutions [special characters omitted]. In particular, for the special case s = 1, it becomes the well-known blind deconvolution problem. We present a non-convex algorithm which guarantees exact recovery under conditions that are competitive with convex optimization methods, with the additional advantage of being computationally much more efficient. We discuss several applications of the proposed framework in image processing and wireless communications in connection with the Internet-of-Things. In Chapter 4, we consider three different self-calibration models of practical relevance. We show how their corresponding bilinear inverse problems can be solved by both the simple linear least squares approach and the SVD-based approach. As a consequence, the proposed algorithms are numerically extremely efficient, thus allowing for real-time deployment. Explicit theoretical guarantees and stability theory are derived and the number of sampling complexity is nearly optimal (up to a poly-log factor). Applications in imaging sciences and signal processing are discussed and numerical simulations are presented to demonstrate the effectiveness and efficiency of our approach.
Efficient sequential and parallel algorithms for record linkage.
Mamun, Abdullah-Al; Mi, Tian; Aseltine, Robert; Rajasekaran, Sanguthevar
2014-01-01
Integrating data from multiple sources is a crucial and challenging problem. Even though there exist numerous algorithms for record linkage or deduplication, they suffer from either large time needs or restrictions on the number of datasets that they can integrate. In this paper we report efficient sequential and parallel algorithms for record linkage which handle any number of datasets and outperform previous algorithms. Our algorithms employ hierarchical clustering algorithms as the basis. A key idea that we use is radix sorting on certain attributes to eliminate identical records before any further processing. Another novel idea is to form a graph that links similar records and find the connected components. Our sequential and parallel algorithms have been tested on a real dataset of 1,083,878 records and synthetic datasets ranging in size from 50,000 to 9,000,000 records. Our sequential algorithm runs at least two times faster, for any dataset, than the previous best-known algorithm, the two-phase algorithm using faster computation of the edit distance (TPA (FCED)). The speedups obtained by our parallel algorithm are almost linear. For example, we get a speedup of 7.5 with 8 cores (residing in a single node), 14.1 with 16 cores (residing in two nodes), and 26.4 with 32 cores (residing in four nodes). We have compared the performance of our sequential algorithm with TPA (FCED) and found that our algorithm outperforms the previous one. The accuracy is the same as that of this previous best-known algorithm.
Silva, Adão; Gameiro, Atílio
2014-01-01
We present in this work a low-complexity algorithm to solve the sum rate maximization problem in multiuser MIMO broadcast channels with downlink beamforming. Our approach decouples the user selection problem from the resource allocation problem and its main goal is to create a set of quasiorthogonal users. The proposed algorithm exploits physical metrics of the wireless channels that can be easily computed in such a way that a null space projection power can be approximated efficiently. Based on the derived metrics we present a mathematical model that describes the dynamics of the user selection process which renders the user selection problem into an integer linear program. Numerical results show that our approach is highly efficient to form groups of quasiorthogonal users when compared to previously proposed algorithms in the literature. Our user selection algorithm achieves a large portion of the optimum user selection sum rate (90%) for a moderate number of active users. PMID:24574928
Using the Multilayer Free-Surface Flow Model to Solve Wave Problems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Prokof’ev, V. A., E-mail: ProkofyevVA@vniig.ru
2017-01-15
A method is presented for changing over from a single-layer shallow-water model to a multilayer model with hydrostatic pressure profile and, then, to a multilayer model with nonhydrostatic pressure profile. The method does not require complex procedures for solving the discrete Poisson’s equation and features high computation efficiency. The results of validating the algorithm against experimental data critical for the numerical dissipation of the numerical scheme are presented. Examples are considered.
On the solution of the Helmholtz equation on regions with corners.
Serkh, Kirill; Rokhlin, Vladimir
2016-08-16
In this paper we solve several boundary value problems for the Helmholtz equation on polygonal domains. We observe that when the problems are formulated as the boundary integral equations of potential theory, the solutions are representable by series of appropriately chosen Bessel functions. In addition to being analytically perspicuous, the resulting expressions lend themselves to the construction of accurate and efficient numerical algorithms. The results are illustrated by a number of numerical examples.
On the solution of the Helmholtz equation on regions with corners
Serkh, Kirill; Rokhlin, Vladimir
2016-01-01
In this paper we solve several boundary value problems for the Helmholtz equation on polygonal domains. We observe that when the problems are formulated as the boundary integral equations of potential theory, the solutions are representable by series of appropriately chosen Bessel functions. In addition to being analytically perspicuous, the resulting expressions lend themselves to the construction of accurate and efficient numerical algorithms. The results are illustrated by a number of numerical examples. PMID:27482110
Adaptive firefly algorithm: parameter analysis and its application.
Cheung, Ngaam J; Ding, Xue-Ming; Shen, Hong-Bin
2014-01-01
As a nature-inspired search algorithm, firefly algorithm (FA) has several control parameters, which may have great effects on its performance. In this study, we investigate the parameter selection and adaptation strategies in a modified firefly algorithm - adaptive firefly algorithm (AdaFa). There are three strategies in AdaFa including (1) a distance-based light absorption coefficient; (2) a gray coefficient enhancing fireflies to share difference information from attractive ones efficiently; and (3) five different dynamic strategies for the randomization parameter. Promising selections of parameters in the strategies are analyzed to guarantee the efficient performance of AdaFa. AdaFa is validated over widely used benchmark functions, and the numerical experiments and statistical tests yield useful conclusions on the strategies and the parameter selections affecting the performance of AdaFa. When applied to the real-world problem - protein tertiary structure prediction, the results demonstrated improved variants can rebuild the tertiary structure with the average root mean square deviation less than 0.4Å and 1.5Å from the native constrains with noise free and 10% Gaussian white noise.
Adaptive Firefly Algorithm: Parameter Analysis and its Application
Shen, Hong-Bin
2014-01-01
As a nature-inspired search algorithm, firefly algorithm (FA) has several control parameters, which may have great effects on its performance. In this study, we investigate the parameter selection and adaptation strategies in a modified firefly algorithm — adaptive firefly algorithm (AdaFa). There are three strategies in AdaFa including (1) a distance-based light absorption coefficient; (2) a gray coefficient enhancing fireflies to share difference information from attractive ones efficiently; and (3) five different dynamic strategies for the randomization parameter. Promising selections of parameters in the strategies are analyzed to guarantee the efficient performance of AdaFa. AdaFa is validated over widely used benchmark functions, and the numerical experiments and statistical tests yield useful conclusions on the strategies and the parameter selections affecting the performance of AdaFa. When applied to the real-world problem — protein tertiary structure prediction, the results demonstrated improved variants can rebuild the tertiary structure with the average root mean square deviation less than 0.4Å and 1.5Å from the native constrains with noise free and 10% Gaussian white noise. PMID:25397812
EMILiO: a fast algorithm for genome-scale strain design.
Yang, Laurence; Cluett, William R; Mahadevan, Radhakrishnan
2011-05-01
Systems-level design of cell metabolism is becoming increasingly important for renewable production of fuels, chemicals, and drugs. Computational models are improving in the accuracy and scope of predictions, but are also growing in complexity. Consequently, efficient and scalable algorithms are increasingly important for strain design. Previous algorithms helped to consolidate the utility of computational modeling in this field. To meet intensifying demands for high-performance strains, both the number and variety of genetic manipulations involved in strain construction are increasing. Existing algorithms have experienced combinatorial increases in computational complexity when applied toward the design of such complex strains. Here, we present EMILiO, a new algorithm that increases the scope of strain design to include reactions with individually optimized fluxes. Unlike existing approaches that would experience an explosion in complexity to solve this problem, we efficiently generated numerous alternate strain designs producing succinate, l-glutamate and l-serine. This was enabled by successive linear programming, a technique new to the area of computational strain design. Copyright © 2011 Elsevier Inc. All rights reserved.
Petascale turbulence simulation using a highly parallel fast multipole method on GPUs
NASA Astrophysics Data System (ADS)
Yokota, Rio; Barba, L. A.; Narumi, Tetsu; Yasuoka, Kenji
2013-03-01
This paper reports large-scale direct numerical simulations of homogeneous-isotropic fluid turbulence, achieving sustained performance of 1.08 petaflop/s on GPU hardware using single precision. The simulations use a vortex particle method to solve the Navier-Stokes equations, with a highly parallel fast multipole method (FMM) as numerical engine, and match the current record in mesh size for this application, a cube of 40963 computational points solved with a spectral method. The standard numerical approach used in this field is the pseudo-spectral method, relying on the FFT algorithm as the numerical engine. The particle-based simulations presented in this paper quantitatively match the kinetic energy spectrum obtained with a pseudo-spectral method, using a trusted code. In terms of parallel performance, weak scaling results show the FMM-based vortex method achieving 74% parallel efficiency on 4096 processes (one GPU per MPI process, 3 GPUs per node of the TSUBAME-2.0 system). The FFT-based spectral method is able to achieve just 14% parallel efficiency on the same number of MPI processes (using only CPU cores), due to the all-to-all communication pattern of the FFT algorithm. The calculation time for one time step was 108 s for the vortex method and 154 s for the spectral method, under these conditions. Computing with 69 billion particles, this work exceeds by an order of magnitude the largest vortex-method calculations to date.
Recent update of the RPLUS2D/3D codes
NASA Technical Reports Server (NTRS)
Tsai, Y.-L. Peter
1991-01-01
The development of the RPLUS2D/3D codes is summarized. These codes utilize LU algorithms to solve chemical non-equilibrium flows in a body-fitted coordinate system. The motivation behind the development of these codes is the need to numerically predict chemical non-equilibrium flows for the National AeroSpace Plane Program. Recent improvements include vectorization method, blocking algorithms for geometric flexibility, out-of-core storage for large-size problems, and an LU-SW/UP combination for CPU-time efficiency and solution quality.
Implementation of a block Lanczos algorithm for Eigenproblem solution of gyroscopic systems
NASA Technical Reports Server (NTRS)
Gupta, Kajal K.; Lawson, Charles L.
1987-01-01
The details of implementation of a general numerical procedure developed for the accurate and economical computation of natural frequencies and associated modes of any elastic structure rotating along an arbitrary axis are described. A block version of the Lanczos algorithm is derived for the solution that fully exploits associated matrix sparsity and employs only real numbers in all relevant computations. It is also capable of determining multiple roots and proves to be most efficient when compared to other, similar, exisiting techniques.
Arbitrary-level hanging nodes for adaptive hphp-FEM approximations in 3D
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pavel Kus; Pavel Solin; David Andrs
2014-11-01
In this paper we discuss constrained approximation with arbitrary-level hanging nodes in adaptive higher-order finite element methods (hphp-FEM) for three-dimensional problems. This technique enables using highly irregular meshes, and it greatly simplifies the design of adaptive algorithms as it prevents refinements from propagating recursively through the finite element mesh. The technique makes it possible to design efficient adaptive algorithms for purely hexahedral meshes. We present a detailed mathematical description of the method and illustrate it with numerical examples.
An Effective Hybrid Evolutionary Algorithm for Solving the Numerical Optimization Problems
NASA Astrophysics Data System (ADS)
Qian, Xiaohong; Wang, Xumei; Su, Yonghong; He, Liu
2018-04-01
There are many different algorithms for solving complex optimization problems. Each algorithm has been applied successfully in solving some optimization problems, but not efficiently in other problems. In this paper the Cauchy mutation and the multi-parent hybrid operator are combined to propose a hybrid evolutionary algorithm based on the communication (Mixed Evolutionary Algorithm based on Communication), hereinafter referred to as CMEA. The basic idea of the CMEA algorithm is that the initial population is divided into two subpopulations. Cauchy mutation operators and multiple paternal crossover operators are used to perform two subpopulations parallelly to evolve recursively until the downtime conditions are met. While subpopulation is reorganized, the individual is exchanged together with information. The algorithm flow is given and the performance of the algorithm is compared using a number of standard test functions. Simulation results have shown that this algorithm converges significantly faster than FEP (Fast Evolutionary Programming) algorithm, has good performance in global convergence and stability and is superior to other compared algorithms.
NASA Astrophysics Data System (ADS)
Han, Byeongho; Seol, Soon Jee; Byun, Joongmoo
2012-04-01
To simulate wave propagation in a tilted transversely isotropic (TTI) medium with a tilting symmetry-axis of anisotropy, we develop a 2D elastic forward modelling algorithm. In this algorithm, we use the staggered-grid finite-difference method which has fourth-order accuracy in space and second-order accuracy in time. Since velocity-stress formulations are defined for staggered grids, we include auxiliary grid points in the z-direction to meet the free surface boundary conditions for shear stress. Through comparisons of displacements obtained from our algorithm, not only with analytical solutions but also with finite element solutions, we are able to validate that the free surface conditions operate appropriately and elastic waves propagate correctly. In order to handle the artificial boundary reflections efficiently, we also implement convolutional perfectly matched layer (CPML) absorbing boundaries in our algorithm. The CPML sufficiently attenuates energy at the grazing incidence by modifying the damping profile of the PML boundary. Numerical experiments indicate that the algorithm accurately expresses elastic wave propagation in the TTI medium. At the free surface, the numerical results show good agreement with analytical solutions not only for body waves but also for the Rayleigh wave which has strong amplitude along the surface. In addition, we demonstrate the efficiency of CPML for a homogeneous TI medium and a dipping layered model. Only using 10 grid points to the CPML regions, the artificial reflections are successfully suppressed and the energy of the boundary reflection back into the effective modelling area is significantly decayed.
Exact consideration of data redundancies for spiral cone-beam CT
NASA Astrophysics Data System (ADS)
Lauritsch, Guenter; Katsevich, Alexander; Hirsch, Michael
2004-05-01
In multi-slice spiral computed tomography (CT) there is an obvious trend in adding more and more detector rows. The goals are numerous: volume coverage, isotropic spatial resolution, and speed. Consequently, there will be a variety of scan protocols optimizing clinical applications. Flexibility in table feed requires consideration of data redundancies to ensure efficient detector usage. Until recently this was achieved by approximate reconstruction algorithms only. However, due to the increasing cone angles there is a need of exact treatment of the cone beam geometry. A new, exact and efficient 3-PI algorithm for considering three-fold data redundancies was derived from a general, theoretical framework based on 3D Radon inversion using Grangeat's formula. The 3-PI algorithm possesses a simple and efficient structure as the 1-PI method for non-redundant data previously proposed. Filtering is one-dimensional, performed along lines with variable tilt on the detector. This talk deals with a thorough evaluation of the performance of the 3-PI algorithm in comparison to the 1-PI method. Image quality of the 3-PI algorithm is superior. The prominent spiral artifacts and other discretization artifacts are significantly reduced due to averaging effects when taking into account redundant data. Certainly signal-to-noise ratio is increased. The computational expense is comparable even to that of approximate algorithms. The 3-PI algorithm proves its practicability for applications in medical imaging. Other exact n-PI methods for n-fold data redundancies (n odd) can be deduced from the general, theoretical framework.
Parallel discontinuous Galerkin FEM for computing hyperbolic conservation law on unstructured grids
NASA Astrophysics Data System (ADS)
Ma, Xinrong; Duan, Zhijian
2018-04-01
High-order resolution Discontinuous Galerkin finite element methods (DGFEM) has been known as a good method for solving Euler equations and Navier-Stokes equations on unstructured grid, but it costs too much computational resources. An efficient parallel algorithm was presented for solving the compressible Euler equations. Moreover, the multigrid strategy based on three-stage three-order TVD Runge-Kutta scheme was used in order to improve the computational efficiency of DGFEM and accelerate the convergence of the solution of unsteady compressible Euler equations. In order to make each processor maintain load balancing, the domain decomposition method was employed. Numerical experiment performed for the inviscid transonic flow fluid problems around NACA0012 airfoil and M6 wing. The results indicated that our parallel algorithm can improve acceleration and efficiency significantly, which is suitable for calculating the complex flow fluid.
Mniszewski, S M; Cawkwell, M J; Wall, M E; Mohd-Yusof, J; Bock, N; Germann, T C; Niklasson, A M N
2015-10-13
We present an algorithm for the calculation of the density matrix that for insulators scales linearly with system size and parallelizes efficiently on multicore, shared memory platforms with small and controllable numerical errors. The algorithm is based on an implementation of the second-order spectral projection (SP2) algorithm [ Niklasson, A. M. N. Phys. Rev. B 2002 , 66 , 155115 ] in sparse matrix algebra with the ELLPACK-R data format. We illustrate the performance of the algorithm within self-consistent tight binding theory by total energy calculations of gas phase poly(ethylene) molecules and periodic liquid water systems containing up to 15,000 atoms on up to 16 CPU cores. We consider algorithm-specific performance aspects, such as local vs nonlocal memory access and the degree of matrix sparsity. Comparisons to sparse matrix algebra implementations using off-the-shelf libraries on multicore CPUs, graphics processing units (GPUs), and the Intel many integrated core (MIC) architecture are also presented. The accuracy and stability of the algorithm are illustrated with long duration Born-Oppenheimer molecular dynamics simulations of 1000 water molecules and a 303 atom Trp cage protein solvated by 2682 water molecules.
Enhancements on the Convex Programming Based Powered Descent Guidance Algorithm for Mars Landing
NASA Technical Reports Server (NTRS)
Acikmese, Behcet; Blackmore, Lars; Scharf, Daniel P.; Wolf, Aron
2008-01-01
In this paper, we present enhancements on the powered descent guidance algorithm developed for Mars pinpoint landing. The guidance algorithm solves the powered descent minimum fuel trajectory optimization problem via a direct numerical method. Our main contribution is to formulate the trajectory optimization problem, which has nonconvex control constraints, as a finite dimensional convex optimization problem, specifically as a finite dimensional second order cone programming (SOCP) problem. SOCP is a subclass of convex programming, and there are efficient SOCP solvers with deterministic convergence properties. Hence, the resulting guidance algorithm can potentially be implemented onboard a spacecraft for real-time applications. Particularly, this paper discusses the algorithmic improvements obtained by: (i) Using an efficient approach to choose the optimal time-of-flight; (ii) Using a computationally inexpensive way to detect the feasibility/ infeasibility of the problem due to the thrust-to-weight constraint; (iii) Incorporating the rotation rate of the planet into the problem formulation; (iv) Developing additional constraints on the position and velocity to guarantee no-subsurface flight between the time samples of the temporal discretization; (v) Developing a fuel-limited targeting algorithm; (vi) Initial result on developing an onboard table lookup method to obtain almost fuel optimal solutions in real-time.
Further optimization of SeDDaRA blind image deconvolution algorithm and its DSP implementation
NASA Astrophysics Data System (ADS)
Wen, Bo; Zhang, Qiheng; Zhang, Jianlin
2011-11-01
Efficient algorithm for blind image deconvolution and its high-speed implementation is of great value in practice. Further optimization of SeDDaRA is developed, from algorithm structure to numerical calculation methods. The main optimization covers that, the structure's modularization for good implementation feasibility, reducing the data computation and dependency of 2D-FFT/IFFT, and acceleration of power operation by segmented look-up table. Then the Fast SeDDaRA is proposed and specialized for low complexity. As the final implementation, a hardware system of image restoration is conducted by using the multi-DSP parallel processing. Experimental results show that, the processing time and memory demand of Fast SeDDaRA decreases 50% at least; the data throughput of image restoration system is over 7.8Msps. The optimization is proved efficient and feasible, and the Fast SeDDaRA is able to support the real-time application.
NASA Technical Reports Server (NTRS)
Cain, Michael D.
1999-01-01
The goal of this thesis is to develop an efficient and robust locally preconditioned semi-coarsening multigrid algorithm for the two-dimensional Navier-Stokes equations. This thesis examines the performance of the multigrid algorithm with local preconditioning for an upwind-discretization of the Navier-Stokes equations. A block Jacobi iterative scheme is used because of its high frequency error mode damping ability. At low Mach numbers, the performance of a flux preconditioner is investigated. The flux preconditioner utilizes a new limiting technique based on local information that was developed by Siu. Full-coarsening and-semi-coarsening are examined as well as the multigrid V-cycle and full multigrid. The numerical tests were performed on a NACA 0012 airfoil at a range of Mach numbers. The tests show that semi-coarsening with flux preconditioning is the most efficient and robust combination of coarsening strategy, and iterative scheme - especially at low Mach numbers.
Communication-avoiding symmetric-indefinite factorization
Ballard, Grey Malone; Becker, Dulcenia; Demmel, James; ...
2014-11-13
We describe and analyze a novel symmetric triangular factorization algorithm. The algorithm is essentially a block version of Aasen's triangular tridiagonalization. It factors a dense symmetric matrix A as the product A=PLTL TP T where P is a permutation matrix, L is lower triangular, and T is block tridiagonal and banded. The algorithm is the first symmetric-indefinite communication-avoiding factorization: it performs an asymptotically optimal amount of communication in a two-level memory hierarchy for almost any cache-line size. Adaptations of the algorithm to parallel computers are likely to be communication efficient as well; one such adaptation has been recently published. Asmore » a result, the current paper describes the algorithm, proves that it is numerically stable, and proves that it is communication optimal.« less
An Optimization Study of Hot Stamping Operation
NASA Astrophysics Data System (ADS)
Ghoo, Bonyoung; Umezu, Yasuyoshi; Watanabe, Yuko; Ma, Ninshu; Averill, Ron
2010-06-01
In the present study, 3-dimensional finite element analyses for hot-stamping processes of Audi B-pillar product are conducted using JSTAMP/NV and HEEDS. Special attention is paid to the optimization of simulation technology coupling with thermal-mechanical formulations. Numerical simulation based on FEM technology and optimization design using the hybrid adaptive SHERPA algorithm are applied to hot stamping operation to improve productivity. The robustness of the SHERPA algorithm is found through the results of the benchmark example. The SHERPA algorithm is shown to be far superior to the GA (Genetic Algorithm) in terms of efficiency, whose calculation time is about 7 times faster than that of the GA. The SHERPA algorithm could show high performance in a large scale problem having complicated design space and long calculation time.
The high performance parallel algorithm for Unified Gas-Kinetic Scheme
NASA Astrophysics Data System (ADS)
Li, Shiyi; Li, Qibing; Fu, Song; Xu, Jinxiu
2016-11-01
A high performance parallel algorithm for UGKS is developed to simulate three-dimensional flows internal and external on arbitrary grid system. The physical domain and velocity domain are divided into different blocks and distributed according to the two-dimensional Cartesian topology with intra-communicators in physical domain for data exchange and other intra-communicators in velocity domain for sum reduction to moment integrals. Numerical results of three-dimensional cavity flow and flow past a sphere agree well with the results from the existing studies and validate the applicability of the algorithm. The scalability of the algorithm is tested both on small (1-16) and large (729-5832) scale processors. The tested speed-up ratio is near linear ashind thus the efficiency is around 1, which reveals the good scalability of the present algorithm.
Communication-avoiding symmetric-indefinite factorization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ballard, Grey Malone; Becker, Dulcenia; Demmel, James
We describe and analyze a novel symmetric triangular factorization algorithm. The algorithm is essentially a block version of Aasen's triangular tridiagonalization. It factors a dense symmetric matrix A as the product A=PLTL TP T where P is a permutation matrix, L is lower triangular, and T is block tridiagonal and banded. The algorithm is the first symmetric-indefinite communication-avoiding factorization: it performs an asymptotically optimal amount of communication in a two-level memory hierarchy for almost any cache-line size. Adaptations of the algorithm to parallel computers are likely to be communication efficient as well; one such adaptation has been recently published. Asmore » a result, the current paper describes the algorithm, proves that it is numerically stable, and proves that it is communication optimal.« less
NASA Astrophysics Data System (ADS)
Hu, Zixi; Yao, Zhewei; Li, Jinglai
2017-03-01
Many scientific and engineering problems require to perform Bayesian inference for unknowns of infinite dimension. In such problems, many standard Markov Chain Monte Carlo (MCMC) algorithms become arbitrary slow under the mesh refinement, which is referred to as being dimension dependent. To this end, a family of dimensional independent MCMC algorithms, known as the preconditioned Crank-Nicolson (pCN) methods, were proposed to sample the infinite dimensional parameters. In this work we develop an adaptive version of the pCN algorithm, where the covariance operator of the proposal distribution is adjusted based on sampling history to improve the simulation efficiency. We show that the proposed algorithm satisfies an important ergodicity condition under some mild assumptions. Finally we provide numerical examples to demonstrate the performance of the proposed method.
Mizutani, Eiji; Demmel, James W
2003-01-01
This paper briefly introduces our numerical linear algebra approaches for solving structured nonlinear least squares problems arising from 'multiple-output' neural-network (NN) models. Our algorithms feature trust-region regularization, and exploit sparsity of either the 'block-angular' residual Jacobian matrix or the 'block-arrow' Gauss-Newton Hessian (or Fisher information matrix in statistical sense) depending on problem scale so as to render a large class of NN-learning algorithms 'efficient' in both memory and operation costs. Using a relatively large real-world nonlinear regression application, we shall explain algorithmic strengths and weaknesses, analyzing simulation results obtained by both direct and iterative trust-region algorithms with two distinct NN models: 'multilayer perceptrons' (MLP) and 'complementary mixtures of MLP-experts' (or neuro-fuzzy modular networks).
NASA Astrophysics Data System (ADS)
Clark, Martyn P.; Kavetski, Dmitri
2010-10-01
A major neglected weakness of many current hydrological models is the numerical method used to solve the governing model equations. This paper thoroughly evaluates several classes of time stepping schemes in terms of numerical reliability and computational efficiency in the context of conceptual hydrological modeling. Numerical experiments are carried out using 8 distinct time stepping algorithms and 6 different conceptual rainfall-runoff models, applied in a densely gauged experimental catchment, as well as in 12 basins with diverse physical and hydroclimatic characteristics. Results show that, over vast regions of the parameter space, the numerical errors of fixed-step explicit schemes commonly used in hydrology routinely dwarf the structural errors of the model conceptualization. This substantially degrades model predictions, but also, disturbingly, generates fortuitously adequate performance for parameter sets where numerical errors compensate for model structural errors. Simply running fixed-step explicit schemes with shorter time steps provides a poor balance between accuracy and efficiency: in some cases daily-step adaptive explicit schemes with moderate error tolerances achieved comparable or higher accuracy than 15 min fixed-step explicit approximations but were nearly 10 times more efficient. From the range of simple time stepping schemes investigated in this work, the fixed-step implicit Euler method and the adaptive explicit Heun method emerge as good practical choices for the majority of simulation scenarios. In combination with the companion paper, where impacts on model analysis, interpretation, and prediction are assessed, this two-part study vividly highlights the impact of numerical errors on critical performance aspects of conceptual hydrological models and provides practical guidelines for robust numerical implementation.
Automatic red eye correction and its quality metric
NASA Astrophysics Data System (ADS)
Safonov, Ilia V.; Rychagov, Michael N.; Kang, KiMin; Kim, Sang Ho
2008-01-01
The red eye artifacts are troublesome defect of amateur photos. Correction of red eyes during printing without user intervention and making photos more pleasant for an observer are important tasks. The novel efficient technique of automatic correction of red eyes aimed for photo printers is proposed. This algorithm is independent from face orientation and capable to detect paired red eyes as well as single red eyes. The approach is based on application of 3D tables with typicalness levels for red eyes and human skin tones and directional edge detection filters for processing of redness image. Machine learning is applied for feature selection. For classification of red eye regions a cascade of classifiers including Gentle AdaBoost committee from Classification and Regression Trees (CART) is applied. Retouching stage includes desaturation, darkening and blending with initial image. Several versions of approach implementation using trade-off between detection and correction quality, processing time, memory volume are possible. The numeric quality criterion of automatic red eye correction is proposed. This quality metric is constructed by applying Analytic Hierarchy Process (AHP) for consumer opinions about correction outcomes. Proposed numeric metric helped to choose algorithm parameters via optimization procedure. Experimental results demonstrate high accuracy and efficiency of the proposed algorithm in comparison with existing solutions.
NASA Astrophysics Data System (ADS)
Dimov, I.; Georgieva, R.; Todorov, V.; Ostromsky, Tz.
2017-10-01
Reliability of large-scale mathematical models is an important issue when such models are used to support decision makers. Sensitivity analysis of model outputs to variation or natural uncertainties of model inputs is crucial for improving the reliability of mathematical models. A comprehensive experimental study of Monte Carlo algorithms based on Sobol sequences for multidimensional numerical integration has been done. A comparison with Latin hypercube sampling and a particular quasi-Monte Carlo lattice rule based on generalized Fibonacci numbers has been presented. The algorithms have been successfully applied to compute global Sobol sensitivity measures corresponding to the influence of several input parameters (six chemical reactions rates and four different groups of pollutants) on the concentrations of important air pollutants. The concentration values have been generated by the Unified Danish Eulerian Model. The sensitivity study has been done for the areas of several European cities with different geographical locations. The numerical tests show that the stochastic algorithms under consideration are efficient for multidimensional integration and especially for computing small by value sensitivity indices. It is a crucial element since even small indices may be important to be estimated in order to achieve a more accurate distribution of inputs influence and a more reliable interpretation of the mathematical model results.
A Novel Quantum-Behaved Bat Algorithm with Mean Best Position Directed for Numerical Optimization
Zhu, Wenyong; Liu, Zijuan; Duan, Qingyan; Cao, Long
2016-01-01
This paper proposes a novel quantum-behaved bat algorithm with the direction of mean best position (QMBA). In QMBA, the position of each bat is mainly updated by the current optimal solution in the early stage of searching and in the late search it also depends on the mean best position which can enhance the convergence speed of the algorithm. During the process of searching, quantum behavior of bats is introduced which is beneficial to jump out of local optimal solution and make the quantum-behaved bats not easily fall into local optimal solution, and it has better ability to adapt complex environment. Meanwhile, QMBA makes good use of statistical information of best position which bats had experienced to generate better quality solutions. This approach not only inherits the characteristic of quick convergence, simplicity, and easy implementation of original bat algorithm, but also increases the diversity of population and improves the accuracy of solution. Twenty-four benchmark test functions are tested and compared with other variant bat algorithms for numerical optimization the simulation results show that this approach is simple and efficient and can achieve a more accurate solution. PMID:27293424
Efficient Numerical Diagonalization of Hermitian 3 × 3 Matrices
NASA Astrophysics Data System (ADS)
Kopp, Joachim
A very common problem in science is the numerical diagonalization of symmetric or hermitian 3 × 3 matrices. Since standard "black box" packages may be too inefficient if the number of matrices is large, we study several alternatives. We consider optimized implementations of the Jacobi, QL, and Cuppen algorithms and compare them with an alytical method relying on Cardano's formula for the eigenvalues and on vector cross products for the eigenvectors. Jacobi is the most accurate, but also the slowest method, while QL and Cuppen are good general purpose algorithms. The analytical algorithm outperforms the others by more than a factor of 2, but becomes inaccurate or may even fail completely if the matrix entries differ greatly in magnitude. This can mostly be circumvented by using a hybrid method, which falls back to QL if conditions are such that the analytical calculation might become too inaccurate. For all algorithms, we give an overview of the underlying mathematical ideas, and present detailed benchmark results. C and Fortran implementations of our code are available for download from .
Vectorization of transport and diffusion computations on the CDC Cyber 205
DOE Office of Scientific and Technical Information (OSTI.GOV)
Abu-Shumays, I.K.
1986-01-01
The development and testing of alternative numerical methods and computational algorithms specifically designed for the vectorization of transport and diffusion computations on a Control Data Corporation (CDC) Cyber 205 vector computer are described. Two solution methods for the discrete ordinates approximation to the transport equation are summarized and compared. Factors of 4 to 7 reduction in run times for certain large transport problems were achieved on a Cyber 205 as compared with run times on a CDC-7600. The solution of tridiagonal systems of linear equations, central to several efficient numerical methods for multidimensional diffusion computations and essential for fluid flowmore » and other physics and engineering problems, is also dealt with. Among the methods tested, a combined odd-even cyclic reduction and modified Cholesky factorization algorithm for solving linear symmetric positive definite tridiagonal systems is found to be the most effective for these systems on a Cyber 205. For large tridiagonal systems, computation with this algorithm is an order of magnitude faster on a Cyber 205 than computation with the best algorithm for tridiagonal systems on a CDC-7600.« less
Merritt, M.L.
1993-01-01
The simulation of the transport of injected freshwater in a thin brackish aquifer, overlain and underlain by confining layers containing more saline water, is shown to be influenced by the choice of the finite-difference approximation method, the algorithm for representing vertical advective and dispersive fluxes, and the values assigned to parametric coefficients that specify the degree of vertical dispersion and molecular diffusion that occurs. Computed potable water recovery efficiencies will differ depending upon the choice of algorithm and approximation method, as will dispersion coefficients estimated based on the calibration of simulations to match measured data. A comparison of centered and backward finite-difference approximation methods shows that substantially different transition zones between injected and native waters are depicted by the different methods, and computed recovery efficiencies vary greatly. Standard and experimental algorithms and a variety of values for molecular diffusivity, transverse dispersivity, and vertical scaling factor were compared in simulations of freshwater storage in a thin brackish aquifer. Computed recovery efficiencies vary considerably, and appreciable differences are observed in the distribution of injected freshwater in the various cases tested. The results demonstrate both a qualitatively different description of transport using the experimental algorithms and the interrelated influences of molecular diffusion and transverse dispersion on simulated recovery efficiency. When simulating natural aquifer flow in cross-section, flushing of the aquifer occurred for all tested coefficient choices using both standard and experimental algorithms. ?? 1993.
Zhang, Kaihua; Zhang, Lei; Yang, Ming-Hsuan
2014-10-01
It is a challenging task to develop effective and efficient appearance models for robust object tracking due to factors such as pose variation, illumination change, occlusion, and motion blur. Existing online tracking algorithms often update models with samples from observations in recent frames. Despite much success has been demonstrated, numerous issues remain to be addressed. First, while these adaptive appearance models are data-dependent, there does not exist sufficient amount of data for online algorithms to learn at the outset. Second, online tracking algorithms often encounter the drift problems. As a result of self-taught learning, misaligned samples are likely to be added and degrade the appearance models. In this paper, we propose a simple yet effective and efficient tracking algorithm with an appearance model based on features extracted from a multiscale image feature space with data-independent basis. The proposed appearance model employs non-adaptive random projections that preserve the structure of the image feature space of objects. A very sparse measurement matrix is constructed to efficiently extract the features for the appearance model. We compress sample images of the foreground target and the background using the same sparse measurement matrix. The tracking task is formulated as a binary classification via a naive Bayes classifier with online update in the compressed domain. A coarse-to-fine search strategy is adopted to further reduce the computational complexity in the detection procedure. The proposed compressive tracking algorithm runs in real-time and performs favorably against state-of-the-art methods on challenging sequences in terms of efficiency, accuracy and robustness.
Artificial bee colony algorithm for constrained possibilistic portfolio optimization problem
NASA Astrophysics Data System (ADS)
Chen, Wei
2015-07-01
In this paper, we discuss the portfolio optimization problem with real-world constraints under the assumption that the returns of risky assets are fuzzy numbers. A new possibilistic mean-semiabsolute deviation model is proposed, in which transaction costs, cardinality and quantity constraints are considered. Due to such constraints the proposed model becomes a mixed integer nonlinear programming problem and traditional optimization methods fail to find the optimal solution efficiently. Thus, a modified artificial bee colony (MABC) algorithm is developed to solve the corresponding optimization problem. Finally, a numerical example is given to illustrate the effectiveness of the proposed model and the corresponding algorithm.
Matrix preconditioning: a robust operation for optical linear algebra processors.
Ghosh, A; Paparao, P
1987-07-15
Analog electrooptical processors are best suited for applications demanding high computational throughput with tolerance for inaccuracies. Matrix preconditioning is one such application. Matrix preconditioning is a preprocessing step for reducing the condition number of a matrix and is used extensively with gradient algorithms for increasing the rate of convergence and improving the accuracy of the solution. In this paper, we describe a simple parallel algorithm for matrix preconditioning, which can be implemented efficiently on a pipelined optical linear algebra processor. From the results of our numerical experiments we show that the efficacy of the preconditioning algorithm is affected very little by the errors of the optical system.
A Multiuser Detector Based on Artificial Bee Colony Algorithm for DS-UWB Systems
Liu, Xiaohui
2013-01-01
Artificial Bee Colony (ABC) algorithm is an optimization algorithm based on the intelligent behavior of honey bee swarm. The ABC algorithm was developed to solve optimizing numerical problems and revealed premising results in processing time and solution quality. In ABC, a colony of artificial bees search for rich artificial food sources; the optimizing numerical problems are converted to the problem of finding the best parameter which minimizes an objective function. Then, the artificial bees randomly discover a population of initial solutions and then iteratively improve them by employing the behavior: moving towards better solutions by means of a neighbor search mechanism while abandoning poor solutions. In this paper, an efficient multiuser detector based on a suboptimal code mapping multiuser detector and artificial bee colony algorithm (SCM-ABC-MUD) is proposed and implemented in direct-sequence ultra-wideband (DS-UWB) systems under the additive white Gaussian noise (AWGN) channel. The simulation results demonstrate that the BER and the near-far effect resistance performances of this proposed algorithm are quite close to those of the optimum multiuser detector (OMD) while its computational complexity is much lower than that of OMD. Furthermore, the BER performance of SCM-ABC-MUD is not sensitive to the number of active users and can obtain a large system capacity. PMID:23983638
ANNIT - An Efficient Inversion Algorithm based on Prediction Principles
NASA Astrophysics Data System (ADS)
Růžek, B.; Kolář, P.
2009-04-01
Solution of inverse problems represents meaningful job in geophysics. The amount of data is continuously increasing, methods of modeling are being improved and the computer facilities are also advancing great technical progress. Therefore the development of new and efficient algorithms and computer codes for both forward and inverse modeling is still up to date. ANNIT is contributing to this stream since it is a tool for efficient solution of a set of non-linear equations. Typical geophysical problems are based on parametric approach. The system is characterized by a vector of parameters p, the response of the system is characterized by a vector of data d. The forward problem is usually represented by unique mapping F(p)=d. The inverse problem is much more complex and the inverse mapping p=G(d) is available in an analytical or closed form only exceptionally and generally it may not exist at all. Technically, both forward and inverse mapping F and G are sets of non-linear equations. ANNIT solves such situation as follows: (i) joint subspaces {pD, pM} of original data and model spaces D, M, resp. are searched for, within which the forward mapping F is sufficiently smooth that the inverse mapping G does exist, (ii) numerical approximation of G in subspaces {pD, pM} is found, (iii) candidate solution is predicted by using this numerical approximation. ANNIT is working in an iterative way in cycles. The subspaces {pD, pM} are searched for by generating suitable populations of individuals (models) covering data and model spaces. The approximation of the inverse mapping is made by using three methods: (a) linear regression, (b) Radial Basis Function Network technique, (c) linear prediction (also known as "Kriging"). The ANNIT algorithm has built in also an archive of already evaluated models. Archive models are re-used in a suitable way and thus the number of forward evaluations is minimized. ANNIT is now implemented both in MATLAB and SCILAB. Numerical tests show good performance of the algorithm. Both versions and documentation are available on Internet and anybody can download them. The goal of this presentation is to offer the algorithm and computer codes for anybody interested in the solution to inverse problems.
Regularization iteration imaging algorithm for electrical capacitance tomography
NASA Astrophysics Data System (ADS)
Tong, Guowei; Liu, Shi; Chen, Hongyan; Wang, Xueyao
2018-03-01
The image reconstruction method plays a crucial role in real-world applications of the electrical capacitance tomography technique. In this study, a new cost function that simultaneously considers the sparsity and low-rank properties of the imaging targets is proposed to improve the quality of the reconstruction images, in which the image reconstruction task is converted into an optimization problem. Within the framework of the split Bregman algorithm, an iterative scheme that splits a complicated optimization problem into several simpler sub-tasks is developed to solve the proposed cost function efficiently, in which the fast-iterative shrinkage thresholding algorithm is introduced to accelerate the convergence. Numerical experiment results verify the effectiveness of the proposed algorithm in improving the reconstruction precision and robustness.
Traffic Flow Management Using Aggregate Flow Models and the Development of Disaggregation Methods
NASA Technical Reports Server (NTRS)
Sun, Dengfeng; Sridhar, Banavar; Grabbe, Shon
2010-01-01
A linear time-varying aggregate traffic flow model can be used to develop Traffic Flow Management (tfm) strategies based on optimization algorithms. However, there are no methods available in the literature to translate these aggregate solutions into actions involving individual aircraft. This paper describes and implements a computationally efficient disaggregation algorithm, which converts an aggregate (flow-based) solution to a flight-specific control action. Numerical results generated by the optimization method and the disaggregation algorithm are presented and illustrated by applying them to generate TFM schedules for a typical day in the U.S. National Airspace System. The results show that the disaggregation algorithm generates control actions for individual flights while keeping the air traffic behavior very close to the optimal solution.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Chao; Pouransari, Hadi; Rajamanickam, Sivasankaran
We present a parallel hierarchical solver for general sparse linear systems on distributed-memory machines. For large-scale problems, this fully algebraic algorithm is faster and more memory-efficient than sparse direct solvers because it exploits the low-rank structure of fill-in blocks. Depending on the accuracy of low-rank approximations, the hierarchical solver can be used either as a direct solver or as a preconditioner. The parallel algorithm is based on data decomposition and requires only local communication for updating boundary data on every processor. Moreover, the computation-to-communication ratio of the parallel algorithm is approximately the volume-to-surface-area ratio of the subdomain owned by everymore » processor. We also provide various numerical results to demonstrate the versatility and scalability of the parallel algorithm.« less
NASA Technical Reports Server (NTRS)
Goodrich, John W.
1995-01-01
Two methods for developing high order single step explicit algorithms on symmetric stencils with data on only one time level are presented. Examples are given for the convection and linearized Euler equations with up to the eighth order accuracy in both space and time in one space dimension, and up to the sixth in two space dimensions. The method of characteristics is generalized to nondiagonalizable hyperbolic systems by using exact local polynominal solutions of the system, and the resulting exact propagator methods automatically incorporate the correct multidimensional wave propagation dynamics. Multivariate Taylor or Cauchy-Kowaleskaya expansions are also used to develop algorithms. Both of these methods can be applied to obtain algorithms of arbitrarily high order for hyperbolic systems in multiple space dimensions. Cross derivatives are included in the local approximations used to develop the algorithms in this paper in order to obtain high order accuracy, and improved isotropy and stability. Efficiency in meeting global error bounds is an important criterion for evaluating algorithms, and the higher order algorithms are shown to be up to several orders of magnitude more efficient even though they are more complex. Stable high order boundary conditions for the linearized Euler equations are developed in one space dimension, and demonstrated in two space dimensions.
Group implicit concurrent algorithms in nonlinear structural dynamics
NASA Technical Reports Server (NTRS)
Ortiz, M.; Sotelino, E. D.
1989-01-01
During the 70's and 80's, considerable effort was devoted to developing efficient and reliable time stepping procedures for transient structural analysis. Mathematically, the equations governing this type of problems are generally stiff, i.e., they exhibit a wide spectrum in the linear range. The algorithms best suited to this type of applications are those which accurately integrate the low frequency content of the response without necessitating the resolution of the high frequency modes. This means that the algorithms must be unconditionally stable, which in turn rules out explicit integration. The most exciting possibility in the algorithms development area in recent years has been the advent of parallel computers with multiprocessing capabilities. So, this work is mainly concerned with the development of parallel algorithms in the area of structural dynamics. A primary objective is to devise unconditionally stable and accurate time stepping procedures which lend themselves to an efficient implementation in concurrent machines. Some features of the new computer architecture are summarized. A brief survey of current efforts in the area is presented. A new class of concurrent procedures, or Group Implicit algorithms is introduced and analyzed. The numerical simulation shows that GI algorithms hold considerable promise for application in coarse grain as well as medium grain parallel computers.
Computing Generalized Matrix Inverse on Spiking Neural Substrate
Shukla, Rohit; Khoram, Soroosh; Jorgensen, Erik; Li, Jing; Lipasti, Mikko; Wright, Stephen
2018-01-01
Emerging neural hardware substrates, such as IBM's TrueNorth Neurosynaptic System, can provide an appealing platform for deploying numerical algorithms. For example, a recurrent Hopfield neural network can be used to find the Moore-Penrose generalized inverse of a matrix, thus enabling a broad class of linear optimizations to be solved efficiently, at low energy cost. However, deploying numerical algorithms on hardware platforms that severely limit the range and precision of representation for numeric quantities can be quite challenging. This paper discusses these challenges and proposes a rigorous mathematical framework for reasoning about range and precision on such substrates. The paper derives techniques for normalizing inputs and properly quantizing synaptic weights originating from arbitrary systems of linear equations, so that solvers for those systems can be implemented in a provably correct manner on hardware-constrained neural substrates. The analytical model is empirically validated on the IBM TrueNorth platform, and results show that the guarantees provided by the framework for range and precision hold under experimental conditions. Experiments with optical flow demonstrate the energy benefits of deploying a reduced-precision and energy-efficient generalized matrix inverse engine on the IBM TrueNorth platform, reflecting 10× to 100× improvement over FPGA and ARM core baselines. PMID:29593483
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tavakoli, Rouhollah, E-mail: rtavakoli@sharif.ir
An unconditionally energy stable time stepping scheme is introduced to solve Cahn–Morral-like equations in the present study. It is constructed based on the combination of David Eyre's time stepping scheme and Schur complement approach. Although the presented method is general and independent of the choice of homogeneous free energy density function term, logarithmic and polynomial energy functions are specifically considered in this paper. The method is applied to study the spinodal decomposition in multi-component systems and optimal space tiling problems. A penalization strategy is developed, in the case of later problem, to avoid trivial solutions. Extensive numerical experiments demonstrate themore » success and performance of the presented method. According to the numerical results, the method is convergent and energy stable, independent of the choice of time stepsize. Its MATLAB implementation is included in the appendix for the numerical evaluation of algorithm and reproduction of the presented results. -- Highlights: •Extension of Eyre's convex–concave splitting scheme to multiphase systems. •Efficient solution of spinodal decomposition in multi-component systems. •Efficient solution of least perimeter periodic space partitioning problem. •Developing a penalization strategy to avoid trivial solutions. •Presentation of MATLAB implementation of the introduced algorithm.« less
Fully implicit moving mesh adaptive algorithm
NASA Astrophysics Data System (ADS)
Chacon, Luis
2005-10-01
In many problems of interest, the numerical modeler is faced with the challenge of dealing with multiple time and length scales. The former is best dealt with with fully implicit methods, which are able to step over fast frequencies to resolve the dynamical time scale of interest. The latter requires grid adaptivity for efficiency. Moving-mesh grid adaptive methods are attractive because they can be designed to minimize the numerical error for a given resolution. However, the required grid governing equations are typically very nonlinear and stiff, and of considerably difficult numerical treatment. Not surprisingly, fully coupled, implicit approaches where the grid and the physics equations are solved simultaneously are rare in the literature, and circumscribed to 1D geometries. In this study, we present a fully implicit algorithm for moving mesh methods that is feasible for multidimensional geometries. A crucial element is the development of an effective multilevel treatment of the grid equation.ootnotetextL. Chac'on, G. Lapenta, A fully implicit, nonlinear adaptive grid strategy, J. Comput. Phys., accepted (2005) We will show that such an approach is competitive vs. uniform grids both from the accuracy (due to adaptivity) and the efficiency standpoints. Results for a variety of models 1D and 2D geometries, including nonlinear diffusion, radiation-diffusion, Burgers equation, and gas dynamics will be presented.
NOTE: A BPF-type algorithm for CT with a curved PI detector
NASA Astrophysics Data System (ADS)
Tang, Jie; Zhang, Li; Chen, Zhiqiang; Xing, Yuxiang; Cheng, Jianping
2006-08-01
Helical cone-beam CT is used widely nowadays because of its rapid scan speed and efficient utilization of x-ray dose. Recently, an exact reconstruction algorithm for helical cone-beam CT was proposed (Zou and Pan 2004a Phys. Med. Biol. 49 941 59). The algorithm is referred to as a backprojection-filtering (BPF) algorithm. This BPF algorithm for a helical cone-beam CT with a flat-panel detector (FPD-HCBCT) requires minimum data within the Tam Danielsson window and can naturally address the problem of ROI reconstruction from data truncated in both longitudinal and transversal directions. In practical CT systems, detectors are expensive and always take a very important position in the total cost. Hence, we work on an exact reconstruction algorithm for a CT system with a detector of the smallest size, i.e., a curved PI detector fitting the Tam Danielsson window. The reconstruction algorithm is derived following the framework of the BPF algorithm. Numerical simulations are done to validate our algorithm in this study.
A BPF-type algorithm for CT with a curved PI detector.
Tang, Jie; Zhang, Li; Chen, Zhiqiang; Xing, Yuxiang; Cheng, Jianping
2006-08-21
Helical cone-beam CT is used widely nowadays because of its rapid scan speed and efficient utilization of x-ray dose. Recently, an exact reconstruction algorithm for helical cone-beam CT was proposed (Zou and Pan 2004a Phys. Med. Biol. 49 941-59). The algorithm is referred to as a backprojection-filtering (BPF) algorithm. This BPF algorithm for a helical cone-beam CT with a flat-panel detector (FPD-HCBCT) requires minimum data within the Tam-Danielsson window and can naturally address the problem of ROI reconstruction from data truncated in both longitudinal and transversal directions. In practical CT systems, detectors are expensive and always take a very important position in the total cost. Hence, we work on an exact reconstruction algorithm for a CT system with a detector of the smallest size, i.e., a curved PI detector fitting the Tam-Danielsson window. The reconstruction algorithm is derived following the framework of the BPF algorithm. Numerical simulations are done to validate our algorithm in this study.
Fast Multilevel Solvers for a Class of Discrete Fourth Order Parabolic Problems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zheng, Bin; Chen, Luoping; Hu, Xiaozhe
2016-03-05
In this paper, we study fast iterative solvers for the solution of fourth order parabolic equations discretized by mixed finite element methods. We propose to use consistent mass matrix in the discretization and use lumped mass matrix to construct efficient preconditioners. We provide eigenvalue analysis for the preconditioned system and estimate the convergence rate of the preconditioned GMRes method. Furthermore, we show that these preconditioners only need to be solved inexactly by optimal multigrid algorithms. Our numerical examples indicate that the proposed preconditioners are very efficient and robust with respect to both discretization parameters and diffusion coefficients. We also investigatemore » the performance of multigrid algorithms with either collective smoothers or distributive smoothers when solving the preconditioner systems.« less
Modified Newton-Raphson GRAPE methods for optimal control of spin systems
NASA Astrophysics Data System (ADS)
Goodwin, D. L.; Kuprov, Ilya
2016-05-01
Quadratic convergence throughout the active space is achieved for the gradient ascent pulse engineering (GRAPE) family of quantum optimal control algorithms. We demonstrate in this communication that the Hessian of the GRAPE fidelity functional is unusually cheap, having the same asymptotic complexity scaling as the functional itself. This leads to the possibility of using very efficient numerical optimization techniques. In particular, the Newton-Raphson method with a rational function optimization (RFO) regularized Hessian is shown in this work to require fewer system trajectory evaluations than any other algorithm in the GRAPE family. This communication describes algebraic and numerical implementation aspects (matrix exponential recycling, Hessian regularization, etc.) for the RFO Newton-Raphson version of GRAPE and reports benchmarks for common spin state control problems in magnetic resonance spectroscopy.
Implicit methods for the Navier-Stokes equations
NASA Technical Reports Server (NTRS)
Yoon, S.; Kwak, D.
1990-01-01
Numerical solutions of the Navier-Stokes equations using explicit schemes can be obtained at the expense of efficiency. Conventional implicit methods which often achieve fast convergence rates suffer high cost per iteration. A new implicit scheme based on lower-upper factorization and symmetric Gauss-Seidel relaxation offers very low cost per iteration as well as fast convergence. High efficiency is achieved by accomplishing the complete vectorizability of the algorithm on oblique planes of sweep in three dimensions.
Numerical Simulation of 3-D Supersonic Viscous Flow in an Experimental MHD Channel
NASA Technical Reports Server (NTRS)
Kato, Hiromasa; Tannehill, John C.; Gupta, Sumeet; Mehta, Unmeel B.
2004-01-01
The 3-D supersonic viscous flow in an experimental MHD channel has been numerically simulated. The experimental MHD channel is currently in operation at NASA Ames Research Center. The channel contains a nozzle section, a center section, and an accelerator section where magnetic and electric fields can be imposed on the flow. In recent tests, velocity increases of up to 40% have been achieved in the accelerator section. The flow in the channel is numerically computed using a new 3-D parabolized Navier-Stokes (PNS) algorithm that has been developed to efficiently compute MHD flows in the low magnetic Reynolds number regime. The MHD effects are modeled by introducing source terms into the PNS equations which can then be solved in a very e5uent manner. To account for upstream (elliptic) effects, the flowfield can be computed using multiple streamwise sweeps with an iterated PNS algorithm. The new algorithm has been used to compute two test cases that match the experimental conditions. In both cases, magnetic and electric fields are applied to the flow. The computed results are in good agreement with the available experimental data.
Numerical Analysis of Ginzburg-Landau Models for Superconductivity.
NASA Astrophysics Data System (ADS)
Coskun, Erhan
Thin film conventional, as well as High T _{c} superconductors of various geometric shapes placed under both uniform and variable strength magnetic field are studied using the universially accepted macroscopic Ginzburg-Landau model. A series of new theoretical results concerning the properties of solution is presented using the semi -discrete time-dependent Ginzburg-Landau equations, staggered grid setup and natural boundary conditions. Efficient serial algorithms including a novel adaptive algorithm is developed and successfully implemented for solving the governing highly nonlinear parabolic system of equations. Refinement technique used in the adaptive algorithm is based on modified forward Euler method which was also developed by us to ease the restriction on time step size for stability considerations. Stability and convergence properties of forward and modified forward Euler schemes are studied. Numerical simulations of various recent physical experiments of technological importance such as vortes motion and pinning are performed. The numerical code for solving time-dependent Ginzburg-Landau equations is parallelized using BlockComm -Chameleon and PCN. The parallel code was run on the distributed memory multiprocessors intel iPSC/860, IBM-SP1 and cluster of Sun Sparc workstations, all located at Mathematics and Computer Science Division, Argonne National Laboratory.
Efficient sequential and parallel algorithms for record linkage
Mamun, Abdullah-Al; Mi, Tian; Aseltine, Robert; Rajasekaran, Sanguthevar
2014-01-01
Background and objective Integrating data from multiple sources is a crucial and challenging problem. Even though there exist numerous algorithms for record linkage or deduplication, they suffer from either large time needs or restrictions on the number of datasets that they can integrate. In this paper we report efficient sequential and parallel algorithms for record linkage which handle any number of datasets and outperform previous algorithms. Methods Our algorithms employ hierarchical clustering algorithms as the basis. A key idea that we use is radix sorting on certain attributes to eliminate identical records before any further processing. Another novel idea is to form a graph that links similar records and find the connected components. Results Our sequential and parallel algorithms have been tested on a real dataset of 1 083 878 records and synthetic datasets ranging in size from 50 000 to 9 000 000 records. Our sequential algorithm runs at least two times faster, for any dataset, than the previous best-known algorithm, the two-phase algorithm using faster computation of the edit distance (TPA (FCED)). The speedups obtained by our parallel algorithm are almost linear. For example, we get a speedup of 7.5 with 8 cores (residing in a single node), 14.1 with 16 cores (residing in two nodes), and 26.4 with 32 cores (residing in four nodes). Conclusions We have compared the performance of our sequential algorithm with TPA (FCED) and found that our algorithm outperforms the previous one. The accuracy is the same as that of this previous best-known algorithm. PMID:24154837
NAS Applications and Advanced Algorithms
NASA Technical Reports Server (NTRS)
Bailey, David H.; Biswas, Rupak; VanDerWijngaart, Rob; Kutler, Paul (Technical Monitor)
1997-01-01
This paper examines the applications most commonly run on the supercomputers at the Numerical Aerospace Simulation (NAS) facility. It analyzes the extent to which such applications are fundamentally oriented to vector computers, and whether or not they can be efficiently implemented on hierarchical memory machines, such as systems with cache memories and highly parallel, distributed memory systems.
Bayesian cloud detection for MERIS, AATSR, and their combination
NASA Astrophysics Data System (ADS)
Hollstein, A.; Fischer, J.; Carbajal Henken, C.; Preusker, R.
2014-11-01
A broad range of different of Bayesian cloud detection schemes is applied to measurements from the Medium Resolution Imaging Spectrometer (MERIS), the Advanced Along-Track Scanning Radiometer (AATSR), and their combination. The cloud masks were designed to be numerically efficient and suited for the processing of large amounts of data. Results from the classical and naive approach to Bayesian cloud masking are discussed for MERIS and AATSR as well as for their combination. A sensitivity study on the resolution of multidimensional histograms, which were post-processed by Gaussian smoothing, shows how theoretically insufficient amounts of truth data can be used to set up accurate classical Bayesian cloud masks. Sets of exploited features from single and derived channels are numerically optimized and results for naive and classical Bayesian cloud masks are presented. The application of the Bayesian approach is discussed in terms of reproducing existing algorithms, enhancing existing algorithms, increasing the robustness of existing algorithms, and on setting up new classification schemes based on manually classified scenes.
Bayesian cloud detection for MERIS, AATSR, and their combination
NASA Astrophysics Data System (ADS)
Hollstein, A.; Fischer, J.; Carbajal Henken, C.; Preusker, R.
2015-04-01
A broad range of different of Bayesian cloud detection schemes is applied to measurements from the Medium Resolution Imaging Spectrometer (MERIS), the Advanced Along-Track Scanning Radiometer (AATSR), and their combination. The cloud detection schemes were designed to be numerically efficient and suited for the processing of large numbers of data. Results from the classical and naive approach to Bayesian cloud masking are discussed for MERIS and AATSR as well as for their combination. A sensitivity study on the resolution of multidimensional histograms, which were post-processed by Gaussian smoothing, shows how theoretically insufficient numbers of truth data can be used to set up accurate classical Bayesian cloud masks. Sets of exploited features from single and derived channels are numerically optimized and results for naive and classical Bayesian cloud masks are presented. The application of the Bayesian approach is discussed in terms of reproducing existing algorithms, enhancing existing algorithms, increasing the robustness of existing algorithms, and on setting up new classification schemes based on manually classified scenes.
Optimal design of solidification processes
NASA Technical Reports Server (NTRS)
Dantzig, Jonathan A.; Tortorelli, Daniel A.
1991-01-01
An optimal design algorithm is presented for the analysis of general solidification processes, and is demonstrated for the growth of GaAs crystals in a Bridgman furnace. The system is optimal in the sense that the prespecified temperature distribution in the solidifying materials is obtained to maximize product quality. The optimization uses traditional numerical programming techniques which require the evaluation of cost and constraint functions and their sensitivities. The finite element method is incorporated to analyze the crystal solidification problem, evaluate the cost and constraint functions, and compute the sensitivities. These techniques are demonstrated in the crystal growth application by determining an optimal furnace wall temperature distribution to obtain the desired temperature profile in the crystal, and hence to maximize the crystal's quality. Several numerical optimization algorithms are studied to determine the proper convergence criteria, effective 1-D search strategies, appropriate forms of the cost and constraint functions, etc. In particular, we incorporate the conjugate gradient and quasi-Newton methods for unconstrained problems. The efficiency and effectiveness of each algorithm is presented in the example problem.
Computing row and column counts for sparse QR and LU factorization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gilbert, John R.; Li, Xiaoye S.; Ng, Esmond G.
2001-01-01
We present algorithms to determine the number of nonzeros in each row and column of the factors of a sparse matrix, for both the QR factorization and the LU factorization with partial pivoting. The algorithms use only the nonzero structure of the input matrix, and run in time nearly linear in the number of nonzeros in that matrix. They may be used to set up data structures or schedule parallel operations in advance of the numerical factorization. The row and column counts we compute are upper bounds on the actual counts. If the input matrix is strong Hall and theremore » is no coincidental numerical cancellation, the counts are exact for QR factorization and are the tightest bounds possible for LU factorization. These algorithms are based on our earlier work on computing row and column counts for sparse Cholesky factorization, plus an efficient method to compute the column elimination tree of a sparse matrix without explicitly forming the product of the matrix and its transpose.« less
An algorithm for the solution of dynamic linear programs
NASA Technical Reports Server (NTRS)
Psiaki, Mark L.
1989-01-01
The algorithm's objective is to efficiently solve Dynamic Linear Programs (DLP) by taking advantage of their special staircase structure. This algorithm constitutes a stepping stone to an improved algorithm for solving Dynamic Quadratic Programs, which, in turn, would make the nonlinear programming method of Successive Quadratic Programs more practical for solving trajectory optimization problems. The ultimate goal is to being trajectory optimization solution speeds into the realm of real-time control. The algorithm exploits the staircase nature of the large constraint matrix of the equality-constrained DLPs encountered when solving inequality-constrained DLPs by an active set approach. A numerically-stable, staircase QL factorization of the staircase constraint matrix is carried out starting from its last rows and columns. The resulting recursion is like the time-varying Riccati equation from multi-stage LQR theory. The resulting factorization increases the efficiency of all of the typical LP solution operations over that of a dense matrix LP code. At the same time numerical stability is ensured. The algorithm also takes advantage of dynamic programming ideas about the cost-to-go by relaxing active pseudo constraints in a backwards sweeping process. This further decreases the cost per update of the LP rank-1 updating procedure, although it may result in more changes of the active set that if pseudo constraints were relaxed in a non-stagewise fashion. The usual stability of closed-loop Linear/Quadratic optimally-controlled systems, if it carries over to strictly linear cost functions, implies that the saving due to reduced factor update effort may outweigh the cost of an increased number of updates. An aerospace example is presented in which a ground-to-ground rocket's distance is maximized. This example demonstrates the applicability of this class of algorithms to aerospace guidance. It also sheds light on the efficacy of the proposed pseudo constraint relaxation scheme.
Portfolio optimization by using linear programing models based on genetic algorithm
NASA Astrophysics Data System (ADS)
Sukono; Hidayat, Y.; Lesmana, E.; Putra, A. S.; Napitupulu, H.; Supian, S.
2018-01-01
In this paper, we discussed the investment portfolio optimization using linear programming model based on genetic algorithms. It is assumed that the portfolio risk is measured by absolute standard deviation, and each investor has a risk tolerance on the investment portfolio. To complete the investment portfolio optimization problem, the issue is arranged into a linear programming model. Furthermore, determination of the optimum solution for linear programming is done by using a genetic algorithm. As a numerical illustration, we analyze some of the stocks traded on the capital market in Indonesia. Based on the analysis, it is shown that the portfolio optimization performed by genetic algorithm approach produces more optimal efficient portfolio, compared to the portfolio optimization performed by a linear programming algorithm approach. Therefore, genetic algorithms can be considered as an alternative on determining the investment portfolio optimization, particularly using linear programming models.
An efficient algorithm for global periodic orbits generation near irregular-shaped asteroids
NASA Astrophysics Data System (ADS)
Shang, Haibin; Wu, Xiaoyu; Ren, Yuan; Shan, Jinjun
2017-07-01
Periodic orbits (POs) play an important role in understanding dynamical behaviors around natural celestial bodies. In this study, an efficient algorithm was presented to generate the global POs around irregular-shaped uniformly rotating asteroids. The algorithm was performed in three steps, namely global search, local refinement, and model continuation. First, a mascon model with a low number of particles and optimized mass distribution was constructed to remodel the exterior gravitational potential of the asteroid. Using this model, a multi-start differential evolution enhanced with a deflection strategy with strong global exploration and bypassing abilities was adopted. This algorithm can be regarded as a search engine to find multiple globally optimal regions in which potential POs were located. This was followed by applying a differential correction to locally refine global search solutions and generate the accurate POs in the mascon model in which an analytical Jacobian matrix was derived to improve convergence. Finally, the concept of numerical model continuation was introduced and used to convert the POs from the mascon model into a high-fidelity polyhedron model by sequentially correcting the initial states. The efficiency of the proposed algorithm was substantiated by computing the global POs around an elongated shoe-shaped asteroid 433 Eros. Various global POs with different topological structures in the configuration space were successfully located. Specifically, the proposed algorithm was generic and could be conveniently extended to explore periodic motions in other gravitational systems.
Random element method for numerical modeling of diffusional processes
NASA Technical Reports Server (NTRS)
Ghoniem, A. F.; Oppenheim, A. K.
1982-01-01
The random element method is a generalization of the random vortex method that was developed for the numerical modeling of momentum transport processes as expressed in terms of the Navier-Stokes equations. The method is based on the concept that random walk, as exemplified by Brownian motion, is the stochastic manifestation of diffusional processes. The algorithm based on this method is grid-free and does not require the diffusion equation to be discritized over a mesh, it is thus devoid of numerical diffusion associated with finite difference methods. Moreover, the algorithm is self-adaptive in space and explicit in time, resulting in an improved numerical resolution of gradients as well as a simple and efficient computational procedure. The method is applied here to an assortment of problems of diffusion of momentum and energy in one-dimension as well as heat conduction in two-dimensions in order to assess its validity and accuracy. The numerical solutions obtained are found to be in good agreement with exact solution except for a statistical error introduced by using a finite number of elements, the error can be reduced by increasing the number of elements or by using ensemble averaging over a number of solutions.
Numerical operator calculus in higher dimensions
Beylkin, Gregory; Mohlenkamp, Martin J.
2002-01-01
When an algorithm in dimension one is extended to dimension d, in nearly every case its computational cost is taken to the power d. This fundamental difficulty is the single greatest impediment to solving many important problems and has been dubbed the curse of dimensionality. For numerical analysis in dimension d, we propose to use a representation for vectors and matrices that generalizes separation of variables while allowing controlled accuracy. Basic linear algebra operations can be performed in this representation using one-dimensional operations, thus bypassing the exponential scaling with respect to the dimension. Although not all operators and algorithms may be compatible with this representation, we believe that many of the most important ones are. We prove that the multiparticle Schrödinger operator, as well as the inverse Laplacian, can be represented very efficiently in this form. We give numerical evidence to support the conjecture that eigenfunctions inherit this property by computing the ground-state eigenfunction for a simplified Schrödinger operator with 30 particles. We conjecture and provide numerical evidence that functions of operators inherit this property, in which case numerical operator calculus in higher dimensions becomes feasible. PMID:12140360
Li, X Y; Yang, G W; Zheng, D S; Guo, W S; Hung, W N N
2015-04-28
Genetic regulatory networks are the key to understanding biochemical systems. One condition of the genetic regulatory network under different living environments can be modeled as a synchronous Boolean network. The attractors of these Boolean networks will help biologists to identify determinant and stable factors. Existing methods identify attractors based on a random initial state or the entire state simultaneously. They cannot identify the fixed length attractors directly. The complexity of including time increases exponentially with respect to the attractor number and length of attractors. This study used the bounded model checking to quickly locate fixed length attractors. Based on the SAT solver, we propose a new algorithm for efficiently computing the fixed length attractors, which is more suitable for large Boolean networks and numerous attractors' networks. After comparison using the tool BooleNet, empirical experiments involving biochemical systems demonstrated the feasibility and efficiency of our approach.
NASA Astrophysics Data System (ADS)
Liao, Wei-Cheng; Hong, Mingyi; Liu, Ya-Feng; Luo, Zhi-Quan
2014-08-01
In a densely deployed heterogeneous network (HetNet), the number of pico/micro base stations (BS) can be comparable with the number of the users. To reduce the operational overhead of the HetNet, proper identification of the set of serving BSs becomes an important design issue. In this work, we show that by jointly optimizing the transceivers and determining the active set of BSs, high system resource utilization can be achieved with only a small number of BSs. In particular, we provide formulations and efficient algorithms for such joint optimization problem, under the following two common design criteria: i) minimization of the total power consumption at the BSs, and ii) maximization of the system spectrum efficiency. In both cases, we introduce a nonsmooth regularizer to facilitate the activation of the most appropriate BSs. We illustrate the efficiency and the efficacy of the proposed algorithms via extensive numerical simulations.
Magnetotelluric inversion via reverse time migration algorithm of seismic data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ha, Taeyoung; Shin, Changsoo
2007-07-01
We propose a new algorithm for two-dimensional magnetotelluric (MT) inversion. Our algorithm is an MT inversion based on the steepest descent method, borrowed from the backpropagation technique of seismic inversion or reverse time migration, introduced in the middle 1980s by Lailly and Tarantola. The steepest descent direction can be calculated efficiently by using the symmetry of numerical Green's function derived from a mixed finite element method proposed by Nedelec for Maxwell's equation, without calculating the Jacobian matrix explicitly. We construct three different objective functions by taking the logarithm of the complex apparent resistivity as introduced in the recent waveform inversionmore » algorithm by Shin and Min. These objective functions can be naturally separated into amplitude inversion, phase inversion and simultaneous inversion. We demonstrate our algorithm by showing three inversion results for synthetic data.« less
An efficient hybrid method for stochastic reaction-diffusion biochemical systems with delay
NASA Astrophysics Data System (ADS)
Sayyidmousavi, Alireza; Ilie, Silvana
2017-12-01
Many chemical reactions, such as gene transcription and translation in living cells, need a certain time to finish once they are initiated. Simulating stochastic models of reaction-diffusion systems with delay can be computationally expensive. In the present paper, a novel hybrid algorithm is proposed to accelerate the stochastic simulation of delayed reaction-diffusion systems. The delayed reactions may be of consuming or non-consuming delay type. The algorithm is designed for moderately stiff systems in which the events can be partitioned into slow and fast subsets according to their propensities. The proposed algorithm is applied to three benchmark problems and the results are compared with those of the delayed Inhomogeneous Stochastic Simulation Algorithm. The numerical results show that the new hybrid algorithm achieves considerable speed-up in the run time and very good accuracy.
Jacobi spectral Galerkin method for elliptic Neumann problems
NASA Astrophysics Data System (ADS)
Doha, E.; Bhrawy, A.; Abd-Elhameed, W.
2009-01-01
This paper is concerned with fast spectral-Galerkin Jacobi algorithms for solving one- and two-dimensional elliptic equations with homogeneous and nonhomogeneous Neumann boundary conditions. The paper extends the algorithms proposed by Shen (SIAM J Sci Comput 15:1489-1505, 1994) and Auteri et al. (J Comput Phys 185:427-444, 2003), based on Legendre polynomials, to Jacobi polynomials with arbitrary α and β. The key to the efficiency of our algorithms is to construct appropriate basis functions with zero slope at the endpoints, which lead to systems with sparse matrices for the discrete variational formulations. The direct solution algorithm developed for the homogeneous Neumann problem in two-dimensions relies upon a tensor product process. Nonhomogeneous Neumann data are accounted for by means of a lifting. Numerical results indicating the high accuracy and effectiveness of these algorithms are presented.
Hardware architecture design of image restoration based on time-frequency domain computation
NASA Astrophysics Data System (ADS)
Wen, Bo; Zhang, Jing; Jiao, Zipeng
2013-10-01
The image restoration algorithms based on time-frequency domain computation is high maturity and applied widely in engineering. To solve the high-speed implementation of these algorithms, the TFDC hardware architecture is proposed. Firstly, the main module is designed, by analyzing the common processing and numerical calculation. Then, to improve the commonality, the iteration control module is planed for iterative algorithms. In addition, to reduce the computational cost and memory requirements, the necessary optimizations are suggested for the time-consuming module, which include two-dimensional FFT/IFFT and the plural calculation. Eventually, the TFDC hardware architecture is adopted for hardware design of real-time image restoration system. The result proves that, the TFDC hardware architecture and its optimizations can be applied to image restoration algorithms based on TFDC, with good algorithm commonality, hardware realizability and high efficiency.
Computing return times or return periods with rare event algorithms
NASA Astrophysics Data System (ADS)
Lestang, Thibault; Ragone, Francesco; Bréhier, Charles-Edouard; Herbert, Corentin; Bouchet, Freddy
2018-04-01
The average time between two occurrences of the same event, referred to as its return time (or return period), is a useful statistical concept for practical applications. For instance insurances or public agencies may be interested by the return time of a 10 m flood of the Seine river in Paris. However, due to their scarcity, reliably estimating return times for rare events is very difficult using either observational data or direct numerical simulations. For rare events, an estimator for return times can be built from the extrema of the observable on trajectory blocks. Here, we show that this estimator can be improved to remain accurate for return times of the order of the block size. More importantly, we show that this approach can be generalised to estimate return times from numerical algorithms specifically designed to sample rare events. So far those algorithms often compute probabilities, rather than return times. The approach we propose provides a computationally extremely efficient way to estimate numerically the return times of rare events for a dynamical system, gaining several orders of magnitude of computational costs. We illustrate the method on two kinds of observables, instantaneous and time-averaged, using two different rare event algorithms, for a simple stochastic process, the Ornstein–Uhlenbeck process. As an example of realistic applications to complex systems, we finally discuss extreme values of the drag on an object in a turbulent flow.
NASA Astrophysics Data System (ADS)
Radev, Dimitar; Lokshina, Izabella
2010-11-01
The paper examines self-similar (or fractal) properties of real communication network traffic data over a wide range of time scales. These self-similar properties are very different from the properties of traditional models based on Poisson and Markov-modulated Poisson processes. Advanced fractal models of sequentional generators and fixed-length sequence generators, and efficient algorithms that are used to simulate self-similar behavior of IP network traffic data are developed and applied. Numerical examples are provided; and simulation results are obtained and analyzed.
Faster Heavy Ion Transport for HZETRN
NASA Technical Reports Server (NTRS)
Slaba, Tony C.
2013-01-01
The deterministic particle transport code HZETRN was developed to enable fast and accurate space radiation transport through materials. As more complex transport solutions are implemented for neutrons, light ions (Z < 2), mesons, and leptons, it is important to maintain overall computational efficiency. In this work, the heavy ion (Z > 2) transport algorithm in HZETRN is reviewed, and a simple modification is shown to provide an approximate 5x decrease in execution time for galactic cosmic ray transport. Convergence tests and other comparisons are carried out to verify that numerical accuracy is maintained in the new algorithm.
Markov chain sampling of the O(n) loop models on the infinite plane
NASA Astrophysics Data System (ADS)
Herdeiro, Victor
2017-07-01
A numerical method was recently proposed in Herdeiro and Doyon [Phys. Rev. E 94, 043322 (2016), 10.1103/PhysRevE.94.043322] showing a precise sampling of the infinite plane two-dimensional critical Ising model for finite lattice subsections. The present note extends the method to a larger class of models, namely the O(n) loop gas models for n ∈(1 ,2 ] . We argue that even though the Gibbs measure is nonlocal, it is factorizable on finite subsections when sufficient information on the loops touching the boundaries is stored. Our results attempt to show that provided an efficient Markov chain mixing algorithm and an improved discrete lattice dilation procedure the planar limit of the O(n) models can be numerically studied with efficiency similar to the Ising case. This confirms that scale invariance is the only requirement for the present numerical method to work.
NASA Technical Reports Server (NTRS)
Carpenter, William C.
1991-01-01
Engineering optimization problems involve minimizing some function subject to constraints. In areas such as aircraft optimization, the constraint equations may be from numerous disciplines such as transfer of information between these disciplines and the optimization algorithm. They are also suited to problems which may require numerous re-optimizations such as in multi-objective function optimization or to problems where the design space contains numerous local minima, thus requiring repeated optimizations from different initial designs. Their use has been limited, however, by the fact that development of response surfaces randomly selected or preselected points in the design space. Thus, they have been thought to be inefficient compared to algorithms to the optimum solution. A development has taken place in the last several years which may effect the desirability of using response surfaces. It may be possible that artificial neural nets are more efficient in developing response surfaces than polynomial approximations which have been used in the past. This development is the concern of the work.
Tensor methodology and computational geometry in direct computational experiments in fluid mechanics
NASA Astrophysics Data System (ADS)
Degtyarev, Alexander; Khramushin, Vasily; Shichkina, Julia
2017-07-01
The paper considers a generalized functional and algorithmic construction of direct computational experiments in fluid dynamics. Notation of tensor mathematics is naturally embedded in the finite - element operation in the construction of numerical schemes. Large fluid particle, which have a finite size, its own weight, internal displacement and deformation is considered as an elementary computing object. Tensor representation of computational objects becomes strait linear and uniquely approximation of elementary volumes and fluid particles inside them. The proposed approach allows the use of explicit numerical scheme, which is an important condition for increasing the efficiency of the algorithms developed by numerical procedures with natural parallelism. It is shown that advantages of the proposed approach are achieved among them by considering representation of large particles of a continuous medium motion in dual coordinate systems and computing operations in the projections of these two coordinate systems with direct and inverse transformations. So new method for mathematical representation and synthesis of computational experiment based on large particle method is proposed.
Scalable domain decomposition solvers for stochastic PDEs in high performance computing
Desai, Ajit; Khalil, Mohammad; Pettit, Chris; ...
2017-09-21
Stochastic spectral finite element models of practical engineering systems may involve solutions of linear systems or linearized systems for non-linear problems with billions of unknowns. For stochastic modeling, it is therefore essential to design robust, parallel and scalable algorithms that can efficiently utilize high-performance computing to tackle such large-scale systems. Domain decomposition based iterative solvers can handle such systems. And though these algorithms exhibit excellent scalabilities, significant algorithmic and implementational challenges exist to extend them to solve extreme-scale stochastic systems using emerging computing platforms. Intrusive polynomial chaos expansion based domain decomposition algorithms are extended here to concurrently handle high resolutionmore » in both spatial and stochastic domains using an in-house implementation. Sparse iterative solvers with efficient preconditioners are employed to solve the resulting global and subdomain level local systems through multi-level iterative solvers. We also use parallel sparse matrix–vector operations to reduce the floating-point operations and memory requirements. Numerical and parallel scalabilities of these algorithms are presented for the diffusion equation having spatially varying diffusion coefficient modeled by a non-Gaussian stochastic process. Scalability of the solvers with respect to the number of random variables is also investigated.« less
Scalable domain decomposition solvers for stochastic PDEs in high performance computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Desai, Ajit; Khalil, Mohammad; Pettit, Chris
Stochastic spectral finite element models of practical engineering systems may involve solutions of linear systems or linearized systems for non-linear problems with billions of unknowns. For stochastic modeling, it is therefore essential to design robust, parallel and scalable algorithms that can efficiently utilize high-performance computing to tackle such large-scale systems. Domain decomposition based iterative solvers can handle such systems. And though these algorithms exhibit excellent scalabilities, significant algorithmic and implementational challenges exist to extend them to solve extreme-scale stochastic systems using emerging computing platforms. Intrusive polynomial chaos expansion based domain decomposition algorithms are extended here to concurrently handle high resolutionmore » in both spatial and stochastic domains using an in-house implementation. Sparse iterative solvers with efficient preconditioners are employed to solve the resulting global and subdomain level local systems through multi-level iterative solvers. We also use parallel sparse matrix–vector operations to reduce the floating-point operations and memory requirements. Numerical and parallel scalabilities of these algorithms are presented for the diffusion equation having spatially varying diffusion coefficient modeled by a non-Gaussian stochastic process. Scalability of the solvers with respect to the number of random variables is also investigated.« less
An improved 3D MoF method based on analytical partial derivatives
NASA Astrophysics Data System (ADS)
Chen, Xiang; Zhang, Xiong
2016-12-01
MoF (Moment of Fluid) method is one of the most accurate approaches among various surface reconstruction algorithms. As other second order methods, MoF method needs to solve an implicit optimization problem to obtain the optimal approximate surface. Therefore, the partial derivatives of the objective function have to be involved during the iteration for efficiency and accuracy. However, to the best of our knowledge, the derivatives are currently estimated numerically by finite difference approximation because it is very difficult to obtain the analytical derivatives of the object function for an implicit optimization problem. Employing numerical derivatives in an iteration not only increase the computational cost, but also deteriorate the convergence rate and robustness of the iteration due to their numerical error. In this paper, the analytical first order partial derivatives of the objective function are deduced for 3D problems. The analytical derivatives can be calculated accurately, so they are incorporated into the MoF method to improve its accuracy, efficiency and robustness. Numerical studies show that by using the analytical derivatives the iterations are converged in all mixed cells with the efficiency improvement of 3 to 4 times.
Parallel Algorithms for Computational Models of Geophysical Systems
NASA Astrophysics Data System (ADS)
Carrillo Ledesma, A.; Herrera, I.; de la Cruz, L. M.; Hernández, G.; Grupo de Modelacion Matematica y Computacional
2013-05-01
Mathematical models of many systems of interest, including very important continuous systems of Earth Sciences and Engineering, lead to a great variety of partial differential equations (PDEs) whose solution methods are based on the computational processing of large-scale algebraic systems. Furthermore, the incredible expansion experienced by the existing computational hardware and software has made amenable to effective treatment problems of an ever increasing diversity and complexity, posed by scientific and engineering applications. Parallel computing is outstanding among the new computational tools and, in order to effectively use the most advanced computers available today, massively parallel software is required. Domain decomposition methods (DDMs) have been developed precisely for effectively treating PDEs in paralle. Ideally, the main objective of domain decomposition research is to produce algorithms capable of 'obtaining the global solution by exclusively solving local problems', but up-to-now this has only been an aspiration; that is, a strong desire for achieving such a property and so we call it 'the DDM-paradigm'. In recent times, numerically competitive DDM-algorithms are non-overlapping, preconditioned and necessarily incorporate constraints which pose an additional challenge for achieving the DDM-paradigm. Recently a group of four algorithms, referred to as the 'DVS-algorithms', which fulfill the DDM-paradigm, was developed. To derive them a new discretization method, which uses a non-overlapping system of nodes (the derived-nodes), was introduced. This discretization procedure can be applied to any boundary-value problem, or system of such equations. In turn, the resulting system of discrete equations can be treated using any available DDM-algorithm. In particular, two of the four DVS-algorithms mentioned above were obtained by application of the well-known and very effective algorithms BDDC and FETI-DP; these will be referred to as the DVS-BDDC and DVS-FETI-DP algorithms. The other two, which will be referred to as the DVS-PRIMAL and DVS-DUAL algorithms, were obtained by application of two new algorithms that had not been previously reported in the literature. As said before, the four DVS-algorithms constitute a group of preconditioned and constrained algorithms that, for the first time, fulfill the DDM-paradigm. Both, BDDC and FETI-DP, are very well-known; and both are highly efficient. Recently, it was established that these two methods are closely related and its numerical performance is quite similar. On the other hand, through numerical experiments, we have established that the numerical performances of each one of the members of DVS-algorithms group (DVS-BDDC, DVS-FETI-DP, DVS-PRIMAL and DVS-DUAL) are very similar too. Furthermore, we have carried out comparisons of the performances of the standard versions of BDDC and FETI-DP with DVS-BDDC and DVS-FETI-DP, and in all such numerical experiments the DVS algorithms have performed significantly better.
Numeric Design and Performance Analysis of Solid Oxide Fuel Cell -- Gas Turbine Hybrids on Aircraft
NASA Astrophysics Data System (ADS)
Hovakimyan, Gevorg
The aircraft industry benefits greatly from small improvements in aircraft component design. One possible area of improvement is in the Auxiliary Power Unit (APU). Modern aircraft APUs are gas turbines located in the tail section of the aircraft that generate additional power when needed. Unfortunately the efficiency of modern aircraft APUs is low. Solid Oxide Fuel Cell/Gas Turbine (SOFC/GT) hybrids are one possible alternative for replacing modern gas turbine APUs. This thesis investigates the feasibility of replacing conventional gas turbine APUs with SOFC/GT APUs on aircraft. An SOFC/GT design algorithm was created in order to determine the specifications of an SOFC/GT APU. The design algorithm is comprised of several integrated modules which together model the characteristics of each component of the SOFC/GT system. Given certain overall inputs, through numerical analysis, the algorithm produces an SOFC/GT APU, optimized for specific power and efficiency, capable of performing to the required specifications. The SOFC/GT design is then input into a previously developed quasi-dynamic SOFC/GT model to determine its load following capabilities over an aircraft flight cycle. Finally an aircraft range study is conducted to determine the feasibility of the SOFC/GT APU as a replacement for the conventional gas turbine APU. The design results show that SOFC/GT APUs have lower specific power than GT systems, but have much higher efficiencies. Moreover, the dynamic simulation results show that SOFC/GT APUs are capable of following modern flight loads. Finally, the range study determined that SOFC/GT APUs are more attractive over conventional APUs for longer range aircraft.
Automated Development of Accurate Algorithms and Efficient Codes for Computational Aeroacoustics
NASA Technical Reports Server (NTRS)
Goodrich, John W.; Dyson, Rodger W.
1999-01-01
The simulation of sound generation and propagation in three space dimensions with realistic aircraft components is a very large time dependent computation with fine details. Simulations in open domains with embedded objects require accurate and robust algorithms for propagation, for artificial inflow and outflow boundaries, and for the definition of geometrically complex objects. The development, implementation, and validation of methods for solving these demanding problems is being done to support the NASA pillar goals for reducing aircraft noise levels. Our goal is to provide algorithms which are sufficiently accurate and efficient to produce usable results rapidly enough to allow design engineers to study the effects on sound levels of design changes in propulsion systems, and in the integration of propulsion systems with airframes. There is a lack of design tools for these purposes at this time. Our technical approach to this problem combines the development of new, algorithms with the use of Mathematica and Unix utilities to automate the algorithm development, code implementation, and validation. We use explicit methods to ensure effective implementation by domain decomposition for SPMD parallel computing. There are several orders of magnitude difference in the computational efficiencies of the algorithms which we have considered. We currently have new artificial inflow and outflow boundary conditions that are stable, accurate, and unobtrusive, with implementations that match the accuracy and efficiency of the propagation methods. The artificial numerical boundary treatments have been proven to have solutions which converge to the full open domain problems, so that the error from the boundary treatments can be driven as low as is required. The purpose of this paper is to briefly present a method for developing highly accurate algorithms for computational aeroacoustics, the use of computer automation in this process, and a brief survey of the algorithms that have resulted from this work. A review of computational aeroacoustics has recently been given by Lele.
Topography Modeling in Atmospheric Flows Using the Immersed Boundary Method
NASA Technical Reports Server (NTRS)
Ackerman, A. S.; Senocak, I.; Mansour, N. N.; Stevens, D. E.
2004-01-01
Numerical simulation of flow over complex geometry needs accurate and efficient computational methods. Different techniques are available to handle complex geometry. The unstructured grid and multi-block body-fitted grid techniques have been widely adopted for complex geometry in engineering applications. In atmospheric applications, terrain fitted single grid techniques have found common use. Although these are very effective techniques, their implementation, coupling with the flow algorithm, and efficient parallelization of the complete method are more involved than a Cartesian grid method. The grid generation can be tedious and one needs to pay special attention in numerics to handle skewed cells for conservation purposes. Researchers have long sought for alternative methods to ease the effort involved in simulating flow over complex geometry.
How to estimate the 3D power spectrum of the Lyman-α forest
NASA Astrophysics Data System (ADS)
Font-Ribera, Andreu; McDonald, Patrick; Slosar, Anže
2018-01-01
We derive and numerically implement an algorithm for estimating the 3D power spectrum of the Lyman-α (Lyα) forest flux fluctuations. The algorithm exploits the unique geometry of Lyα forest data to efficiently measure the cross-spectrum between lines of sight as a function of parallel wavenumber, transverse separation and redshift. We start by approximating the global covariance matrix as block-diagonal, where only pixels from the same spectrum are correlated. We then compute the eigenvectors of the derivative of the signal covariance with respect to cross-spectrum parameters, and project the inverse-covariance-weighted spectra onto them. This acts much like a radial Fourier transform over redshift windows. The resulting cross-spectrum inference is then converted into our final product, an approximation of the likelihood for the 3D power spectrum expressed as second order Taylor expansion around a fiducial model. We demonstrate the accuracy and scalability of the algorithm and comment on possible extensions. Our algorithm will allow efficient analysis of the upcoming Dark Energy Spectroscopic Instrument dataset.
How to estimate the 3D power spectrum of the Lyman-α forest
Font-Ribera, Andreu; McDonald, Patrick; Slosar, Anže
2018-01-02
Here, we derive and numerically implement an algorithm for estimating the 3D power spectrum of the Lyman-α (Lyα) forest flux fluctuations. The algorithm exploits the unique geometry of Lyα forest data to efficiently measure the cross-spectrum between lines of sight as a function of parallel wavenumber, transverse separation and redshift. We start by approximating the global covariance matrix as block-diagonal, where only pixels from the same spectrum are correlated. We then compute the eigenvectors of the derivative of the signal covariance with respect to cross-spectrum parameters, and project the inverse-covariance-weighted spectra onto them. This acts much like a radial Fouriermore » transform over redshift windows. The resulting cross-spectrum inference is then converted into our final product, an approximation of the likelihood for the 3D power spectrum expressed as second order Taylor expansion around a fiducial model. We demonstrate the accuracy and scalability of the algorithm and comment on possible extensions. Our algorithm will allow efficient analysis of the upcoming Dark Energy Spectroscopic Instrument dataset.« less
Biyikli, Emre; To, Albert C.
2015-01-01
A new topology optimization method called the Proportional Topology Optimization (PTO) is presented. As a non-sensitivity method, PTO is simple to understand, easy to implement, and is also efficient and accurate at the same time. It is implemented into two MATLAB programs to solve the stress constrained and minimum compliance problems. Descriptions of the algorithm and computer programs are provided in detail. The method is applied to solve three numerical examples for both types of problems. The method shows comparable efficiency and accuracy with an existing optimality criteria method which computes sensitivities. Also, the PTO stress constrained algorithm and minimum compliance algorithm are compared by feeding output from one algorithm to the other in an alternative manner, where the former yields lower maximum stress and volume fraction but higher compliance compared to the latter. Advantages and disadvantages of the proposed method and future works are discussed. The computer programs are self-contained and publicly shared in the website www.ptomethod.org. PMID:26678849
NASA Astrophysics Data System (ADS)
Pan, Xiao-Min; Wei, Jian-Gong; Peng, Zhen; Sheng, Xin-Qing
2012-02-01
The interpolative decomposition (ID) is combined with the multilevel fast multipole algorithm (MLFMA), denoted by ID-MLFMA, to handle multiscale problems. The ID-MLFMA first generates ID levels by recursively dividing the boxes at the finest MLFMA level into smaller boxes. It is specifically shown that near-field interactions with respect to the MLFMA, in the form of the matrix vector multiplication (MVM), are efficiently approximated at the ID levels. Meanwhile, computations on far-field interactions at the MLFMA levels remain unchanged. Only a small portion of matrix entries are required to approximate coupling among well-separated boxes at the ID levels, and these submatrices can be filled without computing the complete original coupling matrix. It follows that the matrix filling in the ID-MLFMA becomes much less expensive. The memory consumed is thus greatly reduced and the MVM is accelerated as well. Several factors that may influence the accuracy, efficiency and reliability of the proposed ID-MLFMA are investigated by numerical experiments. Complex targets are calculated to demonstrate the capability of the ID-MLFMA algorithm.
How to estimate the 3D power spectrum of the Lyman-α forest
DOE Office of Scientific and Technical Information (OSTI.GOV)
Font-Ribera, Andreu; McDonald, Patrick; Slosar, Anže
Here, we derive and numerically implement an algorithm for estimating the 3D power spectrum of the Lyman-α (Lyα) forest flux fluctuations. The algorithm exploits the unique geometry of Lyα forest data to efficiently measure the cross-spectrum between lines of sight as a function of parallel wavenumber, transverse separation and redshift. We start by approximating the global covariance matrix as block-diagonal, where only pixels from the same spectrum are correlated. We then compute the eigenvectors of the derivative of the signal covariance with respect to cross-spectrum parameters, and project the inverse-covariance-weighted spectra onto them. This acts much like a radial Fouriermore » transform over redshift windows. The resulting cross-spectrum inference is then converted into our final product, an approximation of the likelihood for the 3D power spectrum expressed as second order Taylor expansion around a fiducial model. We demonstrate the accuracy and scalability of the algorithm and comment on possible extensions. Our algorithm will allow efficient analysis of the upcoming Dark Energy Spectroscopic Instrument dataset.« less
Experience with a Genetic Algorithm Implemented on a Multiprocessor Computer
NASA Technical Reports Server (NTRS)
Plassman, Gerald E.; Sobieszczanski-Sobieski, Jaroslaw
2000-01-01
Numerical experiments were conducted to find out the extent to which a Genetic Algorithm (GA) may benefit from a multiprocessor implementation, considering, on one hand, that analyses of individual designs in a population are independent of each other so that they may be executed concurrently on separate processors, and, on the other hand, that there are some operations in a GA that cannot be so distributed. The algorithm experimented with was based on a gaussian distribution rather than bit exchange in the GA reproductive mechanism, and the test case was a hub frame structure of up to 1080 design variables. The experimentation engaging up to 128 processors confirmed expectations of radical elapsed time reductions comparing to a conventional single processor implementation. It also demonstrated that the time spent in the non-distributable parts of the algorithm and the attendant cross-processor communication may have a very detrimental effect on the efficient utilization of the multiprocessor machine and on the number of processors that can be used effectively in a concurrent manner. Three techniques were devised and tested to mitigate that effect, resulting in efficiency increasing to exceed 99 percent.
Evaluation of on-line pulse control for vibration suppression in flexible spacecraft
NASA Technical Reports Server (NTRS)
Masri, Sami F.
1987-01-01
A numerical simulation was performed, by means of a large-scale finite element code capable of handling large deformations and/or nonlinear behavior, to investigate the suitability of the nonlinear pulse-control algorithm to suppress the vibrations induced in the Spacecraft Control Laboratory Experiment (SCOLE) components under realistic maneuvers. Among the topics investigated were the effects of various control parameters on the efficiency and robustness of the vibration control algorithm. Advanced nonlinear control techniques were applied to an idealized model of some of the SCOLE components to develop an efficient algorithm to determine the optimal locations of point actuators, considering the hardware on the SCOLE project as distributed in nature. The control was obtained from a quadratic optimization criterion, given in terms of the state variables of the distributed system. An experimental investigation was performed on a model flexible structure resembling the essential features of the SCOLE components, and electrodynamic and electrohydraulic actuators were used to investigate the applicability of the control algorithm with such devices in addition to mass-ejection pulse generators using compressed air.
Numerical Simulation of the Flow over a Segment-Conical Body on the Basis of Reynolds Equations
NASA Astrophysics Data System (ADS)
Egorov, I. V.; Novikov, A. V.; Palchekovskaya, N. V.
2018-01-01
Numerical simulation was used to study the 3D supersonic flow over a segment-conical body similar in shape to the ExoMars space vehicle. The nonmonotone behavior of the normal force acting on the body placed in a supersonic gas flow was analyzed depending on the angle of attack. The simulation was based on the numerical solution of the unsteady Reynolds-averaged Navier-Stokes equations with a two-parameter differential turbulence model. The solution of the problem was obtained using the in-house solver HSFlow with an efficient parallel algorithm intended for multiprocessor super computers.
A hybrid framework for coupling arbitrary summation-by-parts schemes on general meshes
NASA Astrophysics Data System (ADS)
Lundquist, Tomas; Malan, Arnaud; Nordström, Jan
2018-06-01
We develop a general interface procedure to couple both structured and unstructured parts of a hybrid mesh in a non-collocated, multi-block fashion. The target is to gain optimal computational efficiency in fluid dynamics simulations involving complex geometries. While guaranteeing stability, the proposed procedure is optimized for accuracy and requires minimal algorithmic modifications to already existing schemes. Initial numerical investigations confirm considerable efficiency gains compared to non-hybrid calculations of up to an order of magnitude.
An efficient transport solver for tokamak plasmas
Park, Jin Myung; Murakami, Masanori; St. John, H. E.; ...
2017-01-03
A simple approach to efficiently solve a coupled set of 1-D diffusion-type transport equations with a stiff transport model for tokamak plasmas is presented based on the 4th order accurate Interpolated Differential Operator scheme along with a nonlinear iteration method derived from a root-finding algorithm. Here, numerical tests using the Trapped Gyro-Landau-Fluid model show that the presented high order method provides an accurate transport solution using a small number of grid points with robust nonlinear convergence.
NASA Technical Reports Server (NTRS)
Carroll, Chester C.; Youngblood, John N.; Saha, Aindam
1987-01-01
Improvements and advances in the development of computer architecture now provide innovative technology for the recasting of traditional sequential solutions into high-performance, low-cost, parallel system to increase system performance. Research conducted in development of specialized computer architecture for the algorithmic execution of an avionics system, guidance and control problem in real time is described. A comprehensive treatment of both the hardware and software structures of a customized computer which performs real-time computation of guidance commands with updated estimates of target motion and time-to-go is presented. An optimal, real-time allocation algorithm was developed which maps the algorithmic tasks onto the processing elements. This allocation is based on the critical path analysis. The final stage is the design and development of the hardware structures suitable for the efficient execution of the allocated task graph. The processing element is designed for rapid execution of the allocated tasks. Fault tolerance is a key feature of the overall architecture. Parallel numerical integration techniques, tasks definitions, and allocation algorithms are discussed. The parallel implementation is analytically verified and the experimental results are presented. The design of the data-driven computer architecture, customized for the execution of the particular algorithm, is discussed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carroll, C.C.; Youngblood, J.N.; Saha, A.
1987-12-01
Improvements and advances in the development of computer architecture now provide innovative technology for the recasting of traditional sequential solutions into high-performance, low-cost, parallel system to increase system performance. Research conducted in development of specialized computer architecture for the algorithmic execution of an avionics system, guidance and control problem in real time is described. A comprehensive treatment of both the hardware and software structures of a customized computer which performs real-time computation of guidance commands with updated estimates of target motion and time-to-go is presented. An optimal, real-time allocation algorithm was developed which maps the algorithmic tasks onto the processingmore » elements. This allocation is based on the critical path analysis. The final stage is the design and development of the hardware structures suitable for the efficient execution of the allocated task graph. The processing element is designed for rapid execution of the allocated tasks. Fault tolerance is a key feature of the overall architecture. Parallel numerical integration techniques, tasks definitions, and allocation algorithms are discussed. The parallel implementation is analytically verified and the experimental results are presented. The design of the data-driven computer architecture, customized for the execution of the particular algorithm, is discussed.« less
Multidimensional, fully implicit, exactly conserving electromagnetic particle-in-cell simulations
NASA Astrophysics Data System (ADS)
Chacon, Luis
2015-09-01
We discuss a new, conservative, fully implicit 2D-3V particle-in-cell algorithm for non-radiative, electromagnetic kinetic plasma simulations, based on the Vlasov-Darwin model. Unlike earlier linearly implicit PIC schemes and standard explicit PIC schemes, fully implicit PIC algorithms are unconditionally stable and allow exact discrete energy and charge conservation. This has been demonstrated in 1D electrostatic and electromagnetic contexts. In this study, we build on these recent algorithms to develop an implicit, orbit-averaged, time-space-centered finite difference scheme for the Darwin field and particle orbit equations for multiple species in multiple dimensions. The Vlasov-Darwin model is very attractive for PIC simulations because it avoids radiative noise issues in non-radiative electromagnetic regimes. The algorithm conserves global energy, local charge, and particle canonical-momentum exactly, even with grid packing. The nonlinear iteration is effectively accelerated with a fluid preconditioner, which allows efficient use of large timesteps, O(√{mi/me}c/veT) larger than the explicit CFL. In this presentation, we will introduce the main algorithmic components of the approach, and demonstrate the accuracy and efficiency properties of the algorithm with various numerical experiments in 1D and 2D. Support from the LANL LDRD program and the DOE-SC ASCR office.
NASA Astrophysics Data System (ADS)
Qyyum, Muhammad Abdul; Long, Nguyen Van Duc; Minh, Le Quang; Lee, Moonyong
2018-01-01
Design optimization of the single mixed refrigerant (SMR) natural gas liquefaction (LNG) process involves highly non-linear interactions between decision variables, constraints, and the objective function. These non-linear interactions lead to an irreversibility, which deteriorates the energy efficiency of the LNG process. In this study, a simple and highly efficient hybrid modified coordinate descent (HMCD) algorithm was proposed to cope with the optimization of the natural gas liquefaction process. The single mixed refrigerant process was modeled in Aspen Hysys® and then connected to a Microsoft Visual Studio environment. The proposed optimization algorithm provided an improved result compared to the other existing methodologies to find the optimal condition of the complex mixed refrigerant natural gas liquefaction process. By applying the proposed optimization algorithm, the SMR process can be designed with the 0.2555 kW specific compression power which is equivalent to 44.3% energy saving as compared to the base case. Furthermore, in terms of coefficient of performance (COP), it can be enhanced up to 34.7% as compared to the base case. The proposed optimization algorithm provides a deep understanding of the optimization of the liquefaction process in both technical and numerical perspectives. In addition, the HMCD algorithm can be employed to any mixed refrigerant based liquefaction process in the natural gas industry.
Leong, Kah Huo; Abdul-Rahman, Hamzah; Wang, Chen; Onn, Chiu Chuen
2016-01-01
Railway and metro transport systems (RS) are becoming one of the popular choices of transportation among people, especially those who live in urban cities. Urbanization and increasing population due to rapid development of economy in many cities are leading to a bigger demand for urban rail transit. Despite being a popular variant of Traveling Salesman Problem (TSP), it appears that the universal formula or techniques to solve the problem are yet to be found. This paper aims to develop an optimization algorithm for optimum route selection to multiple destinations in RS before returning to the starting point. Bee foraging behaviour is examined to generate a reliable algorithm in railway TSP. The algorithm is then verified by comparing the results with the exact solutions in 10 test cases, and a numerical case study is designed to demonstrate the application with large size sample. It is tested to be efficient and effective in railway route planning as the tour can be completed within a certain period of time by using minimal resources. The findings further support the reliability of the algorithm and capability to solve the problems with different complexity. This algorithm can be used as a method to assist business practitioners making better decision in route planning. PMID:27930659
Leong, Kah Huo; Abdul-Rahman, Hamzah; Wang, Chen; Onn, Chiu Chuen; Loo, Siaw-Chuing
2016-01-01
Railway and metro transport systems (RS) are becoming one of the popular choices of transportation among people, especially those who live in urban cities. Urbanization and increasing population due to rapid development of economy in many cities are leading to a bigger demand for urban rail transit. Despite being a popular variant of Traveling Salesman Problem (TSP), it appears that the universal formula or techniques to solve the problem are yet to be found. This paper aims to develop an optimization algorithm for optimum route selection to multiple destinations in RS before returning to the starting point. Bee foraging behaviour is examined to generate a reliable algorithm in railway TSP. The algorithm is then verified by comparing the results with the exact solutions in 10 test cases, and a numerical case study is designed to demonstrate the application with large size sample. It is tested to be efficient and effective in railway route planning as the tour can be completed within a certain period of time by using minimal resources. The findings further support the reliability of the algorithm and capability to solve the problems with different complexity. This algorithm can be used as a method to assist business practitioners making better decision in route planning.
Algorithm of composing the schedule of construction and installation works
NASA Astrophysics Data System (ADS)
Nehaj, Rustam; Molotkov, Georgij; Rudchenko, Ivan; Grinev, Anatolij; Sekisov, Aleksandr
2017-10-01
An algorithm for scheduling works is developed, in which the priority of the work corresponds to the total weight of the subordinate works, the vertices of the graph, and it is proved that for graphs of the tree type the algorithm is optimal. An algorithm is synthesized to reduce the search for solutions when drawing up schedules of construction and installation works, allocating a subset with the optimal solution of the problem of the minimum power, which is determined by the structure of its initial data and numerical values. An algorithm for scheduling construction and installation work is developed, taking into account the schedule for the movement of brigades, which is characterized by the possibility to efficiently calculate the values of minimizing the time of work performance by the parameters of organizational and technological reliability through the use of the branch and boundary method. The program of the computational algorithm was compiled in the MatLAB-2008 program. For the initial data of the matrix, random numbers were taken, uniformly distributed in the range from 1 to 100. It takes 0.5; 2.5; 7.5; 27 minutes to solve the problem. Thus, the proposed method for estimating the lower boundary of the solution is sufficiently accurate and allows efficient solution of the minimax task of scheduling construction and installation works.
NASA Astrophysics Data System (ADS)
Quy Muoi, Pham; Nho Hào, Dinh; Sahoo, Sujit Kumar; Tang, Dongliang; Cong, Nguyen Huu; Dang, Cuong
2018-05-01
In this paper, we study a gradient-type method and a semismooth Newton method for minimization problems in regularizing inverse problems with nonnegative and sparse solutions. We propose a special penalty functional forcing the minimizers of regularized minimization problems to be nonnegative and sparse, and then we apply the proposed algorithms in a practical the problem. The strong convergence of the gradient-type method and the local superlinear convergence of the semismooth Newton method are proven. Then, we use these algorithms for the phase retrieval problem and illustrate their efficiency in numerical examples, particularly in the practical problem of optical imaging through scattering media where all the noises from experiment are presented.
Fuzzy multi-objective chance-constrained programming model for hazardous materials transportation
NASA Astrophysics Data System (ADS)
Du, Jiaoman; Yu, Lean; Li, Xiang
2016-04-01
Hazardous materials transportation is an important and hot issue of public safety. Based on the shortest path model, this paper presents a fuzzy multi-objective programming model that minimizes the transportation risk to life, travel time and fuel consumption. First, we present the risk model, travel time model and fuel consumption model. Furthermore, we formulate a chance-constrained programming model within the framework of credibility theory, in which the lengths of arcs in the transportation network are assumed to be fuzzy variables. A hybrid intelligent algorithm integrating fuzzy simulation and genetic algorithm is designed for finding a satisfactory solution. Finally, some numerical examples are given to demonstrate the efficiency of the proposed model and algorithm.
Optimized random phase only holograms.
Zea, Alejandro Velez; Barrera Ramirez, John Fredy; Torroba, Roberto
2018-02-15
We propose a simple and efficient technique capable of generating Fourier phase only holograms with a reconstruction quality similar to the results obtained with the Gerchberg-Saxton (G-S) algorithm. Our proposal is to use the traditional G-S algorithm to optimize a random phase pattern for the resolution, pixel size, and target size of the general optical system without any specific amplitude data. This produces an optimized random phase (ORAP), which is used for fast generation of phase only holograms of arbitrary amplitude targets. This ORAP needs to be generated only once for a given optical system, avoiding the need for costly iterative algorithms for each new target. We show numerical and experimental results confirming the validity of the proposal.
Multigrid solutions to quasi-elliptic schemes
NASA Technical Reports Server (NTRS)
Brandt, A.; Taasan, S.
1985-01-01
Quasi-elliptic schemes arise from central differencing or finite element discretization of elliptic systems with odd order derivatives on non-staggered grids. They are somewhat unstable and less accurate then corresponding staggered-grid schemes. When usual multigrid solvers are applied to them, the asymptotic algebraic convergence is necessarily slow. Nevertheless, it is shown by mode analyses and numerical experiments that the usual FMG algorithm is very efficient in solving quasi-elliptic equations to the level of truncation errors. Also, a new type of multigrid algorithm is presented, mode analyzed and tested, for which even the asymptotic algebraic convergence is fast. The essence of that algorithm is applicable to other kinds of problems, including highly indefinite ones.
Multigrid solutions to quasi-elliptic schemes
NASA Technical Reports Server (NTRS)
Brandt, A.; Taasan, S.
1985-01-01
Quasi-elliptic schemes arise from central differencing or finite element discretization of elliptic systems with odd order derivatives on non-staggered grids. They are somewhat unstable and less accurate than corresponding staggered-grid schemes. When usual multigrid solvers are applied to them, the asymptotic algebraic convergence is necessarily slow. Nevertheless, it is shown by mode analyses and numerical experiments that the usual FMG algorithm is very efficient in solving quasi-elliptic equations to the level of truncation errors. Also, a new type of multigrid algorithm is presented, mode analyzed and tested, for which even the asymptotic algebraic convergence is fast. The essence of that algorithm is applicable to other kinds of problems, including highly indefinite ones.
Time-frequency analysis of acoustic scattering from elastic objects
NASA Astrophysics Data System (ADS)
Yen, Nai-Chyuan; Dragonette, Louis R.; Numrich, Susan K.
1990-06-01
A time-frequency analysis of acoustic scattering from elastic objects was carried out using the time-frequency representation based on a modified version of the Wigner distribution function (WDF) algorithm. A simple and efficient processing algorithm was developed, which provides meaningful interpretation of the scattering physics. The time and frequency representation derived from the WDF algorithm was further reduced to a display which is a skeleton plot, called a vein diagram, that depicts the essential features of the form function. The physical parameters of the scatterer are then extracted from this diagram with the proper interpretation of the scattering phenomena. Several examples, based on data obtained from numerically simulated models and laboratory measurements for elastic spheres and shells, are used to illustrate the capability and proficiency of the algorithm.
Huang, Yu; Guo, Feng; Li, Yongling; Liu, Yufeng
2015-01-01
Parameter estimation for fractional-order chaotic systems is an important issue in fractional-order chaotic control and synchronization and could be essentially formulated as a multidimensional optimization problem. A novel algorithm called quantum parallel particle swarm optimization (QPPSO) is proposed to solve the parameter estimation for fractional-order chaotic systems. The parallel characteristic of quantum computing is used in QPPSO. This characteristic increases the calculation of each generation exponentially. The behavior of particles in quantum space is restrained by the quantum evolution equation, which consists of the current rotation angle, individual optimal quantum rotation angle, and global optimal quantum rotation angle. Numerical simulation based on several typical fractional-order systems and comparisons with some typical existing algorithms show the effectiveness and efficiency of the proposed algorithm. PMID:25603158
A soft computing-based approach to optimise queuing-inventory control problem
NASA Astrophysics Data System (ADS)
Alaghebandha, Mohammad; Hajipour, Vahid
2015-04-01
In this paper, a multi-product continuous review inventory control problem within batch arrival queuing approach (MQr/M/1) is developed to find the optimal quantities of maximum inventory. The objective function is to minimise summation of ordering, holding and shortage costs under warehouse space, service level and expected lost-sales shortage cost constraints from retailer and warehouse viewpoints. Since the proposed model is Non-deterministic Polynomial-time hard, an efficient imperialist competitive algorithm (ICA) is proposed to solve the model. To justify proposed ICA, both ganetic algorithm and simulated annealing algorithm are utilised. In order to determine the best value of algorithm parameters that result in a better solution, a fine-tuning procedure is executed. Finally, the performance of the proposed ICA is analysed using some numerical illustrations.
Variational data assimilation for the initial-value dynamo problem.
Li, Kuan; Jackson, Andrew; Livermore, Philip W
2011-11-01
The secular variation of the geomagnetic field as observed at the Earth's surface results from the complex magnetohydrodynamics taking place in the fluid core of the Earth. One way to analyze this system is to use the data in concert with an underlying dynamical model of the system through the technique of variational data assimilation, in much the same way as is employed in meteorology and oceanography. The aim is to discover an optimal initial condition that leads to a trajectory of the system in agreement with observations. Taking the Earth's core to be an electrically conducting fluid sphere in which convection takes place, we develop the continuous adjoint forms of the magnetohydrodynamic equations that govern the dynamical system together with the corresponding numerical algorithms appropriate for a fully spectral method. These adjoint equations enable a computationally fast iterative improvement of the initial condition that determines the system evolution. The initial condition depends on the three dimensional form of quantities such as the magnetic field in the entire sphere. For the magnetic field, conservation of the divergence-free condition for the adjoint magnetic field requires the introduction of an adjoint pressure term satisfying a zero boundary condition. We thus find that solving the forward and adjoint dynamo system requires different numerical algorithms. In this paper, an efficient algorithm for numerically solving this problem is developed and tested for two illustrative problems in a whole sphere: one is a kinematic problem with prescribed velocity field, and the second is associated with the Hall-effect dynamo, exhibiting considerable nonlinearity. The algorithm exhibits reliable numerical accuracy and stability. Using both the analytical and the numerical techniques of this paper, the adjoint dynamo system can be solved directly with the same order of computational complexity as that required to solve the forward problem. These numerical techniques form a foundation for ultimate application to observations of the geomagnetic field over the time scale of centuries.
NASA Astrophysics Data System (ADS)
Feskov, Serguei V.; Ivanov, Anatoly I.
2018-03-01
An approach to the construction of diabatic free energy surfaces (FESs) for ultrafast electron transfer (ET) in a supramolecule with an arbitrary number of electron localization centers (redox sites) is developed, supposing that the reorganization energies for the charge transfers and shifts between all these centers are known. Dimensionality of the coordinate space required for the description of multistage ET in this supramolecular system is shown to be equal to N - 1, where N is the number of the molecular centers involved in the reaction. The proposed algorithm of FES construction employs metric properties of the coordinate space, namely, relation between the solvent reorganization energy and the distance between the two FES minima. In this space, the ET reaction coordinate zn n' associated with electron transfer between the nth and n'th centers is calculated through the projection to the direction, connecting the FES minima. The energy-gap reaction coordinates zn n' corresponding to different ET processes are not in general orthogonal so that ET between two molecular centers can create nonequilibrium distribution, not only along its own reaction coordinate but along other reaction coordinates too. This results in the influence of the preceding ET steps on the kinetics of the ensuing ET. It is important for the ensuing reaction to be ultrafast to proceed in parallel with relaxation along the ET reaction coordinates. Efficient algorithms for numerical simulation of multistage ET within the stochastic point-transition model are developed. The algorithms are based on the Brownian simulation technique with the recrossing-event detection procedure. The main advantages of the numerical method are (i) its computational complexity is linear with respect to the number of electronic states involved and (ii) calculations can be naturally parallelized up to the level of individual trajectories. The efficiency of the proposed approach is demonstrated for a model supramolecular system involving four redox centers.
An efficient numerical model for multicomponent compressible flow in fractured porous media
NASA Astrophysics Data System (ADS)
Zidane, Ali; Firoozabadi, Abbas
2014-12-01
An efficient and accurate numerical model for multicomponent compressible single-phase flow in fractured media is presented. The discrete-fracture approach is used to model the fractures where the fracture entities are described explicitly in the computational domain. We use the concept of cross flow equilibrium in the fractures. This will allow large matrix elements in the neighborhood of the fractures and considerable speed up of the algorithm. We use an implicit finite volume (FV) scheme to solve the species mass balance equation in the fractures. This step avoids the use of Courant-Freidricks-Levy (CFL) condition and contributes to significant speed up of the code. The hybrid mixed finite element method (MFE) is used to solve for the velocity in both the matrix and the fractures coupled with the discontinuous Galerkin (DG) method to solve the species transport equations in the matrix. Four numerical examples are presented to demonstrate the robustness and efficiency of the proposed model. We show that the combination of the fracture cross-flow equilibrium and the implicit composition calculation in the fractures increase the computational speed 20-130 times in 2D. In 3D, one may expect even a higher computational efficiency.
Event and Apparent Horizon Finders for 3 + 1 Numerical Relativity.
Thornburg, Jonathan
2007-01-01
Event and apparent horizons are key diagnostics for the presence and properties of black holes. In this article I review numerical algorithms and codes for finding event and apparent horizons in numerically-computed spacetimes, focusing on calculations done using the 3 + 1 ADM formalism. The event horizon of an asymptotically-flat spacetime is the boundary between those events from which a future-pointing null geodesic can reach future null infinity and those events from which no such geodesic exists. The event horizon is a (continuous) null surface in spacetime. The event horizon is defined nonlocally in time : it is a global property of the entire spacetime and must be found in a separate post-processing phase after all (or at least the nonstationary part) of spacetime has been numerically computed. There are three basic algorithms for finding event horizons, based on integrating null geodesics forwards in time, integrating null geodesics backwards in time, and integrating null surfaces backwards in time. The last of these is generally the most efficient and accurate. In contrast to an event horizon, an apparent horizon is defined locally in time in a spacelike slice and depends only on data in that slice, so it can be (and usually is) found during the numerical computation of a spacetime. A marginally outer trapped surface (MOTS) in a slice is a smooth closed 2-surface whose future-pointing outgoing null geodesics have zero expansion Θ. An apparent horizon is then defined as a MOTS not contained in any other MOTS. The MOTS condition is a nonlinear elliptic partial differential equation (PDE) for the surface shape, containing the ADM 3-metric, its spatial derivatives, and the extrinsic curvature as coefficients. Most "apparent horizon" finders actually find MOTSs. There are a large number of apparent horizon finding algorithms, with differing trade-offs between speed, robustness, accuracy, and ease of programming. In axisymmetry, shooting algorithms work well and are fairly easy to program. In slices with no continuous symmetries, spectral integral-iteration algorithms and elliptic-PDE algorithms are fast and accurate, but require good initial guesses to converge. In many cases, Schnetter's "pretracking" algorithm can greatly improve an elliptic-PDE algorithm's robustness. Flow algorithms are generally quite slow but can be very robust in their convergence. Minimization methods are slow and relatively inaccurate in the context of a finite differencing simulation, but in a spectral code they can be relatively faster and more robust.
Numerical Study of Boundary Layer Interaction with Shocks: Method Improvement and Test Computation
NASA Technical Reports Server (NTRS)
Adams, N. A.
1995-01-01
The objective is the development of a high-order and high-resolution method for the direct numerical simulation of shock turbulent-boundary-layer interaction. Details concerning the spatial discretization of the convective terms can be found in Adams and Shariff (1995). The computer code based on this method as introduced in Adams (1994) was formulated in Cartesian coordinates and thus has been limited to simple rectangular domains. For more general two-dimensional geometries, as a compression corner, an extension to generalized coordinates is necessary. To keep the requirements or limitations for grid generation low, the extended formulation should allow for non-orthogonal grids. Still, for simplicity and cost efficiency, periodicity can be assumed in one cross-flow direction. For easy vectorization, the compact-ENO coupling algorithm as used in Adams (1994) treated whole planes normal to the derivative direction with the ENO scheme whenever at least one point of this plane satisfied the detection criterion. This is apparently too restrictive for more general geometries and more complex shock patterns. Here we introduce a localized compact-ENO coupling algorithm, which is efficient as long as the overall number of grid points treated by the ENO scheme is small compared to the total number of grid points. Validation and test computations with the final code are performed to assess the efficiency and suitability of the computer code for the problems of interest. We define a set of parameters where a direct numerical simulation of a turbulent boundary layer along a compression corner with reasonably fine resolution is affordable.
Efficient numerical method of freeform lens design for arbitrary irradiance shaping
NASA Astrophysics Data System (ADS)
Wojtanowski, Jacek
2018-05-01
A computational method to design a lens with a flat entrance surface and a freeform exit surface that can transform a collimated, generally non-uniform input beam into a beam with a desired irradiance distribution of arbitrary shape is presented. The methodology is based on non-linear elliptic partial differential equations, known as Monge-Ampère PDEs. This paper describes an original numerical algorithm to solve this problem by applying the Gauss-Seidel method with simplified boundary conditions. A joint MATLAB-ZEMAX environment is used to implement and verify the method. To prove the efficiency of the proposed approach, an exemplary study where the designed lens is faced with the challenging illumination task is shown. An analysis of solution stability, iteration-to-iteration ray mapping evolution (attached in video format), depth of focus and non-zero étendue efficiency is performed.
NASA Astrophysics Data System (ADS)
Wu, Lianglong; Fu, Xiquan; Guo, Xing
2013-03-01
In this paper, we propose a modified adaptive algorithm (MAA) of dealing with the high chirp to efficiently simulate the propagation of chirped pulses along an optical fiber for the propagation distance shorter than the "temporal focal length". The basis of the MAA is that the chirp term of initial pulse is treated as the rapidly varying part by means of the idea of the slowly varying envelope approximation (SVEA). Numerical simulations show that the performance of the MAA is validated, and that the proposed method can decrease the number of sampling points by orders of magnitude. In addition, the computational efficiency of the MAA compared with the time-domain beam propagation method (BPM) can be enhanced with the increase of the chirp of initial pulse.
Efficient scheme for parametric fitting of data in arbitrary dimensions.
Pang, Ning-Ning; Tzeng, Wen-Jer; Kao, Hisen-Ching
2008-07-01
We propose an efficient scheme for parametric fitting expressed in terms of the Legendre polynomials. For continuous systems, our scheme is exact and the derived explicit expression is very helpful for further analytical studies. For discrete systems, our scheme is almost as accurate as the method of singular value decomposition. Through a few numerical examples, we show that our algorithm costs much less CPU time and memory space than the method of singular value decomposition. Thus, our algorithm is very suitable for a large amount of data fitting. In addition, the proposed scheme can also be used to extract the global structure of fluctuating systems. We then derive the exact relation between the correlation function and the detrended variance function of fluctuating systems in arbitrary dimensions and give a general scaling analysis.
NASA Astrophysics Data System (ADS)
Beyhaghi, Pooriya
2016-11-01
This work considers the problem of the efficient minimization of the infinite time average of a stationary ergodic process in the space of a handful of independent parameters which affect it. Problems of this class, derived from physical or numerical experiments which are sometimes expensive to perform, are ubiquitous in turbulence research. In such problems, any given function evaluation, determined with finite sampling, is associated with a quantifiable amount of uncertainty, which may be reduced via additional sampling. This work proposes the first algorithm of this type. Our algorithm remarkably reduces the overall cost of the optimization process for problems of this class. Further, under certain well-defined conditions, rigorous proof of convergence is established to the global minimum of the problem considered.
NASA Astrophysics Data System (ADS)
Shang, J. S.; Andrienko, D. A.; Huang, P. G.; Surzhikov, S. T.
2014-06-01
An efficient computational capability for nonequilibrium radiation simulation via the ray tracing technique has been accomplished. The radiative rate equation is iteratively coupled with the aerodynamic conservation laws including nonequilibrium chemical and chemical-physical kinetic models. The spectral properties along tracing rays are determined by a space partition algorithm of the nearest neighbor search process, and the numerical accuracy is further enhanced by a local resolution refinement using the Gauss-Lobatto polynomial. The interdisciplinary governing equations are solved by an implicit delta formulation through the diminishing residual approach. The axisymmetric radiating flow fields over the reentry RAM-CII probe have been simulated and verified with flight data and previous solutions by traditional methods. A computational efficiency gain nearly forty times is realized over that of the existing simulation procedures.
Phase transition in the countdown problem
NASA Astrophysics Data System (ADS)
Lacasa, Lucas; Luque, Bartolo
2012-07-01
We present a combinatorial decision problem, inspired by the celebrated quiz show called Countdown, that involves the computation of a given target number T from a set of k randomly chosen integers along with a set of arithmetic operations. We find that the probability of winning the game evidences a threshold phenomenon that can be understood in the terms of an algorithmic phase transition as a function of the set size k. Numerical simulations show that such probability sharply transitions from zero to one at some critical value of the control parameter, hence separating the algorithm's parameter space in different phases. We also find that the system is maximally efficient close to the critical point. We derive analytical expressions that match the numerical results for finite size and permit us to extrapolate the behavior in the thermodynamic limit.
NASA Astrophysics Data System (ADS)
Nemoto, Takahiro; Jack, Robert L.; Lecomte, Vivien
2017-03-01
We analyze large deviations of the time-averaged activity in the one-dimensional Fredrickson-Andersen model, both numerically and analytically. The model exhibits a dynamical phase transition, which appears as a singularity in the large deviation function. We analyze the finite-size scaling of this phase transition numerically, by generalizing an existing cloning algorithm to include a multicanonical feedback control: this significantly improves the computational efficiency. Motivated by these numerical results, we formulate an effective theory for the model in the vicinity of the phase transition, which accounts quantitatively for the observed behavior. We discuss potential applications of the numerical method and the effective theory in a range of more general contexts.
A numerical study of axisymmetric compressible non-isothermal and reactive swirling flow
NASA Astrophysics Data System (ADS)
Tavernetti, William E.; Hafez, Mohamed M.
2017-09-01
Non-linear dynamical phenomena in combustion processes is an active area of experimental and theoretical research. This is in large part due to increasingly strict environmental pressures to make gas turbine engines and industrial burners more efficient. Using numerical methods, for steady and unsteady confined and unconfined compressible flow, this study examines the modeling influence of compressibility for axisymmetric swirling flow. The compressible reactive Navier-Stokes equations in terms of stream function, vorticity, circulation are used. Results, details of the numerical algorithms, as well as numerical verification techniques and validation with sources from the literature will be presented. Understanding how vortex breakdown phenomena are affected by modeling reactant consumption with compressibility effect is the main goal of this study.
Operation of the Institute for Computer Applications in Science and Engineering
NASA Technical Reports Server (NTRS)
1975-01-01
The ICASE research program is described in detail; it consists of four major categories: (1) efficient use of vector and parallel computers, with particular emphasis on the CDC STAR-100; (2) numerical analysis, with particular emphasis on the development and analysis of basic numerical algorithms; (3) analysis and planning of large-scale software systems; and (4) computational research in engineering and the natural sciences, with particular emphasis on fluid dynamics. The work in each of these areas is described in detail; other activities are discussed, a prognosis of future activities are included.
Numerical parametric studies of spray combustion instability
NASA Technical Reports Server (NTRS)
Pindera, M. Z.
1993-01-01
A coupled numerical algorithm has been developed for studies of combustion instabilities in spray-driven liquid rocket engines. The model couples gas and liquid phase physics using the method of fractional steps. Also introduced is a novel, efficient methodology for accounting for spray formation through direct solution of liquid phase equations. Preliminary parametric studies show marked sensitivity of spray penetration and geometry to droplet diameter, considerations of liquid core, and acoustic interactions. Less sensitivity was shown to the combustion model type although more rigorous (multi-step) formulations may be needed for the differences to become apparent.
Factorization and reduction methods for optimal control of distributed parameter systems
NASA Technical Reports Server (NTRS)
Burns, J. A.; Powers, R. K.
1985-01-01
A Chandrasekhar-type factorization method is applied to the linear-quadratic optimal control problem for distributed parameter systems. An aeroelastic control problem is used as a model example to demonstrate that if computationally efficient algorithms, such as those of Chandrasekhar-type, are combined with the special structure often available to a particular problem, then an abstract approximation theory developed for distributed parameter control theory becomes a viable method of solution. A numerical scheme based on averaging approximations is applied to hereditary control problems. Numerical examples are given.
Ab initio molecular simulations with numeric atom-centered orbitals
NASA Astrophysics Data System (ADS)
Blum, Volker; Gehrke, Ralf; Hanke, Felix; Havu, Paula; Havu, Ville; Ren, Xinguo; Reuter, Karsten; Scheffler, Matthias
2009-11-01
We describe a complete set of algorithms for ab initio molecular simulations based on numerically tabulated atom-centered orbitals (NAOs) to capture a wide range of molecular and materials properties from quantum-mechanical first principles. The full algorithmic framework described here is embodied in the Fritz Haber Institute "ab initio molecular simulations" (FHI-aims) computer program package. Its comprehensive description should be relevant to any other first-principles implementation based on NAOs. The focus here is on density-functional theory (DFT) in the local and semilocal (generalized gradient) approximations, but an extension to hybrid functionals, Hartree-Fock theory, and MP2/GW electron self-energies for total energies and excited states is possible within the same underlying algorithms. An all-electron/full-potential treatment that is both computationally efficient and accurate is achieved for periodic and cluster geometries on equal footing, including relaxation and ab initio molecular dynamics. We demonstrate the construction of transferable, hierarchical basis sets, allowing the calculation to range from qualitative tight-binding like accuracy to meV-level total energy convergence with the basis set. Since all basis functions are strictly localized, the otherwise computationally dominant grid-based operations scale as O(N) with system size N. Together with a scalar-relativistic treatment, the basis sets provide access to all elements from light to heavy. Both low-communication parallelization of all real-space grid based algorithms and a ScaLapack-based, customized handling of the linear algebra for all matrix operations are possible, guaranteeing efficient scaling (CPU time and memory) up to massively parallel computer systems with thousands of CPUs.
Lipinski, Doug; Mohseni, Kamran
2010-03-01
A ridge tracking algorithm for the computation and extraction of Lagrangian coherent structures (LCS) is developed. This algorithm takes advantage of the spatial coherence of LCS by tracking the ridges which form LCS to avoid unnecessary computations away from the ridges. We also make use of the temporal coherence of LCS by approximating the time dependent motion of the LCS with passive tracer particles. To justify this approximation, we provide an estimate of the difference between the motion of the LCS and that of tracer particles which begin on the LCS. In addition to the speedup in computational time, the ridge tracking algorithm uses less memory and results in smaller output files than the standard LCS algorithm. Finally, we apply our ridge tracking algorithm to two test cases, an analytically defined double gyre as well as the more complicated example of the numerical simulation of a swimming jellyfish. In our test cases, we find up to a 35 times speedup when compared with the standard LCS algorithm.
NASA Astrophysics Data System (ADS)
Ramírez-López, A.; Romero-Romo, M. A.; Muñoz-Negron, D.; López-Ramírez, S.; Escarela-Pérez, R.; Duran-Valencia, C.
2012-10-01
Computational models are developed to create grain structures using mathematical algorithms based on the chaos theory such as cellular automaton, geometrical models, fractals, and stochastic methods. Because of the chaotic nature of grain structures, some of the most popular routines are based on the Monte Carlo method, statistical distributions, and random walk methods, which can be easily programmed and included in nested loops. Nevertheless, grain structures are not well defined as the results of computational errors and numerical inconsistencies on mathematical methods. Due to the finite definition of numbers or the numerical restrictions during the simulation of solidification, damaged images appear on the screen. These images must be repaired to obtain a good measurement of grain geometrical properties. Some mathematical algorithms were developed to repair, measure, and characterize grain structures obtained from cellular automata in the present work. An appropriate measurement of grain size and the corrected identification of interfaces and length are very important topics in materials science because they are the representation and validation of mathematical models with real samples. As a result, the developed algorithms are tested and proved to be appropriate and efficient to eliminate the errors and characterize the grain structures.
A fully implicit Hall MHD algorithm based on the ion Ohm's law
NASA Astrophysics Data System (ADS)
Chacón, Luis
2010-11-01
Hall MHD is characterized by extreme hyperbolic numerical stiffness stemming from fast dispersive waves. Implicit algorithms are potentially advantageous, but of very difficult efficient implementation due to the condition numbers of associated matrices. Here, we explore the extension of a successful fully implicit, fully nonlinear algorithm for resistive MHD,ootnotetextL. Chac'on, Phys. Plasmas, 15 (2008) based on Jacobian-free Newton-Krylov methods with physics-based preconditioning, to Hall MHD. Traditionally, Hall MHD has been formulated using the electron equation of motion (EOM) to determine the electric field in the plasma (the so-called Ohm's law). However, given that the center-of-mass EOM, the ion EOM, and the electron EOM are linearly dependent, one could equivalently employ the ion EOM as the Ohm's law for a Hall MHD formulation. While, from a physical standpoint, there is no a priori advantage for using one Ohm's law vs. the other, we argue in this poster that there is an algorithmic one. We will show that, while the electron Ohm's law prevents the extension of the resistive MHD preconditioning strategy to Hall MHD, an ion Ohm's law allows it trivially. Verification and performance numerical results on relevant problems will be presented.
Using Bitmap Indexing Technology for Combined Numerical and TextQueries
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stockinger, Kurt; Cieslewicz, John; Wu, Kesheng
2006-10-16
In this paper, we describe a strategy of using compressedbitmap indices to speed up queries on both numerical data and textdocuments. By using an efficient compression algorithm, these compressedbitmap indices are compact even for indices with millions of distinctterms. Moreover, bitmap indices can be used very efficiently to answerBoolean queries over text documents involving multiple query terms.Existing inverted indices for text searches are usually inefficient forcorpora with a very large number of terms as well as for queriesinvolving a large number of hits. We demonstrate that our compressedbitmap index technology overcomes both of those short-comings. In aperformance comparison against amore » commonly used database system, ourindices answer queries 30 times faster on average. To provide full SQLsupport, we integrated our indexing software, called FastBit, withMonetDB. The integrated system MonetDB/FastBit provides not onlyefficient searches on a single table as FastBit does, but also answersjoin queries efficiently. Furthermore, MonetDB/FastBit also provides avery efficient retrieval mechanism of result records.« less
The development of efficient numerical time-domain modeling methods for geophysical wave propagation
NASA Astrophysics Data System (ADS)
Zhu, Lieyuan
This Ph.D. dissertation focuses on the numerical simulation of geophysical wave propagation in the time domain including elastic waves in solid media, the acoustic waves in fluid media, and the electromagnetic waves in dielectric media. This thesis shows that a linear system model can describe accurately the physical processes of those geophysical waves' propagation and can be used as a sound basis for modeling geophysical wave propagation phenomena. The generalized stability condition for numerical modeling of wave propagation is therefore discussed in the context of linear system theory. The efficiency of a series of different numerical algorithms in the time-domain for modeling geophysical wave propagation are discussed and compared. These algorithms include the finite-difference time-domain method, pseudospectral time domain method, alternating directional implicit (ADI) finite-difference time domain method. The advantages and disadvantages of these numerical methods are discussed and the specific stability condition for each modeling scheme is carefully derived in the context of the linear system theory. Based on the review and discussion of these existing approaches, the split step, ADI pseudospectral time domain (SS-ADI-PSTD) method is developed and tested for several cases. Moreover, the state-of-the-art stretched-coordinate perfect matched layer (SCPML) has also been implemented in SS-ADI-PSTD algorithm as the absorbing boundary condition for truncating the computational domain and absorbing the artificial reflection from the domain boundaries. After algorithmic development, a few case studies serve as the real-world examples to verify the capacities of the numerical algorithms and understand the capabilities and limitations of geophysical methods for detection of subsurface contamination. The first case is a study using ground penetrating radar (GPR) amplitude variation with offset (AVO) for subsurface non-aqueous-liquid (NAPL) contamination. The numerical AVO study reveals that the normalized residual polarization (NRP) variation with offset does not respond to subsurface NAPL existence when the offset is close to or larger than its critical value (which corresponds to critical incident angle) because the air and head waves dominate the recorded wave field and severely interfere with reflected waves in the TEz wave field. Thus it can be concluded that the NRP AVO/GPR method is invalid when source-receiver angle offset is close to or greater than its critical value due to incomplete and severely distorted reflection information. In other words, AVO is not a promising technique for detection of the subsurface NAPL, as claimed by some researchers. In addition, the robustness of the newly developed numerical algorithms is also verified by the AVO study for randomly-arranged layered media. Meanwhile, this case study also demonstrates again that the full-wave numerical modeling algorithms are superior to ray tracing method. The second case study focuses on the effect of the existence of a near-surface fault on the vertically incident P- and S- plane waves. The modeling results show that both P-wave vertical incidence and S-wave vertical incidence cases are qualified fault indicators. For the plane S-wave vertical incidence case, the horizontal location of the upper tip of the fault (the footwall side) can be identified without much effort, because all the recorded parameters on the surface including the maximum velocities and the maximum accelerations, and even their ratios H/V, have shown dramatic changes when crossing the upper tip of the fault. The centers of the transition zone of the all the curves of parameters are almost directly above the fault tip (roughly the horizontal center of the model). Compared with the case of the vertically incident P-wave source, it has been found that the S-wave vertical source is a better indicator for fault location, because the horizontal location of the tip of that fault cannot be clearly identified with the ratio of the horizontal to vertical velocity for the P-wave incident case.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Salajegheh, Nima; Abedrabbo, Nader; Pourboghrat, Farhang
An efficient integration algorithm for continuum damage based elastoplastic constitutive equations is implemented in LS-DYNA. The isotropic damage parameter is defined as the ratio of the damaged surface area over the total cross section area of the representative volume element. This parameter is incorporated into the integration algorithm as an internal variable. The developed damage model is then implemented in the FEM code LS-DYNA as user material subroutine (UMAT). Pure stretch experiments of a hemispherical punch are carried out for copper sheets and the results are compared against the predictions of the implemented damage model. Evaluation of damage parameters ismore » carried out and the optimized values that correctly predicted the failure in the sheet are reported. Prediction of failure in the numerical analysis is performed through element deletion using the critical damage value. The set of failure parameters which accurately predict the failure behavior in copper sheets compared to experimental data is reported as well.« less
Efficient Compressed Sensing Based MRI Reconstruction using Nonconvex Total Variation Penalties
NASA Astrophysics Data System (ADS)
Lazzaro, D.; Loli Piccolomini, E.; Zama, F.
2016-10-01
This work addresses the problem of Magnetic Resonance Image Reconstruction from highly sub-sampled measurements in the Fourier domain. It is modeled as a constrained minimization problem, where the objective function is a non-convex function of the gradient of the unknown image and the constraints are given by the data fidelity term. We propose an algorithm, Fast Non Convex Reweighted (FNCR), where the constrained problem is solved by a reweighting scheme, as a strategy to overcome the non-convexity of the objective function, with an adaptive adjustment of the penalization parameter. We propose a fast iterative algorithm and we can prove that it converges to a local minimum because the constrained problem satisfies the Kurdyka-Lojasiewicz property. Moreover the adaptation of non convex l0 approximation and penalization parameters, by means of a continuation technique, allows us to obtain good quality solutions, avoiding to get stuck in unwanted local minima. Some numerical experiments performed on MRI sub-sampled data show the efficiency of the algorithm and the accuracy of the solution.
Zhang, Pan; Moore, Cristopher
2014-01-01
Modularity is a popular measure of community structure. However, maximizing the modularity can lead to many competing partitions, with almost the same modularity, that are poorly correlated with each other. It can also produce illusory ‘‘communities’’ in random graphs where none exist. We address this problem by using the modularity as a Hamiltonian at finite temperature and using an efficient belief propagation algorithm to obtain the consensus of many partitions with high modularity, rather than looking for a single partition that maximizes it. We show analytically and numerically that the proposed algorithm works all of the way down to the detectability transition in networks generated by the stochastic block model. It also performs well on real-world networks, revealing large communities in some networks where previous work has claimed no communities exist. Finally we show that by applying our algorithm recursively, subdividing communities until no statistically significant subcommunities can be found, we can detect hierarchical structure in real-world networks more efficiently than previous methods. PMID:25489096
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hurley, R. C.; Vorobiev, O. Y.; Ezzedine, S. M.
Here, we present a numerical method for modeling the mechanical effects of nonlinearly-compliant joints in elasto-plastic media. The method uses a series of strain-rate and stress update algorithms to determine joint closure, slip, and solid stress within computational cells containing multiple “embedded” joints. This work facilitates efficient modeling of nonlinear wave propagation in large spatial domains containing a large number of joints that affect bulk mechanical properties. We implement the method within the massively parallel Lagrangian code GEODYN-L and provide verification and examples. We highlight the ability of our algorithms to capture joint interactions and multiple weakness planes within individualmore » computational cells, as well as its computational efficiency. We also discuss the motivation for developing the proposed technique: to simulate large-scale wave propagation during the Source Physics Experiments (SPE), a series of underground explosions conducted at the Nevada National Security Site (NNSS).« less
Hurley, R. C.; Vorobiev, O. Y.; Ezzedine, S. M.
2017-04-06
Here, we present a numerical method for modeling the mechanical effects of nonlinearly-compliant joints in elasto-plastic media. The method uses a series of strain-rate and stress update algorithms to determine joint closure, slip, and solid stress within computational cells containing multiple “embedded” joints. This work facilitates efficient modeling of nonlinear wave propagation in large spatial domains containing a large number of joints that affect bulk mechanical properties. We implement the method within the massively parallel Lagrangian code GEODYN-L and provide verification and examples. We highlight the ability of our algorithms to capture joint interactions and multiple weakness planes within individualmore » computational cells, as well as its computational efficiency. We also discuss the motivation for developing the proposed technique: to simulate large-scale wave propagation during the Source Physics Experiments (SPE), a series of underground explosions conducted at the Nevada National Security Site (NNSS).« less
Multidisciplinary Multiobjective Optimal Design for Turbomachinery Using Evolutionary Algorithm
NASA Technical Reports Server (NTRS)
2005-01-01
This report summarizes Dr. Lian s efforts toward developing a robust and efficient tool for multidisciplinary and multi-objective optimal design for turbomachinery using evolutionary algorithms. This work consisted of two stages. The first stage (from July 2003 to June 2004) Dr. Lian focused on building essential capabilities required for the project. More specifically, Dr. Lian worked on two subjects: an enhanced genetic algorithm (GA) and an integrated optimization system with a GA and a surrogate model. The second stage (from July 2004 to February 2005) Dr. Lian formulated aerodynamic optimization and structural optimization into a multi-objective optimization problem and performed multidisciplinary and multi-objective optimizations on a transonic compressor blade based on the proposed model. Dr. Lian s numerical results showed that the proposed approach can effectively reduce the blade weight and increase the stage pressure ratio in an efficient manner. In addition, the new design was structurally safer than the original design. Five conference papers and three journal papers were published on this topic by Dr. Lian.
A Shifted Block Lanczos Algorithm 1: The Block Recurrence
NASA Technical Reports Server (NTRS)
Grimes, Roger G.; Lewis, John G.; Simon, Horst D.
1990-01-01
In this paper we describe a block Lanczos algorithm that is used as the key building block of a software package for the extraction of eigenvalues and eigenvectors of large sparse symmetric generalized eigenproblems. The software package comprises: a version of the block Lanczos algorithm specialized for spectrally transformed eigenproblems; an adaptive strategy for choosing shifts, and efficient codes for factoring large sparse symmetric indefinite matrices. This paper describes the algorithmic details of our block Lanczos recurrence. This uses a novel combination of block generalizations of several features that have only been investigated independently in the past. In particular new forms of partial reorthogonalization, selective reorthogonalization and local reorthogonalization are used, as is a new algorithm for obtaining the M-orthogonal factorization of a matrix. The heuristic shifting strategy, the integration with sparse linear equation solvers and numerical experience with the code are described in a companion paper.
Worst-Case Energy Efficiency Maximization in a 5G Massive MIMO-NOMA System.
Chinnadurai, Sunil; Selvaprabhu, Poongundran; Jeong, Yongchae; Jiang, Xueqin; Lee, Moon Ho
2017-09-18
In this paper, we examine the robust beamforming design to tackle the energy efficiency (EE) maximization problem in a 5G massive multiple-input multiple-output (MIMO)-non-orthogonal multiple access (NOMA) downlink system with imperfect channel state information (CSI) at the base station. A novel joint user pairing and dynamic power allocation (JUPDPA) algorithm is proposed to minimize the inter user interference and also to enhance the fairness between the users. This work assumes imperfect CSI by adding uncertainties to channel matrices with worst-case model, i.e., ellipsoidal uncertainty model (EUM). A fractional non-convex optimization problem is formulated to maximize the EE subject to the transmit power constraints and the minimum rate requirement for the cell edge user. The designed problem is difficult to solve due to its nonlinear fractional objective function. We firstly employ the properties of fractional programming to transform the non-convex problem into its equivalent parametric form. Then, an efficient iterative algorithm is proposed established on the constrained concave-convex procedure (CCCP) that solves and achieves convergence to a stationary point of the above problem. Finally, Dinkelbach's algorithm is employed to determine the maximum energy efficiency. Comprehensive numerical results illustrate that the proposed scheme attains higher worst-case energy efficiency as compared with the existing NOMA schemes and the conventional orthogonal multiple access (OMA) scheme.
Worst-Case Energy Efficiency Maximization in a 5G Massive MIMO-NOMA System
Jeong, Yongchae; Jiang, Xueqin; Lee, Moon Ho
2017-01-01
In this paper, we examine the robust beamforming design to tackle the energy efficiency (EE) maximization problem in a 5G massive multiple-input multiple-output (MIMO)-non-orthogonal multiple access (NOMA) downlink system with imperfect channel state information (CSI) at the base station. A novel joint user pairing and dynamic power allocation (JUPDPA) algorithm is proposed to minimize the inter user interference and also to enhance the fairness between the users. This work assumes imperfect CSI by adding uncertainties to channel matrices with worst-case model, i.e., ellipsoidal uncertainty model (EUM). A fractional non-convex optimization problem is formulated to maximize the EE subject to the transmit power constraints and the minimum rate requirement for the cell edge user. The designed problem is difficult to solve due to its nonlinear fractional objective function. We firstly employ the properties of fractional programming to transform the non-convex problem into its equivalent parametric form. Then, an efficient iterative algorithm is proposed established on the constrained concave-convex procedure (CCCP) that solves and achieves convergence to a stationary point of the above problem. Finally, Dinkelbach’s algorithm is employed to determine the maximum energy efficiency. Comprehensive numerical results illustrate that the proposed scheme attains higher worst-case energy efficiency as compared with the existing NOMA schemes and the conventional orthogonal multiple access (OMA) scheme. PMID:28927019
Fully implicit moving mesh adaptive algorithm
NASA Astrophysics Data System (ADS)
Serazio, C.; Chacon, L.; Lapenta, G.
2006-10-01
In many problems of interest, the numerical modeler is faced with the challenge of dealing with multiple time and length scales. The former is best dealt with with fully implicit methods, which are able to step over fast frequencies to resolve the dynamical time scale of interest. The latter requires grid adaptivity for efficiency. Moving-mesh grid adaptive methods are attractive because they can be designed to minimize the numerical error for a given resolution. However, the required grid governing equations are typically very nonlinear and stiff, and of considerably difficult numerical treatment. Not surprisingly, fully coupled, implicit approaches where the grid and the physics equations are solved simultaneously are rare in the literature, and circumscribed to 1D geometries. In this study, we present a fully implicit algorithm for moving mesh methods that is feasible for multidimensional geometries. Crucial elements are the development of an effective multilevel treatment of the grid equation, and a robust, rigorous error estimator. For the latter, we explore the effectiveness of a coarse grid correction error estimator, which faithfully reproduces spatial truncation errors for conservative equations. We will show that the moving mesh approach is competitive vs. uniform grids both in accuracy (due to adaptivity) and efficiency. Results for a variety of models 1D and 2D geometries will be presented. L. Chac'on, G. Lapenta, J. Comput. Phys., 212 (2), 703 (2006) G. Lapenta, L. Chac'on, J. Comput. Phys., accepted (2006)
A heuristic re-mapping algorithm reducing inter-level communication in SAMR applications.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Steensland, Johan; Ray, Jaideep
2003-07-01
This paper aims at decreasing execution time for large-scale structured adaptive mesh refinement (SAMR) applications by proposing a new heuristic re-mapping algorithm and experimentally showing its effectiveness in reducing inter-level communication. Tests were done for five different SAMR applications. The overall goal is to engineer a dynamically adaptive meta-partitioner capable of selecting and configuring the most appropriate partitioning strategy at run-time based on current system and application state. Such a metapartitioner can significantly reduce execution times for general SAMR applications. Computer simulations of physical phenomena are becoming increasingly popular as they constitute an important complement to real-life testing. In manymore » cases, such simulations are based on solving partial differential equations by numerical methods. Adaptive methods are crucial to efficiently utilize computer resources such as memory and CPU. But even with adaption, the simulations are computationally demanding and yield huge data sets. Thus parallelization and the efficient partitioning of data become issues of utmost importance. Adaption causes the workload to change dynamically, calling for dynamic (re-) partitioning to maintain efficient resource utilization. The proposed heuristic algorithm reduced inter-level communication substantially. Since the complexity of the proposed algorithm is low, this decrease comes at a relatively low cost. As a consequence, we draw the conclusion that the proposed re-mapping algorithm would be useful to lower overall execution times for many large SAMR applications. Due to its usefulness and its parameterization, the proposed algorithm would constitute a natural and important component of the meta-partitioner.« less
NASA Astrophysics Data System (ADS)
Rastogi, Richa; Londhe, Ashutosh; Srivastava, Abhishek; Sirasala, Kirannmayi M.; Khonde, Kiran
2017-03-01
In this article, a new scalable 3D Kirchhoff depth migration algorithm is presented on state of the art multicore CPU based cluster. Parallelization of 3D Kirchhoff depth migration is challenging due to its high demand of compute time, memory, storage and I/O along with the need of their effective management. The most resource intensive modules of the algorithm are traveltime calculations and migration summation which exhibit an inherent trade off between compute time and other resources. The parallelization strategy of the algorithm largely depends on the storage of calculated traveltimes and its feeding mechanism to the migration process. The presented work is an extension of our previous work, wherein a 3D Kirchhoff depth migration application for multicore CPU based parallel system had been developed. Recently, we have worked on improving parallel performance of this application by re-designing the parallelization approach. The new algorithm is capable to efficiently migrate both prestack and poststack 3D data. It exhibits flexibility for migrating large number of traces within the available node memory and with minimal requirement of storage, I/O and inter-node communication. The resultant application is tested using 3D Overthrust data on PARAM Yuva II, which is a Xeon E5-2670 based multicore CPU cluster with 16 cores/node and 64 GB shared memory. Parallel performance of the algorithm is studied using different numerical experiments and the scalability results show striking improvement over its previous version. An impressive 49.05X speedup with 76.64% efficiency is achieved for 3D prestack data and 32.00X speedup with 50.00% efficiency for 3D poststack data, using 64 nodes. The results also demonstrate the effectiveness and robustness of the improved algorithm with high scalability and efficiency on a multicore CPU cluster.
Adaptive Numerical Dissipation Control in High Order Schemes for Multi-D Non-Ideal MHD
NASA Technical Reports Server (NTRS)
Yee, H. C.; Sjoegreen, B.
2005-01-01
The required type and amount of numerical dissipation/filter to accurately resolve all relevant multiscales of complex MHD unsteady high-speed shock/shear/turbulence/combustion problems are not only physical problem dependent, but also vary from one flow region to another. In addition, proper and efficient control of the divergence of the magnetic field (Div(B)) numerical error for high order shock-capturing methods poses extra requirements for the considered type of CPU intensive computations. The goal is to extend our adaptive numerical dissipation control in high order filter schemes and our new divergence-free methods for ideal MHD to non-ideal MHD that include viscosity and resistivity. The key idea consists of automatic detection of different flow features as distinct sensors to signal the appropriate type and amount of numerical dissipation/filter where needed and leave the rest of the region free from numerical dissipation contamination. These scheme-independent detectors are capable of distinguishing shocks/shears, flame sheets, turbulent fluctuations and spurious high-frequency oscillations. The detection algorithm is based on an artificial compression method (ACM) (for shocks/shears), and redundant multiresolution wavelets (WAV) (for the above types of flow feature). These filters also provide a natural and efficient way for the minimization of Div(B) numerical error.
Interference graph-based dynamic frequency reuse in optical attocell networks
NASA Astrophysics Data System (ADS)
Liu, Huanlin; Xia, Peijie; Chen, Yong; Wu, Lan
2017-11-01
Indoor optical attocell network may achieve higher capacity than radio frequency (RF) or Infrared (IR)-based wireless systems. It is proposed as a special type of visible light communication (VLC) system using Light Emitting Diodes (LEDs). However, the system spectral efficiency may be severely degraded owing to the inter-cell interference (ICI), particularly for dense deployment scenarios. To address these issues, we construct the spectral interference graph for indoor optical attocell network, and propose the Dynamic Frequency Reuse (DFR) and Weighted Dynamic Frequency Reuse (W-DFR) algorithms to decrease ICI and improve the spectral efficiency performance. The interference graph makes LEDs can transmit data without interference and select the minimum sub-bands needed for frequency reuse. Then, DFR algorithm reuses the system frequency equally across service-providing cells to mitigate spectrum interference. While W-DFR algorithm can reuse the system frequency by using the bandwidth weight (BW), which is defined based on the number of service users. Numerical results show that both of the proposed schemes can effectively improve the average spectral efficiency (ASE) of the system. Additionally, improvement of the user data rate is also obtained by analyzing its cumulative distribution function (CDF).
Computationally efficient method for optical simulation of solar cells and their applications
NASA Astrophysics Data System (ADS)
Semenikhin, I.; Zanuccoli, M.; Fiegna, C.; Vyurkov, V.; Sangiorgi, E.
2013-01-01
This paper presents two novel implementations of the Differential method to solve the Maxwell equations in nanostructured optoelectronic solid state devices. The first proposed implementation is based on an improved and computationally efficient T-matrix formulation that adopts multiple-precision arithmetic to tackle the numerical instability problem which arises due to evanescent modes. The second implementation adopts the iterative approach that allows to achieve low computational complexity O(N logN) or better. The proposed algorithms may work with structures with arbitrary spatial variation of the permittivity. The developed two-dimensional numerical simulator is applied to analyze the dependence of the absorption characteristics of a thin silicon slab on the morphology of the front interface and on the angle of incidence of the radiation with respect to the device surface.
Rheological Models in the Time-Domain Modeling of Seismic Motion
NASA Astrophysics Data System (ADS)
Moczo, P.; Kristek, J.
2004-12-01
The time-domain stress-strain relation in a viscoelastic medium has a form of the convolutory integral which is numerically intractable. This was the reason for the oversimplified models of attenuation in the time-domain seismic wave propagation and earthquake motion modeling. In their pioneering work, Day and Minster (1984) showed the way how to convert the integral into numerically tractable differential form in the case of a general viscoelastic modulus. In response to the work by Day and Minster, Emmerich and Korn (1987) suggested using the rheology of their generalized Maxwell body (GMB) while Carcione et al. (1988) suggested using the generalized Zener body (GZB). The viscoelastic moduli of both rheological models have a form of the rational function and thus the differential form of the stress-strain relation is rather easy to obtain. After the papers by Emmerich and Korn and Carcione et al. numerical modelers decided either for the GMB or GZB rheology and developed 'non-communicating' algorithms. In the many following papers the authors using the GMB never commented the GZB rheology and the corresponding algorithms, and the authors using the GZB never related their methods to the GMB rheology and algorithms. We analyze and compare both rheologies and the corresponding incorporations of the realistic attenuation into the time-domain computations. We then focus on the most recent staggered-grid finite-difference modeling, mainly on accounting for the material heterogeneity in the viscoelastic media, and the computational efficiency of the finite-difference algorithms.
Efficient computer algebra algorithms for polynomial matrices in control design
NASA Technical Reports Server (NTRS)
Baras, J. S.; Macenany, D. C.; Munach, R.
1989-01-01
The theory of polynomial matrices plays a key role in the design and analysis of multi-input multi-output control and communications systems using frequency domain methods. Examples include coprime factorizations of transfer functions, cannonical realizations from matrix fraction descriptions, and the transfer function design of feedback compensators. Typically, such problems abstract in a natural way to the need to solve systems of Diophantine equations or systems of linear equations over polynomials. These and other problems involving polynomial matrices can in turn be reduced to polynomial matrix triangularization procedures, a result which is not surprising given the importance of matrix triangularization techniques in numerical linear algebra. Matrices with entries from a field and Gaussian elimination play a fundamental role in understanding the triangularization process. In the case of polynomial matrices, matrices with entries from a ring for which Gaussian elimination is not defined and triangularization is accomplished by what is quite properly called Euclidean elimination. Unfortunately, the numerical stability and sensitivity issues which accompany floating point approaches to Euclidean elimination are not very well understood. New algorithms are presented which circumvent entirely such numerical issues through the use of exact, symbolic methods in computer algebra. The use of such error-free algorithms guarantees that the results are accurate to within the precision of the model data--the best that can be hoped for. Care must be taken in the design of such algorithms due to the phenomenon of intermediate expressions swell.
Monte Carlo sampling in diffusive dynamical systems
NASA Astrophysics Data System (ADS)
Tapias, Diego; Sanders, David P.; Altmann, Eduardo G.
2018-05-01
We introduce a Monte Carlo algorithm to efficiently compute transport properties of chaotic dynamical systems. Our method exploits the importance sampling technique that favors trajectories in the tail of the distribution of displacements, where deviations from a diffusive process are most prominent. We search for initial conditions using a proposal that correlates states in the Markov chain constructed via a Metropolis-Hastings algorithm. We show that our method outperforms the direct sampling method and also Metropolis-Hastings methods with alternative proposals. We test our general method through numerical simulations in 1D (box-map) and 2D (Lorentz gas) systems.
NASA Technical Reports Server (NTRS)
Toon, Owen B.; Mckay, C. P.; Ackerman, T. P.; Santhanam, K.
1989-01-01
The solution of the generalized two-stream approximation for radiative transfer in homogeneous multiple scattering atmospheres is extended to vertically inhomogeneous atmospheres in a manner which is numerically stable and computationally efficient. It is shown that solar energy deposition rates, photolysis rates, and infrared cooling rates all may be calculated with the simple modifications of a single algorithm. The accuracy of the algorithm is generally better than 10 percent, so that other uncertainties, such as in absorption coefficients, may often dominate the error in calculation of the quantities of interest to atmospheric studies.
NASA Technical Reports Server (NTRS)
Koenig, Herbert A.; Chan, Kwai S.; Cassenti, Brice N.; Weber, Richard
1988-01-01
A unified numerical method for the integration of stiff time dependent constitutive equations is presented. The solution process is directly applied to a constitutive model proposed by Bodner. The theory confronts time dependent inelastic behavior coupled with both isotropic hardening and directional hardening behaviors. Predicted stress-strain responses from this model are compared to experimental data from cyclic tests on uniaxial specimens. An algorithm is developed for the efficient integration of the Bodner flow equation. A comparison is made with the Euler integration method. An analysis of computational time is presented for the three algorithms.
Parameter-tolerant design of high contrast gratings
NASA Astrophysics Data System (ADS)
Chevallier, Christyves; Fressengeas, Nicolas; Jacquet, Joel; Almuneau, Guilhem; Laaroussi, Youness; Gauthier-Lafaye, Olivier; Cerutti, Laurent; Genty, Frédéric
2015-02-01
This work is devoted to the design of high contrast grating mirrors taking into account the technological constraints and tolerance of fabrication. First, a global optimization algorithm has been combined to a numerical analysis of grating structures (RCWA) to automatically design HCG mirrors. Then, the tolerances of the grating dimensions have been precisely studied to develop a robust optimization algorithm with which high contrast gratings, exhibiting not only a high efficiency but also large tolerance values, could be designed. Finally, several structures integrating previously designed HCGs has been simulated to validate and illustrate the interest of such gratings.
Random search optimization based on genetic algorithm and discriminant function
NASA Technical Reports Server (NTRS)
Kiciman, M. O.; Akgul, M.; Erarslanoglu, G.
1990-01-01
The general problem of optimization with arbitrary merit and constraint functions, which could be convex, concave, monotonic, or non-monotonic, is treated using stochastic methods. To improve the efficiency of the random search methods, a genetic algorithm for the search phase and a discriminant function for the constraint-control phase were utilized. The validity of the technique is demonstrated by comparing the results to published test problem results. Numerical experimentation indicated that for cases where a quick near optimum solution is desired, a general, user-friendly optimization code can be developed without serious penalties in both total computer time and accuracy.
Streamline integration as a method for two-dimensional elliptic grid generation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wiesenberger, M., E-mail: Matthias.Wiesenberger@uibk.ac.at; Held, M.; Einkemmer, L.
We propose a new numerical algorithm to construct a structured numerical elliptic grid of a doubly connected domain. Our method is applicable to domains with boundaries defined by two contour lines of a two-dimensional function. Furthermore, we can adapt any analytically given boundary aligned structured grid, which specifically includes polar and Cartesian grids. The resulting coordinate lines are orthogonal to the boundary. Grid points as well as the elements of the Jacobian matrix can be computed efficiently and up to machine precision. In the simplest case we construct conformal grids, yet with the help of weight functions and monitor metricsmore » we can control the distribution of cells across the domain. Our algorithm is parallelizable and easy to implement with elementary numerical methods. We assess the quality of grids by considering both the distribution of cell sizes and the accuracy of the solution to elliptic problems. Among the tested grids these key properties are best fulfilled by the grid constructed with the monitor metric approach. - Graphical abstract: - Highlights: • Construct structured, elliptic numerical grids with elementary numerical methods. • Align coordinate lines with or make them orthogonal to the domain boundary. • Compute grid points and metric elements up to machine precision. • Control cell distribution by adaption functions or monitor metrics.« less
Annual Research Briefs - 2000: Center for Turbulence Research
NASA Technical Reports Server (NTRS)
2000-01-01
This report contains the 2000 annual progress reports of the postdoctoral Fellows and visiting scholars of the Center for Turbulence Research (CTR). It summarizes the research efforts undertaken under the core CTR program. Last year, CTR sponsored sixteen resident Postdoctoral Fellows, nine Research Associates, and two Senior Research Fellows, hosted seven short term visitors, and supported four doctoral students. The Research Associates are supported by the Departments of Defense and Energy. The reports in this volume are divided into five groups. The first group largely consists of the new areas of interest at CTR. It includes efficient algorithms for molecular dynamics, stability in protoplanetary disks, and experimental and numerical applications of evolutionary optimization algorithms for jet flow control. The next group of reports is in experimental, theoretical, and numerical modeling efforts in turbulent combustion. As more challenging computations are attempted, the need for additional theoretical and experimental studies in combustion has emerged. A pacing item for computation of nonpremixed combustion is the prediction of extinction and re-ignition phenomena, which is currently being addressed at CTR. The third group of reports is in the development of accurate and efficient numerical methods, which has always been an important part of CTR's work. This is the tool development part of the program which supports our high fidelity numerical simulations in such areas as turbulence in complex geometries, hypersonics, and acoustics. The final two groups of reports are concerned with LES and RANS prediction methods. There has been significant progress in wall modeling for LES of high Reynolds number turbulence and in validation of the v(exp 2) - f model for industrial applications.
Baczewski, Andrew D; Bond, Stephen D
2013-07-28
Generalized Langevin dynamics (GLD) arise in the modeling of a number of systems, ranging from structured fluids that exhibit a viscoelastic mechanical response, to biological systems, and other media that exhibit anomalous diffusive phenomena. Molecular dynamics (MD) simulations that include GLD in conjunction with external and/or pairwise forces require the development of numerical integrators that are efficient, stable, and have known convergence properties. In this article, we derive a family of extended variable integrators for the Generalized Langevin equation with a positive Prony series memory kernel. Using stability and error analysis, we identify a superlative choice of parameters and implement the corresponding numerical algorithm in the LAMMPS MD software package. Salient features of the algorithm include exact conservation of the first and second moments of the equilibrium velocity distribution in some important cases, stable behavior in the limit of conventional Langevin dynamics, and the use of a convolution-free formalism that obviates the need for explicit storage of the time history of particle velocities. Capability is demonstrated with respect to accuracy in numerous canonical examples, stability in certain limits, and an exemplary application in which the effect of a harmonic confining potential is mapped onto a memory kernel.
A piloted simulator evaluation of a ground-based 4-D descent advisor algorithm
NASA Technical Reports Server (NTRS)
Davis, Thomas J.; Green, Steven M.; Erzberger, Heinz
1990-01-01
A ground-based, four dimensional (4D) descent-advisor algorithm is under development at NASA-Ames. The algorithm combines detailed aerodynamic, propulsive, and atmospheric models with an efficient numerical integration scheme to generate 4D descent advisories. The ability is investigated of the 4D descent advisor algorithm to provide adequate control of arrival time for aircraft not equipped with on-board 4D guidance systems. A piloted simulation was conducted to determine the precision with which the descent advisor could predict the 4D trajectories of typical straight-in descents flown by airline pilots under different wind conditions. The effects of errors in the estimation of wind and initial aircraft weight were also studied. A description of the descent advisor as well as the result of the simulation studies are presented.
Scattering properties of electromagnetic waves from metal object in the lower terahertz region
NASA Astrophysics Data System (ADS)
Chen, Gang; Dang, H. X.; Hu, T. Y.; Su, Xiang; Lv, R. C.; Li, Hao; Tan, X. M.; Cui, T. J.
2018-01-01
An efficient hybrid algorithm is proposed to analyze the electromagnetic scattering properties of metal objects in the lower terahertz (THz) frequency. The metal object can be viewed as perfectly electrical conducting object with a slightly rough surface in the lower THz region. Hence the THz scattered field from metal object can be divided into coherent and incoherent parts. The physical optics and truncated-wedge incremental-length diffraction coefficients methods are combined to compute the coherent part; while the small perturbation method is used for the incoherent part. With the MonteCarlo method, the radar cross section of the rough metal surface is computed by the multilevel fast multipole algorithm and the proposed hybrid algorithm, respectively. The numerical results show that the proposed algorithm has good accuracy to simulate the scattering properties rapidly in the lower THz region.
Structure preserving parallel algorithms for solving the Bethe–Salpeter eigenvalue problem
Shao, Meiyue; da Jornada, Felipe H.; Yang, Chao; ...
2015-10-02
The Bethe–Salpeter eigenvalue problem is a dense structured eigenvalue problem arising from discretized Bethe–Salpeter equation in the context of computing exciton energies and states. A computational challenge is that at least half of the eigenvalues and the associated eigenvectors are desired in practice. In this paper, we establish the equivalence between Bethe–Salpeter eigenvalue problems and real Hamiltonian eigenvalue problems. Based on theoretical analysis, structure preserving algorithms for a class of Bethe–Salpeter eigenvalue problems are proposed. We also show that for this class of problems all eigenvalues obtained from the Tamm–Dancoff approximation are overestimated. In order to solve large scale problemsmore » of practical interest, we discuss parallel implementations of our algorithms targeting distributed memory systems. Finally, several numerical examples are presented to demonstrate the efficiency and accuracy of our algorithms.« less
Efficient algorithms for computing a strong rank-revealing QR factorization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gu, M.; Eisenstat, S.C.
1996-07-01
Given an m x n matrix M with m {ge} n, it is shown that there exists a permutation {Pi} and an integer k such that the QR factorization given by equation (1) reveals the numerical rank of M: the k x k upper-triangular matrix A{sub k} is well conditioned, norm of (C{sub k}){sub 2} is small, and B{sub k} is linearly dependent on A{sub k} with coefficients bounded by a low-degree polynomial in n. Existing rank-revealing QR (RRQR) algorithms are related to such factorizations and two algorithms are presented for computing them. The new algorithms are nearly as efficientmore » as QR with column pivoting for most problems and take O(mn{sup 2}) floating-point operations in the worst case.« less
NASA Technical Reports Server (NTRS)
Nicolaides, R. A.
1979-01-01
A description and explanation of a simple multigrid algorithm for solving finite element systems is given. Numerical results for an implementation are reported for a number of elliptic equations, including cases with singular coefficients and indefinite equations. The method shows the high efficiency, essentially independent of the grid spacing, predicted by the theory.
Graph cuts for curvature based image denoising.
Bae, Egil; Shi, Juan; Tai, Xue-Cheng
2011-05-01
Minimization of total variation (TV) is a well-known method for image denoising. Recently, the relationship between TV minimization problems and binary MRF models has been much explored. This has resulted in some very efficient combinatorial optimization algorithms for the TV minimization problem in the discrete setting via graph cuts. To overcome limitations, such as staircasing effects, of the relatively simple TV model, variational models based upon higher order derivatives have been proposed. The Euler's elastica model is one such higher order model of central importance, which minimizes the curvature of all level lines in the image. Traditional numerical methods for minimizing the energy in such higher order models are complicated and computationally complex. In this paper, we will present an efficient minimization algorithm based upon graph cuts for minimizing the energy in the Euler's elastica model, by simplifying the problem to that of solving a sequence of easy graph representable problems. This sequence has connections to the gradient flow of the energy function, and converges to a minimum point. The numerical experiments show that our new approach is more effective in maintaining smooth visual results while preserving sharp features better than TV models.
NASA Astrophysics Data System (ADS)
Ji, Liang-Bo; Chen, Fang
2017-07-01
Numerical simulation and intelligent optimization technology were adopted for rolling and extrusion of zincked sheet. By response surface methodology (RSM), genetic algorithm (GA) and data processing technology, an efficient optimization of process parameters for rolling of zincked sheet was investigated. The influence trend of roller gap, rolling speed and friction factor effects on reduction rate and plate shortening rate were analyzed firstly. Then a predictive response surface model for comprehensive quality index of part was created using RSM. Simulated and predicted values were compared. Through genetic algorithm method, the optimal process parameters for the forming of rolling were solved. They were verified and the optimum process parameters of rolling were obtained. It is feasible and effective.
ɛ-subgradient algorithms for bilevel convex optimization
NASA Astrophysics Data System (ADS)
Helou, Elias S.; Simões, Lucas E. A.
2017-05-01
This paper introduces and studies the convergence properties of a new class of explicit ɛ-subgradient methods for the task of minimizing a convex function over a set of minimizers of another convex minimization problem. The general algorithm specializes to some important cases, such as first-order methods applied to a varying objective function, which have computationally cheap iterations. We present numerical experimentation concerning certain applications where the theoretical framework encompasses efficient algorithmic techniques, enabling the use of the resulting methods to solve very large practical problems arising in tomographic image reconstruction. ES Helou was supported by FAPESP grants 2013/07375-0 and 2013/16508-3 and CNPq grant 311476/2014-7. LEA Simões was supported by FAPESP grants 2011/02219-4 and 2013/14615-7.
Rapid solution of large-scale systems of equations
NASA Technical Reports Server (NTRS)
Storaasli, Olaf O.
1994-01-01
The analysis and design of complex aerospace structures requires the rapid solution of large systems of linear and nonlinear equations, eigenvalue extraction for buckling, vibration and flutter modes, structural optimization and design sensitivity calculation. Computers with multiple processors and vector capabilities can offer substantial computational advantages over traditional scalar computer for these analyses. These computers fall into two categories: shared memory computers and distributed memory computers. This presentation covers general-purpose, highly efficient algorithms for generation/assembly or element matrices, solution of systems of linear and nonlinear equations, eigenvalue and design sensitivity analysis and optimization. All algorithms are coded in FORTRAN for shared memory computers and many are adapted to distributed memory computers. The capability and numerical performance of these algorithms will be addressed.
Fast principal component analysis for stacking seismic data
NASA Astrophysics Data System (ADS)
Wu, Juan; Bai, Min
2018-04-01
Stacking seismic data plays an indispensable role in many steps of the seismic data processing and imaging workflow. Optimal stacking of seismic data can help mitigate seismic noise and enhance the principal components to a great extent. Traditional average-based seismic stacking methods cannot obtain optimal performance when the ambient noise is extremely strong. We propose a principal component analysis (PCA) algorithm for stacking seismic data without being sensitive to noise level. Considering the computational bottleneck of the classic PCA algorithm in processing massive seismic data, we propose an efficient PCA algorithm to make the proposed method readily applicable for industrial applications. Two numerically designed examples and one real seismic data are used to demonstrate the performance of the presented method.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Spotz, William F.
PyTrilinos is a set of Python interfaces to compiled Trilinos packages. This collection supports serial and parallel dense linear algebra, serial and parallel sparse linear algebra, direct and iterative linear solution techniques, algebraic and multilevel preconditioners, nonlinear solvers and continuation algorithms, eigensolvers and partitioning algorithms. Also included are a variety of related utility functions and classes, including distributed I/O, coloring algorithms and matrix generation. PyTrilinos vector objects are compatible with the popular NumPy Python package. As a Python front end to compiled libraries, PyTrilinos takes advantage of the flexibility and ease of use of Python, and the efficiency of themore » underlying C++, C and Fortran numerical kernels. This paper covers recent, previously unpublished advances in the PyTrilinos package.« less
Fast Optimization for Aircraft Descent and Approach Trajectory
NASA Technical Reports Server (NTRS)
Luchinsky, Dmitry G.; Schuet, Stefan; Brenton, J.; Timucin, Dogan; Smith, David; Kaneshige, John
2017-01-01
We address problem of on-line scheduling of the aircraft descent and approach trajectory. We formulate a general multiphase optimal control problem for optimization of the descent trajectory and review available methods of its solution. We develop a fast algorithm for solution of this problem using two key components: (i) fast inference of the dynamical and control variables of the descending trajectory from the low dimensional flight profile data and (ii) efficient local search for the resulting reduced dimensionality non-linear optimization problem. We compare the performance of the proposed algorithm with numerical solution obtained using optimal control toolbox General Pseudospectral Optimal Control Software. We present results of the solution of the scheduling problem for aircraft descent using novel fast algorithm and discuss its future applications.
F--Ray: A new algorithm for efficient transport of ionizing radiation
NASA Astrophysics Data System (ADS)
Mao, Yi; Zhang, J.; Wandelt, B. D.; Shapiro, P. R.; Iliev, I. T.
2014-04-01
We present a new algorithm for the 3D transport of ionizing radiation, called F
NASA Astrophysics Data System (ADS)
Gilbert, B. K.; Robb, R. A.; Chu, A.; Kenue, S. K.; Lent, A. H.; Swartzlander, E. E., Jr.
1981-02-01
Rapid advances during the past ten years of several forms of computer-assisted tomography (CT) have resulted in the development of numerous algorithms to convert raw projection data into cross-sectional images. These reconstruction algorithms are either 'iterative,' in which a large matrix algebraic equation is solved by successive approximation techniques; or 'closed form'. Continuing evolution of the closed form algorithms has allowed the newest versions to produce excellent reconstructed images in most applications. This paper will review several computer software and special-purpose digital hardware implementations of closed form algorithms, either proposed during the past several years by a number of workers or actually implemented in commercial or research CT scanners. The discussion will also cover a number of recently investigated algorithmic modifications which reduce the amount of computation required to execute the reconstruction process, as well as several new special-purpose digital hardware implementations under development in laboratories at the Mayo Clinic.
2013-01-01
Background Next generation sequencing technologies have greatly advanced many research areas of the biomedical sciences through their capability to generate massive amounts of genetic information at unprecedented rates. The advent of next generation sequencing has led to the development of numerous computational tools to analyze and assemble the millions to billions of short sequencing reads produced by these technologies. While these tools filled an important gap, current approaches for storing, processing, and analyzing short read datasets generally have remained simple and lack the complexity needed to efficiently model the produced reads and assemble them correctly. Results Previously, we presented an overlap graph coarsening scheme for modeling read overlap relationships on multiple levels. Most current read assembly and analysis approaches use a single graph or set of clusters to represent the relationships among a read dataset. Instead, we use a series of graphs to represent the reads and their overlap relationships across a spectrum of information granularity. At each information level our algorithm is capable of generating clusters of reads from the reduced graph, forming an integrated graph modeling and clustering approach for read analysis and assembly. Previously we applied our algorithm to simulated and real 454 datasets to assess its ability to efficiently model and cluster next generation sequencing data. In this paper we extend our algorithm to large simulated and real Illumina datasets to demonstrate that our algorithm is practical for both sequencing technologies. Conclusions Our overlap graph theoretic algorithm is able to model next generation sequencing reads at various levels of granularity through the process of graph coarsening. Additionally, our model allows for efficient representation of the read overlap relationships, is scalable for large datasets, and is practical for both Illumina and 454 sequencing technologies. PMID:24564333
Scalable Domain Decomposed Monte Carlo Particle Transport
NASA Astrophysics Data System (ADS)
O'Brien, Matthew Joseph
In this dissertation, we present the parallel algorithms necessary to run domain decomposed Monte Carlo particle transport on large numbers of processors (millions of processors). Previous algorithms were not scalable, and the parallel overhead became more computationally costly than the numerical simulation. The main algorithms we consider are: • Domain decomposition of constructive solid geometry: enables extremely large calculations in which the background geometry is too large to fit in the memory of a single computational node. • Load Balancing: keeps the workload per processor as even as possible so the calculation runs efficiently. • Global Particle Find: if particles are on the wrong processor, globally resolve their locations to the correct processor based on particle coordinate and background domain. • Visualizing constructive solid geometry, sourcing particles, deciding that particle streaming communication is completed and spatial redecomposition. These algorithms are some of the most important parallel algorithms required for domain decomposed Monte Carlo particle transport. We demonstrate that our previous algorithms were not scalable, prove that our new algorithms are scalable, and run some of the algorithms up to 2 million MPI processes on the Sequoia supercomputer.
NASA Astrophysics Data System (ADS)
Sun, Xiuqiao; Wang, Jian
2018-07-01
Freeway service patrol (FSP), is considered to be an effective method for incident management and can help transportation agency decision-makers alter existing route coverage and fleet allocation. This paper investigates the FSP problem of patrol routing design and fleet allocation, with the objective of minimizing the overall average incident response time. While the simulated annealing (SA) algorithm and its improvements have been applied to solve this problem, they often become trapped in local optimal solution. Moreover, the issue of searching efficiency remains to be further addressed. In this paper, we employ the genetic algorithm (GA) and SA to solve the FSP problem. To maintain population diversity and avoid premature convergence, niche strategy is incorporated into the traditional genetic algorithm. We also employ elitist strategy to speed up the convergence. Numerical experiments have been conducted with the help of the Sioux Falls network. Results show that the GA slightly outperforms the dual-based greedy (DBG) algorithm, the very large-scale neighborhood searching (VLNS) algorithm, the SA algorithm and the scenario algorithm.
NASA Astrophysics Data System (ADS)
Chirico, G. B.; Medina, H.; Romano, N.
2014-07-01
This paper examines the potential of different algorithms, based on the Kalman filtering approach, for assimilating near-surface observations into a one-dimensional Richards equation governing soil water flow in soil. Our specific objectives are: (i) to compare the efficiency of different Kalman filter algorithms in retrieving matric pressure head profiles when they are implemented with different numerical schemes of the Richards equation; (ii) to evaluate the performance of these algorithms when nonlinearities arise from the nonlinearity of the observation equation, i.e. when surface soil water content observations are assimilated to retrieve matric pressure head values. The study is based on a synthetic simulation of an evaporation process from a homogeneous soil column. Our first objective is achieved by implementing a Standard Kalman Filter (SKF) algorithm with both an explicit finite difference scheme (EX) and a Crank-Nicolson (CN) linear finite difference scheme of the Richards equation. The Unscented (UKF) and Ensemble Kalman Filters (EnKF) are applied to handle the nonlinearity of a backward Euler finite difference scheme. To accomplish the second objective, an analogous framework is applied, with the exception of replacing SKF with the Extended Kalman Filter (EKF) in combination with a CN numerical scheme, so as to handle the nonlinearity of the observation equation. While the EX scheme is computationally too inefficient to be implemented in an operational assimilation scheme, the retrieval algorithm implemented with a CN scheme is found to be computationally more feasible and accurate than those implemented with the backward Euler scheme, at least for the examined one-dimensional problem. The UKF appears to be as feasible as the EnKF when one has to handle nonlinear numerical schemes or additional nonlinearities arising from the observation equation, at least for systems of small dimensionality as the one examined in this study.
NASA Astrophysics Data System (ADS)
Lin, Zhi; Zhang, Qinghai
2017-09-01
We propose high-order finite-volume schemes for numerically solving the steady-state advection-diffusion equation with nonlinear Robin boundary conditions. Although the original motivation comes from a mathematical model of blood clotting, the nonlinear boundary conditions may also apply to other scientific problems. The main contribution of this work is a generic algorithm for generating third-order, fourth-order, and even higher-order explicit ghost-filling formulas to enforce nonlinear Robin boundary conditions in multiple dimensions. Under the framework of finite volume methods, this appears to be the first algorithm of its kind. Numerical experiments on boundary value problems show that the proposed fourth-order formula can be much more accurate and efficient than a simple second-order formula. Furthermore, the proposed ghost-filling formulas may also be useful for solving other partial differential equations.
PNS predictions for supersonic/hypersonic flows over finned missile configurations
NASA Technical Reports Server (NTRS)
Bhutta, Bilal A.; Lewis, Clark H.
1992-01-01
Finned missile design entails accurate and computationally fast numerical techniques for predicting viscous flows over complex lifting configurations at small to moderate angles of attack and over Mach 3 to 15; these flows are often characterized by strong embedded shocks, so that numerical algorithms are also required to capture embedded shocks. The recent real-gas Flux Vector Splitting technique is here extended to investigate the Mach 3 flow over a typical finned missile configuration with/without side fin deflections. Elliptic grid-generation techniques for Mach 15 flows are shown to be inadequate for Mach 3 flows over finned configurations and need to be modified. Fin-deflection studies indicate that even small amounts of missile fin deflection can substantially modify vehicle aerodynamics. This 3D parabolized Navier-Stokes scheme is also extended into an efficient embedded algorithm for studying small axially separated flow regions due to strong fin and control surface deflections.
Implementation of a kappa-epsilon turbulence model to RPLUS3D code
NASA Technical Reports Server (NTRS)
Chitsomboon, Tawit
1992-01-01
The RPLUS3D code has been developed at the NASA Lewis Research Center to support the National Aerospace Plane (NASP) project. The code has the ability to solve three dimensional flowfields with finite rate combustion of hydrogen and air. The combustion process of the hydrogen-air system are simulated by an 18 reaction path, 8 species chemical kinetic mechanism. The code uses a Lower-Upper (LU) decomposition numerical algorithm as its basis, making it a very efficient and robust code. Except for the Jacobian matrix for the implicit chemistry source terms, there is no inversion of a matrix even though a fully implicit numerical algorithm is used. A k-epsilon turbulence model has recently been incorporated into the code. Initial validations have been conducted for a flow over a flat plate. Results of the validation studies are shown. Some difficulties in implementing the k-epsilon equations to the code are also discussed.
Implementation of a kappa-epsilon turbulence model to RPLUS3D code
NASA Astrophysics Data System (ADS)
Chitsomboon, Tawit
1992-02-01
The RPLUS3D code has been developed at the NASA Lewis Research Center to support the National Aerospace Plane (NASP) project. The code has the ability to solve three dimensional flowfields with finite rate combustion of hydrogen and air. The combustion process of the hydrogen-air system are simulated by an 18 reaction path, 8 species chemical kinetic mechanism. The code uses a Lower-Upper (LU) decomposition numerical algorithm as its basis, making it a very efficient and robust code. Except for the Jacobian matrix for the implicit chemistry source terms, there is no inversion of a matrix even though a fully implicit numerical algorithm is used. A k-epsilon turbulence model has recently been incorporated into the code. Initial validations have been conducted for a flow over a flat plate. Results of the validation studies are shown. Some difficulties in implementing the k-epsilon equations to the code are also discussed.
Computational singular perturbation analysis of stochastic chemical systems with stiffness
NASA Astrophysics Data System (ADS)
Wang, Lijin; Han, Xiaoying; Cao, Yanzhao; Najm, Habib N.
2017-04-01
Computational singular perturbation (CSP) is a useful method for analysis, reduction, and time integration of stiff ordinary differential equation systems. It has found dominant utility, in particular, in chemical reaction systems with a large range of time scales at continuum and deterministic level. On the other hand, CSP is not directly applicable to chemical reaction systems at micro or meso-scale, where stochasticity plays an non-negligible role and thus has to be taken into account. In this work we develop a novel stochastic computational singular perturbation (SCSP) analysis and time integration framework, and associated algorithm, that can be used to not only construct accurately and efficiently the numerical solutions to stiff stochastic chemical reaction systems, but also analyze the dynamics of the reduced stochastic reaction systems. The algorithm is illustrated by an application to a benchmark stochastic differential equation model, and numerical experiments are carried out to demonstrate the effectiveness of the construction.
Exploring Neutrino Oscillation Parameter Space with a Monte Carlo Algorithm
NASA Astrophysics Data System (ADS)
Espejel, Hugo; Ernst, David; Cogswell, Bernadette; Latimer, David
2015-04-01
The χ2 (or likelihood) function for a global analysis of neutrino oscillation data is first calculated as a function of the neutrino mixing parameters. A computational challenge is to obtain the minima or the allowed regions for the mixing parameters. The conventional approach is to calculate the χ2 (or likelihood) function on a grid for a large number of points, and then marginalize over the likelihood function. As the number of parameters increases with the number of neutrinos, making the calculation numerically efficient becomes necessary. We implement a new Monte Carlo algorithm (D. Foreman-Mackey, D. W. Hogg, D. Lang and J. Goodman, Publications of the Astronomical Society of the Pacific, 125 306 (2013)) to determine its computational efficiency at finding the minima and allowed regions. We examine a realistic example to compare the historical and the new methods.
Evaluation of a Multigrid Scheme for the Incompressible Navier-Stokes Equations
NASA Technical Reports Server (NTRS)
Swanson, R. C.
2004-01-01
A fast multigrid solver for the steady, incompressible Navier-Stokes equations is presented. The multigrid solver is based upon a factorizable discrete scheme for the velocity-pressure form of the Navier-Stokes equations. This scheme correctly distinguishes between the advection-diffusion and elliptic parts of the operator, allowing efficient smoothers to be constructed. To evaluate the multigrid algorithm, solutions are computed for flow over a flat plate, parabola, and a Karman-Trefftz airfoil. Both nonlifting and lifting airfoil flows are considered, with a Reynolds number range of 200 to 800. Convergence and accuracy of the algorithm are discussed. Using Gauss-Seidel line relaxation in alternating directions, multigrid convergence behavior approaching that of O(N) methods is achieved. The computational efficiency of the numerical scheme is compared with that of Runge-Kutta and implicit upwind based multigrid methods.
Optical properties reconstruction using the adjoint method based on the radiative transfer equation
NASA Astrophysics Data System (ADS)
Addoum, Ahmad; Farges, Olivier; Asllanaj, Fatmir
2018-01-01
An efficient algorithm is proposed to reconstruct the spatial distribution of optical properties in heterogeneous media like biological tissues. The light transport through such media is accurately described by the radiative transfer equation in the frequency-domain. The adjoint method is used to efficiently compute the objective function gradient with respect to optical parameters. Numerical tests show that the algorithm is accurate and robust to retrieve simultaneously the absorption μa and scattering μs coefficients for lowly and highly absorbing medium. Moreover, the simultaneous reconstruction of μs and the anisotropy factor g of the Henyey-Greenstein phase function is achieved with a reasonable accuracy. The main novelty in this work is the reconstruction of g which might open the possibility to image this parameter in tissues as an additional contrast agent in optical tomography.
Generalized Gilat-Raubenheimer method for density-of-states calculation in photonic crystals
NASA Astrophysics Data System (ADS)
Liu, Boyuan; Johnson, Steven G.; Joannopoulos, John D.; Lu, Ling
2018-04-01
An efficient numerical algorithm is the key for accurate evaluation of density of states (DOS) in band theory. The Gilat-Raubenheimer (GR) method proposed in 1966 is an efficient linear extrapolation method which was limited in specific lattices. Here, using an affine transformation, we provide a new generalization of the original GR method to any Bravais lattices and show that it is superior to the tetrahedron method and the adaptive Gaussian broadening method. Finally, we apply our generalized GR method to compute DOS of various gyroid photonic crystals of topological degeneracies.
Computing generalized Langevin equations and generalized Fokker-Planck equations.
Darve, Eric; Solomon, Jose; Kia, Amirali
2009-07-07
The Mori-Zwanzig formalism is an effective tool to derive differential equations describing the evolution of a small number of resolved variables. In this paper we present its application to the derivation of generalized Langevin equations and generalized non-Markovian Fokker-Planck equations. We show how long time scales rates and metastable basins can be extracted from these equations. Numerical algorithms are proposed to discretize these equations. An important aspect is the numerical solution of the orthogonal dynamics equation which is a partial differential equation in a high dimensional space. We propose efficient numerical methods to solve this orthogonal dynamics equation. In addition, we present a projection formalism of the Mori-Zwanzig type that is applicable to discrete maps. Numerical applications are presented from the field of Hamiltonian systems.
NASA Astrophysics Data System (ADS)
Bhrawy, A. H.; Doha, E. H.; Baleanu, D.; Ezz-Eldien, S. S.
2015-07-01
In this paper, an efficient and accurate spectral numerical method is presented for solving second-, fourth-order fractional diffusion-wave equations and fractional wave equations with damping. The proposed method is based on Jacobi tau spectral procedure together with the Jacobi operational matrix for fractional integrals, described in the Riemann-Liouville sense. The main characteristic behind this approach is to reduce such problems to those of solving systems of algebraic equations in the unknown expansion coefficients of the sought-for spectral approximations. The validity and effectiveness of the method are demonstrated by solving five numerical examples. Numerical examples are presented in the form of tables and graphs to make comparisons with the results obtained by other methods and with the exact solutions more easier.
On recursive least-squares filtering algorithms and implementations. Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Hsieh, Shih-Fu
1990-01-01
In many real-time signal processing applications, fast and numerically stable algorithms for solving least-squares problems are necessary and important. In particular, under non-stationary conditions, these algorithms must be able to adapt themselves to reflect the changes in the system and take appropriate adjustments to achieve optimum performances. Among existing algorithms, the QR-decomposition (QRD)-based recursive least-squares (RLS) methods have been shown to be useful and effective for adaptive signal processing. In order to increase the speed of processing and achieve high throughput rate, many algorithms are being vectorized and/or pipelined to facilitate high degrees of parallelism. A time-recursive formulation of RLS filtering employing block QRD will be considered first. Several methods, including a new non-continuous windowing scheme based on selectively rejecting contaminated data, were investigated for adaptive processing. Based on systolic triarrays, many other forms of systolic arrays are shown to be capable of implementing different algorithms. Various updating and downdating systolic algorithms and architectures for RLS filtering are examined and compared in details, which include Householder reflector, Gram-Schmidt procedure, and Givens rotation. A unified approach encompassing existing square-root-free algorithms is also proposed. For the sinusoidal spectrum estimation problem, a judicious method of separating the noise from the signal is of great interest. Various truncated QR methods are proposed for this purpose and compared to the truncated SVD method. Computer simulations provided for detailed comparisons show the effectiveness of these methods. This thesis deals with fundamental issues of numerical stability, computational efficiency, adaptivity, and VLSI implementation for the RLS filtering problems. In all, various new and modified algorithms and architectures are proposed and analyzed; the significance of any of the new method depends crucially on specific application.
NASA Astrophysics Data System (ADS)
Li, Jia; Wang, Qiang; Yan, Wenjie; Shen, Yi
2015-12-01
Cooperative spectrum sensing exploits the spatial diversity to improve the detection of occupied channels in cognitive radio networks (CRNs). Cooperative compressive spectrum sensing (CCSS) utilizing the sparsity of channel occupancy further improves the efficiency by reducing the number of reports without degrading detection performance. In this paper, we firstly and mainly propose the referred multi-candidate orthogonal matrix matching pursuit (MOMMP) algorithms to efficiently and effectively detect occupied channels at fusion center (FC), where multi-candidate identification and orthogonal projection are utilized to respectively reduce the number of required iterations and improve the probability of exact identification. Secondly, two common but different approaches based on threshold and Gaussian distribution are introduced to realize the multi-candidate identification. Moreover, to improve the detection accuracy and energy efficiency, we propose the matrix construction based on shrinkage and gradient descent (MCSGD) algorithm to provide a deterministic filter coefficient matrix of low t-average coherence. Finally, several numerical simulations validate that our proposals provide satisfactory performance with higher probability of detection, lower probability of false alarm and less detection time.
A manpower scheduling heuristic for aircraft maintenance application
NASA Astrophysics Data System (ADS)
Sze, San-Nah; Sze, Jeeu-Fong; Chiew, Kang-Leng
2012-09-01
This research studies a manpower scheduling for aircraft maintenance, focusing on in-flight food loading operation. A group of loading teams with flexible shifts is required to deliver and upload packaged meals from the ground kitchen to aircrafts in multiple trips. All aircrafts must be served within predefined time windows. The scheduling process takes into account of various constraints such as meal break allocation, multi-trip traveling and food exposure time limit. Considering the aircrafts movement and predefined maximum working hours for each loading team, the main objective of this study is to form an efficient roster by assigning a minimum number of loading teams to the aircrafts. We proposed an insertion based heuristic to generate the solutions in a short period of time for large instances. This proposed algorithm is implemented in various stages for constructing trips due to the presence of numerous constraints. The robustness and efficiency of the algorithm is demonstrated in computational results. The results show that the insertion heuristic more efficiently outperforms the company's current practice.
NASA Astrophysics Data System (ADS)
Babbush, Ryan; Berry, Dominic W.; Sanders, Yuval R.; Kivlichan, Ian D.; Scherer, Artur; Wei, Annie Y.; Love, Peter J.; Aspuru-Guzik, Alán
2018-01-01
We present a quantum algorithm for the simulation of molecular systems that is asymptotically more efficient than all previous algorithms in the literature in terms of the main problem parameters. As in Babbush et al (2016 New Journal of Physics 18, 033032), we employ a recently developed technique for simulating Hamiltonian evolution using a truncated Taylor series to obtain logarithmic scaling with the inverse of the desired precision. The algorithm of this paper involves simulation under an oracle for the sparse, first-quantized representation of the molecular Hamiltonian known as the configuration interaction (CI) matrix. We construct and query the CI matrix oracle to allow for on-the-fly computation of molecular integrals in a way that is exponentially more efficient than classical numerical methods. Whereas second-quantized representations of the wavefunction require \\widetilde{{ O }}(N) qubits, where N is the number of single-particle spin-orbitals, the CI matrix representation requires \\widetilde{{ O }}(η ) qubits, where η \\ll N is the number of electrons in the molecule of interest. We show that the gate count of our algorithm scales at most as \\widetilde{{ O }}({η }2{N}3t).
Quantum computation and analysis of Wigner and Husimi functions: toward a quantum image treatment.
Terraneo, M; Georgeot, B; Shepelyansky, D L
2005-06-01
We study the efficiency of quantum algorithms which aim at obtaining phase-space distribution functions of quantum systems. Wigner and Husimi functions are considered. Different quantum algorithms are envisioned to build these functions, and compared with the classical computation. Different procedures to extract more efficiently information from the final wave function of these algorithms are studied, including coarse-grained measurements, amplitude amplification, and measure of wavelet-transformed wave function. The algorithms are analyzed and numerically tested on a complex quantum system showing different behavior depending on parameters: namely, the kicked rotator. The results for the Wigner function show in particular that the use of the quantum wavelet transform gives a polynomial gain over classical computation. For the Husimi distribution, the gain is much larger than for the Wigner function and is larger with the help of amplitude amplification and wavelet transforms. We discuss the generalization of these results to the simulation of other quantum systems. We also apply the same set of techniques to the analysis of real images. The results show that the use of the quantum wavelet transform allows one to lower dramatically the number of measurements needed, but at the cost of a large loss of information.
Efficient experimental design of high-fidelity three-qubit quantum gates via genetic programming
NASA Astrophysics Data System (ADS)
Devra, Amit; Prabhu, Prithviraj; Singh, Harpreet; Arvind; Dorai, Kavita
2018-03-01
We have designed efficient quantum circuits for the three-qubit Toffoli (controlled-controlled-NOT) and the Fredkin (controlled-SWAP) gate, optimized via genetic programming methods. The gates thus obtained were experimentally implemented on a three-qubit NMR quantum information processor, with a high fidelity. Toffoli and Fredkin gates in conjunction with the single-qubit Hadamard gates form a universal gate set for quantum computing and are an essential component of several quantum algorithms. Genetic algorithms are stochastic search algorithms based on the logic of natural selection and biological genetics and have been widely used for quantum information processing applications. We devised a new selection mechanism within the genetic algorithm framework to select individuals from a population. We call this mechanism the "Luck-Choose" mechanism and were able to achieve faster convergence to a solution using this mechanism, as compared to existing selection mechanisms. The optimization was performed under the constraint that the experimentally implemented pulses are of short duration and can be implemented with high fidelity. We demonstrate the advantage of our pulse sequences by comparing our results with existing experimental schemes and other numerical optimization methods.
NASA Astrophysics Data System (ADS)
Zhang, Jiangjiang; Lin, Guang; Li, Weixuan; Wu, Laosheng; Zeng, Lingzao
2018-03-01
Ensemble smoother (ES) has been widely used in inverse modeling of hydrologic systems. However, for problems where the distribution of model parameters is multimodal, using ES directly would be problematic. One popular solution is to use a clustering algorithm to identify each mode and update the clusters with ES separately. However, this strategy may not be very efficient when the dimension of parameter space is high or the number of modes is large. Alternatively, we propose in this paper a very simple and efficient algorithm, i.e., the iterative local updating ensemble smoother (ILUES), to explore multimodal distributions of model parameters in nonlinear hydrologic systems. The ILUES algorithm works by updating local ensembles of each sample with ES to explore possible multimodal distributions. To achieve satisfactory data matches in nonlinear problems, we adopt an iterative form of ES to assimilate the measurements multiple times. Numerical cases involving nonlinearity and multimodality are tested to illustrate the performance of the proposed method. It is shown that overall the ILUES algorithm can well quantify the parametric uncertainties of complex hydrologic models, no matter whether the multimodal distribution exists.
2012-01-01
Background Chaos Game Representation (CGR) is an iterated function that bijectively maps discrete sequences into a continuous domain. As a result, discrete sequences can be object of statistical and topological analyses otherwise reserved to numerical systems. Characteristically, CGR coordinates of substrings sharing an L-long suffix will be located within 2-L distance of each other. In the two decades since its original proposal, CGR has been generalized beyond its original focus on genomic sequences and has been successfully applied to a wide range of problems in bioinformatics. This report explores the possibility that it can be further extended to approach algorithms that rely on discrete, graph-based representations. Results The exploratory analysis described here consisted of selecting foundational string problems and refactoring them using CGR-based algorithms. We found that CGR can take the role of suffix trees and emulate sophisticated string algorithms, efficiently solving exact and approximate string matching problems such as finding all palindromes and tandem repeats, and matching with mismatches. The common feature of these problems is that they use longest common extension (LCE) queries as subtasks of their procedures, which we show to have a constant time solution with CGR. Additionally, we show that CGR can be used as a rolling hash function within the Rabin-Karp algorithm. Conclusions The analysis of biological sequences relies on algorithmic foundations facing mounting challenges, both logistic (performance) and analytical (lack of unifying mathematical framework). CGR is found to provide the latter and to promise the former: graph-based data structures for sequence analysis operations are entailed by numerical-based data structures produced by CGR maps, providing a unifying analytical framework for a diversity of pattern matching problems. PMID:22551152
Advancing MODFLOW Applying the Derived Vector Space Method
NASA Astrophysics Data System (ADS)
Herrera, G. S.; Herrera, I.; Lemus-García, M.; Hernandez-Garcia, G. D.
2015-12-01
The most effective domain decomposition methods (DDM) are non-overlapping DDMs. Recently a new approach, the DVS-framework, based on an innovative discretization method that uses a non-overlapping system of nodes (the derived-nodes), was introduced and developed by I. Herrera et al. [1, 2]. Using the DVS-approach a group of four algorithms, referred to as the 'DVS-algorithms', which fulfill the DDM-paradigm (i.e. the solution of global problems is obtained by resolution of local problems exclusively) has been derived. Such procedures are applicable to any boundary-value problem, or system of such equations, for which a standard discretization method is available and then software with a high degree of parallelization can be constructed. In a parallel talk, in this AGU Fall Meeting, Ismael Herrera will introduce the general DVS methodology. The application of the DVS-algorithms has been demonstrated in the solution of several boundary values problems of interest in Geophysics. Numerical examples for a single-equation, for the cases of symmetric, non-symmetric and indefinite problems were demonstrated before [1,2]. For these problems DVS-algorithms exhibited significantly improved numerical performance with respect to standard versions of DDM algorithms. In view of these results our research group is in the process of applying the DVS method to a widely used simulator for the first time, here we present the advances of the application of this method for the parallelization of MODFLOW. Efficiency results for a group of tests will be presented. References [1] I. Herrera, L.M. de la Cruz and A. Rosas-Medina. Non overlapping discretization methods for partial differential equations, Numer Meth Part D E, (2013). [2] Herrera, I., & Contreras Iván "An Innovative Tool for Effectively Applying Highly Parallelized Software To Problems of Elasticity". Geofísica Internacional, 2015 (In press)
NASA Technical Reports Server (NTRS)
Korkin, S.; Lyapustin, A.
2012-01-01
The Levenberg-Marquardt algorithm [1, 2] provides a numerical iterative solution to the problem of minimization of a function over a space of its parameters. In our work, the Levenberg-Marquardt algorithm retrieves optical parameters of a thin (single scattering) plane parallel atmosphere irradiated by collimated infinitely wide monochromatic beam of light. Black ground surface is assumed. Computational accuracy, sensitivity to the initial guess and the presence of noise in the signal, and other properties of the algorithm are investigated in scalar (using intensity only) and vector (including polarization) modes. We consider an atmosphere that contains a mixture of coarse and fine fractions. Following [3], the fractions are simulated using Henyey-Greenstein model. Though not realistic, this assumption is very convenient for tests [4, p.354]. In our case it yields analytical evaluation of Jacobian matrix. Assuming the MISR geometry of observation [5] as an example, the average scattering cosines and the ratio of coarse and fine fractions, the atmosphere optical depth, and the single scattering albedo, are the five parameters to be determined numerically. In our implementation of the algorithm, the system of five linear equations is solved using the fast Cramer s rule [6]. A simple subroutine developed by the authors, makes the algorithm independent from external libraries. All Fortran 90/95 codes discussed in the presentation will be available immediately after the meeting from sergey.v.korkin@nasa.gov by request.
NASA Astrophysics Data System (ADS)
Taitano, W. T.; Chacón, L.; Simakov, A. N.; Molvig, K.
2015-09-01
In this study, we demonstrate a fully implicit algorithm for the multi-species, multidimensional Rosenbluth-Fokker-Planck equation which is exactly mass-, momentum-, and energy-conserving, and which preserves positivity. Unlike most earlier studies, we base our development on the Rosenbluth (rather than Landau) form of the Fokker-Planck collision operator, which reduces complexity while allowing for an optimal fully implicit treatment. Our discrete conservation strategy employs nonlinear constraints that force the continuum symmetries of the collision operator to be satisfied upon discretization. We converge the resulting nonlinear system iteratively using Jacobian-free Newton-Krylov methods, effectively preconditioned with multigrid methods for efficiency. Single- and multi-species numerical examples demonstrate the advertised accuracy properties of the scheme, and the superior algorithmic performance of our approach. In particular, the discretization approach is numerically shown to be second-order accurate in time and velocity space and to exhibit manifestly positive entropy production. That is, H-theorem behavior is indicated for all the examples we have tested. The solution approach is demonstrated to scale optimally with respect to grid refinement (with CPU time growing linearly with the number of mesh points), and timestep (showing very weak dependence of CPU time with time-step size). As a result, the proposed algorithm delivers several orders-of-magnitude speedup vs. explicit algorithms.
Improved diffusion Monte Carlo propagators for bosonic systems using Itô calculus
NASA Astrophysics Data System (ADS)
Hâkansson, P.; Mella, M.; Bressanini, Dario; Morosi, Gabriele; Patrone, Marta
2006-11-01
The construction of importance sampled diffusion Monte Carlo (DMC) schemes accurate to second order in the time step is discussed. A central aspect in obtaining efficient second order schemes is the numerical solution of the stochastic differential equation (SDE) associated with the Fokker-Plank equation responsible for the importance sampling procedure. In this work, stochastic predictor-corrector schemes solving the SDE and consistent with Itô calculus are used in DMC simulations of helium clusters. These schemes are numerically compared with alternative algorithms obtained by splitting the Fokker-Plank operator, an approach that we analyze using the analytical tools provided by Itô calculus. The numerical results show that predictor-corrector methods are indeed accurate to second order in the time step and that they present a smaller time step bias and a better efficiency than second order split-operator derived schemes when computing ensemble averages for bosonic systems. The possible extension of the predictor-corrector methods to higher orders is also discussed.
Efficient Algorithms for Estimating the Absorption Spectrum within Linear Response TDDFT
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brabec, Jiri; Lin, Lin; Shao, Meiyue
We present a special symmetric Lanczos algorithm and a kernel polynomial method (KPM) for approximating the absorption spectrum of molecules within the linear response time-dependent density functional theory (TDDFT) framework in the product form. In contrast to existing algorithms, the new algorithms are based on reformulating the original non-Hermitian eigenvalue problem as a product eigenvalue problem and the observation that the product eigenvalue problem is self-adjoint with respect to an appropriately chosen inner product. This allows a simple symmetric Lanczos algorithm to be used to compute the desired absorption spectrum. The use of a symmetric Lanczos algorithm only requires halfmore » of the memory compared with the nonsymmetric variant of the Lanczos algorithm. The symmetric Lanczos algorithm is also numerically more stable than the nonsymmetric version. The KPM algorithm is also presented as a low-memory alternative to the Lanczos approach, but the algorithm may require more matrix-vector multiplications in practice. We discuss the pros and cons of these methods in terms of their accuracy as well as their computational and storage cost. Applications to a set of small and medium-sized molecules are also presented.« less
Efficient Algorithms for Estimating the Absorption Spectrum within Linear Response TDDFT
Brabec, Jiri; Lin, Lin; Shao, Meiyue; ...
2015-10-06
We present a special symmetric Lanczos algorithm and a kernel polynomial method (KPM) for approximating the absorption spectrum of molecules within the linear response time-dependent density functional theory (TDDFT) framework in the product form. In contrast to existing algorithms, the new algorithms are based on reformulating the original non-Hermitian eigenvalue problem as a product eigenvalue problem and the observation that the product eigenvalue problem is self-adjoint with respect to an appropriately chosen inner product. This allows a simple symmetric Lanczos algorithm to be used to compute the desired absorption spectrum. The use of a symmetric Lanczos algorithm only requires halfmore » of the memory compared with the nonsymmetric variant of the Lanczos algorithm. The symmetric Lanczos algorithm is also numerically more stable than the nonsymmetric version. The KPM algorithm is also presented as a low-memory alternative to the Lanczos approach, but the algorithm may require more matrix-vector multiplications in practice. We discuss the pros and cons of these methods in terms of their accuracy as well as their computational and storage cost. Applications to a set of small and medium-sized molecules are also presented.« less
NASA Astrophysics Data System (ADS)
Ezz-Eldien, S. S.; Doha, E. H.; Bhrawy, A. H.; El-Kalaawy, A. A.; Machado, J. A. T.
2018-04-01
In this paper, we propose a new accurate and robust numerical technique to approximate the solutions of fractional variational problems (FVPs) depending on indefinite integrals with a type of fixed Riemann-Liouville fractional integral. The proposed technique is based on the shifted Chebyshev polynomials as basis functions for the fractional integral operational matrix (FIOM). Together with the Lagrange multiplier method, these problems are then reduced to a system of algebraic equations, which greatly simplifies the solution process. Numerical examples are carried out to confirm the accuracy, efficiency and applicability of the proposed algorithm
An efficient sparse matrix multiplication scheme for the CYBER 205 computer
NASA Technical Reports Server (NTRS)
Lambiotte, Jules J., Jr.
1988-01-01
This paper describes the development of an efficient algorithm for computing the product of a matrix and vector on a CYBER 205 vector computer. The desire to provide software which allows the user to choose between the often conflicting goals of minimizing central processing unit (CPU) time or storage requirements has led to a diagonal-based algorithm in which one of four types of storage is selected for each diagonal. The candidate storage types employed were chosen to be efficient on the CYBER 205 for diagonals which have nonzero structure which is dense, moderately sparse, very sparse and short, or very sparse and long; however, for many densities, no diagonal type is most efficient with respect to both resource requirements, and a trade-off must be made. For each diagonal, an initialization subroutine estimates the CPU time and storage required for each storage type based on results from previously performed numerical experimentation. These requirements are adjusted by weights provided by the user which reflect the relative importance the user places on the two resources. The adjusted resource requirements are then compared to select the most efficient storage and computational scheme.
NASA Astrophysics Data System (ADS)
Zhuang, Wei; Mountrakis, Giorgos
2014-09-01
Large footprint waveform LiDAR sensors have been widely used for numerous airborne studies. Ground peak identification in a large footprint waveform is a significant bottleneck in exploring full usage of the waveform datasets. In the current study, an accurate and computationally efficient algorithm was developed for ground peak identification, called Filtering and Clustering Algorithm (FICA). The method was evaluated on Land, Vegetation, and Ice Sensor (LVIS) waveform datasets acquired over Central NY. FICA incorporates a set of multi-scale second derivative filters and a k-means clustering algorithm in order to avoid detecting false ground peaks. FICA was tested in five different land cover types (deciduous trees, coniferous trees, shrub, grass and developed area) and showed more accurate results when compared to existing algorithms. More specifically, compared with Gaussian decomposition, the RMSE ground peak identification by FICA was 2.82 m (5.29 m for GD) in deciduous plots, 3.25 m (4.57 m for GD) in coniferous plots, 2.63 m (2.83 m for GD) in shrub plots, 0.82 m (0.93 m for GD) in grass plots, and 0.70 m (0.51 m for GD) in plots of developed areas. FICA performance was also relatively consistent under various slope and canopy coverage (CC) conditions. In addition, FICA showed better computational efficiency compared to existing methods. FICA's major computational and accuracy advantage is a result of the adopted multi-scale signal processing procedures that concentrate on local portions of the signal as opposed to the Gaussian decomposition that uses a curve-fitting strategy applied in the entire signal. The FICA algorithm is a good candidate for large-scale implementation on future space-borne waveform LiDAR sensors.
Sensitivity analysis of dynamic biological systems with time-delays.
Wu, Wu Hsiung; Wang, Feng Sheng; Chang, Maw Shang
2010-10-15
Mathematical modeling has been applied to the study and analysis of complex biological systems for a long time. Some processes in biological systems, such as the gene expression and feedback control in signal transduction networks, involve a time delay. These systems are represented as delay differential equation (DDE) models. Numerical sensitivity analysis of a DDE model by the direct method requires the solutions of model and sensitivity equations with time-delays. The major effort is the computation of Jacobian matrix when computing the solution of sensitivity equations. The computation of partial derivatives of complex equations either by the analytic method or by symbolic manipulation is time consuming, inconvenient, and prone to introduce human errors. To address this problem, an automatic approach to obtain the derivatives of complex functions efficiently and accurately is necessary. We have proposed an efficient algorithm with an adaptive step size control to compute the solution and dynamic sensitivities of biological systems described by ordinal differential equations (ODEs). The adaptive direct-decoupled algorithm is extended to solve the solution and dynamic sensitivities of time-delay systems describing by DDEs. To save the human effort and avoid the human errors in the computation of partial derivatives, an automatic differentiation technique is embedded in the extended algorithm to evaluate the Jacobian matrix. The extended algorithm is implemented and applied to two realistic models with time-delays: the cardiovascular control system and the TNF-α signal transduction network. The results show that the extended algorithm is a good tool for dynamic sensitivity analysis on DDE models with less user intervention. By comparing with direct-coupled methods in theory, the extended algorithm is efficient, accurate, and easy to use for end users without programming background to do dynamic sensitivity analysis on complex biological systems with time-delays.
NASA Astrophysics Data System (ADS)
Brakensiek, Joshua; Ragozzine, D.
2012-10-01
The transit method for discovering extra-solar planets relies on detecting regular diminutions of light from stars due to the shadows of planets passing in between the star and the observer. NASA's Kepler Mission has successfully discovered thousands of exoplanet candidates using this technique, including hundreds of stars with multiple transiting planets. In order to estimate the frequency of these valuable systems, our research concerns the efficient calculation of geometric probabilities for detecting multiple transiting extrasolar planets around the same parent star. In order to improve on previous studies that used numerical methods (e.g., Ragozzine & Holman 2010, Tremaine & Dong 2011), we have constructed an efficient, analytical algorithm which, given a collection of conjectured exoplanets orbiting a star, computes the probability that any particular group of exoplanets are transiting. The algorithm applies theorems of elementary differential geometry to compute the areas bounded by circular curves on the surface of a sphere (see Ragozzine & Holman 2010). The implemented algorithm is more accurate and orders of magnitude faster than previous algorithms, based on comparison with Monte Carlo simulations. Expanding this work, we have also developed semi-analytical methods for determining the frequency of exoplanet mutual events, i.e., the geometric probability two planets will transit each other (Planet-Planet Occultation) and the probability that this transit occurs simultaneously as they transit their star (Overlapping Double Transits; see Ragozzine & Holman 2010). The latter algorithm can also be applied to calculating the probability of observing transiting circumbinary planets (Doyle et al. 2011, Welsh et al. 2012). All of these algorithms have been coded in C and will be made publicly available. We will present and advertise these codes and illustrate their value for studying exoplanetary systems.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Guangye; Chacon, Luis; Barnes, Daniel C
2012-01-01
Recently, a fully implicit, energy- and charge-conserving particle-in-cell method has been developed for multi-scale, full-f kinetic simulations [G. Chen, et al., J. Comput. Phys. 230, 18 (2011)]. The method employs a Jacobian-free Newton-Krylov (JFNK) solver and is capable of using very large timesteps without loss of numerical stability or accuracy. A fundamental feature of the method is the segregation of particle orbit integrations from the field solver, while remaining fully self-consistent. This provides great flexibility, and dramatically improves the solver efficiency by reducing the degrees of freedom of the associated nonlinear system. However, it requires a particle push per nonlinearmore » residual evaluation, which makes the particle push the most time-consuming operation in the algorithm. This paper describes a very efficient mixed-precision, hybrid CPU-GPU implementation of the implicit PIC algorithm. The JFNK solver is kept on the CPU (in double precision), while the inherent data parallelism of the particle mover is exploited by implementing it in single-precision on a graphics processing unit (GPU) using CUDA. Performance-oriented optimizations, with the aid of an analytical performance model, the roofline model, are employed. Despite being highly dynamic, the adaptive, charge-conserving particle mover algorithm achieves up to 300 400 GOp/s (including single-precision floating-point, integer, and logic operations) on a Nvidia GeForce GTX580, corresponding to 20 25% absolute GPU efficiency (against the peak theoretical performance) and 50-70% intrinsic efficiency (against the algorithm s maximum operational throughput, which neglects all latencies). This is about 200-300 times faster than an equivalent serial CPU implementation. When the single-precision GPU particle mover is combined with a double-precision CPU JFNK field solver, overall performance gains 100 vs. the double-precision CPU-only serial version are obtained, with no apparent loss of robustness or accuracy when applied to a challenging long-time scale ion acoustic wave simulation.« less
7Be and hydrological model for more efficient implementation of erosion control measure
NASA Astrophysics Data System (ADS)
Al-Barri, Bashar; Bode, Samuel; Blake, William; Ryken, Nick; Cornelis, Wim; Boeckx, Pascal
2014-05-01
Increased concern about the on-site and off-site impacts of soil erosion in agricultural and forested areas has endorsed interest in innovative methods to assess in an unbiased way spatial and temporal soil erosion rates and redistribution patterns. Hence, interest in precisely estimating the magnitude of the problem and therefore applying erosion control measures (ECM) more efficiently. The latest generation of physically-based hydrological models, which fully couple overland flow and subsurface flow in three dimensions, permit implementing ECM in small and large scales more effectively if coupled with a sediment transport algorithm. While many studies focused on integrating empirical or numerical models based on traditional erosion budget measurements into 3D hydrological models, few studies evaluated the efficiency of ECM on watershed scale and very little attention is given to the potentials of environmental Fallout Radio-Nuclides (FRNs) in such applications. The use of FRN tracer 7Be in soil erosion/deposition research proved to overcome many (if not all) of the problems associated with the conventional approaches providing reliable data for efficient land use management. This poster will underline the pros and cones of using conventional methods and 7Be tracers to evaluate the efficiency of coconuts dams installed as ECM in experimental field in Belgium. It will also outline the potentials of 7Be in providing valuable inputs for evolving the numerical sediment transport algorithm needed for the hydrological model on field scale leading to assess the possibility of using this short-lived tracer as a validation tool for the upgraded hydrological model on watershed scale in further steps. Keywords: FRN, erosion control measures, hydrological modes
Theoretical and numerical studies of chaotic mixing
NASA Astrophysics Data System (ADS)
Kim, Ho Jun
Theoretical and numerical studies of chaotic mixing are performed to circumvent the difficulties of efficient mixing, which come from the lack of turbulence in microfluidic devices. In order to carry out efficient and accurate parametric studies and to identify a fully chaotic state, a spectral element algorithm for solution of the incompressible Navier-Stokes and species transport equations is developed. Using Taylor series expansions in time marching, the new algorithm employs an algebraic factorization scheme on multi-dimensional staggered spectral element grids, and extends classical conforming Galerkin formulations to nonconforming spectral elements. Lagrangian particle tracking methods are utilized to study particle dispersion in the mixing device using spectral element and fourth order Runge-Kutta discretizations in space and time, respectively. Comparative studies of five different techniques commonly employed to identify the chaotic strength and mixing efficiency in microfluidic systems are presented to demonstrate the competitive advantages and shortcomings of each method. These are the stirring index based on the box counting method, Poincare sections, finite time Lyapunov exponents, the probability density function of the stretching field, and mixing index inverse, based on the standard deviation of scalar species distribution. Series of numerical simulations are performed by varying the Peclet number (Pe) at fixed kinematic conditions. The mixing length (lm) is characterized as function of the Pe number, and lm ∝ ln(Pe) scaling is demonstrated for fully chaotic cases. Employing the aforementioned techniques, optimum kinematic conditions and the actuation frequency of the stirrer that result in the highest mixing/stirring efficiency are identified in a zeta potential patterned straight micro channel, where a continuous flow is generated by superposition of a steady pressure driven flow and time periodic electroosmotic flow induced by a stream-wise AC electric field. Finally, it is shown that the invariant manifold of hyperbolic periodic point determines the geometry of fast mixing zones in oscillatory flows in two-dimensional cavity.
Reliability enhancement of Navier-Stokes codes through convergence acceleration
NASA Technical Reports Server (NTRS)
Merkle, Charles L.; Dulikravich, George S.
1995-01-01
Methods for enhancing the reliability of Navier-Stokes computer codes through improving convergence characteristics are presented. The improving of these characteristics decreases the likelihood of code unreliability and user interventions in a design environment. The problem referred to as a 'stiffness' in the governing equations for propulsion-related flowfields is investigated, particularly in regard to common sources of equation stiffness that lead to convergence degradation of CFD algorithms. Von Neumann stability theory is employed as a tool to study the convergence difficulties involved. Based on the stability results, improved algorithms are devised to ensure efficient convergence in different situations. A number of test cases are considered to confirm a correlation between stability theory and numerical convergence. The examples of turbulent and reacting flow are presented, and a generalized form of the preconditioning matrix is derived to handle these problems, i.e., the problems involving additional differential equations for describing the transport of turbulent kinetic energy, dissipation rate and chemical species. Algorithms for unsteady computations are considered. The extension of the preconditioning techniques and algorithms derived for Navier-Stokes computations to three-dimensional flow problems is discussed. New methods to accelerate the convergence of iterative schemes for the numerical integration of systems of partial differential equtions are developed, with a special emphasis on the acceleration of convergence on highly clustered grids.
Dang, C; Xu, L
2001-03-01
In this paper a globally convergent Lagrange and barrier function iterative algorithm is proposed for approximating a solution of the traveling salesman problem. The algorithm employs an entropy-type barrier function to deal with nonnegativity constraints and Lagrange multipliers to handle linear equality constraints, and attempts to produce a solution of high quality by generating a minimum point of a barrier problem for a sequence of descending values of the barrier parameter. For any given value of the barrier parameter, the algorithm searches for a minimum point of the barrier problem in a feasible descent direction, which has a desired property that the nonnegativity constraints are always satisfied automatically if the step length is a number between zero and one. At each iteration the feasible descent direction is found by updating Lagrange multipliers with a globally convergent iterative procedure. For any given value of the barrier parameter, the algorithm converges to a stationary point of the barrier problem without any condition on the objective function. Theoretical and numerical results show that the algorithm seems more effective and efficient than the softassign algorithm.
NASA Astrophysics Data System (ADS)
Zakirov, Andrey; Belousov, Sergei; Valuev, Ilya; Levchenko, Vadim; Perepelkina, Anastasia; Zempo, Yasunari
2017-10-01
We demonstrate an efficient approach to numerical modeling of optical properties of large-scale structures with typical dimensions much greater than the wavelength of light. For this purpose, we use the finite-difference time-domain (FDTD) method enhanced with a memory efficient Locally Recursive non-Locally Asynchronous (LRnLA) algorithm called DiamondTorre and implemented for General Purpose Graphical Processing Units (GPGPU) architecture. We apply our approach to simulation of optical properties of organic light emitting diodes (OLEDs), which is an essential step in the process of designing OLEDs with improved efficiency. Specifically, we consider a problem of excitation and propagation of surface plasmon polaritons (SPPs) in a typical OLED, which is a challenging task given that SPP decay length can be about two orders of magnitude greater than the wavelength of excitation. We show that with our approach it is possible to extend the simulated volume size sufficiently so that SPP decay dynamics is accounted for. We further consider an OLED with periodically corrugated metallic cathode and show how the SPP decay length can be greatly reduced due to scattering off the corrugation. Ultimately, we compare the performance of our algorithm to the conventional FDTD and demonstrate that our approach can efficiently be used for large-scale FDTD simulations with the use of only a single GPGPU-powered workstation, which is not practically feasible with the conventional FDTD.
Numerical calculation of the Fresnel transform.
Kelly, Damien P
2014-04-01
In this paper, we address the problem of calculating Fresnel diffraction integrals using a finite number of uniformly spaced samples. General and simple sampling rules of thumb are derived that allow the user to calculate the distribution for any propagation distance. It is shown how these rules can be extended to fast-Fourier-transform-based algorithms to increase calculation efficiency. A comparison with other theoretical approaches is made.
Generalized Redistribute-to-the-Right Algorithm: Application to the Analysis of Censored Cost Data
CHEN, SHUAI; ZHAO, HONGWEI
2013-01-01
Medical cost estimation is a challenging task when censoring of data is present. Although researchers have proposed methods for estimating mean costs, these are often derived from theory and are not always easy to understand. We provide an alternative method, based on a replace-from-the-right algorithm, for estimating mean costs more efficiently. We show that our estimator is equivalent to an existing one that is based on the inverse probability weighting principle and semiparametric efficiency theory. We also propose an alternative method for estimating the survival function of costs, based on the redistribute-to-the-right algorithm, that was originally used for explaining the Kaplan–Meier estimator. We show that this second proposed estimator is equivalent to a simple weighted survival estimator of costs. Finally, we develop a more efficient survival estimator of costs, using the same redistribute-to-the-right principle. This estimator is naturally monotone, more efficient than some existing survival estimators, and has a quite small bias in many realistic settings. We conduct numerical studies to examine the finite sample property of the survival estimators for costs, and show that our new estimator has small mean squared errors when the sample size is not too large. We apply both existing and new estimators to a data example from a randomized cardiovascular clinical trial. PMID:24403869
Parameter identification for structural dynamics based on interval analysis algorithm
NASA Astrophysics Data System (ADS)
Yang, Chen; Lu, Zixing; Yang, Zhenyu; Liang, Ke
2018-04-01
A parameter identification method using interval analysis algorithm for structural dynamics is presented in this paper. The proposed uncertain identification method is investigated by using central difference method and ARMA system. With the help of the fixed memory least square method and matrix inverse lemma, a set-membership identification technology is applied to obtain the best estimation of the identified parameters in a tight and accurate region. To overcome the lack of insufficient statistical description of the uncertain parameters, this paper treats uncertainties as non-probabilistic intervals. As long as we know the bounds of uncertainties, this algorithm can obtain not only the center estimations of parameters, but also the bounds of errors. To improve the efficiency of the proposed method, a time-saving algorithm is presented by recursive formula. At last, to verify the accuracy of the proposed method, two numerical examples are applied and evaluated by three identification criteria respectively.
A Decentralized Eigenvalue Computation Method for Spectrum Sensing Based on Average Consensus
NASA Astrophysics Data System (ADS)
Mohammadi, Jafar; Limmer, Steffen; Stańczak, Sławomir
2016-07-01
This paper considers eigenvalue estimation for the decentralized inference problem for spectrum sensing. We propose a decentralized eigenvalue computation algorithm based on the power method, which is referred to as generalized power method GPM; it is capable of estimating the eigenvalues of a given covariance matrix under certain conditions. Furthermore, we have developed a decentralized implementation of GPM by splitting the iterative operations into local and global computation tasks. The global tasks require data exchange to be performed among the nodes. For this task, we apply an average consensus algorithm to efficiently perform the global computations. As a special case, we consider a structured graph that is a tree with clusters of nodes at its leaves. For an accelerated distributed implementation, we propose to use computation over multiple access channel (CoMAC) as a building block of the algorithm. Numerical simulations are provided to illustrate the performance of the two algorithms.
Algorithm to determine the percolation largest component in interconnected networks.
Schneider, Christian M; Araújo, Nuno A M; Herrmann, Hans J
2013-04-01
Interconnected networks have been shown to be much more vulnerable to random and targeted failures than isolated ones, raising several interesting questions regarding the identification and mitigation of their risk. The paradigm to address these questions is the percolation model, where the resilience of the system is quantified by the dependence of the size of the largest cluster on the number of failures. Numerically, the major challenge is the identification of this cluster and the calculation of its size. Here, we propose an efficient algorithm to tackle this problem. We show that the algorithm scales as O(NlogN), where N is the number of nodes in the network, a significant improvement compared to O(N(2)) for a greedy algorithm, which permits studying much larger networks. Our new strategy can be applied to any network topology and distribution of interdependencies, as well as any sequence of failures.
Chen, Yuantao; Xu, Weihong; Kuang, Fangjun; Gao, Shangbing
2013-01-01
The efficient target tracking algorithm researches have become current research focus of intelligent robots. The main problems of target tracking process in mobile robot face environmental uncertainty. They are very difficult to estimate the target states, illumination change, target shape changes, complex backgrounds, and other factors and all affect the occlusion in tracking robustness. To further improve the target tracking's accuracy and reliability, we present a novel target tracking algorithm to use visual saliency and adaptive support vector machine (ASVM). Furthermore, the paper's algorithm has been based on the mixture saliency of image features. These features include color, brightness, and sport feature. The execution process used visual saliency features and those common characteristics have been expressed as the target's saliency. Numerous experiments demonstrate the effectiveness and timeliness of the proposed target tracking algorithm in video sequences where the target objects undergo large changes in pose, scale, and illumination.
A k-Vector Approach to Sampling, Interpolation, and Approximation
NASA Astrophysics Data System (ADS)
Mortari, Daniele; Rogers, Jonathan
2013-12-01
The k-vector search technique is a method designed to perform extremely fast range searching of large databases at computational cost independent of the size of the database. k-vector search algorithms have historically found application in satellite star-tracker navigation systems which index very large star catalogues repeatedly in the process of attitude estimation. Recently, the k-vector search algorithm has been applied to numerous other problem areas including non-uniform random variate sampling, interpolation of 1-D or 2-D tables, nonlinear function inversion, and solution of systems of nonlinear equations. This paper presents algorithms in which the k-vector search technique is used to solve each of these problems in a computationally-efficient manner. In instances where these tasks must be performed repeatedly on a static (or nearly-static) data set, the proposed k-vector-based algorithms offer an extremely fast solution technique that outperforms standard methods.
NASA Technical Reports Server (NTRS)
Venter, Gerhard; Sobieszczanski-Sobieski Jaroslaw
2002-01-01
The purpose of this paper is to show how the search algorithm known as particle swarm optimization performs. Here, particle swarm optimization is applied to structural design problems, but the method has a much wider range of possible applications. The paper's new contributions are improvements to the particle swarm optimization algorithm and conclusions and recommendations as to the utility of the algorithm, Results of numerical experiments for both continuous and discrete applications are presented in the paper. The results indicate that the particle swarm optimization algorithm does locate the constrained minimum design in continuous applications with very good precision, albeit at a much higher computational cost than that of a typical gradient based optimizer. However, the true potential of particle swarm optimization is primarily in applications with discrete and/or discontinuous functions and variables. Additionally, particle swarm optimization has the potential of efficient computation with very large numbers of concurrently operating processors.
Biological production models as elements of coupled, atmosphere-ocean models for climate research
NASA Technical Reports Server (NTRS)
Platt, Trevor; Sathyendranath, Shubha
1991-01-01
Process models of phytoplankton production are discussed with respect to their suitability for incorporation into global-scale numerical ocean circulation models. Exact solutions are given for integrals over the mixed layer and the day of analytic, wavelength-independent models of primary production. Within this class of model, the bias incurred by using a triangular approximation (rather than a sinusoidal one) to the variation of surface irradiance through the day is computed. Efficient computation algorithms are given for the nonspectral models. More exact calculations require a spectrally sensitive treatment. Such models exist but must be integrated numerically over depth and time. For these integrations, resolution in wavelength, depth, and time are considered and recommendations made for efficient computation. The extrapolation of the one-(spatial)-dimension treatment to large horizontal scale is discussed.
High performance computation of radiative transfer equation using the finite element method
NASA Astrophysics Data System (ADS)
Badri, M. A.; Jolivet, P.; Rousseau, B.; Favennec, Y.
2018-05-01
This article deals with an efficient strategy for numerically simulating radiative transfer phenomena using distributed computing. The finite element method alongside the discrete ordinate method is used for spatio-angular discretization of the monochromatic steady-state radiative transfer equation in an anisotropically scattering media. Two very different methods of parallelization, angular and spatial decomposition methods, are presented. To do so, the finite element method is used in a vectorial way. A detailed comparison of scalability, performance, and efficiency on thousands of processors is established for two- and three-dimensional heterogeneous test cases. Timings show that both algorithms scale well when using proper preconditioners. It is also observed that our angular decomposition scheme outperforms our domain decomposition method. Overall, we perform numerical simulations at scales that were previously unattainable by standard radiative transfer equation solvers.
On the use of reverse Brownian motion to accelerate hybrid simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bakarji, Joseph; Tartakovsky, Daniel M., E-mail: tartakovsky@stanford.edu
Multiscale and multiphysics simulations are two rapidly developing fields of scientific computing. Efficient coupling of continuum (deterministic or stochastic) constitutive solvers with their discrete (stochastic, particle-based) counterparts is a common challenge in both kinds of simulations. We focus on interfacial, tightly coupled simulations of diffusion that combine continuum and particle-based solvers. The latter employs the reverse Brownian motion (rBm), a Monte Carlo approach that allows one to enforce inhomogeneous Dirichlet, Neumann, or Robin boundary conditions and is trivially parallelizable. We discuss numerical approaches for improving the accuracy of rBm in the presence of inhomogeneous Neumann boundary conditions and alternative strategiesmore » for coupling the rBm solver with its continuum counterpart. Numerical experiments are used to investigate the convergence, stability, and computational efficiency of the proposed hybrid algorithm.« less
Performance advantages of CPML over UPML absorbing boundary conditions in FDTD algorithm
NASA Astrophysics Data System (ADS)
Gvozdic, Branko D.; Djurdjevic, Dusan Z.
2017-01-01
Implementation of absorbing boundary condition (ABC) has a very important role in simulation performance and accuracy in finite difference time domain (FDTD) method. The perfectly matched layer (PML) is the most efficient type of ABC. The aim of this paper is to give detailed insight in and discussion of boundary conditions and hence to simplify the choice of PML used for termination of computational domain in FDTD method. In particular, we demonstrate that using the convolutional PML (CPML) has significant advantages in terms of implementation in FDTD method and reducing computer resources than using uniaxial PML (UPML). An extensive number of numerical experiments has been performed and results have shown that CPML is more efficient in electromagnetic waves absorption. Numerical code is prepared, several problems are analyzed and relative error is calculated and presented.
Scheduled Relaxation Jacobi method: Improvements and applications
NASA Astrophysics Data System (ADS)
Adsuara, J. E.; Cordero-Carrión, I.; Cerdá-Durán, P.; Aloy, M. A.
2016-09-01
Elliptic partial differential equations (ePDEs) appear in a wide variety of areas of mathematics, physics and engineering. Typically, ePDEs must be solved numerically, which sets an ever growing demand for efficient and highly parallel algorithms to tackle their computational solution. The Scheduled Relaxation Jacobi (SRJ) is a promising class of methods, atypical for combining simplicity and efficiency, that has been recently introduced for solving linear Poisson-like ePDEs. The SRJ methodology relies on computing the appropriate parameters of a multilevel approach with the goal of minimizing the number of iterations needed to cut down the residuals below specified tolerances. The efficiency in the reduction of the residual increases with the number of levels employed in the algorithm. Applying the original methodology to compute the algorithm parameters with more than 5 levels notably hinders obtaining optimal SRJ schemes, as the mixed (non-linear) algebraic-differential system of equations from which they result becomes notably stiff. Here we present a new methodology for obtaining the parameters of SRJ schemes that overcomes the limitations of the original algorithm and provide parameters for SRJ schemes with up to 15 levels and resolutions of up to 215 points per dimension, allowing for acceleration factors larger than several hundreds with respect to the Jacobi method for typical resolutions and, in some high resolution cases, close to 1000. Most of the success in finding SRJ optimal schemes with more than 10 levels is based on an analytic reduction of the complexity of the previously mentioned system of equations. Furthermore, we extend the original algorithm to apply it to certain systems of non-linear ePDEs.
NASA Technical Reports Server (NTRS)
Radhakrishnan, K.
1984-01-01
The efficiency and accuracy of several algorithms recently developed for the efficient numerical integration of stiff ordinary differential equations are compared. The methods examined include two general-purpose codes, EPISODE and LSODE, and three codes (CHEMEQ, CREK1D, and GCKP84) developed specifically to integrate chemical kinetic rate equations. The codes are applied to two test problems drawn from combustion kinetics. The comparisons show that LSODE is the fastest code currently available for the integration of combustion kinetic rate equations. An important finding is that an interactive solution of the algebraic energy conservation equation to compute the temperature does not result in significant errors. In addition, this method is more efficient than evaluating the temperature by integrating its time derivative. Significant reductions in computational work are realized by updating the rate constants (k = at(supra N) N exp(-E/RT) only when the temperature change exceeds an amount delta T that is problem dependent. An approximate expression for the automatic evaluation of delta T is derived and is shown to result in increased efficiency.
Neural system for heartbeats recognition using genetically integrated ensemble of classifiers.
Osowski, Stanislaw; Siwek, Krzysztof; Siroic, Robert
2011-03-01
This paper presents the application of genetic algorithm for the integration of neural classifiers combined in the ensemble for the accurate recognition of heartbeat types on the basis of ECG registration. The idea presented in this paper is that using many classifiers arranged in the form of ensemble leads to the increased accuracy of the recognition. In such ensemble the important problem is the integration of all classifiers into one effective classification system. This paper proposes the use of genetic algorithm. It was shown that application of the genetic algorithm is very efficient and allows to reduce significantly the total error of heartbeat recognition. This was confirmed by the numerical experiments performed on the MIT BIH Arrhythmia Database. Copyright © 2011 Elsevier Ltd. All rights reserved.
Algorithm 971: An Implementation of a Randomized Algorithm for Principal Component Analysis
LI, HUAMIN; LINDERMAN, GEORGE C.; SZLAM, ARTHUR; STANTON, KELLY P.; KLUGER, YUVAL; TYGERT, MARK
2017-01-01
Recent years have witnessed intense development of randomized methods for low-rank approximation. These methods target principal component analysis and the calculation of truncated singular value decompositions. The present article presents an essentially black-box, foolproof implementation for Mathworks’ MATLAB, a popular software platform for numerical computation. As illustrated via several tests, the randomized algorithms for low-rank approximation outperform or at least match the classical deterministic techniques (such as Lanczos iterations run to convergence) in basically all respects: accuracy, computational efficiency (both speed and memory usage), ease-of-use, parallelizability, and reliability. However, the classical procedures remain the methods of choice for estimating spectral norms and are far superior for calculating the least singular values and corresponding singular vectors (or singular subspaces). PMID:28983138
Harmony search optimization algorithm for a novel transportation problem in a consolidation network
NASA Astrophysics Data System (ADS)
Davod Hosseini, Seyed; Akbarpour Shirazi, Mohsen; Taghi Fatemi Ghomi, Seyed Mohammad
2014-11-01
This article presents a new harmony search optimization algorithm to solve a novel integer programming model developed for a consolidation network. In this network, a set of vehicles is used to transport goods from suppliers to their corresponding customers via two transportation systems: direct shipment and milk run logistics. The objective of this problem is to minimize the total shipping cost in the network, so it tries to reduce the number of required vehicles using an efficient vehicle routing strategy in the solution approach. Solving several numerical examples confirms that the proposed solution approach based on the harmony search algorithm performs much better than CPLEX in reducing both the shipping cost in the network and computational time requirement, especially for realistic size problem instances.
Parareal algorithms with local time-integrators for time fractional differential equations
NASA Astrophysics Data System (ADS)
Wu, Shu-Lin; Zhou, Tao
2018-04-01
It is challenge work to design parareal algorithms for time-fractional differential equations due to the historical effect of the fractional operator. A direct extension of the classical parareal method to such equations will lead to unbalance computational time in each process. In this work, we present an efficient parareal iteration scheme to overcome this issue, by adopting two recently developed local time-integrators for time fractional operators. In both approaches, one introduces auxiliary variables to localized the fractional operator. To this end, we propose a new strategy to perform the coarse grid correction so that the auxiliary variables and the solution variable are corrected separately in a mixed pattern. It is shown that the proposed parareal algorithm admits robust rate of convergence. Numerical examples are presented to support our conclusions.
Development of iterative techniques for the solution of unsteady compressible viscous flows
NASA Technical Reports Server (NTRS)
Sankar, Lakshmi N.; Hixon, Duane
1992-01-01
The development of efficient iterative solution methods for the numerical solution of two- and three-dimensional compressible Navier-Stokes equations is discussed. Iterative time marching methods have several advantages over classical multi-step explicit time marching schemes, and non-iterative implicit time marching schemes. Iterative schemes have better stability characteristics than non-iterative explicit and implicit schemes. In this work, another approach based on the classical conjugate gradient method, known as the Generalized Minimum Residual (GMRES) algorithm is investigated. The GMRES algorithm has been used in the past by a number of researchers for solving steady viscous and inviscid flow problems. Here, we investigate the suitability of this algorithm for solving the system of non-linear equations that arise in unsteady Navier-Stokes solvers at each time step.
Recovery Discontinuous Galerkin Jacobian-free Newton-Krylov Method for all-speed flows
DOE Office of Scientific and Technical Information (OSTI.GOV)
HyeongKae Park; Robert Nourgaliev; Vincent Mousseau
2008-07-01
There is an increasing interest to develop the next generation simulation tools for the advanced nuclear energy systems. These tools will utilize the state-of-art numerical algorithms and computer science technology in order to maximize the predictive capability, support advanced reactor designs, reduce uncertainty and increase safety margins. In analyzing nuclear energy systems, we are interested in compressible low-Mach number, high heat flux flows with a wide range of Re, Ra, and Pr numbers. Under these conditions, the focus is placed on turbulent heat transfer, in contrast to other industries whose main interest is in capturing turbulent mixing. Our objective ismore » to develop singlepoint turbulence closure models for large-scale engineering CFD code, using Direct Numerical Simulation (DNS) or Large Eddy Simulation (LES) tools, requireing very accurate and efficient numerical algorithms. The focus of this work is placed on fully-implicit, high-order spatiotemporal discretization based on the discontinuous Galerkin method solving the conservative form of the compressible Navier-Stokes equations. The method utilizes a local reconstruction procedure derived from weak formulation of the problem, which is inspired by the recovery diffusion flux algorithm of van Leer and Nomura [?] and by the piecewise parabolic reconstruction [?] in the finite volume method. The developed methodology is integrated into the Jacobianfree Newton-Krylov framework [?] to allow a fully-implicit solution of the problem.« less
Vector Observation-Aided/Attitude-Rate Estimation Using Global Positioning System Signals
NASA Technical Reports Server (NTRS)
Oshman, Yaakov; Markley, F. Landis
1997-01-01
A sequential filtering algorithm is presented for attitude and attitude-rate estimation from Global Positioning System (GPS) differential carrier phase measurements. A third-order, minimal-parameter method for solving the attitude matrix kinematic equation is used to parameterize the filter's state, which renders the resulting estimator computationally efficient. Borrowing from tracking theory concepts, the angular acceleration is modeled as an exponentially autocorrelated stochastic process, thus avoiding the use of the uncertain spacecraft dynamic model. The new formulation facilitates the use of aiding vector observations in a unified filtering algorithm, which can enhance the method's robustness and accuracy. Numerical examples are used to demonstrate the performance of the method.
Inversion of particle-size distribution from angular light-scattering data with genetic algorithms.
Ye, M; Wang, S; Lu, Y; Hu, T; Zhu, Z; Xu, Y
1999-04-20
A stochastic inverse technique based on a genetic algorithm (GA) to invert particle-size distribution from angular light-scattering data is developed. This inverse technique is independent of any given a priori information of particle-size distribution. Numerical tests show that this technique can be successfully applied to inverse problems with high stability in the presence of random noise and low susceptibility to the shape of distributions. It has also been shown that the GA-based inverse technique is more efficient in use of computing time than the inverse Monte Carlo method recently developed by Ligon et al. [Appl. Opt. 35, 4297 (1996)].
Auction dynamics: A volume constrained MBO scheme
NASA Astrophysics Data System (ADS)
Jacobs, Matt; Merkurjev, Ekaterina; Esedoǧlu, Selim
2018-02-01
We show how auction algorithms, originally developed for the assignment problem, can be utilized in Merriman, Bence, and Osher's threshold dynamics scheme to simulate multi-phase motion by mean curvature in the presence of equality and inequality volume constraints on the individual phases. The resulting algorithms are highly efficient and robust, and can be used in simulations ranging from minimal partition problems in Euclidean space to semi-supervised machine learning via clustering on graphs. In the case of the latter application, numerous experimental results on benchmark machine learning datasets show that our approach exceeds the performance of current state-of-the-art methods, while requiring a fraction of the computation time.