efficient numerical implementation: Topics by Science.gov

Sample records for efficient numerical implementation

Computationally efficient multibody simulations

NASA Technical Reports Server (NTRS)

Ramakrishnan, Jayant; Kumar, Manoj

1994-01-01

Computationally efficient approaches to the solution of the dynamics of multibody systems are presented in this work. The computational efficiency is derived from both the algorithmic and implementational standpoint. Order(n) approaches provide a new formulation of the equations of motion eliminating the assembly and numerical inversion of a system mass matrix as required by conventional algorithms. Computational efficiency is also gained in the implementation phase by the symbolic processing and parallel implementation of these equations. Comparison of this algorithm with existing multibody simulation programs illustrates the increased computational efficiency.
Efficiency Analysis of the Parallel Implementation of the SIMPLE Algorithm on Multiprocessor Computers

NASA Astrophysics Data System (ADS)

Lashkin, S. V.; Kozelkov, A. S.; Yalozo, A. V.; Gerasimov, V. Yu.; Zelensky, D. K.

2017-12-01

This paper describes the details of the parallel implementation of the SIMPLE algorithm for numerical solution of the Navier-Stokes system of equations on arbitrary unstructured grids. The iteration schemes for the serial and parallel versions of the SIMPLE algorithm are implemented. In the description of the parallel implementation, special attention is paid to computational data exchange among processors under the condition of the grid model decomposition using fictitious cells. We discuss the specific features for the storage of distributed matrices and implementation of vector-matrix operations in parallel mode. It is shown that the proposed way of matrix storage reduces the number of interprocessor exchanges. A series of numerical experiments illustrates the effect of the multigrid SLAE solver tuning on the general efficiency of the algorithm; the tuning involves the types of the cycles used (V, W, and F), the number of iterations of a smoothing operator, and the number of cells for coarsening. Two ways (direct and indirect) of efficiency evaluation for parallelization of the numerical algorithm are demonstrated. The paper presents the results of solving some internal and external flow problems with the evaluation of parallelization efficiency by two algorithms. It is shown that the proposed parallel implementation enables efficient computations for the problems on a thousand processors. Based on the results obtained, some general recommendations are made for the optimal tuning of the multigrid solver, as well as for selecting the optimal number of cells per processor.
Numerical implementation of the S-matrix algorithm for modeling of relief diffraction gratings

NASA Astrophysics Data System (ADS)

Yaremchuk, Iryna; Tamulevičius, Tomas; Fitio, Volodymyr; Gražulevičiūte, Ieva; Bobitski, Yaroslav; Tamulevičius, Sigitas

2013-11-01

A new numerical implementation is developed to calculate the diffraction efficiency of relief diffraction gratings. In the new formulation, vectors containing the expansion coefficients of electric and magnetic fields on boundaries of the grating layer are expressed by additional constants. An S-matrix algorithm has been systematically described in detail and adapted to a simple matrix form. This implementation is suitable for the study of optical characteristics of periodic structures by using modern object-oriented programming languages and different standard mathematical software. The modeling program has been developed on the basis of this numerical implementation and tested by comparison with other commercially available programs and experimental data. Numerical examples are given to show the usefulness of the new implementation.
On finite element implementation and computational techniques for constitutive modeling of high temperature composites

NASA Technical Reports Server (NTRS)

Saleeb, A. F.; Chang, T. Y. P.; Wilt, T.; Iskovitz, I.

1989-01-01

The research work performed during the past year on finite element implementation and computational techniques pertaining to high temperature composites is outlined. In the present research, two main issues are addressed: efficient geometric modeling of composite structures and expedient numerical integration techniques dealing with constitutive rate equations. In the first issue, mixed finite elements for modeling laminated plates and shells were examined in terms of numerical accuracy, locking property and computational efficiency. Element applications include (currently available) linearly elastic analysis and future extension to material nonlinearity for damage predictions and large deformations. On the material level, various integration methods to integrate nonlinear constitutive rate equations for finite element implementation were studied. These include explicit, implicit and automatic subincrementing schemes. In all cases, examples are included to illustrate the numerical characteristics of various methods that were considered.
An Efficient Numerical Method for Computing Synthetic Seismograms for a Layered Half-space with Sources and Receivers at Close or Same Depths

NASA Astrophysics Data System (ADS)

Zhang, H.-m.; Chen, X.-f.; Chang, S.

- It is difficult to compute synthetic seismograms for a layered half-space with sources and receivers at close to or the same depths using the generalized R/T coefficient method (Kennett, 1983; Luco and Apsel, 1983; Yao and Harkrider, 1983; Chen, 1993), because the wavenumber integration converges very slowly. A semi-analytic method for accelerating the convergence, in which part of the integration is implemented analytically, was adopted by some authors (Apsel and Luco, 1983; Hisada, 1994, 1995). In this study, based on the principle of the Repeated Averaging Method (Dahlquist and Björck, 1974; Chang, 1988), we propose an alternative, efficient, numerical method, the peak-trough averaging method (PTAM), to overcome the difficulty mentioned above. Compared with the semi-analytic method, PTAM is not only much simpler mathematically and easier to implement in practice, but also more efficient. Using numerical examples, we illustrate the validity, accuracy and efficiency of the new method.
Efficient algorithms and implementations of entropy-based moment closures for rarefied gases

NASA Astrophysics Data System (ADS)

Schaerer, Roman Pascal; Bansal, Pratyuksh; Torrilhon, Manuel

2017-07-01

We present efficient algorithms and implementations of the 35-moment system equipped with the maximum-entropy closure in the context of rarefied gases. While closures based on the principle of entropy maximization have been shown to yield very promising results for moderately rarefied gas flows, the computational cost of these closures is in general much higher than for closure theories with explicit closed-form expressions of the closing fluxes, such as Grad's classical closure. Following a similar approach as Garrett et al. (2015) [13], we investigate efficient implementations of the computationally expensive numerical quadrature method used for the moment evaluations of the maximum-entropy distribution by exploiting its inherent fine-grained parallelism with the parallelism offered by multi-core processors and graphics cards. We show that using a single graphics card as an accelerator allows speed-ups of two orders of magnitude when compared to a serial CPU implementation. To accelerate the time-to-solution for steady-state problems, we propose a new semi-implicit time discretization scheme. The resulting nonlinear system of equations is solved with a Newton type method in the Lagrange multipliers of the dual optimization problem in order to reduce the computational cost. Additionally, fully explicit time-stepping schemes of first and second order accuracy are presented. We investigate the accuracy and efficiency of the numerical schemes for several numerical test cases, including a steady-state shock-structure problem.
Parallel implementation of geometrical shock dynamics for two dimensional converging shock waves

NASA Astrophysics Data System (ADS)

Qiu, Shi; Liu, Kuang; Eliasson, Veronica

2016-10-01

Geometrical shock dynamics (GSD) theory is an appealing method to predict the shock motion in the sense that it is more computationally efficient than solving the traditional Euler equations, especially for converging shock waves. However, to solve and optimize large scale configurations, the main bottleneck is the computational cost. Among the existing numerical GSD schemes, there is only one that has been implemented on parallel computers, with the purpose to analyze detonation waves. To extend the computational advantage of the GSD theory to more general applications such as converging shock waves, a numerical implementation using a spatial decomposition method has been coupled with a front tracking approach on parallel computers. In addition, an efficient tridiagonal system solver for massively parallel computers has been applied to resolve the most expensive function in this implementation, resulting in an efficiency of 0.93 while using 32 HPCC cores. Moreover, symmetric boundary conditions have been developed to further reduce the computational cost, achieving a speedup of 19.26 for a 12-sided polygonal converging shock.
Efficient analytical implementation of the DOT Riemann solver for the de Saint Venant-Exner morphodynamic model

NASA Astrophysics Data System (ADS)

Carraro, F.; Valiani, A.; Caleffi, V.

2018-03-01

Within the framework of the de Saint Venant equations coupled with the Exner equation for morphodynamic evolution, this work presents a new efficient implementation of the Dumbser-Osher-Toro (DOT) scheme for non-conservative problems. The DOT path-conservative scheme is a robust upwind method based on a complete Riemann solver, but it has the drawback of requiring expensive numerical computations. Indeed, to compute the non-linear time evolution in each time step, the DOT scheme requires numerical computation of the flux matrix eigenstructure (the totality of eigenvalues and eigenvectors) several times at each cell edge. In this work, an analytical and compact formulation of the eigenstructure for the de Saint Venant-Exner (dSVE) model is introduced and tested in terms of numerical efficiency and stability. Using the original DOT and PRICE-C (a very efficient FORCE-type method) as reference methods, we present a convergence analysis (error against CPU time) to study the performance of the DOT method with our new analytical implementation of eigenstructure calculations (A-DOT). In particular, the numerical performance of the three methods is tested in three test cases: a movable bed Riemann problem with analytical solution; a problem with smooth analytical solution; a test in which the water flow is characterised by subcritical and supercritical regions. For a given target error, the A-DOT method is always the most efficient choice. Finally, two experimental data sets and different transport formulae are considered to test the A-DOT model in more practical case studies.
Information processing using a single dynamical node as complex system

PubMed Central

Appeltant, L.; Soriano, M.C.; Van der Sande, G.; Danckaert, J.; Massar, S.; Dambre, J.; Schrauwen, B.; Mirasso, C.R.; Fischer, I.

2011-01-01

Novel methods for information processing are highly desired in our information-driven society. Inspired by the brain's ability to process information, the recently introduced paradigm known as 'reservoir computing' shows that complex networks can efficiently perform computation. Here we introduce a novel architecture that reduces the usually required large number of elements to a single nonlinear node with delayed feedback. Through an electronic implementation, we experimentally and numerically demonstrate excellent performance in a speech recognition benchmark. Complementary numerical studies also show excellent performance for a time series prediction benchmark. These results prove that delay-dynamical systems, even in their simplest manifestation, can perform efficient information processing. This finding paves the way to feasible and resource-efficient technological implementations of reservoir computing. PMID:21915110
Efficient algorithms and implementations of entropy-based moment closures for rarefied gases

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schaerer, Roman Pascal, E-mail: schaerer@mathcces.rwth-aachen.de; Bansal, Pratyuksh; Torrilhon, Manuel

We present efficient algorithms and implementations of the 35-moment system equipped with the maximum-entropy closure in the context of rarefied gases. While closures based on the principle of entropy maximization have been shown to yield very promising results for moderately rarefied gas flows, the computational cost of these closures is in general much higher than for closure theories with explicit closed-form expressions of the closing fluxes, such as Grad's classical closure. Following a similar approach as Garrett et al. (2015) , we investigate efficient implementations of the computationally expensive numerical quadrature method used for the moment evaluations of the maximum-entropymore » distribution by exploiting its inherent fine-grained parallelism with the parallelism offered by multi-core processors and graphics cards. We show that using a single graphics card as an accelerator allows speed-ups of two orders of magnitude when compared to a serial CPU implementation. To accelerate the time-to-solution for steady-state problems, we propose a new semi-implicit time discretization scheme. The resulting nonlinear system of equations is solved with a Newton type method in the Lagrange multipliers of the dual optimization problem in order to reduce the computational cost. Additionally, fully explicit time-stepping schemes of first and second order accuracy are presented. We investigate the accuracy and efficiency of the numerical schemes for several numerical test cases, including a steady-state shock-structure problem.« less
Computationally efficient method for optical simulation of solar cells and their applications

NASA Astrophysics Data System (ADS)

Semenikhin, I.; Zanuccoli, M.; Fiegna, C.; Vyurkov, V.; Sangiorgi, E.

2013-01-01

This paper presents two novel implementations of the Differential method to solve the Maxwell equations in nanostructured optoelectronic solid state devices. The first proposed implementation is based on an improved and computationally efficient T-matrix formulation that adopts multiple-precision arithmetic to tackle the numerical instability problem which arises due to evanescent modes. The second implementation adopts the iterative approach that allows to achieve low computational complexity O(N logN) or better. The proposed algorithms may work with structures with arbitrary spatial variation of the permittivity. The developed two-dimensional numerical simulator is applied to analyze the dependence of the absorption characteristics of a thin silicon slab on the morphology of the front interface and on the angle of incidence of the radiation with respect to the device surface.
Numerically solving the relativistic Grad-Shafranov equation in Kerr spacetimes: numerical techniques

NASA Astrophysics Data System (ADS)

Mahlmann, J. F.; Cerdá-Durán, P.; Aloy, M. A.

2018-07-01

The study of the electrodynamics of static, axisymmetric, and force-free Kerr magnetospheres relies vastly on solutions of the so-called relativistic Grad-Shafranov equation (GSE). Different numerical approaches to the solution of the GSE have been introduced in the literature, but none of them has been fully assessed from the numerical point of view in terms of efficiency and quality of the solutions found. We present a generalization of these algorithms and give a detailed background on the algorithmic implementation. We assess the numerical stability of the implemented algorithms and quantify the convergence of the presented methodology for the most established set-ups (split-monopole, paraboloidal, BH disc, uniform).
Numerically solving the relativistic Grad-Shafranov equation in Kerr spacetimes: Numerical techniques

NASA Astrophysics Data System (ADS)

Mahlmann, J. F.; Cerdá-Durán, P.; Aloy, M. A.

2018-04-01

The study of the electrodynamics of static, axisymmetric and force-free Kerr magnetospheres relies vastly on solutions of the so called relativistic Grad-Shafranov equation (GSE). Different numerical approaches to the solution of the GSE have been introduced in the literature, but none of them has been fully assessed from the numerical point of view in terms of efficiency and quality of the solutions found. We present a generalization of these algorithms and give detailed background on the algorithmic implementation. We assess the numerical stability of the implemented algorithms and quantify the convergence of the presented methodology for the most established setups (split-monopole, paraboloidal, BH-disk, uniform).
An efficient nonlinear finite-difference approach in the computational modeling of the dynamics of a nonlinear diffusion-reaction equation in microbial ecology.

PubMed

Macías-Díaz, J E; Macías, Siegfried; Medina-Ramírez, I E

2013-12-01

In this manuscript, we present a computational model to approximate the solutions of a partial differential equation which describes the growth dynamics of microbial films. The numerical technique reported in this work is an explicit, nonlinear finite-difference methodology which is computationally implemented using Newton's method. Our scheme is compared numerically against an implicit, linear finite-difference discretization of the same partial differential equation, whose computer coding requires an implementation of the stabilized bi-conjugate gradient method. Our numerical results evince that the nonlinear approach results in a more efficient approximation to the solutions of the biofilm model considered, and demands less computer memory. Moreover, the positivity of initial profiles is preserved in the practice by the nonlinear scheme proposed. Copyright © 2013 Elsevier Ltd. All rights reserved.
Tempest - Efficient Computation of Atmospheric Flows Using High-Order Local Discretization Methods

NASA Astrophysics Data System (ADS)

Ullrich, P. A.; Guerra, J. E.

2014-12-01

The Tempest Framework composes several compact numerical methods to easily facilitate intercomparison of atmospheric flow calculations on the sphere and in rectangular domains. This framework includes the implementations of Spectral Elements, Discontinuous Galerkin, Flux Reconstruction, and Hybrid Finite Element methods with the goal of achieving optimal accuracy in the solution of atmospheric problems. Several advantages of this approach are discussed such as: improved pressure gradient calculation, numerical stability by vertical/horizontal splitting, arbitrary order of accuracy, etc. The local numerical discretization allows for high performance parallel computation and efficient inclusion of parameterizations. These techniques are used in conjunction with a non-conformal, locally refined, cubed-sphere grid for global simulations and standard Cartesian grids for simulations at the mesoscale. A complete implementation of the methods described is demonstrated in a non-hydrostatic setting.
An Implicit Algorithm for the Numerical Simulation of Shape-Memory Alloys

DOE Office of Scientific and Technical Information (OSTI.GOV)

Becker, R; Stolken, J; Jannetti, C

Shape-memory alloys (SMA) have the potential to be used in a variety of interesting applications due to their unique properties of pseudoelasticity and the shape-memory effect. However, in order to design SMA devices efficiently, a physics-based constitutive model is required to accurately simulate the behavior of shape-memory alloys. The scope of this work is to extend the numerical capabilities of the SMA constitutive model developed by Jannetti et. al. (2003), to handle large-scale polycrystalline simulations. The constitutive model is implemented within the finite-element software ABAQUS/Standard using a user defined material subroutine, or UMAT. To improve the efficiency of the numericalmore » simulations, so that polycrystalline specimens of shape-memory alloys can be modeled, a fully implicit algorithm has been implemented to integrate the constitutive equations. Using an implicit integration scheme increases the efficiency of the UMAT over the previously implemented explicit integration method by a factor of more than 100 for single crystal simulations.« less
Evaluation of Proteus as a Tool for the Rapid Development of Models of Hydrologic Systems

NASA Astrophysics Data System (ADS)

Weigand, T. M.; Farthing, M. W.; Kees, C. E.; Miller, C. T.

2013-12-01

Models of modern hydrologic systems can be complex and involve a variety of operators with varying character. The goal is to implement approximations of such models that are both efficient for the developer and computationally efficient, which is a set of naturally competing objectives. Proteus is a Python-based toolbox that supports prototyping of model formulations as well as a wide variety of modern numerical methods and parallel computing. We used Proteus to develop numerical approximations for three models: Richards' equation, a brine flow model derived using the Thermodynamically Constrained Averaging Theory (TCAT), and a multiphase TCAT-based tumor growth model. For Richards' equation, we investigated discontinuous Galerkin solutions with higher order time integration based on the backward difference formulas. The TCAT brine flow model was implemented using Proteus and a variety of numerical methods were compared to hand coded solutions. Finally, an existing tumor growth model was implemented in Proteus to introduce more advanced numerics and allow the code to be run in parallel. From these three example models, Proteus was found to be an attractive open-source option for rapidly developing high quality code for solving existing and evolving computational science models.
Reply to Comment by Lu et al. on "An Efficient and Stable Hydrodynamic Model With Novel Source Term Discretization Schemes for Overland Flow and Flood Simulations"

NASA Astrophysics Data System (ADS)

Xia, Xilin; Liang, Qiuhua; Ming, Xiaodong; Hou, Jingming

2018-01-01

This document addresses the comments raised by Lu et al. (2017). Lu et al. (2017) proposed an alternative numerical treatment for implementing the fully implicit friction discretization in Xia et al. (2017). The method by Lu et al. (2017) is also effective, but not necessarily easier to implement or more efficient. The numerical wiggles observed by Lu et al. (2017) do not affect the overall solution accuracy of the surface reconstruction method (SRM). SRM introduces an antidiffusion effect, which may also lead to more accurate numerical predictions than hydrostatic reconstruction (HR) but may be the cause of the numerical wiggles. As suggested by Lu et al. (2017), HR may perform equally well if fine enough grids are used, which has been investigated and recognized in the literature. However, the use of refined meshes in simulations will inevitably increase computational cost and the grid sizes as suggested are too small for real-world applications.
Corruption of accuracy and efficiency of Markov chain Monte Carlo simulation by inaccurate numerical implementation of conceptual hydrologic models

NASA Astrophysics Data System (ADS)

Schoups, G.; Vrugt, J. A.; Fenicia, F.; van de Giesen, N. C.

2010-10-01

Conceptual rainfall-runoff models have traditionally been applied without paying much attention to numerical errors induced by temporal integration of water balance dynamics. Reliance on first-order, explicit, fixed-step integration methods leads to computationally cheap simulation models that are easy to implement. Computational speed is especially desirable for estimating parameter and predictive uncertainty using Markov chain Monte Carlo (MCMC) methods. Confirming earlier work of Kavetski et al. (2003), we show here that the computational speed of first-order, explicit, fixed-step integration methods comes at a cost: for a case study with a spatially lumped conceptual rainfall-runoff model, it introduces artificial bimodality in the marginal posterior parameter distributions, which is not present in numerically accurate implementations of the same model. The resulting effects on MCMC simulation include (1) inconsistent estimates of posterior parameter and predictive distributions, (2) poor performance and slow convergence of the MCMC algorithm, and (3) unreliable convergence diagnosis using the Gelman-Rubin statistic. We studied several alternative numerical implementations to remedy these problems, including various adaptive-step finite difference schemes and an operator splitting method. Our results show that adaptive-step, second-order methods, based on either explicit finite differencing or operator splitting with analytical integration, provide the best alternative for accurate and efficient MCMC simulation. Fixed-step or adaptive-step implicit methods may also be used for increased accuracy, but they cannot match the efficiency of adaptive-step explicit finite differencing or operator splitting. Of the latter two, explicit finite differencing is more generally applicable and is preferred if the individual hydrologic flux laws cannot be integrated analytically, as the splitting method then loses its advantage.
Strategies for efficient numerical implementation of hybrid multi-scale agent-based models to describe biological systems

PubMed Central

Cilfone, Nicholas A.; Kirschner, Denise E.; Linderman, Jennifer J.

2015-01-01

Biologically related processes operate across multiple spatiotemporal scales. For computational modeling methodologies to mimic this biological complexity, individual scale models must be linked in ways that allow for dynamic exchange of information across scales. A powerful methodology is to combine a discrete modeling approach, agent-based models (ABMs), with continuum models to form hybrid models. Hybrid multi-scale ABMs have been used to simulate emergent responses of biological systems. Here, we review two aspects of hybrid multi-scale ABMs: linking individual scale models and efficiently solving the resulting model. We discuss the computational choices associated with aspects of linking individual scale models while simultaneously maintaining model tractability. We demonstrate implementations of existing numerical methods in the context of hybrid multi-scale ABMs. Using an example model describing Mycobacterium tuberculosis infection, we show relative computational speeds of various combinations of numerical methods. Efficient linking and solution of hybrid multi-scale ABMs is key to model portability, modularity, and their use in understanding biological phenomena at a systems level. PMID:26366228

Efficient Non-Hydrostatic Modeling of Rotational, Turbulent, Dispersive, and Variable-Density Flows in the Vicinity of River Mouths and Inlets: Development and Field Support

DTIC Science & Technology

2013-09-30

numerical efforts undertaken here implement established aspects of Boussinesq -type modeling, developed by the PI and other researchers. These aspects...the Boussinesq -type framework, and then implement in a numerical model. Once this comprehensive model is developed and tested against established...phenomena that might be observed at New River. WORK COMPLETED In FY13 we have continued the development of a Boussinesq -type formulation that
Fast Fourier transform-based solution of 2D and 3D magnetization problems in type-II superconductivity

NASA Astrophysics Data System (ADS)

Prigozhin, Leonid; Sokolovsky, Vladimir

2018-05-01

We consider the fast Fourier transform (FFT) based numerical method for thin film magnetization problems (Vestgården and Johansen 2012 Supercond. Sci. Technol. 25 104001), compare it with the finite element methods, and evaluate its accuracy. Proposed modifications of this method implementation ensure stable convergence of iterations and enhance its efficiency. A new method, also based on the FFT, is developed for 3D bulk magnetization problems. This method is based on a magnetic field formulation, different from the popular h-formulation of eddy current problems typically employed with the edge finite elements. The method is simple, easy to implement, and can be used with a general current–voltage relation; its efficiency is illustrated by numerical simulations.
An efficient multi-dimensional implementation of VSIAM3 and its applications to free surface flows

NASA Astrophysics Data System (ADS)

Yokoi, Kensuke; Furuichi, Mikito; Sakai, Mikio

2017-12-01

We propose an efficient multidimensional implementation of VSIAM3 (volume/surface integrated average-based multi-moment method). Although VSIAM3 is a highly capable fluid solver based on a multi-moment concept and has been used for a wide variety of fluid problems, VSIAM3 could not simulate some simple benchmark problems well (for instance, lid-driven cavity flows) due to relatively high numerical viscosity. In this paper, we resolve the issue by using the efficient multidimensional approach. The proposed VSIAM3 is shown to capture lid-driven cavity flows of the Reynolds number up to Re = 7500 with a Cartesian grid of 128 × 128, which was not capable for the original VSIAM3. We also tested the proposed framework in free surface flow problems (droplet collision and separation of We = 40 and droplet splashing on a superhydrophobic substrate). The numerical results by the proposed VSIAM3 showed reasonable agreements with these experiments. The proposed VSIAM3 could capture droplet collision and separation of We = 40 with a low numerical resolution (8 meshes for the initial diameter of droplets). We also simulated free surface flows including particles toward non-Newtonian flow applications. These numerical results have showed that the proposed VSIAM3 can robustly simulate interactions among air, particles (solid), and liquid.
CBES--An Efficient Implementation of the Coursewriter Language.

ERIC Educational Resources Information Center

Franks, Edward W.

An extensive computer based education system (CBES) built around the IBM Coursewriter III program product at Ohio State University is described. In this system, numerous extensions have been added to the Coursewriter III language to provide capabilities needed to implement sophisticated instructional strategies. CBES design goals include lower CPU…
DEVELOPMENTS IN GRworkbench

NASA Astrophysics Data System (ADS)

Moylan, Andrew; Scott, Susan M.; Searle, Anthony C.

2006-02-01

The software tool GRworkbench is an ongoing project in visual, numerical General Relativity at The Australian National University. Recently, GRworkbench has been significantly extended to facilitate numerical experimentation in analytically-defined space-times. The numerical differential geometric engine has been rewritten using functional programming techniques, enabling objects which are normally defined as functions in the formalism of differential geometry and General Relativity to be directly represented as function variables in the C++ code of GRworkbench. The new functional differential geometric engine allows for more accurate and efficient visualisation of objects in space-times and makes new, efficient computational techniques available. Motivated by the desire to investigate a recent scientific claim using GRworkbench, new tools for numerical experimentation have been implemented, allowing for the simulation of complex physical situations.
Numerical Algorithm for Delta of Asian Option

PubMed Central

Zhang, Boxiang; Yu, Yang; Wang, Weiguo

2015-01-01

We study the numerical solution of the Greeks of Asian options. In particular, we derive a close form solution of Δ of Asian geometric option and use this analytical form as a control to numerically calculate Δ of Asian arithmetic option, which is known to have no explicit close form solution. We implement our proposed numerical method and compare the standard error with other classical variance reduction methods. Our method provides an efficient solution to the hedging strategy with Asian options. PMID:26266271
Low cost and efficient kurtosis-based deflationary ICA method: application to MRS sources separation problem.

PubMed

Saleh, M; Karfoul, A; Kachenoura, A; Senhadji, L; Albera, L

2016-08-01

Improving the execution time and the numerical complexity of the well-known kurtosis-based maximization method, the RobustICA, is investigated in this paper. A Newton-based scheme is proposed and compared to the conventional RobustICA method. A new implementation using the nonlinear Conjugate Gradient one is investigated also. Regarding the Newton approach, an exact computation of the Hessian of the considered cost function is provided. The proposed approaches and the considered implementations inherit the global plane search of the initial RobustICA method for which a better convergence speed for a given direction is still guaranteed. Numerical results on Magnetic Resonance Spectroscopy (MRS) source separation show the efficiency of the proposed approaches notably the quasi-Newton one using the BFGS method.
TRIADS: A phase-resolving model for nonlinear shoaling of directional wave spectra

NASA Astrophysics Data System (ADS)

Sheremet, Alex; Davis, Justin R.; Tian, Miao; Hanson, Jeffrey L.; Hathaway, Kent K.

2016-03-01

We investigate the performance of TRIADS, a numerical implementation of a phase-resolving, nonlinear, spectral model describing directional wave evolution in intermediate and shallow water. TRIADS simulations of shoaling waves generated by Hurricane Bill, 2009 are compared to directional spectral estimates based on observations collected at the Field Research Facility of the US Army Corps Of Engineers, at Duck, NC. Both the ability of the model to capture the processes essential to the nonlinear wave evolution, and the efficiency of the numerical implementations are analyzed and discussed.
A numerically efficient finite element hydroelastic analysis. Volume 2: Implementation in NASTRAN, part 1

NASA Technical Reports Server (NTRS)

Coppolino, R. N.

1974-01-01

Details are presented of the implementation of the new formulation into NASTRAN including descriptions of the DMAP statements required for conversion of the program and details pertaining to problem definition and bulk data considerations. Details of the current 1/8-scale space shuttle external tank mathematical model, numerical results and analysis/test comparisons are also presented. The appendices include a description and listing of a FORTRAN program used to develop harmonic transformation bulk data (multipoint constraint statements) and sample bulk data information for a number of hydroelastic problems.
Computation of Reacting Flows in Combustion Processes

NASA Technical Reports Server (NTRS)

Keith, Theo G., Jr.; Chen, K.-H.

2001-01-01

The objective of this research is to develop an efficient numerical algorithm with unstructured grids for the computation of three-dimensional chemical reacting flows that are known to occur in combustion components of propulsion systems. During the grant period (1996 to 1999), two companion codes have been developed and various numerical and physical models were implemented into the two codes.
An efficient implementation of semi-numerical computation of the Hartree-Fock exchange on the Intel Phi processor

NASA Astrophysics Data System (ADS)

Liu, Fenglai; Kong, Jing

2018-07-01

Unique technical challenges and their solutions for implementing semi-numerical Hartree-Fock exchange on the Phil Processor are discussed, especially concerning the single- instruction-multiple-data type of processing and small cache size. Benchmark calculations on a series of buckyball molecules with various Gaussian basis sets on a Phi processor and a six-core CPU show that the Phi processor provides as much as 12 times of speedup with large basis sets compared with the conventional four-center electron repulsion integration approach performed on the CPU. The accuracy of the semi-numerical scheme is also evaluated and found to be comparable to that of the resolution-of-identity approach.
Mathematical modelling of risk reduction in reinsurance

NASA Astrophysics Data System (ADS)

Balashov, R. B.; Kryanev, A. V.; Sliva, D. E.

2017-01-01

The paper presents a mathematical model of efficient portfolio formation in the reinsurance markets. The presented approach provides the optimal ratio between the expected value of return and the risk of yield values below a certain level. The uncertainty in the return values is conditioned by use of expert evaluations and preliminary calculations, which result in expected return values and the corresponding risk levels. The proposed method allows for implementation of computationally simple schemes and algorithms for numerical calculation of the numerical structure of the efficient portfolios of reinsurance contracts of a given insurance company.
Meshfree and efficient modeling of swimming cells

NASA Astrophysics Data System (ADS)

Gallagher, Meurig T.; Smith, David J.

2018-05-01

Locomotion in Stokes flow is an intensively studied problem because it describes important biological phenomena such as the motility of many species' sperm, bacteria, algae, and protozoa. Numerical computations can be challenging, particularly in three dimensions, due to the presence of moving boundaries and complex geometries; methods which combine ease of implementation and computational efficiency are therefore needed. A recently proposed method to discretize the regularized Stokeslet boundary integral equation without the need for a connected mesh is applied to the inertialess locomotion problem in Stokes flow. The mathematical formulation and key aspects of the computational implementation in matlab® or GNU Octave are described, followed by numerical experiments with biflagellate algae and multiple uniflagellate sperm swimming between no-slip surfaces, for which both swimming trajectories and flow fields are calculated. These computational experiments required minutes of time on modest hardware; an extensible implementation is provided in a GitHub repository. The nearest-neighbor discretization dramatically improves convergence and robustness, a key challenge in extending the regularized Stokeslet method to complicated three-dimensional biological fluid problems.
Performance advantages of CPML over UPML absorbing boundary conditions in FDTD algorithm

NASA Astrophysics Data System (ADS)

Gvozdic, Branko D.; Djurdjevic, Dusan Z.

2017-01-01

Implementation of absorbing boundary condition (ABC) has a very important role in simulation performance and accuracy in finite difference time domain (FDTD) method. The perfectly matched layer (PML) is the most efficient type of ABC. The aim of this paper is to give detailed insight in and discussion of boundary conditions and hence to simplify the choice of PML used for termination of computational domain in FDTD method. In particular, we demonstrate that using the convolutional PML (CPML) has significant advantages in terms of implementation in FDTD method and reducing computer resources than using uniaxial PML (UPML). An extensive number of numerical experiments has been performed and results have shown that CPML is more efficient in electromagnetic waves absorption. Numerical code is prepared, several problems are analyzed and relative error is calculated and presented.
A 3D staggered-grid finite difference scheme for poroelastic wave equation

NASA Astrophysics Data System (ADS)

Zhang, Yijie; Gao, Jinghuai

2014-10-01

Three dimensional numerical modeling has been a viable tool for understanding wave propagation in real media. The poroelastic media can better describe the phenomena of hydrocarbon reservoirs than acoustic and elastic media. However, the numerical modeling in 3D poroelastic media demands significantly more computational capacity, including both computational time and memory. In this paper, we present a 3D poroelastic staggered-grid finite difference (SFD) scheme. During the procedure, parallel computing is implemented to reduce the computational time. Parallelization is based on domain decomposition, and communication between processors is performed using message passing interface (MPI). Parallel analysis shows that the parallelized SFD scheme significantly improves the simulation efficiency and 3D decomposition in domain is the most efficient. We also analyze the numerical dispersion and stability condition of the 3D poroelastic SFD method. Numerical results show that the 3D numerical simulation can provide a real description of wave propagation.
Engine dynamic analysis with general nonlinear finite element codes. Part 2: Bearing element implementation overall numerical characteristics and benchmaking

NASA Technical Reports Server (NTRS)

Padovan, J.; Adams, M.; Fertis, J.; Zeid, I.; Lam, P.

1982-01-01

Finite element codes are used in modelling rotor-bearing-stator structure common to the turbine industry. Engine dynamic simulation is used by developing strategies which enable the use of available finite element codes. benchmarking the elements developed are benchmarked by incorporation into a general purpose code (ADINA); the numerical characteristics of finite element type rotor-bearing-stator simulations are evaluated through the use of various types of explicit/implicit numerical integration operators. Improving the overall numerical efficiency of the procedure is improved.
Technical note: Avoiding the direct inversion of the numerator relationship matrix for genotyped animals in single-step genomic best linear unbiased prediction solved with the preconditioned conjugate gradient.

PubMed

Masuda, Y; Misztal, I; Legarra, A; Tsuruta, S; Lourenco, D A L; Fragomeni, B O; Aguilar, I

2017-01-01

This paper evaluates an efficient implementation to multiply the inverse of a numerator relationship matrix for genotyped animals () by a vector (). The computation is required for solving mixed model equations in single-step genomic BLUP (ssGBLUP) with the preconditioned conjugate gradient (PCG). The inverse can be decomposed into sparse matrices that are blocks of the sparse inverse of a numerator relationship matrix () including genotyped animals and their ancestors. The elements of were rapidly calculated with the Henderson's rule and stored as sparse matrices in memory. Implementation of was by a series of sparse matrix-vector multiplications. Diagonal elements of , which were required as preconditioners in PCG, were approximated with a Monte Carlo method using 1,000 samples. The efficient implementation of was compared with explicit inversion of with 3 data sets including about 15,000, 81,000, and 570,000 genotyped animals selected from populations with 213,000, 8.2 million, and 10.7 million pedigree animals, respectively. The explicit inversion required 1.8 GB, 49 GB, and 2,415 GB (estimated) of memory, respectively, and 42 s, 56 min, and 13.5 d (estimated), respectively, for the computations. The efficient implementation required <1 MB, 2.9 GB, and 2.3 GB of memory, respectively, and <1 sec, 3 min, and 5 min, respectively, for setting up. Only <1 sec was required for the multiplication in each PCG iteration for any data sets. When the equations in ssGBLUP are solved with the PCG algorithm, is no longer a limiting factor in the computations.
Lebedev acceleration and comparison of different photometric models in the inversion of lightcurves for asteroids

NASA Astrophysics Data System (ADS)

Lu, Xiao-Ping; Huang, Xiang-Jie; Ip, Wing-Huen; Hsia, Chi-Hao

2018-04-01

In the lightcurve inversion process where asteroid's physical parameters such as rotational period, pole orientation and overall shape are searched, the numerical calculations of the synthetic photometric brightness based on different shape models are frequently implemented. Lebedev quadrature is an efficient method to numerically calculate the surface integral on the unit sphere. By transforming the surface integral on the Cellinoid shape model to that on the unit sphere, the lightcurve inversion process based on the Cellinoid shape model can be remarkably accelerated. Furthermore, Matlab codes of the lightcurve inversion process based on the Cellinoid shape model are available on Github for free downloading. The photometric models, i.e., the scattering laws, also play an important role in the lightcurve inversion process, although the shape variations of asteroids dominate the morphologies of the lightcurves. Derived from the radiative transfer theory, the Hapke model can describe the light reflectance behaviors from the viewpoint of physics, while there are also many empirical models in numerical applications. Numerical simulations are implemented for the comparison of the Hapke model with the other three numerical models, including the Lommel-Seeliger, Minnaert, and Kaasalainen models. The results show that the numerical models with simple function expressions can fit well with the synthetic lightcurves generated based on the Hapke model; this good fit implies that they can be adopted in the lightcurve inversion process for asteroids to improve the numerical efficiency and derive similar results to those of the Hapke model.
Efficient Numerical Methods for Nonlinear-Facilitated Transport and Exchange in a Blood-Tissue Exchange Unit

PubMed Central

Poulain, Christophe A.; Finlayson, Bruce A.; Bassingthwaighte, James B.

2010-01-01

The analysis of experimental data obtained by the multiple-indicator method requires complex mathematical models for which capillary blood-tissue exchange (BTEX) units are the building blocks. This study presents a new, nonlinear, two-region, axially distributed, single capillary, BTEX model. A facilitated transporter model is used to describe mass transfer between plasma and intracellular spaces. To provide fast and accurate solutions, numerical techniques suited to nonlinear convection-dominated problems are implemented. These techniques are the random choice method, an explicit Euler-Lagrange scheme, and the MacCormack method with and without flux correction. The accuracy of the numerical techniques is demonstrated, and their efficiencies are compared. The random choice, Euler-Lagrange and plain MacCormack method are the best numerical techniques for BTEX modeling. However, the random choice and Euler-Lagrange methods are preferred over the MacCormack method because they allow for the derivation of a heuristic criterion that makes the numerical methods stable without degrading their efficiency. Numerical solutions are also used to illustrate some nonlinear behaviors of the model and to show how the new BTEX model can be used to estimate parameters from experimental data. PMID:9146808
Acceleration of Linear Finite-Difference Poisson-Boltzmann Methods on Graphics Processing Units.

PubMed

Qi, Ruxi; Botello-Smith, Wesley M; Luo, Ray

2017-07-11

Electrostatic interactions play crucial roles in biophysical processes such as protein folding and molecular recognition. Poisson-Boltzmann equation (PBE)-based models have emerged as widely used in modeling these important processes. Though great efforts have been put into developing efficient PBE numerical models, challenges still remain due to the high dimensionality of typical biomolecular systems. In this study, we implemented and analyzed commonly used linear PBE solvers for the ever-improving graphics processing units (GPU) for biomolecular simulations, including both standard and preconditioned conjugate gradient (CG) solvers with several alternative preconditioners. Our implementation utilizes the standard Nvidia CUDA libraries cuSPARSE, cuBLAS, and CUSP. Extensive tests show that good numerical accuracy can be achieved given that the single precision is often used for numerical applications on GPU platforms. The optimal GPU performance was observed with the Jacobi-preconditioned CG solver, with a significant speedup over standard CG solver on CPU in our diversified test cases. Our analysis further shows that different matrix storage formats also considerably affect the efficiency of different linear PBE solvers on GPU, with the diagonal format best suited for our standard finite-difference linear systems. Further efficiency may be possible with matrix-free operations and integrated grid stencil setup specifically tailored for the banded matrices in PBE-specific linear systems.

Alternative stable qP wave equations in TTI media with their applications for reverse time migration

NASA Astrophysics Data System (ADS)

Zhou, Yang; Wang, Huazhong; Liu, Wenqing

2015-10-01

Numerical instabilities may arise if the spatial variation of symmetry axis is handled improperly when implementing P-wave modeling and reverse time migration in heterogeneous tilted transversely isotropic (TTI) media, especially in the cases where fast changes exist in TTI symmetry axis’ directions. Based on the pseudo-acoustic approximation to anisotropic elastic wave equations in Cartesian coordinates, alternative second order qP (quasi-P) wave equations in TTI media are derived in this paper. Compared with conventional stable qP wave equations, the proposed equations written in stress components contain only spatial derivatives of wavefield variables (stress components) and are free from spatial derivatives involving media parameters. These lead to an easy and efficient implementation for stable P-wave modeling and imaging. Numerical experiments demonstrate the stability and computational efficiency of the presented equations in complex TTI media.
Numerical Algorithms for Acoustic Integrals - The Devil is in the Details

NASA Technical Reports Server (NTRS)

Brentner, Kenneth S.

1996-01-01

The accurate prediction of the aeroacoustic field generated by aerospace vehicles or nonaerospace machinery is necessary for designers to control and reduce source noise. Powerful computational aeroacoustic methods, based on various acoustic analogies (primarily the Lighthill acoustic analogy) and Kirchhoff methods, have been developed for prediction of noise from complicated sources, such as rotating blades. Both methods ultimately predict the noise through a numerical evaluation of an integral formulation. In this paper, we consider three generic acoustic formulations and several numerical algorithms that have been used to compute the solutions to these formulations. Algorithms for retarded-time formulations are the most efficient and robust, but they are difficult to implement for supersonic-source motion. Collapsing-sphere and emission-surface formulations are good alternatives when supersonic-source motion is present, but the numerical implementations of these formulations are more computationally demanding. New algorithms - which utilize solution adaptation to provide a specified error level - are needed.
A 3D finite-difference BiCG iterative solver with the Fourier-Jacobi preconditioner for the anisotropic EIT/EEG forward problem.

PubMed

Turovets, Sergei; Volkov, Vasily; Zherdetsky, Aleksej; Prakonina, Alena; Malony, Allen D

2014-01-01

The Electrical Impedance Tomography (EIT) and electroencephalography (EEG) forward problems in anisotropic inhomogeneous media like the human head belongs to the class of the three-dimensional boundary value problems for elliptic equations with mixed derivatives. We introduce and explore the performance of several new promising numerical techniques, which seem to be more suitable for solving these problems. The proposed numerical schemes combine the fictitious domain approach together with the finite-difference method and the optimally preconditioned Conjugate Gradient- (CG-) type iterative method for treatment of the discrete model. The numerical scheme includes the standard operations of summation and multiplication of sparse matrices and vector, as well as FFT, making it easy to implement and eligible for the effective parallel implementation. Some typical use cases for the EIT/EEG problems are considered demonstrating high efficiency of the proposed numerical technique.
Parallelization of elliptic solver for solving 1D Boussinesq model

NASA Astrophysics Data System (ADS)

Tarwidi, D.; Adytia, D.

2018-03-01

In this paper, a parallel implementation of an elliptic solver in solving 1D Boussinesq model is presented. Numerical solution of Boussinesq model is obtained by implementing a staggered grid scheme to continuity, momentum, and elliptic equation of Boussinesq model. Tridiagonal system emerging from numerical scheme of elliptic equation is solved by cyclic reduction algorithm. The parallel implementation of cyclic reduction is executed on multicore processors with shared memory architectures using OpenMP. To measure the performance of parallel program, large number of grids is varied from 28 to 214. Two test cases of numerical experiment, i.e. propagation of solitary and standing wave, are proposed to evaluate the parallel program. The numerical results are verified with analytical solution of solitary and standing wave. The best speedup of solitary and standing wave test cases is about 2.07 with 214 of grids and 1.86 with 213 of grids, respectively, which are executed by using 8 threads. Moreover, the best efficiency of parallel program is 76.2% and 73.5% for solitary and standing wave test cases, respectively.
Ancient numerical daemons of conceptual hydrological modeling: 1. Fidelity and efficiency of time stepping schemes

NASA Astrophysics Data System (ADS)

Clark, Martyn P.; Kavetski, Dmitri

2010-10-01

A major neglected weakness of many current hydrological models is the numerical method used to solve the governing model equations. This paper thoroughly evaluates several classes of time stepping schemes in terms of numerical reliability and computational efficiency in the context of conceptual hydrological modeling. Numerical experiments are carried out using 8 distinct time stepping algorithms and 6 different conceptual rainfall-runoff models, applied in a densely gauged experimental catchment, as well as in 12 basins with diverse physical and hydroclimatic characteristics. Results show that, over vast regions of the parameter space, the numerical errors of fixed-step explicit schemes commonly used in hydrology routinely dwarf the structural errors of the model conceptualization. This substantially degrades model predictions, but also, disturbingly, generates fortuitously adequate performance for parameter sets where numerical errors compensate for model structural errors. Simply running fixed-step explicit schemes with shorter time steps provides a poor balance between accuracy and efficiency: in some cases daily-step adaptive explicit schemes with moderate error tolerances achieved comparable or higher accuracy than 15 min fixed-step explicit approximations but were nearly 10 times more efficient. From the range of simple time stepping schemes investigated in this work, the fixed-step implicit Euler method and the adaptive explicit Heun method emerge as good practical choices for the majority of simulation scenarios. In combination with the companion paper, where impacts on model analysis, interpretation, and prediction are assessed, this two-part study vividly highlights the impact of numerical errors on critical performance aspects of conceptual hydrological models and provides practical guidelines for robust numerical implementation.
Human exposure assessment in the near field of GSM base-station antennas using a hybrid finite element/method of moments technique.

PubMed

Meyer, Frans J C; Davidson, David B; Jakobus, Ulrich; Stuchly, Maria A

2003-02-01

A hybrid finite-element method (FEM)/method of moments (MoM) technique is employed for specific absorption rate (SAR) calculations in a human phantom in the near field of a typical group special mobile (GSM) base-station antenna. The MoM is used to model the metallic surfaces and wires of the base-station antenna, and the FEM is used to model the heterogeneous human phantom. The advantages of each of these frequency domain techniques are, thus, exploited, leading to a highly efficient and robust numerical method for addressing this type of bioelectromagnetic problem. The basic mathematical formulation of the hybrid technique is presented. This is followed by a discussion of important implementation details-in particular, the linear algebra routines for sparse, complex FEM matrices combined with dense MoM matrices. The implementation is validated by comparing results to MoM (surface equivalence principle implementation) and finite-difference time-domain (FDTD) solutions of human exposure problems. A comparison of the computational efficiency of the different techniques is presented. The FEM/MoM implementation is then used for whole-body and critical-organ SAR calculations in a phantom at different positions in the near field of a base-station antenna. This problem cannot, in general, be solved using the MoM or FDTD due to computational limitations. This paper shows that the specific hybrid FEM/MoM implementation is an efficient numerical tool for accurate assessment of human exposure in the near field of base-station antennas.
Efficient, massively parallel eigenvalue computation

NASA Technical Reports Server (NTRS)

Huo, Yan; Schreiber, Robert

1993-01-01

In numerical simulations of disordered electronic systems, one of the most common approaches is to diagonalize random Hamiltonian matrices and to study the eigenvalues and eigenfunctions of a single electron in the presence of a random potential. An effort to implement a matrix diagonalization routine for real symmetric dense matrices on massively parallel SIMD computers, the Maspar MP-1 and MP-2 systems, is described. Results of numerical tests and timings are also presented.
Computationally efficient method for Fourier transform of highly chirped pulses for laser and parametric amplifier modeling.

PubMed

Andrianov, Alexey; Szabo, Aron; Sergeev, Alexander; Kim, Arkady; Chvykov, Vladimir; Kalashnikov, Mikhail

2016-11-14

We developed an improved approach to calculate the Fourier transform of signals with arbitrary large quadratic phase which can be efficiently implemented in numerical simulations utilizing Fast Fourier transform. The proposed algorithm significantly reduces the computational cost of Fourier transform of a highly chirped and stretched pulse by splitting it into two separate transforms of almost transform limited pulses, thereby reducing the required grid size roughly by a factor of the pulse stretching. The application of our improved Fourier transform algorithm in the split-step method for numerical modeling of CPA and OPCPA shows excellent agreement with standard algorithms.
Optimally analyzing and implementing of bolt fittings in steel structure based on ANSYS

NASA Astrophysics Data System (ADS)

Han, Na; Song, Shuangyang; Cui, Yan; Wu, Yongchun

2018-03-01

ANSYS simulation software for its excellent performance become outstanding one in Computer-aided Engineering (CAE) family, it is committed to the innovation of engineering simulation to help users to shorten the design process. First, a typical procedure to implement CAE was design. The framework of structural numerical analysis on ANSYS Technology was proposed. Then, A optimally analyzing and implementing of bolt fittings in beam-column join of steel structure was implemented by ANSYS, which was display the cloud chart of XY-shear stress, the cloud chart of YZ-shear stress and the cloud chart of Y component of stress. Finally, ANSYS software simulating results was compared with the measured results by the experiment. The result of ANSYS simulating and analyzing is reliable, efficient and optical. In above process, a structural performance's numerical simulating and analyzing model were explored for engineering enterprises' practice.
Lax-Friedrichs sweeping scheme for static Hamilton-Jacobi equations

NASA Astrophysics Data System (ADS)

Kao, Chiu Yen; Osher, Stanley; Qian, Jianliang

2004-05-01

We propose a simple, fast sweeping method based on the Lax-Friedrichs monotone numerical Hamiltonian to approximate viscosity solutions of arbitrary static Hamilton-Jacobi equations in any number of spatial dimensions. By using the Lax-Friedrichs numerical Hamiltonian, we can easily obtain the solution at a specific grid point in terms of its neighbors, so that a Gauss-Seidel type nonlinear iterative method can be utilized. Furthermore, by incorporating a group-wise causality principle into the Gauss-Seidel iteration by following a finite group of characteristics, we have an easy-to-implement, sweeping-type, and fast convergent numerical method. However, unlike other methods based on the Godunov numerical Hamiltonian, some computational boundary conditions are needed in the implementation. We give a simple recipe which enforces a version of discrete min-max principle. Some convergence analysis is done for the one-dimensional eikonal equation. Extensive 2-D and 3-D numerical examples illustrate the efficiency and accuracy of the new approach. To our knowledge, this is the first fast numerical method based on discretizing the Hamilton-Jacobi equation directly without assuming convexity and/or homogeneity of the Hamiltonian.
Composite SAR imaging using sequential joint sparsity

NASA Astrophysics Data System (ADS)

Sanders, Toby; Gelb, Anne; Platte, Rodrigo B.

2017-06-01

This paper investigates accurate and efficient ℓ1 regularization methods for generating synthetic aperture radar (SAR) images. Although ℓ1 regularization algorithms are already employed in SAR imaging, practical and efficient implementation in terms of real time imaging remain a challenge. Here we demonstrate that fast numerical operators can be used to robustly implement ℓ1 regularization methods that are as or more efficient than traditional approaches such as back projection, while providing superior image quality. In particular, we develop a sequential joint sparsity model for composite SAR imaging which naturally combines the joint sparsity methodology with composite SAR. Our technique, which can be implemented using standard, fractional, or higher order total variation regularization, is able to reduce the effects of speckle and other noisy artifacts with little additional computational cost. Finally we show that generalizing total variation regularization to non-integer and higher orders provides improved flexibility and robustness for SAR imaging.
Hierarchical matrices implemented into the boundary integral approaches for gravity field modelling

NASA Astrophysics Data System (ADS)

Čunderlík, Róbert; Vipiana, Francesca

2017-04-01

Boundary integral approaches applied for gravity field modelling have been recently developed to solve the geodetic boundary value problems numerically, or to process satellite observations, e.g. from the GOCE satellite mission. In order to obtain numerical solutions of "cm-level" accuracy, such approaches require very refined level of the disretization or resolution. This leads to enormous memory requirements that need to be reduced. An implementation of the Hierarchical Matrices (H-matrices) can significantly reduce a numerical complexity of these approaches. A main idea of the H-matrices is based on an approximation of the entire system matrix that is split into a family of submatrices. Large submatrices are stored in factorized representation, while small submatrices are stored in standard representation. This allows reducing memory requirements significantly while improving the efficiency. The poster presents our preliminary results of implementations of the H-matrices into the existing boundary integral approaches based on the boundary element method or the method of fundamental solution.
Rapid Prediction of Unsteady Three-Dimensional Viscous Flows in Turbopump Geometries

NASA Technical Reports Server (NTRS)

Dorney, Daniel J.

1998-01-01

A program is underway to improve the efficiency of a three-dimensional Navier-Stokes code and generalize it for nozzle and turbopump geometries. Code modifications have included the implementation of parallel processing software, incorporation of new physical models and generalization of the multiblock capability. The final report contains details of code modifications, numerical results for several nozzle and turbopump geometries, and the implementation of the parallelization software.
Implementation of Preconditioned Dual-Time Procedures in OVERFLOW

NASA Technical Reports Server (NTRS)

Pandya, Shishir A.; Venkateswaran, Sankaran; Pulliam, Thomas H.; Kwak, Dochan (Technical Monitor)

2003-01-01

Preconditioning methods have become the method of choice for the solution of flowfields involving the simultaneous presence of low Mach and transonic regions. It is well known that these methods are important for insuring accurate numerical discretization as well as convergence efficiency over various operating conditions such as low Mach number, low Reynolds number and high Strouhal numbers. For unsteady problems, the preconditioning is introduced within a dual-time framework wherein the physical time-derivatives are used to march the unsteady equations and the preconditioned time-derivatives are used for purposes of numerical discretization and iterative solution. In this paper, we describe the implementation of the preconditioned dual-time methodology in the OVERFLOW code. To demonstrate the performance of the method, we employ both simple and practical unsteady flowfields, including vortex propagation in a low Mach number flow, flowfield of an impulsively started plate (Stokes' first problem) arid a cylindrical jet in a low Mach number crossflow with ground effect. All the results demonstrate that the preconditioning algorithm is responsible for improvements to both numerical accuracy and convergence efficiency and, thereby, enables low Mach number unsteady computations to be performed at a fraction of the cost of traditional time-marching methods.
Development of the Tensoral Computer Language

NASA Technical Reports Server (NTRS)

Ferziger, Joel; Dresselhaus, Eliot

1996-01-01

The research scientist or engineer wishing to perform large scale simulations or to extract useful information from existing databases is required to have expertise in the details of the particular database, the numerical methods and the computer architecture to be used. This poses a significant practical barrier to the use of simulation data. The goal of this research was to develop a high-level computer language called Tensoral, designed to remove this barrier. The Tensoral language provides a framework in which efficient generic data manipulations can be easily coded and implemented. First of all, Tensoral is general. The fundamental objects in Tensoral represent tensor fields and the operators that act on them. The numerical implementation of these tensors and operators is completely and flexibly programmable. New mathematical constructs and operators can be easily added to the Tensoral system. Tensoral is compatible with existing languages. Tensoral tensor operations co-exist in a natural way with a host language, which may be any sufficiently powerful computer language such as Fortran, C, or Vectoral. Tensoral is very-high-level. Tensor operations in Tensoral typically act on entire databases (i.e., arrays) at one time and may, therefore, correspond to many lines of code in a conventional language. Tensoral is efficient. Tensoral is a compiled language. Database manipulations are simplified optimized and scheduled by the compiler eventually resulting in efficient machine code to implement them.
A shallow water model for the propagation of tsunami via Lattice Boltzmann method

NASA Astrophysics Data System (ADS)

Zergani, Sara; Aziz, Z. A.; Viswanathan, K. K.

2015-01-01

An efficient implementation of the lattice Boltzmann method (LBM) for the numerical simulation of the propagation of long ocean waves (e.g. tsunami), based on the nonlinear shallow water (NSW) wave equation is presented. The LBM is an alternative numerical procedure for the description of incompressible hydrodynamics and has the potential to serve as an efficient solver for incompressible flows in complex geometries. This work proposes the NSW equations for the irrotational surface waves in the case of complex bottom elevation. In recent time, equation involving shallow water is the current norm in modelling tsunami operations which include the propagation zone estimation. Several test-cases are presented to verify our model. Some implications to tsunami wave modelling are also discussed. Numerical results are found to be in excellent agreement with theory.
Efficient implementation of a 3-dimensional ADI method on the iPSC/860

DOE Office of Scientific and Technical Information (OSTI.GOV)

Van der Wijngaart, R.F.

1993-12-31

A comparison is made between several domain decomposition strategies for the solution of three-dimensional partial differential equations on a MIMD distributed memory parallel computer. The grids used are structured, and the numerical algorithm is ADI. Important implementation issues regarding load balancing, storage requirements, network latency, and overlap of computations and communications are discussed. Results of the solution of the three-dimensional heat equation on the Intel iPSC/860 are presented for the three most viable methods. It is found that the Bruno-Cappello decomposition delivers optimal computational speed through an almost complete elimination of processor idle time, while providing good memory efficiency.
A novel unsplit perfectly matched layer for the second-order acoustic wave equation.

PubMed

Ma, Youneng; Yu, Jinhua; Wang, Yuanyuan

2014-08-01

When solving acoustic field equations by using numerical approximation technique, absorbing boundary conditions (ABCs) are widely used to truncate the simulation to a finite space. The perfectly matched layer (PML) technique has exhibited excellent absorbing efficiency as an ABC for the acoustic wave equation formulated as a first-order system. However, as the PML was originally designed for the first-order equation system, it cannot be applied to the second-order equation system directly. In this article, we aim to extend the unsplit PML to the second-order equation system. We developed an efficient unsplit implementation of PML for the second-order acoustic wave equation based on an auxiliary-differential-equation (ADE) scheme. The proposed method can benefit to the use of PML in simulations based on second-order equations. Compared with the existing PMLs, it has simpler implementation and requires less extra storage. Numerical results from finite-difference time-domain models are provided to illustrate the validity of the approach. Copyright © 2014 Elsevier B.V. All rights reserved.
Quantum Monte Carlo for large chemical systems: implementing efficient strategies for petascale platforms and beyond.

PubMed

Scemama, Anthony; Caffarel, Michel; Oseret, Emmanuel; Jalby, William

2013-04-30

Various strategies to implement efficiently quantum Monte Carlo (QMC) simulations for large chemical systems are presented. These include: (i) the introduction of an efficient algorithm to calculate the computationally expensive Slater matrices. This novel scheme is based on the use of the highly localized character of atomic Gaussian basis functions (not the molecular orbitals as usually done), (ii) the possibility of keeping the memory footprint minimal, (iii) the important enhancement of single-core performance when efficient optimization tools are used, and (iv) the definition of a universal, dynamic, fault-tolerant, and load-balanced framework adapted to all kinds of computational platforms (massively parallel machines, clusters, or distributed grids). These strategies have been implemented in the QMC=Chem code developed at Toulouse and illustrated with numerical applications on small peptides of increasing sizes (158, 434, 1056, and 1731 electrons). Using 10-80 k computing cores of the Curie machine (GENCI-TGCC-CEA, France), QMC=Chem has been shown to be capable of running at the petascale level, thus demonstrating that for this machine a large part of the peak performance can be achieved. Implementation of large-scale QMC simulations for future exascale platforms with a comparable level of efficiency is expected to be feasible. Copyright © 2013 Wiley Periodicals, Inc.
A spectral boundary integral equation method for the 2-D Helmholtz equation

NASA Technical Reports Server (NTRS)

Hu, Fang Q.

1994-01-01

In this paper, we present a new numerical formulation of solving the boundary integral equations reformulated from the Helmholtz equation. The boundaries of the problems are assumed to be smooth closed contours. The solution on the boundary is treated as a periodic function, which is in turn approximated by a truncated Fourier series. A Fourier collocation method is followed in which the boundary integral equation is transformed into a system of algebraic equations. It is shown that in order to achieve spectral accuracy for the numerical formulation, the nonsmoothness of the integral kernels, associated with the Helmholtz equation, must be carefully removed. The emphasis of the paper is on investigating the essential elements of removing the nonsmoothness of the integral kernels in the spectral implementation. The present method is robust for a general boundary contour. Aspects of efficient implementation of the method using FFT are also discussed. A numerical example of wave scattering is given in which the exponential accuracy of the present numerical method is demonstrated.

Improved locality of the phase-field lattice-Boltzmann model for immiscible fluids at high density ratios

NASA Astrophysics Data System (ADS)

Fakhari, Abbas; Mitchell, Travis; Leonardi, Christopher; Bolster, Diogo

2017-11-01

Based on phase-field theory, we introduce a robust lattice-Boltzmann equation for modeling immiscible multiphase flows at large density and viscosity contrasts. Our approach is built by modifying the method proposed by Zu and He [Phys. Rev. E 87, 043301 (2013), 10.1103/PhysRevE.87.043301] in such a way as to improve efficiency and numerical stability. In particular, we employ a different interface-tracking equation based on the so-called conservative phase-field model, a simplified equilibrium distribution that decouples pressure and velocity calculations, and a local scheme based on the hydrodynamic distribution functions for calculation of the stress tensor. In addition to two distribution functions for interface tracking and recovery of hydrodynamic properties, the only nonlocal variable in the proposed model is the phase field. Moreover, within our framework there is no need to use biased or mixed difference stencils for numerical stability and accuracy at high density ratios. This not only simplifies the implementation and efficiency of the model, but also leads to a model that is better suited to parallel implementation on distributed-memory machines. Several benchmark cases are considered to assess the efficacy of the proposed model, including the layered Poiseuille flow in a rectangular channel, Rayleigh-Taylor instability, and the rise of a Taylor bubble in a duct. The numerical results are in good agreement with available numerical and experimental data.
Further optimization of SeDDaRA blind image deconvolution algorithm and its DSP implementation

NASA Astrophysics Data System (ADS)

Wen, Bo; Zhang, Qiheng; Zhang, Jianlin

2011-11-01

Efficient algorithm for blind image deconvolution and its high-speed implementation is of great value in practice. Further optimization of SeDDaRA is developed, from algorithm structure to numerical calculation methods. The main optimization covers that, the structure's modularization for good implementation feasibility, reducing the data computation and dependency of 2D-FFT/IFFT, and acceleration of power operation by segmented look-up table. Then the Fast SeDDaRA is proposed and specialized for low complexity. As the final implementation, a hardware system of image restoration is conducted by using the multi-DSP parallel processing. Experimental results show that, the processing time and memory demand of Fast SeDDaRA decreases 50% at least; the data throughput of image restoration system is over 7.8Msps. The optimization is proved efficient and feasible, and the Fast SeDDaRA is able to support the real-time application.
Cymatics for the cloaking of flexural vibrations in a structured plate

PubMed Central

Misseroni, D.; Colquitt, D. J.; Movchan, A. B.; Movchan, N. V.; Jones, I. S.

2016-01-01

Based on rigorous theoretical findings, we present a proof-of-concept design for a structured square cloak enclosing a void in an elastic lattice. We implement high-precision fabrication and experimental testing of an elastic invisibility cloak for flexural waves in a mechanical lattice. This is accompanied by verifications and numerical modelling performed through finite element simulations. The primary advantage of our square lattice cloak, over other designs, is the straightforward implementation and the ease of construction. The elastic lattice cloak, implemented experimentally, shows high efficiency. PMID:27068339
Implementing Parquet equations using HPX

NASA Astrophysics Data System (ADS)

Kellar, Samuel; Wagle, Bibek; Yang, Shuxiang; Tam, Ka-Ming; Kaiser, Hartmut; Moreno, Juana; Jarrell, Mark

A new C++ runtime system (HPX) enables simulations of complex systems to run more efficiently on parallel and heterogeneous systems. This increased efficiency allows for solutions to larger simulations of the parquet approximation for a system with impurities. The relevancy of the parquet equations depends upon the ability to solve systems which require long runs and large amounts of memory. These limitations, in addition to numerical complications arising from stability of the solutions, necessitate running on large distributed systems. As the computational resources trend towards the exascale and the limitations arising from computational resources vanish efficiency of large scale simulations becomes a focus. HPX facilitates efficient simulations through intelligent overlapping of computation and communication. Simulations such as the parquet equations which require the transfer of large amounts of data should benefit from HPX implementations. Supported by the the NSF EPSCoR Cooperative Agreement No. EPS-1003897 with additional support from the Louisiana Board of Regents.
An Initial Multi-Domain Modeling of an Actively Cooled Structure

NASA Technical Reports Server (NTRS)

Steinthorsson, Erlendur

1997-01-01

A methodology for the simulation of turbine cooling flows is being developed. The methodology seeks to combine numerical techniques that optimize both accuracy and computational efficiency. Key components of the methodology include the use of multiblock grid systems for modeling complex geometries, and multigrid convergence acceleration for enhancing computational efficiency in highly resolved fluid flow simulations. The use of the methodology has been demonstrated in several turbo machinery flow and heat transfer studies. Ongoing and future work involves implementing additional turbulence models, improving computational efficiency, adding AMR.
Designing Adaptive Low-Dissipative High Order Schemes for Long-Time Integrations. Chapter 1

NASA Technical Reports Server (NTRS)

Yee, Helen C.; Sjoegreen, B.; Mansour, Nagi N. (Technical Monitor)

2001-01-01

A general framework for the design of adaptive low-dissipative high order schemes is presented. It encompasses a rather complete treatment of the numerical approach based on four integrated design criteria: (1) For stability considerations, condition the governing equations before the application of the appropriate numerical scheme whenever it is possible; (2) For consistency, compatible schemes that possess stability properties, including physical and numerical boundary condition treatments, similar to those of the discrete analogue of the continuum are preferred; (3) For the minimization of numerical dissipation contamination, efficient and adaptive numerical dissipation control to further improve nonlinear stability and accuracy should be used; and (4) For practical considerations, the numerical approach should be efficient and applicable to general geometries, and an efficient and reliable dynamic grid adaptation should be used if necessary. These design criteria are, in general, very useful to a wide spectrum of flow simulations. However, the demand on the overall numerical approach for nonlinear stability and accuracy is much more stringent for long-time integration of complex multiscale viscous shock/shear/turbulence/acoustics interactions and numerical combustion. Robust classical numerical methods for less complex flow physics are not suitable or practical for such applications. The present approach is designed expressly to address such flow problems, especially unsteady flows. The minimization of employing very fine grids to overcome the production of spurious numerical solutions and/or instability due to under-resolved grids is also sought. The incremental studies to illustrate the performance of the approach are summarized. Extensive testing and full implementation of the approach is forthcoming. The results shown so far are very encouraging.
An efficient spectral method for the simulation of dynamos in Cartesian geometry and its implementation on massively parallel computers

NASA Astrophysics Data System (ADS)

Stellmach, Stephan; Hansen, Ulrich

2008-05-01

Numerical simulations of the process of convection and magnetic field generation in planetary cores still fail to reach geophysically realistic control parameter values. Future progress in this field depends crucially on efficient numerical algorithms which are able to take advantage of the newest generation of parallel computers. Desirable features of simulation algorithms include (1) spectral accuracy, (2) an operation count per time step that is small and roughly proportional to the number of grid points, (3) memory requirements that scale linear with resolution, (4) an implicit treatment of all linear terms including the Coriolis force, (5) the ability to treat all kinds of common boundary conditions, and (6) reasonable efficiency on massively parallel machines with tens of thousands of processors. So far, algorithms for fully self-consistent dynamo simulations in spherical shells do not achieve all these criteria simultaneously, resulting in strong restrictions on the possible resolutions. In this paper, we demonstrate that local dynamo models in which the process of convection and magnetic field generation is only simulated for a small part of a planetary core in Cartesian geometry can achieve the above goal. We propose an algorithm that fulfills the first five of the above criteria and demonstrate that a model implementation of our method on an IBM Blue Gene/L system scales impressively well for up to O(104) processors. This allows for numerical simulations at rather extreme parameter values.
Investigating power capping toward energy-efficient scientific applications: Investigating Power Capping toward Energy-Efficient Scientific Applications

DOE PAGES

Haidar, Azzam; Jagode, Heike; Vaccaro, Phil; ...

2018-03-22

The emergence of power efficiency as a primary constraint in processor and system design poses new challenges concerning power and energy awareness for numerical libraries and scientific applications. Power consumption also plays a major role in the design of data centers, which may house petascale or exascale-level computing systems. At these extreme scales, understanding and improving the energy efficiency of numerical libraries and their related applications becomes a crucial part of the successful implementation and operation of the computing system. In this paper, we study and investigate the practice of controlling a compute system's power usage, and we explore howmore » different power caps affect the performance of numerical algorithms with different computational intensities. Further, we determine the impact, in terms of performance and energy usage, that these caps have on a system running scientific applications. This analysis will enable us to characterize the types of algorithms that benefit most from these power management schemes. Our experiments are performed using a set of representative kernels and several popular scientific benchmarks. Lastly, we quantify a number of power and performance measurements and draw observations and conclusions that can be viewed as a roadmap to achieving energy efficiency in the design and execution of scientific algorithms.« less
Investigating power capping toward energy-efficient scientific applications: Investigating Power Capping toward Energy-Efficient Scientific Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Haidar, Azzam; Jagode, Heike; Vaccaro, Phil

The emergence of power efficiency as a primary constraint in processor and system design poses new challenges concerning power and energy awareness for numerical libraries and scientific applications. Power consumption also plays a major role in the design of data centers, which may house petascale or exascale-level computing systems. At these extreme scales, understanding and improving the energy efficiency of numerical libraries and their related applications becomes a crucial part of the successful implementation and operation of the computing system. In this paper, we study and investigate the practice of controlling a compute system's power usage, and we explore howmore » different power caps affect the performance of numerical algorithms with different computational intensities. Further, we determine the impact, in terms of performance and energy usage, that these caps have on a system running scientific applications. This analysis will enable us to characterize the types of algorithms that benefit most from these power management schemes. Our experiments are performed using a set of representative kernels and several popular scientific benchmarks. Lastly, we quantify a number of power and performance measurements and draw observations and conclusions that can be viewed as a roadmap to achieving energy efficiency in the design and execution of scientific algorithms.« less
Magnetohydrodynamic Simulations of Black Hole Accretion Flows Using PATCHWORK, a Multi-Patch, multi-code approach

NASA Astrophysics Data System (ADS)

Avara, Mark J.; Noble, Scott; Shiokawa, Hotaka; Cheng, Roseanne; Campanelli, Manuela; Krolik, Julian H.

2017-08-01

A multi-patch approach to numerical simulations of black hole accretion flows allows one to robustly match numerical grid shape and equations solved to the natural structure of the physical system. For instance, a cartesian gridded patch can be used to cover coordinate singularities on a spherical-polar grid, increasing computational efficiency and better capturing the physical system through natural symmetries. We will present early tests, initial applications, and first results from the new MHD implementation of the PATCHWORK framework.
An accurate method for solving a class of fractional Sturm-Liouville eigenvalue problems

NASA Astrophysics Data System (ADS)

Kashkari, Bothayna S. H.; Syam, Muhammed I.

2018-06-01

This article is devoted to both theoretical and numerical study of the eigenvalues of nonsingular fractional second-order Sturm-Liouville problem. In this paper, we implement a fractional-order Legendre Tau method to approximate the eigenvalues. This method transforms the Sturm-Liouville problem to a sparse nonsingular linear system which is solved using the continuation method. Theoretical results for the considered problem are provided and proved. Numerical results are presented to show the efficiency of the proposed method.
Picture Archiving and Communication System (PACS) implementation, integration & benefits in an integrated health system.

PubMed

Mansoori, Bahar; Erhard, Karen K; Sunshine, Jeffrey L

2012-02-01

The availability of the Picture Archiving and Communication System (PACS) has revolutionized the practice of radiology in the past two decades and has shown to eventually increase productivity in radiology and medicine. PACS implementation and integration may bring along numerous unexpected issues, particularly in a large-scale enterprise. To achieve a successful PACS implementation, identifying the critical success and failure factors is essential. This article provides an overview of the process of implementing and integrating PACS in a comprehensive health system comprising an academic core hospital and numerous community hospitals. Important issues are addressed, touching all stages from planning to operation and training. The impact of an enterprise-wide radiology information system and PACS at the academic medical center (four specialty hospitals), in six additional community hospitals, and in all associated outpatient clinics as well as the implications on the productivity and efficiency of the entire enterprise are presented. Copyright © 2012 AUR. Published by Elsevier Inc. All rights reserved.
Toward Fast and Accurate Binding Affinity Prediction with pmemdGTI: An Efficient Implementation of GPU-Accelerated Thermodynamic Integration.

PubMed

Lee, Tai-Sung; Hu, Yuan; Sherborne, Brad; Guo, Zhuyan; York, Darrin M

2017-07-11

We report the implementation of the thermodynamic integration method on the pmemd module of the AMBER 16 package on GPUs (pmemdGTI). The pmemdGTI code typically delivers over 2 orders of magnitude of speed-up relative to a single CPU core for the calculation of ligand-protein binding affinities with no statistically significant numerical differences and thus provides a powerful new tool for drug discovery applications.
Velocity-gauge real-time TDDFT within a numerical atomic orbital basis set

NASA Astrophysics Data System (ADS)

Pemmaraju, C. D.; Vila, F. D.; Kas, J. J.; Sato, S. A.; Rehr, J. J.; Yabana, K.; Prendergast, David

2018-05-01

The interaction of laser fields with solid-state systems can be modeled efficiently within the velocity-gauge formalism of real-time time dependent density functional theory (RT-TDDFT). In this article, we discuss the implementation of the velocity-gauge RT-TDDFT equations for electron dynamics within a linear combination of atomic orbitals (LCAO) basis set framework. Numerical results obtained from our LCAO implementation, for the electronic response of periodic systems to both weak and intense laser fields, are compared to those obtained from established real-space grid and Full-Potential Linearized Augmented Planewave approaches. Potential applications of the LCAO based scheme in the context of extreme ultra-violet and soft X-ray spectroscopies involving core-electronic excitations are discussed.
Unconditionally energy stable time stepping scheme for Cahn–Morral equation: Application to multi-component spinodal decomposition and optimal space tiling

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tavakoli, Rouhollah, E-mail: rtavakoli@sharif.ir

An unconditionally energy stable time stepping scheme is introduced to solve Cahn–Morral-like equations in the present study. It is constructed based on the combination of David Eyre's time stepping scheme and Schur complement approach. Although the presented method is general and independent of the choice of homogeneous free energy density function term, logarithmic and polynomial energy functions are specifically considered in this paper. The method is applied to study the spinodal decomposition in multi-component systems and optimal space tiling problems. A penalization strategy is developed, in the case of later problem, to avoid trivial solutions. Extensive numerical experiments demonstrate themore » success and performance of the presented method. According to the numerical results, the method is convergent and energy stable, independent of the choice of time stepsize. Its MATLAB implementation is included in the appendix for the numerical evaluation of algorithm and reproduction of the presented results. -- Highlights: •Extension of Eyre's convex–concave splitting scheme to multiphase systems. •Efficient solution of spinodal decomposition in multi-component systems. •Efficient solution of least perimeter periodic space partitioning problem. •Developing a penalization strategy to avoid trivial solutions. •Presentation of MATLAB implementation of the introduced algorithm.« less
A fast numerical solution of scattering by a cylinder: Spectral method for the boundary integral equations

NASA Technical Reports Server (NTRS)

Hu, Fang Q.

1994-01-01

It is known that the exact analytic solutions of wave scattering by a circular cylinder, when they exist, are not in a closed form but in infinite series which converges slowly for high frequency waves. In this paper, we present a fast number solution for the scattering problem in which the boundary integral equations, reformulated from the Helmholtz equation, are solved using a Fourier spectral method. It is shown that the special geometry considered here allows the implementation of the spectral method to be simple and very efficient. The present method differs from previous approaches in that the singularities of the integral kernels are removed and dealt with accurately. The proposed method preserves the spectral accuracy and is shown to have an exponential rate of convergence. Aspects of efficient implementation using FFT are discussed. Moreover, the boundary integral equations of combined single and double-layer representation are used in the present paper. This ensures the uniqueness of the numerical solution for the scattering problem at all frequencies. Although a strongly singular kernel is encountered for the Neumann boundary conditions, we show that the hypersingularity can be handled easily in the spectral method. Numerical examples that demonstrate the validity of the method are also presented.
Numerical aspects and implementation of a two-layer zonal wall model for LES of compressible turbulent flows on unstructured meshes

NASA Astrophysics Data System (ADS)

Park, George Ilhwan; Moin, Parviz

2016-01-01

This paper focuses on numerical and practical aspects associated with a parallel implementation of a two-layer zonal wall model for large-eddy simulation (LES) of compressible wall-bounded turbulent flows on unstructured meshes. A zonal wall model based on the solution of unsteady three-dimensional Reynolds-averaged Navier-Stokes (RANS) equations on a separate near-wall grid is implemented in an unstructured, cell-centered finite-volume LES solver. The main challenge in its implementation is to couple two parallel, unstructured flow solvers for efficient boundary data communication and simultaneous time integrations. A coupling strategy with good load balancing and low processors underutilization is identified. Face mapping and interpolation procedures at the coupling interface are explained in detail. The method of manufactured solution is used for verifying the correct implementation of solver coupling, and parallel performance of the combined wall-modeled LES (WMLES) solver is investigated. The method has successfully been applied to several attached and separated flows, including a transitional flow over a flat plate and a separated flow over an airfoil at an angle of attack.
Lattice Boltzmann Method for 3-D Flows with Curved Boundary

NASA Technical Reports Server (NTRS)

Mei, Renwei; Shyy, Wei; Yu, Dazhi; Luo, Li-Shi

2002-01-01

In this work, we investigate two issues that are important to computational efficiency and reliability in fluid dynamics applications of the lattice, Boltzmann equation (LBE): (1) Computational stability and accuracy of different lattice Boltzmann models and (2) the treatment of the boundary conditions on curved solid boundaries and their 3-D implementations. Three athermal 3-D LBE models (D3QI5, D3Ql9, and D3Q27) are studied and compared in terms of efficiency, accuracy, and robustness. The boundary treatment recently developed by Filippova and Hanel and Met et al. in 2-D is extended to and implemented for 3-D. The convergence, stability, and computational efficiency of the 3-D LBE models with the boundary treatment for curved boundaries were tested in simulations of four 3-D flows: (1) Fully developed flows in a square duct, (2) flow in a 3-D lid-driven cavity, (3) fully developed flows in a circular pipe, and (4) a uniform flow over a sphere. We found that while the fifteen-velocity 3-D (D3Ql5) model is more prone to numerical instability and the D3Q27 is more computationally intensive, the 63Q19 model provides a balance between computational reliability and efficiency. Through numerical simulations, we demonstrated that the boundary treatment for 3-D arbitrary curved geometry has second-order accuracy and possesses satisfactory stability characteristics.
Topography Modeling in Atmospheric Flows Using the Immersed Boundary Method

NASA Technical Reports Server (NTRS)

Ackerman, A. S.; Senocak, I.; Mansour, N. N.; Stevens, D. E.

2004-01-01

Numerical simulation of flow over complex geometry needs accurate and efficient computational methods. Different techniques are available to handle complex geometry. The unstructured grid and multi-block body-fitted grid techniques have been widely adopted for complex geometry in engineering applications. In atmospheric applications, terrain fitted single grid techniques have found common use. Although these are very effective techniques, their implementation, coupling with the flow algorithm, and efficient parallelization of the complete method are more involved than a Cartesian grid method. The grid generation can be tedious and one needs to pay special attention in numerics to handle skewed cells for conservation purposes. Researchers have long sought for alternative methods to ease the effort involved in simulating flow over complex geometry.
Numerical method of lines for the relaxational dynamics of nematic liquid crystals.

PubMed

Bhattacharjee, A K; Menon, Gautam I; Adhikari, R

2008-08-01

We propose an efficient numerical scheme, based on the method of lines, for solving the Landau-de Gennes equations describing the relaxational dynamics of nematic liquid crystals. Our method is computationally easy to implement, balancing requirements of efficiency and accuracy. We benchmark our method through the study of the following problems: the isotropic-nematic interface, growth of nematic droplets in the isotropic phase, and the kinetics of coarsening following a quench into the nematic phase. Our results, obtained through solutions of the full coarse-grained equations of motion with no approximations, provide a stringent test of the de Gennes ansatz for the isotropic-nematic interface, illustrate the anisotropic character of droplets in the nucleation regime, and validate dynamical scaling in the coarsening regime.

Efficient numerical method of freeform lens design for arbitrary irradiance shaping

NASA Astrophysics Data System (ADS)

Wojtanowski, Jacek

2018-05-01

A computational method to design a lens with a flat entrance surface and a freeform exit surface that can transform a collimated, generally non-uniform input beam into a beam with a desired irradiance distribution of arbitrary shape is presented. The methodology is based on non-linear elliptic partial differential equations, known as Monge-Ampère PDEs. This paper describes an original numerical algorithm to solve this problem by applying the Gauss-Seidel method with simplified boundary conditions. A joint MATLAB-ZEMAX environment is used to implement and verify the method. To prove the efficiency of the proposed approach, an exemplary study where the designed lens is faced with the challenging illumination task is shown. An analysis of solution stability, iteration-to-iteration ray mapping evolution (attached in video format), depth of focus and non-zero étendue efficiency is performed.
Subwavelength-thick lenses with high numerical apertures and large efficiency based on high-contrast transmitarrays

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arbabi, Amir; Horie, Yu; Ball, Alexander J.

2015-05-07

Flat optical devices thinner than a wavelength promise to replace conventional free-space components for wavefront and polarization control. Transmissive flat lenses are particularly interesting for applications in imaging and on-chip optoelectronic integration. Several designs based on plasmonic metasurfaces, high-contrast transmitarrays and gratings have been recently implemented but have not provided a performance comparable to conventional curved lenses. Here we report polarization-insensitive, micron-thick, high-contrast transmitarray micro-lenses with focal spots as small as 0.57 λ. The measured focusing efficiency is up to 82%. A rigorous method for ultrathin lens design, and the trade-off between high efficiency and small spot size (or largemore » numerical aperture) are discussed. The micro-lenses, composed of silicon nano-posts on glass, are fabricated in one lithographic step that could be performed with high-throughput photo or nanoimprint lithography, thus enabling widespread adoption.« less
Efficient Simulation Budget Allocation for Selecting an Optimal Subset

NASA Technical Reports Server (NTRS)

Chen, Chun-Hung; He, Donghai; Fu, Michael; Lee, Loo Hay

2008-01-01

We consider a class of the subset selection problem in ranking and selection. The objective is to identify the top m out of k designs based on simulated output. Traditional procedures are conservative and inefficient. Using the optimal computing budget allocation framework, we formulate the problem as that of maximizing the probability of correc tly selecting all of the top-m designs subject to a constraint on the total number of samples available. For an approximation of this corre ct selection probability, we derive an asymptotically optimal allocat ion and propose an easy-to-implement heuristic sequential allocation procedure. Numerical experiments indicate that the resulting allocatio ns are superior to other methods in the literature that we tested, and the relative efficiency increases for larger problems. In addition, preliminary numerical results indicate that the proposed new procedur e has the potential to enhance computational efficiency for simulation optimization.
Efficient field-theoretic simulation of polymer solutions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Villet, Michael C.; Fredrickson, Glenn H., E-mail: ghf@mrl.ucsb.edu; Department of Materials, University of California, Santa Barbara, California 93106

2014-12-14

We present several developments that facilitate the efficient field-theoretic simulation of polymers by complex Langevin sampling. A regularization scheme using finite Gaussian excluded volume interactions is used to derive a polymer solution model that appears free of ultraviolet divergences and hence is well-suited for lattice-discretized field theoretic simulation. We show that such models can exhibit ultraviolet sensitivity, a numerical pathology that dramatically increases sampling error in the continuum lattice limit, and further show that this pathology can be eliminated by appropriate model reformulation by variable transformation. We present an exponential time differencing algorithm for integrating complex Langevin equations for fieldmore » theoretic simulation, and show that the algorithm exhibits excellent accuracy and stability properties for our regularized polymer model. These developments collectively enable substantially more efficient field-theoretic simulation of polymers, and illustrate the importance of simultaneously addressing analytical and numerical pathologies when implementing such computations.« less
Computer investigations of the turbulent flow around a NACA2415 airfoil wind turbine

NASA Astrophysics Data System (ADS)

Driss, Zied; Chelbi, Tarek; Abid, Mohamed Salah

2015-12-01

In this work, computer investigations are carried out to study the flow field developing around a NACA2415 airfoil wind turbine. The Navier-Stokes equations in conjunction with the standard k-ɛ turbulence model are considered. These equations are solved numerically to determine the local characteristics of the flow. The models tested are implemented in the software "SolidWorks Flow Simulation" which uses a finite volume scheme. The numerical results are compared with experiments conducted on an open wind tunnel to validate the numerical results. This will help improving the aerodynamic efficiency in the design of packaged installations of the NACA2415 airfoil type wind turbine.
A time-efficient implementation of Extended Kalman Filter for sequential orbit determination and a case study for onboard application

NASA Astrophysics Data System (ADS)

Tang, Jingshi; Wang, Haihong; Chen, Qiuli; Chen, Zhonggui; Zheng, Jinjun; Cheng, Haowen; Liu, Lin

2018-07-01

Onboard orbit determination (OD) is often used in space missions, with which mission support can be partially accomplished autonomously, with less dependency on ground stations. In major Global Navigation Satellite Systems (GNSS), inter-satellite link is also an essential upgrade in the future generations. To serve for autonomous operation, sequential OD method is crucial to provide real-time or near real-time solutions. The Extended Kalman Filter (EKF) is an effective and convenient sequential estimator that is widely used in onboard application. The filter requires the solutions of state transition matrix (STM) and the process noise transition matrix, which are always obtained by numerical integration. However, numerically integrating the differential equations is a CPU intensive process and consumes a large portion of the time in EKF procedures. In this paper, we present an implementation that uses the analytical solutions of these transition matrices to replace the numerical calculations. This analytical implementation is demonstrated and verified using a fictitious constellation based on selected medium Earth orbit (MEO) and inclined Geosynchronous orbit (IGSO) satellites. We show that this implementation performs effectively and converges quickly, steadily and accurately in the presence of considerable errors in the initial values, measurements and force models. The filter is able to converge within 2-4 h of flight time in our simulation. The observation residual is consistent with simulated measurement error, which is about a few centimeters in our scenarios. Compared to results implemented with numerically integrated STM, the analytical implementation shows results with consistent accuracy, while it takes only about half the CPU time to filter a 10-day measurement series. The future possible extensions are also discussed to fit in various missions.
Highly Efficient and Scalable Compound Decomposition of Two-Electron Integral Tensor and Its Application in Coupled Cluster Calculations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Peng, Bo; Kowalski, Karol

The representation and storage of two-electron integral tensors are vital in large- scale applications of accurate electronic structure methods. Low-rank representation and efficient storage strategy of integral tensors can significantly reduce the numerical overhead and consequently time-to-solution of these methods. In this paper, by combining pivoted incomplete Cholesky decomposition (CD) with a follow-up truncated singular vector decomposition (SVD), we develop a decomposition strategy to approximately represent the two-electron integral tensor in terms of low-rank vectors. A systematic benchmark test on a series of 1-D, 2-D, and 3-D carbon-hydrogen systems demonstrates high efficiency and scalability of the compound two-step decomposition ofmore » the two-electron integral tensor in our implementation. For the size of atomic basis set N_b ranging from ~ 100 up to ~ 2, 000, the observed numerical scaling of our implementation shows O(N_b^{2.5~3}) versus O(N_b^{3~4}) of single CD in most of other implementations. More importantly, this decomposition strategy can significantly reduce the storage requirement of the atomic-orbital (AO) two-electron integral tensor from O(N_b^4) to O(N_b^2 log_{10}(N_b)) with moderate decomposition thresholds. The accuracy tests have been performed using ground- and excited-state formulations of coupled- cluster formalism employing single and double excitations (CCSD) on several bench- mark systems including the C_{60} molecule described by nearly 1,400 basis functions. The results show that the decomposition thresholds can be generally set to 10^{-4} to 10^{-3} to give acceptable compromise between efficiency and accuracy.« less
Highly Efficient and Scalable Compound Decomposition of Two-Electron Integral Tensor and Its Application in Coupled Cluster Calculations.

PubMed

Peng, Bo; Kowalski, Karol

2017-09-12

The representation and storage of two-electron integral tensors are vital in large-scale applications of accurate electronic structure methods. Low-rank representation and efficient storage strategy of integral tensors can significantly reduce the numerical overhead and consequently time-to-solution of these methods. In this work, by combining pivoted incomplete Cholesky decomposition (CD) with a follow-up truncated singular vector decomposition (SVD), we develop a decomposition strategy to approximately represent the two-electron integral tensor in terms of low-rank vectors. A systematic benchmark test on a series of 1-D, 2-D, and 3-D carbon-hydrogen systems demonstrates high efficiency and scalability of the compound two-step decomposition of the two-electron integral tensor in our implementation. For the size of the atomic basis set, N b , ranging from ∼100 up to ∼2,000, the observed numerical scaling of our implementation shows [Formula: see text] versus [Formula: see text] cost of performing single CD on the two-electron integral tensor in most of the other implementations. More importantly, this decomposition strategy can significantly reduce the storage requirement of the atomic orbital (AO) two-electron integral tensor from [Formula: see text] to [Formula: see text] with moderate decomposition thresholds. The accuracy tests have been performed using ground- and excited-state formulations of coupled cluster formalism employing single and double excitations (CCSD) on several benchmark systems including the C 60 molecule described by nearly 1,400 basis functions. The results show that the decomposition thresholds can be generally set to 10 -4 to 10 -3 to give acceptable compromise between efficiency and accuracy.
Implementation of a block Lanczos algorithm for Eigenproblem solution of gyroscopic systems

NASA Technical Reports Server (NTRS)

Gupta, Kajal K.; Lawson, Charles L.

1987-01-01

The details of implementation of a general numerical procedure developed for the accurate and economical computation of natural frequencies and associated modes of any elastic structure rotating along an arbitrary axis are described. A block version of the Lanczos algorithm is derived for the solution that fully exploits associated matrix sparsity and employs only real numbers in all relevant computations. It is also capable of determining multiple roots and proves to be most efficient when compared to other, similar, exisiting techniques.
Implementationof a modular software system for multiphysical processes in porous media

NASA Astrophysics Data System (ADS)

Naumov, Dmitri; Watanabe, Norihiro; Bilke, Lars; Fischer, Thomas; Lehmann, Christoph; Rink, Karsten; Walther, Marc; Wang, Wenqing; Kolditz, Olaf

2016-04-01

Subsurface georeservoirs are a candidate technology for large scale energy storage required as part of the transition to renewable energy sources. The increased use of the subsurface results in competing interests and possible impacts on protected entities. To optimize and plan the use of the subsurface in large scale scenario analyses,powerful numerical frameworks are required that aid process understanding and can capture the coupled thermal (T), hydraulic (H), mechanical (M), and chemical (C) processes with high computational efficiency. Due to having a multitude of different couplings between basic T, H, M, or C processes and the necessity to implement new numerical schemes the development focus has moved to software's modularity. The decreased coupling between the components results in two major advantages: easier addition of specialized processes and improvement of the code's testability and therefore its quality. The idea of modularization is implemented on several levels, in addition to library based separation of the previous code version, by using generalized algorithms available in the Standard Template Library and the Boost library, relying on efficient implementations of liner algebra solvers, using concepts when designing new types, and localization of frequently accessed data structures. This procedure shows certain benefits for a flexible high-performance framework applied to the analysis of multipurpose georeservoirs.
BOOK REVIEW: Advanced Topics in Computational Partial Differential Equations: Numerical Methods and Diffpack Programming

NASA Astrophysics Data System (ADS)

Katsaounis, T. D.

2005-02-01

The scope of this book is to present well known simple and advanced numerical methods for solving partial differential equations (PDEs) and how to implement these methods using the programming environment of the software package Diffpack. A basic background in PDEs and numerical methods is required by the potential reader. Further, a basic knowledge of the finite element method and its implementation in one and two space dimensions is required. The authors claim that no prior knowledge of the package Diffpack is required, which is true, but the reader should be at least familiar with an object oriented programming language like C++ in order to better comprehend the programming environment of Diffpack. Certainly, a prior knowledge or usage of Diffpack would be a great advantage to the reader. The book consists of 15 chapters, each one written by one or more authors. Each chapter is basically divided into two parts: the first part is about mathematical models described by PDEs and numerical methods to solve these models and the second part describes how to implement the numerical methods using the programming environment of Diffpack. Each chapter closes with a list of references on its subject. The first nine chapters cover well known numerical methods for solving the basic types of PDEs. Further, programming techniques on the serial as well as on the parallel implementation of numerical methods are also included in these chapters. The last five chapters are dedicated to applications, modelled by PDEs, in a variety of fields. The first chapter is an introduction to parallel processing. It covers fundamentals of parallel processing in a simple and concrete way and no prior knowledge of the subject is required. Examples of parallel implementation of basic linear algebra operations are presented using the Message Passing Interface (MPI) programming environment. Here, some knowledge of MPI routines is required by the reader. Examples solving in parallel simple PDEs using Diffpack and MPI are also presented. Chapter 2 presents the overlapping domain decomposition method for solving PDEs. It is well known that these methods are suitable for parallel processing. The first part of the chapter covers the mathematical formulation of the method as well as algorithmic and implementational issues. The second part presents a serial and a parallel implementational framework within the programming environment of Diffpack. The chapter closes by showing how to solve two application examples with the overlapping domain decomposition method using Diffpack. Chapter 3 is a tutorial about how to incorporate the multigrid solver in Diffpack. The method is illustrated by examples such as a Poisson solver, a general elliptic problem with various types of boundary conditions and a nonlinear Poisson type problem. In chapter 4 the mixed finite element is introduced. Technical issues concerning the practical implementation of the method are also presented. The main difficulties of the efficient implementation of the method, especially in two and three space dimensions on unstructured grids, are presented and addressed in the framework of Diffpack. The implementational process is illustrated by two examples, namely the system formulation of the Poisson problem and the Stokes problem. Chapter 5 is closely related to chapter 4 and addresses the problem of how to solve efficiently the linear systems arising by the application of the mixed finite element method. The proposed method is block preconditioning. Efficient techniques for implementing the method within Diffpack are presented. Optimal block preconditioners are used to solve the system formulation of the Poisson problem, the Stokes problem and the bidomain model for the electrical activity in the heart. The subject of chapter 6 is systems of PDEs. Linear and nonlinear systems are discussed. Fully implicit and operator splitting methods are presented. Special attention is paid to how existing solvers for scalar equations in Diffpack can be used to derive fully implicit solvers for systems. The proposed techniques are illustrated in terms of two applications, namely a system of PDEs modelling pipeflow and a two-phase porous media flow. Stochastic PDEs is the topic of chapter 7. The first part of the chapter is a simple introduction to stochastic PDEs; basic analytical properties are presented for simple models like transport phenomena and viscous drag forces. The second part considers the numerical solution of stochastic PDEs. Two basic techniques are presented, namely Monte Carlo and perturbation methods. The last part explains how to implement and incorporate these solvers into Diffpack. Chapter 8 describes how to operate Diffpack from Python scripts. The main goal here is to provide all the programming and technical details in order to glue the programming environment of Diffpack with visualization packages through Python and in general take advantage of the Python interfaces. Chapter 9 attempts to show how to use numerical experiments to measure the performance of various PDE solvers. The authors gathered a rather impressive list, a total of 14 PDE solvers. Solvers for problems like Poisson, Navier--Stokes, elasticity, two-phase flows and methods such as finite difference, finite element, multigrid, and gradient type methods are presented. The authors provide a series of numerical results combining various solvers with various methods in order to gain insight into their computational performance and efficiency. In Chapter 10 the authors consider a computationally challenging problem, namely the computation of the electrical activity of the human heart. After a brief introduction on the biology of the problem the authors present the mathematical models involved and a numerical method for solving them within the framework of Diffpack. Chapter 11 and 12 are closely related; actually they could have been combined in a single chapter. Chapter 11 introduces several mathematical models used in finance, based on the Black--Scholes equation. Chapter 12 considers several numerical methods like Monte Carlo, lattice methods, finite difference and finite element methods. Implementation of these methods within Diffpack is presented in the last part of the chapter. Chapter 13 presents how the finite element method is used for the modelling and analysis of elastic structures. The authors describe the structural elements of Diffpack which include popular elements such as beams and plates and examples are presented on how to use them to simulate elastic structures. Chapter 14 describes an application problem, namely the extrusion of aluminum. This is a rather\\endcolumn complicated process which involves non-Newtonian flow, heat transfer and elasticity. The authors describe the systems of PDEs modelling the underlying process and use a finite element method to obtain a numerical solution. The implementation of the numerical method in Diffpack is presented along with some applications. The last chapter, chapter 15, focuses on mathematical and numerical models of systems of PDEs governing geological processes in sedimentary basins. The underlying mathematical model is solved using the finite element method within a fully implicit scheme. The authors discuss the implementational issues involved within Diffpack and they present results from several examples. In summary, the book focuses on the computational and implementational issues involved in solving partial differential equations. The potential reader should have a basic knowledge of PDEs and the finite difference and finite element methods. The examples presented are solved within the programming framework of Diffpack and the reader should have prior experience with the particular software in order to take full advantage of the book. Overall the book is well written, the subject of each chapter is well presented and can serve as a reference for graduate students, researchers and engineers who are interested in the numerical solution of partial differential equations modelling various applications.
Boundary particle method for Laplace transformed time fractional diffusion equations

NASA Astrophysics Data System (ADS)

Fu, Zhuo-Jia; Chen, Wen; Yang, Hai-Tian

2013-02-01

This paper develops a novel boundary meshless approach, Laplace transformed boundary particle method (LTBPM), for numerical modeling of time fractional diffusion equations. It implements Laplace transform technique to obtain the corresponding time-independent inhomogeneous equation in Laplace space and then employs a truly boundary-only meshless boundary particle method (BPM) to solve this Laplace-transformed problem. Unlike the other boundary discretization methods, the BPM does not require any inner nodes, since the recursive composite multiple reciprocity technique (RC-MRM) is used to convert the inhomogeneous problem into the higher-order homogeneous problem. Finally, the Stehfest numerical inverse Laplace transform (NILT) is implemented to retrieve the numerical solutions of time fractional diffusion equations from the corresponding BPM solutions. In comparison with finite difference discretization, the LTBPM introduces Laplace transform and Stehfest NILT algorithm to deal with time fractional derivative term, which evades costly convolution integral calculation in time fractional derivation approximation and avoids the effect of time step on numerical accuracy and stability. Consequently, it can effectively simulate long time-history fractional diffusion systems. Error analysis and numerical experiments demonstrate that the present LTBPM is highly accurate and computationally efficient for 2D and 3D time fractional diffusion equations.
NAS Applications and Advanced Algorithms

NASA Technical Reports Server (NTRS)

Bailey, David H.; Biswas, Rupak; VanDerWijngaart, Rob; Kutler, Paul (Technical Monitor)

1997-01-01

This paper examines the applications most commonly run on the supercomputers at the Numerical Aerospace Simulation (NAS) facility. It analyzes the extent to which such applications are fundamentally oriented to vector computers, and whether or not they can be efficiently implemented on hierarchical memory machines, such as systems with cache memories and highly parallel, distributed memory systems.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhu, Meng-Zheng; School of Physics and Electronic Information, Huaibei Normal University, Huaibei 235000; Ye, Liu, E-mail: yeliu@ahu.edu.cn

An efficient scheme is proposed to implement phase-covariant quantum cloning by using a superconducting transmon qubit coupled to a microwave cavity resonator in the strong dispersive limit of circuit quantum electrodynamics (QED). By solving the master equation numerically, we plot the Wigner function and Poisson distribution of the cavity mode after each operation in the cloning transformation sequence according to two logic circuits proposed. The visualizations of the quasi-probability distribution in phase-space for the cavity mode and the occupation probability distribution in the Fock basis enable us to penetrate the evolution process of cavity mode during the phase-covariant cloning (PCC)more » transformation. With the help of numerical simulation method, we find out that the present cloning machine is not the isotropic model because its output fidelity depends on the polar angle and the azimuthal angle of the initial input state on the Bloch sphere. The fidelity for the actual output clone of the present scheme is slightly smaller than one in the theoretical case. The simulation results are consistent with the theoretical ones. This further corroborates our scheme based on circuit QED can implement efficiently PCC transformation.« less
Analytic Formulation and Numerical Implementation of an Acoustic Pressure Gradient Prediction

NASA Technical Reports Server (NTRS)

Lee, Seongkyu; Brentner, Kenneth S.; Farassat, F.; Morris, Philip J.

2008-01-01

Two new analytical formulations of the acoustic pressure gradient have been developed and implemented in the PSU-WOPWOP rotor noise prediction code. The pressure gradient can be used to solve the boundary condition for scattering problems and it is a key aspect to solve acoustic scattering problems. The first formulation is derived from the gradient of the Ffowcs Williams-Hawkings (FW-H) equation. This formulation has a form involving the observer time differentiation outside the integrals. In the second formulation, the time differentiation is taken inside the integrals analytically. This formulation avoids the numerical time differentiation with respect to the observer time, which is computationally more efficient. The acoustic pressure gradient predicted by these new formulations is validated through comparison with available exact solutions for a stationary and moving monopole sources. The agreement between the predictions and exact solutions is excellent. The formulations are applied to the rotor noise problems for two model rotors. A purely numerical approach is compared with the analytical formulations. The agreement between the analytical formulations and the numerical method is excellent for both stationary and moving observer cases.
Ancient numerical daemons of conceptual hydrological modeling: 2. Impact of time stepping schemes on model analysis and prediction

NASA Astrophysics Data System (ADS)

Kavetski, Dmitri; Clark, Martyn P.

2010-10-01

Despite the widespread use of conceptual hydrological models in environmental research and operations, they remain frequently implemented using numerically unreliable methods. This paper considers the impact of the time stepping scheme on model analysis (sensitivity analysis, parameter optimization, and Markov chain Monte Carlo-based uncertainty estimation) and prediction. It builds on the companion paper (Clark and Kavetski, 2010), which focused on numerical accuracy, fidelity, and computational efficiency. Empirical and theoretical analysis of eight distinct time stepping schemes for six different hydrological models in 13 diverse basins demonstrates several critical conclusions. (1) Unreliable time stepping schemes, in particular, fixed-step explicit methods, suffer from troublesome numerical artifacts that severely deform the objective function of the model. These deformations are not rare isolated instances but can arise in any model structure, in any catchment, and under common hydroclimatic conditions. (2) Sensitivity analysis can be severely contaminated by numerical errors, often to the extent that it becomes dominated by the sensitivity of truncation errors rather than the model equations. (3) Robust time stepping schemes generally produce "better behaved" objective functions, free of spurious local optima, and with sufficient numerical continuity to permit parameter optimization using efficient quasi Newton methods. When implemented within a multistart framework, modern Newton-type optimizers are robust even when started far from the optima and provide valuable diagnostic insights not directly available from evolutionary global optimizers. (4) Unreliable time stepping schemes lead to inconsistent and biased inferences of the model parameters and internal states. (5) Even when interactions between hydrological parameters and numerical errors provide "the right result for the wrong reason" and the calibrated model performance appears adequate, unreliable time stepping schemes make the model unnecessarily fragile in predictive mode, undermining validation assessments and operational use. Erroneous or misleading conclusions of model analysis and prediction arising from numerical artifacts in hydrological models are intolerable, especially given that robust numerics are accepted as mainstream in other areas of science and engineering. We hope that the vivid empirical findings will encourage the conceptual hydrological community to close its Pandora's box of numerical problems, paving the way for more meaningful model application and interpretation.
Error and Symmetry Analysis of Misner's Algorithm for Spherical Harmonic Decomposition on a Cubic Grid

NASA Technical Reports Server (NTRS)

Fiske, David R.

2004-01-01

In an earlier paper, Misner (2004, Class. Quant. Grav., 21, S243) presented a novel algorithm for computing the spherical harmonic components of data represented on a cubic grid. I extend Misner s original analysis by making detailed error estimates of the numerical errors accrued by the algorithm, by using symmetry arguments to suggest a more efficient implementation scheme, and by explaining how the algorithm can be applied efficiently on data with explicit reflection symmetries.
The instanton method and its numerical implementation in fluid mechanics

NASA Astrophysics Data System (ADS)

Grafke, Tobias; Grauer, Rainer; Schäfer, Tobias

2015-08-01

A precise characterization of structures occurring in turbulent fluid flows at high Reynolds numbers is one of the last open problems of classical physics. In this review we discuss recent developments related to the application of instanton methods to turbulence. Instantons are saddle point configurations of the underlying path integrals. They are equivalent to minimizers of the related Freidlin-Wentzell action and known to be able to characterize rare events in such systems. While there is an impressive body of work concerning their analytical description, this review focuses on the question on how to compute these minimizers numerically. In a short introduction we present the relevant mathematical and physical background before we discuss the stochastic Burgers equation in detail. We present algorithms to compute instantons numerically by an efficient solution of the corresponding Euler-Lagrange equations. A second focus is the discussion of a recently developed numerical filtering technique that allows to extract instantons from direct numerical simulations. In the following we present modifications of the algorithms to make them efficient when applied to two- or three-dimensional (2D or 3D) fluid dynamical problems. We illustrate these ideas using the 2D Burgers equation and the 3D Navier-Stokes equations.
Reliability-Based Stability Analysis of Rock Slopes Using Numerical Analysis and Response Surface Method

NASA Astrophysics Data System (ADS)

Dadashzadeh, N.; Duzgun, H. S. B.; Yesiloglu-Gultekin, N.

2017-08-01

While advanced numerical techniques in slope stability analysis are successfully used in deterministic studies, they have so far found limited use in probabilistic analyses due to their high computation cost. The first-order reliability method (FORM) is one of the most efficient probabilistic techniques to perform probabilistic stability analysis by considering the associated uncertainties in the analysis parameters. However, it is not possible to directly use FORM in numerical slope stability evaluations as it requires definition of a limit state performance function. In this study, an integrated methodology for probabilistic numerical modeling of rock slope stability is proposed. The methodology is based on response surface method, where FORM is used to develop an explicit performance function from the results of numerical simulations. The implementation of the proposed methodology is performed by considering a large potential rock wedge in Sumela Monastery, Turkey. The accuracy of the developed performance function to truly represent the limit state surface is evaluated by monitoring the slope behavior. The calculated probability of failure is compared with Monte Carlo simulation (MCS) method. The proposed methodology is found to be 72% more efficient than MCS, while the accuracy is decreased with an error of 24%.
The design and implementation of a parallel unstructured Euler solver using software primitives

NASA Technical Reports Server (NTRS)

Das, R.; Mavriplis, D. J.; Saltz, J.; Gupta, S.; Ponnusamy, R.

1992-01-01

This paper is concerned with the implementation of a three-dimensional unstructured grid Euler-solver on massively parallel distributed-memory computer architectures. The goal is to minimize solution time by achieving high computational rates with a numerically efficient algorithm. An unstructured multigrid algorithm with an edge-based data structure has been adopted, and a number of optimizations have been devised and implemented in order to accelerate the parallel communication rates. The implementation is carried out by creating a set of software tools, which provide an interface between the parallelization issues and the sequential code, while providing a basis for future automatic run-time compilation support. Large practical unstructured grid problems are solved on the Intel iPSC/860 hypercube and Intel Touchstone Delta machine. The quantitative effect of the various optimizations are demonstrated, and we show that the combined effect of these optimizations leads to roughly a factor of three performance improvement. The overall solution efficiency is compared with that obtained on the CRAY-YMP vector supercomputer.

Velocity-gauge real-time TDDFT within a numerical atomic orbital basis set

DOE PAGES

Pemmaraju, C. D.; Vila, F. D.; Kas, J. J.; ...

2018-02-07

The interaction of laser fields with solid-state systems can be modeled efficiently within the velocity-gauge formalism of real-time time dependent density functional theory (RT-TDDFT). In this article, we discuss the implementation of the velocity-gauge RT-TDDFT equations for electron dynamics within a linear combination of atomic orbitals (LCAO) basis set framework. Numerical results obtained from our LCAO implementation, for the electronic response of periodic systems to both weak and intense laser fields, are compared to those obtained from established real-space grid and Full-Potential Linearized Augmented Planewave approaches. As a result, potential applications of the LCAO based scheme in the context ofmore » extreme ultra-violet and soft X-ray spectroscopies involving core-electronic excitations are discussed.« less
Velocity-gauge real-time TDDFT within a numerical atomic orbital basis set

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pemmaraju, C. D.; Vila, F. D.; Kas, J. J.

The interaction of laser fields with solid-state systems can be modeled efficiently within the velocity-gauge formalism of real-time time dependent density functional theory (RT-TDDFT). In this article, we discuss the implementation of the velocity-gauge RT-TDDFT equations for electron dynamics within a linear combination of atomic orbitals (LCAO) basis set framework. Numerical results obtained from our LCAO implementation, for the electronic response of periodic systems to both weak and intense laser fields, are compared to those obtained from established real-space grid and Full-Potential Linearized Augmented Planewave approaches. As a result, potential applications of the LCAO based scheme in the context ofmore » extreme ultra-violet and soft X-ray spectroscopies involving core-electronic excitations are discussed.« less
Variational finite-difference methods in linear and nonlinear problems of the deformation of metallic and composite shells (review)

NASA Astrophysics Data System (ADS)

Maksimyuk, V. A.; Storozhuk, E. A.; Chernyshenko, I. S.

2012-11-01

Variational finite-difference methods of solving linear and nonlinear problems for thin and nonthin shells (plates) made of homogeneous isotropic (metallic) and orthotropic (composite) materials are analyzed and their classification principles and structure are discussed. Scalar and vector variational finite-difference methods that implement the Kirchhoff-Love hypotheses analytically or algorithmically using Lagrange multipliers are outlined. The Timoshenko hypotheses are implemented in a traditional way, i.e., analytically. The stress-strain state of metallic and composite shells of complex geometry is analyzed numerically. The numerical results are presented in the form of graphs and tables and used to assess the efficiency of using the variational finite-difference methods to solve linear and nonlinear problems of the statics of shells (plates)
MALBEC: a new CUDA-C ray-tracer in general relativity

NASA Astrophysics Data System (ADS)

Quiroga, G. D.

2018-06-01

A new CUDA-C code for tracing orbits around non-charged black holes is presented. This code, named MALBEC, take advantage of the graphic processing units and the CUDA platform for tracking null and timelike test particles in Schwarzschild and Kerr. Also, a new general set of equations that describe the closed circular orbits of any timelike test particle in the equatorial plane is derived. These equations are extremely important in order to compare the analytical behavior of the orbits with the numerical results and verify the correct implementation of the Runge-Kutta algorithm in MALBEC. Finally, other numerical tests are performed, demonstrating that MALBEC is able to reproduce some well-known results in these metrics in a faster and more efficient way than a conventional CPU implementation.
QuantumOptics.jl: A Julia framework for simulating open quantum systems

NASA Astrophysics Data System (ADS)

Krämer, Sebastian; Plankensteiner, David; Ostermann, Laurin; Ritsch, Helmut

2018-06-01

We present an open source computational framework geared towards the efficient numerical investigation of open quantum systems written in the Julia programming language. Built exclusively in Julia and based on standard quantum optics notation, the toolbox offers speed comparable to low-level statically typed languages, without compromising on the accessibility and code readability found in dynamic languages. After introducing the framework, we highlight its features and showcase implementations of generic quantum models. Finally, we compare its usability and performance to two well-established and widely used numerical quantum libraries.
A Polynomial Time, Numerically Stable Integer Relation Algorithm

NASA Technical Reports Server (NTRS)

Ferguson, Helaman R. P.; Bailey, Daivd H.; Kutler, Paul (Technical Monitor)

1998-01-01

Let x = (x1, x2...,xn be a vector of real numbers. X is said to possess an integer relation if there exist integers a(sub i) not all zero such that a1x1 + a2x2 + ... a(sub n)Xn = 0. Beginning in 1977 several algorithms (with proofs) have been discovered to recover the a(sub i) given x. The most efficient of these existing integer relation algorithms (in terms of run time and the precision required of the input) has the drawback of being very unstable numerically. It often requires a numeric precision level in the thousands of digits to reliably recover relations in modest-sized test problems. We present here a new algorithm for finding integer relations, which we have named the "PSLQ" algorithm. It is proved in this paper that the PSLQ algorithm terminates with a relation in a number of iterations that is bounded by a polynomial in it. Because this algorithm employs a numerically stable matrix reduction procedure, it is free from the numerical difficulties, that plague other integer relation algorithms. Furthermore, its stability admits an efficient implementation with lower run times oil average than other algorithms currently in Use. Finally, this stability can be used to prove that relation bounds obtained from computer runs using this algorithm are numerically accurate.
Wideband piezoelectric energy harvester for low-frequency application with plucking mechanism

NASA Astrophysics Data System (ADS)

Hiraki, Yasuhiro; Masuda, Arata; Ikeda, Naoto; Katsumura, Hidenori; Kagata, Hiroshi; Okumura, Hidenori

2015-04-01

Wireless sensor networks need energy harvesting from vibrational environment for their power supply. The conventional resonance type vibration energy harvesters, however, are not always effective for low frequency application. The purpose of this paper is to propose a high efficiency energy harvester for low frequency application by utilizing plucking and SSHI techniques, and to investigate the effects of applying those techniques in terms of the energy harvesting efficiency. First, we derived an approximate formulation of energy harvesting efficiency of the plucking device by theoretical analysis. Next, it was confirmed that the improved efficiency agreed with numerical and experimental results. Also, a parallel SSHI, a switching circuit technique to improve the performance of the harvester was introduced and examined by numerical simulations and experiments. Contrary to the simulated results in which the efficiency was improved from 13.1% to 22.6% by introducing the SSHI circuit, the efficiency obtained in the experiment was only 7.43%. This would due to the internal resistance of the inductors and photo MOS relays on the switching circuit and the simulation including this factor revealed large negative influence of it. This result suggested that the reduction of the switching resistance was significantly important to the implementation of SSHI.
Multiscale Modeling and Uncertainty Quantification for Nuclear Fuel Performance

DOE Office of Scientific and Technical Information (OSTI.GOV)

Estep, Donald; El-Azab, Anter; Pernice, Michael

2017-03-23

In this project, we will address the challenges associated with constructing high fidelity multiscale models of nuclear fuel performance. We (*) propose a novel approach for coupling mesoscale and macroscale models, (*) devise efficient numerical methods for simulating the coupled system, and (*) devise and analyze effective numerical approaches for error and uncertainty quantification for the coupled multiscale system. As an integral part of the project, we will carry out analysis of the effects of upscaling and downscaling, investigate efficient methods for stochastic sensitivity analysis of the individual macroscale and mesoscale models, and carry out a posteriori error analysis formore » computed results. We will pursue development and implementation of solutions in software used at Idaho National Laboratories on models of interest to the Nuclear Energy Advanced Modeling and Simulation (NEAMS) program.« less
Programmable logic construction kits for hyper-real-time neuronal modeling.

PubMed

Guerrero-Rivera, Ruben; Morrison, Abigail; Diesmann, Markus; Pearce, Tim C

2006-11-01

Programmable logic designs are presented that achieve exact integration of leaky integrate-and-fire soma and dynamical synapse neuronal models and incorporate spike-time dependent plasticity and axonal delays. Highly accurate numerical performance has been achieved by modifying simpler forward-Euler-based circuitry requiring minimal circuit allocation, which, as we show, behaves equivalently to exact integration. These designs have been implemented and simulated at the behavioral and physical device levels, demonstrating close agreement with both numerical and analytical results. By exploiting finely grained parallelism and single clock cycle numerical iteration, these designs achieve simulation speeds at least five orders of magnitude faster than the nervous system, termed here hyper-real-time operation, when deployed on commercially available field-programmable gate array (FPGA) devices. Taken together, our designs form a programmable logic construction kit of commonly used neuronal model elements that supports the building of large and complex architectures of spiking neuron networks for real-time neuromorphic implementation, neurophysiological interfacing, or efficient parameter space investigations.
Interface modeling in incompressible media using level sets in Escript

NASA Astrophysics Data System (ADS)

Gross, L.; Bourgouin, L.; Hale, A. J.; Mühlhaus, H.-B.

2007-08-01

We use a finite element (FEM) formulation of the level set method to model geological fluid flow problems involving interface propagation. Interface problems are ubiquitous in geophysics. Here we focus on a Rayleigh-Taylor instability, namely mantel plumes evolution, and the growth of lava domes. Both problems require the accurate description of the propagation of an interface between heavy and light materials (plume) or between high viscous lava and low viscous air (lava dome), respectively. The implementation of the models is based on Escript which is a Python module for the solution of partial differential equations (PDEs) using spatial discretization techniques such as FEM. It is designed to describe numerical models in the language of PDEs while using computational components implemented in C and C++ to achieve high performance for time-intensive, numerical calculations. A critical step in the solution geological flow problems is the solution of the velocity-pressure problem. We describe how the Escript module can be used for a high-level implementation of an efficient variant of the well-known Uzawa scheme. We begin with a brief outline of the Escript modules and then present illustrations of its usage for the numerical solutions of the problems mentioned above.
Solving the linear inviscid shallow water equations in one dimension, with variable depth, using a recursion formula

NASA Astrophysics Data System (ADS)

Hernandez-Walls, R.; Martín-Atienza, B.; Salinas-Matus, M.; Castillo, J.

2017-11-01

When solving the linear inviscid shallow water equations with variable depth in one dimension using finite differences, a tridiagonal system of equations must be solved. Here we present an approach, which is more efficient than the commonly used numerical method, to solve this tridiagonal system of equations using a recursion formula. We illustrate this approach with an example in which we solve for a rectangular channel to find the resonance modes. Our numerical solution agrees very well with the analytical solution. This new method is easy to use and understand by undergraduate students, so it can be implemented in undergraduate courses such as Numerical Methods, Lineal Algebra or Differential Equations.
Implementing Free Primary Education Policy in Malawi and Ghana: Equity and Efficiency Analysis

ERIC Educational Resources Information Center

Inoue, Kazuma; Oketch, Moses

2008-01-01

Malawi and Ghana are among the numerous Sub-Saharan Africa countries that have in recent years introduced Free Primary Education (FPE) policy as a means to realizing the 2015 Education for All and Millennium Development Goals international targets. The introduction of FPE policy is, however, a huge challenge for any national government that has…
High-Order Methods for Incompressible Fluid Flow

NASA Astrophysics Data System (ADS)

Deville, M. O.; Fischer, P. F.; Mund, E. H.

2002-08-01

High-order numerical methods provide an efficient approach to simulating many physical problems. This book considers the range of mathematical, engineering, and computer science topics that form the foundation of high-order numerical methods for the simulation of incompressible fluid flows in complex domains. Introductory chapters present high-order spatial and temporal discretizations for one-dimensional problems. These are extended to multiple space dimensions with a detailed discussion of tensor-product forms, multi-domain methods, and preconditioners for iterative solution techniques. Numerous discretizations of the steady and unsteady Stokes and Navier-Stokes equations are presented, with particular sttention given to enforcement of imcompressibility. Advanced discretizations. implementation issues, and parallel and vector performance are considered in the closing sections. Numerous examples are provided throughout to illustrate the capabilities of high-order methods in actual applications.
A developed nearly analytic discrete method for forward modeling in the frequency domain

NASA Astrophysics Data System (ADS)

Liu, Shaolin; Lang, Chao; Yang, Hui; Wang, Wenshuai

2018-02-01

High-efficiency forward modeling methods play a fundamental role in full waveform inversion (FWI). In this paper, the developed nearly analytic discrete (DNAD) method is proposed to accelerate frequency-domain forward modeling processes. We first derive the discretization of frequency-domain wave equations via numerical schemes based on the nearly analytic discrete (NAD) method to obtain a linear system. The coefficients of numerical stencils are optimized to make the linear system easier to solve and to minimize computing time. Wavefield simulation and numerical dispersion analysis are performed to compare the numerical behavior of DNAD method with that of the conventional NAD method. The results demonstrate the superiority of our proposed method. Finally, the DNAD method is implemented in frequency-domain FWI, and high-resolution inverse results are obtained.
Dual initiation strip charge apparatus and methods for making and implementing the same

DOEpatents

Jakaboski, Juan-Carlos [Albuquerque, NM; Todd,; Steven, N [Rio Rancho, NM; Polisar, Stephen [Albuquerque, NM; Hughs, Chance [Tijeras, NM

2011-03-22

A Dual Initiation Strip Charge (DISC) apparatus is initiated by a single initiation source and detonates a strip of explosive charge at two separate contacts. The reflection of explosively induced stresses meet and create a fracture and breach a target along a generally single fracture contour and produce generally fragment-free scattering and no spallation. Methods for making and implementing a DISC apparatus provide numerous advantages over previous methods of creating explosive charges by utilizing steps for rapid prototyping; by implementing efficient steps and designs for metering consistent, repeatable, and controlled amount of high explosive; and by utilizing readily available materials.
Constraint treatment techniques and parallel algorithms for multibody dynamic analysis. Ph.D. Thesis

NASA Technical Reports Server (NTRS)

Chiou, Jin-Chern

1990-01-01

Computational procedures for kinematic and dynamic analysis of three-dimensional multibody dynamic (MBD) systems are developed from the differential-algebraic equations (DAE's) viewpoint. Constraint violations during the time integration process are minimized and penalty constraint stabilization techniques and partitioning schemes are developed. The governing equations of motion, a two-stage staggered explicit-implicit numerical algorithm, are treated which takes advantage of a partitioned solution procedure. A robust and parallelizable integration algorithm is developed. This algorithm uses a two-stage staggered central difference algorithm to integrate the translational coordinates and the angular velocities. The angular orientations of bodies in MBD systems are then obtained by using an implicit algorithm via the kinematic relationship between Euler parameters and angular velocities. It is shown that the combination of the present solution procedures yields a computationally more accurate solution. To speed up the computational procedures, parallel implementation of the present constraint treatment techniques, the two-stage staggered explicit-implicit numerical algorithm was efficiently carried out. The DAE's and the constraint treatment techniques were transformed into arrowhead matrices to which Schur complement form was derived. By fully exploiting the sparse matrix structural analysis techniques, a parallel preconditioned conjugate gradient numerical algorithm is used to solve the systems equations written in Schur complement form. A software testbed was designed and implemented in both sequential and parallel computers. This testbed was used to demonstrate the robustness and efficiency of the constraint treatment techniques, the accuracy of the two-stage staggered explicit-implicit numerical algorithm, and the speed up of the Schur-complement-based parallel preconditioned conjugate gradient algorithm on a parallel computer.
Discrete square root filtering - A survey of current techniques.

NASA Technical Reports Server (NTRS)

Kaminskii, P. G.; Bryson, A. E., Jr.; Schmidt, S. F.

1971-01-01

Current techniques in square root filtering are surveyed and related by applying a duality association. Four efficient square root implementations are suggested, and compared with three common conventional implementations in terms of computational complexity and precision. It is shown that the square root computational burden should not exceed the conventional by more than 50% in most practical problems. An examination of numerical conditioning predicts that the square root approach can yield twice the effective precision of the conventional filter in ill-conditioned problems. This prediction is verified in two examples.
A SCILAB Program for Computing General-Relativistic Models of Rotating Neutron Stars by Implementing Hartle's Perturbation Method

NASA Astrophysics Data System (ADS)

Papasotiriou, P. J.; Geroyannis, V. S.

We implement Hartle's perturbation method to the computation of relativistic rigidly rotating neutron star models. The program has been written in SCILAB (© INRIA ENPC), a matrix-oriented high-level programming language. The numerical method is described in very detail and is applied to many models in slow or fast rotation. We show that, although the method is perturbative, it gives accurate results for all practical purposes and it should prove an efficient tool for computing rapidly rotating pulsars.
Application of the exact exchange potential method for half metallic intermediate band alloy semiconductor.

PubMed

Fernández, J J; Tablero, C; Wahnón, P

2004-06-08

In this paper we present an analysis of the convergence of the band structure properties, particularly the influence on the modification of the bandgap and bandwidth values in half metallic compounds by the use of the exact exchange formalism. This formalism for general solids has been implemented using a localized basis set of numerical functions to represent the exchange density. The implementation has been carried out using a code which uses a linear combination of confined numerical pseudoatomic functions to represent the Kohn-Sham orbitals. The application of this exact exchange scheme to a half-metallic semiconductor compound, in particular to Ga(4)P(3)Ti, a promising material in the field of high efficiency solar cells, confirms the existence of the isolated intermediate band in this compound. (c) 2004 American Institute of Physics.
Improvement of Speckle Contrast Image Processing by an Efficient Algorithm.

PubMed

Steimers, A; Farnung, W; Kohl-Bareis, M

2016-01-01

We demonstrate an efficient algorithm for the temporal and spatial based calculation of speckle contrast for the imaging of blood flow by laser speckle contrast analysis (LASCA). It reduces the numerical complexity of necessary calculations, facilitates a multi-core and many-core implementation of the speckle analysis and enables an independence of temporal or spatial resolution and SNR. The new algorithm was evaluated for both spatial and temporal based analysis of speckle patterns with different image sizes and amounts of recruited pixels as sequential, multi-core and many-core code.

Improved transfer efficiencies in radio-frequency-driven recoupling solid-state NMR by adiabatic sweep through the dipolar recoupling condition

DOE Office of Scientific and Technical Information (OSTI.GOV)

Straasø, Lasse A.; Shankar, Ravi; Nielsen, Niels Chr.

The homonuclear radio-frequency driven recoupling (RFDR) experiment is commonly used in solid-state NMR spectroscopy to gain insight into the structure of biological samples due to its ease of implementation, stability towards fluctuations/missetting of radio-frequency (rf) field strength, and in general low rf requirements. A theoretical operator-based Floquet description is presented to appreciate the effect of having a temporal displacement of the π-pulses in the RFDR experiment. From this description, we demonstrate improved transfer efficiency for the RFDR experiment by generating an adiabatic passage through the zero-quantum recoupling condition. We have compared the performances of RFDR and the improved sequence tomore » mediate efficient {sup 13}CO to {sup 13}C{sub α} polarization transfer for uniformly {sup 13}C,{sup 15}N-labeled glycine and for the fibril forming peptide SNNFGAILSS (one-letter amino acid codes) uniformly {sup 13}C,{sup 15}N-labeled at the FGAIL residues. Using numerically optimized sweeps, we get experimental gains of approximately 20% for glycine where numerical simulations predict an improvement of 25% relative to the standard implementation. For the fibril forming peptide, using the same sweep parameters as found for glycine, we have gains in the order of 10%–20% depending on the spectral regions of interest.« less
Comparison Between 2D and 3D Simulations of Rate Dependent Friction Using DEM

NASA Astrophysics Data System (ADS)

Wang, C.; Elsworth, D.

2017-12-01

Rate-state dependent constitutive laws of frictional evolution have been successful in representing many of the first- and second- order components of earthquake rupture. Although this constitutive law has been successfully applied in numerical models, difficulty remains in efficient implementation of this constitutive law in computationally-expensive granular mechanics simulations using discrete element methods (DEM). This study introduces a novel approach in implementing a rate-dependent constitutive relation of contact friction into DEM. This is essentially an implementation of a slip-weakening constitutive law onto local particle contacts without sacrificing computational efficiency. This implementation allows the analysis of slip stability of simulated fault gouge materials. Velocity-stepping experiments are reported on both uniform and textured distributions of quartz and talc as 3D analogs of gouge mixtures. Distinct local slip stability parameters (a-b) are assigned to the quartz and talc, respectively. We separately vary talc content from 0 to 100% in the uniform mixtures and talc layer thickness from 1 to 20 particles in the textured mixtures. Applied shear displacements are cycled through velocities of 1μm/s and 10μm/s. Frictional evolution data are collected and compared to 2D simulation results. We show that dimensionality significantly impacts the evolution of friction. 3D simulation results are more representative of laboratory observed behavior and numerical noise is shown at a magnitude of 0.01 in terms of friction coefficient. Stability parameters (a-b) can be straightforwardly obtained from analyzing velocity steps, and are different from locally assigned (a-b) values. Sensitivity studies on normal stress, shear velocity, particle size, local (a-b) values, and characteristic slip distance (Dc) show that the implementation is sensitive to local (a-b) values and relations between (Dc) and particle size.
Resolved-particle simulation by the Physalis method: Enhancements and new capabilities

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sierakowski, Adam J., E-mail: sierakowski@jhu.edu; Prosperetti, Andrea; Faculty of Science and Technology and J.M. Burgers Centre for Fluid Dynamics, University of Twente, P.O. Box 217, 7500 AE Enschede

2016-03-15

We present enhancements and new capabilities of the Physalis method for simulating disperse multiphase flows using particle-resolved simulation. The current work enhances the previous method by incorporating a new type of pressure-Poisson solver that couples with a new Physalis particle pressure boundary condition scheme and a new particle interior treatment to significantly improve overall numerical efficiency. Further, we implement a more efficient method of calculating the Physalis scalar products and incorporate short-range particle interaction models. We provide validation and benchmarking for the Physalis method against experiments of a sedimenting particle and of normal wall collisions. We conclude with an illustrativemore » simulation of 2048 particles sedimenting in a duct. In the appendix, we present a complete and self-consistent description of the analytical development and numerical methods.« less
A Three-Dimensional Linearized Unsteady Euler Analysis for Turbomachinery Blade Rows

NASA Technical Reports Server (NTRS)

Montgomery, Matthew D.; Verdon, Joseph M.

1996-01-01

A three-dimensional, linearized, Euler analysis is being developed to provide an efficient unsteady aerodynamic analysis that can be used to predict the aeroelastic and aeroacoustic response characteristics of axial-flow turbomachinery blading. The field equations and boundary conditions needed to describe nonlinear and linearized inviscid unsteady flows through a blade row operating within a cylindrical annular duct are presented. In addition, a numerical model for linearized inviscid unsteady flow, which is based upon an existing nonlinear, implicit, wave-split, finite volume analysis, is described. These aerodynamic and numerical models have been implemented into an unsteady flow code, called LINFLUX. A preliminary version of the LINFLUX code is applied herein to selected, benchmark three-dimensional, subsonic, unsteady flows, to illustrate its current capabilities and to uncover existing problems and deficiencies. The numerical results indicate that good progress has been made toward developing a reliable and useful three-dimensional prediction capability. However, some problems, associated with the implementation of an unsteady displacement field and numerical errors near solid boundaries, still exist. Also, accurate far-field conditions must be incorporated into the FINFLUX analysis, so that this analysis can be applied to unsteady flows driven be external aerodynamic excitations.
Performance Analysis and Optimization on the UCLA Parallel Atmospheric General Circulation Model Code

NASA Technical Reports Server (NTRS)

Lou, John; Ferraro, Robert; Farrara, John; Mechoso, Carlos

1996-01-01

An analysis is presented of several factors influencing the performance of a parallel implementation of the UCLA atmospheric general circulation model (AGCM) on massively parallel computer systems. Several modificaitons to the original parallel AGCM code aimed at improving its numerical efficiency, interprocessor communication cost, load-balance and issues affecting single-node code performance are discussed.
Citizen Empowerment through e-Democracy: Patterns of E-Government Adoption for Small-Sized Cities in Missouri

ERIC Educational Resources Information Center

Massey, Floyd E., III

2014-01-01

E-government is one of the buzzwords in discussing modernizing public administration. Numerous researchers have conducted studies related to the implementation of e-government and e-government 2.0 programs. The main goal of e-government programs is to increase government efficiency and offer benefits to citizens. As the requirements of government…
On some theoretical and practical aspects of multigrid methods. [to solve finite element systems from elliptic equations

NASA Technical Reports Server (NTRS)

Nicolaides, R. A.

1979-01-01

A description and explanation of a simple multigrid algorithm for solving finite element systems is given. Numerical results for an implementation are reported for a number of elliptic equations, including cases with singular coefficients and indefinite equations. The method shows the high efficiency, essentially independent of the grid spacing, predicted by the theory.
Low Order Modeling Tools for Preliminary Pressure Gain Combustion Benefits Analyses

NASA Technical Reports Server (NTRS)

Paxson, Daniel E.

2012-01-01

Pressure gain combustion (PGC) offers the promise of higher thermodynamic cycle efficiency and greater specific power in propulsion and power systems. This presentation describes a model, developed under a cooperative agreement between NASA and AFRL, for preliminarily assessing the performance enhancement and preliminary size requirements of PGC components either as stand-alone thrust producers or coupled with surrounding turbomachinery. The model is implemented in the Numerical Propulsion Simulation System (NPSS) environment allowing various configurations to be examined at numerous operating points. The validated model is simple, yet physics-based. It executes quickly in NPSS, yet produces realistic results.
Parareal in time 3D numerical solver for the LWR Benchmark neutron diffusion transient model

DOE Office of Scientific and Technical Information (OSTI.GOV)

Baudron, Anne-Marie, E-mail: anne-marie.baudron@cea.fr; CEA-DRN/DMT/SERMA, CEN-Saclay, 91191 Gif sur Yvette Cedex; Lautard, Jean-Jacques, E-mail: jean-jacques.lautard@cea.fr

2014-12-15

In this paper we present a time-parallel algorithm for the 3D neutrons calculation of a transient model in a nuclear reactor core. The neutrons calculation consists in numerically solving the time dependent diffusion approximation equation, which is a simplified transport equation. The numerical resolution is done with finite elements method based on a tetrahedral meshing of the computational domain, representing the reactor core, and time discretization is achieved using a θ-scheme. The transient model presents moving control rods during the time of the reaction. Therefore, cross-sections (piecewise constants) are taken into account by interpolations with respect to the velocity ofmore » the control rods. The parallelism across the time is achieved by an adequate use of the parareal in time algorithm to the handled problem. This parallel method is a predictor corrector scheme that iteratively combines the use of two kinds of numerical propagators, one coarse and one fine. Our method is made efficient by means of a coarse solver defined with large time step and fixed position control rods model, while the fine propagator is assumed to be a high order numerical approximation of the full model. The parallel implementation of our method provides a good scalability of the algorithm. Numerical results show the efficiency of the parareal method on large light water reactor transient model corresponding to the Langenbuch–Maurer–Werner benchmark.« less
Monolithic multigrid method for the coupled Stokes flow and deformable porous medium system

NASA Astrophysics Data System (ADS)

Luo, P.; Rodrigo, C.; Gaspar, F. J.; Oosterlee, C. W.

2018-01-01

The interaction between fluid flow and a deformable porous medium is a complicated multi-physics problem, which can be described by a coupled model based on the Stokes and poroelastic equations. A monolithic multigrid method together with either a coupled Vanka smoother or a decoupled Uzawa smoother is employed as an efficient numerical technique for the linear discrete system obtained by finite volumes on staggered grids. A specialty in our modeling approach is that at the interface of the fluid and poroelastic medium, two unknowns from the different subsystems are defined at the same grid point. We propose a special discretization at and near the points on the interface, which combines the approximation of the governing equations and the considered interface conditions. In the decoupled Uzawa smoother, Local Fourier Analysis (LFA) helps us to select optimal values of the relaxation parameter appearing. To implement the monolithic multigrid method, grid partitioning is used to deal with the interface updates when communication is required between two subdomains. Numerical experiments show that the proposed numerical method has an excellent convergence rate. The efficiency and robustness of the method are confirmed in numerical experiments with typically small realistic values of the physical coefficients.
7Be and hydrological model for more efficient implementation of erosion control measure

NASA Astrophysics Data System (ADS)

Al-Barri, Bashar; Bode, Samuel; Blake, William; Ryken, Nick; Cornelis, Wim; Boeckx, Pascal

2014-05-01

Increased concern about the on-site and off-site impacts of soil erosion in agricultural and forested areas has endorsed interest in innovative methods to assess in an unbiased way spatial and temporal soil erosion rates and redistribution patterns. Hence, interest in precisely estimating the magnitude of the problem and therefore applying erosion control measures (ECM) more efficiently. The latest generation of physically-based hydrological models, which fully couple overland flow and subsurface flow in three dimensions, permit implementing ECM in small and large scales more effectively if coupled with a sediment transport algorithm. While many studies focused on integrating empirical or numerical models based on traditional erosion budget measurements into 3D hydrological models, few studies evaluated the efficiency of ECM on watershed scale and very little attention is given to the potentials of environmental Fallout Radio-Nuclides (FRNs) in such applications. The use of FRN tracer 7Be in soil erosion/deposition research proved to overcome many (if not all) of the problems associated with the conventional approaches providing reliable data for efficient land use management. This poster will underline the pros and cones of using conventional methods and 7Be tracers to evaluate the efficiency of coconuts dams installed as ECM in experimental field in Belgium. It will also outline the potentials of 7Be in providing valuable inputs for evolving the numerical sediment transport algorithm needed for the hydrological model on field scale leading to assess the possibility of using this short-lived tracer as a validation tool for the upgraded hydrological model on watershed scale in further steps. Keywords: FRN, erosion control measures, hydrological modes
Numerical implementation of complex orthogonalization, parallel transport on Stiefel bundles, and analyticity

NASA Astrophysics Data System (ADS)

Avitabile, Daniele; Bridges, Thomas J.

2010-06-01

Numerical integration of complex linear systems of ODEs depending analytically on an eigenvalue parameter are considered. Complex orthogonalization, which is required to stabilize the numerical integration, results in non-analytic systems. It is shown that properties of eigenvalues are still efficiently recoverable by extracting information from a non-analytic characteristic function. The orthonormal systems are constructed using the geometry of Stiefel bundles. Different forms of continuous orthogonalization in the literature are shown to correspond to different choices of connection one-form on the Stiefel bundle. For the numerical integration, Gauss-Legendre Runge-Kutta algorithms are the principal choice for preserving orthogonality, and performance results are shown for a range of GLRK methods. The theory and methods are tested by application to example boundary value problems including the Orr-Sommerfeld equation in hydrodynamic stability.
Low-rank approximation in the numerical modeling of the Farley-Buneman instability in ionospheric plasma

NASA Astrophysics Data System (ADS)

Dolgov, S. V.; Smirnov, A. P.; Tyrtyshnikov, E. E.

2014-04-01

We consider numerical modeling of the Farley-Buneman instability in the Earth's ionosphere plasma. The ion behavior is governed by the kinetic Vlasov equation with the BGK collisional term in the four-dimensional phase space, and since the finite difference discretization on a tensor product grid is used, this equation becomes the most computationally challenging part of the scheme. To relax the complexity and memory consumption, an adaptive model reduction using the low-rank separation of variables, namely the Tensor Train format, is employed. The approach was verified via a prototype MATLAB implementation. Numerical experiments demonstrate the possibility of efficient separation of space and velocity variables, resulting in the solution storage reduction by a factor of order tens.
Flood predictions using the parallel version of distributed numerical physical rainfall-runoff model TOPKAPI

NASA Astrophysics Data System (ADS)

Boyko, Oleksiy; Zheleznyak, Mark

2015-04-01

The original numerical code TOPKAPI-IMMS of the distributed rainfall-runoff model TOPKAPI ( Todini et al, 1996-2014) is developed and implemented in Ukraine. The parallel version of the code has been developed recently to be used on multiprocessors systems - multicore/processors PC and clusters. Algorithm is based on binary-tree decomposition of the watershed for the balancing of the amount of computation for all processors/cores. Message passing interface (MPI) protocol is used as a parallel computing framework. The numerical efficiency of the parallelization algorithms is demonstrated for the case studies for the flood predictions of the mountain watersheds of the Ukrainian Carpathian regions. The modeling results is compared with the predictions based on the lumped parameters models.
Comparison of the Computational Efficiency of the Original Versus Reformulated High-Fidelity Generalized Method of Cells

NASA Technical Reports Server (NTRS)

Arnold, Steven M; Bednarcyk, Brett; Aboydi, Jacob

2004-01-01

The High-Fidelity Generalized Method of Cells (HFGMC) micromechanics model has recently been reformulated by Bansal and Pindera (in the context of elastic phases with perfect bonding) to maximize its computational efficiency. This reformulated version of HFGMC has now been extended to include both inelastic phases and imperfect fiber-matrix bonding. The present paper presents an overview of the HFGMC theory in both its original and reformulated forms and a comparison of the results of the two implementations. The objective is to establish the correlation between the two HFGMC formulations and document the improved efficiency offered by the reformulation. The results compare the macro and micro scale predictions of the continuous reinforcement (doubly-periodic) and discontinuous reinforcement (triply-periodic) versions of both formulations into the inelastic regime, and, in the case of the discontinuous reinforcement version, with both perfect and weak interfacial bonding. The results demonstrate that identical predictions are obtained using either the original or reformulated implementations of HFGMC aside from small numerical differences in the inelastic regime due to the different implementation schemes used for the inelastic terms present in the two formulations. Finally, a direct comparison of execution times is presented for the original formulation and reformulation code implementations. It is shown that as the discretization employed in representing the composite repeating unit cell becomes increasingly refined (requiring a larger number of sub-volumes), the reformulated implementation becomes significantly (approximately an order of magnitude at best) more computationally efficient in both the continuous reinforcement (doubly-periodic) and discontinuous reinforcement (triply-periodic) cases.
SENR /NRPy + : Numerical relativity in singular curvilinear coordinate systems

NASA Astrophysics Data System (ADS)

Ruchlin, Ian; Etienne, Zachariah B.; Baumgarte, Thomas W.

2018-03-01

We report on a new open-source, user-friendly numerical relativity code package called SENR /NRPy + . Our code extends previous implementations of the BSSN reference-metric formulation to a much broader class of curvilinear coordinate systems, making it ideally suited to modeling physical configurations with approximate or exact symmetries. In the context of modeling black hole dynamics, it is orders of magnitude more efficient than other widely used open-source numerical relativity codes. NRPy + provides a Python-based interface in which equations are written in natural tensorial form and output at arbitrary finite difference order as highly efficient C code, putting complex tensorial equations at the scientist's fingertips without the need for an expensive software license. SENR provides the algorithmic framework that combines the C codes generated by NRPy + into a functioning numerical relativity code. We validate against two other established, state-of-the-art codes, and achieve excellent agreement. For the first time—in the context of moving puncture black hole evolutions—we demonstrate nearly exponential convergence of constraint violation and gravitational waveform errors to zero as the order of spatial finite difference derivatives is increased, while fixing the numerical grids at moderate resolution in a singular coordinate system. Such behavior outside the horizons is remarkable, as numerical errors do not converge to zero near punctures, and all points along the polar axis are coordinate singularities. The formulation addresses such coordinate singularities via cell-centered grids and a simple change of basis that analytically regularizes tensor components with respect to the coordinates. Future plans include extending this formulation to allow dynamical coordinate grids and bispherical-like distribution of points to efficiently capture orbiting compact binary dynamics.
libvdwxc: a library for exchange-correlation functionals in the vdW-DF family

NASA Astrophysics Data System (ADS)

Hjorth Larsen, Ask; Kuisma, Mikael; Löfgren, Joakim; Pouillon, Yann; Erhart, Paul; Hyldgaard, Per

2017-09-01

We present libvdwxc, a general library for evaluating the energy and potential for the family of vdW-DF exchange-correlation functionals. libvdwxc is written in C and provides an efficient implementation of the vdW-DF method and can be interfaced with various general-purpose DFT codes. Currently, the Gpaw and Octopus codes implement interfaces to libvdwxc. The present implementation emphasizes scalability and parallel performance, and thereby enables ab initio calculations of nanometer-scale complexes. The numerical accuracy is benchmarked on the S22 test set whereas parallel performance is benchmarked on ligand-protected gold nanoparticles ({{Au}}144{({{SC}}11{{NH}}25)}60) up to 9696 atoms.
Influence of the Numerical Scheme on the Solution Quality of the SWE for Tsunami Numerical Codes: The Tohoku-Oki, 2011Example.

NASA Astrophysics Data System (ADS)

Reis, C.; Clain, S.; Figueiredo, J.; Baptista, M. A.; Miranda, J. M. A.

2015-12-01

Numerical tools turn to be very important for scenario evaluations of hazardous phenomena such as tsunami. Nevertheless, the predictions highly depends on the numerical tool quality and the design of efficient numerical schemes still receives important attention to provide robust and accurate solutions. In this study we propose a comparative study between the efficiency of two volume finite numerical codes with second-order discretization implemented with different method to solve the non-conservative shallow water equations, the MUSCL (Monotonic Upstream-Centered Scheme for Conservation Laws) and the MOOD methods (Multi-dimensional Optimal Order Detection) which optimize the accuracy of the approximation in function of the solution local smoothness. The MUSCL is based on a priori criteria where the limiting procedure is performed before updated the solution to the next time-step leading to non-necessary accuracy reduction. On the contrary, the new MOOD technique uses a posteriori detectors to prevent the solution from oscillating in the vicinity of the discontinuities. Indeed, a candidate solution is computed and corrections are performed only for the cells where non-physical oscillations are detected. Using a simple one-dimensional analytical benchmark, 'Single wave on a sloping beach', we show that the classical 1D shallow-water system can be accurately solved with the finite volume method equipped with the MOOD technique and provide better approximation with sharper shock and less numerical diffusion. For the code validation, we also use the Tohoku-Oki 2011 tsunami and reproduce two DART records, demonstrating that the quality of the solution may deeply interfere with the scenario one can assess. This work is funded by the Portugal-France research agreement, through the research project GEONUM FCT-ANR/MAT-NAN/0122/2012.Numerical tools turn to be very important for scenario evaluations of hazardous phenomena such as tsunami. Nevertheless, the predictions highly depends on the numerical tool quality and the design of efficient numerical schemes still receives important attention to provide robust and accurate solutions. In this study we propose a comparative study between the efficiency of two volume finite numerical codes with second-order discretization implemented with different method to solve the non-conservative shallow water equations, the MUSCL (Monotonic Upstream-Centered Scheme for Conservation Laws) and the MOOD methods (Multi-dimensional Optimal Order Detection) which optimize the accuracy of the approximation in function of the solution local smoothness. The MUSCL is based on a priori criteria where the limiting procedure is performed before updated the solution to the next time-step leading to non-necessary accuracy reduction. On the contrary, the new MOOD technique uses a posteriori detectors to prevent the solution from oscillating in the vicinity of the discontinuities. Indeed, a candidate solution is computed and corrections are performed only for the cells where non-physical oscillations are detected. Using a simple one-dimensional analytical benchmark, 'Single wave on a sloping beach', we show that the classical 1D shallow-water system can be accurately solved with the finite volume method equipped with the MOOD technique and provide better approximation with sharper shock and less numerical diffusion. For the code validation, we also use the Tohoku-Oki 2011 tsunami and reproduce two DART records, demonstrating that the quality of the solution may deeply interfere with the scenario one can assess. This work is funded by the Portugal-France research agreement, through the research project GEONUM FCT-ANR/MAT-NAN/0122/2012.
Retrofitting a 1960s Split-Level, Cold-Climate Home

DOE Office of Scientific and Technical Information (OSTI.GOV)

Puttagunta, Srikanth

2015-07-13

National programs such as Home Performance with ENERGY STAR® and numerous other utility air-sealing programs have made homeowners aware of the benefits of energy-efficiency retrofits. Yet these programs tend to focus only on the low-hanging fruit: they recommend air sealing the thermal envelope and ductwork where accessible, switching to efficient lighting and low-flow fixtures, and improving the efficiency of mechanical systems (though insufficient funds or lack of knowledge to implement these improvements commonly prevent the implementation of these higher cost upgrades). At the other end of the spectrum, various utilities across the country are encouraging deep energy retrofit programs. Althoughmore » deep energy retrofits typically seek 50% energy savings, they are often quite costly and are most applicable to gut-rehab projects. A significant potential for lowering energy use in existing homes lies between the lowhanging fruit and deep energy retrofit approaches—retrofits that save approximately 30% in energy compared to the pre-retrofit conditions. The energy-efficiency measures need to be nonintrusive so the retrofit projects can be accomplished in occupied homes.« less
Theory and implementation of H-matrix based iterative and direct solvers for Helmholtz and elastodynamic oscillatory kernels

NASA Astrophysics Data System (ADS)

Chaillat, Stéphanie; Desiderio, Luca; Ciarlet, Patrick

2017-12-01

In this work, we study the accuracy and efficiency of hierarchical matrix (H-matrix) based fast methods for solving dense linear systems arising from the discretization of the 3D elastodynamic Green's tensors. It is well known in the literature that standard H-matrix based methods, although very efficient tools for asymptotically smooth kernels, are not optimal for oscillatory kernels. H2-matrix and directional approaches have been proposed to overcome this problem. However the implementation of such methods is much more involved than the standard H-matrix representation. The central questions we address are twofold. (i) What is the frequency-range in which the H-matrix format is an efficient representation for 3D elastodynamic problems? (ii) What can be expected of such an approach to model problems in mechanical engineering? We show that even though the method is not optimal (in the sense that more involved representations can lead to faster algorithms) an efficient solver can be easily developed. The capabilities of the method are illustrated on numerical examples using the Boundary Element Method.

An Efficient Scheme for Updating Sparse Cholesky Factors

NASA Technical Reports Server (NTRS)

Raghavan, Padma

2002-01-01

Raghavan had earlier developed the software package DCSPACK which can be used for solving sparse linear systems where the coefficient matrix is symmetric and positive definite (this project was not funded by NASA but by agencies such as NSF). DSCPACK-S is the serial code and DSCPACK-P is a parallel implementation suitable for multiprocessors or networks-of-workstations with message passing using MCI. The main algorithm used is the Cholesky factorization of a sparse symmetric positive positive definite matrix A = LL(T). The code can also compute the factorization A = LDL(T). The complexity of the software arises from several factors relating to the sparsity of the matrix A. A sparse N x N matrix A has typically less that cN nonzeroes where c is a small constant. If the matrix were dense, it would have O(N2) nonzeroes. The most complicated part of such sparse Cholesky factorization relates to fill-in, i.e., zeroes in the original matrix that become nonzeroes in the factor L. An efficient implementation depends to a large extent on complex data structures and on techniques from graph theory to reduce, identify, and manage fill. DSCPACK is based on an efficient multifrontal implementation with fill-managing algorithms and implementation arising from earlier research by Raghavan and others. Sparse Cholesky factorization is typically a four step process: (1) ordering to compute a fill-reducing numbering, (2) symbolic factorization to determine the nonzero structure of L, (3) numeric factorization to compute L, and, (4) triangular solution to solve L(T)x = y and Ly = b. The first two steps are symbolic and are performed using the graph of the matrix. The numeric factorization step is of dominant cost and there are several schemes for improving performance by exploiting the nested and dense structure of groups of columns in the factor. The latter are aimed at better utilization of the cache-memory hierarchy on modem processors to prevent cache-misses and provide execution rates (operations/second) that are close to the peak rates for dense matrix computations. Currently, EPISCOPACY is being used in an application at NASA directed by J. Newman and M. James. We propose the implementation of efficient schemes for updating the LL(T) or LDL(T) factors computed in DSCPACK-S to meet the computational requirements of their project. A brief description is provided in the next section.
LS-DYNA Simulation of Hemispherical-punch Stamping Process Using an Efficient Algorithm for Continuum Damage Based Elastoplastic Constitutive Equation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Salajegheh, Nima; Abedrabbo, Nader; Pourboghrat, Farhang

An efficient integration algorithm for continuum damage based elastoplastic constitutive equations is implemented in LS-DYNA. The isotropic damage parameter is defined as the ratio of the damaged surface area over the total cross section area of the representative volume element. This parameter is incorporated into the integration algorithm as an internal variable. The developed damage model is then implemented in the FEM code LS-DYNA as user material subroutine (UMAT). Pure stretch experiments of a hemispherical punch are carried out for copper sheets and the results are compared against the predictions of the implemented damage model. Evaluation of damage parameters ismore » carried out and the optimized values that correctly predicted the failure in the sheet are reported. Prediction of failure in the numerical analysis is performed through element deletion using the critical damage value. The set of failure parameters which accurately predict the failure behavior in copper sheets compared to experimental data is reported as well.« less
A fast new algorithm for a robot neurocontroller using inverse QR decomposition

DOE Office of Scientific and Technical Information (OSTI.GOV)

Morris, A.S.; Khemaissia, S.

2000-01-01

A new adaptive neural network controller for robots is presented. The controller is based on direct adaptive techniques. Unlike many neural network controllers in the literature, inverse dynamical model evaluation is not required. A numerically robust, computationally efficient processing scheme for neutral network weight estimation is described, namely, the inverse QR decomposition (INVQR). The inverse QR decomposition and a weighted recursive least-squares (WRLS) method for neural network weight estimation is derived using Cholesky factorization of the data matrix. The algorithm that performs the efficient INVQR of the underlying space-time data matrix may be implemented in parallel on a triangular array.more » Furthermore, its systolic architecture is well suited for VLSI implementation. Another important benefit is well suited for VLSI implementation. Another important benefit of the INVQR decomposition is that it solves directly for the time-recursive least-squares filter vector, while avoiding the sequential back-substitution step required by the QR decomposition approaches.« less
Comparison of methods for developing the dynamics of rigid-body systems

NASA Technical Reports Server (NTRS)

Ju, M. S.; Mansour, J. M.

1989-01-01

Several approaches for developing the equations of motion for a three-degree-of-freedom PUMA robot were compared on the basis of computational efficiency (i.e., the number of additions, subtractions, multiplications, and divisions). Of particular interest was the investigation of the use of computer algebra as a tool for developing the equations of motion. Three approaches were implemented algebraically: Lagrange's method, Kane's method, and Wittenburg's method. Each formulation was developed in absolute and relative coordinates. These six cases were compared to each other and to a recursive numerical formulation. The results showed that all of the formulations implemented algebraically required fewer calculations than the recursive numerical algorithm. The algebraic formulations required fewer calculations in absolute coordinates than in relative coordinates. Each of the algebraic formulations could be simplified, using patterns from Kane's method, to yield the same number of calculations in a given coordinate system.
Implementation of a kappa-epsilon turbulence model to RPLUS3D code

NASA Technical Reports Server (NTRS)

Chitsomboon, Tawit

1992-01-01

The RPLUS3D code has been developed at the NASA Lewis Research Center to support the National Aerospace Plane (NASP) project. The code has the ability to solve three dimensional flowfields with finite rate combustion of hydrogen and air. The combustion process of the hydrogen-air system are simulated by an 18 reaction path, 8 species chemical kinetic mechanism. The code uses a Lower-Upper (LU) decomposition numerical algorithm as its basis, making it a very efficient and robust code. Except for the Jacobian matrix for the implicit chemistry source terms, there is no inversion of a matrix even though a fully implicit numerical algorithm is used. A k-epsilon turbulence model has recently been incorporated into the code. Initial validations have been conducted for a flow over a flat plate. Results of the validation studies are shown. Some difficulties in implementing the k-epsilon equations to the code are also discussed.
Implementation of a kappa-epsilon turbulence model to RPLUS3D code

NASA Astrophysics Data System (ADS)

Chitsomboon, Tawit

1992-02-01

The RPLUS3D code has been developed at the NASA Lewis Research Center to support the National Aerospace Plane (NASP) project. The code has the ability to solve three dimensional flowfields with finite rate combustion of hydrogen and air. The combustion process of the hydrogen-air system are simulated by an 18 reaction path, 8 species chemical kinetic mechanism. The code uses a Lower-Upper (LU) decomposition numerical algorithm as its basis, making it a very efficient and robust code. Except for the Jacobian matrix for the implicit chemistry source terms, there is no inversion of a matrix even though a fully implicit numerical algorithm is used. A k-epsilon turbulence model has recently been incorporated into the code. Initial validations have been conducted for a flow over a flat plate. Results of the validation studies are shown. Some difficulties in implementing the k-epsilon equations to the code are also discussed.
SToRM: A Model for 2D environmental hydraulics

USGS Publications Warehouse

Simões, Francisco J. M.

2017-01-01

A two-dimensional (depth-averaged) finite volume Godunov-type shallow water model developed for flow over complex topography is presented. The model, SToRM, is based on an unstructured cell-centered finite volume formulation and on nonlinear strong stability preserving Runge-Kutta time stepping schemes. The numerical discretization is founded on the classical and well established shallow water equations in hyperbolic conservative form, but the convective fluxes are calculated using auto-switching Riemann and diffusive numerical fluxes. Computational efficiency is achieved through a parallel implementation based on the OpenMP standard and the Fortran programming language. SToRM’s implementation within a graphical user interface is discussed. Field application of SToRM is illustrated by utilizing it to estimate peak flow discharges in a flooding event of the St. Vrain Creek in Colorado, U.S.A., in 2013, which reached 850 m3/s (~30,000 f3 /s) at the location of this study.
The changing face of surgical education: simulation as the new paradigm.

PubMed

Scott, Daniel J; Cendan, Juan C; Pugh, Carla M; Minter, Rebecca M; Dunnington, Gary L; Kozar, Rosemary A

2008-06-15

Surgical simulation has evolved considerably over the past two decades and now plays a major role in training efforts designed to foster the acquisition of new skills and knowledge outside of the clinical environment. Numerous driving forces have fueled this fundamental change in educational methods, including concerns over patient safety and the need to maximize efficiency within the context of limited work hours and clinical exposure. The importance of simulation has been recognized by the major stake-holders in surgical education, and the Residency Review Committee has mandated that all programs implement skills training curricula in 2008. Numerous issues now face educators who must use these novel training methods. It is important that these individuals have a solid understanding of content, development, research, and implementation aspects regarding simulation. This paper highlights presentations about these topics from a panel of experts convened at the 2008 Academic Surgical Congress.
THE CHANGING FACE OF SURGICAL EDUCATION: SIMULATION AS THE NEW PARADIGM

PubMed Central

Scott, Daniel J.; Cendan, Juan C.; Pugh, Carla M.; Minter, Rebecca M.; Dunnington, Gary L.; Kozar, Rosemary A.

2009-01-01

Surgical simulation has evolved considerably over the past two decades and now plays a major role in training efforts designed to foster the acquisition of new skills and knowledge outside of the clinical environment. Numerous driving forces have fueled this fundamental change in educational methods, including concerns over patient safety and the need to maximize efficiency within the context of limited work hours and clinical exposure. The importance of simulation has been recognized by the major stake-holders in surgical education, and the Residency Review Committee has mandated that all programs implement skills training curricula in 2008. Numerous issues now face educators who must use these novel training methods. It is important that these individuals have a solid understanding of content, development, research, and implementation aspects regarding simulation. This paper highlights presentations about these topics from a panel of experts convened at the 2008 Academic Surgical Congress. PMID:18498868
Highly efficient and exact method for parallelization of grid-based algorithms and its implementation in DelPhi

PubMed Central

Li, Chuan; Li, Lin; Zhang, Jie; Alexov, Emil

2012-01-01

The Gauss-Seidel method is a standard iterative numerical method widely used to solve a system of equations and, in general, is more efficient comparing to other iterative methods, such as the Jacobi method. However, standard implementation of the Gauss-Seidel method restricts its utilization in parallel computing due to its requirement of using updated neighboring values (i.e., in current iteration) as soon as they are available. Here we report an efficient and exact (not requiring assumptions) method to parallelize iterations and to reduce the computational time as a linear/nearly linear function of the number of CPUs. In contrast to other existing solutions, our method does not require any assumptions and is equally applicable for solving linear and nonlinear equations. This approach is implemented in the DelPhi program, which is a finite difference Poisson-Boltzmann equation solver to model electrostatics in molecular biology. This development makes the iterative procedure on obtaining the electrostatic potential distribution in the parallelized DelPhi several folds faster than that in the serial code. Further we demonstrate the advantages of the new parallelized DelPhi by computing the electrostatic potential and the corresponding energies of large supramolecular structures. PMID:22674480
Computing rank-revealing QR factorizations of dense matrices.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bischof, C. H.; Quintana-Orti, G.; Mathematics and Computer Science

1998-06-01

We develop algorithms and implementations for computing rank-revealing QR (RRQR) factorizations of dense matrices. First, we develop an efficient block algorithm for approximating an RRQR factorization, employing a windowed version of the commonly used Golub pivoting strategy, aided by incremental condition estimation. Second, we develop efficiently implementable variants of guaranteed reliable RRQR algorithms for triangular matrices originally suggested by Chandrasekaran and Ipsen and by Pan and Tang. We suggest algorithmic improvements with respect to condition estimation, termination criteria, and Givens updating. By combining the block algorithm with one of the triangular postprocessing steps, we arrive at an efficient and reliablemore » algorithm for computing an RRQR factorization of a dense matrix. Experimental results on IBM RS/6000 SGI R8000 platforms show that this approach performs up to three times faster that the less reliable QR factorization with column pivoting as it is currently implemented in LAPACK, and comes within 15% of the performance of the LAPACK block algorithm for computing a QR factorization without any column exchanges. Thus, we expect this routine to be useful in may circumstances where numerical rank deficiency cannot be ruled out, but currently has been ignored because of the computational cost of dealing with it.« less
Simulation of ring polymer melts with GPU acceleration

NASA Astrophysics Data System (ADS)

Schram, R. D.; Barkema, G. T.

2018-06-01

We implemented the elastic lattice polymer model on the GPU (Graphics Processing Unit), and show that the GPU is very efficient for polymer simulations of dense polymer melts. The implementation is able to perform up to 4.1 ṡ109 Monte Carlo moves per second. Compared to our standard CPU implementation, we find an effective speed-up of a factor 92. Using this GPU implementation we studied the equilibrium properties and the dynamics of non-concatenated ring polymers in a melt of such polymers, using Rouse modes. With increasing polymer length, we found a very slow transition to compactness with a growth exponent ν ≈ 1 / 3. Numerically we find that the longest internal time scale of the polymer scales as N3.1, with N the molecular weight of the ring polymer.
Seaworthy Quantum Key Distribution Design and Validation (SEAKEY)

DTIC Science & Technology

2014-10-30

to single photon detection, at comparable detection efficiencies. On the other hand, error-correction codes are better developed for small-alphabet...protocol is several orders of magnitude better than the Shapiro protocol, which needs entangled states. The bits/mode performance achieved by our...putting together a software tool implemented in MATLAB , which talks to the MODTRAN database via an intermediate numerical dump of transmission data
Reducing full one-loop amplitudes to scalar integrals at the integrand level

NASA Astrophysics Data System (ADS)

Ossola, Giovanni; Papadopoulos, Costas G.; Pittau, Roberto

2007-02-01

We show how to extract the coefficients of the 4-, 3-, 2- and 1-point one-loop scalar integrals from the full one-loop amplitude of arbitrary scattering processes. In a similar fashion, also the rational terms can be derived. Basically no information on the analytical structure of the amplitude is required, making our method appealing for an efficient numerical implementation.
An efficient numerical method for solving the Boltzmann equation in multidimensions

NASA Astrophysics Data System (ADS)

Dimarco, Giacomo; Loubère, Raphaël; Narski, Jacek; Rey, Thomas

2018-01-01

In this paper we deal with the extension of the Fast Kinetic Scheme (FKS) (Dimarco and Loubère, 2013 [26]) originally constructed for solving the BGK equation, to the more challenging case of the Boltzmann equation. The scheme combines a robust and fast method for treating the transport part based on an innovative Lagrangian technique supplemented with conservative fast spectral schemes to treat the collisional operator by means of an operator splitting approach. This approach along with several implementation features related to the parallelization of the algorithm permits to construct an efficient simulation tool which is numerically tested against exact and reference solutions on classical problems arising in rarefied gas dynamic. We present results up to the 3 D × 3 D case for unsteady flows for the Variable Hard Sphere model which may serve as benchmark for future comparisons between different numerical methods for solving the multidimensional Boltzmann equation. For this reason, we also provide for each problem studied details on the computational cost and memory consumption as well as comparisons with the BGK model or the limit model of compressible Euler equations.
An Efficient numerical method to calculate the conductivity tensor for disordered topological matter

NASA Astrophysics Data System (ADS)

Garcia, Jose H.; Covaci, Lucian; Rappoport, Tatiana G.

2015-03-01

We propose a new efficient numerical approach to calculate the conductivity tensor in solids. We use a real-space implementation of the Kubo formalism where both diagonal and off-diagonal conductivities are treated in the same footing. We adopt a formulation of the Kubo theory that is known as Bastin formula and expand the Green's functions involved in terms of Chebyshev polynomials using the kernel polynomial method. Within this method, all the computational effort is on the calculation of the expansion coefficients. It also has the advantage of obtaining both conductivities in a single calculation step and for various values of temperature and chemical potential, capturing the topology of the band-structure. Our numerical technique is very general and is suitable for the calculation of transport properties of disordered systems. We analyze how the method's accuracy varies with the number of moments used in the expansion and illustrate our approach by calculating the transverse conductivity of different topological systems. T.G.R, J.H.G and L.C. acknowledge Brazilian agencies CNPq, FAPERJ and INCT de Nanoestruturas de Carbono, Flemish Science Foundation for financial support.
Contour integral method for obtaining the self-energy matrices of electrodes in electron transport calculations

NASA Astrophysics Data System (ADS)

Iwase, Shigeru; Futamura, Yasunori; Imakura, Akira; Sakurai, Tetsuya; Tsukamoto, Shigeru; Ono, Tomoya

2018-05-01

We propose an efficient computational method for evaluating the self-energy matrices of electrodes to study ballistic electron transport properties in nanoscale systems. To reduce the high computational cost incurred in large systems, a contour integral eigensolver based on the Sakurai-Sugiura method combined with the shifted biconjugate gradient method is developed to solve an exponential-type eigenvalue problem for complex wave vectors. A remarkable feature of the proposed algorithm is that the numerical procedure is very similar to that of conventional band structure calculations. We implement the developed method in the framework of the real-space higher-order finite-difference scheme with nonlocal pseudopotentials. Numerical tests for a wide variety of materials validate the robustness, accuracy, and efficiency of the proposed method. As an illustration of the method, we present the electron transport property of the freestanding silicene with the line defect originating from the reversed buckled phases.
SandiaMRCR

DOE Office of Scientific and Technical Information (OSTI.GOV)

2012-01-05

SandiaMCR was developed to identify pure components and their concentrations from spectral data. This software efficiently implements the multivariate calibration regression alternating least squares (MCR-ALS), principal component analysis (PCA), and singular value decomposition (SVD). Version 3.37 also includes the PARAFAC-ALS Tucker-1 (for trilinear analysis) algorithms. The alternating least squares methods can be used to determine the composition without or with incomplete prior information on the constituents and their concentrations. It allows the specification of numerous preprocessing, initialization and data selection and compression options for the efficient processing of large data sets. The software includes numerous options including the definition ofmore » equality and non-negativety constraints to realistically restrict the solution set, various normalization or weighting options based on the statistics of the data, several initialization choices and data compression. The software has been designed to provide a practicing spectroscopist the tools required to routinely analysis data in a reasonable time and without requiring expert intervention.« less
A GPU accelerated and error-controlled solver for the unbounded Poisson equation in three dimensions

NASA Astrophysics Data System (ADS)

Exl, Lukas

2017-12-01

An efficient solver for the three dimensional free-space Poisson equation is presented. The underlying numerical method is based on finite Fourier series approximation. While the error of all involved approximations can be fully controlled, the overall computation error is driven by the convergence of the finite Fourier series of the density. For smooth and fast-decaying densities the proposed method will be spectrally accurate. The method scales with O(N log N) operations, where N is the total number of discretization points in the Cartesian grid. The majority of the computational costs come from fast Fourier transforms (FFT), which makes it ideal for GPU computation. Several numerical computations on CPU and GPU validate the method and show efficiency and convergence behavior. Tests are performed using the Vienna Scientific Cluster 3 (VSC3). A free MATLAB implementation for CPU and GPU is provided to the interested community.
A new numerically stable implementation of the T-matrix method for electromagnetic scattering by spheroidal particles

NASA Astrophysics Data System (ADS)

Somerville, W. R. C.; Auguié, B.; Le Ru, E. C.

2013-07-01

We propose, describe, and demonstrate a new numerically stable implementation of the extended boundary-condition method (EBCM) to compute the T-matrix for electromagnetic scattering by spheroidal particles. Our approach relies on the fact that for many of the EBCM integrals in the special case of spheroids, a leading part of the integrand integrates exactly to zero, which causes catastrophic loss of precision in numerical computations. This feature was in fact first pointed out by Waterman in the context of acoustic scattering and electromagnetic scattering by infinite cylinders. We have recently studied it in detail in the case of electromagnetic scattering by particles. Based on this study, the principle of our new implementation is therefore to compute all the integrands without the problematic part to avoid the primary cause of loss of precision. Particular attention is also given to choosing the algorithms that minimise loss of precision in every step of the method, without compromising on speed. We show that the resulting implementation can efficiently compute in double precision arithmetic the T-matrix and therefore optical properties of spheroidal particles to a high precision, often down to a remarkable accuracy (10-10 relative error), over a wide range of parameters that are typically considered problematic. We discuss examples such as high-aspect ratio metallic nanorods and large size parameter (≈35) dielectric particles, which had been previously modelled only using quadruple-precision arithmetic codes.

Numerical study on the power extraction performance of a flapping foil with a flexible tail

NASA Astrophysics Data System (ADS)

Wu, J.; Shu, C.; Zhao, N.; Tian, F.-B.

2015-01-01

The numerical study on the power extraction performance of a flapping foil with a flexible tail is performed in this work. A NACA0015 airfoil is arranged in a two-dimensional laminar flow and imposed with a synchronous harmonic plunge and pitch rotary motion. A flat plate that is attached to the trailing edge of the foil is utilized to model a tail, and so they are viewed as a whole for the purpose of power extraction. In addition, the tail either is rigid or can deform due to the exerted hydrodynamic forces. To implement numerical simulations, an immersed boundary-lattice Boltzmann method is employed. At a Reynolds number of 1100 and the position of the pitching axis at third chord, the influences of the mass and flexibility of the tail as well as the frequency of motion on the power extraction are systematically examined. It is found that compared to the foil with a rigid tail, the efficiency of power extraction for the foil with a deformable tail can be improved. Based on the numerical analysis, it is indicated that the enhanced plunging component of the power extraction, which is caused by the increased lift force, directly contributes to the efficiency improvement. Since a flexible tail with medium and high masses is not beneficial to the efficiency improvement, a flexible tail with low mass together with high flexibility is recommended in the flapping foil based power extraction system.
Impact of the shape of the implantable ports on their efficiency of flow (injection and flushing)

PubMed Central

Guiffant, Gérard; Flaud, Patrice; Durussel, Jean Jacques; Merckx, Jacques

2014-01-01

Now widely used, totally implantable venous access devices allow mid- and long-term, frequent, repeated, or continuous injection of therapeutic products by vascular, cavitary, or perineural access. The effective flushing of these devices is a key factor that ensures their long-lasting use. We present experimental results and a numerical simulation to demonstrate that the implementation of rounded edge wall cavities improves flushing efficiency. We use the same approaches to suggest that the deposit amount may be reduced by the use of rounded edge wall cavities. PMID:25258561
Communication: An efficient approach to compute state-specific nuclear gradients for a generic state-averaged multi-configuration self consistent field wavefunction.

PubMed

Granovsky, Alexander A

2015-12-21

We present a new, very efficient semi-numerical approach for the computation of state-specific nuclear gradients of a generic state-averaged multi-configuration self consistent field wavefunction. Our approach eliminates the costly coupled-perturbed multi-configuration Hartree-Fock step as well as the associated integral transformation stage. The details of the implementation within the Firefly quantum chemistry package are discussed and several sample applications are given. The new approach is routinely applicable to geometry optimization of molecular systems with 1000+ basis functions using a standalone multi-core workstation.
Experimental optimization during SERS application

NASA Astrophysics Data System (ADS)

Laha, Ranjit; Das, Gour Mohan; Ranjan, Pranay; Dantham, Venkata Ramanaiah

2018-05-01

The well known surface enhanced Raman scattering (SERS) needs a lot of experimental optimization for its proper implementation. In this report, we demonstrate the efficient SERS using gold nanoparticles (AuNPs) on quartz plate. The AuNPs were prepared by depositing direct current sputtered Au thin film followed by suitable annealing. The parameters varied for getting best SERS effect were 1) Numerical Aperture of Raman objective lens and 2) Sputtering duration of Au film. It was found that AuNPs formed from the Au layer deposited for 40s and Raman objective lens of magnification 50X are the best combination for obtaining efficient SERS effect.
Communication: An efficient approach to compute state-specific nuclear gradients for a generic state-averaged multi-configuration self consistent field wavefunction

DOE Office of Scientific and Technical Information (OSTI.GOV)

Granovsky, Alexander A., E-mail: alex.granovsky@gmail.com

We present a new, very efficient semi-numerical approach for the computation of state-specific nuclear gradients of a generic state-averaged multi-configuration self consistent field wavefunction. Our approach eliminates the costly coupled-perturbed multi-configuration Hartree-Fock step as well as the associated integral transformation stage. The details of the implementation within the Firefly quantum chemistry package are discussed and several sample applications are given. The new approach is routinely applicable to geometry optimization of molecular systems with 1000+ basis functions using a standalone multi-core workstation.
Medical team training and coaching in the Veterans Health Administration; assessment and impact on the first 32 facilities in the programme.

PubMed

Neily, Julia; Mills, Peter D; Lee, Pamela; Carney, Brian; West, Priscilla; Percarpio, Katherine; Mazzia, Lisa; Paull, Douglas E; Bagian, James P

2010-08-01

Communication is problematic in healthcare. The Veterans Health Administration is implementing Medical Team Training. The authors describe results of the first 32 of 130 sites to undergo the programme. This report is unique; it provides aggregate results of a crew resource-management programme for numerous facilities. Facilities were taught medical team training and implemented briefings, debriefings and other projects. The authors coached teams through consultative phone interviews over a year. Implementation teams self-reported implementation and rated programme impact: 1='no impact' and 5='significant impact.' We used logistic regression to examine implementation of briefing/debriefing. Ninety-seven per cent of facilities implemented briefings and debriefings, and all implemented an additional project. As of the final interview, 73% of OR and 67% of ICU implementation teams self-reported and rated staff impact 4-5. Eighty-six per cent of OR and 82% of ICU implementation teams self-reported and rated patient impact 4-5. Improved teamwork was reported by 84% of OR and 75% of ICU implementation teams. Efficiency improvements were reported by 94% of OR implementation teams. Almost all facilities (97%) reported a success story or avoiding an undesirable event. Sites with lower volume were more likely to conduct briefings/debriefings in all cases for all surgical services (p=0.03). Sites are implementing the programme with a positive impact on patients and staff, and improving teamwork, efficiency and safety. A unique feature of the programme is that implementation was facilitated through follow-up support. This may have contributed to the early success of the programme.
The Design of a Templated C++ Small Vector Class for Numerical Computing

NASA Technical Reports Server (NTRS)

Moran, Patrick J.

2000-01-01

We describe the design and implementation of a templated C++ class for vectors. The vector class is templated both for vector length and vector component type; the vector length is fixed at template instantiation time. The vector implementation is such that for a vector of N components of type T, the total number of bytes required by the vector is equal to N * size of (T), where size of is the built-in C operator. The property of having a size no bigger than that required by the components themselves is key in many numerical computing applications, where one may allocate very large arrays of small, fixed-length vectors. In addition to the design trade-offs motivating our fixed-length vector design choice, we review some of the C++ template features essential to an efficient, succinct implementation. In particular, we highlight some of the standard C++ features, such as partial template specialization, that are not supported by all compilers currently. This report provides an inventory listing the relevant support currently provided by some key compilers, as well as test code one can use to verify compiler capabilities.
A gradient enhanced plasticity-damage microplane model for concrete

NASA Astrophysics Data System (ADS)

Zreid, Imadeddin; Kaliske, Michael

2018-03-01

Computational modeling of concrete poses two main types of challenges. The first is the mathematical description of local response for such a heterogeneous material under all stress states, and the second is the stability and efficiency of the numerical implementation in finite element codes. The paper at hand presents a comprehensive approach addressing both issues. Adopting the microplane theory, a combined plasticity-damage model is formulated and regularized by an implicit gradient enhancement. The plasticity part introduces a new microplane smooth 3-surface cap yield function, which provides a stable numerical solution within an implicit finite element algorithm. The damage part utilizes a split, which can describe the transition of loading between tension and compression. Regularization of the model by the implicit gradient approach eliminates the mesh sensitivity and numerical instabilities. Identification methods for model parameters are proposed and several numerical examples of plain and reinforced concrete are carried out for illustration.
A Parallel Numerical Algorithm To Solve Linear Systems Of Equations Emerging From 3D Radiative Transfer

NASA Astrophysics Data System (ADS)

Wichert, Viktoria; Arkenberg, Mario; Hauschildt, Peter H.

2016-10-01

Highly resolved state-of-the-art 3D atmosphere simulations will remain computationally extremely expensive for years to come. In addition to the need for more computing power, rethinking coding practices is necessary. We take a dual approach by introducing especially adapted, parallel numerical methods and correspondingly parallelizing critical code passages. In the following, we present our respective work on PHOENIX/3D. With new parallel numerical algorithms, there is a big opportunity for improvement when iteratively solving the system of equations emerging from the operator splitting of the radiative transfer equation J = ΛS. The narrow-banded approximate Λ-operator Λ* , which is used in PHOENIX/3D, occurs in each iteration step. By implementing a numerical algorithm which takes advantage of its characteristic traits, the parallel code's efficiency is further increased and a speed-up in computational time can be achieved.
Traveling-Wave Solutions of the Kolmogorov-Petrovskii-Piskunov Equation

NASA Astrophysics Data System (ADS)

Pikulin, S. V.

2018-02-01

We consider quasi-stationary solutions of a problem without initial conditions for the Kolmogorov-Petrovskii-Piskunov (KPP) equation, which is a quasilinear parabolic one arising in the modeling of certain reaction-diffusion processes in the theory of combustion, mathematical biology, and other areas of natural sciences. A new efficiently numerically implementable analytical representation is constructed for self-similar plane traveling-wave solutions of the KPP equation with a special right-hand side. Sufficient conditions for an auxiliary function involved in this representation to be analytical for all values of its argument, including the endpoints, are obtained. Numerical results are obtained for model examples.
Lattice dynamics calculations based on density-functional perturbation theory in real space

NASA Astrophysics Data System (ADS)

Shang, Honghui; Carbogno, Christian; Rinke, Patrick; Scheffler, Matthias

2017-06-01

A real-space formalism for density-functional perturbation theory (DFPT) is derived and applied for the computation of harmonic vibrational properties in molecules and solids. The practical implementation using numeric atom-centered orbitals as basis functions is demonstrated exemplarily for the all-electron Fritz Haber Institute ab initio molecular simulations (FHI-aims) package. The convergence of the calculations with respect to numerical parameters is carefully investigated and a systematic comparison with finite-difference approaches is performed both for finite (molecules) and extended (periodic) systems. Finally, the scaling tests and scalability tests on massively parallel computer systems demonstrate the computational efficiency.
Reliability-Based Control Design for Uncertain Systems

NASA Technical Reports Server (NTRS)

Crespo, Luis G.; Kenny, Sean P.

2005-01-01

This paper presents a robust control design methodology for systems with probabilistic parametric uncertainty. Control design is carried out by solving a reliability-based multi-objective optimization problem where the probability of violating design requirements is minimized. Simultaneously, failure domains are optimally enlarged to enable global improvements in the closed-loop performance. To enable an efficient numerical implementation, a hybrid approach for estimating reliability metrics is developed. This approach, which integrates deterministic sampling and asymptotic approximations, greatly reduces the numerical burden associated with complex probabilistic computations without compromising the accuracy of the results. Examples using output-feedback and full-state feedback with state estimation are used to demonstrate the ideas proposed.
Memory efficient solution of the primitive equations for numerical weather prediction on the CYBER 205

NASA Technical Reports Server (NTRS)

Tuccillo, J. J.

1984-01-01

Numerical Weather Prediction (NWP), for both operational and research purposes, requires only fast computational speed but also large memory. A technique for solving the Primitive Equations for atmospheric motion on the CYBER 205, as implemented in the Mesoscale Atmospheric Simulation System, which is fully vectorized and requires substantially less memory than other techniques such as the Leapfrog or Adams-Bashforth Schemes is discussed. The technique presented uses the Euler-Backard time marching scheme. Also discussed are several techniques for reducing computational time of the model by replacing slow intrinsic routines by faster algorithms which use only hardware vector instructions.
Power corrections in the N -jettiness subtraction scheme

DOE Office of Scientific and Technical Information (OSTI.GOV)

Boughezal, Radja; Liu, Xiaohui; Petriello, Frank

We discuss the leading-logarithmic power corrections in the N-jettiness subtraction scheme for higher-order perturbative QCD calculations. We compute the next-to-leading order power corrections for an arbitrary N-jet process, and we explicitly calculate the power correction through next-to-next-to-leading order for color-singlet production for bothmore » $$q\\bar{q}$$ and gg initiated processes. Our results are compact and simple to implement numerically. Including the leading power correction in the N-jettiness subtraction scheme substantially improves its numerical efficiency. Finally, we discuss what features of our techniques extend to processes containing final-state jets.« less
Power corrections in the N -jettiness subtraction scheme

DOE PAGES

Boughezal, Radja; Liu, Xiaohui; Petriello, Frank

2017-03-30

We discuss the leading-logarithmic power corrections in the N-jettiness subtraction scheme for higher-order perturbative QCD calculations. We compute the next-to-leading order power corrections for an arbitrary N-jet process, and we explicitly calculate the power correction through next-to-next-to-leading order for color-singlet production for bothmore » $$q\\bar{q}$$ and gg initiated processes. Our results are compact and simple to implement numerically. Including the leading power correction in the N-jettiness subtraction scheme substantially improves its numerical efficiency. Finally, we discuss what features of our techniques extend to processes containing final-state jets.« less
Vectorization on the star computer of several numerical methods for a fluid flow problem

NASA Technical Reports Server (NTRS)

Lambiotte, J. J., Jr.; Howser, L. M.

1974-01-01

A reexamination of some numerical methods is considered in light of the new class of computers which use vector streaming to achieve high computation rates. A study has been made of the effect on the relative efficiency of several numerical methods applied to a particular fluid flow problem when they are implemented on a vector computer. The method of Brailovskaya, the alternating direction implicit method, a fully implicit method, and a new method called partial implicitization have been applied to the problem of determining the steady state solution of the two-dimensional flow of a viscous imcompressible fluid in a square cavity driven by a sliding wall. Results are obtained for three mesh sizes and a comparison is made of the methods for serial computation.
Numerical simulation of coupled electrochemical and transport processes in battery systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liaw, B.Y.; Gu, W.B.; Wang, C.Y.

1997-12-31

Advanced numerical modeling to simulate dynamic battery performance characteristics for several types of advanced batteries is being conducted using computational fluid dynamics (CFD) techniques. The CFD techniques provide efficient algorithms to solve a large set of highly nonlinear partial differential equations that represent the complex battery behavior governed by coupled electrochemical reactions and transport processes. The authors have recently successfully applied such techniques to model advanced lead-acid, Ni-Cd and Ni-MH cells. In this paper, the authors briefly discuss how the governing equations were numerically implemented, show some preliminary modeling results, and compare them with other modeling or experimental data reportedmore » in the literature. The authors describe the advantages and implications of using the CFD techniques and their capabilities in future battery applications.« less
A Numerical Method for Solving the 3D Unsteady Incompressible Navier-Stokes Equations in Curvilinear Domains with Complex Immersed Boundaries.

PubMed

Ge, Liang; Sotiropoulos, Fotis

2007-08-01

A novel numerical method is developed that integrates boundary-conforming grids with a sharp interface, immersed boundary methodology. The method is intended for simulating internal flows containing complex, moving immersed boundaries such as those encountered in several cardiovascular applications. The background domain (e.g the empty aorta) is discretized efficiently with a curvilinear boundary-fitted mesh while the complex moving immersed boundary (say a prosthetic heart valve) is treated with the sharp-interface, hybrid Cartesian/immersed-boundary approach of Gilmanov and Sotiropoulos [1]. To facilitate the implementation of this novel modeling paradigm in complex flow simulations, an accurate and efficient numerical method is developed for solving the unsteady, incompressible Navier-Stokes equations in generalized curvilinear coordinates. The method employs a novel, fully-curvilinear staggered grid discretization approach, which does not require either the explicit evaluation of the Christoffel symbols or the discretization of all three momentum equations at cell interfaces as done in previous formulations. The equations are integrated in time using an efficient, second-order accurate fractional step methodology coupled with a Jacobian-free, Newton-Krylov solver for the momentum equations and a GMRES solver enhanced with multigrid as preconditioner for the Poisson equation. Several numerical experiments are carried out on fine computational meshes to demonstrate the accuracy and efficiency of the proposed method for standard benchmark problems as well as for unsteady, pulsatile flow through a curved, pipe bend. To demonstrate the ability of the method to simulate flows with complex, moving immersed boundaries we apply it to calculate pulsatile, physiological flow through a mechanical, bileaflet heart valve mounted in a model straight aorta with an anatomical-like triple sinus.
A Numerical Method for Solving the 3D Unsteady Incompressible Navier-Stokes Equations in Curvilinear Domains with Complex Immersed Boundaries

PubMed Central

Ge, Liang; Sotiropoulos, Fotis

2008-01-01

A novel numerical method is developed that integrates boundary-conforming grids with a sharp interface, immersed boundary methodology. The method is intended for simulating internal flows containing complex, moving immersed boundaries such as those encountered in several cardiovascular applications. The background domain (e.g the empty aorta) is discretized efficiently with a curvilinear boundary-fitted mesh while the complex moving immersed boundary (say a prosthetic heart valve) is treated with the sharp-interface, hybrid Cartesian/immersed-boundary approach of Gilmanov and Sotiropoulos [1]. To facilitate the implementation of this novel modeling paradigm in complex flow simulations, an accurate and efficient numerical method is developed for solving the unsteady, incompressible Navier-Stokes equations in generalized curvilinear coordinates. The method employs a novel, fully-curvilinear staggered grid discretization approach, which does not require either the explicit evaluation of the Christoffel symbols or the discretization of all three momentum equations at cell interfaces as done in previous formulations. The equations are integrated in time using an efficient, second-order accurate fractional step methodology coupled with a Jacobian-free, Newton-Krylov solver for the momentum equations and a GMRES solver enhanced with multigrid as preconditioner for the Poisson equation. Several numerical experiments are carried out on fine computational meshes to demonstrate the accuracy and efficiency of the proposed method for standard benchmark problems as well as for unsteady, pulsatile flow through a curved, pipe bend. To demonstrate the ability of the method to simulate flows with complex, moving immersed boundaries we apply it to calculate pulsatile, physiological flow through a mechanical, bileaflet heart valve mounted in a model straight aorta with an anatomical-like triple sinus. PMID:19194533
Kalman filters for assimilating near-surface observations into the Richards equation - Part 1: Retrieving state profiles with linear and nonlinear numerical schemes

NASA Astrophysics Data System (ADS)

Chirico, G. B.; Medina, H.; Romano, N.

2014-07-01

This paper examines the potential of different algorithms, based on the Kalman filtering approach, for assimilating near-surface observations into a one-dimensional Richards equation governing soil water flow in soil. Our specific objectives are: (i) to compare the efficiency of different Kalman filter algorithms in retrieving matric pressure head profiles when they are implemented with different numerical schemes of the Richards equation; (ii) to evaluate the performance of these algorithms when nonlinearities arise from the nonlinearity of the observation equation, i.e. when surface soil water content observations are assimilated to retrieve matric pressure head values. The study is based on a synthetic simulation of an evaporation process from a homogeneous soil column. Our first objective is achieved by implementing a Standard Kalman Filter (SKF) algorithm with both an explicit finite difference scheme (EX) and a Crank-Nicolson (CN) linear finite difference scheme of the Richards equation. The Unscented (UKF) and Ensemble Kalman Filters (EnKF) are applied to handle the nonlinearity of a backward Euler finite difference scheme. To accomplish the second objective, an analogous framework is applied, with the exception of replacing SKF with the Extended Kalman Filter (EKF) in combination with a CN numerical scheme, so as to handle the nonlinearity of the observation equation. While the EX scheme is computationally too inefficient to be implemented in an operational assimilation scheme, the retrieval algorithm implemented with a CN scheme is found to be computationally more feasible and accurate than those implemented with the backward Euler scheme, at least for the examined one-dimensional problem. The UKF appears to be as feasible as the EnKF when one has to handle nonlinear numerical schemes or additional nonlinearities arising from the observation equation, at least for systems of small dimensionality as the one examined in this study.

Efficient Project Delivery Using Lean Principles - An Indian Case Study

NASA Astrophysics Data System (ADS)

Kovvuri, P. Ramachandra Reddy; Sawhney, Anil; Ahuja, Ritu; Sreekumar, Aiswarya

2016-03-01

Construction industry in India is growing at a rapid pace. Along with this growth, the industry is facing numerous challenges that are making delivery of projects inefficient. Experts believe that capacity constraints in the industry need to be addressed immediately. Government has recommended `introduction of efficient technologies and modern management techniques' to increase the productivity of the industry. In this context, lean principles can act as a lever to make project delivery more efficient and provide the much needed impetus to the Indian construction sector. Around the globe lean principles are showing positive results on the projects. Project teams are reporting improvements in construction time, cost and quality along with softer benefits of enhanced collaboration, coordination and trust in project teams. Can adoption of lean principles provide similar benefits in the Indian construction sector? This research was conducted to answer this question. Using an action research approach a key lean construction tool called Last Planner System (LPS) was tested on a large Indian construction project. The work described in this work investigates the improvements achieved in project delivery by adopting LPS in Indian construction sector. Comparison in pre- and post-implementation data demonstrates increase in the certainty of work-flow and improves schedule compliance. This is measured through a simple LPS metric called percent plan complete. Explicit improvements in schedule performance are seen during 8 week LPS implementation along with implicit improvements in coordination, collaboration and trust in the project team. This work reports the findings of LPS implementation on the case study project outlining the barriers and drivers to adoption, strategies needed to ensure successful implementation and roadmap for implementation. Based on the findings the authors envision that lean construction can make project delivery more efficient in India.
Numerical simulation of turbulent stratified flame propagation in a closed vessel

NASA Astrophysics Data System (ADS)

Gruselle, Catherine; Lartigue, Ghislain; Pepiot, Perrine; Moureau, Vincent; D'Angelo, Yves

2012-11-01

Reducing pollutants emissions while keeping a high combustion efficiency and a low fuel consumption is an important challenge for both gas turbine (GT) and internal combustion engines (ICE). To fulfill these new constraints, stratified combustion may constitute an efficient strategy. A tabulated chemistry approach based on FPI combined to a low-Mach number method is applied in the analysis of a turbulent propane-air flame with equivalence ratio (ER) stratification, which has been studied experimentally by Balusamy [S. Balusamy, Ph.D Thesis, INSA-Rouen (2010)]. Flame topology, along with flame velocity statistics, are well reproduced in the simulation, even if time-history effects are not accounted for in the tabulated approach. However, these effects may become significant when exhaust gas recirculation (EGR) is introduced. To better quantify them, both ER and EGR-stratified two-dimensional flames are simulated using finite-rate chemistry and a semi-detailed mechanism for propane oxidation. The numerical implementation is first investigated in terms of efficiency and accuracy, with a focus on splitting errors. The resulting flames are then analyzed to investigate potential extensions of the FPI technique to EGR stratification.
Implicit Kalman filtering

NASA Technical Reports Server (NTRS)

Skliar, M.; Ramirez, W. F.

1997-01-01

For an implicitly defined discrete system, a new algorithm for Kalman filtering is developed and an efficient numerical implementation scheme is proposed. Unlike the traditional explicit approach, the implicit filter can be readily applied to ill-conditioned systems and allows for generalization to descriptor systems. The implementation of the implicit filter depends on the solution of the congruence matrix equation (A1)(Px)(AT1) = Py. We develop a general iterative method for the solution of this equation, and prove necessary and sufficient conditions for convergence. It is shown that when the system matrices of an implicit system are sparse, the implicit Kalman filter requires significantly less computer time and storage to implement as compared to the traditional explicit Kalman filter. Simulation results are presented to illustrate and substantiate the theoretical developments.
Successful Municipal Separate Storm Sewer System Programs Implemented in the Navy - NESDI #494

DTIC Science & Technology

2014-06-01

account. Lastly, upon speaking with numerous stormwater personnel who use a spreadsheet software for data tracking, they recommended that staying well...existing data sources, gathering and maintaining the data needed, and completing and reviewing this collection of information. Send comments regarding...an organized manner. In the long-term, a comprehensive electronic methodology is recommended to keep data organized, be more efficient and to keep
Current and Future Applications of Machine Learning for the US Army

DTIC Science & Technology

2018-04-13

designing from the unwieldy application of the first principles of flight controls, aerodynamics, blade propulsion, and so on, the designers turned...when the number of features runs into millions can become challenging. To overcome these issues, regularization techniques have been developed which...and compiled to run efficiently on either CPU or GPU architectures. 5) Keras63 is a library that contains numerous implementations of commonly used
Research on an augmented Lagrangian penalty function algorithm for nonlinear programming

NASA Technical Reports Server (NTRS)

Frair, L.

1978-01-01

The augmented Lagrangian (ALAG) Penalty Function Algorithm for optimizing nonlinear mathematical models is discussed. The mathematical models of interest are deterministic in nature and finite dimensional optimization is assumed. A detailed review of penalty function techniques in general and the ALAG technique in particular is presented. Numerical experiments are conducted utilizing a number of nonlinear optimization problems to identify an efficient ALAG Penalty Function Technique for computer implementation.
Fully Implicit, Nonlinear 3D Extended Magnetohydrodynamics

NASA Astrophysics Data System (ADS)

Chacon, Luis; Knoll, Dana

2003-10-01

Extended magnetohydrodynamics (XMHD) includes nonideal effects such as nonlinear, anisotropic transport and two-fluid (Hall) effects. XMHD supports multiple, separate time scales that make explicit time differencing approaches extremely inefficient. While a fully implicit implementation promises efficiency without sacrificing numerical accuracy,(D. A. Knoll et al., phJ. Comput. Phys.) 185 (2), 583-611 (2003) the nonlinear nature of the XMHD system and the numerical stiffness associated with the fast waves make this endeavor difficult. Newton-Krylov methods are, however, ideally suited for such a task. These synergistically combine Newton's method for nonlinear convergence, and Krylov techniques to solve the associated Jacobian (linear) systems. Krylov methods can be implemented Jacobian-free and can be preconditioned for efficiency. Successful preconditioning strategies have been developed for 2D incompressible resistive(L. Chacón et al., phJ. Comput. Phys). 178 (1), 15- 36 (2002) and Hall(L. Chacón and D. A. Knoll, phJ. Comput. Phys.), 188 (2), 573-592 (2003) MHD models. These are based on ``physics-based'' ideas, in which knowledge of the physics is exploited to derive well-conditioned (diagonally-dominant) approximations to the original system that are amenable to optimal solver technologies (multigrid). In this work, we will describe the status of the extension of the 2D preconditioning ideas for a 3D compressible, single-fluid XMHD model.
A SEMI-LAGRANGIAN TWO-LEVEL PRECONDITIONED NEWTON-KRYLOV SOLVER FOR CONSTRAINED DIFFEOMORPHIC IMAGE REGISTRATION.

PubMed

Mang, Andreas; Biros, George

2017-01-01

We propose an efficient numerical algorithm for the solution of diffeomorphic image registration problems. We use a variational formulation constrained by a partial differential equation (PDE), where the constraints are a scalar transport equation. We use a pseudospectral discretization in space and second-order accurate semi-Lagrangian time stepping scheme for the transport equations. We solve for a stationary velocity field using a preconditioned, globalized, matrix-free Newton-Krylov scheme. We propose and test a two-level Hessian preconditioner. We consider two strategies for inverting the preconditioner on the coarse grid: a nested preconditioned conjugate gradient method (exact solve) and a nested Chebyshev iterative method (inexact solve) with a fixed number of iterations. We test the performance of our solver in different synthetic and real-world two-dimensional application scenarios. We study grid convergence and computational efficiency of our new scheme. We compare the performance of our solver against our initial implementation that uses the same spatial discretization but a standard, explicit, second-order Runge-Kutta scheme for the numerical time integration of the transport equations and a single-level preconditioner. Our improved scheme delivers significant speedups over our original implementation. As a highlight, we observe a 20 × speedup for a two dimensional, real world multi-subject medical image registration problem.
Avoiding numerical pitfalls in social force models

NASA Astrophysics Data System (ADS)

Köster, Gerta; Treml, Franz; Gödel, Marion

2013-06-01

The social force model of Helbing and Molnár is one of the best known approaches to simulate pedestrian motion, a collective phenomenon with nonlinear dynamics. It is based on the idea that the Newtonian laws of motion mostly carry over to pedestrian motion so that human trajectories can be computed by solving a set of ordinary differential equations for velocity and acceleration. The beauty and simplicity of this ansatz are strong reasons for its wide spread. However, the numerical implementation is not without pitfalls. Oscillations, collisions, and instabilities occur even for very small step sizes. Classic solution ideas from molecular dynamics do not apply to the problem because the system is not Hamiltonian despite its source of inspiration. Looking at the model through the eyes of a mathematician, however, we realize that the right hand side of the differential equation is nondifferentiable and even discontinuous at critical locations. This produces undesirable behavior in the exact solution and, at best, severe loss of accuracy in efficient numerical schemes even in short range simulations. We suggest a very simple mollified version of the social force model that conserves the desired dynamic properties of the original many-body system but elegantly and cost efficiently resolves several of the issues concerning stability and numerical resolution.
Influence of lubrication forces in direct numerical simulations of particle-laden flows

NASA Astrophysics Data System (ADS)

Maitri, Rohit; Peters, Frank; Padding, Johan; Kuipers, Hans

2016-11-01

Accurate numerical representation of particle-laden flows is important for fundamental understanding and optimizing the complex processes such as proppant transport in fracking. Liquid-solid flows are fundamentally different from gas-solid flows because of lower density ratios (solid to fluid) and non-negligible lubrication forces. In this interface resolved model, fluid-solid coupling is achieved by incorporating the no-slip boundary condition implicitly at particle's surfaces by means of an efficient second order ghost-cell immersed boundary method. A fixed Eulerian grid is used for solving the Navier-Stokes equations and the particle-particle interactions are implemented using the soft sphere collision and sub-grid scale lubrication model. Due to the range of influence of lubrication force on a smaller scale than the grid size, it is important to implement the lubrication model accurately. In this work, different implementations of the lubrication model on particle dynamics are studied for various flow conditions. The effect of a particle surface roughness on lubrication force and the particle transport is also investigated. This study is aimed at developing a validated methodology to incorporate lubrication models in direct numerical simulation of particle laden flows. This research is supported from Grant 13CSER014 of the Foundation for Fundamental Research on Matter (FOM), which is part of the Netherlands Organisation for Scientific Research (NWO).
Quantum generalisation of feedforward neural networks

NASA Astrophysics Data System (ADS)

Wan, Kwok Ho; Dahlsten, Oscar; Kristjánsson, Hlér; Gardner, Robert; Kim, M. S.

2017-09-01

We propose a quantum generalisation of a classical neural network. The classical neurons are firstly rendered reversible by adding ancillary bits. Then they are generalised to being quantum reversible, i.e., unitary (the classical networks we generalise are called feedforward, and have step-function activation functions). The quantum network can be trained efficiently using gradient descent on a cost function to perform quantum generalisations of classical tasks. We demonstrate numerically that it can: (i) compress quantum states onto a minimal number of qubits, creating a quantum autoencoder, and (ii) discover quantum communication protocols such as teleportation. Our general recipe is theoretical and implementation-independent. The quantum neuron module can naturally be implemented photonically.
Proportional Topology Optimization: A New Non-Sensitivity Method for Solving Stress Constrained and Minimum Compliance Problems and Its Implementation in MATLAB

PubMed Central

Biyikli, Emre; To, Albert C.

2015-01-01

A new topology optimization method called the Proportional Topology Optimization (PTO) is presented. As a non-sensitivity method, PTO is simple to understand, easy to implement, and is also efficient and accurate at the same time. It is implemented into two MATLAB programs to solve the stress constrained and minimum compliance problems. Descriptions of the algorithm and computer programs are provided in detail. The method is applied to solve three numerical examples for both types of problems. The method shows comparable efficiency and accuracy with an existing optimality criteria method which computes sensitivities. Also, the PTO stress constrained algorithm and minimum compliance algorithm are compared by feeding output from one algorithm to the other in an alternative manner, where the former yields lower maximum stress and volume fraction but higher compliance compared to the latter. Advantages and disadvantages of the proposed method and future works are discussed. The computer programs are self-contained and publicly shared in the website www.ptomethod.org. PMID:26678849
Implementation of a digital optical matrix-vector multiplier using a holographic look-up table and residue arithmetic

NASA Technical Reports Server (NTRS)

Habiby, Sarry F.

1987-01-01

The design and implementation of a digital (numerical) optical matrix-vector multiplier are presented. The objective is to demonstrate the operation of an optical processor designed to minimize computation time in performing a practical computing application. This is done by using the large array of processing elements in a Hughes liquid crystal light valve, and relying on the residue arithmetic representation, a holographic optical memory, and position coded optical look-up tables. In the design, all operations are performed in effectively one light valve response time regardless of matrix size. The features of the design allowing fast computation include the residue arithmetic representation, the mapping approach to computation, and the holographic memory. In addition, other features of the work include a practical light valve configuration for efficient polarization control, a model for recording multiple exposures in silver halides with equal reconstruction efficiency, and using light from an optical fiber for a reference beam source in constructing the hologram. The design can be extended to implement larger matrix arrays without increasing computation time.
Exploiting the chaotic behaviour of atmospheric models with reconfigurable architectures

NASA Astrophysics Data System (ADS)

Russell, Francis P.; Düben, Peter D.; Niu, Xinyu; Luk, Wayne; Palmer, T. N.

2017-12-01

Reconfigurable architectures are becoming mainstream: Amazon, Microsoft and IBM are supporting such architectures in their data centres. The computationally intensive nature of atmospheric modelling is an attractive target for hardware acceleration using reconfigurable computing. Performance of hardware designs can be improved through the use of reduced-precision arithmetic, but maintaining appropriate accuracy is essential. We explore reduced-precision optimisation for simulating chaotic systems, targeting atmospheric modelling, in which even minor changes in arithmetic behaviour will cause simulations to diverge quickly. The possibility of equally valid simulations having differing outcomes means that standard techniques for comparing numerical accuracy are inappropriate. We use the Hellinger distance to compare statistical behaviour between reduced-precision CPU implementations to guide reconfigurable designs of a chaotic system, then analyse accuracy, performance and power efficiency of the resulting implementations. Our results show that with only a limited loss in accuracy corresponding to less than 10% uncertainty in input parameters, the throughput and energy efficiency of a single-precision chaotic system implemented on a Xilinx Virtex-6 SX475T Field Programmable Gate Array (FPGA) can be more than doubled.
Experience with a Genetic Algorithm Implemented on a Multiprocessor Computer

NASA Technical Reports Server (NTRS)

Plassman, Gerald E.; Sobieszczanski-Sobieski, Jaroslaw

2000-01-01

Numerical experiments were conducted to find out the extent to which a Genetic Algorithm (GA) may benefit from a multiprocessor implementation, considering, on one hand, that analyses of individual designs in a population are independent of each other so that they may be executed concurrently on separate processors, and, on the other hand, that there are some operations in a GA that cannot be so distributed. The algorithm experimented with was based on a gaussian distribution rather than bit exchange in the GA reproductive mechanism, and the test case was a hub frame structure of up to 1080 design variables. The experimentation engaging up to 128 processors confirmed expectations of radical elapsed time reductions comparing to a conventional single processor implementation. It also demonstrated that the time spent in the non-distributable parts of the algorithm and the attendant cross-processor communication may have a very detrimental effect on the efficient utilization of the multiprocessor machine and on the number of processors that can be used effectively in a concurrent manner. Three techniques were devised and tested to mitigate that effect, resulting in efficiency increasing to exceed 99 percent.
Implementation of the infinite-range exterior complex scaling to the time-dependent complete-active-space self-consistent-field method

NASA Astrophysics Data System (ADS)

Orimo, Yuki; Sato, Takeshi; Scrinzi, Armin; Ishikawa, Kenichi L.

2018-02-01

We present a numerical implementation of the infinite-range exterior complex scaling [Scrinzi, Phys. Rev. A 81, 053845 (2010), 10.1103/PhysRevA.81.053845] as an efficient absorbing boundary to the time-dependent complete-active-space self-consistent field method [Sato, Ishikawa, Březinová, Lackner, Nagele, and Burgdörfer, Phys. Rev. A 94, 023405 (2016), 10.1103/PhysRevA.94.023405] for multielectron atoms subject to an intense laser pulse. We introduce Gauss-Laguerre-Radau quadrature points to construct discrete variable representation basis functions in the last radial finite element extending to infinity. This implementation is applied to strong-field ionization and high-harmonic generation in He, Be, and Ne atoms. It efficiently prevents unphysical reflection of photoelectron wave packets at the simulation boundary, enabling accurate simulations with substantially reduced computational cost, even under significant (≈50 % ) double ionization. For the case of a simulation of high-harmonic generation from Ne, for example, 80% cost reduction is achieved, compared to a mask-function absorption boundary.
The application of quaternions and other spatial representations to the reconstruction of re-entry vehicle motion.

DOE Office of Scientific and Technical Information (OSTI.GOV)

De Sapio, Vincent

2010-09-01

The analysis of spacecraft kinematics and dynamics requires an efficient scheme for spatial representation. While the representation of displacement in three dimensional Euclidean space is straightforward, orientation in three dimensions poses particular challenges. The unit quaternion provides an approach that mitigates many of the problems intrinsic in other representation approaches, including the ill-conditioning that arises from computing many successive rotations. This report focuses on the computational utility of unit quaternions and their application to the reconstruction of re-entry vehicle (RV) motion history from sensor data. To this end they will be used in conjunction with other kinematic and data processingmore » techniques. We will present a numerical implementation for the reconstruction of RV motion solely from gyroscope and accelerometer data. This will make use of unit quaternions due to their numerical efficacy in dealing with the composition of many incremental rotations over a time series. In addition to signal processing and data conditioning procedures, algorithms for numerical quaternion-based integration of gyroscope data will be addressed, as well as accelerometer triangulation and integration to yield RV trajectory. Actual processed flight data will be presented to demonstrate the implementation of these methods.« less
Calculation of ionized fields in DC electrostatic precipitators in the presence of dust and electric wind

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cristina, S.; Feliziani, M.

1995-11-01

This paper describes a new procedure for the numerical computation of the electric field and current density distributions in a dc electrostatic precipitator in the presence of dust, taking into account the particle-size distribution. Poisson`s and continuity equations are numerically solved by supposing that the coronating conductors satisfy Kaptzov`s assumption on the emitter surfaces. Two iterative numerical procedures, both based on the finite element method (FEM), are implemented for evaluating, respectively, the unknown ionic charge density and the particle charge density distributions. The V-I characteristic and the precipitation efficiencies for the individual particle-size classes, calculated with reference to the pilotmore » precipitator installed by ENEL (Italian Electricity Board) at its Marghera (Venice) coal-fired power station, are found to be very close to those measured experimentally.« less
Engine dynamic analysis with general nonlinear finite element codes. II - Bearing element implementation, overall numerical characteristics and benchmarking

NASA Technical Reports Server (NTRS)

Padovan, J.; Adams, M.; Lam, P.; Fertis, D.; Zeid, I.

1982-01-01

Second-year efforts within a three-year study to develop and extend finite element (FE) methodology to efficiently handle the transient/steady state response of rotor-bearing-stator structure associated with gas turbine engines are outlined. The two main areas aim at (1) implanting the squeeze film damper element into a general purpose FE code for testing and evaluation; and (2) determining the numerical characteristics of the FE-generated rotor-bearing-stator simulation scheme. The governing FE field equations are set out and the solution methodology is presented. The choice of ADINA as the general-purpose FE code is explained, and the numerical operational characteristics of the direct integration approach of FE-generated rotor-bearing-stator simulations is determined, including benchmarking, comparison of explicit vs. implicit methodologies of direct integration, and demonstration problems.
COMETS2: An advanced MATLAB toolbox for the numerical analysis of electric fields generated by transcranial direct current stimulation.

PubMed

Lee, Chany; Jung, Young-Jin; Lee, Sang Jun; Im, Chang-Hwan

2017-02-01

Since there is no way to measure electric current generated by transcranial direct current stimulation (tDCS) inside the human head through in vivo experiments, numerical analysis based on the finite element method has been widely used to estimate the electric field inside the head. In 2013, we released a MATLAB toolbox named COMETS, which has been used by a number of groups and has helped researchers to gain insight into the electric field distribution during stimulation. The aim of this study was to develop an advanced MATLAB toolbox, named COMETS2, for the numerical analysis of the electric field generated by tDCS. COMETS2 can generate any sizes of rectangular pad electrodes on any positions on the scalp surface. To reduce the large computational burden when repeatedly testing multiple electrode locations and sizes, a new technique to decompose the global stiffness matrix was proposed. As examples of potential applications, we observed the effects of sizes and displacements of electrodes on the results of electric field analysis. The proposed mesh decomposition method significantly enhanced the overall computational efficiency. We implemented an automatic electrode modeler for the first time, and proposed a new technique to enhance the computational efficiency. In this paper, an efficient toolbox for tDCS analysis is introduced (freely available at http://www.cometstool.com). It is expected that COMETS2 will be a useful toolbox for researchers who want to benefit from the numerical analysis of electric fields generated by tDCS. Copyright © 2016. Published by Elsevier B.V.

Two-dimensional nonsteady viscous flow simulation on the Navier-Stokes computer miniNode

NASA Technical Reports Server (NTRS)

Nosenchuck, Daniel M.; Littman, Michael G.; Flannery, William

1986-01-01

The needs of large-scale scientific computation are outpacing the growth in performance of mainframe supercomputers. In particular, problems in fluid mechanics involving complex flow simulations require far more speed and capacity than that provided by current and proposed Class VI supercomputers. To address this concern, the Navier-Stokes Computer (NSC) was developed. The NSC is a parallel-processing machine, comprised of individual Nodes, each comparable in performance to current supercomputers. The global architecture is that of a hypercube, and a 128-Node NSC has been designed. New architectural features, such as a reconfigurable many-function ALU pipeline and a multifunction memory-ALU switch, have provided the capability to efficiently implement a wide range of algorithms. Efficient algorithms typically involve numerically intensive tasks, which often include conditional operations. These operations may be efficiently implemented on the NSC without, in general, sacrificing vector-processing speed. To illustrate the architecture, programming, and several of the capabilities of the NSC, the simulation of two-dimensional, nonsteady viscous flows on a prototype Node, called the miniNode, is presented.
Development of a Aerothermoelastic-Acoustics Simulation Capability of Flight Vehicles

NASA Technical Reports Server (NTRS)

Gupta, K. K.; Choi, S. B.; Ibrahim, A.

2010-01-01

A novel numerical, finite element based analysis methodology is presented in this paper suitable for accurate and efficient simulation of practical, complex flight vehicles. An associated computer code, developed in this connection, is also described in some detail. Thermal effects of high speed flow obtained from a heat conduction analysis are incorporated in the modal analysis which in turn affects the unsteady flow arising out of interaction of elastic structures with the air. Numerical examples pertaining to representative problems are given in much detail testifying to the efficacy of the advocated techniques. This is a unique implementation of temperature effects in a finite element CFD based multidisciplinary simulation analysis capability involving large scale computations.
bhlight: General Relativistic Radiation Magnetohydrodynamics with Monte Carlo Transport

DOE PAGES

Ryan, Benjamin R; Dolence, Joshua C.; Gammie, Charles F.

2015-06-25

We present bhlight, a numerical scheme for solving the equations of general relativistic radiation magnetohydrodynamics using a direct Monte Carlo solution of the frequency-dependent radiative transport equation. bhlight is designed to evolve black hole accretion flows at intermediate accretion rate, in the regime between the classical radiatively efficient disk and the radiatively inefficient accretion flow (RIAF), in which global radiative effects play a sub-dominant but non-negligible role in disk dynamics. We describe the governing equations, numerical method, idiosyncrasies of our implementation, and a suite of test and convergence results. We also describe example applications to radiative Bondi accretion and tomore » a slowly accreting Kerr black hole in axisymmetry.« less
Computational fluid dynamics uses in fluid dynamics/aerodynamics education

NASA Technical Reports Server (NTRS)

Holst, Terry L.

1994-01-01

The field of computational fluid dynamics (CFD) has advanced to the point where it can now be used for the purpose of fluid dynamics physics education. Because of the tremendous wealth of information available from numerical simulation, certain fundamental concepts can be efficiently communicated using an interactive graphical interrogation of the appropriate numerical simulation data base. In other situations, a large amount of aerodynamic information can be communicated to the student by interactive use of simple CFD tools on a workstation or even in a personal computer environment. The emphasis in this presentation is to discuss ideas for how this process might be implemented. Specific examples, taken from previous publications, will be used to highlight the presentation.
Simple numerical method for predicting steady compressible flows

NASA Technical Reports Server (NTRS)

Vonlavante, Ernst; Nelson, N. Duane

1986-01-01

A numerical method for solving the isenthalpic form of the governing equations for compressible viscous and inviscid flows was developed. The method was based on the concept of flux vector splitting in its implicit form. The method was tested on several demanding inviscid and viscous configurations. Two different forms of the implicit operator were investigated. The time marching to steady state was accelerated by the implementation of the multigrid procedure. Its various forms very effectively increased the rate of convergence of the present scheme. High quality steady state results were obtained in most of the test cases; these required only short computational times due to the relative efficiency of the basic method.
Two-Level Hierarchical FEM Method for Modeling Passive Microwave Devices

NASA Astrophysics Data System (ADS)

Polstyanko, Sergey V.; Lee, Jin-Fa

1998-03-01

In recent years multigrid methods have been proven to be very efficient for solving large systems of linear equations resulting from the discretization of positive definite differential equations by either the finite difference method or theh-version of the finite element method. In this paper an iterative method of the multiple level type is proposed for solving systems of algebraic equations which arise from thep-version of the finite element analysis applied to indefinite problems. A two-levelV-cycle algorithm has been implemented and studied with a Gauss-Seidel iterative scheme used as a smoother. The convergence of the method has been investigated, and numerical results for a number of numerical examples are presented.
Computational methods for reactive transport modeling: A Gibbs energy minimization approach for multiphase equilibrium calculations

NASA Astrophysics Data System (ADS)

Leal, Allan M. M.; Kulik, Dmitrii A.; Kosakowski, Georg

2016-02-01

We present a numerical method for multiphase chemical equilibrium calculations based on a Gibbs energy minimization approach. The method can accurately and efficiently determine the stable phase assemblage at equilibrium independently of the type of phases and species that constitute the chemical system. We have successfully applied our chemical equilibrium algorithm in reactive transport simulations to demonstrate its effective use in computationally intensive applications. We used FEniCS to solve the governing partial differential equations of mass transport in porous media using finite element methods in unstructured meshes. Our equilibrium calculations were benchmarked with GEMS3K, the numerical kernel of the geochemical package GEMS. This allowed us to compare our results with a well-established Gibbs energy minimization algorithm, as well as their performance on every mesh node, at every time step of the transport simulation. The benchmark shows that our novel chemical equilibrium algorithm is accurate, robust, and efficient for reactive transport applications, and it is an improvement over the Gibbs energy minimization algorithm used in GEMS3K. The proposed chemical equilibrium method has been implemented in Reaktoro, a unified framework for modeling chemically reactive systems, which is now used as an alternative numerical kernel of GEMS.
Computing Generalized Matrix Inverse on Spiking Neural Substrate.

PubMed

Shukla, Rohit; Khoram, Soroosh; Jorgensen, Erik; Li, Jing; Lipasti, Mikko; Wright, Stephen

2018-01-01

Emerging neural hardware substrates, such as IBM's TrueNorth Neurosynaptic System, can provide an appealing platform for deploying numerical algorithms. For example, a recurrent Hopfield neural network can be used to find the Moore-Penrose generalized inverse of a matrix, thus enabling a broad class of linear optimizations to be solved efficiently, at low energy cost. However, deploying numerical algorithms on hardware platforms that severely limit the range and precision of representation for numeric quantities can be quite challenging. This paper discusses these challenges and proposes a rigorous mathematical framework for reasoning about range and precision on such substrates. The paper derives techniques for normalizing inputs and properly quantizing synaptic weights originating from arbitrary systems of linear equations, so that solvers for those systems can be implemented in a provably correct manner on hardware-constrained neural substrates. The analytical model is empirically validated on the IBM TrueNorth platform, and results show that the guarantees provided by the framework for range and precision hold under experimental conditions. Experiments with optical flow demonstrate the energy benefits of deploying a reduced-precision and energy-efficient generalized matrix inverse engine on the IBM TrueNorth platform, reflecting 10× to 100× improvement over FPGA and ARM core baselines.
BeamDyn: a high-fidelity wind turbine blade solver in the FAST modular framework

DOE PAGES

Wang, Qi; Sprague, Michael A.; Jonkman, Jason; ...

2017-03-14

Here, this paper presents a numerical implementation of the geometrically exact beam theory based on the Legendre-spectral-finite-element (LSFE) method. The displacement-based geometrically exact beam theory is presented, and the special treatment of three-dimensional rotation parameters is reviewed. An LSFE is a high-order finite element with nodes located at the Gauss-Legendre-Lobatto points. These elements can be an order of magnitude more computationally efficient than low-order finite elements for a given accuracy level. The new module, BeamDyn, is implemented in the FAST modularization framework for dynamic simulation of highly flexible composite-material wind turbine blades within the FAST aeroelastic engineering model. The frameworkmore » allows for fully interactive simulations of turbine blades in operating conditions. Numerical examples are provided to validate BeamDyn and examine the LSFE performance as well as the coupling algorithm in the FAST modularization framework. BeamDyn can also be used as a stand-alone high-fidelity beam tool.« less
XGC developments for a more efficient XGC-GENE code coupling

NASA Astrophysics Data System (ADS)

Dominski, Julien; Hager, Robert; Ku, Seung-Hoe; Chang, Cs

2017-10-01

In the Exascale Computing Program, the High-Fidelity Whole Device Modeling project initially aims at delivering a tightly-coupled simulation of plasma neoclassical and turbulence dynamics from the core to the edge of the tokamak. To permit such simulations, the gyrokinetic codes GENE and XGC will be coupled together. Numerical efforts are made to improve the numerical schemes agreement in the coupling region. One of the difficulties of coupling those codes together is the incompatibility of their grids. GENE is a continuum grid-based code and XGC is a Particle-In-Cell code using unstructured triangular mesh. A field-aligned filter is thus implemented in XGC. Even if XGC originally had an approximately field-following mesh, this field-aligned filter permits to have a perturbation discretization closer to the one solved in the field-aligned code GENE. Additionally, new XGC gyro-averaging matrices are implemented on a velocity grid adapted to the plasma properties, thus ensuring same accuracy from the core to the edge regions.
BeamDyn: a high-fidelity wind turbine blade solver in the FAST modular framework

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Qi; Sprague, Michael A.; Jonkman, Jason

Here, this paper presents a numerical implementation of the geometrically exact beam theory based on the Legendre-spectral-finite-element (LSFE) method. The displacement-based geometrically exact beam theory is presented, and the special treatment of three-dimensional rotation parameters is reviewed. An LSFE is a high-order finite element with nodes located at the Gauss-Legendre-Lobatto points. These elements can be an order of magnitude more computationally efficient than low-order finite elements for a given accuracy level. The new module, BeamDyn, is implemented in the FAST modularization framework for dynamic simulation of highly flexible composite-material wind turbine blades within the FAST aeroelastic engineering model. The frameworkmore » allows for fully interactive simulations of turbine blades in operating conditions. Numerical examples are provided to validate BeamDyn and examine the LSFE performance as well as the coupling algorithm in the FAST modularization framework. BeamDyn can also be used as a stand-alone high-fidelity beam tool.« less
High performance volume-of-intersection projectors for 3D-PET image reconstruction based on polar symmetries and SIMD vectorisation.

PubMed

Scheins, J J; Vahedipour, K; Pietrzyk, U; Shah, N J

2015-12-21

For high-resolution, iterative 3D PET image reconstruction the efficient implementation of forward-backward projectors is essential to minimise the calculation time. Mathematically, the projectors are summarised as a system response matrix (SRM) whose elements define the contribution of image voxels to lines-of-response (LORs). In fact, the SRM easily comprises billions of non-zero matrix elements to evaluate the tremendous number of LORs as provided by state-of-the-art PET scanners. Hence, the performance of iterative algorithms, e.g. maximum-likelihood-expectation-maximisation (MLEM), suffers from severe computational problems due to the intensive memory access and huge number of floating point operations. Here, symmetries occupy a key role in terms of efficient implementation. They reduce the amount of independent SRM elements, thus allowing for a significant matrix compression according to the number of exploitable symmetries. With our previous work, the PET REconstruction Software TOolkit (PRESTO), very high compression factors (>300) are demonstrated by using specific non-Cartesian voxel patterns involving discrete polar symmetries. In this way, a pre-calculated memory-resident SRM using complex volume-of-intersection calculations can be achieved. However, our original ray-driven implementation suffers from addressing voxels, projection data and SRM elements in disfavoured memory access patterns. As a consequence, a rather limited numerical throughput is observed due to the massive waste of memory bandwidth and inefficient usage of cache respectively. In this work, an advantageous symmetry-driven evaluation of the forward-backward projectors is proposed to overcome these inefficiencies. The polar symmetries applied in PRESTO suggest a novel organisation of image data and LOR projection data in memory to enable an efficient single instruction multiple data vectorisation, i.e. simultaneous use of any SRM element for symmetric LORs. In addition, the calculation time is further reduced by using simultaneous multi-threading (SMT). A global speedup factor of 11 without SMT and above 100 with SMT has been achieved for the improved CPU-based implementation while obtaining equivalent numerical results.
High performance volume-of-intersection projectors for 3D-PET image reconstruction based on polar symmetries and SIMD vectorisation

NASA Astrophysics Data System (ADS)

Scheins, J. J.; Vahedipour, K.; Pietrzyk, U.; Shah, N. J.

2015-12-01

For high-resolution, iterative 3D PET image reconstruction the efficient implementation of forward-backward projectors is essential to minimise the calculation time. Mathematically, the projectors are summarised as a system response matrix (SRM) whose elements define the contribution of image voxels to lines-of-response (LORs). In fact, the SRM easily comprises billions of non-zero matrix elements to evaluate the tremendous number of LORs as provided by state-of-the-art PET scanners. Hence, the performance of iterative algorithms, e.g. maximum-likelihood-expectation-maximisation (MLEM), suffers from severe computational problems due to the intensive memory access and huge number of floating point operations. Here, symmetries occupy a key role in terms of efficient implementation. They reduce the amount of independent SRM elements, thus allowing for a significant matrix compression according to the number of exploitable symmetries. With our previous work, the PET REconstruction Software TOolkit (PRESTO), very high compression factors (>300) are demonstrated by using specific non-Cartesian voxel patterns involving discrete polar symmetries. In this way, a pre-calculated memory-resident SRM using complex volume-of-intersection calculations can be achieved. However, our original ray-driven implementation suffers from addressing voxels, projection data and SRM elements in disfavoured memory access patterns. As a consequence, a rather limited numerical throughput is observed due to the massive waste of memory bandwidth and inefficient usage of cache respectively. In this work, an advantageous symmetry-driven evaluation of the forward-backward projectors is proposed to overcome these inefficiencies. The polar symmetries applied in PRESTO suggest a novel organisation of image data and LOR projection data in memory to enable an efficient single instruction multiple data vectorisation, i.e. simultaneous use of any SRM element for symmetric LORs. In addition, the calculation time is further reduced by using simultaneous multi-threading (SMT). A global speedup factor of 11 without SMT and above 100 with SMT has been achieved for the improved CPU-based implementation while obtaining equivalent numerical results.
Multilevel Optimization Framework for Hierarchical Stiffened Shells Accelerated by Adaptive Equivalent Strategy

NASA Astrophysics Data System (ADS)

Wang, Bo; Tian, Kuo; Zhao, Haixin; Hao, Peng; Zhu, Tianyu; Zhang, Ke; Ma, Yunlong

2017-06-01

In order to improve the post-buckling optimization efficiency of hierarchical stiffened shells, a multilevel optimization framework accelerated by adaptive equivalent strategy is presented in this paper. Firstly, the Numerical-based Smeared Stiffener Method (NSSM) for hierarchical stiffened shells is derived by means of the numerical implementation of asymptotic homogenization (NIAH) method. Based on the NSSM, a reasonable adaptive equivalent strategy for hierarchical stiffened shells is developed from the concept of hierarchy reduction. Its core idea is to self-adaptively decide which hierarchy of the structure should be equivalent according to the critical buckling mode rapidly predicted by NSSM. Compared with the detailed model, the high prediction accuracy and efficiency of the proposed model is highlighted. On the basis of this adaptive equivalent model, a multilevel optimization framework is then established by decomposing the complex entire optimization process into major-stiffener-level and minor-stiffener-level sub-optimizations, during which Fixed Point Iteration (FPI) is employed to accelerate convergence. Finally, the illustrative examples of the multilevel framework is carried out to demonstrate its efficiency and effectiveness to search for the global optimum result by contrast with the single-level optimization method. Remarkably, the high efficiency and flexibility of the adaptive equivalent strategy is indicated by compared with the single equivalent strategy.
A parallel computing engine for a class of time critical processes.

PubMed

Nabhan, T M; Zomaya, A Y

1997-01-01

This paper focuses on the efficient parallel implementation of systems of numerically intensive nature over loosely coupled multiprocessor architectures. These analytical models are of significant importance to many real-time systems that have to meet severe time constants. A parallel computing engine (PCE) has been developed in this work for the efficient simplification and the near optimal scheduling of numerical models over the different cooperating processors of the parallel computer. First, the analytical system is efficiently coded in its general form. The model is then simplified by using any available information (e.g., constant parameters). A task graph representing the interconnections among the different components (or equations) is generated. The graph can then be compressed to control the computation/communication requirements. The task scheduler employs a graph-based iterative scheme, based on the simulated annealing algorithm, to map the vertices of the task graph onto a Multiple-Instruction-stream Multiple-Data-stream (MIMD) type of architecture. The algorithm uses a nonanalytical cost function that properly considers the computation capability of the processors, the network topology, the communication time, and congestion possibilities. Moreover, the proposed technique is simple, flexible, and computationally viable. The efficiency of the algorithm is demonstrated by two case studies with good results.
Iterative methods for 3D implicit finite-difference migration using the complex Padé approximation

NASA Astrophysics Data System (ADS)

Costa, Carlos A. N.; Campos, Itamara S.; Costa, Jessé C.; Neto, Francisco A.; Schleicher, Jörg; Novais, Amélia

2013-08-01

Conventional implementations of 3D finite-difference (FD) migration use splitting techniques to accelerate performance and save computational cost. However, such techniques are plagued with numerical anisotropy that jeopardises the correct positioning of dipping reflectors in the directions not used for the operator splitting. We implement 3D downward continuation FD migration without splitting using a complex Padé approximation. In this way, the numerical anisotropy is eliminated at the expense of a computationally more intensive solution of a large-band linear system. We compare the performance of the iterative stabilized biconjugate gradient (BICGSTAB) and that of the multifrontal massively parallel direct solver (MUMPS). It turns out that the use of the complex Padé approximation not only stabilizes the solution, but also acts as an effective preconditioner for the BICGSTAB algorithm, reducing the number of iterations as compared to the implementation using the real Padé expansion. As a consequence, the iterative BICGSTAB method is more efficient than the direct MUMPS method when solving a single term in the Padé expansion. The results of both algorithms, here evaluated by computing the migration impulse response in the SEG/EAGE salt model, are of comparable quality.
Triangular covariance factorizations for. Ph.D. Thesis. - Calif. Univ.

NASA Technical Reports Server (NTRS)

Thornton, C. L.

1976-01-01

An improved computational form of the discrete Kalman filter is derived using an upper triangular factorization of the error covariance matrix. The covariance P is factored such that P = UDUT where U is unit upper triangular and D is diagonal. Recursions are developed for propagating the U-D covariance factors together with the corresponding state estimate. The resulting algorithm, referred to as the U-D filter, combines the superior numerical precision of square root filtering techniques with an efficiency comparable to that of Kalman's original formula. Moreover, this method is easily implemented and involves no more computer storage than the Kalman algorithm. These characteristics make the U-D method an attractive realtime filtering technique. A new covariance error analysis technique is obtained from an extension of the U-D filter equations. This evaluation method is flexible and efficient and may provide significantly improved numerical results. Cost comparisons show that for a large class of problems the U-D evaluation algorithm is noticeably less expensive than conventional error analysis methods.
Adaptive [theta]-methods for pricing American options

NASA Astrophysics Data System (ADS)

Khaliq, Abdul Q. M.; Voss, David A.; Kazmi, Kamran

2008-12-01

We develop adaptive [theta]-methods for solving the Black-Scholes PDE for American options. By adding a small, continuous term, the Black-Scholes PDE becomes an advection-diffusion-reaction equation on a fixed spatial domain. Standard implementation of [theta]-methods would require a Newton-type iterative procedure at each time step thereby increasing the computational complexity of the methods. Our linearly implicit approach avoids such complications. We establish a general framework under which [theta]-methods satisfy a discrete version of the positivity constraint characteristic of American options, and numerically demonstrate the sensitivity of the constraint. The positivity results are established for the single-asset and independent two-asset models. In addition, we have incorporated and analyzed an adaptive time-step control strategy to increase the computational efficiency. Numerical experiments are presented for one- and two-asset American options, using adaptive exponential splitting for two-asset problems. The approach is compared with an iterative solution of the two-asset problem in terms of computational efficiency.
Enhancement of the Open National Combustion Code (OpenNCC) and Initial Simulation of Energy Efficient Engine Combustor

NASA Technical Reports Server (NTRS)

Miki, Kenji; Moder, Jeff; Liou, Meng-Sing

2016-01-01

In this paper, we present the recent enhancement of the Open National Combustion Code (OpenNCC) and apply the OpenNCC to model a realistic combustor configuration (Energy Efficient Engine (E3)). First, we perform a series of validation tests for the newly-implemented advection upstream splitting method (AUSM) and the extended version of the AUSM-family schemes (AUSM+-up). Compared with the analytical/experimental data of the validation tests, we achieved good agreement. In the steady-state E3 cold flow results using the Reynolds-averaged Navier-Stokes(RANS), we find a noticeable difference in the flow fields calculated by the two different numerical schemes, the standard Jameson- Schmidt-Turkel (JST) scheme and the AUSM scheme. The main differences are that the AUSM scheme is less numerical dissipative and it predicts much stronger reverse flow in the recirculation zone. This study indicates that two schemes could show different flame-holding predictions and overall flame structures.
Renormalization group approach to symmetry protected topological phases

NASA Astrophysics Data System (ADS)

van Nieuwenburg, Evert P. L.; Schnyder, Andreas P.; Chen, Wei

2018-04-01

A defining feature of a symmetry protected topological phase (SPT) in one dimension is the degeneracy of the Schmidt values for any given bipartition. For the system to go through a topological phase transition separating two SPTs, the Schmidt values must either split or cross at the critical point in order to change their degeneracies. A renormalization group (RG) approach based on this splitting or crossing is proposed, through which we obtain an RG flow that identifies the topological phase transitions in the parameter space. Our approach can be implemented numerically in an efficient manner, for example, using the matrix product state formalism, since only the largest first few Schmidt values need to be calculated with sufficient accuracy. Using several concrete models, we demonstrate that the critical points and fixed points of the RG flow coincide with the maxima and minima of the entanglement entropy, respectively, and the method can serve as a numerically efficient tool to analyze interacting SPTs in the parameter space.

Photon Throughput Calculations for a Spherical Crystal Spectrometer

NASA Astrophysics Data System (ADS)

Gilman, C. J.; Bitter, M.; Delgado-Aparicio, L.; Efthimion, P. C.; Hill, K.; Kraus, B.; Gao, L.; Pablant, N.

2017-10-01

X-ray imaging crystal spectrometers of the type described in Refs. have become a standard diagnostic for Doppler measurements of profiles of the ion temperature and the plasma flow velocities in magnetically confined, hot fusion plasmas. These instruments have by now been implemented on major tokamak and stellarator experiments in Korea, China, Japan, and Germany and are currently also being designed by PPPL for ITER. A still missing part in the present data analysis is an efficient code for photon throughput calculations to evaluate the chord-integrated spectral data. The existing ray tracing codes cannot be used for a data analysis between shots, since they require extensive and time consuming numerical calculations. Here, we present a detailed analysis of the geometrical properties of the ray pattern. This method allows us to minimize the extent of numerical calculations and to create a more efficient code. This work was performed under the auspices of the U.S. Department of Energy by Princeton Plasma Physics Laboratory under contract DE-AC02-09CH11466.
Structure-preserving and rank-revealing QR-factorizations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bischof, C.H.; Hansen, P.C.

1991-11-01

The rank-revealing QR-factorization (RRQR-factorization) is a special QR-factorization that is guaranteed to reveal the numerical rank of the matrix under consideration. This makes the RRQR-factorization a useful tool in the numerical treatment of many rank-deficient problems in numerical linear algebra. In this paper, a framework is presented for the efficient implementation of RRQR algorithms, in particular, for sparse matrices. A sparse RRQR-algorithm should seek to preserve the structure and sparsity of the matrix as much as possible while retaining the ability to capture safely the numerical rank. To this end, the paper proposes to compute an initial QR-factorization using amore » restricted pivoting strategy guarded by incremental condition estimation (ICE), and then applies the algorithm suggested by Chan and Foster to this QR-factorization. The column exchange strategy used in the initial QR factorization will exploit the fact that certain column exchanges do not change the sparsity structure, and compute a sparse QR-factorization that is a good approximation of the sought-after RRQR-factorization. Due to quantities produced by ICE, the Chan/Foster RRQR algorithm can be implemented very cheaply, thus verifying that the sought-after RRQR-factorization has indeed been computed. Experimental results on a model problem show that the initial QR-factorization is indeed very likely to produce RRQR-factorization.« less
CORDIC-based digital signal processing (DSP) element for adaptive signal processing

NASA Astrophysics Data System (ADS)

Bolstad, Gregory D.; Neeld, Kenneth B.

1995-04-01

The High Performance Adaptive Weight Computation (HAWC) processing element is a CORDIC based application specific DSP element that, when connected in a linear array, can perform extremely high throughput (100s of GFLOPS) matrix arithmetic operations on linear systems of equations in real time. In particular, it very efficiently performs the numerically intense computation of optimal least squares solutions for large, over-determined linear systems. Most techniques for computing solutions to these types of problems have used either a hard-wired, non-programmable systolic array approach, or more commonly, programmable DSP or microprocessor approaches. The custom logic methods can be efficient, but are generally inflexible. Approaches using multiple programmable generic DSP devices are very flexible, but suffer from poor efficiency and high computation latencies, primarily due to the large number of DSP devices that must be utilized to achieve the necessary arithmetic throughput. The HAWC processor is implemented as a highly optimized systolic array, yet retains some of the flexibility of a programmable data-flow system, allowing efficient implementation of algorithm variations. This provides flexible matrix processing capabilities that are one to three orders of magnitude less expensive and more dense than the current state of the art, and more importantly, allows a realizable solution to matrix processing problems that were previously considered impractical to physically implement. HAWC has direct applications in RADAR, SONAR, communications, and image processing, as well as in many other types of systems.
Continuum topology optimization considering uncertainties in load locations based on the cloud model

NASA Astrophysics Data System (ADS)

Liu, Jie; Wen, Guilin

2018-06-01

Few researchers have paid attention to designing structures in consideration of uncertainties in the loading locations, which may significantly influence the structural performance. In this work, cloud models are employed to depict the uncertainties in the loading locations. A robust algorithm is developed in the context of minimizing the expectation of the structural compliance, while conforming to a material volume constraint. To guarantee optimal solutions, sufficient cloud drops are used, which in turn leads to low efficiency. An innovative strategy is then implemented to enormously improve the computational efficiency. A modified soft-kill bi-directional evolutionary structural optimization method using derived sensitivity numbers is used to output the robust novel configurations. Several numerical examples are presented to demonstrate the effectiveness and efficiency of the proposed algorithm.
Army Reserve Comprehensive Water Efficiency Assessments

DOE Office of Scientific and Technical Information (OSTI.GOV)

McMordie Stoughton, Kate; Kearney, Jaime

The Army Reserve has partnered with the Pacific Northwest National Laboratory (PNNL) to develop comprehensive water assessments for numerous Army Reserve Centers in all five regions including the Pacific islands and Puerto Rico, and at Fort Buchanan and Fort Hunter Liggett. The objective of these assessments is to quantify water use at the site, and identify innovative water efficiency projects that can be implemented to help reduce water demand and increase efficiency. Several of these assessments have focused on a strategic plan for achieving net zero water to help meet the Army’s Net Zero Directive . The Army Reserve hasmore » also leveraged this approach as part of the energy conservation investment program (ECIP), energy savings performance contracts (ESPCs), and utility energy service contracts (UESCs). This article documents the process involved.« less
Efficient Load Balancing and Data Remapping for Adaptive Grid Calculations

NASA Technical Reports Server (NTRS)

Oliker, Leonid; Biswas, Rupak

1997-01-01

Mesh adaption is a powerful tool for efficient unstructured- grid computations but causes load imbalance among processors on a parallel machine. We present a novel method to dynamically balance the processor workloads with a global view. This paper presents, for the first time, the implementation and integration of all major components within our dynamic load balancing strategy for adaptive grid calculations. Mesh adaption, repartitioning, processor assignment, and remapping are critical components of the framework that must be accomplished rapidly and efficiently so as not to cause a significant overhead to the numerical simulation. Previous results indicated that mesh repartitioning and data remapping are potential bottlenecks for performing large-scale scientific calculations. We resolve these issues and demonstrate that our framework remains viable on a large number of processors.
A stable high-order perturbation of surfaces method for numerical simulation of diffraction problems in triply layered media

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hong, Youngjoon, E-mail: hongy@uic.edu; Nicholls, David P., E-mail: davidn@uic.edu

The accurate numerical simulation of linear waves interacting with periodic layered media is a crucial capability in engineering applications. In this contribution we study the stable and high-order accurate numerical simulation of the interaction of linear, time-harmonic waves with a periodic, triply layered medium with irregular interfaces. In contrast with volumetric approaches, High-Order Perturbation of Surfaces (HOPS) algorithms are inexpensive interfacial methods which rapidly and recursively estimate scattering returns by perturbation of the interface shape. In comparison with Boundary Integral/Element Methods, the stable HOPS algorithm we describe here does not require specialized quadrature rules, periodization strategies, or the solution ofmore » dense non-symmetric positive definite linear systems. In addition, the algorithm is provably stable as opposed to other classical HOPS approaches. With numerical experiments we show the remarkable efficiency, fidelity, and accuracy one can achieve with an implementation of this algorithm.« less
A general spectral method for the numerical simulation of one-dimensional interacting fermions

NASA Astrophysics Data System (ADS)

Clason, Christian; von Winckel, Gregory

2012-08-01

This software implements a general framework for the direct numerical simulation of systems of interacting fermions in one spatial dimension. The approach is based on a specially adapted nodal spectral Galerkin method, where the basis functions are constructed to obey the antisymmetry relations of fermionic wave functions. An efficient Matlab program for the assembly of the stiffness and potential matrices is presented, which exploits the combinatorial structure of the sparsity pattern arising from this discretization to achieve optimal run-time complexity. This program allows the accurate discretization of systems with multiple fermions subject to arbitrary potentials, e.g., for verifying the accuracy of multi-particle approximations such as Hartree-Fock in the few-particle limit. It can be used for eigenvalue computations or numerical solutions of the time-dependent Schrödinger equation. The new version includes a Python implementation of the presented approach. New version program summaryProgram title: assembleFermiMatrix Catalogue identifier: AEKO_v1_1 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEKO_v1_1.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 332 No. of bytes in distributed program, including test data, etc.: 5418 Distribution format: tar.gz Programming language: MATLAB/GNU Octave, Python Computer: Any architecture supported by MATLAB, GNU Octave or Python Operating system: Any supported by MATLAB, GNU Octave or Python RAM: Depends on the data Classification: 4.3, 2.2. External routines: Python 2.7+, NumPy 1.3+, SciPy 0.10+ Catalogue identifier of previous version: AEKO_v1_0 Journal reference of previous version: Comput. Phys. Commun. 183 (2012) 405 Does the new version supersede the previous version?: Yes Nature of problem: The direct numerical solution of the multi-particle one-dimensional Schrödinger equation in a quantum well is challenging due to the exponential growth in the number of degrees of freedom with increasing particles. Solution method: A nodal spectral Galerkin scheme is used where the basis functions are constructed to obey the antisymmetry relations of the fermionic wave function. The assembly of these matrices is performed efficiently by exploiting the combinatorial structure of the sparsity patterns. Reasons for new version: A Python implementation is now included. Summary of revisions: Added a Python implementation; small documentation fixes in Matlab implementation. No change in features of the package. Restrictions: Only one-dimensional computational domains with homogeneous Dirichlet or periodic boundary conditions are supported. Running time: Seconds to minutes.
A Class of High-Resolution Explicit and Implicit Shock-Capturing Methods

NASA Technical Reports Server (NTRS)

Yee, H. C.

1994-01-01

The development of shock-capturing finite difference methods for hyperbolic conservation laws has been a rapidly growing area for the last decade. Many of the fundamental concepts, state-of-the-art developments and applications to fluid dynamics problems can only be found in meeting proceedings, scientific journals and internal reports. This paper attempts to give a unified and generalized formulation of a class of high-resolution, explicit and implicit shock capturing methods, and to illustrate their versatility in various steady and unsteady complex shock waves, perfect gases, equilibrium real gases and nonequilibrium flow computations. These numerical methods are formulated for the purpose of ease and efficient implementation into a practical computer code. The various constructions of high-resolution shock-capturing methods fall nicely into the present framework and a computer code can be implemented with the various methods as separate modules. Included is a systematic overview of the basic design principle of the various related numerical methods. Special emphasis will be on the construction of the basic nonlinear, spatially second and third-order schemes for nonlinear scalar hyperbolic conservation laws and the methods of extending these nonlinear scalar schemes to nonlinear systems via the approximate Riemann solvers and flux-vector splitting approaches. Generalization of these methods to efficiently include real gases and large systems of nonequilibrium flows will be discussed. Some perbolic conservation laws to problems containing stiff source terms and terms and shock waves are also included. The performance of some of these schemes is illustrated by numerical examples for one-, two- and three-dimensional gas-dynamics problems. The use of the Lax-Friedrichs numerical flux to obtain high-resolution shock-capturing schemes is generalized. This method can be extended to nonlinear systems of equations without the use of Riemann solvers or flux-vector splitting approaches and thus provides a large savings for multidimensional, equilibrium real gases and nonequilibrium flow computations.
tran-SAS v1.0: a numerical model to compute catchment-scale hydrologic transport using StorAge Selection functions

NASA Astrophysics Data System (ADS)

Benettin, Paolo; Bertuzzo, Enrico

2018-04-01

This paper presents the tran-SAS package, which includes a set of codes to model solute transport and water residence times through a hydrological system. The model is based on a catchment-scale approach that aims at reproducing the integrated response of the system at one of its outlets. The codes are implemented in MATLAB and are meant to be easy to edit, so that users with minimal programming knowledge can adapt them to the desired application. The problem of large-scale solute transport has both theoretical and practical implications. On the one side, the ability to represent the ensemble of water flow trajectories through a heterogeneous system helps unraveling streamflow generation processes and allows us to make inferences on plant-water interactions. On the other side, transport models are a practical tool that can be used to estimate the persistence of solutes in the environment. The core of the package is based on the implementation of an age master equation (ME), which is solved using general StorAge Selection (SAS) functions. The age ME is first converted into a set of ordinary differential equations, each addressing the transport of an individual precipitation input through the catchment, and then it is discretized using an explicit numerical scheme. Results show that the implementation is efficient and allows the model to run in short times. The numerical accuracy is critically evaluated and it is shown to be satisfactory in most cases of hydrologic interest. Additionally, a higher-order implementation is provided within the package to evaluate and, if necessary, to improve the numerical accuracy of the results. The codes can be used to model streamflow age and solute concentration, but a number of additional outputs can be obtained by editing the codes to further advance the ability to understand and model catchment transport processes.
Mean-Field Description of Ionic Size Effects with Non-Uniform Ionic Sizes: A Numerical Approach

PubMed Central

Zhou, Shenggao; Wang, Zhongming; Li, Bo

2013-01-01

Ionic size effects are significant in many biological systems. Mean-field descriptions of such effects can be efficient but also challenging. When ionic sizes are different, explicit formulas in such descriptions are not available for the dependence of the ionic concentrations on the electrostatic potential, i.e., there is no explicit, Boltzmann type distributions. This work begins with a variational formulation of the continuum electrostatics of an ionic solution with such non-uniform ionic sizes as well as multiple ionic valences. An augmented Lagrange multiplier method is then developed and implemented to numerically solve the underlying constrained optimization problem. The method is shown to be accurate and efficient, and is applied to ionic systems with non-uniform ionic sizes such as the sodium chloride solution. Extensive numerical tests demonstrate that the mean-field model and numerical method capture qualitatively some significant ionic size effects, particularly those for multivalent ionic solutions, such as the stratification of multivalent counterions near a charged surface. The ionic valence-to-volume ratio is found to be the key physical parameter in the stratification of concentrations. All these are not well described by the classical Poisson–Boltzmann theory, or the generalized Poisson–Boltzmann theory that treats uniform ionic sizes. Finally, various issues such as the close packing, limitation of the continuum model, and generalization of this work to molecular solvation are discussed. PMID:21929014
Research highlights: June 1990 - May 1991

NASA Technical Reports Server (NTRS)

1991-01-01

Linear instability calculations at MSFC have suggested that the Geophysical Fluid Flow Cell (GFFC) should exhibit classic baroclinic instability at accessible parameter settings. Interest was in the mechanisms of transition to temporal chaos and the evolution of spatio-temporal chaos. In order to understand more about such transitions, high resolution numerical experiments for the physically simplest model of two layer baroclinic instability were conducted. This model has the advantage that the numerical code is exponentially convergent and can be efficiently run for very long times, enabling the study of chaotic attractors without the often devastating effects of low-order trunction found in many previous studies. Numerical algorithms for implementing an empirical orthogonal function (EOF) analysis of the high resolution numerical results were completed. Under conditions of rapid rotation and relatively low differential heating, convection in a spherical shell takes place as columnar banana cells wrapped around the annular gap, but with axes oriented along the axis of rotation; these were clearly evident in the GFFC experiments. The results of recent numerical simulations of columnar convection and future research plans are presented.
Efficient computation of the joint sample frequency spectra for multiple populations.

PubMed

Kamm, John A; Terhorst, Jonathan; Song, Yun S

2017-01-01

A wide range of studies in population genetics have employed the sample frequency spectrum (SFS), a summary statistic which describes the distribution of mutant alleles at a polymorphic site in a sample of DNA sequences and provides a highly efficient dimensional reduction of large-scale population genomic variation data. Recently, there has been much interest in analyzing the joint SFS data from multiple populations to infer parameters of complex demographic histories, including variable population sizes, population split times, migration rates, admixture proportions, and so on. SFS-based inference methods require accurate computation of the expected SFS under a given demographic model. Although much methodological progress has been made, existing methods suffer from numerical instability and high computational complexity when multiple populations are involved and the sample size is large. In this paper, we present new analytic formulas and algorithms that enable accurate, efficient computation of the expected joint SFS for thousands of individuals sampled from hundreds of populations related by a complex demographic model with arbitrary population size histories (including piecewise-exponential growth). Our results are implemented in a new software package called momi (MOran Models for Inference). Through an empirical study we demonstrate our improvements to numerical stability and computational complexity.
Efficient computation of the joint sample frequency spectra for multiple populations

PubMed Central

Kamm, John A.; Terhorst, Jonathan; Song, Yun S.

2016-01-01

A wide range of studies in population genetics have employed the sample frequency spectrum (SFS), a summary statistic which describes the distribution of mutant alleles at a polymorphic site in a sample of DNA sequences and provides a highly efficient dimensional reduction of large-scale population genomic variation data. Recently, there has been much interest in analyzing the joint SFS data from multiple populations to infer parameters of complex demographic histories, including variable population sizes, population split times, migration rates, admixture proportions, and so on. SFS-based inference methods require accurate computation of the expected SFS under a given demographic model. Although much methodological progress has been made, existing methods suffer from numerical instability and high computational complexity when multiple populations are involved and the sample size is large. In this paper, we present new analytic formulas and algorithms that enable accurate, efficient computation of the expected joint SFS for thousands of individuals sampled from hundreds of populations related by a complex demographic model with arbitrary population size histories (including piecewise-exponential growth). Our results are implemented in a new software package called momi (MOran Models for Inference). Through an empirical study we demonstrate our improvements to numerical stability and computational complexity. PMID:28239248
An accurate and efficient acoustic eigensolver based on a fast multipole BEM and a contour integral method

NASA Astrophysics Data System (ADS)

Zheng, Chang-Jun; Gao, Hai-Feng; Du, Lei; Chen, Hai-Bo; Zhang, Chuanzeng

2016-01-01

An accurate numerical solver is developed in this paper for eigenproblems governed by the Helmholtz equation and formulated through the boundary element method. A contour integral method is used to convert the nonlinear eigenproblem into an ordinary eigenproblem, so that eigenvalues can be extracted accurately by solving a set of standard boundary element systems of equations. In order to accelerate the solution procedure, the parameters affecting the accuracy and efficiency of the method are studied and two contour paths are compared. Moreover, a wideband fast multipole method is implemented with a block IDR (s) solver to reduce the overall solution cost of the boundary element systems of equations with multiple right-hand sides. The Burton-Miller formulation is employed to identify the fictitious eigenfrequencies of the interior acoustic problems with multiply connected domains. The actual effect of the Burton-Miller formulation on tackling the fictitious eigenfrequency problem is investigated and the optimal choice of the coupling parameter as α = i / k is confirmed through exterior sphere examples. Furthermore, the numerical eigenvalues obtained by the developed method are compared with the results obtained by the finite element method to show the accuracy and efficiency of the developed method.
Design of an ultraprecision computerized numerical control chemical mechanical polishing machine and its implementation

NASA Astrophysics Data System (ADS)

Zhang, Chupeng; Zhao, Huiying; Zhu, Xueliang; Zhao, Shijie; Jiang, Chunye

2018-01-01

The chemical mechanical polishing (CMP) is a key process during the machining route of plane optics. To improve the polishing efficiency and accuracy, a CMP model and machine tool were developed. Based on the Preston equation and the axial run-out error measurement results of the m circles on the tin plate, a CMP model that could simulate the material removal at any point on the workpiece was presented. An analysis of the model indicated that lower axial run-out error led to lower material removal but better polishing efficiency and accuracy. Based on this conclusion, the CMP machine was designed, and the ultraprecision gas hydrostatic guideway and rotary table as well as the Siemens 840Dsl numerical control system were incorporated in the CMP machine. To verify the design principles of machine, a series of detection and machining experiments were conducted. The LK-G5000 laser sensor was employed for detecting the straightness error of the gas hydrostatic guideway and the axial run-out error of the gas hydrostatic rotary table. A 300-mm-diameter optic was chosen for the surface profile machining experiments performed to determine the CMP efficiency and accuracy.
Diffusion approximation-based simulation of stochastic ion channels: which method to use?

PubMed Central

Pezo, Danilo; Soudry, Daniel; Orio, Patricio

2014-01-01

To study the effects of stochastic ion channel fluctuations on neural dynamics, several numerical implementation methods have been proposed. Gillespie's method for Markov Chains (MC) simulation is highly accurate, yet it becomes computationally intensive in the regime of a high number of channels. Many recent works aim to speed simulation time using the Langevin-based Diffusion Approximation (DA). Under this common theoretical approach, each implementation differs in how it handles various numerical difficulties—such as bounding of state variables to [0,1]. Here we review and test a set of the most recently published DA implementations (Goldwyn et al., 2011; Linaro et al., 2011; Dangerfield et al., 2012; Orio and Soudry, 2012; Schmandt and Galán, 2012; Güler, 2013; Huang et al., 2013a), comparing all of them in a set of numerical simulations that assess numerical accuracy and computational efficiency on three different models: (1) the original Hodgkin and Huxley model, (2) a model with faster sodium channels, and (3) a multi-compartmental model inspired in granular cells. We conclude that for a low number of channels (usually below 1000 per simulated compartment) one should use MC—which is the fastest and most accurate method. For a high number of channels, we recommend using the method by Orio and Soudry (2012), possibly combined with the method by Schmandt and Galán (2012) for increased speed and slightly reduced accuracy. Consequently, MC modeling may be the best method for detailed multicompartment neuron models—in which a model neuron with many thousands of channels is segmented into many compartments with a few hundred channels. PMID:25404914
A Cascaded Self-Similar Rat-Race Hybrid Coupler Architecture and its Compact Ka-Band Implementation

DTIC Science & Technology

2017-03-01

real-estate and limit the system-level performance, including bandwidth, gain, and energy - efficiency. These many challenges are positioning passive...and are used in numerous RF/mm-wave systems for radar and wireless communications. Although a Marchand balun covers a large bandwidth, it is...requires multiple λ/4 transmission lines (t-lines), making its on-chip designs very costly even for RF/mm-wave bands. Reported miniaturized rat-race
Efficient XML Interchange (EXI) Compression and Performance Benefits: Development, Implementation and Evaluation

DTIC Science & Technology

2010-03-01

to a graphics card , and not the redesign of XML. The justification is that if XML is going to be prevalent, special optimized hardware is...the answer, similar to the specialized functions of a video card .  Given the Moore’s law that processing power doubles every few years, let the...and numerous multimedia players such as iTunes from Apple. These applications are free to use, but the source is restricted by software licenses
Rapid Prediction of Unsteady Three-Dimensional Viscous Flows in Turbopump Geometries

NASA Technical Reports Server (NTRS)

Dorney, Daniel J.

1998-01-01

A program is underway to improve the efficiency of a three-dimensional Navier-Stokes code and generalize it for nozzle and turbopump geometries. Code modifications will include the implementation of parallel processing software, incorporating new physical models and generalizing the multi-block capability to allow the simultaneous simulation of nozzle and turbopump configurations. The current report contains details of code modifications, numerical results of several flow simulations and the status of the parallelization effort.

Hybrid ODE/SSA methods and the cell cycle model

NASA Astrophysics Data System (ADS)

Wang, S.; Chen, M.; Cao, Y.

2017-07-01

Stochastic effect in cellular systems has been an important topic in systems biology. Stochastic modeling and simulation methods are important tools to study stochastic effect. Given the low efficiency of stochastic simulation algorithms, the hybrid method, which combines an ordinary differential equation (ODE) system with a stochastic chemically reacting system, shows its unique advantages in the modeling and simulation of biochemical systems. The efficiency of hybrid method is usually limited by reactions in the stochastic subsystem, which are modeled and simulated using Gillespie's framework and frequently interrupt the integration of the ODE subsystem. In this paper we develop an efficient implementation approach for the hybrid method coupled with traditional ODE solvers. We also compare the efficiency of hybrid methods with three widely used ODE solvers RADAU5, DASSL, and DLSODAR. Numerical experiments with three biochemical models are presented. A detailed discussion is presented for the performances of three ODE solvers.
Parallel Domain Decomposition Formulation and Software for Large-Scale Sparse Symmetrical/Unsymmetrical Aeroacoustic Applications

NASA Technical Reports Server (NTRS)

Nguyen, D. T.; Watson, Willie R. (Technical Monitor)

2005-01-01

The overall objectives of this research work are to formulate and validate efficient parallel algorithms, and to efficiently design/implement computer software for solving large-scale acoustic problems, arised from the unified frameworks of the finite element procedures. The adopted parallel Finite Element (FE) Domain Decomposition (DD) procedures should fully take advantages of multiple processing capabilities offered by most modern high performance computing platforms for efficient parallel computation. To achieve this objective. the formulation needs to integrate efficient sparse (and dense) assembly techniques, hybrid (or mixed) direct and iterative equation solvers, proper pre-conditioned strategies, unrolling strategies, and effective processors' communicating schemes. Finally, the numerical performance of the developed parallel finite element procedures will be evaluated by solving series of structural, and acoustic (symmetrical and un-symmetrical) problems (in different computing platforms). Comparisons with existing "commercialized" and/or "public domain" software are also included, whenever possible.
Eigenvalue routines in NASTRAN: A comparison with the Block Lanczos method

NASA Technical Reports Server (NTRS)

Tischler, V. A.; Venkayya, Vipperla B.

1993-01-01

The NASA STRuctural ANalysis (NASTRAN) program is one of the most extensively used engineering applications software in the world. It contains a wealth of matrix operations and numerical solution techniques, and they were used to construct efficient eigenvalue routines. The purpose of this paper is to examine the current eigenvalue routines in NASTRAN and to make efficiency comparisons with a more recent implementation of the Block Lanczos algorithm by Boeing Computer Services (BCS). This eigenvalue routine is now available in the BCS mathematics library as well as in several commercial versions of NASTRAN. In addition, CRAY maintains a modified version of this routine on their network. Several example problems, with a varying number of degrees of freedom, were selected primarily for efficiency bench-marking. Accuracy is not an issue, because they all gave comparable results. The Block Lanczos algorithm was found to be extremely efficient, in particular, for very large size problems.
Numerical modelling of emissions of nitrogen oxides in solid fuel combustion.

PubMed

Bešenić, Tibor; Mikulčić, Hrvoje; Vujanović, Milan; Duić, Neven

2018-06-01

Among the combustion products, nitrogen oxides are one of the main contributors to a negative impact on the environment, participating in harmful processes such as tropospheric ozone and acid rains production. The main source of emissions of nitrogen oxides is the human combustion of fossil fuels. Their formation models are investigated and implemented with the goal of obtaining a tool for studying the nitrogen-containing pollutant production. In this work, numerical simulation of solid fuel combustion was carried out on a three-dimensional model of a drop tube furnace by using the commercial software FIRE. It was used for simulating turbulent fluid flow and temperature field, concentrations of the reactants and products, as well as the fluid-particles interaction by numerically solving the integro-differential equations describing these processes. Chemical reactions mechanisms for the formation of nitrogen oxides were implemented by the user functions. To achieve reasonable calculation times for running the simulations, as well as efficient coupling with the turbulent mixing process, the nitrogen scheme is limited to sufficiently few homogeneous reactions and species. Turbulent fluctuations that affect the reaction rates of nitrogen oxides' concentration are modelled by probability density function approach. Results of the implemented model for nitrogen oxides' formation from coal and biomass are compared to the experimental data. Temperature, burnout and nitrogen oxides' concentration profiles are compared, showing satisfactory agreement. The new model allows the simulation of pollutant formation in the real-world applications. Copyright © 2018 Elsevier Ltd. All rights reserved.
A coupled approach for the three-dimensional simulation of pipe leakage in variably saturated soil

NASA Astrophysics Data System (ADS)

Peche, Aaron; Graf, Thomas; Fuchs, Lothar; Neuweiler, Insa

2017-12-01

In urban water pipe networks, pipe leakage may lead to subsurface contamination or to reduced waste water treatment efficiency. The quantification of pipe leakage is challenging due to inaccessibility and unknown hydraulic properties of the soil. A novel physically-based model for three-dimensional numerical simulation of pipe leakage in variably saturated soil is presented. We describe the newly implemented coupling between the pipe flow simulator HYSTEM-EXTRAN and the groundwater flow simulator OpenGeoSys and its validation. We further describe a novel upscaling of leakage using transfer functions derived from numerical simulations. This upscaling enables the simulation of numerous pipe defects with the benefit of reduced computation times. Finally, we investigate the response of leakage to different time-dependent pipe flow events and conclude that larger pipe flow volume and duration lead to larger leakage while the peak position in time has a small effect on leakage.
Novel numerical techniques for magma dynamics

NASA Astrophysics Data System (ADS)

Rhebergen, S.; Katz, R. F.; Wathen, A.; Alisic, L.; Rudge, J. F.; Wells, G.

2013-12-01

We discuss the development of finite element techniques and solvers for magma dynamics computations. These are implemented within the FEniCS framework. This approach allows for user-friendly, expressive, high-level code development, but also provides access to powerful, scalable numerical solvers and a large family of finite element discretisations. With the recent addition of dolfin-adjoint, FeniCS supports automated adjoint and tangent-linear models, enabling the rapid development of Generalised Stability Analysis. The ability to easily scale codes to three dimensions with large meshes, and/or to apply intricate adjoint calculations means that efficiency of the numerical algorithms is vital. We therefore describe our development and analysis of preconditioners designed specifically for finite element discretizations of equations governing magma dynamics. The preconditioners are based on Elman-Silvester-Wathen methods for the Stokes equation, and we extend these to flows with compaction. Our simulations are validated by comparison of results with laboratory experiments on partially molten aggregates.
Accurate complex scaling of three dimensional numerical potentials

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cerioni, Alessandro; Genovese, Luigi; Duchemin, Ivan

2013-05-28

The complex scaling method, which consists in continuing spatial coordinates into the complex plane, is a well-established method that allows to compute resonant eigenfunctions of the time-independent Schroedinger operator. Whenever it is desirable to apply the complex scaling to investigate resonances in physical systems defined on numerical discrete grids, the most direct approach relies on the application of a similarity transformation to the original, unscaled Hamiltonian. We show that such an approach can be conveniently implemented in the Daubechies wavelet basis set, featuring a very promising level of generality, high accuracy, and no need for artificial convergence parameters. Complex scalingmore » of three dimensional numerical potentials can be efficiently and accurately performed. By carrying out an illustrative resonant state computation in the case of a one-dimensional model potential, we then show that our wavelet-based approach may disclose new exciting opportunities in the field of computational non-Hermitian quantum mechanics.« less
Experimental and numerical study of two dimensional heat and mass transfer in unsaturated soil with and application to soil thermal energy storage (SBTES) systems

NASA Astrophysics Data System (ADS)

Moradi, A.; Smits, K. M.

2014-12-01

A promising energy storage option to compensate for daily and seasonal energy offsets is to inject and store heat generated from renewable energy sources (e.g. solar energy) in the ground, oftentimes referred to as soil borehole thermal energy storage (SBTES). Nonetheless in SBTES modeling efforts, it is widely recognized that the movement of water vapor is closely coupled to thermal processes. However, their mutual interactions are rarely considered in most soil water modeling efforts or in practical applications. The validation of numerical models that are designed to capture these processes is difficult due to the scarcity of experimental data, limiting the testing and refinement of heat and water transfer theories. A common assumption in most SBTES modeling approaches is to consider the soil as a purely conductive medium with constant hydraulic and thermal properties. However, this simplified approach can be improved upon by better understanding the coupled processes at play. Consequently, developing new modeling techniques along with suitable experimental tools to add more complexity in coupled processes has critical importance in obtaining necessary knowledge in efficient design and implementation of SBTES systems. The goal of this work is to better understand heat and mass transfer processes for SBTES. In this study, we implemented a fully coupled numerical model that solves for heat, liquid water and water vapor flux and allows for non-equilibrium liquid/gas phase change. This model was then used to investigate the influence of different hydraulic and thermal parameterizations on SBTES system efficiency. A two dimensional tank apparatus was used with a series of soil moisture, temperature and soil thermal properties sensors. Four experiments were performed with different test soils. Experimental results provide evidences of thermally induced moisture flow that was also confirmed by numerical results. Numerical results showed that for the test conditions applied here, moisture flow is more influenced by thermal gradients rather than hydraulic gradients. The results also demonstrate that convective fluxes are higher compared to conductive fluxes indicating that moisture flow has more contribution to the overall heat flux than conductive fluxes.
Transient loads analysis for space flight applications

NASA Technical Reports Server (NTRS)

Thampi, S. K.; Vidyasagar, N. S.; Ganesan, N.

1992-01-01

A significant part of the flight readiness verification process involves transient analysis of the coupled Shuttle-payload system to determine the low frequency transient loads. This paper describes a methodology for transient loads analysis and its implementation for the Spacelab Life Sciences Mission. The analysis is carried out using two major software tools - NASTRAN and an external FORTRAN code called EZTRAN. This approach is adopted to overcome some of the limitations of NASTRAN's standard transient analysis capabilities. The method uses Data Recovery Matrices (DRM) to improve computational efficiency. The mode acceleration method is fully implemented in the DRM formulation to recover accurate displacements, stresses, and forces. The advantages of the method are demonstrated through a numerical example.
Fast Implementation of Quantum Phase Gates and Creation of Cluster States via Transitionless Quantum Driving

NASA Astrophysics Data System (ADS)

Zhang, Chun-Ling; Liu, Wen-Wu

2018-05-01

In this paper, combining transitionless quantum driving and quantum Zeno dynamics, we propose an efficient scheme to fast implement a two-qubit quantum phase gate which can be used to generate cluster state of atoms trapped in distant cavities. The influence of various of various error sources including spontaneous emission and photon loss on the fidelity is analyzed via numerical simulation. The results show that this scheme not only takes less time than adiabatic scheme but also is not sensitive to both error sources. Additionally, a creation of N-atom cluster states is put forward as a typical example of the applications of the phase gates.
Technological Innovation and Developmental Strategies for Sustainable Management of Aquatic Resources in Developing Countries

NASA Astrophysics Data System (ADS)

Agboola, Julius Ibukun

2014-12-01

Sustainable use and allocation of aquatic resources including water resources require implementation of ecologically appropriate technologies, efficient and relevant to local needs. Despite the numerous international agreements and provisions on transfer of technology, this has not been successfully achieved in developing countries. While reviewing some challenges to technological innovations and developments (TID), this paper analyzes five TID strategic approaches centered on grassroots technology development and provision of localized capacity for sustainable aquatic resources management. Three case studies provide examples of successful implementation of these strategies. Success requires the provision of localized capacity to manage technology through knowledge empowerment in rural communities situated within a framework of clear national priorities for technology development.
Technological innovation and developmental strategies for sustainable management of aquatic resources in developing countries.

PubMed

Agboola, Julius Ibukun

2014-12-01

Sustainable use and allocation of aquatic resources including water resources require implementation of ecologically appropriate technologies, efficient and relevant to local needs. Despite the numerous international agreements and provisions on transfer of technology, this has not been successfully achieved in developing countries. While reviewing some challenges to technological innovations and developments (TID), this paper analyzes five TID strategic approaches centered on grassroots technology development and provision of localized capacity for sustainable aquatic resources management. Three case studies provide examples of successful implementation of these strategies. Success requires the provision of localized capacity to manage technology through knowledge empowerment in rural communities situated within a framework of clear national priorities for technology development.
Improving the Numerical Stability of Fast Matrix Multiplication

DOE PAGES

Ballard, Grey; Benson, Austin R.; Druinsky, Alex; ...

2016-10-04

Fast algorithms for matrix multiplication, namely those that perform asymptotically fewer scalar operations than the classical algorithm, have been considered primarily of theoretical interest. Apart from Strassen's original algorithm, few fast algorithms have been efficiently implemented or used in practical applications. However, there exist many practical alternatives to Strassen's algorithm with varying performance and numerical properties. Fast algorithms are known to be numerically stable, but because their error bounds are slightly weaker than the classical algorithm, they are not used even in cases where they provide a performance benefit. We argue in this study that the numerical sacrifice of fastmore » algorithms, particularly for the typical use cases of practical algorithms, is not prohibitive, and we explore ways to improve the accuracy both theoretically and empirically. The numerical accuracy of fast matrix multiplication depends on properties of the algorithm and of the input matrices, and we consider both contributions independently. We generalize and tighten previous error analyses of fast algorithms and compare their properties. We discuss algorithmic techniques for improving the error guarantees from two perspectives: manipulating the algorithms, and reducing input anomalies by various forms of diagonal scaling. In conclusion, we benchmark performance and demonstrate our improved numerical accuracy.« less
Numerical Analysis of Dusty-Gas Flows

NASA Astrophysics Data System (ADS)

Saito, T.

2002-02-01

This paper presents the development of a numerical code for simulating unsteady dusty-gas flows including shock and rarefaction waves. The numerical results obtained for a shock tube problem are used for validating the accuracy and performance of the code. The code is then extended for simulating two-dimensional problems. Since the interactions between the gas and particle phases are calculated with the operator splitting technique, we can choose numerical schemes independently for the different phases. A semi-analytical method is developed for the dust phase, while the TVD scheme of Harten and Yee is chosen for the gas phase. Throughout this study, computations are carried out on SGI Origin2000, a parallel computer with multiple of RISC based processors. The efficient use of the parallel computer system is an important issue and the code implementation on Origin2000 is also described. Flow profiles of both the gas and solid particles behind the steady shock wave are calculated by integrating the steady conservation equations. The good agreement between the pseudo-stationary solutions and those from the current numerical code validates the numerical approach and the actual coding. The pseudo-stationary shock profiles can also be used as initial conditions of unsteady multidimensional simulations.
Numerical solution of special ultra-relativistic Euler equations using central upwind scheme

NASA Astrophysics Data System (ADS)

Ghaffar, Tayabia; Yousaf, Muhammad; Qamar, Shamsul

2018-06-01

This article is concerned with the numerical approximation of one and two-dimensional special ultra-relativistic Euler equations. The governing equations are coupled first-order nonlinear hyperbolic partial differential equations. These equations describe perfect fluid flow in terms of the particle density, the four-velocity and the pressure. A high-resolution shock-capturing central upwind scheme is employed to solve the model equations. To avoid excessive numerical diffusion, the considered scheme avails the specific information of local propagation speeds. By using Runge-Kutta time stepping method and MUSCL-type initial reconstruction, we have obtained 2nd order accuracy of the proposed scheme. After discussing the model equations and the numerical technique, several 1D and 2D test problems are investigated. For all the numerical test cases, our proposed scheme demonstrates very good agreement with the results obtained by well-established algorithms, even in the case of highly relativistic 2D test problems. For validation and comparison, the staggered central scheme and the kinetic flux-vector splitting (KFVS) method are also implemented to the same model. The robustness and efficiency of central upwind scheme is demonstrated by the numerical results.
Algorithms for optimized maximum entropy and diagnostic tools for analytic continuation.

PubMed

Bergeron, Dominic; Tremblay, A-M S

2016-08-01

Analytic continuation of numerical data obtained in imaginary time or frequency has become an essential part of many branches of quantum computational physics. It is, however, an ill-conditioned procedure and thus a hard numerical problem. The maximum-entropy approach, based on Bayesian inference, is the most widely used method to tackle that problem. Although the approach is well established and among the most reliable and efficient ones, useful developments of the method and of its implementation are still possible. In addition, while a few free software implementations are available, a well-documented, optimized, general purpose, and user-friendly software dedicated to that specific task is still lacking. Here we analyze all aspects of the implementation that are critical for accuracy and speed and present a highly optimized approach to maximum entropy. Original algorithmic and conceptual contributions include (1) numerical approximations that yield a computational complexity that is almost independent of temperature and spectrum shape (including sharp Drude peaks in broad background, for example) while ensuring quantitative accuracy of the result whenever precision of the data is sufficient, (2) a robust method of choosing the entropy weight α that follows from a simple consistency condition of the approach and the observation that information- and noise-fitting regimes can be identified clearly from the behavior of χ^{2} with respect to α, and (3) several diagnostics to assess the reliability of the result. Benchmarks with test spectral functions of different complexity and an example with an actual physical simulation are presented. Our implementation, which covers most typical cases for fermions, bosons, and response functions, is available as an open source, user-friendly software.
Algorithms for optimized maximum entropy and diagnostic tools for analytic continuation

NASA Astrophysics Data System (ADS)

Bergeron, Dominic; Tremblay, A.-M. S.

2016-08-01

Analytic continuation of numerical data obtained in imaginary time or frequency has become an essential part of many branches of quantum computational physics. It is, however, an ill-conditioned procedure and thus a hard numerical problem. The maximum-entropy approach, based on Bayesian inference, is the most widely used method to tackle that problem. Although the approach is well established and among the most reliable and efficient ones, useful developments of the method and of its implementation are still possible. In addition, while a few free software implementations are available, a well-documented, optimized, general purpose, and user-friendly software dedicated to that specific task is still lacking. Here we analyze all aspects of the implementation that are critical for accuracy and speed and present a highly optimized approach to maximum entropy. Original algorithmic and conceptual contributions include (1) numerical approximations that yield a computational complexity that is almost independent of temperature and spectrum shape (including sharp Drude peaks in broad background, for example) while ensuring quantitative accuracy of the result whenever precision of the data is sufficient, (2) a robust method of choosing the entropy weight α that follows from a simple consistency condition of the approach and the observation that information- and noise-fitting regimes can be identified clearly from the behavior of χ2 with respect to α , and (3) several diagnostics to assess the reliability of the result. Benchmarks with test spectral functions of different complexity and an example with an actual physical simulation are presented. Our implementation, which covers most typical cases for fermions, bosons, and response functions, is available as an open source, user-friendly software.
Towards a Highly Efficient Meshfree Simulation of Non-Newtonian Free Surface Ice Flow: Application to the Haut Glacier d'Arolla

NASA Astrophysics Data System (ADS)

Shcherbakov, V.; Ahlkrona, J.

2016-12-01

In this work we develop a highly efficient meshfree approach to ice sheet modeling. Traditionally mesh based methods such as finite element methods are employed to simulate glacier and ice sheet dynamics. These methods are mature and well developed. However, despite of numerous advantages these methods suffer from some drawbacks such as necessity to remesh the computational domain every time it changes its shape, which significantly complicates the implementation on moving domains, or a costly assembly procedure for nonlinear problems. We introduce a novel meshfree approach that frees us from all these issues. The approach is built upon a radial basis function (RBF) method that, thanks to its meshfree nature, allows for an efficient handling of moving margins and free ice surface. RBF methods are also accurate and easy to implement. Since the formulation is stated in strong form it allows for a substantial reduction of the computational cost associated with the linear system assembly inside the nonlinear solver. We implement a global RBF method that defines an approximation on the entire computational domain. This method exhibits high accuracy properties. However, it suffers from a disadvantage that the coefficient matrix is dense, and therefore the computational efficiency decreases. In order to overcome this issue we also implement a localized RBF method that rests upon a partition of unity approach to subdivide the domain into several smaller subdomains. The radial basis function partition of unity method (RBF-PUM) inherits high approximation characteristics form the global RBF method while resulting in a sparse system of equations, which essentially increases the computational efficiency. To demonstrate the usefulness of the RBF methods we model the velocity field of ice flow in the Haut Glacier d'Arolla. We assume that the flow is governed by the nonlinear Blatter-Pattyn equations. We test the methods for different basal conditions and for a free moving surface. Both RBF methods are compared with a classical finite element method in terms of accuracy and efficiency. We find that the RBF methods are more efficient than the finite element method and well suited for ice dynamics modeling, especially the partition of unity approach.
A numerical method for solving the 3D unsteady incompressible Navier Stokes equations in curvilinear domains with complex immersed boundaries

NASA Astrophysics Data System (ADS)

Ge, Liang; Sotiropoulos, Fotis

2007-08-01

A novel numerical method is developed that integrates boundary-conforming grids with a sharp interface, immersed boundary methodology. The method is intended for simulating internal flows containing complex, moving immersed boundaries such as those encountered in several cardiovascular applications. The background domain (e.g. the empty aorta) is discretized efficiently with a curvilinear boundary-fitted mesh while the complex moving immersed boundary (say a prosthetic heart valve) is treated with the sharp-interface, hybrid Cartesian/immersed-boundary approach of Gilmanov and Sotiropoulos [A. Gilmanov, F. Sotiropoulos, A hybrid cartesian/immersed boundary method for simulating flows with 3d, geometrically complex, moving bodies, Journal of Computational Physics 207 (2005) 457-492.]. To facilitate the implementation of this novel modeling paradigm in complex flow simulations, an accurate and efficient numerical method is developed for solving the unsteady, incompressible Navier-Stokes equations in generalized curvilinear coordinates. The method employs a novel, fully-curvilinear staggered grid discretization approach, which does not require either the explicit evaluation of the Christoffel symbols or the discretization of all three momentum equations at cell interfaces as done in previous formulations. The equations are integrated in time using an efficient, second-order accurate fractional step methodology coupled with a Jacobian-free, Newton-Krylov solver for the momentum equations and a GMRES solver enhanced with multigrid as preconditioner for the Poisson equation. Several numerical experiments are carried out on fine computational meshes to demonstrate the accuracy and efficiency of the proposed method for standard benchmark problems as well as for unsteady, pulsatile flow through a curved, pipe bend. To demonstrate the ability of the method to simulate flows with complex, moving immersed boundaries we apply it to calculate pulsatile, physiological flow through a mechanical, bileaflet heart valve mounted in a model straight aorta with an anatomical-like triple sinus.
Multiphoton ionization of many-electron atoms and highly-charged ions in intense laser fields: a relativistic time-dependent density functional theory approach

NASA Astrophysics Data System (ADS)

Tumakov, Dmitry A.; Telnov, Dmitry A.; Maltsev, Ilia A.; Plunien, Günter; Shabaev, Vladimir M.

2017-10-01

We develop an efficient numerical implementation of the relativistic time-dependent density functional theory (RTDDFT) to study multielectron highly-charged ions subject to intense linearly-polarized laser fields. The interaction with the electromagnetic field is described within the electric dipole approximation. The resulting time-dependent relativistic Kohn-Sham (RKS) equations possess an axial symmetry and are solved accurately and efficiently with the help of the time-dependent generalized pseudospectral method. As a case study, we calculate multiphoton ionization probabilities of the neutral argon atom and argon-like xenon ion. Relativistic effects are assessed by comparison of our present results with existing non-relativistic data.

Efficient implementations of a pseudodynamical stochastic filtering strategy for static elastography.

PubMed

Banerjee, Biswanath; Roy, Debasish; Vasu, Ram Mohan

2009-08-01

A computationally efficient pseudodynamical filtering setup is established for elasticity imaging (i.e., reconstruction of shear modulus distribution) in soft-tissue organs given statically recorded and partially measured displacement data. Unlike a regularized quasi-Newton method (QNM) that needs inversion of ill-conditioned matrices, the authors explore pseudodynamic extended and ensemble Kalman filters (PD-EKF and PD-EnKF) that use a parsimonious representation of states and bypass explicit regularization by recursion over pseudotime. Numerical experiments with QNM and the two filters suggest that the PD-EnKF is the most robust performer as it exhibits no sensitivity to process noise covariance and yields good reconstruction even with small ensemble sizes.
AN EFFICIENT HIGHER-ORDER FAST MULTIPOLE BOUNDARY ELEMENT SOLUTION FOR POISSON-BOLTZMANN BASED MOLECULAR ELECTROSTATICS

PubMed Central

Bajaj, Chandrajit; Chen, Shun-Chuan; Rand, Alexander

2011-01-01

In order to compute polarization energy of biomolecules, we describe a boundary element approach to solving the linearized Poisson-Boltzmann equation. Our approach combines several important features including the derivative boundary formulation of the problem and a smooth approximation of the molecular surface based on the algebraic spline molecular surface. State of the art software for numerical linear algebra and the kernel independent fast multipole method is used for both simplicity and efficiency of our implementation. We perform a variety of computational experiments, testing our method on a number of actual proteins involved in molecular docking and demonstrating the effectiveness of our solver for computing molecular polarization energy. PMID:21660123
Nonlinear combining and compression in multicore fibers

DOE PAGES

Chekhovskoy, I. S.; Rubenchik, A. M.; Shtyrina, O. V.; ...

2016-10-25

In this paper, we demonstrate numerically light-pulse combining and pulse compression using wave-collapse (self-focusing) energy-localization dynamics in a continuous-discrete nonlinear system, as implemented in a multicore fiber (MCF) using one-dimensional (1D) and 2D core distribution designs. Large-scale numerical simulations were performed to determine the conditions of the most efficient coherent combining and compression of pulses injected into the considered MCFs. We demonstrate the possibility of combining in a single core 90% of the total energy of pulses initially injected into all cores of a 7-core MCF with a hexagonal lattice. Finally, a pulse compression factor of about 720 can bemore » obtained with a 19-core ring MCF.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Goodwin, D. L.; Kuprov, Ilya, E-mail: i.kuprov@soton.ac.uk

Quadratic convergence throughout the active space is achieved for the gradient ascent pulse engineering (GRAPE) family of quantum optimal control algorithms. We demonstrate in this communication that the Hessian of the GRAPE fidelity functional is unusually cheap, having the same asymptotic complexity scaling as the functional itself. This leads to the possibility of using very efficient numerical optimization techniques. In particular, the Newton-Raphson method with a rational function optimization (RFO) regularized Hessian is shown in this work to require fewer system trajectory evaluations than any other algorithm in the GRAPE family. This communication describes algebraic and numerical implementation aspects (matrixmore » exponential recycling, Hessian regularization, etc.) for the RFO Newton-Raphson version of GRAPE and reports benchmarks for common spin state control problems in magnetic resonance spectroscopy.« less
A Strassen-Newton algorithm for high-speed parallelizable matrix inversion

NASA Technical Reports Server (NTRS)

Bailey, David H.; Ferguson, Helaman R. P.

1988-01-01

Techniques are described for computing matrix inverses by algorithms that are highly suited to massively parallel computation. The techniques are based on an algorithm suggested by Strassen (1969). Variations of this scheme use matrix Newton iterations and other methods to improve the numerical stability while at the same time preserving a very high level of parallelism. One-processor Cray-2 implementations of these schemes range from one that is up to 55 percent faster than a conventional library routine to one that is slower than a library routine but achieves excellent numerical stability. The problem of computing the solution to a single set of linear equations is discussed, and it is shown that this problem can also be solved efficiently using these techniques.
Complex amplitude reconstruction by iterative amplitude-phase retrieval algorithm with reference

NASA Astrophysics Data System (ADS)

Shen, Cheng; Guo, Cheng; Tan, Jiubin; Liu, Shutian; Liu, Zhengjun

2018-06-01

Multi-image iterative phase retrieval methods have been successfully applied in plenty of research fields due to their simple but efficient implementation. However, there is a mismatch between the measurement of the first long imaging distance and the sequential interval. In this paper, an amplitude-phase retrieval algorithm with reference is put forward without additional measurements or priori knowledge. It gets rid of measuring the first imaging distance. With a designed update formula, it significantly raises the convergence speed and the reconstruction fidelity, especially in phase retrieval. Its superiority over the original amplitude-phase retrieval (APR) method is validated by numerical analysis and experiments. Furthermore, it provides a conceptual design of a compact holographic image sensor, which can achieve numerical refocusing easily.
Generation of structural topologies using efficient technique based on sorted compliances

NASA Astrophysics Data System (ADS)

Mazur, Monika; Tajs-Zielińska, Katarzyna; Bochenek, Bogdan

2018-01-01

Topology optimization, although well recognized is still widely developed. It has gained recently more attention since large computational ability become available for designers. This process is stimulated simultaneously by variety of emerging, innovative optimization methods. It is observed that traditional gradient-based mathematical programming algorithms, in many cases, are replaced by novel and e cient heuristic methods inspired by biological, chemical or physical phenomena. These methods become useful tools for structural optimization because of their versatility and easy numerical implementation. In this paper engineering implementation of a novel heuristic algorithm for minimum compliance topology optimization is discussed. The performance of the topology generator is based on implementation of a special function utilizing information of compliance distribution within the design space. With a view to cope with engineering problems the algorithm has been combined with structural analysis system Ansys.
Solving large sparse eigenvalue problems on supercomputers

NASA Technical Reports Server (NTRS)

Philippe, Bernard; Saad, Youcef

1988-01-01

An important problem in scientific computing consists in finding a few eigenvalues and corresponding eigenvectors of a very large and sparse matrix. The most popular methods to solve these problems are based on projection techniques on appropriate subspaces. The main attraction of these methods is that they only require the use of the matrix in the form of matrix by vector multiplications. The implementations on supercomputers of two such methods for symmetric matrices, namely Lanczos' method and Davidson's method are compared. Since one of the most important operations in these two methods is the multiplication of vectors by the sparse matrix, methods of performing this operation efficiently are discussed. The advantages and the disadvantages of each method are compared and implementation aspects are discussed. Numerical experiments on a one processor CRAY 2 and CRAY X-MP are reported. Possible parallel implementations are also discussed.
Efficient control schemes with limited computation complexity for Tomographic AO systems on VLTs and ELTs

NASA Astrophysics Data System (ADS)

Petit, C.; Le Louarn, M.; Fusco, T.; Madec, P.-Y.

2011-09-01

Various tomographic control solutions have been proposed during the last decades to ensure efficient or even optimal closed-loop correction to tomographic Adaptive Optics (AO) concepts such as Laser Tomographic AO (LTAO), Multi-Conjugate AO (MCAO). The optimal solution, based on Linear Quadratic Gaussian (LQG) approach, as well as suboptimal but efficient solutions such as Pseudo-Open Loop Control (POLC) require multiple Matrix Vector Multiplications (MVM). Disregarding their respective performance, these efficient control solutions thus exhibit strong increase of on-line complexity and their implementation may become difficult in demanding cases. Among them, two cases are of particular interest. First, the system Real-Time Computer architecture and implementation is derived from past or present solutions and does not support multiple MVM. This is the case of the AO Facility which RTC architecture is derived from the SPARTA platform and inherits its simple MVM architecture, which does not fit with LTAO control solutions for instance. Second, considering future systems such as Extremely Large Telescopes, the number of degrees of freedom is twenty to one hundred times bigger than present systems. In these conditions, tomographic control solutions can hardly be used in their standard form and optimized implementation shall be considered. Single MVM tomographic control solutions represent a potential solution, and straightforward solutions such as Virtual Deformable Mirrors have been already proposed for LTAO but with tuning issues. We investigate in this paper the possibility to derive from tomographic control solutions, such as POLC or LQG, simplified control solutions ensuring simple MVM architecture and that could be thus implemented on nowadays systems or future complex systems. We theoretically derive various solutions and analyze their respective performance on various systems thanks to numerical simulation. We discuss the optimization of their performance and stability issues with respect to classic control solutions. We finally discuss off-line computation and implementation constraints.
Selected Systems Engineering Process Deficiencies and Their Consequences

NASA Technical Reports Server (NTRS)

Thomas, Lawrence Dale

2006-01-01

The systems engineering process is well established and well understood. While this statement could be argued in the light of the many systems engineering guidelines and that have been developed, comparative review of these respective descriptions reveal that they differ primarily in the number of discrete steps or other nuances, and are at their core essentially common. Likewise, the systems engineering textbooks differ primarily in the context for application of systems engineering or in the utilization of evolved tools and techniques, not in the basic method. Thus, failures in systems engineering cannot credibly be attributed to implementation of the wrong systems engineering process among alternatives. However, numerous systems failures can be attributed to deficient implementation of the systems engineering process. What may clearly be perceived as a system engineering deficiency in retrospect can appear to be a well considered system engineering efficiency in real time - an efficiency taken to reduce cost or meet a schedule, or more often both. Typically these efficiencies are grounded on apparently solid rationale, such as reuse of heritage hardware or software. Over time, unintended consequences of a systems engineering process deficiency may begin to be realized, and unfortunately often the consequence is system failure. This paper describes several actual cases of system failures that resulted from deficiencies in their systems engineering process implementation, including the Ariane 5 and the Hubble Space Telescope.
Selected systems engineering process deficiencies and their consequences

NASA Astrophysics Data System (ADS)

Thomas, L. Dale

2007-06-01

The systems engineering process is well established and well understood. While this statement could be argued in the light of the many systems engineering guidelines and that have been developed, comparative review of these respective descriptions reveal that they differ primarily in the number of discrete steps or other nuances, and are at their core essentially common. Likewise, the systems engineering textbooks differ primarily in the context for application of systems engineering or in the utilization of evolved tools and techniques, not in the basic method. Thus, failures in systems engineering cannot credibly be attributed to implementation of the wrong systems engineering process among alternatives. However, numerous system failures can be attributed to deficient implementation of the systems engineering process. What may clearly be perceived as a systems engineering deficiency in retrospect can appear to be a well considered system engineering efficiency in real time—an efficiency taken to reduce cost or meet a schedule, or more often both. Typically these efficiencies are grounded on apparently solid rationale, such as reuse of heritage hardware or software. Over time, unintended consequences of a systems engineering process deficiency may begin to be realized, and unfortunately often the consequence is systems failure. This paper describes several actual cases of system failures that resulted from deficiencies in their systems engineering process implementation, including the Ariane 5 and the Hubble Space Telescope.
Utilizing Radiofrequency Identification Technology to Improve Safety and Management of Blood Bank Supply Chains.

PubMed

Coustasse, Alberto; Meadows, Pamela; Hall, Robert S; Hibner, Travis; Deslich, Stacie

2015-11-01

The importance of efficiency in the supply chain of perishable products, such as the blood products used in transfusion services, cannot be overstated. Many problems can occur, such as the outdating of products, inventory management issues, patient misidentification, and mistransfusion. The purpose of this article was to identify the benefits and barriers associated with radiofrequency identification (RFID) usage in improving the blood bank supply chain. The methodology for this study was a qualitative literature review following a systematic approach. The review was limited to sources published from 2000 to 2014 in the English language. Sixty-five sources were found, and 56 were used in this research study. According to the finding of the present study, there are numerous benefits and barriers to RFID utilization in blood bank supply chains. RFID technology offers several benefits with regard to blood bank product management, including decreased transfusion errors, reduction of product loss, and more efficient inventory management. Barriers to RFID implementation include the cost associated with system implementation and patient privacy issues. Implementation of an RFID system can be a significant investment. However, when observing the positive impact that such systems may have on transfusion safety and inventory management, the cost associated with RFID systems can easily be justified. RFID in blood bank inventory management is vital to ensuring efficient product inventory management and positive patient outcomes.
The Oceanographic Multipurpose Software Environment (OMUSE v1.0)

NASA Astrophysics Data System (ADS)

Pelupessy, Inti; van Werkhoven, Ben; van Elteren, Arjen; Viebahn, Jan; Candy, Adam; Portegies Zwart, Simon; Dijkstra, Henk

2017-08-01

In this paper we present the Oceanographic Multipurpose Software Environment (OMUSE). OMUSE aims to provide a homogeneous environment for existing or newly developed numerical ocean simulation codes, simplifying their use and deployment. In this way, numerical experiments that combine ocean models representing different physics or spanning different ranges of physical scales can be easily designed. Rapid development of simulation models is made possible through the creation of simple high-level scripts. The low-level core of the abstraction in OMUSE is designed to deploy these simulations efficiently on heterogeneous high-performance computing resources. Cross-verification of simulation models with different codes and numerical methods is facilitated by the unified interface that OMUSE provides. Reproducibility in numerical experiments is fostered by allowing complex numerical experiments to be expressed in portable scripts that conform to a common OMUSE interface. Here, we present the design of OMUSE as well as the modules and model components currently included, which range from a simple conceptual quasi-geostrophic solver to the global circulation model POP (Parallel Ocean Program). The uniform access to the codes' simulation state and the extensive automation of data transfer and conversion operations aids the implementation of model couplings. We discuss the types of couplings that can be implemented using OMUSE. We also present example applications that demonstrate the straightforward model initialization and the concurrent use of data analysis tools on a running model. We give examples of multiscale and multiphysics simulations by embedding a regional ocean model into a global ocean model and by coupling a surface wave propagation model with a coastal circulation model.
An efficient mixed-precision, hybrid CPU-GPU implementation of a nonlinearly implicit one-dimensional particle-in-cell algorithm

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Guangye; Chacon, Luis; Barnes, Daniel C

2012-01-01

Recently, a fully implicit, energy- and charge-conserving particle-in-cell method has been developed for multi-scale, full-f kinetic simulations [G. Chen, et al., J. Comput. Phys. 230, 18 (2011)]. The method employs a Jacobian-free Newton-Krylov (JFNK) solver and is capable of using very large timesteps without loss of numerical stability or accuracy. A fundamental feature of the method is the segregation of particle orbit integrations from the field solver, while remaining fully self-consistent. This provides great flexibility, and dramatically improves the solver efficiency by reducing the degrees of freedom of the associated nonlinear system. However, it requires a particle push per nonlinearmore » residual evaluation, which makes the particle push the most time-consuming operation in the algorithm. This paper describes a very efficient mixed-precision, hybrid CPU-GPU implementation of the implicit PIC algorithm. The JFNK solver is kept on the CPU (in double precision), while the inherent data parallelism of the particle mover is exploited by implementing it in single-precision on a graphics processing unit (GPU) using CUDA. Performance-oriented optimizations, with the aid of an analytical performance model, the roofline model, are employed. Despite being highly dynamic, the adaptive, charge-conserving particle mover algorithm achieves up to 300 400 GOp/s (including single-precision floating-point, integer, and logic operations) on a Nvidia GeForce GTX580, corresponding to 20 25% absolute GPU efficiency (against the peak theoretical performance) and 50-70% intrinsic efficiency (against the algorithm s maximum operational throughput, which neglects all latencies). This is about 200-300 times faster than an equivalent serial CPU implementation. When the single-precision GPU particle mover is combined with a double-precision CPU JFNK field solver, overall performance gains 100 vs. the double-precision CPU-only serial version are obtained, with no apparent loss of robustness or accuracy when applied to a challenging long-time scale ion acoustic wave simulation.« less
Density functional theory for molecular and periodic systems using density fitting and continuous fast multipole method: Analytical gradients.

PubMed

Łazarski, Roman; Burow, Asbjörn Manfred; Grajciar, Lukáš; Sierka, Marek

2016-10-30

A full implementation of analytical energy gradients for molecular and periodic systems is reported in the TURBOMOLE program package within the framework of Kohn-Sham density functional theory using Gaussian-type orbitals as basis functions. Its key component is a combination of density fitting (DF) approximation and continuous fast multipole method (CFMM) that allows for an efficient calculation of the Coulomb energy gradient. For exchange-correlation part the hierarchical numerical integration scheme (Burow and Sierka, Journal of Chemical Theory and Computation 2011, 7, 3097) is extended to energy gradients. Computational efficiency and asymptotic O(N) scaling behavior of the implementation is demonstrated for various molecular and periodic model systems, with the largest unit cell of hematite containing 640 atoms and 19,072 basis functions. The overall computational effort of energy gradient is comparable to that of the Kohn-Sham matrix formation. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Pure quasi-P-wave calculation in transversely isotropic media using a hybrid method

NASA Astrophysics Data System (ADS)

Wu, Zedong; Liu, Hongwei; Alkhalifah, Tariq

2018-07-01

The acoustic approximation for anisotropic media is widely used in current industry imaging and inversion algorithms mainly because Pwaves constitute the majority of the energy recorded in seismic exploration. The resulting acoustic formulae tend to be simpler, resulting in more efficient implementations, and depend on fewer medium parameters. However, conventional solutions of the acoustic wave equation with higher-order derivatives suffer from shear wave artefacts. Thus, we derive a new acoustic wave equation for wave propagation in transversely isotropic (TI) media, which is based on a partially separable approximation of the dispersion relation for TI media and free of shear wave artefacts. Even though our resulting equation is not a partial differential equation, it is still a linear equation. Thus, we propose to implement this equation efficiently by combining the finite difference approximation with spectral evaluation of the space-independent parts. The resulting algorithm provides solutions without the constraint ɛ ≥ δ. Numerical tests demonstrate the effectiveness of the approach.
Mean of the typical decoding rates: a new translation efficiency index based on the analysis of ribosome profiling data.

PubMed

Dana, Alexandra; Tuller, Tamir

2014-12-01

Gene translation modeling and prediction is a fundamental problem that has numerous biomedical implementations. In this work we present a novel, user-friendly tool/index for calculating the mean of the typical decoding rates that enables predicting translation elongation efficiency of protein coding genes for different tissue types, developmental stages, and experimental conditions. The suggested translation efficiency index is based on the analysis of the organism's ribosome profiling data. This index could be used for example to predict changes in translation elongation efficiency of lowly expressed genes that usually have relatively low and/or biased ribosomal densities and protein levels measurements, or can be used for example for predicting translation efficiency of new genetically engineered genes. We demonstrate the usability of this index via the analysis of six organisms in different tissues and developmental stages. Distributable cross platform application and guideline are available for download at: http://www.cs.tau.ac.il/~tamirtul/MTDR/MTDR_Install.html. Copyright © 2015 Dana and Tuller.
Nonlinear Legendre Spectral Finite Elements for Wind Turbine Blade Dynamics: Preprint

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Q.; Sprague, M. A.; Jonkman, J.

2014-01-01

This paper presents a numerical implementation and examination of new wind turbine blade finite element model based on Geometrically Exact Beam Theory (GEBT) and a high-order spectral finite element method. The displacement-based GEBT is presented, which includes the coupling effects that exist in composite structures and geometric nonlinearity. Legendre spectral finite elements (LSFEs) are high-order finite elements with nodes located at the Gauss-Legendre-Lobatto points. LSFEs can be an order of magnitude more efficient that low-order finite elements for a given accuracy level. Interpolation of the three-dimensional rotation, a major technical barrier in large-deformation simulation, is discussed in the context ofmore » LSFEs. It is shown, by numerical example, that the high-order LSFEs, where weak forms are evaluated with nodal quadrature, do not suffer from a drawback that exists in low-order finite elements where the tangent-stiffness matrix is calculated at the Gauss points. Finally, the new LSFE code is implemented in the new FAST Modularization Framework for dynamic simulation of highly flexible composite-material wind turbine blades. The framework allows for fully interactive simulations of turbine blades in operating conditions. Numerical examples showing validation and LSFE performance will be provided in the final paper.« less
Nonlinear interferometry approach to photonic sequential logic

NASA Astrophysics Data System (ADS)

Mabuchi, Hideo

2011-10-01

Motivated by rapidly advancing capabilities for extensive nanoscale patterning of optical materials, I propose an approach to implementing photonic sequential logic that exploits circuit-scale phase coherence for efficient realizations of fundamental components such as a NAND-gate-with-fanout and a bistable latch. Kerr-nonlinear optical resonators are utilized in combination with interference effects to drive the binary logic. Quantum-optical input-output models are characterized numerically using design parameters that yield attojoule-scale energy separation between the latch states.
Computation of forces arising from the polarizable continuum model within the domain-decomposition paradigm

NASA Astrophysics Data System (ADS)

Gatto, Paolo; Lipparini, Filippo; Stamm, Benjamin

2017-12-01

The domain-decomposition (dd) paradigm, originally introduced for the conductor-like screening model, has been recently extended to the dielectric Polarizable Continuum Model (PCM), resulting in the ddPCM method. We present here a complete derivation of the analytical derivatives of the ddPCM energy with respect to the positions of the solute's atoms and discuss their efficient implementation. As it is the case for the energy, we observe a quadratic scaling, which is discussed and demonstrated with numerical tests.

Modified Method of Adaptive Artificial Viscosity for Solution of Gas Dynamics Problems on Parallel Computer Systems

NASA Astrophysics Data System (ADS)

Popov, Igor; Sukov, Sergey

2018-02-01

A modification of the adaptive artificial viscosity (AAV) method is considered. This modification is based on one stage time approximation and is adopted to calculation of gasdynamics problems on unstructured grids with an arbitrary type of grid elements. The proposed numerical method has simplified logic, better performance and parallel efficiency compared to the implementation of the original AAV method. Computer experiments evidence the robustness and convergence of the method to difference solution.
A Discontinuous Galerkin Finite Element Method for Hamilton-Jacobi Equations

NASA Technical Reports Server (NTRS)

Hu, Changqing; Shu, Chi-Wang

1998-01-01

In this paper, we present a discontinuous Galerkin finite element method for solving the nonlinear Hamilton-Jacobi equations. This method is based on the Runge-Kutta discontinuous Galerkin finite element method for solving conservation laws. The method has the flexibility of treating complicated geometry by using arbitrary triangulation, can achieve high order accuracy with a local, compact stencil, and are suited for efficient parallel implementation. One and two dimensional numerical examples are given to illustrate the capability of the method.
Pricing and simulation for real estate index options: Radial basis point interpolation

NASA Astrophysics Data System (ADS)

Gong, Pu; Zou, Dong; Wang, Jiayue

2018-06-01

This study employs the meshfree radial basis point interpolation (RBPI) for pricing real estate derivatives contingent on real estate index. This method combines radial and polynomial basis functions, which can guarantee the interpolation scheme with Kronecker property and effectively improve accuracy. An exponential change of variables, a mesh refinement algorithm and the Richardson extrapolation are employed in this study to implement the RBPI. Numerical results are presented to examine the computational efficiency and accuracy of our method.
An Optimized Multicolor Point-Implicit Solver for Unstructured Grid Applications on Graphics Processing Units

NASA Technical Reports Server (NTRS)

Zubair, Mohammad; Nielsen, Eric; Luitjens, Justin; Hammond, Dana

2016-01-01

In the field of computational fluid dynamics, the Navier-Stokes equations are often solved using an unstructuredgrid approach to accommodate geometric complexity. Implicit solution methodologies for such spatial discretizations generally require frequent solution of large tightly-coupled systems of block-sparse linear equations. The multicolor point-implicit solver used in the current work typically requires a significant fraction of the overall application run time. In this work, an efficient implementation of the solver for graphics processing units is proposed. Several factors present unique challenges to achieving an efficient implementation in this environment. These include the variable amount of parallelism available in different kernel calls, indirect memory access patterns, low arithmetic intensity, and the requirement to support variable block sizes. In this work, the solver is reformulated to use standard sparse and dense Basic Linear Algebra Subprograms (BLAS) functions. However, numerical experiments show that the performance of the BLAS functions available in existing CUDA libraries is suboptimal for matrices representative of those encountered in actual simulations. Instead, optimized versions of these functions are developed. Depending on block size, the new implementations show performance gains of up to 7x over the existing CUDA library functions.
Efficient experimental design of high-fidelity three-qubit quantum gates via genetic programming

NASA Astrophysics Data System (ADS)

Devra, Amit; Prabhu, Prithviraj; Singh, Harpreet; Arvind; Dorai, Kavita

2018-03-01

We have designed efficient quantum circuits for the three-qubit Toffoli (controlled-controlled-NOT) and the Fredkin (controlled-SWAP) gate, optimized via genetic programming methods. The gates thus obtained were experimentally implemented on a three-qubit NMR quantum information processor, with a high fidelity. Toffoli and Fredkin gates in conjunction with the single-qubit Hadamard gates form a universal gate set for quantum computing and are an essential component of several quantum algorithms. Genetic algorithms are stochastic search algorithms based on the logic of natural selection and biological genetics and have been widely used for quantum information processing applications. We devised a new selection mechanism within the genetic algorithm framework to select individuals from a population. We call this mechanism the "Luck-Choose" mechanism and were able to achieve faster convergence to a solution using this mechanism, as compared to existing selection mechanisms. The optimization was performed under the constraint that the experimentally implemented pulses are of short duration and can be implemented with high fidelity. We demonstrate the advantage of our pulse sequences by comparing our results with existing experimental schemes and other numerical optimization methods.
Self-consistent implementation of meta-GGA functionals for the ONETEP linear-scaling electronic structure package.

PubMed

Womack, James C; Mardirossian, Narbe; Head-Gordon, Martin; Skylaris, Chris-Kriton

2016-11-28

Accurate and computationally efficient exchange-correlation functionals are critical to the successful application of linear-scaling density functional theory (DFT). Local and semi-local functionals of the density are naturally compatible with linear-scaling approaches, having a general form which assumes the locality of electronic interactions and which can be efficiently evaluated by numerical quadrature. Presently, the most sophisticated and flexible semi-local functionals are members of the meta-generalized-gradient approximation (meta-GGA) family, and depend upon the kinetic energy density, τ, in addition to the charge density and its gradient. In order to extend the theoretical and computational advantages of τ-dependent meta-GGA functionals to large-scale DFT calculations on thousands of atoms, we have implemented support for τ-dependent meta-GGA functionals in the ONETEP program. In this paper we lay out the theoretical innovations necessary to implement τ-dependent meta-GGA functionals within ONETEP's linear-scaling formalism. We present expressions for the gradient of the τ-dependent exchange-correlation energy, necessary for direct energy minimization. We also derive the forms of the τ-dependent exchange-correlation potential and kinetic energy density in terms of the strictly localized, self-consistently optimized orbitals used by ONETEP. To validate the numerical accuracy of our self-consistent meta-GGA implementation, we performed calculations using the B97M-V and PKZB meta-GGAs on a variety of small molecules. Using only a minimal basis set of self-consistently optimized local orbitals, we obtain energies in excellent agreement with large basis set calculations performed using other codes. Finally, to establish the linear-scaling computational cost and applicability of our approach to large-scale calculations, we present the outcome of self-consistent meta-GGA calculations on amyloid fibrils of increasing size, up to tens of thousands of atoms.
Accelerating Subsurface Transport Simulation on Heterogeneous Clusters

DOE Office of Scientific and Technical Information (OSTI.GOV)

Villa, Oreste; Gawande, Nitin A.; Tumeo, Antonino

Reactive transport numerical models simulate chemical and microbiological reactions that occur along a flowpath. These models have to compute reactions for a large number of locations. They solve the set of ordinary differential equations (ODEs) that describes the reaction for each location through the Newton-Raphson technique. This technique involves computing a Jacobian matrix and a residual vector for each set of equation, and then solving iteratively the linearized system by performing Gaussian Elimination and LU decomposition until convergence. STOMP, a well known subsurface flow simulation tool, employs matrices with sizes in the order of 100x100 elements and, for numerical accuracy,more » LU factorization with full pivoting instead of the faster partial pivoting. Modern high performance computing systems are heterogeneous machines whose nodes integrate both CPUs and GPUs, exposing unprecedented amounts of parallelism. To exploit all their computational power, applications must use both the types of processing elements. For the case of subsurface flow simulation, this mainly requires implementing efficient batched LU-based solvers and identifying efficient solutions for enabling load balancing among the different processors of the system. In this paper we discuss two approaches that allows scaling STOMP's performance on heterogeneous clusters. We initially identify the challenges in implementing batched LU-based solvers for small matrices on GPUs, and propose an implementation that fulfills STOMP's requirements. We compare this implementation to other existing solutions. Then, we combine the batched GPU solver with an OpenMP-based CPU solver, and present an adaptive load balancer that dynamically distributes the linear systems to solve between the two components inside a node. We show how these approaches, integrated into the full application, provide speed ups from 6 to 7 times on large problems, executed on up to 16 nodes of a cluster with two AMD Opteron 6272 and a Tesla M2090 per node.« less
Self-consistent implementation of meta-GGA functionals for the ONETEP linear-scaling electronic structure package

NASA Astrophysics Data System (ADS)

Womack, James C.; Mardirossian, Narbe; Head-Gordon, Martin; Skylaris, Chris-Kriton

2016-11-01

Accurate and computationally efficient exchange-correlation functionals are critical to the successful application of linear-scaling density functional theory (DFT). Local and semi-local functionals of the density are naturally compatible with linear-scaling approaches, having a general form which assumes the locality of electronic interactions and which can be efficiently evaluated by numerical quadrature. Presently, the most sophisticated and flexible semi-local functionals are members of the meta-generalized-gradient approximation (meta-GGA) family, and depend upon the kinetic energy density, τ, in addition to the charge density and its gradient. In order to extend the theoretical and computational advantages of τ-dependent meta-GGA functionals to large-scale DFT calculations on thousands of atoms, we have implemented support for τ-dependent meta-GGA functionals in the ONETEP program. In this paper we lay out the theoretical innovations necessary to implement τ-dependent meta-GGA functionals within ONETEP's linear-scaling formalism. We present expressions for the gradient of the τ-dependent exchange-correlation energy, necessary for direct energy minimization. We also derive the forms of the τ-dependent exchange-correlation potential and kinetic energy density in terms of the strictly localized, self-consistently optimized orbitals used by ONETEP. To validate the numerical accuracy of our self-consistent meta-GGA implementation, we performed calculations using the B97M-V and PKZB meta-GGAs on a variety of small molecules. Using only a minimal basis set of self-consistently optimized local orbitals, we obtain energies in excellent agreement with large basis set calculations performed using other codes. Finally, to establish the linear-scaling computational cost and applicability of our approach to large-scale calculations, we present the outcome of self-consistent meta-GGA calculations on amyloid fibrils of increasing size, up to tens of thousands of atoms.
Efficient grid-based techniques for density functional theory

NASA Astrophysics Data System (ADS)

Rodriguez-Hernandez, Juan Ignacio

Understanding the chemical and physical properties of molecules and materials at a fundamental level often requires quantum-mechanical models for these substance's electronic structure. This type of many body quantum mechanics calculation is computationally demanding, hindering its application to substances with more than a few hundreds atoms. The supreme goal of many researches in quantum chemistry---and the topic of this dissertation---is to develop more efficient computational algorithms for electronic structure calculations. In particular, this dissertation develops two new numerical integration techniques for computing molecular and atomic properties within conventional Kohn-Sham-Density Functional Theory (KS-DFT) of molecular electronic structure. The first of these grid-based techniques is based on the transformed sparse grid construction. In this construction, a sparse grid is generated in the unit cube and then mapped to real space according to the pro-molecular density using the conditional distribution transformation. The transformed sparse grid was implemented in program deMon2k, where it is used as the numerical integrator for the exchange-correlation energy and potential in the KS-DFT procedure. We tested our grid by computing ground state energies, equilibrium geometries, and atomization energies. The accuracy on these test calculations shows that our grid is more efficient than some previous integration methods: our grids use fewer points to obtain the same accuracy. The transformed sparse grids were also tested for integrating, interpolating and differentiating in different dimensions (n = 1,2,3,6). The second technique is a grid-based method for computing atomic properties within QTAIM. It was also implemented in deMon2k. The performance of the method was tested by computing QTAIM atomic energies, charges, dipole moments, and quadrupole moments. For medium accuracy, our method is the fastest one we know of.
Laser deposition of resonant silicon nanoparticles on perovskite for photoluminescence enhancement

NASA Astrophysics Data System (ADS)

Tiguntseva, E. Y.; Zalogina, A. S.; Milichko, V. A.; Zuev, D. A.; Omelyanovich, M. M.; Ishteev, A.; Cerdan Pasaran, A.; Haroldson, R.; Makarov, S. V.; Zakhidov, A. A.

2017-11-01

Hybrid lead halide perovskite based optoelectronics is a promising area of modern technologies yielding excellent characteristics of light emitting diodes and lasers as well as high efficiencies of photovoltaic devices. However, the efficiency of perovskite based devices hold a potential of further improvement. Here we demonstrate high photoluminescence efficiency of perovskites thin films via deposition of resonant silicon nanoparticles on their surface. The deposited nanoparticles have a number of advances over their plasmonic counterparts, which were applied in previous studies. We show experimentally the increase of photoluminescence of perovskite film with the silicon nanoparticles by 150 % as compared to the film without the nanoparticles. The results are supported by numerical calculations. Our results pave the way to high throughput implementation of low loss resonant nanoparticles in order to create highly effective perovskite based optoelectronic devices.
Semi-implicit finite difference methods for three-dimensional shallow water flow

USGS Publications Warehouse

Casulli, Vincenzo; Cheng, Ralph T.

1992-01-01

A semi-implicit finite difference method for the numerical solution of three-dimensional shallow water flows is presented and discussed. The governing equations are the primitive three-dimensional turbulent mean flow equations where the pressure distribution in the vertical has been assumed to be hydrostatic. In the method of solution a minimal degree of implicitness has been adopted in such a fashion that the resulting algorithm is stable and gives a maximal computational efficiency at a minimal computational cost. At each time step the numerical method requires the solution of one large linear system which can be formally decomposed into a set of small three-diagonal systems coupled with one five-diagonal system. All these linear systems are symmetric and positive definite. Thus the existence and uniquencess of the numerical solution are assured. When only one vertical layer is specified, this method reduces as a special case to a semi-implicit scheme for solving the corresponding two-dimensional shallow water equations. The resulting two- and three-dimensional algorithm has been shown to be fast, accurate and mass-conservative and can also be applied to simulate flooding and drying of tidal mud-flats in conjunction with three-dimensional flows. Furthermore, the resulting algorithm is fully vectorizable for an efficient implementation on modern vector computers.
Extending semi-numeric reionization models to the first stars and galaxies

NASA Astrophysics Data System (ADS)

Koh, Daegene; Wise, John H.

2018-03-01

Semi-numeric methods have made it possible to efficiently model the epoch of reionization (EoR). While most implementations involve a reduction to a simple three-parameter model, we introduce a new mass-dependent ionizing efficiency parameter that folds in physical parameters that are constrained by the latest numerical simulations. This new parametrization enables the effective modelling of a broad range of host halo masses containing ionizing sources, extending from the smallest Population III host haloes with M ˜ 106 M⊙, which are often ignored, to the rarest cosmic peaks with M ˜ 1012 M⊙ during EoR. We compare the resulting ionizing histories with a typical three-parameter model and also compare with the latest constraints from the Planck mission. Our model results in an optical depth due to Thomson scattering, τe = 0.057, that is consistent with Planck. The largest difference in our model is shown in the resulting bubble size distributions that peak at lower characteristic sizes and are broadened. We also consider the uncertainties of the various physical parameters, and comparing the resulting ionizing histories broadly disfavours a small contribution from galaxies. The smallest haloes cease a meaningful contribution to the ionizing photon budget after z = 10, implying that they play a role in determining the start of EoR and little else.
Efficient numerical evaluation of Feynman integrals

NASA Astrophysics Data System (ADS)

Li, Zhao; Wang, Jian; Yan, Qi-Shu; Zhao, Xiaoran

2016-03-01

Feynman loop integrals are a key ingredient for the calculation of higher order radiation effects, and are responsible for reliable and accurate theoretical prediction. We improve the efficiency of numerical integration in sector decomposition by implementing a quasi-Monte Carlo method associated with the CUDA/GPU technique. For demonstration we present the results of several Feynman integrals up to two loops in both Euclidean and physical kinematic regions in comparison with those obtained from FIESTA3. It is shown that both planar and non-planar two-loop master integrals in the physical kinematic region can be evaluated in less than half a minute with accuracy, which makes the direct numerical approach viable for precise investigation of higher order effects in multi-loop processes, e.g. the next-to-leading order QCD effect in Higgs pair production via gluon fusion with a finite top quark mass. Supported by the Natural Science Foundation of China (11305179 11475180), Youth Innovation Promotion Association, CAS, IHEP Innovation (Y4545170Y2), State Key Lab for Electronics and Particle Detectors, Open Project Program of State Key Laboratory of Theoretical Physics, Institute of Theoretical Physics, Chinese Academy of Sciences, China (Y4KF061CJ1), Cluster of Excellence Precision Physics, Fundamental Interactions and Structure of Matter (PRISMA-EXC 1098)
Computing Generalized Matrix Inverse on Spiking Neural Substrate

PubMed Central

Shukla, Rohit; Khoram, Soroosh; Jorgensen, Erik; Li, Jing; Lipasti, Mikko; Wright, Stephen

2018-01-01

Emerging neural hardware substrates, such as IBM's TrueNorth Neurosynaptic System, can provide an appealing platform for deploying numerical algorithms. For example, a recurrent Hopfield neural network can be used to find the Moore-Penrose generalized inverse of a matrix, thus enabling a broad class of linear optimizations to be solved efficiently, at low energy cost. However, deploying numerical algorithms on hardware platforms that severely limit the range and precision of representation for numeric quantities can be quite challenging. This paper discusses these challenges and proposes a rigorous mathematical framework for reasoning about range and precision on such substrates. The paper derives techniques for normalizing inputs and properly quantizing synaptic weights originating from arbitrary systems of linear equations, so that solvers for those systems can be implemented in a provably correct manner on hardware-constrained neural substrates. The analytical model is empirically validated on the IBM TrueNorth platform, and results show that the guarantees provided by the framework for range and precision hold under experimental conditions. Experiments with optical flow demonstrate the energy benefits of deploying a reduced-precision and energy-efficient generalized matrix inverse engine on the IBM TrueNorth platform, reflecting 10× to 100× improvement over FPGA and ARM core baselines. PMID:29593483
DNS of Low-Pressure Turbine Cascade Flows with Elevated Inflow Turbulence Using a Discontinuous-Galerkin Spectral-Element Method

NASA Technical Reports Server (NTRS)

Garai, Anirban; Diosady, Laslo T.; Murman, Scott M.; Madavan, Nateri K.

2016-01-01

Recent progress towards developing a new computational capability for accurate and efficient high-fidelity direct numerical simulation (DNS) and large-eddy simulation (LES) of turbomachinery is described. This capability is based on an entropy- stable Discontinuous-Galerkin spectral-element approach that extends to arbitrarily high orders of spatial and temporal accuracy, and is implemented in a computationally efficient manner on a modern high performance computer architecture. An inflow turbulence generation procedure based on a linear forcing approach has been incorporated in this framework and DNS conducted to study the effect of inflow turbulence on the suction- side separation bubble in low-pressure turbine (LPT) cascades. The T106 series of airfoil cascades in both lightly (T106A) and highly loaded (T106C) configurations at exit isentropic Reynolds numbers of 60,000 and 80,000, respectively, are considered. The numerical simulations are performed using 8th-order accurate spatial and 4th-order accurate temporal discretization. The changes in separation bubble topology due to elevated inflow turbulence is captured by the present method and the physical mechanisms leading to the changes are explained. The present results are in good agreement with prior numerical simulations but some expected discrepancies with the experimental data for the T106C case are noted and discussed.
Cost efficiency of the non-associative flow rule simulation of an industrial component

NASA Astrophysics Data System (ADS)

Galdos, Lander; de Argandoña, Eneko Saenz; Mendiguren, Joseba

2017-10-01

In the last decade, metal forming industry is becoming more and more competitive. In this context, the FEM modeling has become a primary tool of information for the component and process design. Numerous researchers have been focused on improving the accuracy of the material models implemented on the FEM in order to improve the efficiency of the simulations. Aimed at increasing the efficiency of the anisotropic behavior modelling, in the last years the use of non-associative flow rule models (NAFR) has been presented as an alternative to the classic associative flow rule models (AFR). In this work, the cost efficiency of the used flow rule model has been numerically analyzed by simulating an industrial drawing operation with two different models of the same degree of flexibility: one AFR model and one NAFR model. From the present study, it has been concluded that the flow rule has a negligible influence on the final drawing prediction; this is mainly driven by the model parameter identification procedure. Even though the NAFR formulation is complex when compared to the AFR, the present study shows that the total simulation time while using explicit FE solvers has been reduced without loss of accuracy. Furthermore, NAFR formulations have an advantage over AFR formulations in parameter identification because the formulation decouples the yield stress and the Lankford coefficients.
Efficient Hardware Implementation of the Horn-Schunck Algorithm for High-Resolution Real-Time Dense Optical Flow Sensor

PubMed Central

Komorkiewicz, Mateusz; Kryjak, Tomasz; Gorgon, Marek

2014-01-01

This article presents an efficient hardware implementation of the Horn-Schunck algorithm that can be used in an embedded optical flow sensor. An architecture is proposed, that realises the iterative Horn-Schunck algorithm in a pipelined manner. This modification allows to achieve data throughput of 175 MPixels/s and makes processing of Full HD video stream (1, 920 × 1, 080 @ 60 fps) possible. The structure of the optical flow module as well as pre- and post-filtering blocks and a flow reliability computation unit is described in details. Three versions of optical flow modules, with different numerical precision, working frequency and obtained results accuracy are proposed. The errors caused by switching from floating- to fixed-point computations are also evaluated. The described architecture was tested on popular sequences from an optical flow dataset of the Middlebury University. It achieves state-of-the-art results among hardware implementations of single scale methods. The designed fixed-point architecture achieves performance of 418 GOPS with power efficiency of 34 GOPS/W. The proposed floating-point module achieves 103 GFLOPS, with power efficiency of 24 GFLOPS/W. Moreover, a 100 times speedup compared to a modern CPU with SIMD support is reported. A complete, working vision system realized on Xilinx VC707 evaluation board is also presented. It is able to compute optical flow for Full HD video stream received from an HDMI camera in real-time. The obtained results prove that FPGA devices are an ideal platform for embedded vision systems. PMID:24526303
Efficient hardware implementation of the Horn-Schunck algorithm for high-resolution real-time dense optical flow sensor.

PubMed

Komorkiewicz, Mateusz; Kryjak, Tomasz; Gorgon, Marek

2014-02-12

This article presents an efficient hardware implementation of the Horn-Schunck algorithm that can be used in an embedded optical flow sensor. An architecture is proposed, that realises the iterative Horn-Schunck algorithm in a pipelined manner. This modification allows to achieve data throughput of 175 MPixels/s and makes processing of Full HD video stream (1; 920 × 1; 080 @ 60 fps) possible. The structure of the optical flow module as well as pre- and post-filtering blocks and a flow reliability computation unit is described in details. Three versions of optical flow modules, with different numerical precision, working frequency and obtained results accuracy are proposed. The errors caused by switching from floating- to fixed-point computations are also evaluated. The described architecture was tested on popular sequences from an optical flow dataset of the Middlebury University. It achieves state-of-the-art results among hardware implementations of single scale methods. The designed fixed-point architecture achieves performance of 418 GOPS with power efficiency of 34 GOPS/W. The proposed floating-point module achieves 103 GFLOPS, with power efficiency of 24 GFLOPS/W. Moreover, a 100 times speedup compared to a modern CPU with SIMD support is reported. A complete, working vision system realized on Xilinx VC707 evaluation board is also presented. It is able to compute optical flow for Full HD video stream received from an HDMI camera in real-time. The obtained results prove that FPGA devices are an ideal platform for embedded vision systems.
Numerical and experimental study on the steady cone-jet mode of electro-centrifugal spinning

NASA Astrophysics Data System (ADS)

Hashemi, Ali Reza; Pishevar, Ahmad Reza; Valipouri, Afsaneh; Pǎrǎu, Emilian I.

2018-01-01

This study focuses on a numerical investigation of an initial stable jet through the air-sealed electro-centrifugal spinning process, which is known as a viable method for the mass production of nanofibers. A liquid jet undergoing electric and centrifugal forces, as well as other forces, first travels in a stable trajectory and then goes through an unstable curled path to the collector. In numerical modeling, hydrodynamic equations have been solved using the perturbation method—and the boundary integral method has been implemented to efficiently solve the electric potential equation. Hydrodynamic equations have been coupled with the electric field using stress boundary conditions at the fluid-fluid interface. Perturbation equations were discretized by a second order finite difference method, and the Newton method was implemented to solve the discretized non-linear system. Also, the boundary element method was utilized to solve electrostatic equations. In the theoretical study, the fluid was described as a leaky dielectric with charges only on the surface of the jet traveling in dielectric air. The effect of the electric field induced around the nozzle tip on the jet instability and trajectory deviation was also experimentally studied through plate-plate geometry as well as point-plate geometry. It was numerically found that the centrifugal force prevails on electric force by increasing the rotational speed. Therefore, the alteration of the applied voltage does not significantly affect the jet thinning profile or the jet trajectory.
Efficient Numerical Diagonalization of Hermitian 3 × 3 Matrices

NASA Astrophysics Data System (ADS)

Kopp, Joachim

A very common problem in science is the numerical diagonalization of symmetric or hermitian 3 × 3 matrices. Since standard "black box" packages may be too inefficient if the number of matrices is large, we study several alternatives. We consider optimized implementations of the Jacobi, QL, and Cuppen algorithms and compare them with an alytical method relying on Cardano's formula for the eigenvalues and on vector cross products for the eigenvectors. Jacobi is the most accurate, but also the slowest method, while QL and Cuppen are good general purpose algorithms. The analytical algorithm outperforms the others by more than a factor of 2, but becomes inaccurate or may even fail completely if the matrix entries differ greatly in magnitude. This can mostly be circumvented by using a hybrid method, which falls back to QL if conditions are such that the analytical calculation might become too inaccurate. For all algorithms, we give an overview of the underlying mathematical ideas, and present detailed benchmark results. C and Fortran implementations of our code are available for download from .

Adaptive angular-velocity Vold-Kalman filter order tracking - Theoretical basis, numerical implementation and parameter investigation

NASA Astrophysics Data System (ADS)

Pan, M.-Ch.; Chu, W.-Ch.; Le, Duc-Do

2016-12-01

The paper presents an alternative Vold-Kalman filter order tracking (VKF_OT) method, i.e. adaptive angular-velocity VKF_OT technique, to extract and characterize order components in an adaptive manner for the condition monitoring and fault diagnosis of rotary machinery. The order/spectral waveforms to be tracked can be recursively solved by using Kalman filter based on the one-step state prediction. The paper comprises theoretical derivation of computation scheme, numerical implementation, and parameter investigation. Comparisons of the adaptive VKF_OT scheme with two other ones are performed through processing synthetic signals of designated order components. Processing parameters such as the weighting factor and the correlation matrix of process noise, and data conditions like the sampling frequency, which influence tracking behavior, are explored. The merits such as adaptive processing nature and computation efficiency brought by the proposed scheme are addressed although the computation was performed in off-line conditions. The proposed scheme can simultaneously extract multiple spectral components, and effectively decouple close and crossing orders associated with multi-axial reference rotating speeds.
Implementing N-quantum phase gate via circuit QED with qubit-qubit interaction

NASA Astrophysics Data System (ADS)

Said, T.; Chouikh, A.; Essammouni, K.; Bennai, M.

2016-02-01

We propose a method for realizing a quantum phase gate of one qubit simultaneously controlling N target qubits based on the qubit-qubit interaction. We show how to implement the proposed gate with one transmon qubit simultaneously controlling N transmon qubits in a circuit QED driven by a strong microwave field. In our scheme, the operation time of this phase gate is independent of the number N of qubits. On the other hand, this gate can be realized in a time of nanosecond-scale much smaller than the decoherence time and dephasing time both being the time of microsecond-scale. Numerical simulation of the occupation probabilities of the second excited lever shows that the scheme could be achieved efficiently within current technology.
Equations of motion for a flexible spacecraft-lumped parameter idealization

NASA Technical Reports Server (NTRS)

Storch, Joel; Gates, Stephen

1982-01-01

The equations of motion for a flexible vehicle capable of arbitrary translational and rotational motions in inertial space accompanied by small elastic deformations are derived in an unabridged form. The vehicle is idealized as consisting of a single rigid body with an ensemble of mass particles interconnected by massless elastic structure. The internal elastic restoring forces are quantified in terms of a stiffness matrix. A transformation and truncation of elastic degrees of freedom is made in the interest of numerical integration efficiency. Deformation dependent terms are partitioned into a hierarchy of significance. The final set of motion equations are brought to a fully assembled first order form suitable for direct digital implementation. A FORTRAN program implementing the equations is given and its salient features described.
LS-DYNA Implementation of Polymer Matrix Composite Model Under High Strain Rate Impact

NASA Technical Reports Server (NTRS)

Zheng, Xia-Hua; Goldberg, Robert K.; Binienda, Wieslaw K.; Roberts, Gary D.

2003-01-01

A recently developed constitutive model is implemented into LS-DYNA as a user defined material model (UMAT) to characterize the nonlinear strain rate dependent behavior of polymers. By utilizing this model within a micromechanics technique based on a laminate analogy, an algorithm to analyze the strain rate dependent, nonlinear deformation of a fiber reinforced polymer matrix composite is then developed as a UMAT to simulate the response of these composites under high strain rate impact. The models are designed for shell elements in order to ensure computational efficiency. Experimental and numerical stress-strain curves are compared for two representative polymers and a representative polymer matrix composite, with the analytical model predicting the experimental response reasonably well.
Parallel implementation of a Lagrangian-based model on an adaptive mesh in C++: Application to sea-ice

NASA Astrophysics Data System (ADS)

Samaké, Abdoulaye; Rampal, Pierre; Bouillon, Sylvain; Ólason, Einar

2017-12-01

We present a parallel implementation framework for a new dynamic/thermodynamic sea-ice model, called neXtSIM, based on the Elasto-Brittle rheology and using an adaptive mesh. The spatial discretisation of the model is done using the finite-element method. The temporal discretisation is semi-implicit and the advection is achieved using either a pure Lagrangian scheme or an Arbitrary Lagrangian Eulerian scheme (ALE). The parallel implementation presented here focuses on the distributed-memory approach using the message-passing library MPI. The efficiency and the scalability of the parallel algorithms are illustrated by the numerical experiments performed using up to 500 processor cores of a cluster computing system. The performance obtained by the proposed parallel implementation of the neXtSIM code is shown being sufficient to perform simulations for state-of-the-art sea ice forecasting and geophysical process studies over geographical domain of several millions squared kilometers like the Arctic region.
The numerical simulation tool for the MAORY multiconjugate adaptive optics system

NASA Astrophysics Data System (ADS)

Arcidiacono, C.; Schreiber, L.; Bregoli, G.; Diolaiti, E.; Foppiani, I.; Agapito, G.; Puglisi, A.; Xompero, M.; Oberti, S.; Cosentino, G.; Lombini, M.; Butler, R. C.; Ciliegi, P.; Cortecchia, F.; Patti, M.; Esposito, S.; Feautrier, P.

2016-07-01

The Multiconjugate Adaptive Optics RelaY (MAORY) is and Adaptive Optics module to be mounted on the ESO European-Extremely Large Telescope (E-ELT). It is an hybrid Natural and Laser Guide System that will perform the correction of the atmospheric turbulence volume above the telescope feeding the Multi-AO Imaging Camera for Deep Observations Near Infrared spectro-imager (MICADO). We developed an end-to-end Monte- Carlo adaptive optics simulation tool to investigate the performance of a the MAORY and the calibration, acquisition, operation strategies. MAORY will implement Multiconjugate Adaptive Optics combining Laser Guide Stars (LGS) and Natural Guide Stars (NGS) measurements. The simulation tool implement the various aspect of the MAORY in an end to end fashion. The code has been developed using IDL and use libraries in C++ and CUDA for efficiency improvements. Here we recall the code architecture, we describe the modeled instrument components and the control strategies implemented in the code.
A high-order Lagrangian-decoupling method for the incompressible Navier-Stokes equations

NASA Technical Reports Server (NTRS)

Ho, Lee-Wing; Maday, Yvon; Patera, Anthony T.; Ronquist, Einar M.

1989-01-01

A high-order Lagrangian-decoupling method is presented for the unsteady convection-diffusion and incompressible Navier-Stokes equations. The method is based upon: (1) Lagrangian variational forms that reduce the convection-diffusion equation to a symmetric initial value problem; (2) implicit high-order backward-differentiation finite-difference schemes for integration along characteristics; (3) finite element or spectral element spatial discretizations; and (4) mesh-invariance procedures and high-order explicit time-stepping schemes for deducing function values at convected space-time points. The method improves upon previous finite element characteristic methods through the systematic and efficient extension to high order accuracy, and the introduction of a simple structure-preserving characteristic-foot calculation procedure which is readily implemented on modern architectures. The new method is significantly more efficient than explicit-convection schemes for the Navier-Stokes equations due to the decoupling of the convection and Stokes operators and the attendant increase in temporal stability. Numerous numerical examples are given for the convection-diffusion and Navier-Stokes equations for the particular case of a spectral element spatial discretization.
Parallel numerical modeling of hybrid-dimensional compositional non-isothermal Darcy flows in fractured porous media

NASA Astrophysics Data System (ADS)

Xing, F.; Masson, R.; Lopez, S.

2017-09-01

This paper introduces a new discrete fracture model accounting for non-isothermal compositional multiphase Darcy flows and complex networks of fractures with intersecting, immersed and non-immersed fractures. The so called hybrid-dimensional model using a 2D model in the fractures coupled with a 3D model in the matrix is first derived rigorously starting from the equi-dimensional matrix fracture model. Then, it is discretized using a fully implicit time integration combined with the Vertex Approximate Gradient (VAG) finite volume scheme which is adapted to polyhedral meshes and anisotropic heterogeneous media. The fully coupled systems are assembled and solved in parallel using the Single Program Multiple Data (SPMD) paradigm with one layer of ghost cells. This strategy allows for a local assembly of the discrete systems. An efficient preconditioner is implemented to solve the linear systems at each time step and each Newton type iteration of the simulation. The numerical efficiency of our approach is assessed on different meshes, fracture networks, and physical settings in terms of parallel scalability, nonlinear convergence and linear convergence.
Nonlinear mechanics of non-rigid origami: an efficient computational approach

NASA Astrophysics Data System (ADS)

Liu, K.; Paulino, G. H.

2017-10-01

Origami-inspired designs possess attractive applications to science and engineering (e.g. deployable, self-assembling, adaptable systems). The special geometric arrangement of panels and creases gives rise to unique mechanical properties of origami, such as reconfigurability, making origami designs well suited for tunable structures. Although often being ignored, origami structures exhibit additional soft modes beyond rigid folding due to the flexibility of thin sheets that further influence their behaviour. Actual behaviour of origami structures usually involves significant geometric nonlinearity, which amplifies the influence of additional soft modes. To investigate the nonlinear mechanics of origami structures with deformable panels, we present a structural engineering approach for simulating the nonlinear response of non-rigid origami structures. In this paper, we propose a fully nonlinear, displacement-based implicit formulation for performing static/quasi-static analyses of non-rigid origami structures based on `bar-and-hinge' models. The formulation itself leads to an efficient and robust numerical implementation. Agreement between real models and numerical simulations demonstrates the ability of the proposed approach to capture key features of origami behaviour.
Nonlinear mechanics of non-rigid origami: an efficient computational approach.

PubMed

Liu, K; Paulino, G H

2017-10-01

Origami-inspired designs possess attractive applications to science and engineering (e.g. deployable, self-assembling, adaptable systems). The special geometric arrangement of panels and creases gives rise to unique mechanical properties of origami, such as reconfigurability, making origami designs well suited for tunable structures. Although often being ignored, origami structures exhibit additional soft modes beyond rigid folding due to the flexibility of thin sheets that further influence their behaviour. Actual behaviour of origami structures usually involves significant geometric nonlinearity, which amplifies the influence of additional soft modes. To investigate the nonlinear mechanics of origami structures with deformable panels, we present a structural engineering approach for simulating the nonlinear response of non-rigid origami structures. In this paper, we propose a fully nonlinear, displacement-based implicit formulation for performing static/quasi-static analyses of non-rigid origami structures based on 'bar-and-hinge' models. The formulation itself leads to an efficient and robust numerical implementation. Agreement between real models and numerical simulations demonstrates the ability of the proposed approach to capture key features of origami behaviour.
622-Mbps Orthogonal Frequency Division Multiplexing (OFDM) Digital Modem Implemented

NASA Technical Reports Server (NTRS)

Kifle, Muli; Bizon, Thomas P.; Nguyen, Nam T.; Tran, Quang K.; Mortensen, Dale J.

2002-01-01

Future generation space communications systems feature significantly higher data rates and relatively smaller frequency spectrum allocations than systems currently deployed. This requires the application of bandwidth- and power-efficient signal transmission techniques. There are a number of approaches to implementing such techniques, including analog, digital, mixed-signal, single-channel, or multichannel systems. In general, the digital implementations offer more advantages; however, a fully digital implementation is very difficult because of the very high clock speeds required. Multichannel techniques are used to reduce the sampling rate. One such technique, multicarrier modulation, divides the data into a number of low-rate channels that are stacked in frequency. Orthogonal frequency division multiplexing (OFDM), a form of multicarrier modulation, is being proposed for numerous systems, including mobile wireless and digital subscriber link communication systems. In response to this challenge, NASA Glenn Research Center's Communication Technology Division has developed an OFDM digital modem (modulator and demodulator) with an aggregate information throughput of 622 Mbps. The basic OFDM waveform is constructed by dividing an incoming data stream into four channels, each using either 16- ary quadrature amplitude modulation (16-QAM) or 8-phase shift keying (8-PSK). An efficient implementation for an OFDM architecture is being achieved using the combination of a discrete Fourier transform (DFT) at the transmitter to digitally stack the individual carriers, inverse DFT at the receiver to perform the frequency translations, and a polyphase filter to facilitate the pulse shaping.
Development of a linearized unsteady Euler analysis for turbomachinery blade rows

NASA Technical Reports Server (NTRS)

Verdon, Joseph M.; Montgomery, Matthew D.; Kousen, Kenneth A.

1995-01-01

A linearized unsteady aerodynamic analysis for axial-flow turbomachinery blading is described in this report. The linearization is based on the Euler equations of fluid motion and is motivated by the need for an efficient aerodynamic analysis that can be used in predicting the aeroelastic and aeroacoustic responses of blade rows. The field equations and surface conditions required for inviscid, nonlinear and linearized, unsteady aerodynamic analyses of three-dimensional flow through a single, blade row operating within a cylindrical duct, are derived. An existing numerical algorithm for determining time-accurate solutions of the nonlinear unsteady flow problem is described, and a numerical model, based upon this nonlinear flow solver, is formulated for the first-harmonic linear unsteady problem. The linearized aerodynamic and numerical models have been implemented into a first-harmonic unsteady flow code, called LINFLUX. At present this code applies only to two-dimensional flows, but an extension to three-dimensions is planned as future work. The three-dimensional aerodynamic and numerical formulations are described in this report. Numerical results for two-dimensional unsteady cascade flows, excited by prescribed blade motions and prescribed aerodynamic disturbances at inlet and exit, are also provided to illustrate the present capabilities of the LINFLUX analysis.
Numerical Algorithms for Precise and Efficient Orbit Propagation and Positioning

NASA Astrophysics Data System (ADS)

Bradley, Ben K.

Motivated by the growing space catalog and the demands for precise orbit determination with shorter latency for science and reconnaissance missions, this research improves the computational performance of orbit propagation through more efficient and precise numerical integration and frame transformation implementations. Propagation of satellite orbits is required for astrodynamics applications including mission design, orbit determination in support of operations and payload data analysis, and conjunction assessment. Each of these applications has somewhat different requirements in terms of accuracy, precision, latency, and computational load. This dissertation develops procedures to achieve various levels of accuracy while minimizing computational cost for diverse orbit determination applications. This is done by addressing two aspects of orbit determination: (1) numerical integration used for orbit propagation and (2) precise frame transformations necessary for force model evaluation and station coordinate rotations. This dissertation describes a recently developed method for numerical integration, dubbed Bandlimited Collocation Implicit Runge-Kutta (BLC-IRK), and compare its efficiency in propagating orbits to existing techniques commonly used in astrodynamics. The BLC-IRK scheme uses generalized Gaussian quadratures for bandlimited functions. It requires significantly fewer force function evaluations than explicit Runge-Kutta schemes and approaches the efficiency of the 8th-order Gauss-Jackson multistep method. Converting between the Geocentric Celestial Reference System (GCRS) and International Terrestrial Reference System (ITRS) is necessary for many applications in astrodynamics, such as orbit propagation, orbit determination, and analyzing geoscience data from satellite missions. This dissertation provides simplifications to the Celestial Intermediate Origin (CIO) transformation scheme and Earth orientation parameter (EOP) storage for use in positioning and orbit propagation, yielding savings in computation time and memory. Orbit propagation and position transformation simulations are analyzed to generate a complete set of recommendations for performing the ITRS/GCRS transformation for a wide range of needs, encompassing real-time on-board satellite operations and precise post-processing applications. In addition, a complete derivation of the ITRS/GCRS frame transformation time-derivative is detailed for use in velocity transformations between the GCRS and ITRS and is applied to orbit propagation in the rotating ITRS. EOP interpolation methods and ocean tide corrections are shown to impact the ITRS/GCRS transformation accuracy at the level of 5 cm and 20 cm on the surface of the Earth and at the Global Positioning System (GPS) altitude, respectively. The precession-nutation and EOP simplifications yield maximum propagation errors of approximately 2 cm and 1 m after 15 minutes and 6 hours in low-Earth orbit (LEO), respectively, while reducing computation time and memory usage. Finally, for orbit propagation in the ITRS, a simplified scheme is demonstrated that yields propagation errors under 5 cm after 15 minutes in LEO. This approach is beneficial for orbit determination based on GPS measurements. We conclude with a summary of recommendations on EOP usage and bias-precession-nutation implementations for achieving a wide range of transformation and propagation accuracies at several altitudes. This comprehensive set of recommendations allows satellite operators, astrodynamicists, and scientists to make informed decisions when choosing the best implementation for their application, balancing accuracy and computational complexity.
Automatic numerical evaluation of vacancy-mediated transport for arbitrary crystals: Onsager coefficients in the dilute limit using a Green function approach

NASA Astrophysics Data System (ADS)

Trinkle, Dallas R.

2017-10-01

A general solution for vacancy-mediated diffusion in the dilute-vacancy/dilute-solute limit for arbitrary crystal structures is derived from the master equation. A general numerical approach to the vacancy lattice Green function reduces to the sum of a few analytic functions and numerical integration of a smooth function over the Brillouin zone for arbitrary crystals. The Dyson equation solves for the Green function in the presence of a solute with arbitrary but finite interaction range to compute the transport coefficients accurately, efficiently and automatically, including cases with very large differences in solute-vacancy exchange rates. The methodology takes advantage of the space group symmetry of a crystal to reduce the complexity of the matrix inversion in the Dyson equation. An open-source implementation of the algorithm is available, and numerical results are presented for the convergence of the integration error of the bare vacancy Green function, and tracer correlation factors for a variety of crystals including wurtzite (hexagonal diamond) and garnet.
Reconstructing householder vectors from Tall-Skinny QR

DOE PAGES

Ballard, Grey Malone; Demmel, James; Grigori, Laura; ...

2015-08-05

The Tall-Skinny QR (TSQR) algorithm is more communication efficient than the standard Householder algorithm for QR decomposition of matrices with many more rows than columns. However, TSQR produces a different representation of the orthogonal factor and therefore requires more software development to support the new representation. Further, implicitly applying the orthogonal factor to the trailing matrix in the context of factoring a square matrix is more complicated and costly than with the Householder representation. We show how to perform TSQR and then reconstruct the Householder vector representation with the same asymptotic communication efficiency and little extra computational cost. We demonstratemore » the high performance and numerical stability of this algorithm both theoretically and empirically. The new Householder reconstruction algorithm allows us to design more efficient parallel QR algorithms, with significantly lower latency cost compared to Householder QR and lower bandwidth and latency costs compared with Communication-Avoiding QR (CAQR) algorithm. Experiments on supercomputers demonstrate the benefits of the communication cost improvements: in particular, our experiments show substantial improvements over tuned library implementations for tall-and-skinny matrices. Furthermore, we also provide algorithmic improvements to the Householder QR and CAQR algorithms, and we investigate several alternatives to the Householder reconstruction algorithm that sacrifice guarantees on numerical stability in some cases in order to obtain higher performance.« less
Framework to trade optimality for local processing in large-scale wavefront reconstruction problems.

PubMed

Haber, Aleksandar; Verhaegen, Michel

2016-11-15

We show that the minimum variance wavefront estimation problems permit localized approximate solutions, in the sense that the wavefront value at a point (excluding unobservable modes, such as the piston mode) can be approximated by a linear combination of the wavefront slope measurements in the point's neighborhood. This enables us to efficiently compute a wavefront estimate by performing a single sparse matrix-vector multiplication. Moreover, our results open the possibility for the development of wavefront estimators that can be easily implemented in a decentralized/distributed manner, and in which the estimate optimality can be easily traded for computational efficiency. We numerically validate our approach on Hudgin wavefront sensor geometries, and the results can be easily generalized to Fried geometries.
Towards developing robust algorithms for solving partial differential equations on MIMD machines

NASA Technical Reports Server (NTRS)

Saltz, Joel H.; Naik, Vijay K.

1988-01-01

Methods for efficient computation of numerical algorithms on a wide variety of MIMD machines are proposed. These techniques reorganize the data dependency patterns to improve the processor utilization. The model problem finds the time-accurate solution to a parabolic partial differential equation discretized in space and implicitly marched forward in time. The algorithms are extensions of Jacobi and SOR. The extensions consist of iterating over a window of several timesteps, allowing efficient overlap of computation with communication. The methods increase the degree to which work can be performed while data are communicated between processors. The effect of the window size and of domain partitioning on the system performance is examined both by implementing the algorithm on a simulated multiprocessor system.
Super-Nyquist shaping and processing technologies for high-spectral-efficiency optical systems

NASA Astrophysics Data System (ADS)

Jia, Zhensheng; Chien, Hung-Chang; Zhang, Junwen; Dong, Ze; Cai, Yi; Yu, Jianjun

2013-12-01

The implementations of super-Nyquist pulse generation, both in a digital field using a digital-to-analog converter (DAC) or an optical filter at transmitter side, are introduced. Three corresponding signal processing algorithms at receiver are presented and compared for high spectral-efficiency (SE) optical systems employing the spectral prefiltering. Those algorithms are designed for the mitigation towards inter-symbol-interference (ISI) and inter-channel-interference (ICI) impairments by the bandwidth constraint, including 1-tap constant modulus algorithm (CMA) and 3-tap maximum likelihood sequence estimation (MLSE), regular CMA and digital filter with 2-tap MLSE, and constant multi-modulus algorithm (CMMA) with 2-tap MLSE. The principles and prefiltering tolerance are given through numerical and experimental results.
Towards developing robust algorithms for solving partial differential equations on MIMD machines

NASA Technical Reports Server (NTRS)

Saltz, J. H.; Naik, V. K.

1985-01-01

Methods for efficient computation of numerical algorithms on a wide variety of MIMD machines are proposed. These techniques reorganize the data dependency patterns to improve the processor utilization. The model problem finds the time-accurate solution to a parabolic partial differential equation discretized in space and implicitly marched forward in time. The algorithms are extensions of Jacobi and SOR. The extensions consist of iterating over a window of several timesteps, allowing efficient overlap of computation with communication. The methods increase the degree to which work can be performed while data are communicated between processors. The effect of the window size and of domain partitioning on the system performance is examined both by implementing the algorithm on a simulated multiprocessor system.
An efficient variable projection formulation for separable nonlinear least squares problems.

PubMed

Gan, Min; Li, Han-Xiong

2014-05-01

We consider in this paper a class of nonlinear least squares problems in which the model can be represented as a linear combination of nonlinear functions. The variable projection algorithm projects the linear parameters out of the problem, leaving the nonlinear least squares problems involving only the nonlinear parameters. To implement the variable projection algorithm more efficiently, we propose a new variable projection functional based on matrix decomposition. The advantage of the proposed formulation is that the size of the decomposed matrix may be much smaller than those of previous ones. The Levenberg-Marquardt algorithm using finite difference method is then applied to minimize the new criterion. Numerical results show that the proposed approach achieves significant reduction in computing time.

Direct variational data assimilation algorithm for atmospheric chemistry data with transport and transformation model

NASA Astrophysics Data System (ADS)

Penenko, Alexey; Penenko, Vladimir; Nuterman, Roman; Baklanov, Alexander; Mahura, Alexander

2015-11-01

Atmospheric chemistry dynamics is studied with convection-diffusion-reaction model. The numerical Data Assimilation algorithm presented is based on the additive-averaged splitting schemes. It carries out ''fine-grained'' variational data assimilation on the separate splitting stages with respect to spatial dimensions and processes i.e. the same measurement data is assimilated to different parts of the split model. This design has efficient implementation due to the direct data assimilation algorithms of the transport process along coordinate lines. Results of numerical experiments with chemical data assimilation algorithm of in situ concentration measurements on real data scenario have been presented. In order to construct the scenario, meteorological data has been taken from EnviroHIRLAM model output, initial conditions from MOZART model output and measurements from Airbase database.
Solution of elliptic PDEs by fast Poisson solvers using a local relaxation factor

NASA Technical Reports Server (NTRS)

Chang, Sin-Chung

1986-01-01

A large class of two- and three-dimensional, nonseparable elliptic partial differential equations (PDEs) is presently solved by means of novel one-step (D'Yakanov-Gunn) and two-step (accelerated one-step) iterative procedures, using a local, discrete Fourier analysis. In addition to being easily implemented and applicable to a variety of boundary conditions, these procedures are found to be computationally efficient on the basis of the results of numerical comparison with other established methods, which lack the present one's: (1) insensitivity to grid cell size and aspect ratio, and (2) ease of convergence rate estimation by means of the coefficient of the PDE being solved. The two-step procedure is numerically demonstrated to outperform the one-step procedure in the case of PDEs with variable coefficients.
Modified Newton-Raphson GRAPE methods for optimal control of spin systems

NASA Astrophysics Data System (ADS)

Goodwin, D. L.; Kuprov, Ilya

2016-05-01

Quadratic convergence throughout the active space is achieved for the gradient ascent pulse engineering (GRAPE) family of quantum optimal control algorithms. We demonstrate in this communication that the Hessian of the GRAPE fidelity functional is unusually cheap, having the same asymptotic complexity scaling as the functional itself. This leads to the possibility of using very efficient numerical optimization techniques. In particular, the Newton-Raphson method with a rational function optimization (RFO) regularized Hessian is shown in this work to require fewer system trajectory evaluations than any other algorithm in the GRAPE family. This communication describes algebraic and numerical implementation aspects (matrix exponential recycling, Hessian regularization, etc.) for the RFO Newton-Raphson version of GRAPE and reports benchmarks for common spin state control problems in magnetic resonance spectroscopy.
Modelling water uptake efficiency of root systems

NASA Astrophysics Data System (ADS)

Leitner, Daniel; Tron, Stefania; Schröder, Natalie; Bodner, Gernot; Javaux, Mathieu; Vanderborght, Jan; Vereecken, Harry; Schnepf, Andrea

2016-04-01

Water uptake is crucial for plant productivity. Trait based breeding for more water efficient crops will enable a sustainable agricultural management under specific pedoclimatic conditions, and can increase drought resistance of plants. Mathematical modelling can be used to find suitable root system traits for better water uptake efficiency defined as amount of water taken up per unit of root biomass. This approach requires large simulation times and large number of simulation runs, since we test different root systems under different pedoclimatic conditions. In this work, we model water movement by the 1-dimensional Richards equation with the soil hydraulic properties described according to the van Genuchten model. Climatic conditions serve as the upper boundary condition. The root system grows during the simulation period and water uptake is calculated via a sink term (after Tron et al. 2015). The goal of this work is to compare different free software tools based on different numerical schemes to solve the model. We compare implementations using DUMUX (based on finite volumes), Hydrus 1D (based on finite elements), and a Matlab implementation of Van Dam, J. C., & Feddes 2000 (based on finite differences). We analyse the methods for accuracy, speed and flexibility. Using this model case study, we can clearly show the impact of various root system traits on water uptake efficiency. Furthermore, we can quantify frequent simplifications that are introduced in the modelling step like considering a static root system instead of a growing one, or considering a sink term based on root density instead of considering the full root hydraulic model (Javaux et al. 2008). References Tron, S., Bodner, G., Laio, F., Ridolfi, L., & Leitner, D. (2015). Can diversity in root architecture explain plant water use efficiency? A modeling study. Ecological modelling, 312, 200-210. Van Dam, J. C., & Feddes, R. A. (2000). Numerical simulation of infiltration, evaporation and shallow groundwater levels with the Richards equation. Journal of Hydrology, 233(1), 72-85. Javaux, M., Schröder, T., Vanderborght, J., & Vereecken, H. (2008). Use of a three-dimensional detailed modeling approach for predicting root water uptake. Vadose Zone Journal, 7(3), 1079-1088.
Faster and exact implementation of the continuous cellular automaton for anisotropic etching simulations

NASA Astrophysics Data System (ADS)

Ferrando, N.; Gosálvez, M. A.; Cerdá, J.; Gadea, R.; Sato, K.

2011-02-01

The current success of the continuous cellular automata for the simulation of anisotropic wet chemical etching of silicon in microengineering applications is based on a relatively fast, approximate, constant time stepping implementation (CTS), whose accuracy against the exact algorithm—a computationally slow, variable time stepping implementation (VTS)—has not been previously analyzed in detail. In this study we show that the CTS implementation can generate moderately wrong etch rates and overall etching fronts, thus justifying the presentation of a novel, exact reformulation of the VTS implementation based on a new state variable, referred to as the predicted removal time (PRT), and the use of a self-balanced binary search tree that enables storage and efficient access to the PRT values in each time step in order to quickly remove the corresponding surface atom/s. The proposed PRT method reduces the simulation cost of the exact implementation from {O}(N^{5/3}) to {O}(N^{3/2} log N) without introducing any model simplifications. This enables more precise simulations (only limited by numerical precision errors) with affordable computational times that are similar to the less precise CTS implementation and even faster for low reactivity systems.
Energy Efficiency and Demand Response for Residential Applications

NASA Astrophysics Data System (ADS)

Wellons, Christopher J., II

The purpose of this thesis is to analyze the costs, feasibility and benefits of implementing energy efficient devices and demand response programs to a residential consumer environment. Energy efficiency and demand response are important for many reasons, including grid stabilization. With energy demand increasing, as the years' pass, the drain on the grid is going up. There are two key solutions to this problem, increasing supply by building more power plants and decreasing demand during peak periods, by increasing participation in demand response programs and by upgrading residential and commercial customers to energy efficient devices, to lower demand throughout the day. This thesis focuses on utilizing demand response methods and energy efficient device to reduce demand. Four simulations were created to analyze these methods. These simulations show the importance of energy efficiency and demand response participation to help stabilize the grid, integrate more alternative energy resources, and reduce emissions from fossil fuel generating facilities. The results of these numerical analyses show that demand response and energy efficiency can be beneficial to consumers and utilities. With demand response being the most beneficial to the utility and energy efficiency, specifically LED lighting, providing the most benefits to the consumer.
MILAMIN 2 - Fast MATLAB FEM solver

NASA Astrophysics Data System (ADS)

Dabrowski, Marcin; Krotkiewski, Marcin; Schmid, Daniel W.

2013-04-01

MILAMIN is a free and efficient MATLAB-based two-dimensional FEM solver utilizing unstructured meshes [Dabrowski et al., G-cubed (2008)]. The code consists of steady-state thermal diffusion and incompressible Stokes flow solvers implemented in approximately 200 lines of native MATLAB code. The brevity makes the code easily customizable. An important quality of MILAMIN is speed - it can handle millions of nodes within minutes on one CPU core of a standard desktop computer, and is faster than many commercial solutions. The new MILAMIN 2 allows three-dimensional modeling. It is designed as a set of functional modules that can be used as building blocks for efficient FEM simulations using MATLAB. The utilities are largely implemented as native MATLAB functions. For performance critical parts we use MUTILS - a suite of compiled MEX functions optimized for shared memory multi-core computers. The most important features of MILAMIN 2 are: 1. Modular approach to defining, tracking, and discretizing the geometry of the model 2. Interfaces to external mesh generators (e.g., Triangle, Fade2d, T3D) and mesh utilities (e.g., element type conversion, fast point location, boundary extraction) 3. Efficient computation of the stiffness matrix for a wide range of element types, anisotropic materials and three-dimensional problems 4. Fast global matrix assembly using a dedicated MEX function 5. Automatic integration rules 6. Flexible prescription (spatial, temporal, and field functions) and efficient application of Dirichlet, Neuman, and periodic boundary conditions 7. Treatment of transient and non-linear problems 8. Various iterative and multi-level solution strategies 9. Post-processing tools (e.g., numerical integration) 10. Visualization primitives using MATLAB, and VTK export functions We provide a large number of examples that show how to implement a custom FEM solver using the MILAMIN 2 framework. The examples are MATLAB scripts of increasing complexity that address a given technical topic (e.g., creating meshes, reordering nodes, applying boundary conditions), a given numerical topic (e.g., using various solution strategies, non-linear iterations), or that present a fully-developed solver designed to address a scientific topic (e.g., performing Stokes flow simulations in synthetic porous medium). References: Dabrowski, M., M. Krotkiewski, and D. W. Schmid MILAMIN: MATLAB-based finite element method solver for large problems, Geochem. Geophys. Geosyst., 9, Q04030, 2008
An efficient and general approach for implementing thermodynamic phase equilibria information in geophysical and geodynamic studies

NASA Astrophysics Data System (ADS)

Afonso, Juan Carlos; Zlotnik, Sergio; Díez, Pedro

2015-10-01

We present a flexible, general, and efficient approach for implementing thermodynamic phase equilibria information (in the form of sets of physical parameters) into geophysical and geodynamic studies. The approach is based on Tensor Rank Decomposition methods, which transform the original multidimensional discrete information into a separated representation that contains significantly fewer terms, thus drastically reducing the amount of information to be stored in memory during a numerical simulation or geophysical inversion. Accordingly, the amount and resolution of the thermodynamic information that can be used in a simulation or inversion increases substantially. In addition, the method is independent of the actual software used to obtain the primary thermodynamic information, and therefore, it can be used in conjunction with any thermodynamic modeling program and/or database. Also, the errors associated with the decomposition procedure are readily controlled by the user, depending on her/his actual needs (e.g., preliminary runs versus full resolution runs). We illustrate the benefits, generality, and applicability of our approach with several examples of practical interest for both geodynamic modeling and geophysical inversion/modeling. Our results demonstrate that the proposed method is a competitive and attractive candidate for implementing thermodynamic constraints into a broad range of geophysical and geodynamic studies. MATLAB implementations of the method and examples are provided as supporting information and can be downloaded from the journal's website.
A parallel overset-curvilinear-immersed boundary framework for simulating complex 3D incompressible flows

PubMed Central

Borazjani, Iman; Ge, Liang; Le, Trung; Sotiropoulos, Fotis

2013-01-01

We develop an overset-curvilinear immersed boundary (overset-CURVIB) method in a general non-inertial frame of reference to simulate a wide range of challenging biological flow problems. The method incorporates overset-curvilinear grids to efficiently handle multi-connected geometries and increase the resolution locally near immersed boundaries. Complex bodies undergoing arbitrarily large deformations may be embedded within the overset-curvilinear background grid and treated as sharp interfaces using the curvilinear immersed boundary (CURVIB) method (Ge and Sotiropoulos, Journal of Computational Physics, 2007). The incompressible flow equations are formulated in a general non-inertial frame of reference to enhance the overall versatility and efficiency of the numerical approach. Efficient search algorithms to identify areas requiring blanking, donor cells, and interpolation coefficients for constructing the boundary conditions at grid interfaces of the overset grid are developed and implemented using efficient parallel computing communication strategies to transfer information among sub-domains. The governing equations are discretized using a second-order accurate finite-volume approach and integrated in time via an efficient fractional-step method. Various strategies for ensuring globally conservative interpolation at grid interfaces suitable for incompressible flow fractional step methods are implemented and evaluated. The method is verified and validated against experimental data, and its capabilities are demonstrated by simulating the flow past multiple aquatic swimmers and the systolic flow in an anatomic left ventricle with a mechanical heart valve implanted in the aortic position. PMID:23833331
High Step-Up DC—DC Converter for AC Photovoltaic Module with MPPT Control

NASA Astrophysics Data System (ADS)

Sundar, Govindasamy; Karthick, Narashiman; Rama Reddy, Sasi

2014-08-01

This paper presents the high gain step-up BOOST converter which is essential to step up the low output voltage from PV panel to the high voltage according to the requirement of the application. In this paper a high gain BOOST converter with coupled inductor technique is proposed with the MPPT control. Without extreme duty ratios and the numerous turns-ratios of a coupled inductor this converter achieves a high step-up voltage-conversion ratio and the leakage energy of the coupled inductor is efficiently recycled to the load. MPPT control used to extract the maximum power from PV panel by controlling the Duty ratio of the converter. The PV panel, BOOST converter and the MPPT are modeled using Sim Power System blocks in MATLAB/SIMULINK environment. The prototype model of the proposed converter has been implemented with the maximum measured efficiency is up to 95.4% and full-load efficiency is 93.1%.
A global optimization method synthesizing heat transfer and thermodynamics for the power generation system with Brayton cycle

NASA Astrophysics Data System (ADS)

Fu, Rong-Huan; Zhang, Xing

2016-09-01

Supercritical carbon dioxide operated in a Brayton cycle offers a numerous of potential advantages for a power generation system, and a lot of thermodynamics analyses have been conducted to increase its efficiency. Because there are a lot of heat-absorbing and heat-lossing subprocesses in a practical thermodynamic cycle and they are implemented by heat exchangers, it will increase the gross efficiency of the whole power generation system to optimize the system combining thermodynamics and heat transfer theory. This paper analyzes the influence of the performance of heat exchangers on the actual efficiency of an ideal Brayton cycle with a simple configuration, and proposes a new method to optimize the power generation system, which aims at the minimum energy consumption. Although the method is operated only for the ideal working fluid in this paper, its merits compared to that only with thermodynamic analysis are fully shown.
Gradient-based Optimization for Poroelastic and Viscoelastic MR Elastography

PubMed Central

Tan, Likun; McGarry, Matthew D.J.; Van Houten, Elijah E.W.; Ji, Ming; Solamen, Ligin; Weaver, John B.

2017-01-01

We describe an efficient gradient computation for solving inverse problems arising in magnetic resonance elastography (MRE). The algorithm can be considered as a generalized ‘adjoint method’ based on a Lagrangian formulation. One requirement for the classic adjoint method is assurance of the self-adjoint property of the stiffness matrix in the elasticity problem. In this paper, we show this property is no longer a necessary condition in our algorithm, but the computational performance can be as efficient as the classic method, which involves only two forward solutions and is independent of the number of parameters to be estimated. The algorithm is developed and implemented in material property reconstructions using poroelastic and viscoelastic modeling. Various gradient- and Hessian-based optimization techniques have been tested on simulation, phantom and in vivo brain data. The numerical results show the feasibility and the efficiency of the proposed scheme for gradient calculation. PMID:27608454
Optimizing luminescent solar concentrator design

DOE PAGES

Hernandez-Noyola, Hermilo; Potterveld, David H.; Holt, Roy J.; ...

2011-12-21

Luminescent Solar Concentrators (LSCs) use fluorescent materials and light guides to convert direct and diffuse sunlight into concentrated wavelength-shifted light that produces electrical power in small photovoltaic (PV) cells with the goal of significantly reducing the cost of solar energy utilization. In this paper we present an optimization analysis based on the implementation of a genetic algorithm (GA) subroutine to a numerical ray-tracing Monte Carlo model of an LSC, SIMSOLAR-P. The initial use of the GA implementation in SIMSOLAR-P is to find the optimal parameters of a hypothetical ‘‘perfect luminescent material’’ that obeys the Kennard Stepanov (K-S) thermodynamic relationship betweenmore » emission and absorption. The optimization balances the efficiency losses in the wavelength shift and PV conversion with the efficiency losses due to re-scattering of light out of the collector. The theoretical limits of efficiency are provided for one, two and three layer configurations; the results show that a single layer configuration is far from optimal and adding a second layer in the LSC with wavelength shifted material in the near infrared region significantly increases the power output, while the gain in power by adding a third layer is relatively small. Here, the results of this study provide a theoretical upper limit to the performance of an LSC and give guidance for the properties required for luminescent materials, such as quantum nanocrystals, to operate efficiently in planar LSC configurations« less
Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures

DTIC Science & Technology

2017-10-04

Report: Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures The views, opinions and/or findings contained in this...Chapel Hill Title: Efficient Numeric and Geometric Computations using Heterogeneous Shared Memory Architectures Report Term: 0-Other Email: dm...algorithms for scientific and geometric computing by exploiting the power and performance efficiency of heterogeneous shared memory architectures . These
Equilibrium shapes of a heterogeneous bubble in an electric field: a variational formulation and numerical verifications

NASA Astrophysics Data System (ADS)

Wang, Hanxiong; Liu, Liping; Liu, Dong

2017-03-01

The equilibrium shape of a bubble/droplet in an electric field is important for electrowetting over dielectrics (EWOD), electrohydrodynamic (EHD) enhancement for heat transfer and electro-deformation of a single biological cell among others. In this work, we develop a general variational formulation in account of electro-mechanical couplings. In the context of EHD, we identify the free energy functional and the associated energy minimization problem that determines the equilibrium shape of a bubble in an electric field. Based on this variational formulation, we implement a fixed mesh level-set gradient method for computing the equilibrium shapes. This numerical scheme is efficient and validated by comparing with analytical solutions at the absence of electric field and experimental results at the presence of electric field. We also present simulation results for zero gravity which will be useful for space applications. The variational formulation and numerical scheme are anticipated to have broad applications in areas of EWOD, EHD and electro-deformation in biomechanics.
GPU accelerated manifold correction method for spinning compact binaries

NASA Astrophysics Data System (ADS)

Ran, Chong-xi; Liu, Song; Zhong, Shuang-ying

2018-04-01

The graphics processing unit (GPU) acceleration of the manifold correction algorithm based on the compute unified device architecture (CUDA) technology is designed to simulate the dynamic evolution of the Post-Newtonian (PN) Hamiltonian formulation of spinning compact binaries. The feasibility and the efficiency of parallel computation on GPU have been confirmed by various numerical experiments. The numerical comparisons show that the accuracy on GPU execution of manifold corrections method has a good agreement with the execution of codes on merely central processing unit (CPU-based) method. The acceleration ability when the codes are implemented on GPU can increase enormously through the use of shared memory and register optimization techniques without additional hardware costs, implying that the speedup is nearly 13 times as compared with the codes executed on CPU for phase space scan (including 314 × 314 orbits). In addition, GPU-accelerated manifold correction method is used to numerically study how dynamics are affected by the spin-induced quadrupole-monopole interaction for black hole binary system.
Efficient solution of the Wigner-Liouville equation using a spectral decomposition of the force field

NASA Astrophysics Data System (ADS)

Van de Put, Maarten L.; Sorée, Bart; Magnus, Wim

2017-12-01

The Wigner-Liouville equation is reformulated using a spectral decomposition of the classical force field instead of the potential energy. The latter is shown to simplify the Wigner-Liouville kernel both conceptually and numerically as the spectral force Wigner-Liouville equation avoids the numerical evaluation of the highly oscillatory Wigner kernel which is nonlocal in both position and momentum. The quantum mechanical evolution is instead governed by a term local in space and non-local in momentum, where the non-locality in momentum has only a limited range. An interpretation of the time evolution in terms of two processes is presented; a classical evolution under the influence of the averaged driving field, and a probability-preserving quantum-mechanical generation and annihilation term. Using the inherent stability and reduced complexity, a direct deterministic numerical implementation using Chebyshev and Fourier pseudo-spectral methods is detailed. For the purpose of illustration, we present results for the time-evolution of a one-dimensional resonant tunneling diode driven out of equilibrium.
Numerical investigation of gapped edge states in fractional quantum Hall-superconductor heterostructures

NASA Astrophysics Data System (ADS)

Repellin, Cécile; Cook, Ashley M.; Neupert, Titus; Regnault, Nicolas

2018-03-01

Fractional quantum Hall-superconductor heterostructures may provide a platform towards non-abelian topological modes beyond Majoranas. However their quantitative theoretical study remains extremely challenging. We propose and implement a numerical setup for studying edge states of fractional quantum Hall droplets with a superconducting instability. The fully gapped edges carry a topological degree of freedom that can encode quantum information protected against local perturbations. We simulate such a system numerically using exact diagonalization by restricting the calculation to the quasihole-subspace of a (time-reversal symmetric) bilayer fractional quantum Hall system of Laughlin ν = 1/3 states. We show that the edge ground states are permuted by spin-dependent flux insertion and demonstrate their fractional 6π Josephson effect, evidencing their topological nature and the Cooper pairing of fractionalized quasiparticles. The versatility and efficiency of our setup make it a well suited method to tackle wider questions of edge phases and phase transitions in fractional quantum Hall systems.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Zakharov, Leonic E.; Li, Xujing

This paper formulates the Tokamak Magneto-Hydrodynamics (TMHD), initially outlined by X. Li and L.E. Zakharov [Plasma Science and Technology, accepted, ID:2013-257 (2013)] for proper simulations of macroscopic plasma dynamics. The simplest set of magneto-hydrodynamics equations, sufficient for disruption modeling and extendable to more refined physics, is explained in detail. First, the TMHD introduces to 3-D simulations the Reference Magnetic Coordinates (RMC), which are aligned with the magnetic field in the best possible way. The numerical implementation of RMC is adaptive grids. Being consistent with the high anisotropy of the tokamak plasma, RMC allow simulations at realistic, very high plasma electricmore » conductivity. Second, the TMHD splits the equation of motion into an equilibrium equation and the plasma advancing equation. This resolves the 4 decade old problem of Courant limitations of the time step in existing, plasma inertia driven numerical codes. The splitting allows disruption simulations on a relatively slow time scale in comparison with the fast time of ideal MHD instabilities. A new, efficient numerical scheme is proposed for TMHD.« less
Effect of fire-induced damage on the uniaxial strength characteristics of solid timber: A numerical study

NASA Astrophysics Data System (ADS)

Hopkin, D. J.; El-Rimawi, J.; Lennon, T.; Silberschmidt, V. V.

2011-07-01

The advent of the structural Eurocodes has allowed civil engineers to be more creative in the design of structures exposed to fire. Rather than rely upon regulatory guidance and prescriptive methods engineers are now able to use such codes to design buildings on the basis of credible design fires rather than accepted unrealistic standard-fire time-temperature curves. Through this process safer and more efficient structural designs are achievable. The key development in enabling performance-based fire design is the emergence of validated numerical models capable of predicting the mechanical response of a whole building or sub-assemblies at elevated temperature. In such a way, efficiency savings have been achieved in the design of steel, concrete and composite structures. However, at present, due to a combination of limited fundamental research and restrictions in the UK National Annex to the timber Eurocode, the design of fire-exposed timber structures using numerical modelling techniques is not generally undertaken. The 'fire design' of timber structures is covered in Eurocode 5 part 1.2 (EN 1995-1-2). In this code there is an advanced calculation annex (Annex B) intended to facilitate the implementation of numerical models in the design of fire-exposed timber structures. The properties contained in the code can, at present, only be applied to standard-fire exposure conditions. This is due to existing limitations related to the available thermal properties which are only valid for standard fire exposure. In an attempt to overcome this barrier the authors have proposed a 'modified conductivity model' (MCM) for determining the temperature of timber structural elements during the heating phase of non-standard fires. This is briefly outlined in this paper. In addition, in a further study, the MCM has been implemented in a coupled thermo-mechanical analysis of uniaxially loaded timber elements exposed to non-standard fires. The finite element package DIANA was adopted with plane-strain elements assuming two-dimensional heat flow. The resulting predictions of failure time for given levels of load are discussed and compared with the simplified 'effective cross section' method presented in EN 1995-1-2.

Automated Development of Accurate Algorithms and Efficient Codes for Computational Aeroacoustics

NASA Technical Reports Server (NTRS)

Goodrich, John W.; Dyson, Rodger W.

1999-01-01

The simulation of sound generation and propagation in three space dimensions with realistic aircraft components is a very large time dependent computation with fine details. Simulations in open domains with embedded objects require accurate and robust algorithms for propagation, for artificial inflow and outflow boundaries, and for the definition of geometrically complex objects. The development, implementation, and validation of methods for solving these demanding problems is being done to support the NASA pillar goals for reducing aircraft noise levels. Our goal is to provide algorithms which are sufficiently accurate and efficient to produce usable results rapidly enough to allow design engineers to study the effects on sound levels of design changes in propulsion systems, and in the integration of propulsion systems with airframes. There is a lack of design tools for these purposes at this time. Our technical approach to this problem combines the development of new, algorithms with the use of Mathematica and Unix utilities to automate the algorithm development, code implementation, and validation. We use explicit methods to ensure effective implementation by domain decomposition for SPMD parallel computing. There are several orders of magnitude difference in the computational efficiencies of the algorithms which we have considered. We currently have new artificial inflow and outflow boundary conditions that are stable, accurate, and unobtrusive, with implementations that match the accuracy and efficiency of the propagation methods. The artificial numerical boundary treatments have been proven to have solutions which converge to the full open domain problems, so that the error from the boundary treatments can be driven as low as is required. The purpose of this paper is to briefly present a method for developing highly accurate algorithms for computational aeroacoustics, the use of computer automation in this process, and a brief survey of the algorithms that have resulted from this work. A review of computational aeroacoustics has recently been given by Lele.
Hantush Well Function revisited

NASA Astrophysics Data System (ADS)

Veling, E. J. M.; Maas, C.

2010-11-01

SummaryIn this paper, we comment on some recent numerical and analytical work to evaluate the Hantush Well Function. We correct an expression found in a Comment by Nadarajah [Nadarajah, S., 2007. A comment on numerical evaluation of Theis and Hantush-Jacob well functions. Journal of Hydrology 338, 152-153] to a paper by Prodanoff et al. [Prodanoff, J.A., Mansur, W.J., Mascarenhas, F.C.B., 2006. Numerical evaluation of Theis and Hantush-Jacob well functions. Journal of Hydrology 318, 173-183]. We subsequently derived another analytic representation based on a generalized hypergeometric function in two variables and from the hydrological literature we cite an analytic representation by Hunt [Hunt, B., 1977. Calculation of the leaky aquifer function. Journal of Hydrology 33, 179-183]. We have implemented both representations and compared the results. Using a convergence accelerator Hunt's representation of Hantush Well Function is efficient and accurate. While checking our implementations we found that Bear's table of the Hantush Well Function [Bear, J., 1979. Hydraulics of Groundwater. McGraw-Hill, New York, Tables 8-6] contains a number of typographical errors that are not present in the original table published by Hantush [Hantush, M.S., 1956. Analysis of data from pumping tests in leaky aquifers. Transactions, American Geophysical Union 37, 702-714]. Finally, we offer a very fast approximation with a maximum relative error of 0.0033 for the parameter range in the table given by Bear.
Anderson acceleration and application to the three-temperature energy equations

NASA Astrophysics Data System (ADS)

An, Hengbin; Jia, Xiaowei; Walker, Homer F.

2017-10-01

The Anderson acceleration method is an algorithm for accelerating the convergence of fixed-point iterations, including the Picard method. Anderson acceleration was first proposed in 1965 and, for some years, has been used successfully to accelerate the convergence of self-consistent field iterations in electronic-structure computations. Recently, the method has attracted growing attention in other application areas and among numerical analysts. Compared with a Newton-like method, an advantage of Anderson acceleration is that there is no need to form the Jacobian matrix. Thus the method is easy to implement. In this paper, an Anderson-accelerated Picard method is employed to solve the three-temperature energy equations, which are a type of strong nonlinear radiation-diffusion equations. Two strategies are used to improve the robustness of the Anderson acceleration method. One strategy is to adjust the iterates when necessary to satisfy the physical constraint. Another strategy is to monitor and, if necessary, reduce the matrix condition number of the least-squares problem in the Anderson-acceleration implementation so that numerical stability can be guaranteed. Numerical results show that the Anderson-accelerated Picard method can solve the three-temperature energy equations efficiently. Compared with the Picard method without acceleration, Anderson acceleration can reduce the number of iterations by at least half. A comparison between a Jacobian-free Newton-Krylov method, the Picard method, and the Anderson-accelerated Picard method is conducted in this paper.
Numerical study on the electromechanical behavior of dielectric elastomer with the influence of surrounding medium

NASA Astrophysics Data System (ADS)

Jia; Lu

2016-01-01

The considerable electric-induced shape change, together with the attributes of lightweight, high efficiency, and inexpensive cost, makes dielectric elastomer, a promising soft active material for the realization of actuators in broad applications. Although, a number of prototype devices have been demonstrated in the past few years, the further development of this technology necessitates adequate analytical and numerical tools. Especially, previous theoretical studies always neglect the influence of surrounding medium. Due to the large deformation and nonlinear equations of states involved in dielectric elastomer, finite element method (FEM) is anticipated; however, the few available formulations employ homemade codes, which are inconvenient to implement. The aim of this work is to present a numerical approach with the commercial FEM package COMSOL to investigate the nonlinear response of dielectric elastomer under electric stimulation. The influence of surrounding free space on the electric field is analyzed and the corresponding electric force is taken into account through an electric surface traction on the circumstances edge. By employing Maxwell stress tensor as actuation pressure, the mechanical and electric governing equations for dielectric elastomer are coupled, and then solved simultaneously with the Gent model of stain energy to derive the electric induced large deformation as well as the electromechanical instability. The finite element implementation presented here may provide a powerful computational tool to help design and optimize the engineering applications of dielectric elastomer.
Non-Abelian gauge preheating

NASA Astrophysics Data System (ADS)

Adshead, Peter; Giblin, John T.; Weiner, Zachary J.

2017-12-01

We study preheating in models where a scalar inflaton is directly coupled to a non-Abelian S U (2 ) gauge field. In particular, we examine m2ϕ2 inflation with a conformal, dilatonlike coupling to the non-Abelian sector. We describe a numerical scheme that combines lattice gauge theory with standard finite difference methods applied to the scalar field. We show that a significant tachyonic instability allows for efficient preheating, which is parametrically suppressed by increasing the non-Abelian self-coupling. Additionally, we comment on the technical implementation of the evolution scheme and setting initial conditions.
Y-MP floating point and Cholesky factorization

NASA Technical Reports Server (NTRS)

Carter, Russell

1991-01-01

The floating point arithmetics implemented in the Cray 2 and Cray Y-MP computer systems are nearly identical, but large scale computations performed on the two systems have exhibited significant differences in accuracy. The difference in accuracy is analyzed for Cholesky factorization algorithm, and it is found that the source of the difference is the subtract magnitude operation of the Cray Y-MP. The results from numerical experiments for a range of problem sizes are presented, and an efficient method for improving the accuracy of the factorization obtained on the Y-MP is presented.
Toward a More Efficient Implementation of Antifibrillation Pacing

PubMed Central

Wilson, Dan; Moehlis, Jeff

2016-01-01

We devise a methodology to determine an optimal pattern of inputs to synchronize firing patterns of cardiac cells which only requires the ability to measure action potential durations in individual cells. In numerical bidomain simulations, the resulting synchronizing inputs are shown to terminate spiral waves with a higher probability than comparable inputs that do not synchronize the cells as strongly. These results suggest that designing stimuli which promote synchronization in cardiac tissue could improve the success rate of defibrillation, and point towards novel strategies for optimizing antifibrillation pacing. PMID:27391010
Faster Heavy Ion Transport for HZETRN

NASA Technical Reports Server (NTRS)

Slaba, Tony C.

2013-01-01

The deterministic particle transport code HZETRN was developed to enable fast and accurate space radiation transport through materials. As more complex transport solutions are implemented for neutrons, light ions (Z < 2), mesons, and leptons, it is important to maintain overall computational efficiency. In this work, the heavy ion (Z > 2) transport algorithm in HZETRN is reviewed, and a simple modification is shown to provide an approximate 5x decrease in execution time for galactic cosmic ray transport. Convergence tests and other comparisons are carried out to verify that numerical accuracy is maintained in the new algorithm.
A Survey of Symplectic and Collocation Integration Methods for Orbit Propagation

NASA Technical Reports Server (NTRS)

Jones, Brandon A.; Anderson, Rodney L.

2012-01-01

Demands on numerical integration algorithms for astrodynamics applications continue to increase. Common methods, like explicit Runge-Kutta, meet the orbit propagation needs of most scenarios, but more specialized scenarios require new techniques to meet both computational efficiency and accuracy needs. This paper provides an extensive survey on the application of symplectic and collocation methods to astrodynamics. Both of these methods benefit from relatively recent theoretical developments, which improve their applicability to artificial satellite orbit propagation. This paper also details their implementation, with several tests demonstrating their advantages and disadvantages.
Single-mode VCSEL operation via photocurrent feedback

NASA Astrophysics Data System (ADS)

Riyopoulos, Spilios

1999-04-01

On-axis channeling through the use of photoactive layers in VCSEL cavities is proposed to counteract hole burning and mode switching. The photoactive layers act as variable resistivity screens whose radial `aperture' is controlled by the light itself. It is numerically demonstrated that absorption of a small fraction of the light intensity suffices for significant on axis current peaking and single mode operation at currents many times threshold, with minimum efficiency loss and optical mode distortion. Fabrication is implemented during the molecular beam epitaxy phase without wafer post processing, as for oxide apertures.
Numerical investigation of the staged gasification of wet wood

NASA Astrophysics Data System (ADS)

Donskoi, I. G.; Kozlov, A. N.; Svishchev, D. A.; Shamanskii, V. A.

2017-04-01

Gasification of wooden biomass makes it possible to utilize forestry wastes and agricultural residues for generation of heat and power in isolated small-scale power systems. In spite of the availability of a huge amount of cheap biomass, the implementation of the gasification process is impeded by formation of tar products and poor thermal stability of the process. These factors reduce the competitiveness of gasification as compared with alternative technologies. The use of staged technologies enables certain disadvantages of conventional processes to be avoided. One of the previously proposed staged processes is investigated in this paper. For this purpose, mathematical models were developed for individual stages of the process, such as pyrolysis, pyrolysis gas combustion, and semicoke gasification. The effect of controlling parameters on the efficiency of fuel conversion into combustible gases is studied numerically using these models. For the controlling parameter are selected heat inputted into a pyrolysis reactor, the excess of oxidizer during gas combustion, and the wood moisture content. The process efficiency criterion is the gasification chemical efficiency accounting for the input of external heat (used for fuel drying and pyrolysis). The generated regime diagrams represent the gasification efficiency as a function of controlling parameters. Modeling results demonstrate that an increase in the fraction of heat supplied from an external source can result in an adequate efficiency of the wood gasification through the use of steam generated during drying. There are regions where it is feasible to perform incomplete combustion of the pyrolysis gas prior to the gasification. The calculated chemical efficiency of the staged gasification is as high as 80-85%, which is 10-20% higher that in conventional single-stage processes.
Navier-Stokes simulations of unsteady transonic flow phenomena

NASA Technical Reports Server (NTRS)

Atwood, C. A.

1992-01-01

Numerical simulations of two classes of unsteady flows are obtained via the Navier-Stokes equations: a blast-wave/target interaction problem class and a transonic cavity flow problem class. The method developed for the viscous blast-wave/target interaction problem assumes a laminar, perfect gas implemented in a structured finite-volume framework. The approximately factored implicit scheme uses Newton subiterations to obtain the spatially and temporally second-order accurate time history of the blast-waves with stationary targets. The inviscid flux is evaluated using either of two upwind techniques, while the full viscous terms are computed by central differencing. Comparisons of unsteady numerical, analytical, and experimental results are made in two- and three-dimensions for Couette flows, a starting shock-tunnel, and a shock-tube blockage study. The results show accurate wave speed resolution and nonoscillatory discontinuity capturing of the predominantly inviscid flows. Viscous effects were increasingly significant at large post-interaction times. While the blast-wave/target interaction problem benefits from high-resolution methods applied to the Euler terms, the transonic cavity flow problem requires the use of an efficient scheme implemented in a geometrically flexible overset mesh environment. Hence, the Reynolds averaged Navier-Stokes equations implemented in a diagonal form are applied to the cavity flow class of problems. Comparisons between numerical and experimental results are made in two-dimensions for free shear layers and both rectangular and quieted cavities, and in three-dimensions for Stratospheric Observatory For Infrared Astronomy (SOFIA) geometries. The acoustic behavior of the rectangular and three-dimensional cavity flows compare well with experiment in terms of frequency, magnitude, and quieting trends. However, there is a more rapid decrease in computed acoustic energy with frequency than observed experimentally owing to numerical dissipation. In addition, optical phase distortion due to the time-varying density field is modelled using geometrical constructs. The computed optical distortion trends compare with the experimentally inferred result, but underpredicts the fluctuating phase difference magnitude.
High-Performance Design Patterns for Modern Fortran

DOE PAGES

Haveraaen, Magne; Morris, Karla; Rouson, Damian; ...

2015-01-01

This paper presents ideas for using coordinate-free numerics in modern Fortran to achieve code flexibility in the partial differential equation (PDE) domain. We also show how Fortran, over the last few decades, has changed to become a language well-suited for state-of-the-art software development. Fortran’s new coarray distributed data structure, the language’s class mechanism, and its side-effect-free, pure procedure capability provide the scaffolding on which we implement HPC software. These features empower compilers to organize parallel computations with efficient communication. We present some programming patterns that support asynchronous evaluation of expressions comprised of parallel operations on distributed data. We implemented thesemore » patterns using coarrays and the message passing interface (MPI). We compared the codes’ complexity and performance. The MPI code is much more complex and depends on external libraries. The MPI code on Cray hardware using the Cray compiler is 1.5–2 times faster than the coarray code on the same hardware. The Intel compiler implements coarrays atop Intel’s MPI library with the result apparently being 2–2.5 times slower than manually coded MPI despite exhibiting nearly linear scaling efficiency. As compilers mature and further improvements to coarrays comes in Fortran 2015, we expect this performance gap to narrow.« less
Fractional Steps methods for transient problems on commodity computer architectures

NASA Astrophysics Data System (ADS)

Krotkiewski, M.; Dabrowski, M.; Podladchikov, Y. Y.

2008-12-01

Fractional Steps methods are suitable for modeling transient processes that are central to many geological applications. Low memory requirements and modest computational complexity facilitates calculations on high-resolution three-dimensional models. An efficient implementation of Alternating Direction Implicit/Locally One-Dimensional schemes for an Opteron-based shared memory system is presented. The memory bandwidth usage, the main bottleneck on modern computer architectures, is specially addressed. High efficiency of above 2 GFlops per CPU is sustained for problems of 1 billion degrees of freedom. The optimized sequential implementation of all 1D sweeps is comparable in execution time to copying the used data in the memory. Scalability of the parallel implementation on up to 8 CPUs is close to perfect. Performing one timestep of the Locally One-Dimensional scheme on a system of 1000 3 unknowns on 8 CPUs takes only 11 s. We validate the LOD scheme using a computational model of an isolated inclusion subject to a constant far field flux. Next, we study numerically the evolution of a diffusion front and the effective thermal conductivity of composites consisting of multiple inclusions and compare the results with predictions based on the differential effective medium approach. Finally, application of the developed parabolic solver is suggested for a real-world problem of fluid transport and reactions inside a reservoir.
SINE_scan: an efficient tool to discover short interspersed nuclear elements (SINEs) in large-scale genomic datasets

PubMed Central

Mao, Hongliang

2017-01-01

Abstract Motivation: Short Interspersed Nuclear Elements (SINEs) are transposable elements (TEs) that amplify through a copy-and-paste mode via RNA intermediates. The computational identification of new SINEs are challenging because of their weak structural signals and rapid diversification in sequences. Results: Here we report SINE_Scan, a highly efficient program to predict SINE elements in genomic DNA sequences. SINE_Scan integrates hallmark of SINE transposition, copy number and structural signals to identify a SINE element. SINE_Scan outperforms the previously published de novo SINE discovery program. It shows high sensitivity and specificity in 19 plant and animal genome assemblies, of which sizes vary from 120 Mb to 3.5 Gb. It identifies numerous new families and substantially increases the estimation of the abundance of SINEs in these genomes. Availability and Implementation: The code of SINE_Scan is freely available at http://github.com/maohlzj/SINE_Scan, implemented in PERL and supported on Linux. Contact: wangh8@fudan.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28062442
Dynamic extension of the Simulation Problem Analysis Kernel (SPANK)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sowell, E.F.; Buhl, W.F.

1988-07-15

The Simulation Problem Analysis Kernel (SPANK) is an object-oriented simulation environment for general simulation purposes. Among its unique features is use of the directed graph as the primary data structure, rather than the matrix. This allows straightforward use of graph algorithms for matching variables and equations, and reducing the problem graph for efficient numerical solution. The original prototype implementation demonstrated the principles for systems of algebraic equations, allowing simulation of steady-state, nonlinear systems (Sowell 1986). This paper describes how the same principles can be extended to include dynamic objects, allowing simulation of general dynamic systems. The theory is developed andmore » an implementation is described. An example is taken from the field of building energy system simulation. 2 refs., 9 figs.« less
Algorithm 971: An Implementation of a Randomized Algorithm for Principal Component Analysis

PubMed Central

LI, HUAMIN; LINDERMAN, GEORGE C.; SZLAM, ARTHUR; STANTON, KELLY P.; KLUGER, YUVAL; TYGERT, MARK

2017-01-01

Recent years have witnessed intense development of randomized methods for low-rank approximation. These methods target principal component analysis and the calculation of truncated singular value decompositions. The present article presents an essentially black-box, foolproof implementation for Mathworks’ MATLAB, a popular software platform for numerical computation. As illustrated via several tests, the randomized algorithms for low-rank approximation outperform or at least match the classical deterministic techniques (such as Lanczos iterations run to convergence) in basically all respects: accuracy, computational efficiency (both speed and memory usage), ease-of-use, parallelizability, and reliability. However, the classical procedures remain the methods of choice for estimating spectral norms and are far superior for calculating the least singular values and corresponding singular vectors (or singular subspaces). PMID:28983138
Implementation of Implicit Adaptive Mesh Refinement in an Unstructured Finite-Volume Flow Solver

NASA Technical Reports Server (NTRS)

Schwing, Alan M.; Nompelis, Ioannis; Candler, Graham V.

2013-01-01

This paper explores the implementation of adaptive mesh refinement in an unstructured, finite-volume solver. Unsteady and steady problems are considered. The effect on the recovery of high-order numerics is explored and the results are favorable. Important to this work is the ability to provide a path for efficient, implicit time advancement. A method using a simple refinement sensor based on undivided differences is discussed and applied to a practical problem: a shock-shock interaction on a hypersonic, inviscid double-wedge. Cases are compared to uniform grids without the use of adapted meshes in order to assess error and computational expense. Discussion of difficulties, advances, and future work prepare this method for additional research. The potential for this method in more complicated flows is described.
New Developments in Modeling MHD Systems on High Performance Computing Architectures

NASA Astrophysics Data System (ADS)

Germaschewski, K.; Raeder, J.; Larson, D. J.; Bhattacharjee, A.

2009-04-01

Modeling the wide range of time and length scales present even in fluid models of plasmas like MHD and X-MHD (Extended MHD including two fluid effects like Hall term, electron inertia, electron pressure gradient) is challenging even on state-of-the-art supercomputers. In the last years, HPC capacity has continued to grow exponentially, but at the expense of making the computer systems more and more difficult to program in order to get maximum performance. In this paper, we will present a new approach to managing the complexity caused by the need to write efficient codes: Separating the numerical description of the problem, in our case a discretized right hand side (r.h.s.), from the actual implementation of efficiently evaluating it. An automatic code generator is used to describe the r.h.s. in a quasi-symbolic form while leaving the translation into efficient and parallelized code to a computer program itself. We implemented this approach for OpenGGCM (Open General Geospace Circulation Model), a model of the Earth's magnetosphere, which was accelerated by a factor of three on regular x86 architecture and a factor of 25 on the Cell BE architecture (commonly known for its deployment in Sony's PlayStation 3).
A Planar Microfluidic Mixer Based on Logarithmic Spirals

PubMed Central

Scherr, Thomas; Quitadamo, Christian; Tesvich, Preston; Park, Daniel Sang-Won; Tiersch, Terrence; Hayes, Daniel; Choi, Jin-Woo; Nandakumar, Krishnaswamy

2013-01-01

A passive, planar micromixer design based on logarithmic spirals is presented. The device was fabricated using polydimethylsiloxane soft photolithography techniques, and mixing performance was characterized via numerical simulation and fluorescent microscopy. Mixing efficiency initially declined as Reynolds number increased, and this trend continued until a Reynolds number of 15 where a minimum was reached at 53%. Mixing efficiency then began to increase reaching a maximum mixing efficiency of 86% at Re = 67. Three-dimensional simulations of fluid mixing in this design were compared to other planar geometries such as the Archimedes spiral and Meandering-S mixers. The implementation of logarithmic curvature offers several unique advantages that enhance mixing, namely a variable cross-sectional area and a logarithmically varying radius of curvature that creates 3-D Dean vortices. These flow phenomena were observed in simulations with multilayered fluid folding and validated with confocal microscopy. This design provides improved mixing performance over a broader range of Reynolds numbers than other reported planar mixers, all while avoiding external force fields, more complicated fabrication processes, and the introduction of flow obstructions or cavities that may unintentionally affect sensitive or particulate-containing samples. Due to the planar design requiring only single-step lithographic features, this compact geometry could be easily implemented into existing micro-total analysis systems requiring effective rapid mixing. PMID:23956497

A planar microfluidic mixer based on logarithmic spirals

NASA Astrophysics Data System (ADS)

Scherr, Thomas; Quitadamo, Christian; Tesvich, Preston; Sang-Won Park, Daniel; Tiersch, Terrence; Hayes, Daniel; Choi, Jin-Woo; Nandakumar, Krishnaswamy; Monroe, W. Todd

2012-05-01

A passive, planar micromixer design based on logarithmic spirals is presented. The device was fabricated using polydimethylsiloxane soft photolithography techniques, and mixing performance was characterized via numerical simulation and fluorescent microscopy. Mixing efficiency initially declined as the Reynolds number increased, and this trend continued until a Reynolds number of 15 where a minimum was reached at 53%. Mixing efficiency then began to increase reaching a maximum mixing efficiency of 86% at Re = 67. Three-dimensional (3D) simulations of fluid mixing in this design were compared to other planar geometries such as the Archimedes spiral and Meandering-S mixers. The implementation of logarithmic curvature offers several unique advantages that enhance mixing, namely a variable cross-sectional area and a logarithmically varying radius of curvature that creates 3D Dean vortices. These flow phenomena were observed in simulations with multilayered fluid folding and validated with confocal microscopy. This design provides improved mixing performance over a broader range of Reynolds numbers than other reported planar mixers, all while avoiding external force fields, more complicated fabrication processes and the introduction of flow obstructions or cavities that may unintentionally affect sensitive or particulate-containing samples. Due to the planar design requiring only single-step lithographic features, this compact geometry could be easily implemented into existing micro-total analysis systems requiring effective rapid mixing.
Higher-order compositional modeling of three-phase flow in 3D fractured porous media based on cross-flow equilibrium

NASA Astrophysics Data System (ADS)

Moortgat, Joachim; Firoozabadi, Abbas

2013-10-01

Numerical simulation of multiphase compositional flow in fractured porous media, when all the species can transfer between the phases, is a real challenge. Despite the broad applications in hydrocarbon reservoir engineering and hydrology, a compositional numerical simulator for three-phase flow in fractured media has not appeared in the literature, to the best of our knowledge. In this work, we present a three-phase fully compositional simulator for fractured media, based on higher-order finite element methods. To achieve computational efficiency, we invoke the cross-flow equilibrium (CFE) concept between discrete fractures and a small neighborhood in the matrix blocks. We adopt the mixed hybrid finite element (MHFE) method to approximate convective Darcy fluxes and the pressure equation. This approach is the most natural choice for flow in fractured media. The mass balance equations are discretized by the discontinuous Galerkin (DG) method, which is perhaps the most efficient approach to capture physical discontinuities in phase properties at the matrix-fracture interfaces and at phase boundaries. In this work, we account for gravity and Fickian diffusion. The modeling of capillary effects is discussed in a separate paper. We present the mathematical framework, using the implicit-pressure-explicit-composition (IMPEC) scheme, which facilitates rigorous thermodynamic stability analyses and the computation of phase behavior effects to account for transfer of species between the phases. A deceptively simple CFL condition is implemented to improve numerical stability and accuracy. We provide six numerical examples at both small and larger scales and in two and three dimensions, to demonstrate powerful features of the formulation.
Permeability Sensitivity Functions and Rapid Simulation of Hydraulic-Testing Measurements Using Perturbation Theory

NASA Astrophysics Data System (ADS)

Escobar Gómez, J. D.; Torres-Verdín, C.

2018-03-01

Single-well pressure-diffusion simulators enable improved quantitative understanding of hydraulic-testing measurements in the presence of arbitrary spatial variations of rock properties. Simulators of this type implement robust numerical algorithms which are often computationally expensive, thereby making the solution of the forward modeling problem onerous and inefficient. We introduce a time-domain perturbation theory for anisotropic permeable media to efficiently and accurately approximate the transient pressure response of spatially complex aquifers. Although theoretically valid for any spatially dependent rock/fluid property, our single-phase flow study emphasizes arbitrary spatial variations of permeability and anisotropy, which constitute key objectives of hydraulic-testing operations. Contrary to time-honored techniques, the perturbation method invokes pressure-flow deconvolution to compute the background medium's permeability sensitivity function (PSF) with a single numerical simulation run. Subsequently, the first-order term of the perturbed solution is obtained by solving an integral equation that weighs the spatial variations of permeability with the spatial-dependent and time-dependent PSF. Finally, discrete convolution transforms the constant-flow approximation to arbitrary multirate conditions. Multidimensional numerical simulation studies for a wide range of single-well field conditions indicate that perturbed solutions can be computed in less than a few CPU seconds with relative errors in pressure of <5%, corresponding to perturbations in background permeability of up to two orders of magnitude. Our work confirms that the proposed joint perturbation-convolution (JPC) method is an efficient alternative to analytical and numerical solutions for accurate modeling of pressure-diffusion phenomena induced by Neumann or Dirichlet boundary conditions.
Astrophysical fluid simulations of thermally ideal gases with non-constant adiabatic index: numerical implementation

NASA Astrophysics Data System (ADS)

Vaidya, B.; Mignone, A.; Bodo, G.; Massaglia, S.

2015-08-01

Context. An equation of state (EoS) is a relation between thermodynamic state variables and it is essential for closing the set of equations describing a fluid system. Although an ideal EoS with a constant adiabatic index Γ is the preferred choice owing to its simplistic implementation, many astrophysical fluid simulations may benefit from a more sophisticated treatment that can account for diverse chemical processes. Aims: In the present work we first review the basic thermodynamic principles of a gas mixture in terms of its thermal and caloric EoS by including effects like ionization, dissociation, and temperature dependent degrees of freedom such as molecular vibrations and rotations. The formulation is revisited in the context of plasmas that are either in equilibrium conditions (local thermodynamic- or collisional excitation-equilibria) or described by non-equilibrium chemistry coupled to optically thin radiative cooling. We then present a numerical implementation of thermally ideal gases obeying a more general caloric EoS with non-constant adiabatic index in Godunov-type numerical schemes. Methods: We discuss the necessary modifications to the Riemann solver and to the conversion between total energy and pressure (or vice versa) routinely invoked in Godunov-type schemes. We then present two different approaches for computing the EoS. The first employs root-finder methods and it is best suited for EoS in analytical form. The second is based on lookup tables and interpolation and results in a more computationally efficient approach, although care must be taken to ensure thermodynamic consistency. Results: A number of selected benchmarks demonstrate that the employment of a non-ideal EoS can lead to important differences in the solution when the temperature range is 500-104 K where dissociation and ionization occur. The implementation of selected EoS introduces additional computational costs although the employment of lookup table methods (when possible) can significantly reduce the overhead by a factor of ~ 3-4.
Barotropic Tidal Predictions and Validation in a Relocatable Modeling Environment. Revised

NASA Technical Reports Server (NTRS)

Mehra, Avichal; Passi, Ranjit; Kantha, Lakshmi; Payne, Steven; Brahmachari, Shuvobroto

1998-01-01

Under funding from the Office of Naval Research (ONR), the Mississippi State University Center for Air Sea Technology (CAST) has been working on developing a Relocatable Modeling Environment (RME) to provide a uniform and unbiased infrastructure for efficiently configuring numerical models in any geographic or oceanic region. Under Naval Oceanographic Office (NAVOCEANO) funding, the model was implemented and tested for NAVOCEANO use. With our current emphasis on ocean tidal modeling, CAST has adopted the Colorado University's numerical ocean model, known as CURReNTSS (Colorado University Rapidly Relocatable Nestable Storm Surge) Model, as the model of choice. During the RME development process, CURReNTSS has been relocated to several coastal oceanic regions, providing excellent results that demonstrate its veracity. This report documents the model validation results and provides a brief description of the Graphic user Interface.
Self adaptive solution strategies: Locally bound constrained Newton Raphson solution algorithms

NASA Technical Reports Server (NTRS)

Padovan, Joe

1991-01-01

A summary is given of strategies which enable the automatic adjustment of the constraint surfaces recently used to extend the range and numerical stability/efficiency of nonlinear finite element equation solvers. In addition to handling kinematic and material induced nonlinearity, both pre-and postbuckling behavior can be treated. The scheme employs localized bounds on various hierarchical partitions of the field variables. These are used to resize, shape, and orient the global constraint surface, thereby enabling essentially automatic load/deflection incrementation. Due to the generality of the approach taken, it can be implemented in conjunction with the constraints of an arbitrary functional type. To benchmark the method, several numerical experiments are presented. These include problems involving kinematic and material nonlinearity, as well as pre- and postbuckling characteristics. Also included is a list of papers published in the course of the work.
Implicit Total Variation Diminishing (TVD) schemes for steady-state calculations

NASA Technical Reports Server (NTRS)

Yee, H. C.; Warming, R. F.; Harten, A.

1983-01-01

The application of a new implicit unconditionally stable high resolution total variation diminishing (TVD) scheme to steady state calculations. It is a member of a one parameter family of explicit and implicit second order accurate schemes developed by Harten for the computation of weak solutions of hyperbolic conservation laws. This scheme is guaranteed not to generate spurious oscillations for a nonlinear scalar equation and a constant coefficient system. Numerical experiments show that this scheme not only has a rapid convergence rate, but also generates a highly resolved approximation to the steady state solution. A detailed implementation of the implicit scheme for the one and two dimensional compressible inviscid equations of gas dynamics is presented. Some numerical computations of one and two dimensional fluid flows containing shocks demonstrate the efficiency and accuracy of this new scheme.
Research and Application of an Air Quality Early Warning System Based on a Modified Least Squares Support Vector Machine and a Cloud Model.

PubMed

Wang, Jianzhou; Niu, Tong; Wang, Rui

2017-03-02

The worsening atmospheric pollution increases the necessity of air quality early warning systems (EWSs). Despite the fact that a massive amount of investigation about EWS in theory and practicality has been conducted by numerous researchers, studies concerning the quantification of uncertain information and comprehensive evaluation are still lacking, which impedes further development in the area. In this paper, firstly a comprehensive warning system is proposed, which consists of two vital indispensable modules, namely effective forecasting and scientific evaluation, respectively. For the forecasting module, a novel hybrid model combining the theory of data preprocessing and numerical optimization is first developed to implement effective forecasting for air pollutant concentration. Especially, in order to further enhance the accuracy and robustness of the warning system, interval forecasting is implemented to quantify the uncertainties generated by forecasts, which can provide significant risk signals by using point forecasting for decision-makers. For the evaluation module, a cloud model, based on probability and fuzzy set theory, is developed to perform comprehensive evaluations of air quality, which can realize the transformation between qualitative concept and quantitative data. To verify the effectiveness and efficiency of the warning system, extensive simulations based on air pollutants data from Dalian in China were effectively implemented, which illustrate that the warning system is not only remarkably high-performance, but also widely applicable.
Research and Application of an Air Quality Early Warning System Based on a Modified Least Squares Support Vector Machine and a Cloud Model

PubMed Central

Wang, Jianzhou; Niu, Tong; Wang, Rui

2017-01-01

The worsening atmospheric pollution increases the necessity of air quality early warning systems (EWSs). Despite the fact that a massive amount of investigation about EWS in theory and practicality has been conducted by numerous researchers, studies concerning the quantification of uncertain information and comprehensive evaluation are still lacking, which impedes further development in the area. In this paper, firstly a comprehensive warning system is proposed, which consists of two vital indispensable modules, namely effective forecasting and scientific evaluation, respectively. For the forecasting module, a novel hybrid model combining the theory of data preprocessing and numerical optimization is first developed to implement effective forecasting for air pollutant concentration. Especially, in order to further enhance the accuracy and robustness of the warning system, interval forecasting is implemented to quantify the uncertainties generated by forecasts, which can provide significant risk signals by using point forecasting for decision-makers. For the evaluation module, a cloud model, based on probability and fuzzy set theory, is developed to perform comprehensive evaluations of air quality, which can realize the transformation between qualitative concept and quantitative data. To verify the effectiveness and efficiency of the warning system, extensive simulations based on air pollutants data from Dalian in China were effectively implemented, which illustrate that the warning system is not only remarkably high-performance, but also widely applicable. PMID:28257122
Numerical methods for large eddy simulation of acoustic combustion instabilities

NASA Astrophysics Data System (ADS)

Wall, Clifton T.

Acoustic combustion instabilities occur when interaction between the combustion process and acoustic modes in a combustor results in periodic oscillations in pressure, velocity, and heat release. If sufficiently large in amplitude, these instabilities can cause operational difficulties or the failure of combustor hardware. In many situations, the dominant instability is the result of the interaction between a low frequency acoustic mode of the combustor and the large scale hydrodynamics. Large eddy simulation (LES), therefore, is a promising tool for the prediction of these instabilities, since both the low frequency acoustic modes and the large scale hydrodynamics are well resolved in LES. Problems with the tractability of such simulations arise, however, due to the difficulty of solving the compressible Navier-Stokes equations efficiently at low Mach number and due to the large number of acoustic periods that are often required for such instabilities to reach limit cycles. An implicit numerical method for the solution of the compressible Navier-Stokes equations has been developed which avoids the acoustic CFL restriction, allowing for significant efficiency gains at low Mach number, while still resolving the low frequency acoustic modes of interest. In the limit of a uniform grid the numerical method causes no artificial damping of acoustic waves. New, non-reflecting boundary conditions have also been developed for use with the characteristic-based approach of Poinsot and Lele (1992). The new boundary conditions are implemented in a manner which allows for significant reduction of the computational domain of an LES by eliminating the need to perform LES in regions where one-dimensional acoustics significantly affect the instability but details of the hydrodynamics do not. These new numerical techniques have been demonstrated in an LES of an experimental combustor. The new techniques are shown to be an efficient means of performing LES of acoustic combustion instabilities and are shown to accurately predict the occurrence and frequency of the dominant mode of the instability observed in the experiment.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Lott, P. Aaron; Woodward, Carol S.; Evans, Katherine J.

Performing accurate and efficient numerical simulation of global atmospheric climate models is challenging due to the disparate length and time scales over which physical processes interact. Implicit solvers enable the physical system to be integrated with a time step commensurate with processes being studied. The dominant cost of an implicit time step is the ancillary linear system solves, so we have developed a preconditioner aimed at improving the efficiency of these linear system solves. Our preconditioner is based on an approximate block factorization of the linearized shallow-water equations and has been implemented within the spectral element dynamical core within themore » Community Atmospheric Model (CAM-SE). Furthermore, in this paper we discuss the development and scalability of the preconditioner for a suite of test cases with the implicit shallow-water solver within CAM-SE.« less
An indirect approach to the extensive calculation of relationship coefficients

PubMed Central

Colleau, Jean-Jacques

2002-01-01

A method was described for calculating population statistics on relationship coefficients without using corresponding individual data. It relied on the structure of the inverse of the numerator relationship matrix between individuals under investigation and ancestors. Computation times were observed on simulated populations and were compared to those incurred with a conventional direct approach. The indirect approach turned out to be very efficient for multiplying the relationship matrix corresponding to planned matings (full design) by any vector. Efficiency was generally still good or very good for calculating statistics on these simulated populations. An extreme implementation of the method is the calculation of inbreeding coefficients themselves. Relative performances of the indirect method were good except when many full-sibs during many generations existed in the population. PMID:12270102
Simulating Progressive Damage of Notched Composite Laminates with Various Lamination Schemes

NASA Astrophysics Data System (ADS)

Mandal, B.; Chakrabarti, A.

2017-05-01

A three dimensional finite element based progressive damage model has been developed for the failure analysis of notched composite laminates. The material constitutive relations and the progressive damage algorithms are implemented into finite element code ABAQUS using user-defined subroutine UMAT. The existing failure criteria for the composite laminates are modified by including the failure criteria for fiber/matrix shear damage and delamination effects. The proposed numerical model is quite efficient and simple compared to other progressive damage models available in the literature. The efficiency of the present constitutive model and the computational scheme is verified by comparing the simulated results with the results available in the literature. A parametric study has been carried out to investigate the effect of change in lamination scheme on the failure behaviour of notched composite laminates.
The FLAME-slab method for electromagnetic wave scattering in aperiodic slabs

NASA Astrophysics Data System (ADS)

Mansha, Shampy; Tsukerman, Igor; Chong, Y. D.

2017-12-01

The proposed numerical method, "FLAME-slab," solves electromagnetic wave scattering problems for aperiodic slab structures by exploiting short-range regularities in these structures. The computational procedure involves special difference schemes with high accuracy even on coarse grids. These schemes are based on Trefftz approximations, utilizing functions that locally satisfy the governing differential equations, as is done in the Flexible Local Approximation Method (FLAME). Radiation boundary conditions are implemented via Fourier expansions in the air surrounding the slab. When applied to ensembles of slab structures with identical short-range features, such as amorphous or quasicrystalline lattices, the method is significantly more efficient, both in runtime and in memory consumption, than traditional approaches. This efficiency is due to the fact that the Trefftz functions need to be computed only once for the whole ensemble.
Three-dimensional cascaded lattice Boltzmann method: Improved implementation and consistent forcing scheme

NASA Astrophysics Data System (ADS)

Fei, Linlin; Luo, Kai H.; Li, Qing

2018-05-01

The cascaded or central-moment-based lattice Boltzmann method (CLBM) proposed in [Phys. Rev. E 73, 066705 (2006), 10.1103/PhysRevE.73.066705] possesses very good numerical stability. However, two constraints exist in three-dimensional (3D) CLBM simulations. First, the conventional implementation for 3D CLBM involves cumbersome operations and requires much higher computational cost compared to the single-relaxation-time (SRT) LBM. Second, it is a challenge to accurately incorporate a general force field into the 3D CLBM. In this paper, we present an improved method to implement CLBM in 3D. The main strategy is to adopt a simplified central moment set and carry out the central-moment-based collision operator based on a general multi-relaxation-time (GMRT) framework. Next, the recently proposed consistent forcing scheme for CLBM [Fei and Luo, Phys. Rev. E 96, 053307 (2017), 10.1103/PhysRevE.96.053307] is extended to incorporate a general force field into 3D CLBM. Compared with the recently developed nonorthogonal CLBM [Rosis, Phys. Rev. E 95, 013310 (2017), 10.1103/PhysRevE.95.013310], our implementation is proved to reduce the computational cost significantly. The inconsistency of adopting the discrete equilibrium distribution functions in the nonorthogonal CLBM is analyzed and validated. The 3D CLBM developed here in conjunction with the consistent forcing scheme is verified through numerical simulations of several canonical force-driven flows, highlighting very good properties in terms of accuracy, convergence, and consistency with the nonslip rule. Finally, the techniques developed here for 3D CLBM can be applied to make the implementation and execution of 3D MRT-LBM more efficient.
GPU acceleration of Runge Kutta-Fehlberg and its comparison with Dormand-Prince method

NASA Astrophysics Data System (ADS)

Seen, Wo Mei; Gobithaasan, R. U.; Miura, Kenjiro T.

2014-07-01

There is a significant reduction of processing time and speedup of performance in computer graphics with the emergence of Graphic Processing Units (GPUs). GPUs have been developed to surpass Central Processing Unit (CPU) in terms of performance and processing speed. This evolution has opened up a new area in computing and researches where highly parallel GPU has been used for non-graphical algorithms. Physical or phenomenal simulations and modelling can be accelerated through General Purpose Graphic Processing Units (GPGPU) and Compute Unified Device Architecture (CUDA) implementations. These phenomena can be represented with mathematical models in the form of Ordinary Differential Equations (ODEs) which encompasses the gist of change rate between independent and dependent variables. ODEs are numerically integrated over time in order to simulate these behaviours. The classical Runge-Kutta (RK) scheme is the common method used to numerically solve ODEs. The Runge Kutta Fehlberg (RKF) scheme has been specially developed to provide an estimate of the principal local truncation error at each step, known as embedding estimate technique. This paper delves into the implementation of RKF scheme for GPU devices and compares its result with Dorman Prince method. A pseudo code is developed to show the implementation in detail. Hence, practitioners will be able to understand the data allocation in GPU, formation of RKF kernels and the flow of data to/from GPU-CPU upon RKF kernel evaluation. The pseudo code is then written in C Language and two ODE models are executed to show the achievable speedup as compared to CPU implementation. The accuracy and efficiency of the proposed implementation method is discussed in the final section of this paper.
GPU-based simulation of optical propagation through turbulence for active and passive imaging

NASA Astrophysics Data System (ADS)

Monnier, Goulven; Duval, François-Régis; Amram, Solène

2014-10-01

IMOTEP is a GPU-based (Graphical Processing Units) software relying on a fast parallel implementation of Fresnel diffraction through successive phase screens. Its applications include active imaging, laser telemetry and passive imaging through turbulence with anisoplanatic spatial and temporal fluctuations. Thanks to parallel implementation on GPU, speedups ranging from 40X to 70X are achieved. The present paper gives a brief overview of IMOTEP models, algorithms, implementation and user interface. It then focuses on major improvements recently brought to the anisoplanatic imaging simulation method. Previously, we took advantage of the computational power offered by the GPU to develop a simulation method based on large series of deterministic realisations of the PSF distorted by turbulence. The phase screen propagation algorithm, by reproducing higher moments of the incident wavefront distortion, provides realistic PSFs. However, we first used a coarse gaussian model to fit the numerical PSFs and characterise there spatial statistics through only 3 parameters (two-dimensional displacements of centroid and width). Meanwhile, this approach was unable to reproduce the effects related to the details of the PSF structure, especially the "speckles" leading to prominent high-frequency content in short-exposure images. To overcome this limitation, we recently implemented a new empirical model of the PSF, based on Principal Components Analysis (PCA), ought to catch most of the PSF complexity. The GPU implementation allows estimating and handling efficiently the numerous (up to several hundreds) principal components typically required under the strong turbulence regime. A first demanding computational step involves PCA, phase screen propagation and covariance estimates. In a second step, realistic instantaneous images, fully accounting for anisoplanatic effects, are quickly generated. Preliminary results are presented.
Streamline integration as a method for two-dimensional elliptic grid generation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wiesenberger, M., E-mail: Matthias.Wiesenberger@uibk.ac.at; Held, M.; Einkemmer, L.

We propose a new numerical algorithm to construct a structured numerical elliptic grid of a doubly connected domain. Our method is applicable to domains with boundaries defined by two contour lines of a two-dimensional function. Furthermore, we can adapt any analytically given boundary aligned structured grid, which specifically includes polar and Cartesian grids. The resulting coordinate lines are orthogonal to the boundary. Grid points as well as the elements of the Jacobian matrix can be computed efficiently and up to machine precision. In the simplest case we construct conformal grids, yet with the help of weight functions and monitor metricsmore » we can control the distribution of cells across the domain. Our algorithm is parallelizable and easy to implement with elementary numerical methods. We assess the quality of grids by considering both the distribution of cell sizes and the accuracy of the solution to elliptic problems. Among the tested grids these key properties are best fulfilled by the grid constructed with the monitor metric approach. - Graphical abstract: - Highlights: • Construct structured, elliptic numerical grids with elementary numerical methods. • Align coordinate lines with or make them orthogonal to the domain boundary. • Compute grid points and metric elements up to machine precision. • Control cell distribution by adaption functions or monitor metrics.« less
Development of a CFD Code for Analysis of Fluid Dynamic Forces in Seals

NASA Technical Reports Server (NTRS)

Athavale, Mahesh M.; Przekwas, Andrzej J.; Singhal, Ashok K.

1991-01-01

The aim is to develop a 3-D computational fluid dynamics (CFD) code for the analysis of fluid flow in cylindrical seals and evaluation of the dynamic forces on the seals. This code is expected to serve as a scientific tool for detailed flow analysis as well as a check for the accuracy of the 2D industrial codes. The features necessary in the CFD code are outlined. The initial focus was to develop or modify and implement new techniques and physical models. These include collocated grid formulation, rotating coordinate frames and moving grid formulation. Other advanced numerical techniques include higher order spatial and temporal differencing and an efficient linear equation solver. These techniques were implemented in a 2D flow solver for initial testing. Several benchmark test cases were computed using the 2D code, and the results of these were compared to analytical solutions or experimental data to check the accuracy. Tests presented here include planar wedge flow, flow due to an enclosed rotor, and flow in a 2D seal with a whirling rotor. Comparisons between numerical and experimental results for an annular seal and a 7-cavity labyrinth seal are also included.
Parameter optimization for the visco-hyperelastic constitutive model of tendon using FEM.

PubMed

Tang, C Y; Ng, G Y F; Wang, Z W; Tsui, C P; Zhang, G

2011-01-01

Numerous constitutive models describing the mechanical properties of tendons have been proposed during the past few decades. However, few were widely used owing to the lack of implementation in the general finite element (FE) software, and very few systematic studies have been done on selecting the most appropriate parameters for these constitutive laws. In this work, the visco-hyperelastic constitutive model of the tendon implemented through the use of three-parameter Mooney-Rivlin form and sixty-four-parameter Prony series were firstly analyzed using ANSYS FE software. Afterwards, an integrated optimization scheme was developed by coupling two optimization toolboxes (OPTs) of ANSYS and MATLAB for estimating these unknown constitutive parameters of the tendon. Finally, a group of Sprague-Dawley rat tendons was used to execute experimental and numerical simulation investigation. The simulated results showed good agreement with the experimental data. An important finding revealed that too many Maxwell elements was not necessary for assuring accuracy of the model, which is often neglected in most open literatures. Thus, all these proved that the constitutive parameter optimization scheme was reliable and highly efficient. Furthermore, the approach can be extended to study other tendons or ligaments, as well as any visco-hyperelastic solid materials.

Investigation of flow in data rack

NASA Astrophysics Data System (ADS)

Manoch, Lukáš; Nožička, Jiří; Pohan, Petr

2012-04-01

The main purpose of this paper was to set up a functioning numerical model of data rack verified by an experimental measurement. The verification of the numerical model was carried out by means of the PIV method (Particle Image Velocimetry). The numerical model was "found" while using the assumed and preset values from the experimental measurement which represent boundary conditions. The server model was conceived as a four-channel with a controlled flow rate without simulation of heat transfer. The flow rate in each channel was implemented by means of pressure loss. The numerical model was further used for simulation of several phases and configurations of data rack (21U rack space) fitted with two server workstations Dell Precision R5400. The flow field in the inlet of data rack in the front of the workstations were observed and evaluated in such a way that a 2U-dimensional free space between the workstations was being left and the remaining inlet space was blanked-off/fully opened. The results of this paper will serve for designing optimization treatment of data rack from the viewpoint of cooling efficiency both within the data rack and within the data center design.
THC-MP: High performance numerical simulation of reactive transport and multiphase flow in porous media

NASA Astrophysics Data System (ADS)

Wei, Xiaohui; Li, Weishan; Tian, Hailong; Li, Hongliang; Xu, Haixiao; Xu, Tianfu

2015-07-01

The numerical simulation of multiphase flow and reactive transport in the porous media on complex subsurface problem is a computationally intensive application. To meet the increasingly computational requirements, this paper presents a parallel computing method and architecture. Derived from TOUGHREACT that is a well-established code for simulating subsurface multi-phase flow and reactive transport problems, we developed a high performance computing THC-MP based on massive parallel computer, which extends greatly on the computational capability for the original code. The domain decomposition method was applied to the coupled numerical computing procedure in the THC-MP. We designed the distributed data structure, implemented the data initialization and exchange between the computing nodes and the core solving module using the hybrid parallel iterative and direct solver. Numerical accuracy of the THC-MP was verified through a CO2 injection-induced reactive transport problem by comparing the results obtained from the parallel computing and sequential computing (original code). Execution efficiency and code scalability were examined through field scale carbon sequestration applications on the multicore cluster. The results demonstrate successfully the enhanced performance using the THC-MP on parallel computing facilities.
Numerical integration of the extended variable generalized Langevin equation with a positive Prony representable memory kernel.

PubMed

Baczewski, Andrew D; Bond, Stephen D

2013-07-28

Generalized Langevin dynamics (GLD) arise in the modeling of a number of systems, ranging from structured fluids that exhibit a viscoelastic mechanical response, to biological systems, and other media that exhibit anomalous diffusive phenomena. Molecular dynamics (MD) simulations that include GLD in conjunction with external and/or pairwise forces require the development of numerical integrators that are efficient, stable, and have known convergence properties. In this article, we derive a family of extended variable integrators for the Generalized Langevin equation with a positive Prony series memory kernel. Using stability and error analysis, we identify a superlative choice of parameters and implement the corresponding numerical algorithm in the LAMMPS MD software package. Salient features of the algorithm include exact conservation of the first and second moments of the equilibrium velocity distribution in some important cases, stable behavior in the limit of conventional Langevin dynamics, and the use of a convolution-free formalism that obviates the need for explicit storage of the time history of particle velocities. Capability is demonstrated with respect to accuracy in numerous canonical examples, stability in certain limits, and an exemplary application in which the effect of a harmonic confining potential is mapped onto a memory kernel.
MUTILS - a set of efficient modeling tools for multi-core CPUs implemented in MEX

NASA Astrophysics Data System (ADS)

Krotkiewski, Marcin; Dabrowski, Marcin

2013-04-01

The need for computational performance is common in scientific applications, and in particular in numerical simulations, where high resolution models require efficient processing of large amounts of data. Especially in the context of geological problems the need to increase the model resolution to resolve physical and geometrical complexities seems to have no limits. Alas, the performance of new generations of CPUs does not improve any longer by simply increasing clock speeds. Current industrial trends are to increase the number of computational cores. As a result, parallel implementations are required in order to fully utilize the potential of new processors, and to study more complex models. We target simulations on small to medium scale shared memory computers: laptops and desktop PCs with ~8 CPU cores and up to tens of GB of memory to high-end servers with ~50 CPU cores and hundereds of GB of memory. In this setting MATLAB is often the environment of choice for scientists that want to implement their own models with little effort. It is a useful general purpose mathematical software package, but due to its versatility some of its functionality is not as efficient as it could be. In particular, the challanges of modern multi-core architectures are not fully addressed. We have developed MILAMIN 2 - an efficient FEM modeling environment written in native MATLAB. Amongst others, MILAMIN provides functions to define model geometry, generate and convert structured and unstructured meshes (also through interfaces to external mesh generators), compute element and system matrices, apply boundary conditions, solve the system of linear equations, address non-linear and transient problems, and perform post-processing. MILAMIN strives to combine the ease of code development and the computational efficiency. Where possible, the code is optimized and/or parallelized within the MATLAB framework. Native MATLAB is augmented with the MUTILS library - a set of MEX functions that implement the computationally intensive, performance critical parts of the code, which we have identified to be bottlenecks. Here, we discuss the functionality and performance of the MUTILS library. Currently, it includes: 1. time and memory efficient assembly of sparse matrices for FEM simulations 2. parallel sparse matrix - vector product with optimizations speficic to symmetric matrices and multiple degrees of freedom per node 3. parallel point in triangle location and point in tetrahedron location for unstructured, adaptive 2D and 3D meshes (useful for 'marker in cell' type of methods) 4. parallel FEM interpolation for 2D and 3D meshes of elements of different types and orders, and for different number of degrees of freedom per node 5. a stand-alone, MEX implementation of the Conjugate Gradients iterative solver 6. interface to METIS graph partitioning and a fast implementation of RCM reordering
Relationship between the spectral line based weighted-sum-of-gray-gases model and the full spectrum k-distribution model

NASA Astrophysics Data System (ADS)

Chu, Huaqiang; Liu, Fengshan; Consalvi, Jean-Louis

2014-08-01

The relationship between the spectral line based weighted-sum-of-gray-gases (SLW) model and the full-spectrum k-distribution (FSK) model in isothermal and homogeneous media is investigated in this paper. The SLW transfer equation can be derived from the FSK transfer equation expressed in the k-distribution function without approximation. It confirms that the SLW model is equivalent to the FSK model in the k-distribution function form. The numerical implementation of the SLW relies on a somewhat arbitrary discretization of the absorption cross section whereas the FSK model finds the spectrally integrated intensity by integration over the smoothly varying cumulative-k distribution function using a Gaussian quadrature scheme. The latter is therefore in general more efficient as a fewer number of gray gases is required to achieve a prescribed accuracy. Sample numerical calculations were conducted to demonstrate the different efficiency of these two methods. The FSK model is found more accurate than the SLW model in radiation transfer in H2O; however, the SLW model is more accurate in media containing CO2 as the only radiating gas due to its explicit treatment of ‘clear gas.’
Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications

NASA Technical Reports Server (NTRS)

Sun, Xian-He

1997-01-01

Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm and Reduced Parallel Diagonal Dominant (RPDD) algorithm have been carefully studied on different parallel platforms for different applications, and a NASA simulation code developed by Man M. Rai and his colleagues has been parallelized and implemented based on data dependency analysis. These achievements are addressed in detail in the paper.
An Efficient Local Correlation Matrix Decomposition Approach for the Localization Implementation of Ensemble-Based Assimilation Methods

NASA Astrophysics Data System (ADS)

Zhang, Hongqin; Tian, Xiangjun

2018-04-01

Ensemble-based data assimilation methods often use the so-called localization scheme to improve the representation of the ensemble background error covariance (Be). Extensive research has been undertaken to reduce the computational cost of these methods by using the localized ensemble samples to localize Be by means of a direct decomposition of the local correlation matrix C. However, the computational costs of the direct decomposition of the local correlation matrix C are still extremely high due to its high dimension. In this paper, we propose an efficient local correlation matrix decomposition approach based on the concept of alternating directions. This approach is intended to avoid direct decomposition of the correlation matrix. Instead, we first decompose the correlation matrix into 1-D correlation matrices in the three coordinate directions, then construct their empirical orthogonal function decomposition at low resolution. This procedure is followed by the 1-D spline interpolation process to transform the above decompositions to the high-resolution grid. Finally, an efficient correlation matrix decomposition is achieved by computing the very similar Kronecker product. We conducted a series of comparison experiments to illustrate the validity and accuracy of the proposed local correlation matrix decomposition approach. The effectiveness of the proposed correlation matrix decomposition approach and its efficient localization implementation of the nonlinear least-squares four-dimensional variational assimilation are further demonstrated by several groups of numerical experiments based on the Advanced Research Weather Research and Forecasting model.
An Implementation Method of the Fractional-Order PID Control System Considering the Memory Constraint and its Application to the Temperature Control of Heat Plate

NASA Astrophysics Data System (ADS)

Sasano, Koji; Okajima, Hiroshi; Matsunaga, Nobutomo

Recently, the fractional order PID (FO-PID) control, which is the extension of the PID control, has been focused on. Even though the FO-PID requires the high-order filter, it is difficult to realize the high-order filter due to the memory limitation of digital computer. For implementation of FO-PID, approximation of the fractional integrator and differentiator are required. Short memory principle (SMP) is one of the effective approximation methods. However, there is a disadvantage that the approximated filter with SMP cannot eliminate the steady-state error. For this problem, we introduce the distributed implementation of the integrator and the dynamic quantizer to make the efficient use of permissible memory. The objective of this study is to clarify how to implement the accurate FO-PID with limited memories. In this paper, we propose the implementation method of FO-PID with memory constraint using dynamic quantizer. And the trade off between approximation of fractional elements and quantized data size are examined so as to close to the ideal FO-PID responses. The effectiveness of proposed method is evaluated by numerical example and experiment in the temperature control of heat plate.
Application of the MacCormack scheme to overland flow routing for high-spatial resolution distributed hydrological model

NASA Astrophysics Data System (ADS)

Zhang, Ling; Nan, Zhuotong; Liang, Xu; Xu, Yi; Hernández, Felipe; Li, Lianxia

2018-03-01

Although process-based distributed hydrological models (PDHMs) are evolving rapidly over the last few decades, their extensive applications are still challenged by the computational expenses. This study attempted, for the first time, to apply the numerically efficient MacCormack algorithm to overland flow routing in a representative high-spatial resolution PDHM, i.e., the distributed hydrology-soil-vegetation model (DHSVM), in order to improve its computational efficiency. The analytical verification indicates that both the semi and full versions of the MacCormack schemes exhibit robust numerical stability and are more computationally efficient than the conventional explicit linear scheme. The full-version outperforms the semi-version in terms of simulation accuracy when a same time step is adopted. The semi-MacCormack scheme was implemented into DHSVM (version 3.1.2) to solve the kinematic wave equations for overland flow routing. The performance and practicality of the enhanced DHSVM-MacCormack model was assessed by performing two groups of modeling experiments in the Mercer Creek watershed, a small urban catchment near Bellevue, Washington. The experiments show that DHSVM-MacCormack can considerably improve the computational efficiency without compromising the simulation accuracy of the original DHSVM model. More specifically, with the same computational environment and model settings, the computational time required by DHSVM-MacCormack can be reduced to several dozen minutes for a simulation period of three months (in contrast with one day and a half by the original DHSVM model) without noticeable sacrifice of the accuracy. The MacCormack scheme proves to be applicable to overland flow routing in DHSVM, which implies that it can be coupled into other PHDMs for watershed routing to either significantly improve their computational efficiency or to make the kinematic wave routing for high resolution modeling computational feasible.
Rapid execution of fan beam image reconstruction algorithms using efficient computational techniques and special-purpose processors

NASA Astrophysics Data System (ADS)

Gilbert, B. K.; Robb, R. A.; Chu, A.; Kenue, S. K.; Lent, A. H.; Swartzlander, E. E., Jr.

1981-02-01

Rapid advances during the past ten years of several forms of computer-assisted tomography (CT) have resulted in the development of numerous algorithms to convert raw projection data into cross-sectional images. These reconstruction algorithms are either 'iterative,' in which a large matrix algebraic equation is solved by successive approximation techniques; or 'closed form'. Continuing evolution of the closed form algorithms has allowed the newest versions to produce excellent reconstructed images in most applications. This paper will review several computer software and special-purpose digital hardware implementations of closed form algorithms, either proposed during the past several years by a number of workers or actually implemented in commercial or research CT scanners. The discussion will also cover a number of recently investigated algorithmic modifications which reduce the amount of computation required to execute the reconstruction process, as well as several new special-purpose digital hardware implementations under development in laboratories at the Mayo Clinic.
Full Parallel Implementation of an All-Electron Four-Component Dirac-Kohn-Sham Program.

PubMed

Rampino, Sergio; Belpassi, Leonardo; Tarantelli, Francesco; Storchi, Loriano

2014-09-09

A full distributed-memory implementation of the Dirac-Kohn-Sham (DKS) module of the program BERTHA (Belpassi et al., Phys. Chem. Chem. Phys. 2011, 13, 12368-12394) is presented, where the self-consistent field (SCF) procedure is replicated on all the parallel processes, each process working on subsets of the global matrices. The key feature of the implementation is an efficient procedure for switching between two matrix distribution schemes, one (integral-driven) optimal for the parallel computation of the matrix elements and another (block-cyclic) optimal for the parallel linear algebra operations. This approach, making both CPU-time and memory scalable with the number of processors used, virtually overcomes at once both time and memory barriers associated with DKS calculations. Performance, portability, and numerical stability of the code are illustrated on the basis of test calculations on three gold clusters of increasing size, an organometallic compound, and a perovskite model. The calculations are performed on a Beowulf and a BlueGene/Q system.
New insight in spiral drawing analysis methods - Application to action tremor quantification.

PubMed

Legrand, André Pierre; Rivals, Isabelle; Richard, Aliénor; Apartis, Emmanuelle; Roze, Emmanuel; Vidailhet, Marie; Meunier, Sabine; Hainque, Elodie

2017-10-01

Spiral drawing is one of the standard tests used to assess tremor severity for the clinical evaluation of medical treatments. Tremor severity is estimated through visual rating of the drawings by movement disorders experts. Different approaches based on the mathematical signal analysis of the recorded spiral drawings were proposed to replace this rater dependent estimate. The objective of the present study is to propose new numerical methods and to evaluate them in terms of agreement with visual rating and reproducibility. Series of spiral drawings of patients with essential tremor were visually rated by a board of experts. In addition to the usual velocity analysis, three new numerical methods were tested and compared, namely static and dynamic unraveling, and empirical mode decomposition. The reproducibility of both visual and numerical ratings was estimated, and their agreement was evaluated. The statistical analysis demonstrated excellent agreement between visual and numerical ratings, and more reproducible results with numerical methods than with visual ratings. The velocity method and the new numerical methods are in good agreement. Among the latter, static and dynamic unravelling both display a smaller dispersion and are easier for automatic analysis. The reliable scores obtained through the proposed numerical methods allow considering that their implementation on a digitized tablet, be it connected with a computer or independent, provides an efficient automatic tool for tremor severity assessment. Copyright © 2017 International Federation of Clinical Neurophysiology. Published by Elsevier B.V. All rights reserved.
Using SPARK as a Solver for Modelica

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wetter, Michael; Wetter, Michael; Haves, Philip

Modelica is an object-oriented acausal modeling language that is well positioned to become a de-facto standard for expressing models of complex physical systems. To simulate a model expressed in Modelica, it needs to be translated into executable code. For generating run-time efficient code, such a translation needs to employ algebraic formula manipulations. As the SPARK solver has been shown to be competitive for generating such code but currently cannot be used with the Modelica language, we report in this paper how SPARK's symbolic and numerical algorithms can be implemented in OpenModelica, an open-source implementation of a Modelica modeling and simulationmore » environment. We also report benchmark results that show that for our air flow network simulation benchmark, the SPARK solver is competitive with Dymola, which is believed to provide the best solver for Modelica.« less
Bistatic passive radar simulator with spatial filtering subsystem

NASA Astrophysics Data System (ADS)

Hossa, Robert; Szlachetko, Boguslaw; Lewandowski, Andrzej; Górski, Maksymilian

2009-06-01

The purpose of this paper is to briefly introduce the structure and features of the developed virtual passive FM radar implemented in Matlab system of numerical computations and to present many alternative ways of its performance. An idea of the proposed solution is based on analytic representation of transmitted direct signals and reflected echo signals. As a spatial filtering subsystem a beamforming network of ULA and UCA dipole configuration dedicated to bistatic radar concept is considered and computationally efficient procedures are presented in details. Finally, exemplary results of the computer simulations of the elaborated virtual simulator are provided and discussed.
A computationally efficient scheme for the non-linear diffusion equation

NASA Astrophysics Data System (ADS)

Termonia, P.; Van de Vyver, H.

2009-04-01

This Letter proposes a new numerical scheme for integrating the non-linear diffusion equation. It is shown that it is linearly stable. Some tests are presented comparing this scheme to a popular decentered version of the linearized Crank-Nicholson scheme, showing that, although this scheme is slightly less accurate in treating the highly resolved waves, (i) the new scheme better treats highly non-linear systems, (ii) better handles the short waves, (iii) for a given test bed turns out to be three to four times more computationally cheap, and (iv) is easier in implementation.
A gradient based algorithm to solve inverse plane bimodular problems of identification

NASA Astrophysics Data System (ADS)

Ran, Chunjiang; Yang, Haitian; Zhang, Guoqing

2018-02-01

This paper presents a gradient based algorithm to solve inverse plane bimodular problems of identifying constitutive parameters, including tensile/compressive moduli and tensile/compressive Poisson's ratios. For the forward bimodular problem, a FE tangent stiffness matrix is derived facilitating the implementation of gradient based algorithms, for the inverse bimodular problem of identification, a two-level sensitivity analysis based strategy is proposed. Numerical verification in term of accuracy and efficiency is provided, and the impacts of initial guess, number of measurement points, regional inhomogeneity, and noisy data on the identification are taken into accounts.
2D/3D Synthetic Vision Navigation Display

NASA Technical Reports Server (NTRS)

Prinzel, Lawrence J., III; Kramer, Lynda J.; Arthur, J. J., III; Bailey, Randall E.; Sweeters, jason L.

2008-01-01

Flight-deck display software was designed and developed at NASA Langley Research Center to provide two-dimensional (2D) and three-dimensional (3D) terrain, obstacle, and flight-path perspectives on a single navigation display. The objective was to optimize the presentation of synthetic vision (SV) system technology that permits pilots to view multiple perspectives of flight-deck display symbology and 3D terrain information. Research was conducted to evaluate the efficacy of the concept. The concept has numerous unique implementation features that would permit enhanced operational concepts and efficiencies in both current and future aircraft.
Calculation of transmission probability by solving an eigenvalue problem

NASA Astrophysics Data System (ADS)

Bubin, Sergiy; Varga, Kálmán

2010-11-01

The electron transmission probability in nanodevices is calculated by solving an eigenvalue problem. The eigenvalues are the transmission probabilities and the number of nonzero eigenvalues is equal to the number of open quantum transmission eigenchannels. The number of open eigenchannels is typically a few dozen at most, thus the computational cost amounts to the calculation of a few outer eigenvalues of a complex Hermitian matrix (the transmission matrix). The method is implemented on a real space grid basis providing an alternative to localized atomic orbital based quantum transport calculations. Numerical examples are presented to illustrate the efficiency of the method.
Generating higher-order quantum dissipation from lower-order parametric processes

NASA Astrophysics Data System (ADS)

Mundhada, S. O.; Grimm, A.; Touzard, S.; Vool, U.; Shankar, S.; Devoret, M. H.; Mirrahimi, M.

2017-06-01

The stabilisation of quantum manifolds is at the heart of error-protected quantum information storage and manipulation. Nonlinear driven-dissipative processes achieve such stabilisation in a hardware efficient manner. Josephson circuits with parametric pump drives implement these nonlinear interactions. In this article, we propose a scheme to engineer a four-photon drive and dissipation on a harmonic oscillator by cascading experimentally demonstrated two-photon processes. This would stabilise a four-dimensional degenerate manifold in a superconducting resonator. We analyse the performance of the scheme using numerical simulations of a realisable system with experimentally achievable parameters.
Quasi interpolation with Voronoi splines.

PubMed

Mirzargar, Mahsa; Entezari, Alireza

2011-12-01

We present a quasi interpolation framework that attains the optimal approximation-order of Voronoi splines for reconstruction of volumetric data sampled on general lattices. The quasi interpolation framework of Voronoi splines provides an unbiased reconstruction method across various lattices. Therefore this framework allows us to analyze and contrast the sampling-theoretic performance of general lattices, using signal reconstruction, in an unbiased manner. Our quasi interpolation methodology is implemented as an efficient FIR filter that can be applied online or as a preprocessing step. We present visual and numerical experiments that demonstrate the improved accuracy of reconstruction across lattices, using the quasi interpolation framework. © 2011 IEEE

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yanai, Takeshi; Fann, George I.; Beylkin, Gregory

Using the fully numerical method for time-dependent Hartree–Fock and density functional theory (TD-HF/DFT) with the Tamm–Dancoff (TD) approximation we use a multiresolution analysis (MRA) approach to present our findings. From a reformulation with effective use of the density matrix operator, we obtain a general form of the HF/DFT linear response equation in the first quantization formalism. It can be readily rewritten as an integral equation with the bound-state Helmholtz (BSH) kernel for the Green's function. The MRA implementation of the resultant equation permits excited state calculations without virtual orbitals. Moreover, the integral equation is efficiently and adaptively solved using amore » numerical multiresolution solver with multiwavelet bases. Our implementation of the TD-HF/DFT methods is applied for calculating the excitation energies of H 2, Be, N 2, H 2O, and C 2H 4 molecules. The numerical errors of the calculated excitation energies converge in proportion to the residuals of the equation in the molecular orbitals and response functions. The energies of the excited states at a variety of length scales ranging from short-range valence excitations to long-range Rydberg-type ones are consistently accurate. It is shown that the multiresolution calculations yield the correct exponential asymptotic tails for the response functions, whereas those computed with Gaussian basis functions are too diffuse or decay too rapidly. Finally, we introduce a simple asymptotic correction to the local spin-density approximation (LSDA) so that in the TDDFT calculations, the excited states are correctly bound.« less
Efficient integration method for fictitious domain approaches

NASA Astrophysics Data System (ADS)

Duczek, Sascha; Gabbert, Ulrich

2015-10-01

In the current article, we present an efficient and accurate numerical method for the integration of the system matrices in fictitious domain approaches such as the finite cell method (FCM). In the framework of the FCM, the physical domain is embedded in a geometrically larger domain of simple shape which is discretized using a regular Cartesian grid of cells. Therefore, a spacetree-based adaptive quadrature technique is normally deployed to resolve the geometry of the structure. Depending on the complexity of the structure under investigation this method accounts for most of the computational effort. To reduce the computational costs for computing the system matrices an efficient quadrature scheme based on the divergence theorem (Gauß-Ostrogradsky theorem) is proposed. Using this theorem the dimension of the integral is reduced by one, i.e. instead of solving the integral for the whole domain only its contour needs to be considered. In the current paper, we present the general principles of the integration method and its implementation. The results to several two-dimensional benchmark problems highlight its properties. The efficiency of the proposed method is compared to conventional spacetree-based integration techniques.
Dynamical Origin of Highly Efficient Energy Dissipation in Soft Magnetic Nanoparticles for Magnetic Hyperthermia Applications

NASA Astrophysics Data System (ADS)

Kim, Min-Kwan; Sim, Jaegun; Lee, Jae-Hyeok; Kim, Miyoung; Kim, Sang-Koog

2018-05-01

We explore robust magnetization-dynamic behaviors in soft magnetic nanoparticles in single-domain states and find their related high-efficiency energy-dissipation mechanism using finite-element micromagnetic simulations. We also make analytical derivations that provide deeper physical insights into the magnetization dynamics associated with Gilbert damping parameters under applications of time-varying rotating magnetic fields of different strengths and frequencies and static magnetic fields. Furthermore, we find that the mass-specific energy-dissipation rate at resonance in the steady-state regime changes remarkably with the strength of rotating fields and static fields for given damping constants. The associated magnetization dynamics are well interpreted with the help of the numerical calculation of analytically derived explicit forms. The high-efficiency energy-loss power can be obtained using soft magnetic nanoparticles in the single-domain state by tuning the frequency of rotating fields to the resonance frequency; what is more, it is controllable via the rotating and static field strengths for a given intrinsic damping constant. We provide a better and more efficient means of achieving specific loss power that can be implemented in magnetic hyperthermia applications.
Spitzer observatory operations: increasing efficiency in mission operations

NASA Astrophysics Data System (ADS)

Scott, Charles P.; Kahr, Bolinda E.; Sarrel, Marc A.

2006-06-01

This paper explores the how's and why's of the Spitzer Mission Operations System's (MOS) success, efficiency, and affordability in comparison to other observatory-class missions. MOS exploits today's flight, ground, and operations capabilities, embraces automation, and balances both risk and cost. With operational efficiency as the primary goal, MOS maintains a strong control process by translating lessons learned into efficiency improvements, thereby enabling the MOS processes, teams, and procedures to rapidly evolve from concept (through thorough validation) into in-flight implementation. Operational teaming, planning, and execution are designed to enable re-use. Mission changes, unforeseen events, and continuous improvement have often times forced us to learn to fly anew. Collaborative spacecraft operations and remote science and instrument teams have become well integrated, and worked together to improve and optimize each human, machine, and software-system element. Adaptation to tighter spacecraft margins has facilitated continuous operational improvements via automated and autonomous software coupled with improved human analysis. Based upon what we now know and what we need to improve, adapt, or fix, the projected mission lifetime continues to grow - as does the opportunity for numerous scientific discoveries.
The analysis of composite laminated beams using a 2D interpolating meshless technique

NASA Astrophysics Data System (ADS)

Sadek, S. H. M.; Belinha, J.; Parente, M. P. L.; Natal Jorge, R. M.; de Sá, J. M. A. César; Ferreira, A. J. M.

2018-02-01

Laminated composite materials are widely implemented in several engineering constructions. For its relative light weight, these materials are suitable for aerospace, military, marine, and automotive structural applications. To obtain safe and economical structures, the modelling analysis accuracy is highly relevant. Since meshless methods in the recent years achieved a remarkable progress in computational mechanics, the present work uses one of the most flexible and stable interpolation meshless technique available in the literature—the Radial Point Interpolation Method (RPIM). Here, a 2D approach is considered to numerically analyse composite laminated beams. Both the meshless formulation and the equilibrium equations ruling the studied physical phenomenon are presented with detail. Several benchmark beam examples are studied and the results are compared with exact solutions available in the literature and the results obtained from a commercial finite element software. The results show the efficiency and accuracy of the proposed numeric technique.
Barotropic Tidal Predictions and Validation in a Relocatable Modeling Environment. Revised

NASA Technical Reports Server (NTRS)

Mehra, Avichal; Passi, Ranjit; Kantha, Lakshmi; Payne, Steven; Brahmachari, Shuvobroto

1998-01-01

Under funding from the Office of Naval Research (ONR), and the Naval Oceanographic Office (NAVOCEANO), the Mississippi State University Center for Air Sea Technology (CAST) has been working on developing a Relocatable Modeling Environment(RME) to provide a uniform and unbiased infrastructure for efficiently configuring numerical models in any geographic/oceanic region. Under Naval Oceanographic Office (NAVO-CEANO) funding, the model was implemented and tested for NAVOCEANO use. With our current emphasis on ocean tidal modeling, CAST has adopted the Colorado University's numerical ocean model, known as CURReNTSS (Colorado University Rapidly Relocatable Nestable Storm Surge) Model, as the model of choice. During the RME development process, CURReNTSS has been relocated to several coastal oceanic regions, providing excellent results that demonstrate its veracity. This report documents the model validation results and provides a brief description of the Graphic user Interface (GUI).
Embedded-cluster calculations in a numeric atomic orbital density-functional theory framework.

PubMed

Berger, Daniel; Logsdail, Andrew J; Oberhofer, Harald; Farrow, Matthew R; Catlow, C Richard A; Sherwood, Paul; Sokol, Alexey A; Blum, Volker; Reuter, Karsten

2014-07-14

We integrate the all-electron electronic structure code FHI-aims into the general ChemShell package for solid-state embedding quantum and molecular mechanical (QM/MM) calculations. A major undertaking in this integration is the implementation of pseudopotential functionality into FHI-aims to describe cations at the QM/MM boundary through effective core potentials and therewith prevent spurious overpolarization of the electronic density. Based on numeric atomic orbital basis sets, FHI-aims offers particularly efficient access to exact exchange and second order perturbation theory, rendering the established QM/MM setup an ideal tool for hybrid and double-hybrid level density functional theory calculations of solid systems. We illustrate this capability by calculating the reduction potential of Fe in the Fe-substituted ZSM-5 zeolitic framework and the reaction energy profile for (photo-)catalytic water oxidation at TiO2(110).
Embedded-cluster calculations in a numeric atomic orbital density-functional theory framework

DOE Office of Scientific and Technical Information (OSTI.GOV)

Berger, Daniel, E-mail: daniel.berger@ch.tum.de; Oberhofer, Harald; Reuter, Karsten

2014-07-14

We integrate the all-electron electronic structure code FHI-aims into the general ChemShell package for solid-state embedding quantum and molecular mechanical (QM/MM) calculations. A major undertaking in this integration is the implementation of pseudopotential functionality into FHI-aims to describe cations at the QM/MM boundary through effective core potentials and therewith prevent spurious overpolarization of the electronic density. Based on numeric atomic orbital basis sets, FHI-aims offers particularly efficient access to exact exchange and second order perturbation theory, rendering the established QM/MM setup an ideal tool for hybrid and double-hybrid level density functional theory calculations of solid systems. We illustrate this capabilitymore » by calculating the reduction potential of Fe in the Fe-substituted ZSM-5 zeolitic framework and the reaction energy profile for (photo-)catalytic water oxidation at TiO{sub 2}(110)« less
Simulation of two-dimensional turbulent flows in a rotating annulus

NASA Astrophysics Data System (ADS)

Storey, Brian D.

2004-05-01

Rotating water tank experiments have been used to study fundamental processes of atmospheric and geophysical turbulence in a controlled laboratory setting. When these tanks are undergoing strong rotation the forced turbulent flow becomes highly two dimensional along the axis of rotation. An efficient numerical method has been developed for simulating the forced quasi-geostrophic equations in an annular geometry to model current laboratory experiments. The algorithm employs a spectral method with Fourier series and Chebyshev polynomials as basis functions. The algorithm has been implemented on a parallel architecture to allow modelling of a wide range of spatial scales over long integration times. This paper describes the derivation of the model equations, numerical method, testing and performance of the algorithm. Results provide reasonable agreement with the experimental data, indicating that such computations can be used as a predictive tool to design future experiments.
Artificial boundary conditions for certain evolution PDEs with cubic nonlinearity for non-compactly supported initial data

NASA Astrophysics Data System (ADS)

Vaibhav, V.

2011-04-01

The paper addresses the problem of constructing non-reflecting boundary conditions for two types of one dimensional evolution equations, namely, the cubic nonlinear Schrödinger (NLS) equation, ∂tu+Lu-iχ|u|2u=0 with L≡-i∂x2, and the equation obtained by letting L≡∂x3. The usual restriction of compact support of the initial data is relaxed by allowing it to have a constant amplitude along with a linear phase variation outside a compact domain. We adapt the pseudo-differential approach developed by Antoine et al. (2006) [5] for the NLS equation to the second type of evolution equation, and further, extend the scheme to the aforementioned class of initial data for both of the equations. In addition, we discuss efficient numerical implementation of our scheme and produce the results of several numerical experiments demonstrating its effectiveness.
Enhancement of a dynamic porous model considering compression-release hysteresis behavior: application to graphite

NASA Astrophysics Data System (ADS)

Jodar, B.; Seisson, G.; Hébert, D.; Bertron, I.; Boustie, M.; Berthe, L.

2016-08-01

Because of their shock wave attenuation properties, porous materials and foams are increasingly used for various applications such as graphite in the aerospace industry and polyurethane (PU) foams in biomedical engineering. For these two materials, the absence of residual compaction after compression and release cycles limits the efficiency of the usual numerical dynamic porous models such as P-α and POREQST. In this paper, we suggest a simple enhancement of the latter in order to take into account the compression-release hysteresis behavior experimentally observed for the considered materials. The new model, named H-POREQST, was implemented into a Lagrangian hydrocode and tested for simulating plate impact experiments at moderate pressure onto a commercial grade of porous graphite (EDM3). It proved to be in far better agreement with experimental data than the original model which encourages us to pursue numerical tests and developments.
Efficiency of jet grout columns and sand-recycled material mixtures for mitigating liquefaction damage

NASA Astrophysics Data System (ADS)

Kerem Ertek, M.; Demir, Gökhan; Köktan, Utku

2017-04-01

Liquefaction is an important seismic phenomena that has to be assessed and consequently makes it essential to take measures in order to reduce related hazards. There are several ways to assess liquefaction potential analytically and some constitutive models implemented in FEM softwares presenting cyclic behaviour of sand making it possible to observe shear strain or excess pore pressure ratio which are measures to hold a view about liquefaction occurrence. According to various studies in the literature, post-earthquake inspections show that the measures in terms of grouting, piled rafts and sand mixtures with different non-liquefiable materials reduce liquefaction related damage. This paper aims to provide a brief information about effectiveness of jet-grout columns and recycled material-sand mixtures against liquefaction by the help of numerical analyses performed with MIDAS GTS NX software with regard to generation of shear strains. Key words: liquefaction, numerical analyses, jet-grout, sand mixtures
Numerical Implementation of the Cohesive Soil Bounding Surface Plasticity Model. Volume I.

DTIC Science & Technology

1983-02-01

AD-R24 866 NUMERICAL IMPLEMENTATION OF THE COHESIVE SOIL BOUNDING 1/2 SURFACE PLASTICITY ..(U) CALIFORNIA UNIV DAVIS DEPT OF CIVIL ENGINEERING L R...a study of various numerical means for implementing the bounding surface plasticity model for cohesive soils is presented. A comparison is made of... Plasticity Models 17 3.4 Selection Of Methods For Comparison 17 3.5 Theory 20 3.5.1 Solution Methods 20 3.5.2 Reduction Of The Number Of Equation
A manpower scheduling heuristic for aircraft maintenance application

NASA Astrophysics Data System (ADS)

Sze, San-Nah; Sze, Jeeu-Fong; Chiew, Kang-Leng

2012-09-01

This research studies a manpower scheduling for aircraft maintenance, focusing on in-flight food loading operation. A group of loading teams with flexible shifts is required to deliver and upload packaged meals from the ground kitchen to aircrafts in multiple trips. All aircrafts must be served within predefined time windows. The scheduling process takes into account of various constraints such as meal break allocation, multi-trip traveling and food exposure time limit. Considering the aircrafts movement and predefined maximum working hours for each loading team, the main objective of this study is to form an efficient roster by assigning a minimum number of loading teams to the aircrafts. We proposed an insertion based heuristic to generate the solutions in a short period of time for large instances. This proposed algorithm is implemented in various stages for constructing trips due to the presence of numerous constraints. The robustness and efficiency of the algorithm is demonstrated in computational results. The results show that the insertion heuristic more efficiently outperforms the company's current practice.
Efficient parallel resolution of the simplified transport equations in mixed-dual formulation

NASA Astrophysics Data System (ADS)

Barrault, M.; Lathuilière, B.; Ramet, P.; Roman, J.

2011-03-01

A reactivity computation consists of computing the highest eigenvalue of a generalized eigenvalue problem, for which an inverse power algorithm is commonly used. Very fine modelizations are difficult to treat for our sequential solver, based on the simplified transport equations, in terms of memory consumption and computational time. A first implementation of a Lagrangian based domain decomposition method brings to a poor parallel efficiency because of an increase in the power iterations [1]. In order to obtain a high parallel efficiency, we improve the parallelization scheme by changing the location of the loop over the subdomains in the overall algorithm and by benefiting from the characteristics of the Raviart-Thomas finite element. The new parallel algorithm still allows us to locally adapt the numerical scheme (mesh, finite element order). However, it can be significantly optimized for the matching grid case. The good behavior of the new parallelization scheme is demonstrated for the matching grid case on several hundreds of nodes for computations based on a pin-by-pin discretization.
Exploring Neutrino Oscillation Parameter Space with a Monte Carlo Algorithm

NASA Astrophysics Data System (ADS)

Espejel, Hugo; Ernst, David; Cogswell, Bernadette; Latimer, David

2015-04-01

The χ2 (or likelihood) function for a global analysis of neutrino oscillation data is first calculated as a function of the neutrino mixing parameters. A computational challenge is to obtain the minima or the allowed regions for the mixing parameters. The conventional approach is to calculate the χ2 (or likelihood) function on a grid for a large number of points, and then marginalize over the likelihood function. As the number of parameters increases with the number of neutrinos, making the calculation numerically efficient becomes necessary. We implement a new Monte Carlo algorithm (D. Foreman-Mackey, D. W. Hogg, D. Lang and J. Goodman, Publications of the Astronomical Society of the Pacific, 125 306 (2013)) to determine its computational efficiency at finding the minima and allowed regions. We examine a realistic example to compare the historical and the new methods.
TRIM—3D: a three-dimensional model for accurate simulation of shallow water flow

USGS Publications Warehouse

Casulli, Vincenzo; Bertolazzi, Enrico; Cheng, Ralph T.

1993-01-01

A semi-implicit finite difference formulation for the numerical solution of three-dimensional tidal circulation is discussed. The governing equations are the three-dimensional Reynolds equations in which the pressure is assumed to be hydrostatic. A minimal degree of implicitness has been introduced in the finite difference formula so that the resulting algorithm permits the use of large time steps at a minimal computational cost. This formulation includes the simulation of flooding and drying of tidal flats, and is fully vectorizable for an efficient implementation on modern vector computers. The high computational efficiency of this method has made it possible to provide the fine details of circulation structure in complex regions that previous studies were unable to obtain. For proper interpretation of the model results suitable interactive graphics is also an essential tool.
Fuzzy control for nonlinear structure with semi-active friction damper

NASA Astrophysics Data System (ADS)

Zhao, Da-Hai; Li, Hong-Nan

2007-04-01

The implementation of semi-active friction damper for vibration mitigation of seismic structure generally requires an efficient control strategy. In this paper, the fuzzy logic based on Takagi-Sugeno model is proposed for controlling a semi-active friction damper that is installed on a nonlinear building subjected to strong earthquakes. The continuous Bouc-Wen hysteretic model for the stiffness is used to describe nonlinear characteristic of the building. The optimal sliding force with friction damper is determined by nonlinear time history analysis under normal earthquakes. The Takagi-Sugeno fuzzy logic model is employed to adjust the clamping force acted on the friction damper according to the semi-active control strategy. Numerical simulation results demonstrate that the proposed method is very efficient in reducing the peak inter-story drift and acceleration of the nonlinear building structure under earthquake excitations.
Behaviors of susceptible-infected epidemics on scale-free networks with identical infectivity

NASA Astrophysics Data System (ADS)

Zhou, Tao; Liu, Jian-Guo; Bai, Wen-Jie; Chen, Guanrong; Wang, Bing-Hong

2006-11-01

In this paper, we propose a susceptible-infected model with identical infectivity, in which, at every time step, each node can only contact a constant number of neighbors. We implemented this model on scale-free networks, and found that the infected population grows in an exponential form with the time scale proportional to the spreading rate. Furthermore, by numerical simulation, we demonstrated that the targeted immunization of the present model is much less efficient than that of the standard susceptible-infected model. Finally, we investigate a fast spreading strategy when only local information is available. Different from the extensively studied path-finding strategy, the strategy preferring small-degree nodes is more efficient than that preferring large-degree nodes. Our results indicate the existence of an essential relationship between network traffic and network epidemic on scale-free networks.
Three-dimensional near-field MIMO array imaging using range migration techniques.

PubMed

Zhuge, Xiaodong; Yarovoy, Alexander G

2012-06-01

This paper presents a 3-D near-field imaging algorithm that is formulated for 2-D wideband multiple-input-multiple-output (MIMO) imaging array topology. The proposed MIMO range migration technique performs the image reconstruction procedure in the frequency-wavenumber domain. The algorithm is able to completely compensate the curvature of the wavefront in the near-field through a specifically defined interpolation process and provides extremely high computational efficiency by the application of the fast Fourier transform. The implementation aspects of the algorithm and the sampling criteria of a MIMO aperture are discussed. The image reconstruction performance and computational efficiency of the algorithm are demonstrated both with numerical simulations and measurements using 2-D MIMO arrays. Real-time 3-D near-field imaging can be achieved with a real-aperture array by applying the proposed MIMO range migration techniques.

DNS of Flow in a Low-Pressure Turbine Cascade Using a Discontinuous-Galerkin Spectral-Element Method

NASA Technical Reports Server (NTRS)

Garai, Anirban; Diosady, Laslo Tibor; Murman, Scott; Madavan, Nateri

2015-01-01

A new computational capability under development for accurate and efficient high-fidelity direct numerical simulation (DNS) and large eddy simulation (LES) of turbomachinery is described. This capability is based on an entropy-stable Discontinuous-Galerkin spectral-element approach that extends to arbitrarily high orders of spatial and temporal accuracy and is implemented in a computationally efficient manner on a modern high performance computer architecture. A validation study using this method to perform DNS of flow in a low-pressure turbine airfoil cascade are presented. Preliminary results indicate that the method captures the main features of the flow. Discrepancies between the predicted results and the experiments are likely due to the effects of freestream turbulence not being included in the simulation and will be addressed in the final paper.
Tokamak magneto-hydrodynamics and reference magnetic coordinates for simulations of plasma disruptions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zakharov, Leonid E.; Li, Xujing

This paper formulates the Tokamak Magneto-Hydrodynamics (TMHD), initially outlined by X. Li and L. E. Zakharov [Plasma Science and Technology 17(2), 97–104 (2015)] for proper simulations of macroscopic plasma dynamics. The simplest set of magneto-hydrodynamics equations, sufficient for disruption modeling and extendable to more refined physics, is explained in detail. First, the TMHD introduces to 3-D simulations the Reference Magnetic Coordinates (RMC), which are aligned with the magnetic field in the best possible way. The numerical implementation of RMC is adaptive grids. Being consistent with the high anisotropy of the tokamak plasma, RMC allow simulations at realistic, very high plasmamore » electric conductivity. Second, the TMHD splits the equation of motion into an equilibrium equation and the plasma advancing equation. This resolves the 4 decade old problem of Courant limitations of the time step in existing, plasma inertia driven numerical codes. The splitting allows disruption simulations on a relatively slow time scale in comparison with the fast time of ideal MHD instabilities. A new, efficient numerical scheme is proposed for TMHD.« less
On a numerical method for solving integro-differential equations with variable coefficients with applications in finance

NASA Astrophysics Data System (ADS)

Kudryavtsev, O.; Rodochenko, V.

2018-03-01

We propose a new general numerical method aimed to solve integro-differential equations with variable coefficients. The problem under consideration arises in finance where in the context of pricing barrier options in a wide class of stochastic volatility models with jumps. To handle the effect of the correlation between the price and the variance, we use a suitable substitution for processes. Then we construct a Markov-chain approximation for the variation process on small time intervals and apply a maturity randomization technique. The result is a system of boundary problems for integro-differential equations with constant coefficients on the line in each vertex of the chain. We solve the arising problems using a numerical Wiener-Hopf factorization method. The approximate formulae for the factors are efficiently implemented by means of the Fast Fourier Transform. Finally, we use a recurrent procedure that moves backwards in time on the variance tree. We demonstrate the convergence of the method using Monte-Carlo simulations and compare our results with the results obtained by the Wiener-Hopf method with closed-form expressions of the factors.
Prediction and measurements of vibrations from a railway track lying on a peaty ground

NASA Astrophysics Data System (ADS)

Picoux, B.; Rotinat, R.; Regoin, J. P.; Le Houédec, D.

2003-10-01

This paper introduces a two-dimensional model for the response of the ground surface due to vibrations generated by a railway traffic. A semi-analytical wave propagation model is introduced which is subjected to a set of harmonic moving loads and based on a calculation method of the dynamic stiffness matrix of the ground. In order to model a complete railway system, the effect of a simple track model is taken into account including rails, sleepers and ballast especially designed for the study of low vibration frequencies. The priority has been given to a simple formulation based on the principle of spatial Fourier transforms compatible with good numerical efficiency and yet providing quick solutions. In addition, in situ measurements for a soft soil near a railway track were carried out and will be used to validate the numerical implementation. The numerical and experimental results constitute a significant body of useful data to, on the one hand, characterize the response of the environment of tracks and, on the other hand, appreciate the importance of the speed and weight on the behaviour of the structure.
The novel high-performance 3-D MT inverse solver

NASA Astrophysics Data System (ADS)

Kruglyakov, Mikhail; Geraskin, Alexey; Kuvshinov, Alexey

2016-04-01

We present novel, robust, scalable, and fast 3-D magnetotelluric (MT) inverse solver. The solver is written in multi-language paradigm to make it as efficient, readable and maintainable as possible. Separation of concerns and single responsibility concepts go through implementation of the solver. As a forward modelling engine a modern scalable solver extrEMe, based on contracting integral equation approach, is used. Iterative gradient-type (quasi-Newton) optimization scheme is invoked to search for (regularized) inverse problem solution, and adjoint source approach is used to calculate efficiently the gradient of the misfit. The inverse solver is able to deal with highly detailed and contrasting models, allows for working (separately or jointly) with any type of MT responses, and supports massive parallelization. Moreover, different parallelization strategies implemented in the code allow optimal usage of available computational resources for a given problem statement. To parameterize an inverse domain the so-called mask parameterization is implemented, which means that one can merge any subset of forward modelling cells in order to account for (usually) irregular distribution of observation sites. We report results of 3-D numerical experiments aimed at analysing the robustness, performance and scalability of the code. In particular, our computational experiments carried out at different platforms ranging from modern laptops to HPC Piz Daint (6th supercomputer in the world) demonstrate practically linear scalability of the code up to thousands of nodes.
Area and power efficient DCT architecture for image compression

NASA Astrophysics Data System (ADS)

Dhandapani, Vaithiyanathan; Ramachandran, Seshasayanan

2014-12-01

The discrete cosine transform (DCT) is one of the major components in image and video compression systems. The final output of these systems is interpreted by the human visual system (HVS), which is not perfect. The limited perception of human visualization allows the algorithm to be numerically approximate rather than exact. In this paper, we propose a new matrix for discrete cosine transform. The proposed 8 × 8 transformation matrix contains only zeros and ones which requires only adders, thus avoiding the need for multiplication and shift operations. The new class of transform requires only 12 additions, which highly reduces the computational complexity and achieves a performance in image compression that is comparable to that of the existing approximated DCT. Another important aspect of the proposed transform is that it provides an efficient area and power optimization while implementing in hardware. To ensure the versatility of the proposal and to further evaluate the performance and correctness of the structure in terms of speed, area, and power consumption, the model is implemented on Xilinx Virtex 7 field programmable gate array (FPGA) device and synthesized with Cadence® RTL Compiler® using UMC 90 nm standard cell library. The analysis obtained from the implementation indicates that the proposed structure is superior to the existing approximation techniques with a 30% reduction in power and 12% reduction in area.
Object-oriented philosophy in designing adaptive finite-element package for 3D elliptic deferential equations

NASA Astrophysics Data System (ADS)

Zhengyong, R.; Jingtian, T.; Changsheng, L.; Xiao, X.

2007-12-01

Although adaptive finite-element (AFE) analysis is becoming more and more focused in scientific and engineering fields, its efficient implementations are remain to be a discussed problem as its more complex procedures. In this paper, we propose a clear C++ framework implementation to show the powerful properties of Object-oriented philosophy (OOP) in designing such complex adaptive procedure. In terms of the modal functions of OOP language, the whole adaptive system is divided into several separate parts such as the mesh generation or refinement, a-posterior error estimator, adaptive strategy and the final post processing. After proper designs are locally performed on these separate modals, a connected framework of adaptive procedure is formed finally. Based on the general elliptic deferential equation, little efforts should be added in the adaptive framework to do practical simulations. To show the preferable properties of OOP adaptive designing, two numerical examples are tested. The first one is the 3D direct current resistivity problem in which the powerful framework is efficiently shown as only little divisions are added. And then, in the second induced polarization£¨IP£©exploration case, new adaptive procedure is easily added which adequately shows the strong extendibility and re-usage of OOP language. Finally we believe based on the modal framework adaptive implementation by OOP methodology, more advanced adaptive analysis system will be available in future.
An online-coupled NWP/ACT model with conserved Lagrangian levels

NASA Astrophysics Data System (ADS)

Sørensen, B.; Kaas, E.; Lauritzen, P. H.

2012-04-01

Numerical weather and climate modelling is under constant development. Semi-implicit semi-Lagrangian (SISL) models have proven to be numerically efficient in both short-range weather forecasts and climate models, due to the ability to use long time steps. Chemical/aerosol feedback mechanism are becoming more and more relevant in NWP as well as climate models, since the biogenic and anthropogenic emissions can have a direct effect on the dynamics and radiative properties of the atmosphere. To include chemical feedback mechanisms in the NWP models, on-line coupling is crucial. In 3D semi-Lagrangian schemes with quasi-Lagrangian vertical coordinates the Lagrangian levels are remapped to Eulerian model levels each time step. This remapping introduces an undesirable tendency to smooth sharp gradients and creates unphysical numerical diffusion in the vertical distribution. A semi-Lagrangian advection method is introduced, it combines an inherently mass conserving 2D semi-Lagrangian scheme, with a SISL scheme employing both hybrid vertical coordinates and a fully Lagrangian vertical coordinate. This minimizes the vertical diffusion and thus potentially improves the simulation of the vertical profiles of moisture, clouds, and chemical constituents. Since the Lagrangian levels suffer from traditional Lagrangian limitations caused by the convergence and divergence of the flow, remappings to the Eulerian model levels are generally still required - but this need only be applied after a number of time steps - unless dynamic remapping methods are used. For this several different remapping methods has been implemented. The combined scheme is mass conserving, consistent, and multi-tracer efficient.
Optimal control design of turbo spin‐echo sequences with applications to parallel‐transmit systems

PubMed Central

Hoogduin, Hans; Hajnal, Joseph V.; van den Berg, Cornelis A. T.; Luijten, Peter R.; Malik, Shaihan J.

2016-01-01

Purpose The design of turbo spin‐echo sequences is modeled as a dynamic optimization problem which includes the case of inhomogeneous transmit radiofrequency fields. This problem is efficiently solved by optimal control techniques making it possible to design patient‐specific sequences online. Theory and Methods The extended phase graph formalism is employed to model the signal evolution. The design problem is cast as an optimal control problem and an efficient numerical procedure for its solution is given. The numerical and experimental tests address standard multiecho sequences and pTx configurations. Results Standard, analytically derived flip angle trains are recovered by the numerical optimal control approach. New sequences are designed where constraints on radiofrequency total and peak power are included. In the case of parallel transmit application, the method is able to calculate the optimal echo train for two‐dimensional and three‐dimensional turbo spin echo sequences in the order of 10 s with a single central processing unit (CPU) implementation. The image contrast is maintained through the whole field of view despite inhomogeneities of the radiofrequency fields. Conclusion The optimal control design sheds new light on the sequence design process and makes it possible to design sequences in an online, patient‐specific fashion. Magn Reson Med 77:361–373, 2017. © 2016 The Authors Magnetic Resonance in Medicine published by Wiley Periodicals, Inc. on behalf of International Society for Magnetic Resonance in Medicine PMID:26800383
Mountain bicycle frame testing as an example of practical implementation of hybrid simulation using RTFEM

NASA Astrophysics Data System (ADS)

Mucha, Waldemar; Kuś, Wacław

2018-01-01

The paper presents a practical implementation of hybrid simulation using Real Time Finite Element Method (RTFEM). Hybrid simulation is a technique for investigating dynamic material and structural properties of mechanical systems by performing numerical analysis and experiment at the same time. It applies to mechanical systems with elements too difficult or impossible to model numerically. These elements are tested experimentally, while the rest of the system is simulated numerically. Data between the experiment and numerical simulation are exchanged in real time. Authors use Finite Element Method to perform the numerical simulation. The following paper presents the general algorithm for hybrid simulation using RTFEM and possible improvements of the algorithm for computation time reduction developed by the authors. The paper focuses on practical implementation of presented methods, which involves testing of a mountain bicycle frame, where the shock absorber is tested experimentally while the rest of the frame is simulated numerically.
Advanced Applications of Adifor 3.0 for Efficient Calculation of First-and Second-Order CFD Sensitivity Derivatives

NASA Technical Reports Server (NTRS)

Taylor, Arthur C., III

2004-01-01

This final report will document the accomplishments of the work of this project. 1) The incremental-iterative (II) form of the reverse-mode (adjoint) method for computing first-order (FO) aerodynamic sensitivity derivatives (SDs) has been successfully implemented and tested in a 2D CFD code (called ANSERS) using the reverse-mode capability of ADIFOR 3.0. These preceding results compared very well with similar SDS computed via a black-box (BB) application of the reverse-mode capability of ADIFOR 3.0, and also with similar SDs calculated via the method of finite differences. 2) Second-order (SO) SDs have been implemented in the 2D ASNWERS code using the very efficient strategy that was originally proposed (but not previously tested) of Reference 3, Appendix A. Furthermore, these SO SOs have been validated for accuracy and computational efficiency. 3) Studies were conducted in Quasi-1D and 2D concerning the smoothness (or lack of smoothness) of the FO and SO SD's for flows with shock waves. The phenomenon is documented in the publications of this study (listed subsequently), however, the specific numerical mechanism which is responsible for this unsmoothness phenomenon was not discovered. 4) The FO and SO derivatives for Quasi-1D and 2D flows were applied to predict aerodynamic design uncertainties, and were also applied in robust design optimization studies.
Architecting the Finite Element Method Pipeline for the GPU.

PubMed

Fu, Zhisong; Lewis, T James; Kirby, Robert M; Whitaker, Ross T

2014-02-01

The finite element method (FEM) is a widely employed numerical technique for approximating the solution of partial differential equations (PDEs) in various science and engineering applications. Many of these applications benefit from fast execution of the FEM pipeline. One way to accelerate the FEM pipeline is by exploiting advances in modern computational hardware, such as the many-core streaming processors like the graphical processing unit (GPU). In this paper, we present the algorithms and data-structures necessary to move the entire FEM pipeline to the GPU. First we propose an efficient GPU-based algorithm to generate local element information and to assemble the global linear system associated with the FEM discretization of an elliptic PDE. To solve the corresponding linear system efficiently on the GPU, we implement a conjugate gradient method preconditioned with a geometry-informed algebraic multi-grid (AMG) method preconditioner. We propose a new fine-grained parallelism strategy, a corresponding multigrid cycling stage and efficient data mapping to the many-core architecture of GPU. Comparison of our on-GPU assembly versus a traditional serial implementation on the CPU achieves up to an 87 × speedup. Focusing on the linear system solver alone, we achieve a speedup of up to 51 × versus use of a comparable state-of-the-art serial CPU linear system solver. Furthermore, the method compares favorably with other GPU-based, sparse, linear solvers.
Implementing a Flipped Classroom Approach in a University Numerical Methods Mathematics Course

ERIC Educational Resources Information Center

Johnston, Barbara M.

2017-01-01

This paper describes and analyses the implementation of a "flipped classroom" approach, in an undergraduate mathematics course on numerical methods. The approach replaced all the lecture contents by instructor-made videos and was implemented in the consecutive years 2014 and 2015. The sequential case study presented here begins with an…
Dynamic fisheye grids for binary black hole simulations

NASA Astrophysics Data System (ADS)

Zilhão, Miguel; Noble, Scott C.

2014-03-01

We present a new warped gridding scheme adapted to simulating gas dynamics in binary black hole spacetimes. The grid concentrates grid points in the vicinity of each black hole to resolve the smaller scale structures there, and rarefies grid points away from each black hole to keep the overall problem size at a practical level. In this respect, our system can be thought of as a ‘double’ version of the fisheye coordinate system, used before in numerical relativity codes for evolving binary black holes. The gridding scheme is constructed as a mapping between a uniform coordinate system—in which the equations of motion are solved—to the distorted system representing the spatial locations of our grid points. Since we are motivated to eventually use this system for circumbinary disc calculations, we demonstrate how the distorted system can be constructed to asymptote to the typical spherical polar coordinate system, amenable to efficiently simulating orbiting gas flows about central objects with little numerical diffusion. We discuss its implementation in the Harm3d code, tailored to evolve the magnetohydrodynamics equations in curved spacetimes. We evaluate the performance of the system’s implementation in Harm3d with a series of tests, such as the advected magnetic field loop test, magnetized Bondi accretion, and evolutions of hydrodynamic discs about a single black hole and about a binary black hole. Like we have done with Harm3d, this gridding scheme can be implemented in other unigrid codes as a (possibly) simpler alternative to adaptive mesh refinement.
Trajectory errors of different numerical integration schemes diagnosed with the MPTRAC advection module driven by ECMWF operational analyses

NASA Astrophysics Data System (ADS)

Rößler, Thomas; Stein, Olaf; Heng, Yi; Baumeister, Paul; Hoffmann, Lars

2018-02-01

The accuracy of trajectory calculations performed by Lagrangian particle dispersion models (LPDMs) depends on various factors. The optimization of numerical integration schemes used to solve the trajectory equation helps to maximize the computational efficiency of large-scale LPDM simulations. We analyzed global truncation errors of six explicit integration schemes of the Runge-Kutta family, which we implemented in the Massive-Parallel Trajectory Calculations (MPTRAC) advection module. The simulations were driven by wind fields from operational analysis and forecasts of the European Centre for Medium-Range Weather Forecasts (ECMWF) at T1279L137 spatial resolution and 3 h temporal sampling. We defined separate test cases for 15 distinct regions of the atmosphere, covering the polar regions, the midlatitudes, and the tropics in the free troposphere, in the upper troposphere and lower stratosphere (UT/LS) region, and in the middle stratosphere. In total, more than 5000 different transport simulations were performed, covering the months of January, April, July, and October for the years 2014 and 2015. We quantified the accuracy of the trajectories by calculating transport deviations with respect to reference simulations using a fourth-order Runge-Kutta integration scheme with a sufficiently fine time step. Transport deviations were assessed with respect to error limits based on turbulent diffusion. Independent of the numerical scheme, the global truncation errors vary significantly between the different regions. Horizontal transport deviations in the stratosphere are typically an order of magnitude smaller compared with the free troposphere. We found that the truncation errors of the six numerical schemes fall into three distinct groups, which mostly depend on the numerical order of the scheme. Schemes of the same order differ little in accuracy, but some methods need less computational time, which gives them an advantage in efficiency. The selection of the integration scheme and the appropriate time step should possibly take into account the typical altitude ranges as well as the total length of the simulations to achieve the most efficient simulations. However, trying to summarize, we recommend the third-order Runge-Kutta method with a time step of 170 s or the midpoint scheme with a time step of 100 s for efficient simulations of up to 10 days of simulation time for the specific ECMWF high-resolution data set considered in this study. Purely stratospheric simulations can use significantly larger time steps of 800 and 1100 s for the midpoint scheme and the third-order Runge-Kutta method, respectively.
Phase-shifting coronagraph

NASA Astrophysics Data System (ADS)

Hénault, François; Carlotti, Alexis; Vérinaud, Christophe

2017-09-01

With the recent commissioning of ground instruments such as SPHERE or GPI and future space observatories like WFIRST-AFTA, coronagraphy should probably become the most efficient tool for identifying and characterizing extrasolar planets in the forthcoming years. Coronagraphic instruments such as Phase mask coronagraphs (PMC) are usually based on a phase mask or plate located at the telescope focal plane, spreading the starlight outside the diameter of a Lyot stop that blocks it. In this communication is investigated the capability of a PMC to act as a phase-shifting wavefront sensor for better control of the achieved star extinction ratio in presence of the coronagraphic mask. We discuss the two main implementations of the phase-shifting process, either introducing phase-shifts in a pupil plane and sensing intensity variations in an image plane, or reciprocally. Conceptual optical designs are described in both cases. Numerical simulations allow for better understanding of the performance and limitations of both options, and optimizing their fundamental parameters. In particular, they demonstrate that the phase-shifting process is a bit more efficient when implemented into an image plane, and is compatible with the most popular phase masks currently employed, i.e. fourquadrants and vortex phase masks.
A finite element approach to self-consistent field theory calculations of multiblock polymers

NASA Astrophysics Data System (ADS)

Ackerman, David M.; Delaney, Kris; Fredrickson, Glenn H.; Ganapathysubramanian, Baskar

2017-02-01

Self-consistent field theory (SCFT) has proven to be a powerful tool for modeling equilibrium microstructures of soft materials, particularly for multiblock polymers. A very successful approach to numerically solving the SCFT set of equations is based on using a spectral approach. While widely successful, this approach has limitations especially in the context of current technologically relevant applications. These limitations include non-trivial approaches for modeling complex geometries, difficulties in extending to non-periodic domains, as well as non-trivial extensions for spatial adaptivity. As a viable alternative to spectral schemes, we develop a finite element formulation of the SCFT paradigm for calculating equilibrium polymer morphologies. We discuss the formulation and address implementation challenges that ensure accuracy and efficiency. We explore higher order chain contour steppers that are efficiently implemented with Richardson Extrapolation. This approach is highly scalable and suitable for systems with arbitrary shapes. We show spatial and temporal convergence and illustrate scaling on up to 2048 cores. Finally, we illustrate confinement effects for selected complex geometries. This has implications for materials design for nanoscale applications where dimensions are such that equilibrium morphologies dramatically differ from the bulk phases.
Scalable domain decomposition solvers for stochastic PDEs in high performance computing

DOE PAGES

Desai, Ajit; Khalil, Mohammad; Pettit, Chris; ...

2017-09-21

Stochastic spectral finite element models of practical engineering systems may involve solutions of linear systems or linearized systems for non-linear problems with billions of unknowns. For stochastic modeling, it is therefore essential to design robust, parallel and scalable algorithms that can efficiently utilize high-performance computing to tackle such large-scale systems. Domain decomposition based iterative solvers can handle such systems. And though these algorithms exhibit excellent scalabilities, significant algorithmic and implementational challenges exist to extend them to solve extreme-scale stochastic systems using emerging computing platforms. Intrusive polynomial chaos expansion based domain decomposition algorithms are extended here to concurrently handle high resolutionmore » in both spatial and stochastic domains using an in-house implementation. Sparse iterative solvers with efficient preconditioners are employed to solve the resulting global and subdomain level local systems through multi-level iterative solvers. We also use parallel sparse matrix–vector operations to reduce the floating-point operations and memory requirements. Numerical and parallel scalabilities of these algorithms are presented for the diffusion equation having spatially varying diffusion coefficient modeled by a non-Gaussian stochastic process. Scalability of the solvers with respect to the number of random variables is also investigated.« less
Scalable domain decomposition solvers for stochastic PDEs in high performance computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Desai, Ajit; Khalil, Mohammad; Pettit, Chris

Stochastic spectral finite element models of practical engineering systems may involve solutions of linear systems or linearized systems for non-linear problems with billions of unknowns. For stochastic modeling, it is therefore essential to design robust, parallel and scalable algorithms that can efficiently utilize high-performance computing to tackle such large-scale systems. Domain decomposition based iterative solvers can handle such systems. And though these algorithms exhibit excellent scalabilities, significant algorithmic and implementational challenges exist to extend them to solve extreme-scale stochastic systems using emerging computing platforms. Intrusive polynomial chaos expansion based domain decomposition algorithms are extended here to concurrently handle high resolutionmore » in both spatial and stochastic domains using an in-house implementation. Sparse iterative solvers with efficient preconditioners are employed to solve the resulting global and subdomain level local systems through multi-level iterative solvers. We also use parallel sparse matrix–vector operations to reduce the floating-point operations and memory requirements. Numerical and parallel scalabilities of these algorithms are presented for the diffusion equation having spatially varying diffusion coefficient modeled by a non-Gaussian stochastic process. Scalability of the solvers with respect to the number of random variables is also investigated.« less
Efficient parallel linear scaling construction of the density matrix for Born-Oppenheimer molecular dynamics.

PubMed

Mniszewski, S M; Cawkwell, M J; Wall, M E; Mohd-Yusof, J; Bock, N; Germann, T C; Niklasson, A M N

2015-10-13

We present an algorithm for the calculation of the density matrix that for insulators scales linearly with system size and parallelizes efficiently on multicore, shared memory platforms with small and controllable numerical errors. The algorithm is based on an implementation of the second-order spectral projection (SP2) algorithm [ Niklasson, A. M. N. Phys. Rev. B 2002 , 66 , 155115 ] in sparse matrix algebra with the ELLPACK-R data format. We illustrate the performance of the algorithm within self-consistent tight binding theory by total energy calculations of gas phase poly(ethylene) molecules and periodic liquid water systems containing up to 15,000 atoms on up to 16 CPU cores. We consider algorithm-specific performance aspects, such as local vs nonlocal memory access and the degree of matrix sparsity. Comparisons to sparse matrix algebra implementations using off-the-shelf libraries on multicore CPUs, graphics processing units (GPUs), and the Intel many integrated core (MIC) architecture are also presented. The accuracy and stability of the algorithm are illustrated with long duration Born-Oppenheimer molecular dynamics simulations of 1000 water molecules and a 303 atom Trp cage protein solvated by 2682 water molecules.

A finite element approach to self-consistent field theory calculations of multiblock polymers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ackerman, David M.; Delaney, Kris; Fredrickson, Glenn H.

Self-consistent field theory (SCFT) has proven to be a powerful tool for modeling equilibrium microstructures of soft materials, particularly for multiblock polymers. A very successful approach to numerically solving the SCFT set of equations is based on using a spectral approach. While widely successful, this approach has limitations especially in the context of current technologically relevant applications. These limitations include non-trivial approaches for modeling complex geometries, difficulties in extending to non-periodic domains, as well as non-trivial extensions for spatial adaptivity. As a viable alternative to spectral schemes, we develop a finite element formulation of the SCFT paradigm for calculating equilibriummore » polymer morphologies. We discuss the formulation and address implementation challenges that ensure accuracy and efficiency. We explore higher order chain contour steppers that are efficiently implemented with Richardson Extrapolation. This approach is highly scalable and suitable for systems with arbitrary shapes. We show spatial and temporal convergence and illustrate scaling on up to 2048 cores. Finally, we illustrate confinement effects for selected complex geometries. This has implications for materials design for nanoscale applications where dimensions are such that equilibrium morphologies dramatically differ from the bulk phases.« less
Modelling multiple cycles of static and dynamic recrystallisation using a fully implicit isotropic material model based on dislocation density

NASA Astrophysics Data System (ADS)

Jansen van Rensburg, Gerhardus J.; Kok, Schalk; Wilke, Daniel N.

2018-03-01

This paper presents the development and numerical implementation of a state variable based thermomechanical material model, intended for use within a fully implicit finite element formulation. Plastic hardening, thermal recovery and multiple cycles of recrystallisation can be tracked for single peak as well as multiple peak recrystallisation response. The numerical implementation of the state variable model extends on a J2 isotropic hypo-elastoplastic modelling framework. The complete numerical implementation is presented as an Abaqus UMAT and linked subroutines. Implementation is discussed with detailed explanation of the derivation and use of various sensitivities, internal state variable management and multiple recrystallisation cycle contributions. A flow chart explaining the proposed numerical implementation is provided as well as verification on the convergence of the material subroutine. The material model is characterised using two high temperature data sets for cobalt and copper. The results of finite element analyses using the material parameter values characterised on the copper data set are also presented.
Advanced numerical methods for three dimensional two-phase flow calculations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Toumi, I.; Caruge, D.

1997-07-01

This paper is devoted to new numerical methods developed for both one and three dimensional two-phase flow calculations. These methods are finite volume numerical methods and are based on the use of Approximate Riemann Solvers concepts to define convective fluxes versus mean cell quantities. The first part of the paper presents the numerical method for a one dimensional hyperbolic two-fluid model including differential terms as added mass and interface pressure. This numerical solution scheme makes use of the Riemann problem solution to define backward and forward differencing to approximate spatial derivatives. The construction of this approximate Riemann solver uses anmore » extension of Roe`s method that has been successfully used to solve gas dynamic equations. As far as the two-fluid model is hyperbolic, this numerical method seems very efficient for the numerical solution of two-phase flow problems. The scheme was applied both to shock tube problems and to standard tests for two-fluid computer codes. The second part describes the numerical method in the three dimensional case. The authors discuss also some improvements performed to obtain a fully implicit solution method that provides fast running steady state calculations. Such a scheme is not implemented in a thermal-hydraulic computer code devoted to 3-D steady-state and transient computations. Some results obtained for Pressurised Water Reactors concerning upper plenum calculations and a steady state flow in the core with rod bow effect evaluation are presented. In practice these new numerical methods have proved to be stable on non staggered grids and capable of generating accurate non oscillating solutions for two-phase flow calculations.« less
Efficient uncertainty quantification in fully-integrated surface and subsurface hydrologic simulations

NASA Astrophysics Data System (ADS)

Miller, K. L.; Berg, S. J.; Davison, J. H.; Sudicky, E. A.; Forsyth, P. A.

2018-01-01

Although high performance computers and advanced numerical methods have made the application of fully-integrated surface and subsurface flow and transport models such as HydroGeoSphere common place, run times for large complex basin models can still be on the order of days to weeks, thus, limiting the usefulness of traditional workhorse algorithms for uncertainty quantification (UQ) such as Latin Hypercube simulation (LHS) or Monte Carlo simulation (MCS), which generally require thousands of simulations to achieve an acceptable level of accuracy. In this paper we investigate non-intrusive polynomial chaos for uncertainty quantification, which in contrast to random sampling methods (e.g., LHS and MCS), represents a model response of interest as a weighted sum of polynomials over the random inputs. Once a chaos expansion has been constructed, approximating the mean, covariance, probability density function, cumulative distribution function, and other common statistics as well as local and global sensitivity measures is straightforward and computationally inexpensive, thus making PCE an attractive UQ method for hydrologic models with long run times. Our polynomial chaos implementation was validated through comparison with analytical solutions as well as solutions obtained via LHS for simple numerical problems. It was then used to quantify parametric uncertainty in a series of numerical problems with increasing complexity, including a two-dimensional fully-saturated, steady flow and transient transport problem with six uncertain parameters and one quantity of interest; a one-dimensional variably-saturated column test involving transient flow and transport, four uncertain parameters, and two quantities of interest at 101 spatial locations and five different times each (1010 total); and a three-dimensional fully-integrated surface and subsurface flow and transport problem for a small test catchment involving seven uncertain parameters and three quantities of interest at 241 different times each. Numerical experiments show that polynomial chaos is an effective and robust method for quantifying uncertainty in fully-integrated hydrologic simulations, which provides a rich set of features and is computationally efficient. Our approach has the potential for significant speedup over existing sampling based methods when the number of uncertain model parameters is modest ( ≤ 20). To our knowledge, this is the first implementation of the algorithm in a comprehensive, fully-integrated, physically-based three-dimensional hydrosystem model.
Adaptive Wavelet Modeling of Geophysical Data

NASA Astrophysics Data System (ADS)

Plattner, A.; Maurer, H.; Dahmen, W.; Vorloeper, J.

2009-12-01

Despite the ever-increasing power of modern computers, realistic modeling of complex three-dimensional Earth models is still a challenging task and requires substantial computing resources. The overwhelming majority of current geophysical modeling approaches includes either finite difference or non-adaptive finite element algorithms, and variants thereof. These numerical methods usually require the subsurface to be discretized with a fine mesh to accurately capture the behavior of the physical fields. However, this may result in excessive memory consumption and computing times. A common feature of most of these algorithms is that the modeled data discretizations are independent of the model complexity, which may be wasteful when there are only minor to moderate spatial variations in the subsurface parameters. Recent developments in the theory of adaptive numerical solvers have the potential to overcome this problem. Here, we consider an adaptive wavelet based approach that is applicable to a large scope of problems, also including nonlinear problems. To the best of our knowledge such algorithms have not yet been applied in geophysics. Adaptive wavelet algorithms offer several attractive features: (i) for a given subsurface model, they allow the forward modeling domain to be discretized with a quasi minimal number of degrees of freedom, (ii) sparsity of the associated system matrices is guaranteed, which makes the algorithm memory efficient, and (iii) the modeling accuracy scales linearly with computing time. We have implemented the adaptive wavelet algorithm for solving three-dimensional geoelectric problems. To test its performance, numerical experiments were conducted with a series of conductivity models exhibiting varying degrees of structural complexity. Results were compared with a non-adaptive finite element algorithm, which incorporates an unstructured mesh to best fit subsurface boundaries. Such algorithms represent the current state-of-the-art in geoelectrical modeling. An analysis of the numerical accuracy as a function of the number of degrees of freedom revealed that the adaptive wavelet algorithm outperforms the finite element solver for simple and moderately complex models, whereas the results become comparable for models with spatially highly variable electrical conductivities. The linear dependency of the modeling error and the computing time proved to be model-independent. This feature will allow very efficient computations using large-scale models as soon as our experimental code is optimized in terms of its implementation.
Efficient Sample Delay Calculation for 2-D and 3-D Ultrasound Imaging.

PubMed

Ibrahim, Aya; Hager, Pascal A; Bartolini, Andrea; Angiolini, Federico; Arditi, Marcel; Thiran, Jean-Philippe; Benini, Luca; De Micheli, Giovanni

2017-08-01

Ultrasound imaging is a reference medical diagnostic technique, thanks to its blend of versatility, effectiveness, and moderate cost. The core computation of all ultrasound imaging methods is based on simple formulae, except for those required to calculate acoustic propagation delays with high precision and throughput. Unfortunately, advanced three-dimensional (3-D) systems require the calculation or storage of billions of such delay values per frame, which is a challenge. In 2-D systems, this requirement can be four orders of magnitude lower, but efficient computation is still crucial in view of low-power implementations that can be battery-operated, enabling usage in numerous additional scenarios. In this paper, we explore two smart designs of the delay generation function. To quantify their hardware cost, we implement them on FPGA and study their footprint and performance. We evaluate how these architectures scale to different ultrasound applications, from a low-power 2-D system to a next-generation 3-D machine. When using numerical approximations, we demonstrate the ability to generate delay values with sufficient throughput to support 10 000-channel 3-D imaging at up to 30 fps while using 63% of a Virtex 7 FPGA, requiring 24 MB of external memory accessed at about 32 GB/s bandwidth. Alternatively, with similar FPGA occupation, we show an exact calculation method that reaches 24 fps on 1225-channel 3-D imaging and does not require external memory at all. Both designs can be scaled to use a negligible amount of resources for 2-D imaging in low-power applications and for ultrafast 2-D imaging at hundreds of frames per second.
An efficient finite differences method for the computation of compressible, subsonic, unsteady flows past airfoils and panels

NASA Astrophysics Data System (ADS)

Colera, Manuel; Pérez-Saborid, Miguel

2017-09-01

A finite differences scheme is proposed in this work to compute in the time domain the compressible, subsonic, unsteady flow past an aerodynamic airfoil using the linearized potential theory. It improves and extends the original method proposed in this journal by Hariharan, Ping and Scott [1] by considering: (i) a non-uniform mesh, (ii) an implicit time integration algorithm, (iii) a vectorized implementation and (iv) the coupled airfoil dynamics and fluid dynamic loads. First, we have formulated the method for cases in which the airfoil motion is given. The scheme has been tested on well known problems in unsteady aerodynamics -such as the response to a sudden change of the angle of attack and to a harmonic motion of the airfoil- and has been proved to be more accurate and efficient than other finite differences and vortex-lattice methods found in the literature. Secondly, we have coupled our method to the equations governing the airfoil dynamics in order to numerically solve problems where the airfoil motion is unknown a priori as happens, for example, in the cases of the flutter and the divergence of a typical section of a wing or of a flexible panel. Apparently, this is the first self-consistent and easy-to-implement numerical analysis in the time domain of the compressible, linearized coupled dynamics of the (generally flexible) airfoil-fluid system carried out in the literature. The results for the particular case of a rigid airfoil show excellent agreement with those reported by other authors, whereas those obtained for the case of a cantilevered flexible airfoil in compressible flow seem to be original or, at least, not well-known.
The Linearized Bregman Method for Frugal Full-waveform Inversion with Compressive Sensing and Sparsity-promoting

NASA Astrophysics Data System (ADS)

Chai, Xintao; Tang, Genyang; Peng, Ronghua; Liu, Shaoyong

2018-03-01

Full-waveform inversion (FWI) reconstructs the subsurface properties from acquired seismic data via minimization of the misfit between observed and simulated data. However, FWI suffers from considerable computational costs resulting from the numerical solution of the wave equation for each source at each iteration. To reduce the computational burden, constructing supershots by combining several sources (aka source encoding) allows mitigation of the number of simulations at each iteration, but it gives rise to crosstalk artifacts because of interference between the individual sources of the supershot. A modified Gauss-Newton FWI (MGNFWI) approach showed that as long as the difference between the initial and true models permits a sparse representation, the ℓ _1-norm constrained model updates suppress subsampling-related artifacts. However, the spectral-projected gradient ℓ _1 (SPGℓ _1) algorithm employed by MGNFWI is rather complicated that makes its implementation difficult. To facilitate realistic applications, we adapt a linearized Bregman (LB) method to sparsity-promoting FWI (SPFWI) because of the efficiency and simplicity of LB in the framework of ℓ _1-norm constrained optimization problem and compressive sensing. Numerical experiments performed with the BP Salt model, the Marmousi model and the BG Compass model verify the following points. The FWI result with LB solving ℓ _1-norm sparsity-promoting problem for the model update outperforms that generated by solving ℓ _2-norm problem in terms of crosstalk elimination and high-fidelity results. The simpler LB method performs comparably and even superiorly to the complicated SPGℓ _1 method in terms of computational efficiency and model quality, making the LB method a viable alternative for realistic implementations of SPFWI.
Jacobi-Gauss-Lobatto collocation method for the numerical solution of 1+1 nonlinear Schrödinger equations

NASA Astrophysics Data System (ADS)

Doha, E. H.; Bhrawy, A. H.; Abdelkawy, M. A.; Van Gorder, Robert A.

2014-03-01

A Jacobi-Gauss-Lobatto collocation (J-GL-C) method, used in combination with the implicit Runge-Kutta method of fourth order, is proposed as a numerical algorithm for the approximation of solutions to nonlinear Schrödinger equations (NLSE) with initial-boundary data in 1+1 dimensions. Our procedure is implemented in two successive steps. In the first one, the J-GL-C is employed for approximating the functional dependence on the spatial variable, using (N-1) nodes of the Jacobi-Gauss-Lobatto interpolation which depends upon two general Jacobi parameters. The resulting equations together with the two-point boundary conditions induce a system of 2(N-1) first-order ordinary differential equations (ODEs) in time. In the second step, the implicit Runge-Kutta method of fourth order is applied to solve this temporal system. The proposed J-GL-C method, used in combination with the implicit Runge-Kutta method of fourth order, is employed to obtain highly accurate numerical approximations to four types of NLSE, including the attractive and repulsive NLSE and a Gross-Pitaevskii equation with space-periodic potential. The numerical results obtained by this algorithm have been compared with various exact solutions in order to demonstrate the accuracy and efficiency of the proposed method. Indeed, for relatively few nodes used, the absolute error in our numerical solutions is sufficiently small.
High Order Discontinuous Gelerkin Methods for Convection Dominated Problems with Application to Aeroacoustics

NASA Technical Reports Server (NTRS)

Shu, Chi-Wang

2000-01-01

This project is about the investigation of the development of the discontinuous Galerkin finite element methods, for general geometry and triangulations, for solving convection dominated problems, with applications to aeroacoustics. On the analysis side, we have studied the efficient and stable discontinuous Galerkin framework for small second derivative terms, for example in Navier-Stokes equations, and also for related equations such as the Hamilton-Jacobi equations. This is a truly local discontinuous formulation where derivatives are considered as new variables. On the applied side, we have implemented and tested the efficiency of different approaches numerically. Related issues in high order ENO and WENO finite difference methods and spectral methods have also been investigated. Jointly with Hu, we have presented a discontinuous Galerkin finite element method for solving the nonlinear Hamilton-Jacobi equations. This method is based on the RungeKutta discontinuous Galerkin finite element method for solving conservation laws. The method has the flexibility of treating complicated geometry by using arbitrary triangulation, can achieve high order accuracy with a local, compact stencil, and are suited for efficient parallel implementation. One and two dimensional numerical examples are given to illustrate the capability of the method. Jointly with Hu, we have constructed third and fourth order WENO schemes on two dimensional unstructured meshes (triangles) in the finite volume formulation. The third order schemes are based on a combination of linear polynomials with nonlinear weights, and the fourth order schemes are based on combination of quadratic polynomials with nonlinear weights. We have addressed several difficult issues associated with high order WENO schemes on unstructured mesh, including the choice of linear and nonlinear weights, what to do with negative weights, etc. Numerical examples are shown to demonstrate the accuracies and robustness of the methods for shock calculations. Jointly with P. Montarnal, we have used a recently developed energy relaxation theory by Coquel and Perthame and high order weighted essentially non-oscillatory (WENO) schemes to simulate the Euler equations of real gas. The main idea is an energy decomposition under the form epsilon = epsilon(sub 1) + epsilon(sub 2), where epsilon(sub 1) is associated with a simpler pressure law (gamma)-law in this paper) and the nonlinear deviation epsilon(sub 2) is convected with the flow. A relaxation process is performed for each time step to ensure that the original pressure law is satisfied. The necessary characteristic decomposition for the high order WENO schemes is performed on the characteristic fields based on the epsilon(sub l) gamma-law. The algorithm only calls for the original pressure law once per grid point per time step, without the need to compute its derivatives or any Riemann solvers. Both one and two dimensional numerical examples are shown to illustrate the effectiveness of this approach.
[Hardware Implementation of Numerical Simulation Function of Hodgkin-Huxley Model Neurons Action Potential Based on Field Programmable Gate Array].

PubMed

Wang, Jinlong; Lu, Mai; Hu, Yanwen; Chen, Xiaoqiang; Pan, Qiangqiang

2015-12-01

Neuron is the basic unit of the biological neural system. The Hodgkin-Huxley (HH) model is one of the most realistic neuron models on the electrophysiological characteristic description of neuron. Hardware implementation of neuron could provide new research ideas to clinical treatment of spinal cord injury, bionics and artificial intelligence. Based on the HH model neuron and the DSP Builder technology, in the present study, a single HH model neuron hardware implementation was completed in Field Programmable Gate Array (FPGA). The neuron implemented in FPGA was stimulated by different types of current, the action potential response characteristics were analyzed, and the correlation coefficient between numerical simulation result and hardware implementation result were calculated. The results showed that neuronal action potential response of FPGA was highly consistent with numerical simulation result. This work lays the foundation for hardware implementation of neural network.
Final Report for''Numerical Methods and Studies of High-Speed Reactive and Non-Reactive Flows''

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schwendeman, D W

2002-11-20

The work carried out under this subcontract involved the development and use of an adaptive numerical method for the accurate calculation of high-speed reactive flows on overlapping grids. The flow is modeled by the reactive Euler equations with an assumed equation of state and with various reaction rate models. A numerical method has been developed to solve the nonlinear hyperbolic partial differential equations in the model. The method uses an unsplit, shock-capturing scheme, and uses a Godunov-type scheme to compute fluxes and a Runge-Kutta error control scheme to compute the source term modeling the chemical reactions. An adaptive mesh refinementmore » (AMR) scheme has been implemented in order to locally increase grid resolution. The numerical method uses composite overlapping grids to handle complex flow geometries. The code is part of the ''Overture-OverBlown'' framework of object-oriented codes [1, 2], and the development has occurred in close collaboration with Bill Henshaw and David Brown, and other members of the Overture team within CASC. During the period of this subcontract, a number of tasks were accomplished, including: (1) an extension of the numerical method to handle ''ignition and grow'' reaction models and a JWL equations of state; (2) an improvement in the efficiency of the AMR scheme and the error estimator; (3) an addition of a scheme of numerical dissipation designed to suppress numerical oscillations/instabilities near expanding detonations and along grid overlaps; and (4) an exploration of the evolution to detonation in an annulus and of detonation failure in an expanding channel.« less
Improving finite element results in modeling heart valve mechanics.

PubMed

Earl, Emily; Mohammadi, Hadi

2018-06-01

Finite element analysis is a well-established computational tool which can be used for the analysis of soft tissue mechanics. Due to the structural complexity of the leaflet tissue of the heart valve, the currently available finite element models do not adequately represent the leaflet tissue. A method of addressing this issue is to implement computationally expensive finite element models, characterized by precise constitutive models including high-order and high-density mesh techniques. In this study, we introduce a novel numerical technique that enhances the results obtained from coarse mesh finite element models to provide accuracy comparable to that of fine mesh finite element models while maintaining a relatively low computational cost. Introduced in this study is a method by which the computational expense required to solve linear and nonlinear constitutive models, commonly used in heart valve mechanics simulations, is reduced while continuing to account for large and infinitesimal deformations. This continuum model is developed based on the least square algorithm procedure coupled with the finite difference method adhering to the assumption that the components of the strain tensor are available at all nodes of the finite element mesh model. The suggested numerical technique is easy to implement, practically efficient, and requires less computational time compared to currently available commercial finite element packages such as ANSYS and/or ABAQUS.
Development and Implementation of a Transport Method for the Transport and Reaction Simulation Engine (TaRSE) based on the Godunov-Mixed Finite Element Method

USGS Publications Warehouse

James, Andrew I.; Jawitz, James W.; Munoz-Carpena, Rafael

2009-01-01

A model to simulate transport of materials in surface water and ground water has been developed to numerically approximate solutions to the advection-dispersion equation. This model, known as the Transport and Reaction Simulation Engine (TaRSE), uses an algorithm that incorporates a time-splitting technique where the advective part of the equation is solved separately from the dispersive part. An explicit finite-volume Godunov method is used to approximate the advective part, while a mixed-finite element technique is used to approximate the dispersive part. The dispersive part uses an implicit discretization, which allows it to run stably with a larger time step than the explicit advective step. The potential exists to develop algorithms that run several advective steps, and then one dispersive step that encompasses the time interval of the advective steps. Because the dispersive step is computationally most expensive, schemes can be implemented that are more computationally efficient than non-time-split algorithms. This technique enables scientists to solve problems with high grid Peclet numbers, such as transport problems with sharp solute fronts, without spurious oscillations in the numerical approximation to the solution and with virtually no artificial diffusion.
Smooth Particle Hydrodynamics GPU-Acceleration Tool for Asteroid Fragmentation Simulation

NASA Astrophysics Data System (ADS)

Buruchenko, Sergey K.; Schäfer, Christoph M.; Maindl, Thomas I.

2017-10-01

The impact threat of near-Earth objects (NEOs) is a concern to the global community, as evidenced by the Chelyabinsk event (caused by a 17-m meteorite) in Russia on February 15, 2013 and a near miss by asteroid 2012 DA14 ( 30 m diameter), on the same day. The expected energy, from either a low-altitude air burst or direct impact, would have severe consequences, especially in populated regions. To mitigate this threat one of the methods is employment of large kinetic-energy impactors (KEIs). The simulation of asteroid target fragmentation is a challenging task which demands efficient and accurate numerical methods with large computational power. Modern graphics processing units (GPUs) lead to a major increase 10 times and more in the performance of the computation of astrophysical and high velocity impacts. The paper presents a new implementation of the numerical method smooth particle hydrodynamics (SPH) using NVIDIA-GPU and the first astrophysical and high velocity application of the new code. The code allows for a tremendous increase in speed of astrophysical simulations with SPH and self-gravity at low costs for new hardware. We have implemented the SPH equations to model gas, liquids and elastic, and plastic solid bodies and added a fragmentation model for brittle materials. Self-gravity may be optionally included in the simulations.
A Decentralized Eigenvalue Computation Method for Spectrum Sensing Based on Average Consensus

NASA Astrophysics Data System (ADS)

Mohammadi, Jafar; Limmer, Steffen; Stańczak, Sławomir

2016-07-01

This paper considers eigenvalue estimation for the decentralized inference problem for spectrum sensing. We propose a decentralized eigenvalue computation algorithm based on the power method, which is referred to as generalized power method GPM; it is capable of estimating the eigenvalues of a given covariance matrix under certain conditions. Furthermore, we have developed a decentralized implementation of GPM by splitting the iterative operations into local and global computation tasks. The global tasks require data exchange to be performed among the nodes. For this task, we apply an average consensus algorithm to efficiently perform the global computations. As a special case, we consider a structured graph that is a tree with clusters of nodes at its leaves. For an accelerated distributed implementation, we propose to use computation over multiple access channel (CoMAC) as a building block of the algorithm. Numerical simulations are provided to illustrate the performance of the two algorithms.
Implementation and application of a gradient enhanced crystal plasticity model

NASA Astrophysics Data System (ADS)

Soyarslan, C.; Perdahcıoǧlu, E. S.; Aşık, E. E.; van den Boogaard, A. H.; Bargmann, S.

2017-10-01

A rate-independent crystal plasticity model is implemented in which description of the hardening of the material is given as a function of the total dislocation density. The evolution of statistically stored dislocations (SSDs) is described using a saturating type evolution law. The evolution of geometrically necessary dislocations (GNDs) on the other hand is described using the gradient of the plastic strain tensor in a non-local manner. The gradient of the incremental plastic strain tensor is computed explicitly during an implicit FE simulation after each converged step. Using the plastic strain tensor stored as state variables at each integration point and an efficient numerical algorithm to find the gradients, the GND density is obtained. This results in a weak coupling of the equilibrium solution and the gradient enhancement. The algorithm is applied to an academic test problem which considers growth of a cylindrical void in a single crystal matrix.
Clinical application of next-generation sequencing for Mendelian diseases.

PubMed

Jamuar, Saumya Shekhar; Tan, Ene-Choo

2015-06-16

Over the past decade, next-generation sequencing (NGS) has led to an exponential increase in our understanding of the genetic basis of Mendelian diseases. NGS allows for the analysis of multiple regions of the genome in one single reaction and has been shown to be a cost-effective and efficient tool in investigating patients with Mendelian diseases. More recently, NGS has been successfully deployed in the clinics, with a reported diagnostic yield of ~25 %. However, recommendations on clinical implementation of NGS are still evolving with numerous key challenges that impede the widespread use of genetics in everyday medicine. These challenges include when to order, on whom to order, what type of test to order, and how to interpret and communicate the results, including incidental findings, to the patient and family. In this review, we discuss these challenges and suggest guidelines on implementing NGS in the routine clinical workflow.
Calculation of stress intensity factors in an isotropic multicracked plate: Part 2: Symbolic/numeric implementation

NASA Technical Reports Server (NTRS)

Arnold, S. M.; Binienda, W. K.; Tan, H. Q.; Xu, M. H.

1992-01-01

Analytical derivations of stress intensity factors (SIF's) of a multicracked plate can be complex and tedious. Recent advances, however, in intelligent application of symbolic computation can overcome these difficulties and provide the means to rigorously and efficiently analyze this class of problems. Here, the symbolic algorithm required to implement the methodology described in Part 1 is presented. The special problem-oriented symbolic functions to derive the fundamental kernels are described, and the associated automatically generated FORTRAN subroutines are given. As a result, a symbolic/FORTRAN package named SYMFRAC, capable of providing accurate SIF's at each crack tip, was developed and validated. Simple illustrative examples using SYMFRAC show the potential of the present approach for predicting the macrocrack propagation path due to existing microcracks in the vicinity of a macrocrack tip, when the influence of the microcrack's location, orientation, size, and interaction are taken into account.
An improved wavelet-Galerkin method for dynamic response reconstruction and parameter identification of shear-type frames

NASA Astrophysics Data System (ADS)

Bu, Haifeng; Wang, Dansheng; Zhou, Pin; Zhu, Hongping

2018-04-01

An improved wavelet-Galerkin (IWG) method based on the Daubechies wavelet is proposed for reconstructing the dynamic responses of shear structures. The proposed method flexibly manages wavelet resolution level according to excitation, thereby avoiding the weakness of the wavelet-Galerkin multiresolution analysis (WGMA) method in terms of resolution and the requirement of external excitation. IWG is implemented by this work in certain case studies, involving single- and n-degree-of-freedom frame structures subjected to a determined discrete excitation. Results demonstrate that IWG performs better than WGMA in terms of accuracy and computation efficiency. Furthermore, a new method for parameter identification based on IWG and an optimization algorithm are also developed for shear frame structures, and a simultaneous identification of structural parameters and excitation is implemented. Numerical results demonstrate that the proposed identification method is effective for shear frame structures.

Simulating the dynamic behavior of a vertical axis wind turbine operating in unsteady conditions

NASA Astrophysics Data System (ADS)

Battisti, L.; Benini, E.; Brighenti, A.; Soraperra, G.; Raciti Castelli, M.

2016-09-01

The present work aims at assessing the reliability of a simulation tool capable of computing the unsteady rotational motion and the associated tower oscillations of a variable speed VAWT immersed in a coherent turbulent wind. As a matter of fact, since the dynamic behaviour of a variable speed turbine strongly depends on unsteady wind conditions (wind gusts), a steady state approach can't accurately catch transient correlated issues. The simulation platform proposed here is implemented using a lumped mass approach: the drive train is described by resorting to both the polar inertia and the angular position of rotating parts, also considering their speed and acceleration, while rotor aerodynamic is based on steady experimental curves. The ultimate objective of the presented numerical platform is the simulation of transient phenomena, driven by turbulence, occurring during rotor operation, with the aim of supporting the implementation of efficient and robust control algorithms.
Stochastic DG Placement for Conservation Voltage Reduction Based on Multiple Replications Procedure

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Zhaoyu; Chen, Bokan; Wang, Jianhui

2015-06-01

Conservation voltage reduction (CVR) and distributed-generation (DG) integration are popular strategies implemented by utilities to improve energy efficiency. This paper investigates the interactions between CVR and DG placement to minimize load consumption in distribution networks, while keeping the lowest voltage level within the predefined range. The optimal placement of DG units is formulated as a stochastic optimization problem considering the uncertainty of DG outputs and load consumptions. A sample average approximation algorithm-based technique is developed to solve the formulated problem effectively. A multiple replications procedure is developed to test the stability of the solution and calculate the confidence interval ofmore » the gap between the candidate solution and optimal solution. The proposed method has been applied to the IEEE 37-bus distribution test system with different scenarios. The numerical results indicate that the implementations of CVR and DG, if combined, can achieve significant energy savings.« less
Automatic mesh refinement and parallel load balancing for Fokker-Planck-DSMC algorithm

NASA Astrophysics Data System (ADS)

Küchlin, Stephan; Jenny, Patrick

2018-06-01

Recently, a parallel Fokker-Planck-DSMC algorithm for rarefied gas flow simulation in complex domains at all Knudsen numbers was developed by the authors. Fokker-Planck-DSMC (FP-DSMC) is an augmentation of the classical DSMC algorithm, which mitigates the near-continuum deficiencies in terms of computational cost of pure DSMC. At each time step, based on a local Knudsen number criterion, the discrete DSMC collision operator is dynamically switched to the Fokker-Planck operator, which is based on the integration of continuous stochastic processes in time, and has fixed computational cost per particle, rather than per collision. In this contribution, we present an extension of the previous implementation with automatic local mesh refinement and parallel load-balancing. In particular, we show how the properties of discrete approximations to space-filling curves enable an efficient implementation. Exemplary numerical studies highlight the capabilities of the new code.
Implementation of Kane's Method for a Spacecraft Composed of Multiple Rigid Bodies

NASA Technical Reports Server (NTRS)

Stoneking, Eric T.

2013-01-01

Equations of motion are derived for a general spacecraft composed of rigid bodies connected via rotary (spherical or gimballed) joints in a tree topology. Several supporting concepts are developed in depth. Basis dyads aid in the transition from basis-free vector equations to component-wise equations. Joint partials allow abstraction of 1-DOF, 2-DOF, 3-DOF gimballed and spherical rotational joints to a common notation. The basic building block consisting of an "inner" body and an "outer" body connected by a joint enables efficient organization of arbitrary tree structures. Kane's equation is recast in a form which facilitates systematic assembly of large systems of equations, and exposes a relationship of Kane's equation to Newton and Euler's equations which is obscured by the usual presentation. The resulting system of dynamic equations is of minimum dimension, and is suitable for numerical solution by computer. Implementation is ·discussed, and illustrative simulation results are presented.
Efficient estimation of the maximum metabolic productivity of batch systems

DOE PAGES

St. John, Peter C.; Crowley, Michael F.; Bomble, Yannick J.

2017-01-31

Production of chemicals from engineered organisms in a batch culture involves an inherent trade-off between productivity, yield, and titer. Existing strategies for strain design typically focus on designing mutations that achieve the highest yield possible while maintaining growth viability. While these methods are computationally tractable, an optimum productivity could be achieved by a dynamic strategy in which the intracellular division of resources is permitted to change with time. New methods for the design and implementation of dynamic microbial processes, both computational and experimental, have therefore been explored to maximize productivity. However, solving for the optimal metabolic behavior under the assumptionmore » that all fluxes in the cell are free to vary is a challenging numerical task. Here, previous studies have therefore typically focused on simpler strategies that are more feasible to implement in practice, such as the time-dependent control of a single flux or control variable.« less
Numerical Approximation of Elasticity Tensor Associated With Green-Naghdi Rate.

PubMed

Liu, Haofei; Sun, Wei

2017-08-01

Objective stress rates are often used in commercial finite element (FE) programs. However, deriving a consistent tangent modulus tensor (also known as elasticity tensor or material Jacobian) associated with the objective stress rates is challenging when complex material models are utilized. In this paper, an approximation method for the tangent modulus tensor associated with the Green-Naghdi rate of the Kirchhoff stress is employed to simplify the evaluation process. The effectiveness of the approach is demonstrated through the implementation of two user-defined fiber-reinforced hyperelastic material models. Comparisons between the approximation method and the closed-form analytical method demonstrate that the former can simplify the material Jacobian evaluation with satisfactory accuracy while retaining its computational efficiency. Moreover, since the approximation method is independent of material models, it can facilitate the implementation of complex material models in FE analysis using shell/membrane elements in abaqus.
Nonlinear power flow feedback control for improved stability and performance of airfoil sections

DOEpatents

Wilson, David G.; Robinett, III, Rush D.

2013-09-03

A computer-implemented method of determining the pitch stability of an airfoil system, comprising using a computer to numerically integrate a differential equation of motion that includes terms describing PID controller action. In one model, the differential equation characterizes the time-dependent response of the airfoil's pitch angle, .alpha.. The computer model calculates limit-cycles of the model, which represent the stability boundaries of the airfoil system. Once the stability boundary is known, feedback control can be implemented, by using, for example, a PID controller to control a feedback actuator. The method allows the PID controller gain constants, K.sub.I, K.sub.p, and K.sub.d, to be optimized. This permits operation closer to the stability boundaries, while preventing the physical apparatus from unintentionally crossing the stability boundaries. Operating closer to the stability boundaries permits greater power efficiencies to be extracted from the airfoil system.
Probabilistic Structural Analysis Methods (PSAM) for select space propulsion system components, part 2

NASA Technical Reports Server (NTRS)

1991-01-01

The technical effort and computer code enhancements performed during the sixth year of the Probabilistic Structural Analysis Methods program are summarized. Various capabilities are described to probabilistically combine structural response and structural resistance to compute component reliability. A library of structural resistance models is implemented in the Numerical Evaluations of Stochastic Structures Under Stress (NESSUS) code that included fatigue, fracture, creep, multi-factor interaction, and other important effects. In addition, a user interface was developed for user-defined resistance models. An accurate and efficient reliability method was developed and was successfully implemented in the NESSUS code to compute component reliability based on user-selected response and resistance models. A risk module was developed to compute component risk with respect to cost, performance, or user-defined criteria. The new component risk assessment capabilities were validated and demonstrated using several examples. Various supporting methodologies were also developed in support of component risk assessment.
A parallel graded-mesh FDTD algorithm for human-antenna interaction problems.

PubMed

Catarinucci, Luca; Tarricone, Luciano

2009-01-01

The finite difference time domain method (FDTD) is frequently used for the numerical solution of a wide variety of electromagnetic (EM) problems and, among them, those concerning human exposure to EM fields. In many practical cases related to the assessment of occupational EM exposure, large simulation domains are modeled and high space resolution adopted, so that strong memory and central processing unit power requirements have to be satisfied. To better afford the computational effort, the use of parallel computing is a winning approach; alternatively, subgridding techniques are often implemented. However, the simultaneous use of subgridding schemes and parallel algorithms is very new. In this paper, an easy-to-implement and highly-efficient parallel graded-mesh (GM) FDTD scheme is proposed and applied to human-antenna interaction problems, demonstrating its appropriateness in dealing with complex occupational tasks and showing its capability to guarantee the advantages of a traditional subgridding technique without affecting the parallel FDTD performance.
Development of a Linearized Unsteady Euler Analysis with Application to Wake/Blade-Row Interactions

NASA Technical Reports Server (NTRS)

Verdon, Joseph M.; Montgomery, Matthew D.; Chuang, H. Andrew

1999-01-01

A three-dimensional, linearized, Euler analysis is being developed to provide a comprehensive and efficient unsteady aerodynamic analysis for predicting the aeroacoustic and aeroelastic responses of axial-flow turbomachinery blading. The mathematical models needed to describe nonlinear and linearized, inviscid, unsteady flows through a blade row operating within a cylindrical annular duct are presented in this report. A numerical model for linearized inviscid unsteady flows, which couples a near-field, implicit, wave-split, finite volume analysis to far-field eigen analyses, is also described. The linearized aerodynamic and numerical models have been implemented into the three-dimensional unsteady flow code, LINFLUX. This code is applied herein to predict unsteady subsonic flows driven by wake or vortical excitations. The intent is to validate the LINFLUX analysis via numerical results for simple benchmark unsteady flows and to demonstrate this analysis via application to a realistic wake/blade-row interaction. Detailed numerical results for a three-dimensional version of the 10th Standard Cascade and a fan exit guide vane indicate that LINFLUX is becoming a reliable and useful unsteady aerodynamic prediction capability that can be applied, in the future, to assess the three-dimensional flow physics important to blade-row, aeroacoustic and aeroelastic responses.
Impact of eliminating fracture intersection nodes in multiphase compositional flow simulation

NASA Astrophysics Data System (ADS)

Walton, Kenneth M.; Unger, Andre J. A.; Ioannidis, Marios A.; Parker, Beth L.

2017-04-01

Algebraic elimination of nodes at discrete fracture intersections via the star-delta technique has proven to be a valuable tool for making multiphase numerical simulations more tractable and efficient. This study examines the assumptions of the star-delta technique and exposes its effects in a 3-D, multiphase context for advective and dispersive/diffusive fluxes. Key issues of relative permeability-saturation-capillary pressure (kr-S-Pc) and capillary barriers at fracture-fracture intersections are discussed. This study uses a multiphase compositional, finite difference numerical model in discrete fracture network (DFN) and discrete fracture-matrix (DFM) modes. It verifies that the numerical model replicates analytical solutions and performs adequately in convergence exercises (conservative and decaying tracer, one and two-phase flow, DFM and DFN domains). The study culminates in simulations of a two-phase laboratory experiment in which a fluid invades a simple fracture intersection. The experiment and simulations evoke different invading fluid flow paths by varying fracture apertures as oil invades water-filled fractures and as water invades air-filled fractures. Results indicate that the node elimination technique as implemented in numerical model correctly reproduces the long-term flow path of the invading fluid, but that short-term temporal effects of the capillary traps and barriers arising from the intersection node are lost.
Implementing a GPU-based numerical algorithm for modelling dynamics of a high-speed train

NASA Astrophysics Data System (ADS)

Sytov, E. S.; Bratus, A. S.; Yurchenko, D.

2018-04-01

This paper discusses the initiative of implementing a GPU-based numerical algorithm for studying various phenomena associated with dynamics of a high-speed railway transport. The proposed numerical algorithm for calculating a critical speed of the bogie is based on the first Lyapunov number. Numerical algorithm is validated by analytical results, derived for a simple model. A dynamic model of a carriage connected to a new dual-wheelset flexible bogie is studied for linear and dry friction damping. Numerical results obtained by CPU, MPU and GPU approaches are compared and appropriateness of these methods is discussed.
Wavelet-based Adaptive Mesh Refinement Method for Global Atmospheric Chemical Transport Modeling

NASA Astrophysics Data System (ADS)

Rastigejev, Y.

2011-12-01

Numerical modeling of global atmospheric chemical transport presents enormous computational difficulties, associated with simulating a wide range of time and spatial scales. The described difficulties are exacerbated by the fact that hundreds of chemical species and thousands of chemical reactions typically are used for chemical kinetic mechanism description. These computational requirements very often forces researches to use relatively crude quasi-uniform numerical grids with inadequate spatial resolution that introduces significant numerical diffusion into the system. It was shown that this spurious diffusion significantly distorts the pollutant mixing and transport dynamics for typically used grid resolution. The described numerical difficulties have to be systematically addressed considering that the demand for fast, high-resolution chemical transport models will be exacerbated over the next decade by the need to interpret satellite observations of tropospheric ozone and related species. In this study we offer dynamically adaptive multilevel Wavelet-based Adaptive Mesh Refinement (WAMR) method for numerical modeling of atmospheric chemical evolution equations. The adaptive mesh refinement is performed by adding and removing finer levels of resolution in the locations of fine scale development and in the locations of smooth solution behavior accordingly. The algorithm is based on the mathematically well established wavelet theory. This allows us to provide error estimates of the solution that are used in conjunction with an appropriate threshold criteria to adapt the non-uniform grid. Other essential features of the numerical algorithm include: an efficient wavelet spatial discretization that allows to minimize the number of degrees of freedom for a prescribed accuracy, a fast algorithm for computing wavelet amplitudes, and efficient and accurate derivative approximations on an irregular grid. The method has been tested for a variety of benchmark problems including numerical simulation of transpacific traveling pollution plumes. The generated pollution plumes are diluted due to turbulent mixing as they are advected downwind. Despite this dilution, it was recently discovered that pollution plumes in the remote troposphere can preserve their identity as well-defined structures for two weeks or more as they circle the globe. Present Global Chemical Transport Models (CTMs) implemented for quasi-uniform grids are completely incapable of reproducing these layered structures due to high numerical plume dilution caused by numerical diffusion combined with non-uniformity of atmospheric flow. It is shown that WAMR algorithm solutions of comparable accuracy as conventional numerical techniques are obtained with more than an order of magnitude reduction in number of grid points, therefore the adaptive algorithm is capable to produce accurate results at a relatively low computational cost. The numerical simulations demonstrate that WAMR algorithm applied the traveling plume problem accurately reproduces the plume dynamics unlike conventional numerical methods that utilizes quasi-uniform numerical grids.
Three-Dimensional Stereoscopic Tracking Velocimetry and Experimental/Numerical Comparison of Directional Solidification

NASA Technical Reports Server (NTRS)

Lee, David; Ge, Yi; Cha, Soyoung Stephen; Ramachandran, Narayanan; Rose, M. Franklin (Technical Monitor)

2001-01-01

Measurement of three-dimensional (3-D) three-component velocity fields is of great importance in both ground and space experiments for understanding materials processing and fluid physics. The experiments in these fields most likely inhibit the application of conventional planar probes for observing 3-D phenomena. Here, we present the investigation results of stereoscopic tracking velocimetry (STV) for measuring 3-D velocity fields, which include diagnostic technology development, experimental velocity measurement, and comparison with analytical and numerical computation. STV is advantageous in system simplicity for building compact hardware and in software efficiency for continual near-real-time monitoring. It has great freedom in illuminating and observing volumetric fields from arbitrary directions. STV is based on stereoscopic observation of particles-Seeded in a flow by CCD sensors. In the approach, part of the individual particle images that provide data points is likely to be lost or cause errors when their images overlap and crisscross each other especially under a high particle density. In order to maximize the valid recovery of data points, neural networks are implemented for these two important processes. For the step of particle overlap decomposition, the back propagation neural network is utilized because of its ability in pattern recognition with pertinent particle image feature parameters. For the step of particle tracking, the Hopfield neural network is employed to find appropriate particle tracks based on global optimization. Our investigation indicates that the neural networks are very efficient and useful for stereoscopically tracking particles. As an initial assessment of the diagnostic technology performance, laminar water jets with and without pulsation are measured. The jet tip velocity profiles are in good agreement with analytical predictions. Finally, for testing in material processing applications, a simple directional solidification apparatus is built for experimenting with a metal analog of succinonitrile. Its 3-D velocity field at the liquid phase is then measured to be compared with those from numerical computation. Our theoretical, numerical, and experimental investigations have proven STV to be a viable candidate for reliably measuring 3-D flow velocities. With current activities are focused on further improving the processing efficiency, overall accuracy, and automation, the eventual efforts of broad experimental applications and concurrent numerical modeling validation will be vital to many areas in fluid flow and materials processing.
The MeqTrees software system and its use for third-generation calibration of radio interferometers

NASA Astrophysics Data System (ADS)

Noordam, J. E.; Smirnov, O. M.

2010-12-01

Context. The formulation of the radio interferometer measurement equation (RIME) for a generic radio telescope by Hamaker et al. has provided us with an elegant mathematical apparatus for better understanding, simulation and calibration of existing and future instruments. The calibration of the new radio telescopes (LOFAR, SKA) would be unthinkable without the RIME formalism, and new software to exploit it. Aims: The MeqTrees software system is designed to implement numerical models, and to solve for arbitrary subsets of their parameters. It may be applied to many problems, but was originally geared towards implementing Measurement Equations in radio astronomy for the purposes of simulation and calibration. The technical goal of MeqTrees is to provide a tool for rapid implementation of such models, while offering performance comparable to hand-written code. We are also pursuing the wider goal of increasing the rate of evolution of radio astronomical software, by offering a tool that facilitates rapid experimentation, and exchange of ideas (and scripts). Methods: MeqTrees is implemented as a Python-based front-end called the meqbrowser, and an efficient (C++-based) computational back-end called the meqserver. Numerical models are defined on the front-end via a Python-based Tree Definition Language (TDL), then rapidly executed on the back-end. The use of TDL facilitates an extremely short turn-around time (hours rather than weeks or months) for experimentation with new ideas. This is also helped by unprecedented visualization capabilities for all final and intermediate results. A flexible data model and a number of important optimizations in the back-end ensures that the numerical performance is comparable to that of hand-written code. Results: MeqTrees is already widely used as the simulation tool for new instruments (LOFAR, SKA) and technologies (focal plane arrays). It has demonstrated that it can achieve a noise-limited dynamic range in excess of a million, on WSRT data. It is the only package that is specifically designed to handle what we propose to call third-generation calibration (3GC), which is needed for the new generation of giant radio telescopes, but can also improve the calibration of existing instruments.
Implementation of ternary Shor’s algorithm based on vibrational states of an ion in anharmonic potential

NASA Astrophysics Data System (ADS)

Liu, Wei; Chen, Shu-Ming; Zhang, Jian; Wu, Chun-Wang; Wu, Wei; Chen, Ping-Xing

2015-03-01

It is widely believed that Shor’s factoring algorithm provides a driving force to boost the quantum computing research. However, a serious obstacle to its binary implementation is the large number of quantum gates. Non-binary quantum computing is an efficient way to reduce the required number of elemental gates. Here, we propose optimization schemes for Shor’s algorithm implementation and take a ternary version for factorizing 21 as an example. The optimized factorization is achieved by a two-qutrit quantum circuit, which consists of only two single qutrit gates and one ternary controlled-NOT gate. This two-qutrit quantum circuit is then encoded into the nine lower vibrational states of an ion trapped in a weakly anharmonic potential. Optimal control theory (OCT) is employed to derive the manipulation electric field for transferring the encoded states. The ternary Shor’s algorithm can be implemented in one single step. Numerical simulation results show that the accuracy of the state transformations is about 0.9919. Project supported by the National Natural Science Foundation of China (Grant No. 61205108) and the High Performance Computing (HPC) Foundation of National University of Defense Technology, China.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Besse, Nicolas; Latu, Guillaume; Ghizzo, Alain

In this paper we present a new method for the numerical solution of the relativistic Vlasov-Maxwell system on a phase-space grid using an adaptive semi-Lagrangian method. The adaptivity is performed through a wavelet multiresolution analysis, which gives a powerful and natural refinement criterion based on the local measurement of the approximation error and regularity of the distribution function. Therefore, the multiscale expansion of the distribution function allows to get a sparse representation of the data and thus save memory space and CPU time. We apply this numerical scheme to reduced Vlasov-Maxwell systems arising in laser-plasma physics. Interaction of relativistically strongmore » laser pulses with overdense plasma slabs is investigated. These Vlasov simulations revealed a rich variety of phenomena associated with the fast particle dynamics induced by electromagnetic waves as electron trapping, particle acceleration, and electron plasma wavebreaking. However, the wavelet based adaptive method that we developed here, does not yield significant improvements compared to Vlasov solvers on a uniform mesh due to the substantial overhead that the method introduces. Nonetheless they might be a first step towards more efficient adaptive solvers based on different ideas for the grid refinement or on a more efficient implementation. Here the Vlasov simulations are performed in a two-dimensional phase-space where the development of thin filaments, strongly amplified by relativistic effects requires an important increase of the total number of points of the phase-space grid as they get finer as time goes on. The adaptive method could be more useful in cases where these thin filaments that need to be resolved are a very small fraction of the hyper-volume, which arises in higher dimensions because of the surface-to-volume scaling and the essentially one-dimensional structure of the filaments. Moreover, the main way to improve the efficiency of the adaptive method is to increase the local character in phase-space of the numerical scheme, by considering multiscale reconstruction with more compact support and by replacing the semi-Lagrangian method with more local - in space - numerical scheme as compact finite difference schemes, discontinuous-Galerkin method or finite element residual schemes which are well suited for parallel domain decomposition techniques.« less
SIM_EXPLORE: Software for Directed Exploration of Complex Systems

NASA Technical Reports Server (NTRS)

Burl, Michael; Wang, Esther; Enke, Brian; Merline, William J.

2013-01-01

Physics-based numerical simulation codes are widely used in science and engineering to model complex systems that would be infeasible to study otherwise. While such codes may provide the highest- fidelity representation of system behavior, they are often so slow to run that insight into the system is limited. Trying to understand the effects of inputs on outputs by conducting an exhaustive grid-based sweep over the input parameter space is simply too time-consuming. An alternative approach called "directed exploration" has been developed to harvest information from numerical simulators more efficiently. The basic idea is to employ active learning and supervised machine learning to choose cleverly at each step which simulation trials to run next based on the results of previous trials. SIM_EXPLORE is a new computer program that uses directed exploration to explore efficiently complex systems represented by numerical simulations. The software sequentially identifies and runs simulation trials that it believes will be most informative given the results of previous trials. The results of new trials are incorporated into the software's model of the system behavior. The updated model is then used to pick the next round of new trials. This process, implemented as a closed-loop system wrapped around existing simulation code, provides a means to improve the speed and efficiency with which a set of simulations can yield scientifically useful results. The software focuses on the case in which the feedback from the simulation trials is binary-valued, i.e., the learner is only informed of the success or failure of the simulation trial to produce a desired output. The software offers a number of choices for the supervised learning algorithm (the method used to model the system behavior given the results so far) and a number of choices for the active learning strategy (the method used to choose which new simulation trials to run given the current behavior model). The software also makes use of the LEGION distributed computing framework to leverage the power of a set of compute nodes. The approach has been demonstrated on a planetary science application in which numerical simulations are used to study the formation of asteroid families.
Stability of finite difference numerical simulations of acoustic logging-while-drilling with different perfectly matched layer schemes

NASA Astrophysics Data System (ADS)

Wang, Hua; Tao, Guo; Shang, Xue-Feng; Fang, Xin-Ding; Burns, Daniel R.

2013-12-01

In acoustic logging-while-drilling (ALWD) finite difference in time domain (FDTD) simulations, large drill collar occupies, most of the fluid-filled borehole and divides the borehole fluid into two thin fluid columns (radius ˜27 mm). Fine grids and large computational models are required to model the thin fluid region between the tool and the formation. As a result, small time step and more iterations are needed, which increases the cumulative numerical error. Furthermore, due to high impedance contrast between the drill collar and fluid in the borehole (the difference is >30 times), the stability and efficiency of the perfectly matched layer (PML) scheme is critical to simulate complicated wave modes accurately. In this paper, we compared four different PML implementations in a staggered grid finite difference in time domain (FDTD) in the ALWD simulation, including field-splitting PML (SPML), multiaxial PML(MPML), non-splitting PML (NPML), and complex frequency-shifted PML (CFS-PML). The comparison indicated that NPML and CFS-PML can absorb the guided wave reflection from the computational boundaries more efficiently than SPML and M-PML. For large simulation time, SPML, M-PML, and NPML are numerically unstable. However, the stability of M-PML can be improved further to some extent. Based on the analysis, we proposed that the CFS-PML method is used in FDTD to eliminate the numerical instability and to improve the efficiency of absorption in the PML layers for LWD modeling. The optimal values of CFS-PML parameters in the LWD simulation were investigated based on thousands of 3D simulations. For typical LWD cases, the best maximum value of the quadratic damping profile was obtained using one d 0. The optimal parameter space for the maximum value of the linear frequency-shifted factor ( α 0) and the scaling factor ( β 0) depended on the thickness of the PML layer. For typical formations, if the PML thickness is 10 grid points, the global error can be reduced to <1% using the optimal PML parameters, and the error will decrease as the PML thickness increases.
A direct method for unfolding the resolution function from measurements of neutron induced reactions

NASA Astrophysics Data System (ADS)

Žugec, P.; Colonna, N.; Sabate-Gilarte, M.; Vlachoudis, V.; Massimi, C.; Lerendegui-Marco, J.; Stamatopoulos, A.; Bacak, M.; Warren, S. G.; n TOF Collaboration

2017-12-01

The paper explores the numerical stability and the computational efficiency of a direct method for unfolding the resolution function from the measurements of the neutron induced reactions. A detailed resolution function formalism is laid out, followed by an overview of challenges present in a practical implementation of the method. A special matrix storage scheme is developed in order to facilitate both the memory management of the resolution function matrix, and to increase the computational efficiency of the matrix multiplication and decomposition procedures. Due to its admirable computational properties, a Cholesky decomposition is at the heart of the unfolding procedure. With the smallest but necessary modification of the matrix to be decomposed, the method is successfully applied to system of 105 × 105. However, the amplification of the uncertainties during the direct inversion procedures limits the applicability of the method to high-precision measurements of neutron induced reactions.

Design and construction of an impulse turbine

NASA Astrophysics Data System (ADS)

Hernández, E.

2013-11-01

Impulse turbine has been constructed to be used in the program of Hydraulic Machines, Faculty of Mechanical Engineering at the Universidad Pontificia Bolivariana, sede Bucaramanga. For construction of the impulse turbine (Pelton) detailed plans were drawn up taking into account the design and implementation of the fundamental equations of hydraulic turbomachinery. From the experimental data found maximum mechanical efficiency of 0.6 ± 0.03 for a water flow of 2.1 l/s. The maximum overall efficiency was 0.23 ± 0.02 for a water flow of 0.83 l/s. The design parameter used was a power of 1 kW, as flow regulator built a needle type regulator, which performed well, the model of the bucket or vane is built on a machine type CNC (Computer Numerical Control). For the construction of the impeller and blades was used aluminium because of chemical and physical characteristics and the casing was manufactured in acrylic.
Algorithmically scalable block preconditioner for fully implicit shallow-water equations in CAM-SE

DOE PAGES

Lott, P. Aaron; Woodward, Carol S.; Evans, Katherine J.

2014-10-19

Performing accurate and efficient numerical simulation of global atmospheric climate models is challenging due to the disparate length and time scales over which physical processes interact. Implicit solvers enable the physical system to be integrated with a time step commensurate with processes being studied. The dominant cost of an implicit time step is the ancillary linear system solves, so we have developed a preconditioner aimed at improving the efficiency of these linear system solves. Our preconditioner is based on an approximate block factorization of the linearized shallow-water equations and has been implemented within the spectral element dynamical core within themore » Community Atmospheric Model (CAM-SE). Furthermore, in this paper we discuss the development and scalability of the preconditioner for a suite of test cases with the implicit shallow-water solver within CAM-SE.« less
Hybrid discrete/continuum algorithms for stochastic reaction networks

DOE PAGES

Safta, Cosmin; Sargsyan, Khachik; Debusschere, Bert; ...

2014-10-22

Direct solutions of the Chemical Master Equation (CME) governing Stochastic Reaction Networks (SRNs) are generally prohibitively expensive due to excessive numbers of possible discrete states in such systems. To enhance computational efficiency we develop a hybrid approach where the evolution of states with low molecule counts is treated with the discrete CME model while that of states with large molecule counts is modeled by the continuum Fokker-Planck equation. The Fokker-Planck equation is discretized using a 2nd order finite volume approach with appropriate treatment of flux components to avoid negative probability values. The numerical construction at the interface between the discretemore » and continuum regions implements the transfer of probability reaction by reaction according to the stoichiometry of the system. As a result, the performance of this novel hybrid approach is explored for a two-species circadian model with computational efficiency gains of about one order of magnitude.« less
Incorporating extrinsic noise into the stochastic simulation of biochemical reactions: A comparison of approaches

NASA Astrophysics Data System (ADS)

Thanh, Vo Hong; Marchetti, Luca; Reali, Federico; Priami, Corrado

2018-02-01

The stochastic simulation algorithm (SSA) has been widely used for simulating biochemical reaction networks. SSA is able to capture the inherently intrinsic noise of the biological system, which is due to the discreteness of species population and to the randomness of their reciprocal interactions. However, SSA does not consider other sources of heterogeneity in biochemical reaction systems, which are referred to as extrinsic noise. Here, we extend two simulation approaches, namely, the integration-based method and the rejection-based method, to take extrinsic noise into account by allowing the reaction propensities to vary in time and state dependent manner. For both methods, new efficient implementations are introduced and their efficiency and applicability to biological models are investigated. Our numerical results suggest that the rejection-based method performs better than the integration-based method when the extrinsic noise is considered.
Operating manual for the miniservo-control tester

USGS Publications Warehouse

Rapp, W.L.

1986-01-01

Ever since the implementation of servo-control units (regular and minimodels) with manometers at U. S. Geological Survey streamflow stations, the need for an effective and efficient servo-control unit tester has been paramount among field personnel. In numerous cases, servo-control unit failures were blamed on battery failures and vice versa. There was no valid instrument to definitively identify cause of failure, let alone properly diagnose the servo-control/manometer system. In 1983, two servo-control unit testers were developed and fabricated. One was mechanical in fabrication, operation, and serviceability; the other was electronic. The testers were extensively used and evaluated in Maine, Ohio, Kansas, and Louisiana under a wide range of environmental conditions. The consensus to integrate the best aspects of both testers into one instrument allowed the Survey to finally solve its long-time need for an effective, efficient servo-control unit tester. (USGS)
An Effective Evolutionary Approach for Bicriteria Shortest Path Routing Problems

NASA Astrophysics Data System (ADS)

Lin, Lin; Gen, Mitsuo

Routing problem is one of the important research issues in communication network fields. In this paper, we consider a bicriteria shortest path routing (bSPR) model dedicated to calculating nondominated paths for (1) the minimum total cost and (2) the minimum transmission delay. To solve this bSPR problem, we propose a new multiobjective genetic algorithm (moGA): (1) an efficient chromosome representation using the priority-based encoding method; (2) a new operator of GA parameters auto-tuning, which is adaptively regulation of exploration and exploitation based on the change of the average fitness of parents and offspring which is occurred at each generation; and (3) an interactive adaptive-weight fitness assignment mechanism is implemented that assigns weights to each objective and combines the weighted objectives into a single objective function. Numerical experiments with various scales of network design problems show the effectiveness and the efficiency of our approach by comparing with the recent researches.
PLUM: Parallel Load Balancing for Adaptive Unstructured Meshes

NASA Technical Reports Server (NTRS)

Oliker, Leonid; Biswas, Rupak; Saini, Subhash (Technical Monitor)

1998-01-01

Mesh adaption is a powerful tool for efficient unstructured-grid computations but causes load imbalance among processors on a parallel machine. We present a novel method called PLUM to dynamically balance the processor workloads with a global view. This paper presents the implementation and integration of all major components within our dynamic load balancing strategy for adaptive grid calculations. Mesh adaption, repartitioning, processor assignment, and remapping are critical components of the framework that must be accomplished rapidly and efficiently so as not to cause a significant overhead to the numerical simulation. A data redistribution model is also presented that predicts the remapping cost on the SP2. This model is required to determine whether the gain from a balanced workload distribution offsets the cost of data movement. Results presented in this paper demonstrate that PLUM is an effective dynamic load balancing strategy which remains viable on a large number of processors.
Efficient isoparametric integration over arbitrary space-filling Voronoi polyhedra for electronic structure calculations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Alam, Aftab; Khan, S. N.; Wilson, Brian G.

2011-07-06

A numerically efficient, accurate, and easily implemented integration scheme over convex Voronoi polyhedra (VP) is presented for use in ab initio electronic-structure calculations. We combine a weighted Voronoi tessellation with isoparametric integration via Gauss-Legendre quadratures to provide rapidly convergent VP integrals for a variety of integrands, including those with a Coulomb singularity. We showcase the capability of our approach by first applying it to an analytic charge-density model achieving machine-precision accuracy with expected convergence properties in milliseconds. For contrast, we compare our results to those using shape-functions and show our approach is greater than 10 5 times faster and 10more » 7 times more accurate. Furthermore, a weighted Voronoi tessellation also allows for a physics-based partitioning of space that guarantees convex, space-filling VP while reflecting accurate atomic size and site charges, as we show within KKR methods applied to Fe-Pd alloys.« less
Suppressing spectral diffusion of emitted photons with optical pulses

DOE PAGES

Fotso, H. F.; Feiguin, A. E.; Awschalom, D. D.; ...

2016-01-22

In many quantum architectures the solid-state qubits, such as quantum dots or color centers, are interfaced via emitted photons. However, the frequency of photons emitted by solid-state systems exhibits slow uncontrollable fluctuations over time (spectral diffusion), creating a serious problem for implementation of the photon-mediated protocols. Here we show that a sequence of optical pulses applied to the solid-state emitter can stabilize the emission line at the desired frequency. We demonstrate efficiency, robustness, and feasibility of the method analytically and numerically. Taking nitrogen-vacancy center in diamond as an example, we show that only several pulses, with the width of 1more » ns, separated by few ns (which is not difficult to achieve) can suppress spectral diffusion. As a result, our method provides a simple and robust way to greatly improve the efficiency of photon-mediated entanglement and/or coupling to photonic cavities for solid-state qubits.« less
Device research task (processing and high-efficiency solar cells)

NASA Technical Reports Server (NTRS)

1986-01-01

This task has been expanded since the last 25th Project Integration Meeting (PIM) to include process research in addition to device research. The objective of this task is to assist the Flat-plate Solar Array (FSA) Project in meeting its near- and long-term goals by identifying and implementing research in the areas of device physics, device structures, measurement techniques, material-device interactions, and cell processing. The research efforts of this task are described and reflect the deversity of device research being conducted. All of the contracts being reported are either completed or near completion and culminate the device research efforts of the FSA Project. Optimazation methods and silicon solar cell numerical models, carrier transport and recombination parameters in heavily doped silicon, development and analysis of silicon solar cells of near 20% efficiency, and SiN sub x passivation of silicon surfaces are discussed.
Investigation of a Parabolic Iterative Solver for Three-dimensional Configurations

NASA Technical Reports Server (NTRS)

Nark, Douglas M.; Watson, Willie R.; Mani, Ramani

2007-01-01

A parabolic iterative solution procedure is investigated that seeks to extend the parabolic approximation used within the internal propagation module of the duct noise propagation and radiation code CDUCT-LaRC. The governing convected Helmholtz equation is split into a set of coupled equations governing propagation in the positive and negative directions. The proposed method utilizes an iterative procedure to solve the coupled equations in an attempt to account for possible reflections from internal bifurcations, impedance discontinuities, and duct terminations. A geometry consistent with the NASA Langley Curved Duct Test Rig is considered and the effects of acoustic treatment and non-anechoic termination are included. Two numerical implementations are studied and preliminary results indicate that improved accuracy in predicted amplitude and phase can be obtained for modes at a cut-off ratio of 1.7. Further predictions for modes at a cut-off ratio of 1.1 show improvement in predicted phase at the expense of increased amplitude error. Possible methods of improvement are suggested based on analytic and numerical analysis. It is hoped that coupling the parabolic iterative approach with less efficient, high fidelity finite element approaches will ultimately provide the capability to perform efficient, higher fidelity acoustic calculations within complex 3-D geometries for impedance eduction and noise propagation and radiation predictions.
A Gauss-Newton full-waveform inversion in PML-truncated domains using scalar probing waves

NASA Astrophysics Data System (ADS)

Pakravan, Alireza; Kang, Jun Won; Newtson, Craig M.

2017-12-01

This study considers the characterization of subsurface shear wave velocity profiles in semi-infinite media using scalar waves. Using surficial responses caused by probing waves, a reconstruction of the material profile is sought using a Gauss-Newton full-waveform inversion method in a two-dimensional domain truncated by perfectly matched layer (PML) wave-absorbing boundaries. The PML is introduced to limit the semi-infinite extent of the half-space and to prevent reflections from the truncated boundaries. A hybrid unsplit-field PML is formulated in the inversion framework to enable more efficient wave simulations than with a fully mixed PML. The full-waveform inversion method is based on a constrained optimization framework that is implemented using Karush-Kuhn-Tucker (KKT) optimality conditions to minimize the objective functional augmented by PML-endowed wave equations via Lagrange multipliers. The KKT conditions consist of state, adjoint, and control problems, and are solved iteratively to update the shear wave velocity profile of the PML-truncated domain. Numerical examples show that the developed Gauss-Newton inversion method is accurate enough and more efficient than another inversion method. The algorithm's performance is demonstrated by the numerical examples including the case of noisy measurement responses and the case of reduced number of sources and receivers.
Bayesian block-diagonal variable selection and model averaging

PubMed Central

Papaspiliopoulos, O.; Rossell, D.

2018-01-01

Summary We propose a scalable algorithmic framework for exact Bayesian variable selection and model averaging in linear models under the assumption that the Gram matrix is block-diagonal, and as a heuristic for exploring the model space for general designs. In block-diagonal designs our approach returns the most probable model of any given size without resorting to numerical integration. The algorithm also provides a novel and efficient solution to the frequentist best subset selection problem for block-diagonal designs. Posterior probabilities for any number of models are obtained by evaluating a single one-dimensional integral, and other quantities of interest such as variable inclusion probabilities and model-averaged regression estimates are obtained by an adaptive, deterministic one-dimensional numerical integration. The overall computational cost scales linearly with the number of blocks, which can be processed in parallel, and exponentially with the block size, rendering it most adequate in situations where predictors are organized in many moderately-sized blocks. For general designs, we approximate the Gram matrix by a block-diagonal matrix using spectral clustering and propose an iterative algorithm that capitalizes on the block-diagonal algorithms to explore efficiently the model space. All methods proposed in this paper are implemented in the R library mombf. PMID:29861501
Automatic red eye correction and its quality metric

NASA Astrophysics Data System (ADS)

Safonov, Ilia V.; Rychagov, Michael N.; Kang, KiMin; Kim, Sang Ho

2008-01-01

The red eye artifacts are troublesome defect of amateur photos. Correction of red eyes during printing without user intervention and making photos more pleasant for an observer are important tasks. The novel efficient technique of automatic correction of red eyes aimed for photo printers is proposed. This algorithm is independent from face orientation and capable to detect paired red eyes as well as single red eyes. The approach is based on application of 3D tables with typicalness levels for red eyes and human skin tones and directional edge detection filters for processing of redness image. Machine learning is applied for feature selection. For classification of red eye regions a cascade of classifiers including Gentle AdaBoost committee from Classification and Regression Trees (CART) is applied. Retouching stage includes desaturation, darkening and blending with initial image. Several versions of approach implementation using trade-off between detection and correction quality, processing time, memory volume are possible. The numeric quality criterion of automatic red eye correction is proposed. This quality metric is constructed by applying Analytic Hierarchy Process (AHP) for consumer opinions about correction outcomes. Proposed numeric metric helped to choose algorithm parameters via optimization procedure. Experimental results demonstrate high accuracy and efficiency of the proposed algorithm in comparison with existing solutions.
Optimizing Approximate Weighted Matching on Nvidia Kepler K40

DOE Office of Scientific and Technical Information (OSTI.GOV)

Naim, Md; Manne, Fredrik; Halappanavar, Mahantesh

Matching is a fundamental graph problem with numerous applications in science and engineering. While algorithms for computing optimal matchings are difficult to parallelize, approximation algorithms on the other hand generally compute high quality solutions and are amenable to parallelization. In this paper, we present efficient implementations of the current best algorithm for half-approximate weighted matching, the Suitor algorithm, on Nvidia Kepler K-40 platform. We develop four variants of the algorithm that exploit hardware features to address key challenges for a GPU implementation. We also experiment with different combinations of work assigned to a warp. Using an exhaustive set ofmore » $269$ inputs, we demonstrate that the new implementation outperforms the previous best GPU algorithm by $10$ to $$100\\times$$ for over $100$ instances, and from $100$ to $$1000\\times$$ for $15$ instances. We also demonstrate up to $$20\\times$$ speedup relative to $2$ threads, and up to $$5\\times$$ relative to $16$ threads on Intel Xeon platform with $16$ cores for the same algorithm. The new algorithms and implementations provided in this paper will have a direct impact on several applications that repeatedly use matching as a key compute kernel. Further, algorithm designs and insights provided in this paper will benefit other researchers implementing graph algorithms on modern GPU architectures.« less
Implementation of the Jacobian-free Newton-Krylov method for solving the for solving the first-order ice sheet momentum balance

DOE Office of Scientific and Technical Information (OSTI.GOV)

Salinger, Andy; Evans, Katherine J; Lemieux, Jean-Francois

2011-01-01

We have implemented the Jacobian-free Newton-Krylov (JFNK) method for solving the rst-order ice sheet momentum equation in order to improve the numerical performance of the Community Ice Sheet Model (CISM), the land ice component of the Community Earth System Model (CESM). Our JFNK implementation is based on signicant re-use of existing code. For example, our physics-based preconditioner uses the original Picard linear solver in CISM. For several test cases spanning a range of geometries and boundary conditions, our JFNK implementation is 1.84-3.62 times more efficient than the standard Picard solver in CISM. Importantly, this computational gain of JFNK over themore » Picard solver increases when rening the grid. Global convergence of the JFNK solver has been signicantly improved by rescaling the equation for the basal boundary condition and through the use of an inexact Newton method. While a diverse set of test cases show that our JFNK implementation is usually robust, for some problems it may fail to converge with increasing resolution (as does the Picard solver). Globalization through parameter continuation did not remedy this problem and future work to improve robustness will explore a combination of Picard and JFNK and the use of homotopy methods.« less
Prevention of work disability due to musculoskeletal disorders: the challenge of implementing evidence.

PubMed

Loisel, Patrick; Buchbinder, Rachelle; Hazard, Rowland; Keller, Robert; Scheel, Inger; van Tulder, Maurits; Webster, Barbara

2005-12-01

The process of returning disabled workers to work presents numerous challenges. In spite of the growing evidence regarding work disability prevention, little uptake of this evidence has been observed. One reason for limited dissemination of evidence is the complexity of the problem, as it is subject to multiple legal, administrative, social, political, and cultural challenges. A literature review and collection of experts' opinion is presented, on the current evidence for work disability prevention, and barriers to evidence implementation. Recommendations are presented for enhancing implementation of research results. The current evidence regarding work disability prevention shows that some clinical interventions (advice to return to modified work and graded activity programs) and some non-clinical interventions (at a service and policy/community level but not at a practice level) are effective in reducing work absenteeism. Implementation of evidence in work disability is a major challenge because intervention recommendations are often imprecise and not yet practical for immediate use, many barriers exist, and many stakeholders are involved. Future studies should involve all relevant stakeholders and aim at developing new strategies that are effective, efficient, and have a potential for successful implementation. These studies should be based upon a clearer conceptualization of the broader context and inter-relationships that determine return to work outcomes.
Conjugate gradient minimisation approach to generating holographic traps for ultracold atoms.

PubMed

Harte, Tiffany; Bruce, Graham D; Keeling, Jonathan; Cassettari, Donatella

2014-11-03

Direct minimisation of a cost function can in principle provide a versatile and highly controllable route to computational hologram generation. Here we show that the careful design of cost functions, combined with numerically efficient conjugate gradient minimisation, establishes a practical method for the generation of holograms for a wide range of target light distributions. This results in a guided optimisation process, with a crucial advantage illustrated by the ability to circumvent optical vortex formation during hologram calculation. We demonstrate the implementation of the conjugate gradient method for both discrete and continuous intensity distributions and discuss its applicability to optical trapping of ultracold atoms.
An adiabatic linearized path integral approach for quantum time-correlation functions II: a cumulant expansion method for improving convergence.

PubMed

Causo, Maria Serena; Ciccotti, Giovanni; Bonella, Sara; Vuilleumier, Rodolphe

2006-08-17

Linearized mixed quantum-classical simulations are a promising approach for calculating time-correlation functions. At the moment, however, they suffer from some numerical problems that may compromise their efficiency and reliability in applications to realistic condensed-phase systems. In this paper, we present a method that improves upon the convergence properties of the standard algorithm for linearized calculations by implementing a cumulant expansion of the relevant averages. The effectiveness of the new approach is tested by applying it to the challenging computation of the diffusion of an excess electron in a metal-molten salt solution.
Asynchronous variational integration using continuous assumed gradient elements.

PubMed

Wolff, Sebastian; Bucher, Christian

2013-03-01

Asynchronous variational integration (AVI) is a tool which improves the numerical efficiency of explicit time stepping schemes when applied to finite element meshes with local spatial refinement. This is achieved by associating an individual time step length to each spatial domain. Furthermore, long-term stability is ensured by its variational structure. This article presents AVI in the context of finite elements based on a weakened weak form (W2) Liu (2009) [1], exemplified by continuous assumed gradient elements Wolff and Bucher (2011) [2]. The article presents the main ideas of the modified AVI, gives implementation notes and a recipe for estimating the critical time step.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.