parallel numerical simulation: Topics by Science.gov

Sample records for parallel numerical simulation

A 3D staggered-grid finite difference scheme for poroelastic wave equation

NASA Astrophysics Data System (ADS)

Zhang, Yijie; Gao, Jinghuai

2014-10-01

Three dimensional numerical modeling has been a viable tool for understanding wave propagation in real media. The poroelastic media can better describe the phenomena of hydrocarbon reservoirs than acoustic and elastic media. However, the numerical modeling in 3D poroelastic media demands significantly more computational capacity, including both computational time and memory. In this paper, we present a 3D poroelastic staggered-grid finite difference (SFD) scheme. During the procedure, parallel computing is implemented to reduce the computational time. Parallelization is based on domain decomposition, and communication between processors is performed using message passing interface (MPI). Parallel analysis shows that the parallelized SFD scheme significantly improves the simulation efficiency and 3D decomposition in domain is the most efficient. We also analyze the numerical dispersion and stability condition of the 3D poroelastic SFD method. Numerical results show that the 3D numerical simulation can provide a real description of wave propagation.
A Parallel, Finite-Volume Algorithm for Large-Eddy Simulation of Turbulent Flows

NASA Technical Reports Server (NTRS)

Bui, Trong T.

1999-01-01

A parallel, finite-volume algorithm has been developed for large-eddy simulation (LES) of compressible turbulent flows. This algorithm includes piecewise linear least-square reconstruction, trilinear finite-element interpolation, Roe flux-difference splitting, and second-order MacCormack time marching. Parallel implementation is done using the message-passing programming model. In this paper, the numerical algorithm is described. To validate the numerical method for turbulence simulation, LES of fully developed turbulent flow in a square duct is performed for a Reynolds number of 320 based on the average friction velocity and the hydraulic diameter of the duct. Direct numerical simulation (DNS) results are available for this test case, and the accuracy of this algorithm for turbulence simulations can be ascertained by comparing the LES solutions with the DNS results. The effects of grid resolution, upwind numerical dissipation, and subgrid-scale dissipation on the accuracy of the LES are examined. Comparison with DNS results shows that the standard Roe flux-difference splitting dissipation adversely affects the accuracy of the turbulence simulation. For accurate turbulence simulations, only 3-5 percent of the standard Roe flux-difference splitting dissipation is needed.
THC-MP: High performance numerical simulation of reactive transport and multiphase flow in porous media

NASA Astrophysics Data System (ADS)

Wei, Xiaohui; Li, Weishan; Tian, Hailong; Li, Hongliang; Xu, Haixiao; Xu, Tianfu

2015-07-01

The numerical simulation of multiphase flow and reactive transport in the porous media on complex subsurface problem is a computationally intensive application. To meet the increasingly computational requirements, this paper presents a parallel computing method and architecture. Derived from TOUGHREACT that is a well-established code for simulating subsurface multi-phase flow and reactive transport problems, we developed a high performance computing THC-MP based on massive parallel computer, which extends greatly on the computational capability for the original code. The domain decomposition method was applied to the coupled numerical computing procedure in the THC-MP. We designed the distributed data structure, implemented the data initialization and exchange between the computing nodes and the core solving module using the hybrid parallel iterative and direct solver. Numerical accuracy of the THC-MP was verified through a CO2 injection-induced reactive transport problem by comparing the results obtained from the parallel computing and sequential computing (original code). Execution efficiency and code scalability were examined through field scale carbon sequestration applications on the multicore cluster. The results demonstrate successfully the enhanced performance using the THC-MP on parallel computing facilities.
A fully coupled method for massively parallel simulation of hydraulically driven fractures in 3-dimensions: FULLY COUPLED PARALLEL SIMULATION OF HYDRAULIC FRACTURES IN 3-D

DOE PAGES

Settgast, Randolph R.; Fu, Pengcheng; Walsh, Stuart D. C.; ...

2016-09-18

This study describes a fully coupled finite element/finite volume approach for simulating field-scale hydraulically driven fractures in three dimensions, using massively parallel computing platforms. The proposed method is capable of capturing realistic representations of local heterogeneities, layering and natural fracture networks in a reservoir. A detailed description of the numerical implementation is provided, along with numerical studies comparing the model with both analytical solutions and experimental results. The results demonstrate the effectiveness of the proposed method for modeling large-scale problems involving hydraulically driven fractures in three dimensions.
A fully coupled method for massively parallel simulation of hydraulically driven fractures in 3-dimensions: FULLY COUPLED PARALLEL SIMULATION OF HYDRAULIC FRACTURES IN 3-D

DOE Office of Scientific and Technical Information (OSTI.GOV)

Settgast, Randolph R.; Fu, Pengcheng; Walsh, Stuart D. C.

This study describes a fully coupled finite element/finite volume approach for simulating field-scale hydraulically driven fractures in three dimensions, using massively parallel computing platforms. The proposed method is capable of capturing realistic representations of local heterogeneities, layering and natural fracture networks in a reservoir. A detailed description of the numerical implementation is provided, along with numerical studies comparing the model with both analytical solutions and experimental results. The results demonstrate the effectiveness of the proposed method for modeling large-scale problems involving hydraulically driven fractures in three dimensions.
Petascale turbulence simulation using a highly parallel fast multipole method on GPUs

NASA Astrophysics Data System (ADS)

Yokota, Rio; Barba, L. A.; Narumi, Tetsu; Yasuoka, Kenji

2013-03-01

This paper reports large-scale direct numerical simulations of homogeneous-isotropic fluid turbulence, achieving sustained performance of 1.08 petaflop/s on GPU hardware using single precision. The simulations use a vortex particle method to solve the Navier-Stokes equations, with a highly parallel fast multipole method (FMM) as numerical engine, and match the current record in mesh size for this application, a cube of 40963 computational points solved with a spectral method. The standard numerical approach used in this field is the pseudo-spectral method, relying on the FFT algorithm as the numerical engine. The particle-based simulations presented in this paper quantitatively match the kinetic energy spectrum obtained with a pseudo-spectral method, using a trusted code. In terms of parallel performance, weak scaling results show the FMM-based vortex method achieving 74% parallel efficiency on 4096 processes (one GPU per MPI process, 3 GPUs per node of the TSUBAME-2.0 system). The FFT-based spectral method is able to achieve just 14% parallel efficiency on the same number of MPI processes (using only CPU cores), due to the all-to-all communication pattern of the FFT algorithm. The calculation time for one time step was 108 s for the vortex method and 154 s for the spectral method, under these conditions. Computing with 69 billion particles, this work exceeds by an order of magnitude the largest vortex-method calculations to date.
Parallel spatial direct numerical simulations on the Intel iPSC/860 hypercube

NASA Technical Reports Server (NTRS)

Joslin, Ronald D.; Zubair, Mohammad

1993-01-01

The implementation and performance of a parallel spatial direct numerical simulation (PSDNS) approach on the Intel iPSC/860 hypercube is documented. The direct numerical simulation approach is used to compute spatially evolving disturbances associated with the laminar-to-turbulent transition in boundary-layer flows. The feasibility of using the PSDNS on the hypercube to perform transition studies is examined. The results indicate that the direct numerical simulation approach can effectively be parallelized on a distributed-memory parallel machine. By increasing the number of processors nearly ideal linear speedups are achieved with nonoptimized routines; slower than linear speedups are achieved with optimized (machine dependent library) routines. This slower than linear speedup results because the Fast Fourier Transform (FFT) routine dominates the computational cost and because the routine indicates less than ideal speedups. However with the machine-dependent routines the total computational cost decreases by a factor of 4 to 5 compared with standard FORTRAN routines. The computational cost increases linearly with spanwise wall-normal and streamwise grid refinements. The hypercube with 32 processors was estimated to require approximately twice the amount of Cray supercomputer single processor time to complete a comparable simulation; however it is estimated that a subgrid-scale model which reduces the required number of grid points and becomes a large-eddy simulation (PSLES) would reduce the computational cost and memory requirements by a factor of 10 over the PSDNS. This PSLES implementation would enable transition simulations on the hypercube at a reasonable computational cost.
A scalable parallel black oil simulator on distributed memory parallel computers

NASA Astrophysics Data System (ADS)

Wang, Kun; Liu, Hui; Chen, Zhangxin

2015-11-01

This paper presents our work on developing a parallel black oil simulator for distributed memory computers based on our in-house parallel platform. The parallel simulator is designed to overcome the performance issues of common simulators that are implemented for personal computers and workstations. The finite difference method is applied to discretize the black oil model. In addition, some advanced techniques are employed to strengthen the robustness and parallel scalability of the simulator, including an inexact Newton method, matrix decoupling methods, and algebraic multigrid methods. A new multi-stage preconditioner is proposed to accelerate the solution of linear systems from the Newton methods. Numerical experiments show that our simulator is scalable and efficient, and is capable of simulating extremely large-scale black oil problems with tens of millions of grid blocks using thousands of MPI processes on parallel computers.
Parallel processing for nonlinear dynamics simulations of structures including rotating bladed-disk assemblies

NASA Technical Reports Server (NTRS)

Hsieh, Shang-Hsien

1993-01-01

The principal objective of this research is to develop, test, and implement coarse-grained, parallel-processing strategies for nonlinear dynamic simulations of practical structural problems. There are contributions to four main areas: finite element modeling and analysis of rotational dynamics, numerical algorithms for parallel nonlinear solutions, automatic partitioning techniques to effect load-balancing among processors, and an integrated parallel analysis system.
Acoustic simulation in architecture with parallel algorithm

NASA Astrophysics Data System (ADS)

Li, Xiaohong; Zhang, Xinrong; Li, Dan

2004-03-01

In allusion to complexity of architecture environment and Real-time simulation of architecture acoustics, a parallel radiosity algorithm was developed. The distribution of sound energy in scene is solved with this method. And then the impulse response between sources and receivers at frequency segment, which are calculated with multi-process, are combined into whole frequency response. The numerical experiment shows that parallel arithmetic can improve the acoustic simulating efficiency of complex scene.
Development of a parallel FE simulator for modeling the whole trans-scale failure process of rock from meso- to engineering-scale

NASA Astrophysics Data System (ADS)

Li, Gen; Tang, Chun-An; Liang, Zheng-Zhao

2017-01-01

Multi-scale high-resolution modeling of rock failure process is a powerful means in modern rock mechanics studies to reveal the complex failure mechanism and to evaluate engineering risks. However, multi-scale continuous modeling of rock, from deformation, damage to failure, has raised high requirements on the design, implementation scheme and computation capacity of the numerical software system. This study is aimed at developing the parallel finite element procedure, a parallel rock failure process analysis (RFPA) simulator that is capable of modeling the whole trans-scale failure process of rock. Based on the statistical meso-damage mechanical method, the RFPA simulator is able to construct heterogeneous rock models with multiple mechanical properties, deal with and represent the trans-scale propagation of cracks, in which the stress and strain fields are solved for the damage evolution analysis of representative volume element by the parallel finite element method (FEM) solver. This paper describes the theoretical basis of the approach and provides the details of the parallel implementation on a Windows - Linux interactive platform. A numerical model is built to test the parallel performance of FEM solver. Numerical simulations are then carried out on a laboratory-scale uniaxial compression test, and field-scale net fracture spacing and engineering-scale rock slope examples, respectively. The simulation results indicate that relatively high speedup and computation efficiency can be achieved by the parallel FEM solver with a reasonable boot process. In laboratory-scale simulation, the well-known physical phenomena, such as the macroscopic fracture pattern and stress-strain responses, can be reproduced. In field-scale simulation, the formation process of net fracture spacing from initiation, propagation to saturation can be revealed completely. In engineering-scale simulation, the whole progressive failure process of the rock slope can be well modeled. It is shown that the parallel FE simulator developed in this study is an efficient tool for modeling the whole trans-scale failure process of rock from meso- to engineering-scale.
Efficient Parallel Algorithm For Direct Numerical Simulation of Turbulent Flows

NASA Technical Reports Server (NTRS)

Moitra, Stuti; Gatski, Thomas B.

1997-01-01

A distributed algorithm for a high-order-accurate finite-difference approach to the direct numerical simulation (DNS) of transition and turbulence in compressible flows is described. This work has two major objectives. The first objective is to demonstrate that parallel and distributed-memory machines can be successfully and efficiently used to solve computationally intensive and input/output intensive algorithms of the DNS class. The second objective is to show that the computational complexity involved in solving the tridiagonal systems inherent in the DNS algorithm can be reduced by algorithm innovations that obviate the need to use a parallelized tridiagonal solver.
Surrogates for numerical simulations; optimization of eddy-promoter heat exchangers

NASA Technical Reports Server (NTRS)

Patera, Anthony T.; Patera, Anthony

1993-01-01

Although the advent of fast and inexpensive parallel computers has rendered numerous previously intractable calculations feasible, many numerical simulations remain too resource-intensive to be directly inserted in engineering optimization efforts. An attractive alternative to direct insertion considers models for computational systems: the expensive simulation is evoked only to construct and validate a simplified, input-output model; this simplified input-output model then serves as a simulation surrogate in subsequent engineering optimization studies. A simple 'Bayesian-validated' statistical framework for the construction, validation, and purposive application of static computer simulation surrogates is presented. As an example, dissipation-transport optimization of laminar-flow eddy-promoter heat exchangers are considered: parallel spectral element Navier-Stokes calculations serve to construct and validate surrogates for the flowrate and Nusselt number; these surrogates then represent the originating Navier-Stokes equations in the ensuing design process.
A method for data handling numerical results in parallel OpenFOAM simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Anton, Alin; Muntean, Sebastian

Parallel computational fluid dynamics simulations produce vast amount of numerical result data. This paper introduces a method for reducing the size of the data by replaying the interprocessor traffic. The results are recovered only in certain regions of interest configured by the user. A known test case is used for several mesh partitioning scenarios using the OpenFOAM toolkit{sup ®}[1]. The space savings obtained with classic algorithms remain constant for more than 60 Gb of floating point data. Our method is most efficient on large simulation meshes and is much better suited for compressing large scale simulation results than the regular algorithms.
A parallelization method for time periodic steady state in simulation of radio frequency sheath dynamics

NASA Astrophysics Data System (ADS)

Kwon, Deuk-Chul; Shin, Sung-Sik; Yu, Dong-Hun

2017-10-01

In order to reduce the computing time in simulation of radio frequency (rf) plasma sources, various numerical schemes were developed. It is well known that the upwind, exponential, and power-law schemes can efficiently overcome the limitation on the grid size for fluid transport simulations of high density plasma discharges. Also, the semi-implicit method is a well-known numerical scheme to overcome on the simulation time step. However, despite remarkable advances in numerical techniques and computing power over the last few decades, efficient multi-dimensional modeling of low temperature plasma discharges has remained a considerable challenge. In particular, there was a difficulty on parallelization in time for the time periodic steady state problems such as capacitively coupled plasma discharges and rf sheath dynamics because values of plasma parameters in previous time step are used to calculate new values each time step. Therefore, we present a parallelization method for the time periodic steady state problems by using period-slices. In order to evaluate the efficiency of the developed method, one-dimensional fluid simulations are conducted for describing rf sheath dynamics. The result shows that speedup can be achieved by using a multithreading method.
Scalability of Parallel Spatial Direct Numerical Simulations on Intel Hypercube and IBM SP1 and SP2

NASA Technical Reports Server (NTRS)

Joslin, Ronald D.; Hanebutte, Ulf R.; Zubair, Mohammad

1995-01-01

The implementation and performance of a parallel spatial direct numerical simulation (PSDNS) approach on the Intel iPSC/860 hypercube and IBM SP1 and SP2 parallel computers is documented. Spatially evolving disturbances associated with the laminar-to-turbulent transition in boundary-layer flows are computed with the PSDNS code. The feasibility of using the PSDNS to perform transition studies on these computers is examined. The results indicate that PSDNS approach can effectively be parallelized on a distributed-memory parallel machine by remapping the distributed data structure during the course of the calculation. Scalability information is provided to estimate computational costs to match the actual costs relative to changes in the number of grid points. By increasing the number of processors, slower than linear speedups are achieved with optimized (machine-dependent library) routines. This slower than linear speedup results because the computational cost is dominated by FFT routine, which yields less than ideal speedups. By using appropriate compile options and optimized library routines on the SP1, the serial code achieves 52-56 M ops on a single node of the SP1 (45 percent of theoretical peak performance). The actual performance of the PSDNS code on the SP1 is evaluated with a "real world" simulation that consists of 1.7 million grid points. One time step of this simulation is calculated on eight nodes of the SP1 in the same time as required by a Cray Y/MP supercomputer. For the same simulation, 32-nodes of the SP1 and SP2 are required to reach the performance of a Cray C-90. A 32 node SP1 (SP2) configuration is 2.9 (4.6) times faster than a Cray Y/MP for this simulation, while the hypercube is roughly 2 times slower than the Y/MP for this application. KEY WORDS: Spatial direct numerical simulations; incompressible viscous flows; spectral methods; finite differences; parallel computing.
A Parallel Numerical Algorithm To Solve Linear Systems Of Equations Emerging From 3D Radiative Transfer

NASA Astrophysics Data System (ADS)

Wichert, Viktoria; Arkenberg, Mario; Hauschildt, Peter H.

2016-10-01

Highly resolved state-of-the-art 3D atmosphere simulations will remain computationally extremely expensive for years to come. In addition to the need for more computing power, rethinking coding practices is necessary. We take a dual approach by introducing especially adapted, parallel numerical methods and correspondingly parallelizing critical code passages. In the following, we present our respective work on PHOENIX/3D. With new parallel numerical algorithms, there is a big opportunity for improvement when iteratively solving the system of equations emerging from the operator splitting of the radiative transfer equation J = ΛS. The narrow-banded approximate Λ-operator Λ* , which is used in PHOENIX/3D, occurs in each iteration step. By implementing a numerical algorithm which takes advantage of its characteristic traits, the parallel code's efficiency is further increased and a speed-up in computational time can be achieved.
Scalability study of parallel spatial direct numerical simulation code on IBM SP1 parallel supercomputer

NASA Technical Reports Server (NTRS)

Hanebutte, Ulf R.; Joslin, Ronald D.; Zubair, Mohammad

1994-01-01

The implementation and the performance of a parallel spatial direct numerical simulation (PSDNS) code are reported for the IBM SP1 supercomputer. The spatially evolving disturbances that are associated with laminar-to-turbulent in three-dimensional boundary-layer flows are computed with the PS-DNS code. By remapping the distributed data structure during the course of the calculation, optimized serial library routines can be utilized that substantially increase the computational performance. Although the remapping incurs a high communication penalty, the parallel efficiency of the code remains above 40% for all performed calculations. By using appropriate compile options and optimized library routines, the serial code achieves 52-56 Mflops on a single node of the SP1 (45% of theoretical peak performance). The actual performance of the PSDNS code on the SP1 is evaluated with a 'real world' simulation that consists of 1.7 million grid points. One time step of this simulation is calculated on eight nodes of the SP1 in the same time as required by a Cray Y/MP for the same simulation. The scalability information provides estimated computational costs that match the actual costs relative to changes in the number of grid points.
How to Build an AppleSeed: A Parallel Macintosh Cluster for Numerically Intensive Computing

NASA Astrophysics Data System (ADS)

Decyk, V. K.; Dauger, D. E.

We have constructed a parallel cluster consisting of a mixture of Apple Macintosh G3 and G4 computers running the Mac OS, and have achieved very good performance on numerically intensive, parallel plasma particle-incell simulations. A subset of the MPI message-passing library was implemented in Fortran77 and C. This library enabled us to port code, without modification, from other parallel processors to the Macintosh cluster. Unlike Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. This enables us to move parallel computing from the realm of experts to the main stream of computing.
The NAS parallel benchmarks

NASA Technical Reports Server (NTRS)

Bailey, D. H.; Barszcz, E.; Barton, J. T.; Carter, R. L.; Lasinski, T. A.; Browning, D. S.; Dagum, L.; Fatoohi, R. A.; Frederickson, P. O.; Schreiber, R. S.

1991-01-01

A new set of benchmarks has been developed for the performance evaluation of highly parallel supercomputers in the framework of the NASA Ames Numerical Aerodynamic Simulation (NAS) Program. These consist of five 'parallel kernel' benchmarks and three 'simulated application' benchmarks. Together they mimic the computation and data movement characteristics of large-scale computational fluid dynamics applications. The principal distinguishing feature of these benchmarks is their 'pencil and paper' specification-all details of these benchmarks are specified only algorithmically. In this way many of the difficulties associated with conventional benchmarking approaches on highly parallel systems are avoided.

Large-scale three-dimensional phase-field simulations for phase coarsening at ultrahigh volume fraction on high-performance architectures

NASA Astrophysics Data System (ADS)

Yan, Hui; Wang, K. G.; Jones, Jim E.

2016-06-01

A parallel algorithm for large-scale three-dimensional phase-field simulations of phase coarsening is developed and implemented on high-performance architectures. From the large-scale simulations, a new kinetics in phase coarsening in the region of ultrahigh volume fraction is found. The parallel implementation is capable of harnessing the greater computer power available from high-performance architectures. The parallelized code enables increase in three-dimensional simulation system size up to a 5123 grid cube. Through the parallelized code, practical runtime can be achieved for three-dimensional large-scale simulations, and the statistical significance of the results from these high resolution parallel simulations are greatly improved over those obtainable from serial simulations. A detailed performance analysis on speed-up and scalability is presented, showing good scalability which improves with increasing problem size. In addition, a model for prediction of runtime is developed, which shows a good agreement with actual run time from numerical tests.
Symplectic molecular dynamics simulations on specially designed parallel computers.

PubMed

Borstnik, Urban; Janezic, Dusanka

2005-01-01

We have developed a computer program for molecular dynamics (MD) simulation that implements the Split Integration Symplectic Method (SISM) and is designed to run on specialized parallel computers. The MD integration is performed by the SISM, which analytically treats high-frequency vibrational motion and thus enables the use of longer simulation time steps. The low-frequency motion is treated numerically on specially designed parallel computers, which decreases the computational time of each simulation time step. The combination of these approaches means that less time is required and fewer steps are needed and so enables fast MD simulations. We study the computational performance of MD simulation of molecular systems on specialized computers and provide a comparison to standard personal computers. The combination of the SISM with two specialized parallel computers is an effective way to increase the speed of MD simulations up to 16-fold over a single PC processor.
Acceleration of Radiance for Lighting Simulation by Using Parallel Computing with OpenCL

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zuo, Wangda; McNeil, Andrew; Wetter, Michael

2011-09-06

We report on the acceleration of annual daylighting simulations for fenestration systems in the Radiance ray-tracing program. The algorithm was optimized to reduce both the redundant data input/output operations and the floating-point operations. To further accelerate the simulation speed, the calculation for matrix multiplications was implemented using parallel computing on a graphics processing unit. We used OpenCL, which is a cross-platform parallel programming language. Numerical experiments show that the combination of the above measures can speed up the annual daylighting simulations 101.7 times or 28.6 times when the sky vector has 146 or 2306 elements, respectively.
ANNarchy: a code generation approach to neural simulations on parallel hardware

PubMed Central

Vitay, Julien; Dinkelbach, Helge Ü.; Hamker, Fred H.

2015-01-01

Many modern neural simulators focus on the simulation of networks of spiking neurons on parallel hardware. Another important framework in computational neuroscience, rate-coded neural networks, is mostly difficult or impossible to implement using these simulators. We present here the ANNarchy (Artificial Neural Networks architect) neural simulator, which allows to easily define and simulate rate-coded and spiking networks, as well as combinations of both. The interface in Python has been designed to be close to the PyNN interface, while the definition of neuron and synapse models can be specified using an equation-oriented mathematical description similar to the Brian neural simulator. This information is used to generate C++ code that will efficiently perform the simulation on the chosen parallel hardware (multi-core system or graphical processing unit). Several numerical methods are available to transform ordinary differential equations into an efficient C++code. We compare the parallel performance of the simulator to existing solutions. PMID:26283957
Numerical study of the interaction between a head fire and a backfire propagating in grassland.

Treesearch

Dominique Morvan; Sofiane Meradji; William Mell

2011-01-01

One of the objectives of this paper was to simulate numerically the interaction between two line fires ignited in a grassland, on a flat terrain, perpendicularly to the wind direction, in such a way that the two fire fronts (a head fire and a backfire) propagated in opposite directions parallel to the wind. The numerical simulations were conducted in 3-0 using the new...
Numerical simulation of h-adaptive immersed boundary method for freely falling disks

NASA Astrophysics Data System (ADS)

Zhang, Pan; Xia, Zhenhua; Cai, Qingdong

2018-05-01

In this work, a freely falling disk with aspect ratio 1/10 is directly simulated by using an adaptive numerical model implemented on a parallel computation framework JASMIN. The adaptive numerical model is a combination of the h-adaptive mesh refinement technique and the implicit immersed boundary method (IBM). Our numerical results agree well with the experimental results in all of the six degrees of freedom of the disk. Furthermore, very similar vortex structures observed in the experiment were also obtained.
Efficient, massively parallel eigenvalue computation

NASA Technical Reports Server (NTRS)

Huo, Yan; Schreiber, Robert

1993-01-01

In numerical simulations of disordered electronic systems, one of the most common approaches is to diagonalize random Hamiltonian matrices and to study the eigenvalues and eigenfunctions of a single electron in the presence of a random potential. An effort to implement a matrix diagonalization routine for real symmetric dense matrices on massively parallel SIMD computers, the Maspar MP-1 and MP-2 systems, is described. Results of numerical tests and timings are also presented.
Extension and Validation of a Hybrid Particle-Finite Element Method for Hypervelocity Impact Simulation. Chapter 2

NASA Technical Reports Server (NTRS)

Fahrenthold, Eric P.; Shivarama, Ravishankar

2004-01-01

The hybrid particle-finite element method of Fahrenthold and Horban, developed for the simulation of hypervelocity impact problems, has been extended to include new formulations of the particle-element kinematics, additional constitutive models, and an improved numerical implementation. The extended formulation has been validated in three dimensional simulations of published impact experiments. The test cases demonstrate good agreement with experiment, good parallel speedup, and numerical convergence of the simulation results.
Numerical Analysis of Dusty-Gas Flows

NASA Astrophysics Data System (ADS)

Saito, T.

2002-02-01

This paper presents the development of a numerical code for simulating unsteady dusty-gas flows including shock and rarefaction waves. The numerical results obtained for a shock tube problem are used for validating the accuracy and performance of the code. The code is then extended for simulating two-dimensional problems. Since the interactions between the gas and particle phases are calculated with the operator splitting technique, we can choose numerical schemes independently for the different phases. A semi-analytical method is developed for the dust phase, while the TVD scheme of Harten and Yee is chosen for the gas phase. Throughout this study, computations are carried out on SGI Origin2000, a parallel computer with multiple of RISC based processors. The efficient use of the parallel computer system is an important issue and the code implementation on Origin2000 is also described. Flow profiles of both the gas and solid particles behind the steady shock wave are calculated by integrating the steady conservation equations. The good agreement between the pseudo-stationary solutions and those from the current numerical code validates the numerical approach and the actual coding. The pseudo-stationary shock profiles can also be used as initial conditions of unsteady multidimensional simulations.
Scalable Domain Decomposed Monte Carlo Particle Transport

DOE Office of Scientific and Technical Information (OSTI.GOV)

O'Brien, Matthew Joseph

2013-12-05

In this dissertation, we present the parallel algorithms necessary to run domain decomposed Monte Carlo particle transport on large numbers of processors (millions of processors). Previous algorithms were not scalable, and the parallel overhead became more computationally costly than the numerical simulation.
Parallel 3-D numerical simulation of dielectric barrier discharge plasma actuators

NASA Astrophysics Data System (ADS)

Houba, Tomas

Dielectric barrier discharge plasma actuators have shown promise in a range of applications including flow control, sterilization and ozone generation. Developing numerical models of plasma actuators is of great importance, because a high-fidelity parallel numerical model allows new design configurations to be tested rapidly. Additionally, it provides a better understanding of the plasma actuator physics which is useful for further innovation. The physics of plasma actuators is studied numerically. A loosely coupled approach is utilized for the coupling of the plasma to the neutral fluid. The state of the art in numerical plasma modeling is advanced by the development of a parallel, three-dimensional, first-principles model with detailed air chemistry. The model incorporates 7 charged species and 18 reactions, along with a solution of the electron energy equation. To the author's knowledge, a parallel three-dimensional model of a gas discharge with a detailed air chemistry model and the solution of electron energy is unique. Three representative geometries are studied using the gas discharge model. The discharge of gas between two parallel electrodes is used to validate the air chemistry model developed for the gas discharge code. The gas discharge model is then applied to the discharge produced by placing a dc powered wire and grounded plate electrodes in a channel. Finally, a three-dimensional simulation of gas discharge produced by electrodes placed inside a riblet is carried out. The body force calculated with the gas discharge model is loosely coupled with a fluid model to predict the induced flow inside the riblet.
An efficient spectral method for the simulation of dynamos in Cartesian geometry and its implementation on massively parallel computers

NASA Astrophysics Data System (ADS)

Stellmach, Stephan; Hansen, Ulrich

2008-05-01

Numerical simulations of the process of convection and magnetic field generation in planetary cores still fail to reach geophysically realistic control parameter values. Future progress in this field depends crucially on efficient numerical algorithms which are able to take advantage of the newest generation of parallel computers. Desirable features of simulation algorithms include (1) spectral accuracy, (2) an operation count per time step that is small and roughly proportional to the number of grid points, (3) memory requirements that scale linear with resolution, (4) an implicit treatment of all linear terms including the Coriolis force, (5) the ability to treat all kinds of common boundary conditions, and (6) reasonable efficiency on massively parallel machines with tens of thousands of processors. So far, algorithms for fully self-consistent dynamo simulations in spherical shells do not achieve all these criteria simultaneously, resulting in strong restrictions on the possible resolutions. In this paper, we demonstrate that local dynamo models in which the process of convection and magnetic field generation is only simulated for a small part of a planetary core in Cartesian geometry can achieve the above goal. We propose an algorithm that fulfills the first five of the above criteria and demonstrate that a model implementation of our method on an IBM Blue Gene/L system scales impressively well for up to O(104) processors. This allows for numerical simulations at rather extreme parameter values.
A hybrid parallel framework for the cellular Potts model simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jiang, Yi; He, Kejing; Dong, Shoubin

2009-01-01

The Cellular Potts Model (CPM) has been widely used for biological simulations. However, most current implementations are either sequential or approximated, which can't be used for large scale complex 3D simulation. In this paper we present a hybrid parallel framework for CPM simulations. The time-consuming POE solving, cell division, and cell reaction operation are distributed to clusters using the Message Passing Interface (MPI). The Monte Carlo lattice update is parallelized on shared-memory SMP system using OpenMP. Because the Monte Carlo lattice update is much faster than the POE solving and SMP systems are more and more common, this hybrid approachmore » achieves good performance and high accuracy at the same time. Based on the parallel Cellular Potts Model, we studied the avascular tumor growth using a multiscale model. The application and performance analysis show that the hybrid parallel framework is quite efficient. The hybrid parallel CPM can be used for the large scale simulation ({approx}10{sup 8} sites) of complex collective behavior of numerous cells ({approx}10{sup 6}).« less
Hybrid Particle-Element Simulation of Impact on Composite Orbital Debris Shields

NASA Technical Reports Server (NTRS)

Fahrenthold, Eric P.

2004-01-01

This report describes the development of new numerical methods and new constitutive models for the simulation of hypervelocity impact effects on spacecraft. The research has included parallel implementation of the numerical methods and material models developed under the project. Validation work has included both one dimensional simulations, for comparison with exact solutions, and three dimensional simulations of published hypervelocity impact experiments. The validated formulations have been applied to simulate impact effects in a velocity and kinetic energy regime outside the capabilities of current experimental methods. The research results presented here allow for the expanded use of numerical simulation, as a complement to experimental work, in future design of spacecraft for hypervelocity impact effects.
Numerical propulsion system simulation: An interdisciplinary approach

NASA Technical Reports Server (NTRS)

Nichols, Lester D.; Chamis, Christos C.

1991-01-01

The tremendous progress being made in computational engineering and the rapid growth in computing power that is resulting from parallel processing now make it feasible to consider the use of computer simulations to gain insights into the complex interactions in aerospace propulsion systems and to evaluate new concepts early in the design process before a commitment to hardware is made. Described here is a NASA initiative to develop a Numerical Propulsion System Simulation (NPSS) capability.
Numerical propulsion system simulation - An interdisciplinary approach

NASA Technical Reports Server (NTRS)

Nichols, Lester D.; Chamis, Christos C.

1991-01-01

The tremendous progress being made in computational engineering and the rapid growth in computing power that is resulting from parallel processing now make it feasible to consider the use of computer simulations to gain insights into the complex interactions in aerospace propulsion systems and to evaluate new concepts early in the design process before a commitment to hardware is made. Described here is a NASA initiative to develop a Numerical Propulsion System Simulation (NPSS) capability.
Parallel implementation of the particle simulation method with dynamic load balancing: Toward realistic geodynamical simulation

NASA Astrophysics Data System (ADS)

Furuichi, M.; Nishiura, D.

2015-12-01

Fully Lagrangian methods such as Smoothed Particle Hydrodynamics (SPH) and Discrete Element Method (DEM) have been widely used to solve the continuum and particles motions in the computational geodynamics field. These mesh-free methods are suitable for the problems with the complex geometry and boundary. In addition, their Lagrangian nature allows non-diffusive advection useful for tracking history dependent properties (e.g. rheology) of the material. These potential advantages over the mesh-based methods offer effective numerical applications to the geophysical flow and tectonic processes, which are for example, tsunami with free surface and floating body, magma intrusion with fracture of rock, and shear zone pattern generation of granular deformation. In order to investigate such geodynamical problems with the particle based methods, over millions to billion particles are required for the realistic simulation. Parallel computing is therefore important for handling such huge computational cost. An efficient parallel implementation of SPH and DEM methods is however known to be difficult especially for the distributed-memory architecture. Lagrangian methods inherently show workload imbalance problem for parallelization with the fixed domain in space, because particles move around and workloads change during the simulation. Therefore dynamic load balance is key technique to perform the large scale SPH and DEM simulation. In this work, we present the parallel implementation technique of SPH and DEM method utilizing dynamic load balancing algorithms toward the high resolution simulation over large domain using the massively parallel super computer system. Our method utilizes the imbalances of the executed time of each MPI process as the nonlinear term of parallel domain decomposition and minimizes them with the Newton like iteration method. In order to perform flexible domain decomposition in space, the slice-grid algorithm is used. Numerical tests show that our approach is suitable for solving the particles with different calculation costs (e.g. boundary particles) as well as the heterogeneous computer architecture. We analyze the parallel efficiency and scalability on the super computer systems (K-computer, Earth simulator 3, etc.).
LASER APPLICATIONS AND OTHER TOPICS IN QUANTUM ELECTRONICS: Application of the stochastic parallel gradient descent algorithm for numerical simulation and analysis of the coherent summation of radiation from fibre amplifiers

NASA Astrophysics Data System (ADS)

Zhou, Pu; Wang, Xiaolin; Li, Xiao; Chen, Zilum; Xu, Xiaojun; Liu, Zejin

2009-10-01

Coherent summation of fibre laser beams, which can be scaled to a relatively large number of elements, is simulated by using the stochastic parallel gradient descent (SPGD) algorithm. The applicability of this algorithm for coherent summation is analysed and its optimisaton parameters and bandwidth limitations are studied.
Gyrokinetic continuum simulation of turbulence in a straight open-field-line plasma

DOE PAGES

Shi, E. L.; Hammett, G. W.; Stoltzfus-Dueck, T.; ...

2017-05-29

Here, five-dimensional gyrokinetic continuum simulations of electrostatic plasma turbulence in a straight, open-field-line geometry have been performed using a full- discontinuous-Galerkin approach implemented in the Gkeyll code. While various simplifications have been used for now, such as long-wavelength approximations in the gyrokinetic Poisson equation and the Hamiltonian, these simulations include the basic elements of a fusion-device scrape-off layer: localised sources to model plasma outflow from the core, cross-field turbulent transport, parallel flow along magnetic field lines, and parallel losses at the limiter or divertor with sheath-model boundary conditions. The set of sheath-model boundary conditions used in the model allows currentsmore » to flow through the walls. In addition to details of the numerical approach, results from numerical simulations of turbulence in the Large Plasma Device, a linear device featuring straight magnetic field lines, are presented.« less
A conservative scheme for electromagnetic simulation of magnetized plasmas with kinetic electrons

NASA Astrophysics Data System (ADS)

Bao, J.; Lin, Z.; Lu, Z. X.

2018-02-01

A conservative scheme has been formulated and verified for gyrokinetic particle simulations of electromagnetic waves and instabilities in magnetized plasmas. An electron continuity equation derived from the drift kinetic equation is used to time advance the electron density perturbation by using the perturbed mechanical flow calculated from the parallel vector potential, and the parallel vector potential is solved by using the perturbed canonical flow from the perturbed distribution function. In gyrokinetic particle simulations using this new scheme, the shear Alfvén wave dispersion relation in the shearless slab and continuum damping in the sheared cylinder have been recovered. The new scheme overcomes the stringent requirement in the conventional perturbative simulation method that perpendicular grid size needs to be as small as electron collisionless skin depth even for the long wavelength Alfvén waves. The new scheme also avoids the problem in the conventional method that an unphysically large parallel electric field arises due to the inconsistency between electrostatic potential calculated from the perturbed density and vector potential calculated from the perturbed canonical flow. Finally, the gyrokinetic particle simulations of the Alfvén waves in sheared cylinder have superior numerical properties compared with the fluid simulations, which suffer from numerical difficulties associated with singular mode structures.

MADNESS: A Multiresolution, Adaptive Numerical Environment for Scientific Simulation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Harrison, Robert J.; Beylkin, Gregory; Bischoff, Florian A.

2016-01-01

MADNESS (multiresolution adaptive numerical environment for scientific simulation) is a high-level software environment for solving integral and differential equations in many dimensions that uses adaptive and fast harmonic analysis methods with guaranteed precision based on multiresolution analysis and separated representations. Underpinning the numerical capabilities is a powerful petascale parallel programming environment that aims to increase both programmer productivity and code scalability. This paper describes the features and capabilities of MADNESS and briefly discusses some current applications in chemistry and several areas of physics.
Xyce Parallel Electronic Simulator : users' guide, version 2.0.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hoekstra, Robert John; Waters, Lon J.; Rankin, Eric Lamont

2004-06-01

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator capable of simulating electrical circuits at a variety of abstraction levels. Primarily, Xyce has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability the current state-of-the-art in the following areas: {sm_bullet} Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers. {sm_bullet} Improved performance for allmore » numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-art algorithms and novel techniques. {sm_bullet} Device models which are specifically tailored to meet Sandia's needs, including many radiation-aware devices. {sm_bullet} A client-server or multi-tiered operating model wherein the numerical kernel can operate independently of the graphical user interface (GUI). {sm_bullet} Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing of computing platforms. These include serial, shared-memory and distributed-memory parallel implementation - which allows it to run efficiently on the widest possible number parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. One feature required by designers is the ability to add device models, many specific to the needs of Sandia, to the code. To this end, the device package in the Xyce These input formats include standard analytical models, behavioral models look-up Parallel Electronic Simulator is designed to support a variety of device model inputs. tables, and mesh-level PDE device models. Combined with this flexible interface is an architectural design that greatly simplifies the addition of circuit models. One of the most important feature of Xyce is in providing a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia now has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods) research and development can be performed. Ultimately, these capabilities are migrated to end users.« less
Substructured multibody molecular dynamics.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grest, Gary Stephen; Stevens, Mark Jackson; Plimpton, Steven James

2006-11-01

We have enhanced our parallel molecular dynamics (MD) simulation software LAMMPS (Large-scale Atomic/Molecular Massively Parallel Simulator, lammps.sandia.gov) to include many new features for accelerated simulation including articulated rigid body dynamics via coupling to the Rensselaer Polytechnic Institute code POEMS (Parallelizable Open-source Efficient Multibody Software). We use new features of the LAMMPS software package to investigate rhodopsin photoisomerization, and water model surface tension and capillary waves at the vapor-liquid interface. Finally, we motivate the recipes of MD for practitioners and researchers in numerical analysis and computational mechanics.
Explicit finite-difference simulation of optical integrated devices on massive parallel computers.

PubMed

Sterkenburgh, T; Michels, R M; Dress, P; Franke, H

1997-02-20

An explicit method for the numerical simulation of optical integrated circuits by means of the finite-difference time-domain (FDTD) method is presented. This method, based on an explicit solution of Maxwell's equations, is well established in microwave technology. Although the simulation areas are small, we verified the behavior of three interesting problems, especially nonparaxial problems, with typical aspects of integrated optical devices. Because numerical losses are within acceptable limits, we suggest the use of the FDTD method to achieve promising quantitative simulation results.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Shi, E. L.; Hammett, G. W.; Stoltzfus-Dueck, T.

Here, five-dimensional gyrokinetic continuum simulations of electrostatic plasma turbulence in a straight, open-field-line geometry have been performed using a full- discontinuous-Galerkin approach implemented in the Gkeyll code. While various simplifications have been used for now, such as long-wavelength approximations in the gyrokinetic Poisson equation and the Hamiltonian, these simulations include the basic elements of a fusion-device scrape-off layer: localised sources to model plasma outflow from the core, cross-field turbulent transport, parallel flow along magnetic field lines, and parallel losses at the limiter or divertor with sheath-model boundary conditions. The set of sheath-model boundary conditions used in the model allows currentsmore » to flow through the walls. In addition to details of the numerical approach, results from numerical simulations of turbulence in the Large Plasma Device, a linear device featuring straight magnetic field lines, are presented.« less
Rapid Prediction of Unsteady Three-Dimensional Viscous Flows in Turbopump Geometries

NASA Technical Reports Server (NTRS)

Dorney, Daniel J.

1998-01-01

A program is underway to improve the efficiency of a three-dimensional Navier-Stokes code and generalize it for nozzle and turbopump geometries. Code modifications will include the implementation of parallel processing software, incorporating new physical models and generalizing the multi-block capability to allow the simultaneous simulation of nozzle and turbopump configurations. The current report contains details of code modifications, numerical results of several flow simulations and the status of the parallelization effort.
The 3-D numerical simulation research of vacuum injector for linear induction accelerator

NASA Astrophysics Data System (ADS)

Liu, Dagang; Xie, Mengjun; Tang, Xinbing; Liao, Shuqing

2017-01-01

Simulation method for voltage in-feed and electron injection of vacuum injector is given, and verification of the simulated voltage and current is carried out. The numerical simulation for the magnetic field of solenoid is implemented, and a comparative analysis is conducted between the simulation results and experimental results. A semi-implicit difference algorithm is adopted to suppress the numerical noise, and a parallel acceleration algorithm is used for increasing the computation speed. The RMS emittance calculation method of the beam envelope equations is analyzed. In addition, the simulated results of RMS emittance are compared with the experimental data. Finally, influences of the ferromagnetic rings on the radial and axial magnetic fields of solenoid as well as the emittance of beam are studied.
Parallel 3D Finite Element Numerical Modelling of DC Electron Guns

DOE Office of Scientific and Technical Information (OSTI.GOV)

Prudencio, E.; Candel, A.; Ge, L.

2008-02-04

In this paper we present Gun3P, a parallel 3D finite element application that the Advanced Computations Department at the Stanford Linear Accelerator Center is developing for the analysis of beam formation in DC guns and beam transport in klystrons. Gun3P is targeted specially to complex geometries that cannot be described by 2D models and cannot be easily handled by finite difference discretizations. Its parallel capability allows simulations with more accuracy and less processing time than packages currently available. We present simulation results for the L-band Sheet Beam Klystron DC gun, in which case Gun3P is able to reduce simulation timemore » from days to some hours.« less
Numerical Simulation of the Flow over a Segment-Conical Body on the Basis of Reynolds Equations

NASA Astrophysics Data System (ADS)

Egorov, I. V.; Novikov, A. V.; Palchekovskaya, N. V.

2018-01-01

Numerical simulation was used to study the 3D supersonic flow over a segment-conical body similar in shape to the ExoMars space vehicle. The nonmonotone behavior of the normal force acting on the body placed in a supersonic gas flow was analyzed depending on the angle of attack. The simulation was based on the numerical solution of the unsteady Reynolds-averaged Navier-Stokes equations with a two-parameter differential turbulence model. The solution of the problem was obtained using the in-house solver HSFlow with an efficient parallel algorithm intended for multiprocessor super computers.
Obtaining identical results with double precision global accuracy on different numbers of processors in parallel particle Monte Carlo simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cleveland, Mathew A., E-mail: cleveland7@llnl.gov; Brunner, Thomas A.; Gentile, Nicholas A.

2013-10-15

We describe and compare different approaches for achieving numerical reproducibility in photon Monte Carlo simulations. Reproducibility is desirable for code verification, testing, and debugging. Parallelism creates a unique problem for achieving reproducibility in Monte Carlo simulations because it changes the order in which values are summed. This is a numerical problem because double precision arithmetic is not associative. Parallel Monte Carlo, both domain replicated and decomposed simulations, will run their particles in a different order during different runs of the same simulation because the non-reproducibility of communication between processors. In addition, runs of the same simulation using different domain decompositionsmore » will also result in particles being simulated in a different order. In [1], a way of eliminating non-associative accumulations using integer tallies was described. This approach successfully achieves reproducibility at the cost of lost accuracy by rounding double precision numbers to fewer significant digits. This integer approach, and other extended and reduced precision reproducibility techniques, are described and compared in this work. Increased precision alone is not enough to ensure reproducibility of photon Monte Carlo simulations. Non-arbitrary precision approaches require a varying degree of rounding to achieve reproducibility. For the problems investigated in this work double precision global accuracy was achievable by using 100 bits of precision or greater on all unordered sums which where subsequently rounded to double precision at the end of every time-step.« less
MADNESS: A Multiresolution, Adaptive Numerical Environment for Scientific Simulation

DOE PAGES

Harrison, Robert J.; Beylkin, Gregory; Bischoff, Florian A.; ...

2016-01-01

We present MADNESS (multiresolution adaptive numerical environment for scientific simulation) that is a high-level software environment for solving integral and differential equations in many dimensions that uses adaptive and fast harmonic analysis methods with guaranteed precision that are based on multiresolution analysis and separated representations. Underpinning the numerical capabilities is a powerful petascale parallel programming environment that aims to increase both programmer productivity and code scalability. This paper describes the features and capabilities of MADNESS and briefly discusses some current applications in chemistry and several areas of physics.
PRATHAM: Parallel Thermal Hydraulics Simulations using Advanced Mesoscopic Methods

DOE Office of Scientific and Technical Information (OSTI.GOV)

Joshi, Abhijit S; Jain, Prashant K; Mudrich, Jaime A

2012-01-01

At the Oak Ridge National Laboratory, efforts are under way to develop a 3D, parallel LBM code called PRATHAM (PaRAllel Thermal Hydraulic simulations using Advanced Mesoscopic Methods) to demonstrate the accuracy and scalability of LBM for turbulent flow simulations in nuclear applications. The code has been developed using FORTRAN-90, and parallelized using the message passing interface MPI library. Silo library is used to compact and write the data files, and VisIt visualization software is used to post-process the simulation data in parallel. Both the single relaxation time (SRT) and multi relaxation time (MRT) LBM schemes have been implemented in PRATHAM.more » To capture turbulence without prohibitively increasing the grid resolution requirements, an LES approach [5] is adopted allowing large scale eddies to be numerically resolved while modeling the smaller (subgrid) eddies. In this work, a Smagorinsky model has been used, which modifies the fluid viscosity by an additional eddy viscosity depending on the magnitude of the rate-of-strain tensor. In LBM, this is achieved by locally varying the relaxation time of the fluid.« less
Dust Dynamics in Protoplanetary Disks: Parallel Computing with PVM

NASA Astrophysics Data System (ADS)

de La Fuente Marcos, Carlos; Barge, Pierre; de La Fuente Marcos, Raúl

2002-03-01

We describe a parallel version of our high-order-accuracy particle-mesh code for the simulation of collisionless protoplanetary disks. We use this code to carry out a massively parallel, two-dimensional, time-dependent, numerical simulation, which includes dust particles, to study the potential role of large-scale, gaseous vortices in protoplanetary disks. This noncollisional problem is easy to parallelize on message-passing multicomputer architectures. We performed the simulations on a cache-coherent nonuniform memory access Origin 2000 machine, using both the parallel virtual machine (PVM) and message-passing interface (MPI) message-passing libraries. Our performance analysis suggests that, for our problem, PVM is about 25% faster than MPI. Using PVM and MPI made it possible to reduce CPU time and increase code performance. This allows for simulations with a large number of particles (N ~ 105-106) in reasonable CPU times. The performances of our implementation of the pa! rallel code on an Origin 2000 supercomputer are presented and discussed. They exhibit very good speedup behavior and low load unbalancing. Our results confirm that giant gaseous vortices can play a dominant role in giant planet formation.
An Investigation of the Flow Physics of Acoustic Liners by Direct Numerical Simulation

NASA Technical Reports Server (NTRS)

Watson, Willie R. (Technical Monitor); Tam, Christopher

2004-01-01

This report concentrates on reporting the effort and status of work done on three dimensional (3-D) simulation of a multi-hole resonator in an impedance tube. This work is coordinated with a parallel experimental effort to be carried out at the NASA Langley Research Center. The outline of this report is as follows : 1. Preliminary consideration. 2. Computation model. 3. Mesh design and parallel computing. 4. Visualization. 5. Status of computer code development. 1. Preliminary Consideration.
Streaming parallel GPU acceleration of large-scale filter-based spiking neural networks.

PubMed

Slażyński, Leszek; Bohte, Sander

2012-01-01

The arrival of graphics processing (GPU) cards suitable for massively parallel computing promises affordable large-scale neural network simulation previously only available at supercomputing facilities. While the raw numbers suggest that GPUs may outperform CPUs by at least an order of magnitude, the challenge is to develop fine-grained parallel algorithms to fully exploit the particulars of GPUs. Computation in a neural network is inherently parallel and thus a natural match for GPU architectures: given inputs, the internal state for each neuron can be updated in parallel. We show that for filter-based spiking neurons, like the Spike Response Model, the additive nature of membrane potential dynamics enables additional update parallelism. This also reduces the accumulation of numerical errors when using single precision computation, the native precision of GPUs. We further show that optimizing simulation algorithms and data structures to the GPU's architecture has a large pay-off: for example, matching iterative neural updating to the memory architecture of the GPU speeds up this simulation step by a factor of three to five. With such optimizations, we can simulate in better-than-realtime plausible spiking neural networks of up to 50 000 neurons, processing over 35 million spiking events per second.
Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications

NASA Technical Reports Server (NTRS)

Sun, Xian-He

1997-01-01

Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm and Reduced Parallel Diagonal Dominant (RPDD) algorithm have been carefully studied on different parallel platforms for different applications, and a NASA simulation code developed by Man M. Rai and his colleagues has been parallelized and implemented based on data dependency analysis. These achievements are addressed in detail in the paper.
Tough2{_}MP: A parallel version of TOUGH2

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Keni; Wu, Yu-Shu; Ding, Chris

2003-04-09

TOUGH2{_}MP is a massively parallel version of TOUGH2. It was developed for running on distributed-memory parallel computers to simulate large simulation problems that may not be solved by the standard, single-CPU TOUGH2 code. The new code implements an efficient massively parallel scheme, while preserving the full capacity and flexibility of the original TOUGH2 code. The new software uses the METIS software package for grid partitioning and AZTEC software package for linear-equation solving. The standard message-passing interface is adopted for communication among processors. Numerical performance of the current version code has been tested on CRAY-T3E and IBM RS/6000 SP platforms. Inmore » addition, the parallel code has been successfully applied to real field problems of multi-million-cell simulations for three-dimensional multiphase and multicomponent fluid and heat flow, as well as solute transport. In this paper, we will review the development of the TOUGH2{_}MP, and discuss the basic features, modules, and their applications.« less
Massive parallel 3D PIC simulation of negative ion extraction

NASA Astrophysics Data System (ADS)

Revel, Adrien; Mochalskyy, Serhiy; Montellano, Ivar Mauricio; Wünderlich, Dirk; Fantz, Ursel; Minea, Tiberiu

2017-09-01

The 3D PIC-MCC code ONIX is dedicated to modeling Negative hydrogen/deuterium Ion (NI) extraction and co-extraction of electrons from radio-frequency driven, low pressure plasma sources. It provides valuable insight on the complex phenomena involved in the extraction process. In previous calculations, a mesh size larger than the Debye length was used, implying numerical electron heating. Important steps have been achieved in terms of computation performance and parallelization efficiency allowing successful massive parallel calculations (4096 cores), imperative to resolve the Debye length. In addition, the numerical algorithms have been improved in terms of grid treatment, i.e., the electric field near the complex geometry boundaries (plasma grid) is calculated more accurately. The revised model preserves the full 3D treatment, but can take advantage of a highly refined mesh. ONIX was used to investigate the role of the mesh size, the re-injection scheme for lost particles (extracted or wall absorbed), and the electron thermalization process on the calculated extracted current and plasma characteristics. It is demonstrated that all numerical schemes give the same NI current distribution for extracted ions. Concerning the electrons, the pair-injection technique is found well-adapted to simulate the sheath in front of the plasma grid.
Multi-dimensional high order essentially non-oscillatory finite difference methods in generalized coordinates

NASA Technical Reports Server (NTRS)

Shu, Chi-Wang

1992-01-01

The nonlinear stability of compact schemes for shock calculations is investigated. In recent years compact schemes were used in various numerical simulations including direct numerical simulation of turbulence. However to apply them to problems containing shocks, one has to resolve the problem of spurious numerical oscillation and nonlinear instability. A framework to apply nonlinear limiting to a local mean is introduced. The resulting scheme can be proven total variation (1D) or maximum norm (multi D) stable and produces nice numerical results in the test cases. The result is summarized in the preprint entitled 'Nonlinearly Stable Compact Schemes for Shock Calculations', which was submitted to SIAM Journal on Numerical Analysis. Research was continued on issues related to two and three dimensional essentially non-oscillatory (ENO) schemes. The main research topics include: parallel implementation of ENO schemes on Connection Machines; boundary conditions; shock interaction with hydrogen bubbles, a preparation for the full combustion simulation; and direct numerical simulation of compressible sheared turbulence.
Numerical characteristics of quantum computer simulation

NASA Astrophysics Data System (ADS)

Chernyavskiy, A.; Khamitov, K.; Teplov, A.; Voevodin, V.; Voevodin, Vl.

2016-12-01

The simulation of quantum circuits is significantly important for the implementation of quantum information technologies. The main difficulty of such modeling is the exponential growth of dimensionality, thus the usage of modern high-performance parallel computations is relevant. As it is well known, arbitrary quantum computation in circuit model can be done by only single- and two-qubit gates, and we analyze the computational structure and properties of the simulation of such gates. We investigate the fact that the unique properties of quantum nature lead to the computational properties of the considered algorithms: the quantum parallelism make the simulation of quantum gates highly parallel, and on the other hand, quantum entanglement leads to the problem of computational locality during simulation. We use the methodology of the AlgoWiki project (algowiki-project.org) to analyze the algorithm. This methodology consists of theoretical (sequential and parallel complexity, macro structure, and visual informational graph) and experimental (locality and memory access, scalability and more specific dynamic characteristics) parts. Experimental part was made by using the petascale Lomonosov supercomputer (Moscow State University, Russia). We show that the simulation of quantum gates is a good base for the research and testing of the development methods for data intense parallel software, and considered methodology of the analysis can be successfully used for the improvement of the algorithms in quantum information science.

Analysis of Serial and Parallel Algorithms for Use in Molecular Dynamics.. Review and Proposals

NASA Astrophysics Data System (ADS)

Mazzone, A. M.

This work analyzes the stability and accuracy of multistep methods, either for serial or parallel calculations, applied to molecular dynamics simulations. Numerical testing is made by evaluating the equilibrium configurations of mono-elemental crystalline lattices of metallic and semiconducting type (Ag and Si, respectively) and of a cubic CuY compound.
Development and application of numerical techniques for general-relativistic magnetohydrodynamics simulations of black hole accretion

NASA Astrophysics Data System (ADS)

White, Christopher Joseph

We describe the implementation of sophisticated numerical techniques for general-relativistic magnetohydrodynamics simulations in the Athena++ code framework. Improvements over many existing codes include the use of advanced Riemann solvers and of staggered-mesh constrained transport. Combined with considerations for computational performance and parallel scalability, these allow us to investigate black hole accretion flows with unprecedented accuracy. The capability of the code is demonstrated by exploring magnetically arrested disks.
Stress orientation and fracturing during three-dimensional buckling: Numerical simulation and application to chocolate-tablet structures in folded turbidites, SW Portugal

NASA Astrophysics Data System (ADS)

Reber, J. E.; Schmalholz, S. M.; Burg, J.-P.

2010-10-01

Two orthogonal sets of veins, both orthogonal to bedding, form chocolate tablet structures on the limbs of folded quartzwackes of Carboniferous turbidites in SW Portugal. Structural observations suggest that (1) mode 1 fractures transverse to the fold axes formed while fold amplitudes were small and limbs were under layer-subparallel compression and (2) mode 1 fractures parallel to the fold axes formed while fold amplitudes were large and limbs were brought to be under layer-subparallel tension. We performed two- and three-dimensional numerical simulations investigating the evolution of stress orientations during viscous folding to test whether and how these two successive sets of fractures were related to folding. We employed ellipses and ellipsoids for the visualization and quantification of the local stress field. The numerical simulations show a change in the orientation of the local σ1 direction by almost 90° with respect to the bedding plane in the fold limbs. The coeval σ3 direction rotates from parallel to the fold axis at low fold amplitudes to orthogonal to the fold axis at high fold amplitudes. The stress orientation changes faster in multilayers than in single-layers. The numerical simulations are consistent with observation and provide a mechanical interpretation for the formation of the chocolate tablet structures through consecutive sets of fractures on rotating limbs of folded competent layers.
Depiction of interfacial morphology in impact welded Ti/Cu bimetallic systems using smoothed particle hydrodynamics

NASA Astrophysics Data System (ADS)

Nassiri, Ali; Vivek, Anupam; Abke, Tim; Liu, Bert; Lee, Taeseon; Daehn, Glenn

2017-06-01

Numerical simulations of high-velocity impact welding are extremely challenging due to the coupled physics and highly dynamic nature of the process. Thus, conventional mesh-based numerical methodologies are not able to accurately model the process owing to the excessive mesh distortion close to the interface of two welded materials. A simulation platform was developed using smoothed particle hydrodynamics, implemented in a parallel architecture on a supercomputer. Then, the numerical simulations were compared to experimental tests conducted by vaporizing foil actuator welding. The close correspondence of the experiment and modeling in terms of interface characteristics allows the prediction of local temperature and strain distributions, which are not easily measured.
Why not make a PC cluster of your own? 5. AppleSeed: A Parallel Macintosh Cluster for Scientific Computing

NASA Astrophysics Data System (ADS)

Decyk, Viktor K.; Dauger, Dean E.

We have constructed a parallel cluster consisting of Apple Macintosh G4 computers running both Classic Mac OS as well as the Unix-based Mac OS X, and have achieved very good performance on numerically intensive, parallel plasma particle-in-cell simulations. Unlike other Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. This enables us to move parallel computing from the realm of experts to the mainstream of computing.
Numerical simulation of groundwater flow in strongly anisotropic aquifers using multiple-point flux approximation method

NASA Astrophysics Data System (ADS)

Lin, S. T.; Liou, T. S.

2017-12-01

Numerical simulation of groundwater flow in anisotropic aquifers usually suffers from the lack of accuracy of calculating groundwater flux across grid blocks. Conventional two-point flux approximation (TPFA) can only obtain the flux normal to the grid interface but completely neglects the one parallel to it. Furthermore, the hydraulic gradient in a grid block estimated from TPFA can only poorly represent the hydraulic condition near the intersection of grid blocks. These disadvantages are further exacerbated when the principal axes of hydraulic conductivity, global coordinate system, and grid boundary are not parallel to one another. In order to refine the estimation the in-grid hydraulic gradient, several multiple-point flux approximation (MPFA) methods have been developed for two-dimensional groundwater flow simulations. For example, the MPFA-O method uses the hydraulic head at the junction node as an auxiliary variable which is then eliminated using the head and flux continuity conditions. In this study, a three-dimensional MPFA method will be developed for numerical simulation of groundwater flow in three-dimensional and strongly anisotropic aquifers. This new MPFA method first discretizes the simulation domain into hexahedrons. Each hexahedron is further decomposed into a certain number of tetrahedrons. The 2D MPFA-O method is then extended to these tetrahedrons, using the unknown head at the intersection of hexahedrons as an auxiliary variable along with the head and flux continuity conditions to solve for the head at the center of each hexahedron. Numerical simulations using this new MPFA method have been successfully compared with those obtained from a modified version of TOUGH2.
A survey of parallel programming tools

NASA Technical Reports Server (NTRS)

Cheng, Doreen Y.

1991-01-01

This survey examines 39 parallel programming tools. Focus is placed on those tool capabilites needed for parallel scientific programming rather than for general computer science. The tools are classified with current and future needs of Numerical Aerodynamic Simulator (NAS) in mind: existing and anticipated NAS supercomputers and workstations; operating systems; programming languages; and applications. They are divided into four categories: suggested acquisitions, tools already brought in; tools worth tracking; and tools eliminated from further consideration at this time.
Computational time analysis of the numerical solution of 3D electrostatic Poisson's equation

NASA Astrophysics Data System (ADS)

Kamboh, Shakeel Ahmed; Labadin, Jane; Rigit, Andrew Ragai Henri; Ling, Tech Chaw; Amur, Khuda Bux; Chaudhary, Muhammad Tayyab

2015-05-01

3D Poisson's equation is solved numerically to simulate the electric potential in a prototype design of electrohydrodynamic (EHD) ion-drag micropump. Finite difference method (FDM) is employed to discretize the governing equation. The system of linear equations resulting from FDM is solved iteratively by using the sequential Jacobi (SJ) and sequential Gauss-Seidel (SGS) methods, simulation results are also compared to examine the difference between the results. The main objective was to analyze the computational time required by both the methods with respect to different grid sizes and parallelize the Jacobi method to reduce the computational time. In common, the SGS method is faster than the SJ method but the data parallelism of Jacobi method may produce good speedup over SGS method. In this study, the feasibility of using parallel Jacobi (PJ) method is attempted in relation to SGS method. MATLAB Parallel/Distributed computing environment is used and a parallel code for SJ method is implemented. It was found that for small grid size the SGS method remains dominant over SJ method and PJ method while for large grid size both the sequential methods may take nearly too much processing time to converge. Yet, the PJ method reduces computational time to some extent for large grid sizes.
Parallel Simulation of Unsteady Turbulent Flames

NASA Technical Reports Server (NTRS)

Menon, Suresh

1996-01-01

Time-accurate simulation of turbulent flames in high Reynolds number flows is a challenging task since both fluid dynamics and combustion must be modeled accurately. To numerically simulate this phenomenon, very large computer resources (both time and memory) are required. Although current vector supercomputers are capable of providing adequate resources for simulations of this nature, the high cost and their limited availability, makes practical use of such machines less than satisfactory. At the same time, the explicit time integration algorithms used in unsteady flow simulations often possess a very high degree of parallelism, making them very amenable to efficient implementation on large-scale parallel computers. Under these circumstances, distributed memory parallel computers offer an excellent near-term solution for greatly increased computational speed and memory, at a cost that may render the unsteady simulations of the type discussed above more feasible and affordable.This paper discusses the study of unsteady turbulent flames using a simulation algorithm that is capable of retaining high parallel efficiency on distributed memory parallel architectures. Numerical studies are carried out using large-eddy simulation (LES). In LES, the scales larger than the grid are computed using a time- and space-accurate scheme, while the unresolved small scales are modeled using eddy viscosity based subgrid models. This is acceptable for the moment/energy closure since the small scales primarily provide a dissipative mechanism for the energy transferred from the large scales. However, for combustion to occur, the species must first undergo mixing at the small scales and then come into molecular contact. Therefore, global models cannot be used. Recently, a new model for turbulent combustion was developed, in which the combustion is modeled, within the subgrid (small-scales) using a methodology that simulates the mixing and the molecular transport and the chemical kinetics within each LES grid cell. Finite-rate kinetics can be included without any closure and this approach actually provides a means to predict the turbulent rates and the turbulent flame speed. The subgrid combustion model requires resolution of the local time scales associated with small-scale mixing, molecular diffusion and chemical kinetics and, therefore, within each grid cell, a significant amount of computations must be carried out before the large-scale (LES resolved) effects are incorporated. Therefore, this approach is uniquely suited for parallel processing and has been implemented on various systems such as: Intel Paragon, IBM SP-2, Cray T3D and SGI Power Challenge (PC) using the system independent Message Passing Interface (MPI) compiler. In this paper, timing data on these machines is reported along with some characteristic results.
Spatial and temporal accuracy of asynchrony-tolerant finite difference schemes for partial differential equations at extreme scales

NASA Astrophysics Data System (ADS)

Kumari, Komal; Donzis, Diego

2017-11-01

Highly resolved computational simulations on massively parallel machines are critical in understanding the physics of a vast number of complex phenomena in nature governed by partial differential equations. Simulations at extreme levels of parallelism present many challenges with communication between processing elements (PEs) being a major bottleneck. In order to fully exploit the computational power of exascale machines one needs to devise numerical schemes that relax global synchronizations across PEs. This asynchronous computations, however, have a degrading effect on the accuracy of standard numerical schemes.We have developed asynchrony-tolerant (AT) schemes that maintain order of accuracy despite relaxed communications. We show, analytically and numerically, that these schemes retain their numerical properties with multi-step higher order temporal Runge-Kutta schemes. We also show that for a range of optimized parameters,the computation time and error for AT schemes is less than their synchronous counterpart. Stability of the AT schemes which depends upon history and random nature of delays, are also discussed. Support from NSF is gratefully acknowledged.
High Performance Parallel Processing (HPPP) Finite Element Simulation of Fluid Structure Interactions Final Report CRADA No. TC-0824-94-A

DOE Office of Scientific and Technical Information (OSTI.GOV)

Couch, R.; Ziegler, D. P.

This project was a muki-partner CRADA. This was a partnership between Alcoa and LLNL. AIcoa developed a system of numerical simulation modules that provided accurate and efficient threedimensional modeling of combined fluid dynamics and structural response.
Xyce parallel electronic simulator : users' guide.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.

2011-05-01

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: (1) Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers; (2) Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-artmore » algorithms and novel techniques. (3) Device models which are specifically tailored to meet Sandia's needs, including some radiation-aware devices (for Sandia users only); and (4) Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The development of Xyce provides a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms) research and development can be performed. As a result, Xyce is a unique electrical simulation capability, designed to meet the unique needs of the laboratory.« less
Flow of a Gas Turbine Engine Low-Pressure Subsystem Simulated

NASA Technical Reports Server (NTRS)

Veres, Joseph P.

1997-01-01

The NASA Lewis Research Center is managing a task to numerically simulate overnight, on a parallel computing testbed, the aerodynamic flow in the complete low-pressure subsystem (LPS) of a gas turbine engine. The model solves the three-dimensional Navier- Stokes flow equations through all the components within the LPS, as well as the external flow around the engine nacelle. The LPS modeling task is being performed by Allison Engine Company under the Small Engine Technology contract. The large computer simulation was evaluated on networked computer systems using 8, 16, and 32 processors, with the parallel computing efficiency reaching 75 percent when 16 processors were used.
A parallel direct-forcing fictitious domain method for simulating microswimmers

NASA Astrophysics Data System (ADS)

Gao, Tong; Lin, Zhaowu

2017-11-01

We present a 3D parallel direct-forcing fictitious domain method for simulating swimming micro-organisms at small Reynolds numbers. We treat the motile micro-swimmers as spherical rigid particles using the ``Squirmer'' model. The particle dynamics are solved on the moving Larangian meshes that overlay upon a fixed Eulerian mesh for solving the fluid motion, and the momentum exchange between the two phases is resolved by distributing pseudo body-forces over the particle interior regions which constrain the background fictitious fluids to follow the particle movement. While the solid and fluid subproblems are solved separately, no inner-iterations are required to enforce numerical convergence. We demonstrate the accuracy and robustness of the method by comparing our results with the existing analytical and numerical studies for various cases of single particle dynamics and particle-particle interactions. We also perform a series of numerical explorations to obtain statistical and rheological measurements to characterize the dynamics and structures of Squirmer suspensions. NSF DMS 1619960.
Real-world hydrologic assessment of a fully-distributed hydrological model in a parallel computing environment

NASA Astrophysics Data System (ADS)

Vivoni, Enrique R.; Mascaro, Giuseppe; Mniszewski, Susan; Fasel, Patricia; Springer, Everett P.; Ivanov, Valeriy Y.; Bras, Rafael L.

2011-10-01

SummaryA major challenge in the use of fully-distributed hydrologic models has been the lack of computational capabilities for high-resolution, long-term simulations in large river basins. In this study, we present the parallel model implementation and real-world hydrologic assessment of the Triangulated Irregular Network (TIN)-based Real-time Integrated Basin Simulator (tRIBS). Our parallelization approach is based on the decomposition of a complex watershed using the channel network as a directed graph. The resulting sub-basin partitioning divides effort among processors and handles hydrologic exchanges across boundaries. Through numerical experiments in a set of nested basins, we quantify parallel performance relative to serial runs for a range of processors, simulation complexities and lengths, and sub-basin partitioning methods, while accounting for inter-run variability on a parallel computing system. In contrast to serial simulations, the parallel model speed-up depends on the variability of hydrologic processes. Load balancing significantly improves parallel speed-up with proportionally faster runs as simulation complexity (domain resolution and channel network extent) increases. The best strategy for large river basins is to combine a balanced partitioning with an extended channel network, with potential savings through a lower TIN resolution. Based on these advances, a wider range of applications for fully-distributed hydrologic models are now possible. This is illustrated through a set of ensemble forecasts that account for precipitation uncertainty derived from a statistical downscaling model.
N-MODY: a code for collisionless N-body simulations in modified Newtonian dynamics.

NASA Astrophysics Data System (ADS)

Londrillo, P.; Nipoti, C.

We describe the numerical code N-MODY, a parallel particle-mesh code for collisionless N-body simulations in modified Newtonian dynamics (MOND). N-MODY is based on a numerical potential solver in spherical coordinates that solves the non-linear MOND field equation, and is ideally suited to simulate isolated stellar systems. N-MODY can be used also to compute the MOND potential of arbitrary static density distributions. A few applications of N-MODY indicate that some astrophysically relevant dynamical processes are profoundly different in MOND and in Newtonian gravity with dark matter.
Nonlinear whistler waves

NASA Astrophysics Data System (ADS)

Vasko, I.; Agapitov, O. V.; Mozer, F.; Bonnell, J. W.; Krasnoselskikh, V.; Artemyev, A.; Drake, J. F.

2017-12-01

Chorus waves observed in the Earth inner magnetosphere sometimes exhibit significantly distorted (nonharmonic) parallel electric field waveform. In spectrograms these waveform features show up as overtones of chorus wave. In this work we show that the chorus wave parallel electric field is distorted due to finite temperature of electrons. The distortion of the parallel electric field is described analytically and reproduced in the numerical fluid simulations. Due to this effect the chorus energy is transferred to higher frequencies making possible efficient scattering of low ( a few keV) energy electrons.
Proceedings of the 14th International Conference on the Numerical Simulation of Plasmas

NASA Astrophysics Data System (ADS)

Partial Contents are as follows: Numerical Simulations of the Vlasov-Maxwell Equations by Coupled Particle-Finite Element Methods on Unstructured Meshes; Electromagnetic PIC Simulations Using Finite Elements on Unstructured Grids; Modelling Travelling Wave Output Structures with the Particle-in-Cell Code CONDOR; SST--A Single-Slice Particle Simulation Code; Graphical Display and Animation of Data Produced by Electromagnetic, Particle-in-Cell Codes; A Post-Processor for the PEST Code; Gray Scale Rendering of Beam Profile Data; A 2D Electromagnetic PIC Code for Distributed Memory Parallel Computers; 3-D Electromagnetic PIC Simulation on the NRL Connection Machine; Plasma PIC Simulations on MIMD Computers; Vlasov-Maxwell Algorithm for Electromagnetic Plasma Simulation on Distributed Architectures; MHD Boundary Layer Calculation Using the Vortex Method; and Eulerian Codes for Plasma Simulations.
Numerical simulation of phenomenon on zonal disintegration in deep underground mining in case of unsupported roadway

NASA Astrophysics Data System (ADS)

Han, Fengshan; Wu, Xinli; Li, Xia; Zhu, Dekang

2018-02-01

Zonal disintegration phenomenon was found in deep mining roadway surrounding rock. It seriously affects the safety of mining and underground engineering and it may lead to the occurrence of natural disasters. in deep mining roadway surrounding rock, tectonic stress in deep mining roadway rock mass, horizontal stress is much greater than the vertical stress, When the direction of maximum principal stress is parallel to the axis of the roadway in deep mining, this is the main reasons for Zonal disintegration phenomenon. Using ABAQUS software to numerical simulation of the three-dimensional model of roadway rupture formation process systematically, and the study shows that when The Direction of maximum main stress in deep underground mining is along the roadway axial direction, Zonal disintegration phenomenon in deep underground mining is successfully reproduced by our numerical simulation..numerical simulation shows that using ABAQUA simulation can reproduce Zonal disintegration phenomenon and the formation process of damage of surrounding rock can be reproduced. which have important engineering practical significance.
[Correlation of substrate structure and hydraulic characteristics in subsurface flow constructed wetlands].

PubMed

Bai, Shao-Yuan; Song, Zhi-Xin; Ding, Yan-Li; You, Shao-Hong; He, Shan

2014-02-01

The correlation of substrate structure and hydraulic characteristics was studied by numerical simulation combined with experimental method. The numerical simulation results showed that the permeability coefficient of matrix had a great influence on hydraulic efficiency in subsurface flow constructed wetlands. The filler with a high permeability coefficient had a worse flow field distribution in the constructed wetland with single layer structure. The layered substrate structure with the filler permeability coefficient increased from surface to bottom could avoid the short-circuited flow and dead-zones, and thus, increased the hydraulic efficiency. Two parallel pilot-scale constructed wetlands were built according to the numerical simulation results, and tracer experiments were conducted to validate the simulation results. The tracer experiment result showed that hydraulic characteristics in the layered constructed wetland were obviously better than that in the single layer system, and the substrate effective utilization rates were 0.87 and 0.49, respectively. It was appeared that numerical simulation would be favorable for substrate structure optimization in subsurface flow constructed wetlands.

A Dynamic Finite Element Method for Simulating the Physics of Faults Systems

NASA Astrophysics Data System (ADS)

Saez, E.; Mora, P.; Gross, L.; Weatherley, D.

2004-12-01

We introduce a dynamic Finite Element method using a novel high level scripting language to describe the physical equations, boundary conditions and time integration scheme. The library we use is the parallel Finley library: a finite element kernel library, designed for solving large-scale problems. It is incorporated as a differential equation solver into a more general library called escript, based on the scripting language Python. This library has been developed to facilitate the rapid development of 3D parallel codes, and is optimised for the Australian Computational Earth Systems Simulator Major National Research Facility (ACcESS MNRF) supercomputer, a 208 processor SGI Altix with a peak performance of 1.1 TFlops. Using the scripting approach we obtain a parallel FE code able to take advantage of the computational efficiency of the Altix 3700. We consider faults as material discontinuities (the displacement, velocity, and acceleration fields are discontinuous at the fault), with elastic behavior. The stress continuity at the fault is achieved naturally through the expression of the fault interactions in the weak formulation. The elasticity problem is solved explicitly in time, using the Saint Verlat scheme. Finally, we specify a suitable frictional constitutive relation and numerical scheme to simulate fault behaviour. Our model is based on previous work on modelling fault friction and multi-fault systems using lattice solid-like models. We adapt the 2D model for simulating the dynamics of parallel fault systems described to the Finite-Element method. The approach uses a frictional relation along faults that is slip and slip-rate dependent, and the numerical integration approach introduced by Mora and Place in the lattice solid model. In order to illustrate the new Finite Element model, single and multi-fault simulation examples are presented.
Parallel scalability and efficiency of vortex particle method for aeroelasticity analysis of bluff bodies

NASA Astrophysics Data System (ADS)

Tolba, Khaled Ibrahim; Morgenthal, Guido

2018-01-01

This paper presents an analysis of the scalability and efficiency of a simulation framework based on the vortex particle method. The code is applied for the numerical aerodynamic analysis of line-like structures. The numerical code runs on multicore CPU and GPU architectures using OpenCL framework. The focus of this paper is the analysis of the parallel efficiency and scalability of the method being applied to an engineering test case, specifically the aeroelastic response of a long-span bridge girder at the construction stage. The target is to assess the optimal configuration and the required computer architecture, such that it becomes feasible to efficiently utilise the method within the computational resources available for a regular engineering office. The simulations and the scalability analysis are performed on a regular gaming type computer.
Feasibility of using the Massively Parallel Processor for large eddy simulations and other Computational Fluid Dynamics applications

NASA Technical Reports Server (NTRS)

Bruno, John

1984-01-01

The results of an investigation into the feasibility of using the MPP for direct and large eddy simulations of the Navier-Stokes equations is presented. A major part of this study was devoted to the implementation of two of the standard numerical algorithms for CFD. These implementations were not run on the Massively Parallel Processor (MPP) since the machine delivered to NASA Goddard does not have sufficient capacity. Instead, a detailed implementation plan was designed and from these were derived estimates of the time and space requirements of the algorithms on a suitably configured MPP. In addition, other issues related to the practical implementation of these algorithms on an MPP-like architecture were considered; namely, adaptive grid generation, zonal boundary conditions, the table lookup problem, and the software interface. Performance estimates show that the architectural components of the MPP, the Staging Memory and the Array Unit, appear to be well suited to the numerical algorithms of CFD. This combined with the prospect of building a faster and larger MMP-like machine holds the promise of achieving sustained gigaflop rates that are required for the numerical simulations in CFD.
A parallel direct numerical simulation of dust particles in a turbulent flow

NASA Astrophysics Data System (ADS)

Nguyen, H. V.; Yokota, R.; Stenchikov, G.; Kocurek, G.

2012-04-01

Due to their effects on radiation transport, aerosols play an important role in the global climate. Mineral dust aerosol is a predominant natural aerosol in the desert and semi-desert regions of the Middle East and North Africa (MENA). The Arabian Peninsula is one of the three predominant source regions on the planet "exporting" dust to almost the entire world. Mineral dust aerosols make up about 50% of the tropospheric aerosol mass and therefore produces a significant impact on the Earth's climate and the atmospheric environment, especially in the MENA region that is characterized by frequent dust storms and large aerosol generation. Understanding the mechanisms of dust emission, transport and deposition is therefore essential for correctly representing dust in numerical climate prediction. In this study we present results of numerical simulations of dust particles in a turbulent flow to study the interaction between dust and the atmosphere. Homogenous and passive dust particles in the boundary layers are entrained and advected under the influence of a turbulent flow. Currently no interactions between particles are included. Turbulence is resolved through direct numerical simulation using a parallel incompressible Navier-Stokes flow solver. Model output provides information on particle trajectories, turbulent transport of dust and effects of gravity on dust motion, which will be used to compare with the wind tunnel experiments at University of Texas at Austin. Results of testing of parallel efficiency and scalability is provided. Future versions of the model will include air-particle momentum exchanges, varying particle sizes and saltation effect. The results will be used for interpreting wind tunnel and field experiments and for improvement of dust generation parameterizations in meteorological models.
The Space-Time Conservative Schemes for Large-Scale, Time-Accurate Flow Simulations with Tetrahedral Meshes

NASA Technical Reports Server (NTRS)

Venkatachari, Balaji Shankar; Streett, Craig L.; Chang, Chau-Lyan; Friedlander, David J.; Wang, Xiao-Yen; Chang, Sin-Chung

2016-01-01

Despite decades of development of unstructured mesh methods, high-fidelity time-accurate simulations are still predominantly carried out on structured, or unstructured hexahedral meshes by using high-order finite-difference, weighted essentially non-oscillatory (WENO), or hybrid schemes formed by their combinations. In this work, the space-time conservation element solution element (CESE) method is used to simulate several flow problems including supersonic jet/shock interaction and its impact on launch vehicle acoustics, and direct numerical simulations of turbulent flows using tetrahedral meshes. This paper provides a status report for the continuing development of the space-time conservation element solution element (CESE) numerical and software framework under the Revolutionary Computational Aerosciences (RCA) project. Solution accuracy and large-scale parallel performance of the numerical framework is assessed with the goal of providing a viable paradigm for future high-fidelity flow physics simulations.
Longitudinal train dynamics: an overview

NASA Astrophysics Data System (ADS)

Wu, Qing; Spiryagin, Maksym; Cole, Colin

2016-12-01

This paper discusses the evolution of longitudinal train dynamics (LTD) simulations, which covers numerical solvers, vehicle connection systems, air brake systems, wagon dumper systems and locomotives, resistance forces and gravitational components, vehicle in-train instabilities, and computing schemes. A number of potential research topics are suggested, such as modelling of friction, polymer, and transition characteristics for vehicle connection simulations, studies of wagon dumping operations, proper modelling of vehicle in-train instabilities, and computing schemes for LTD simulations. Evidence shows that LTD simulations have evolved with computing capabilities. Currently, advanced component models that directly describe the working principles of the operation of air brake systems, vehicle connection systems, and traction systems are available. Parallel computing is a good solution to combine and simulate all these advanced models. Parallel computing can also be used to conduct three-dimensional long train dynamics simulations.
The numerical simulation of a high-speed axial flow compressor

NASA Technical Reports Server (NTRS)

Mulac, Richard A.; Adamczyk, John J.

1991-01-01

The advancement of high-speed axial-flow multistage compressors is impeded by a lack of detailed flow-field information. Recent development in compressor flow modeling and numerical simulation have the potential to provide needed information in a timely manner. The development of a computer program is described to solve the viscous form of the average-passage equation system for multistage turbomachinery. Programming issues such as in-core versus out-of-core data storage and CPU utilization (parallelization, vectorization, and chaining) are addressed. Code performance is evaluated through the simulation of the first four stages of a five-stage, high-speed, axial-flow compressor. The second part addresses the flow physics which can be obtained from the numerical simulation. In particular, an examination of the endwall flow structure is made, and its impact on blockage distribution assessed.
Visual Computing Environment

NASA Technical Reports Server (NTRS)

Lawrence, Charles; Putt, Charles W.

1997-01-01

The Visual Computing Environment (VCE) is a NASA Lewis Research Center project to develop a framework for intercomponent and multidisciplinary computational simulations. Many current engineering analysis codes simulate various aspects of aircraft engine operation. For example, existing computational fluid dynamics (CFD) codes can model the airflow through individual engine components such as the inlet, compressor, combustor, turbine, or nozzle. Currently, these codes are run in isolation, making intercomponent and complete system simulations very difficult to perform. In addition, management and utilization of these engineering codes for coupled component simulations is a complex, laborious task, requiring substantial experience and effort. To facilitate multicomponent aircraft engine analysis, the CFD Research Corporation (CFDRC) is developing the VCE system. This system, which is part of NASA's Numerical Propulsion Simulation System (NPSS) program, can couple various engineering disciplines, such as CFD, structural analysis, and thermal analysis. The objectives of VCE are to (1) develop a visual computing environment for controlling the execution of individual simulation codes that are running in parallel and are distributed on heterogeneous host machines in a networked environment, (2) develop numerical coupling algorithms for interchanging boundary conditions between codes with arbitrary grid matching and different levels of dimensionality, (3) provide a graphical interface for simulation setup and control, and (4) provide tools for online visualization and plotting. VCE was designed to provide a distributed, object-oriented environment. Mechanisms are provided for creating and manipulating objects, such as grids, boundary conditions, and solution data. This environment includes parallel virtual machine (PVM) for distributed processing. Users can interactively select and couple any set of codes that have been modified to run in a parallel distributed fashion on a cluster of heterogeneous workstations. A scripting facility allows users to dictate the sequence of events that make up the particular simulation.
Performance evaluation of GPU parallelization, space-time adaptive algorithms, and their combination for simulating cardiac electrophysiology.

PubMed

Sachetto Oliveira, Rafael; Martins Rocha, Bernardo; Burgarelli, Denise; Meira, Wagner; Constantinides, Christakis; Weber Dos Santos, Rodrigo

2018-02-01

The use of computer models as a tool for the study and understanding of the complex phenomena of cardiac electrophysiology has attained increased importance nowadays. At the same time, the increased complexity of the biophysical processes translates into complex computational and mathematical models. To speed up cardiac simulations and to allow more precise and realistic uses, 2 different techniques have been traditionally exploited: parallel computing and sophisticated numerical methods. In this work, we combine a modern parallel computing technique based on multicore and graphics processing units (GPUs) and a sophisticated numerical method based on a new space-time adaptive algorithm. We evaluate each technique alone and in different combinations: multicore and GPU, multicore and GPU and space adaptivity, multicore and GPU and space adaptivity and time adaptivity. All the techniques and combinations were evaluated under different scenarios: 3D simulations on slabs, 3D simulations on a ventricular mouse mesh, ie, complex geometry, sinus-rhythm, and arrhythmic conditions. Our results suggest that multicore and GPU accelerate the simulations by an approximate factor of 33×, whereas the speedups attained by the space-time adaptive algorithms were approximately 48. Nevertheless, by combining all the techniques, we obtained speedups that ranged between 165 and 498. The tested methods were able to reduce the execution time of a simulation by more than 498× for a complex cellular model in a slab geometry and by 165× in a realistic heart geometry simulating spiral waves. The proposed methods will allow faster and more realistic simulations in a feasible time with no significant loss of accuracy. Copyright © 2017 John Wiley & Sons, Ltd.
Numerical Simulation of Transit-Time Ultrasonic Flowmeters by a Direct Approach.

PubMed

Luca, Adrian; Marchiano, Regis; Chassaing, Jean-Camille

2016-06-01

This paper deals with the development of a computational code for the numerical simulation of wave propagation through domains with a complex geometry consisting in both solids and moving fluids. The emphasis is on the numerical simulation of ultrasonic flowmeters (UFMs) by modeling the wave propagation in solids with the equations of linear elasticity (ELE) and in fluids with the linearized Euler equations (LEEs). This approach requires high performance computing because of the high number of degrees of freedom and the long propagation distances. Therefore, the numerical method should be chosen with care. In order to minimize the numerical dissipation which may occur in this kind of configuration, the numerical method employed here is the nodal discontinuous Galerkin (DG) method. Also, this method is well suited for parallel computing. To speed up the code, almost all the computational stages have been implemented to run on graphical processing unit (GPU) by using the compute unified device architecture (CUDA) programming model from NVIDIA. This approach has been validated and then used for the two-dimensional simulation of gas UFMs. The large contrast of acoustic impedance characteristic to gas UFMs makes their simulation a real challenge.
Vortex-induced vibration of two parallel risers: Experimental test and numerical simulation

NASA Astrophysics Data System (ADS)

Huang, Weiping; Zhou, Yang; Chen, Haiming

2016-04-01

The vortex-induced vibration of two identical rigidly mounted risers in a parallel arrangement was studied using Ansys- CFX and model tests. The vortex shedding and force were recorded to determine the effect of spacing on the two-degree-of-freedom oscillation of the risers. CFX was used to study the single riser and two parallel risers in 2-8 D spacing considering the coupling effect. Because of the limited width of water channel, only three different riser spacings, 2 D, 3 D, and 4 D, were tested to validate the characteristics of the two parallel risers by comparing to the numerical simulation. The results indicate that the lift force changes significantly with the increase in spacing, and in the case of 3 D spacing, the lift force of the two parallel risers reaches the maximum. The vortex shedding of the risers in 3 D spacing shows that a variable velocity field with the same frequency as the vortex shedding is generated in the overlapped area, thus equalizing the period of drag force to that of lift force. It can be concluded that the interaction between the two parallel risers is significant when the risers are brought to a small distance between them because the trajectory of riser changes from oval to curve 8 as the spacing is increased. The phase difference of lift force between the two risers is also different as the spacing changes.
Advances in Parallelization for Large Scale Oct-Tree Mesh Generation

NASA Technical Reports Server (NTRS)

O'Connell, Matthew; Karman, Steve L.

2015-01-01

Despite great advancements in the parallelization of numerical simulation codes over the last 20 years, it is still common to perform grid generation in serial. Generating large scale grids in serial often requires using special "grid generation" compute machines that can have more than ten times the memory of average machines. While some parallel mesh generation techniques have been proposed, generating very large meshes for LES or aeroacoustic simulations is still a challenging problem. An automated method for the parallel generation of very large scale off-body hierarchical meshes is presented here. This work enables large scale parallel generation of off-body meshes by using a novel combination of parallel grid generation techniques and a hybrid "top down" and "bottom up" oct-tree method. Meshes are generated using hardware commonly found in parallel compute clusters. The capability to generate very large meshes is demonstrated by the generation of off-body meshes surrounding complex aerospace geometries. Results are shown including a one billion cell mesh generated around a Predator Unmanned Aerial Vehicle geometry, which was generated on 64 processors in under 45 minutes.
Design of object-oriented distributed simulation classes

NASA Technical Reports Server (NTRS)

Schoeffler, James D. (Principal Investigator)

1995-01-01

Distributed simulation of aircraft engines as part of a computer aided design package is being developed by NASA Lewis Research Center for the aircraft industry. The project is called NPSS, an acronym for 'Numerical Propulsion Simulation System'. NPSS is a flexible object-oriented simulation of aircraft engines requiring high computing speed. It is desirable to run the simulation on a distributed computer system with multiple processors executing portions of the simulation in parallel. The purpose of this research was to investigate object-oriented structures such that individual objects could be distributed. The set of classes used in the simulation must be designed to facilitate parallel computation. Since the portions of the simulation carried out in parallel are not independent of one another, there is the need for communication among the parallel executing processors which in turn implies need for their synchronization. Communication and synchronization can lead to decreased throughput as parallel processors wait for data or synchronization signals from other processors. As a result of this research, the following have been accomplished. The design and implementation of a set of simulation classes which result in a distributed simulation control program have been completed. The design is based upon MIT 'Actor' model of a concurrent object and uses 'connectors' to structure dynamic connections between simulation components. Connectors may be dynamically created according to the distribution of objects among machines at execution time without any programming changes. Measurements of the basic performance have been carried out with the result that communication overhead of the distributed design is swamped by the computation time of modules unless modules have very short execution times per iteration or time step. An analytical performance model based upon queuing network theory has been designed and implemented. Its application to realistic configurations has not been carried out.
Design of Object-Oriented Distributed Simulation Classes

NASA Technical Reports Server (NTRS)

Schoeffler, James D.

1995-01-01

Distributed simulation of aircraft engines as part of a computer aided design package being developed by NASA Lewis Research Center for the aircraft industry. The project is called NPSS, an acronym for "Numerical Propulsion Simulation System". NPSS is a flexible object-oriented simulation of aircraft engines requiring high computing speed. It is desirable to run the simulation on a distributed computer system with multiple processors executing portions of the simulation in parallel. The purpose of this research was to investigate object-oriented structures such that individual objects could be distributed. The set of classes used in the simulation must be designed to facilitate parallel computation. Since the portions of the simulation carried out in parallel are not independent of one another, there is the need for communication among the parallel executing processors which in turn implies need for their synchronization. Communication and synchronization can lead to decreased throughput as parallel processors wait for data or synchronization signals from other processors. As a result of this research, the following have been accomplished. The design and implementation of a set of simulation classes which result in a distributed simulation control program have been completed. The design is based upon MIT "Actor" model of a concurrent object and uses "connectors" to structure dynamic connections between simulation components. Connectors may be dynamically created according to the distribution of objects among machines at execution time without any programming changes. Measurements of the basic performance have been carried out with the result that communication overhead of the distributed design is swamped by the computation time of modules unless modules have very short execution times per iteration or time step. An analytical performance model based upon queuing network theory has been designed and implemented. Its application to realistic configurations has not been carried out.
The CP-PACS parallel computer

NASA Astrophysics Data System (ADS)

Ukawa, Akira

1998-05-01

The CP-PACS computer is a massively parallel computer consisting of 2048 processing units and having a peak speed of 614 GFLOPS and 128 GByte of main memory. It was developed over the four years from 1992 to 1996 at the Center for Computational Physics, University of Tsukuba, for large-scale numerical simulations in computational physics, especially those of lattice QCD. The CP-PACS computer has been in full operation for physics computations since October 1996. In this article we describe the chronology of the development, the hardware and software characteristics of the computer, and its performance for lattice QCD simulations.
High-Performance Computation of Distributed-Memory Parallel 3D Voronoi and Delaunay Tessellation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Peterka, Tom; Morozov, Dmitriy; Phillips, Carolyn

2014-11-14

Computing a Voronoi or Delaunay tessellation from a set of points is a core part of the analysis of many simulated and measured datasets: N-body simulations, molecular dynamics codes, and LIDAR point clouds are just a few examples. Such computational geometry methods are common in data analysis and visualization; but as the scale of simulations and observations surpasses billions of particles, the existing serial and shared-memory algorithms no longer suffice. A distributed-memory scalable parallel algorithm is the only feasible approach. The primary contribution of this paper is a new parallel Delaunay and Voronoi tessellation algorithm that automatically determines which neighbormore » points need to be exchanged among the subdomains of a spatial decomposition. Other contributions include periodic and wall boundary conditions, comparison of our method using two popular serial libraries, and application to numerous science datasets.« less
Development and parallelization of a direct numerical simulation to study the formation and transport of nanoparticle clusters in a viscous fluid

NASA Astrophysics Data System (ADS)

Sloan, Gregory James

The direct numerical simulation (DNS) offers the most accurate approach to modeling the behavior of a physical system, but carries an enormous computation cost. There exists a need for an accurate DNS to model the coupled solid-fluid system seen in targeted drug delivery (TDD), nanofluid thermal energy storage (TES), as well as other fields where experiments are necessary, but experiment design may be costly. A parallel DNS can greatly reduce the large computation times required, while providing the same results and functionality of the serial counterpart. A D2Q9 lattice Boltzmann method approach was implemented to solve the fluid phase. The use of domain decomposition with message passing interface (MPI) parallelism resulted in an algorithm that exhibits super-linear scaling in testing, which may be attributed to the caching effect. Decreased performance on a per-node basis for a fixed number of processes confirms this observation. A multiscale approach was implemented to model the behavior of nanoparticles submerged in a viscous fluid, and used to examine the mechanisms that promote or inhibit clustering. Parallelization of this model using a masterworker algorithm with MPI gives less-than-linear speedup for a fixed number of particles and varying number of processes. This is due to the inherent inefficiency of the master-worker approach. Lastly, these separate simulations are combined, and two-way coupling is implemented between the solid and fluid.
A Fast MHD Code for Gravitationally Stratified Media using Graphical Processing Units: SMAUG

NASA Astrophysics Data System (ADS)

Griffiths, M. K.; Fedun, V.; Erdélyi, R.

2015-03-01

Parallelization techniques have been exploited most successfully by the gaming/graphics industry with the adoption of graphical processing units (GPUs), possessing hundreds of processor cores. The opportunity has been recognized by the computational sciences and engineering communities, who have recently harnessed successfully the numerical performance of GPUs. For example, parallel magnetohydrodynamic (MHD) algorithms are important for numerical modelling of highly inhomogeneous solar, astrophysical and geophysical plasmas. Here, we describe the implementation of SMAUG, the Sheffield Magnetohydrodynamics Algorithm Using GPUs. SMAUG is a 1-3D MHD code capable of modelling magnetized and gravitationally stratified plasma. The objective of this paper is to present the numerical methods and techniques used for porting the code to this novel and highly parallel compute architecture. The methods employed are justified by the performance benchmarks and validation results demonstrating that the code successfully simulates the physics for a range of test scenarios including a full 3D realistic model of wave propagation in the solar atmosphere.
Performance analysis of a parallel Monte Carlo code for simulating solar radiative transfer in cloudy atmospheres using CUDA-enabled NVIDIA GPU

NASA Astrophysics Data System (ADS)

Russkova, Tatiana V.

2017-11-01

One tool to improve the performance of Monte Carlo methods for numerical simulation of light transport in the Earth's atmosphere is the parallel technology. A new algorithm oriented to parallel execution on the CUDA-enabled NVIDIA graphics processor is discussed. The efficiency of parallelization is analyzed on the basis of calculating the upward and downward fluxes of solar radiation in both a vertically homogeneous and inhomogeneous models of the atmosphere. The results of testing the new code under various atmospheric conditions including continuous singlelayered and multilayered clouds, and selective molecular absorption are presented. The results of testing the code using video cards with different compute capability are analyzed. It is shown that the changeover of computing from conventional PCs to the architecture of graphics processors gives more than a hundredfold increase in performance and fully reveals the capabilities of the technology used.
Simulating electron wave dynamics in graphene superlattices exploiting parallel processing advantages

NASA Astrophysics Data System (ADS)

Rodrigues, Manuel J.; Fernandes, David E.; Silveirinha, Mário G.; Falcão, Gabriel

2018-01-01

This work introduces a parallel computing framework to characterize the propagation of electron waves in graphene-based nanostructures. The electron wave dynamics is modeled using both "microscopic" and effective medium formalisms and the numerical solution of the two-dimensional massless Dirac equation is determined using a Finite-Difference Time-Domain scheme. The propagation of electron waves in graphene superlattices with localized scattering centers is studied, and the role of the symmetry of the microscopic potential in the electron velocity is discussed. The computational methodologies target the parallel capabilities of heterogeneous multi-core CPU and multi-GPU environments and are built with the OpenCL parallel programming framework which provides a portable, vendor agnostic and high throughput-performance solution. The proposed heterogeneous multi-GPU implementation achieves speedup ratios up to 75x when compared to multi-thread and multi-core CPU execution, reducing simulation times from several hours to a couple of minutes.

Interactive simulator for e-Learning environments: a teaching software for health care professionals.

PubMed

De Lazzari, Claudio; Genuini, Igino; Pisanelli, Domenico M; D'Ambrosi, Alessandra; Fedele, Francesco

2014-12-18

There is an established tradition of cardiovascular simulation tools, but the application of this kind of technology in the e-Learning arena is a novel approach. This paper presents an e-Learning environment aimed at teaching the interaction of cardiovascular and lung systems to health-care professionals. Heart-lung interaction must be analyzed while assisting patients with severe respiratory problems or with heart failure in intensive care unit. Such patients can be assisted by mechanical ventilatory assistance or by thoracic artificial lung."In silico" cardiovascular simulator was experimented during a training course given to graduate students of the School of Specialization in Cardiology at 'Sapienza' University in Rome.The training course employed CARDIOSIM©: a numerical simulator of the cardiovascular system. Such simulator is able to reproduce pathophysiological conditions of patients affected by cardiovascular and/or lung disease. In order to study the interactions among the cardiovascular system, the natural lung and the thoracic artificial lung (TAL), the numerical model of this device has been implemented. After having reproduced a patient's pathological condition, TAL model was applied in parallel and hybrid model during the training course.Results obtained during the training course show that TAL parallel assistance reduces right ventricular end systolic (diastolic) volume, but increases left ventricular end systolic (diastolic) volume. The percentage changes induced by hybrid TAL assistance on haemodynamic variables are lower than those produced by parallel assistance. Only in the case of the mean pulmonary arterial pressure, there is a percentage reduction which, in case of hybrid assistance, is greater (about 40%) than in case of parallel assistance (20-30%).At the end of the course, a short questionnaire was submitted to students in order to assess the quality of the course. The feedback obtained was positive, showing good results with respect to the degree of students' learning and the ease of use of the software simulator.
Numerical Prediction of CCV in a PFI Engine using a Parallel LES Approach

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ameen, Muhsin M; Mirzaeian, Mohsen; Millo, Federico

Cycle-to-cycle variability (CCV) is detrimental to IC engine operation and can lead to partial burn, misfire, and knock. Predicting CCV numerically is extremely challenging due to two key reasons. Firstly, high-fidelity methods such as large eddy simulation (LES) are required to accurately resolve the incylinder turbulent flowfield both spatially and temporally. Secondly, CCV is experienced over long timescales and hence the simulations need to be performed for hundreds of consecutive cycles. Ameen et al. (Int. J. Eng. Res., 2017) developed a parallel perturbation model (PPM) approach to dissociate this long time-scale problem into several shorter timescale problems. The strategy ismore » to perform multiple single-cycle simulations in parallel by effectively perturbing the initial velocity field based on the intensity of the in-cylinder turbulence. This strategy was demonstrated for motored engine and it was shown that the mean and variance of the in-cylinder flowfield was captured reasonably well by this approach. In the present study, this PPM approach is extended to simulate the CCV in a fired port-fuel injected (PFI) SI engine. Two operating conditions are considered – a medium CCV operating case corresponding to 2500 rpm and 16 bar BMEP and a low CCV case corresponding to 4000 rpm and 12 bar BMEP. The predictions from this approach are also shown to be similar to the consecutive LES cycles. Both the consecutive and PPM LES cycles are observed to under-predict the variability in the early stage of combustion. The parallel approach slightly underpredicts the cyclic variability at all stages of combustion as compared to the consecutive LES cycles. However, it is shown that the parallel approach is able to predict the coefficient of variation (COV) of the in-cylinder pressure and burn rate related parameters with sufficient accuracy, and is also able to predict the qualitative trends in CCV with changing operating conditions. The convergence of the statistics predicted by the PPM approach with respect to the number of consecutive cycles required for each parallel simulation is also investigated. It is shown that this new approach is able to give accurate predictions of the CCV in fired engines in less than one-tenth of the time required for the conventional approach of simulating consecutive engine cycles.« less
Current and planned numerical development for improving computing performance for long duration and/or low pressure transients

DOE Office of Scientific and Technical Information (OSTI.GOV)

Faydide, B.

1997-07-01

This paper presents the current and planned numerical development for improving computing performance in case of Cathare applications needing real time, like simulator applications. Cathare is a thermalhydraulic code developed by CEA (DRN), IPSN, EDF and FRAMATOME for PWR safety analysis. First, the general characteristics of the code are presented, dealing with physical models, numerical topics, and validation strategy. Then, the current and planned applications of Cathare in the field of simulators are discussed. Some of these applications were made in the past, using a simplified and fast-running version of Cathare (Cathare-Simu); the status of the numerical improvements obtained withmore » Cathare-Simu is presented. The planned developments concern mainly the Simulator Cathare Release (SCAR) project which deals with the use of the most recent version of Cathare inside simulators. In this frame, the numerical developments are related with the speed up of the calculation process, using parallel processing and improvement of code reliability on a large set of NPP transients.« less
Numerical simulation of the Earth satellites motion using parallel computing. accounting of weak disturbances. (Russian Title: Прогнозирование движения ИСЗ с использованием параллельных вычислений. учет слабых возмущений)

NASA Astrophysics Data System (ADS)

Chuvashov, I. N.

2010-12-01

The features of high-precision numerical simulation of the Earth satellite motion using parallel computing are discussed on example the implementation of the cluster "Skiff Cyberia" software complex "Numerical model of the motion of system satellites". It is shown that the use of 128 bit word length allows considering weak perturbations from the high-order harmonics in the expansion of the geopotential and the effect of strain geopotential harmonics arising due to the combination of tidal perturbations associated with exposure to the moon and sun on the solid Earth and its oceans.
A domain-specific compiler for a parallel multiresolution adaptive numerical simulation environment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rajbhandari, Samyam; Kim, Jinsung; Krishnamoorthy, Sriram

This paper describes the design and implementation of a layered domain-specific compiler to support MADNESS---Multiresolution ADaptive Numerical Environment for Scientific Simulation. MADNESS is a high-level software environment for the solution of integral and differential equations in many dimensions, using adaptive and fast harmonic analysis methods with guaranteed precision. MADNESS uses k-d trees to represent spatial functions and implements operators like addition, multiplication, differentiation, and integration on the numerical representation of functions. The MADNESS runtime system provides global namespace support and a task-based execution model including futures. MADNESS is currently deployed on massively parallel supercomputers and has enabled many science advances.more » Due to the highly irregular and statically unpredictable structure of the k-d trees representing the spatial functions encountered in MADNESS applications, only purely runtime approaches to optimization have previously been implemented in the MADNESS framework. This paper describes a layered domain-specific compiler developed to address some performance bottlenecks in MADNESS. The newly developed static compile-time optimizations, in conjunction with the MADNESS runtime support, enable significant performance improvement for the MADNESS framework.« less
Collisions between quasi-parallel shocks

NASA Technical Reports Server (NTRS)

Cargill, Peter J.

1991-01-01

The collision between pairs of quasi-parallel shocks is examined using hybrid numerical simulations. In the interaction, the two shocks are transmitted through each other leaving behind a hot plasma with a population of particles with energies in excess of 40 E0, where E0 is the kinetic energy of particles in the shock frame prior to the collision. The energization is more efficient for quasi-parallel shocks than parallel shocks. Collisions between shocks of equal strengths are more efficient than those that are unequal. The results are of importance for phenomena during the impulsive phase of solar flares, in the distant solar wind and at planetary bow shocks.
Two-dimensional numerical simulation of a Stirling engine heat exchanger

NASA Technical Reports Server (NTRS)

Ibrahim, Mounir; Tew, Roy C.; Dudenhoefer, James E.

1989-01-01

The first phase of an effort to develop multidimensional models of Stirling engine components is described. The ultimate goal is to model an entire engine working space. Parallel plate and tubular heat exchanger models are described, with emphasis on the central part of the channel (i.e., ignoring hydrodynamic and thermal end effects). The model assumes laminar, incompressible flow with constant thermophysical properties. In addition, a constant axial temperature gradient is imposed. The governing equations describing the model have been solved using the Crack-Nicloson finite-difference scheme. Model predictions are compared with analytical solutions for oscillating/reversing flow and heat transfer in order to check numerical accuracy. Excellent agreement is obtained for flow both in circular tubes and between parallel plates. The computational heat transfer results are in good agreement with the analytical heat transfer results for parallel plates.
Re-forming supercritical quasi-parallel shocks. II - Mechanism for wave generation and front re-formation

NASA Technical Reports Server (NTRS)

Winske, D.; Thomas, V. A.; Omidi, N.; Quest, K. B.

1990-01-01

This paper continues the study of Thomas et al. (1990) in which hybrid simulations of quasi-parallel shocks were performed in one and two spatial dimensions. To identify the wave generation processes, the electromagnetic structure of the shock is examined by performing a number of one-dimensional hybrid simulations of quasi-parallel shocks for various upstream conditions. In addition, numerical experiments were carried out in which the backstreaming ions were removed from calculations to show their fundamental importance in reformation process. The calculations show that the waves are excited before ions can propagate far enough upstream to generate resonant modes. At some later times, the waves are regenerated at the leading edge of the interface, with properties like those of their initial interactions.
Numerical Simulation of Flow Field Within Parallel Plate Plastometer

NASA Technical Reports Server (NTRS)

Antar, Basil N.

2002-01-01

Parallel Plate Plastometer (PPP) is a device commonly used for measuring the viscosity of high polymers at low rates of shear in the range 10(exp 4) to 10(exp 9) poises. This device is being validated for use in measuring the viscosity of liquid glasses at high temperatures having similar ranges for the viscosity values. PPP instrument consists of two similar parallel plates, both in the range of 1 inch in diameter with the upper plate being movable while the lower one is kept stationary. Load is applied to the upper plate by means of a beam connected to shaft attached to the upper plate. The viscosity of the fluid is deduced from measuring the variation of the plate separation, h, as a function of time when a specified fixed load is applied on the beam. Operating plate speeds measured with the PPP is usually in the range of 10.3 cm/s or lower. The flow field within the PPP can be simulated using the equations of motion of fluid flow for this configuration. With flow speeds in the range quoted above the flow field between the two plates is certainly incompressible and laminar. Such flows can be easily simulated using numerical modeling with computational fluid dynamics (CFD) codes. We present below the mathematical model used to simulate this flow field and also the solutions obtained for the flow using a commercially available finite element CFD code.
A joint numerical and experimental study of the jet of an aircraft engine installation with advanced techniques

NASA Astrophysics Data System (ADS)

Brunet, V.; Molton, P.; Bézard, H.; Deck, S.; Jacquin, L.

2012-01-01

This paper describes the results obtained during the European Union JEDI (JEt Development Investigations) project carried out in cooperation between ONERA and Airbus. The aim of these studies was first to acquire a complete database of a modern-type engine jet installation set under a wall-to-wall swept wing in various transonic flow conditions. Interactions between the engine jet, the pylon, and the wing were studied thanks to ¤advanced¥ measurement techniques. In parallel, accurate Reynolds-averaged Navier Stokes (RANS) simulations were carried out from simple ones with the Spalart Allmaras model to more complex ones like the DRSM-SSG (Differential Reynolds Stress Modef of Speziale Sarkar Gatski) turbulence model. In the end, Zonal-Detached Eddy Simulations (Z-DES) were also performed to compare different simulation techniques. All numerical results are accurately validated thanks to the experimental database acquired in parallel. This complete and complex study of modern civil aircraft engine installation allowed many upgrades in understanding and simulation methods to be obtained. Furthermore, a setup for engine jet installation studies has been validated for possible future works in the S3Ch transonic research wind-tunnel. The main conclusions are summed up in this paper.
High-Order Methods for Incompressible Fluid Flow

NASA Astrophysics Data System (ADS)

Deville, M. O.; Fischer, P. F.; Mund, E. H.

2002-08-01

High-order numerical methods provide an efficient approach to simulating many physical problems. This book considers the range of mathematical, engineering, and computer science topics that form the foundation of high-order numerical methods for the simulation of incompressible fluid flows in complex domains. Introductory chapters present high-order spatial and temporal discretizations for one-dimensional problems. These are extended to multiple space dimensions with a detailed discussion of tensor-product forms, multi-domain methods, and preconditioners for iterative solution techniques. Numerous discretizations of the steady and unsteady Stokes and Navier-Stokes equations are presented, with particular sttention given to enforcement of imcompressibility. Advanced discretizations. implementation issues, and parallel and vector performance are considered in the closing sections. Numerous examples are provided throughout to illustrate the capabilities of high-order methods in actual applications.
A parallel finite element simulator for ion transport through three-dimensional ion channel systems.

PubMed

Tu, Bin; Chen, Minxin; Xie, Yan; Zhang, Linbo; Eisenberg, Bob; Lu, Benzhuo

2013-09-15

A parallel finite element simulator, ichannel, is developed for ion transport through three-dimensional ion channel systems that consist of protein and membrane. The coordinates of heavy atoms of the protein are taken from the Protein Data Bank and the membrane is represented as a slab. The simulator contains two components: a parallel adaptive finite element solver for a set of Poisson-Nernst-Planck (PNP) equations that describe the electrodiffusion process of ion transport, and a mesh generation tool chain for ion channel systems, which is an essential component for the finite element computations. The finite element method has advantages in modeling irregular geometries and complex boundary conditions. We have built a tool chain to get the surface and volume mesh for ion channel systems, which consists of a set of mesh generation tools. The adaptive finite element solver in our simulator is implemented using the parallel adaptive finite element package Parallel Hierarchical Grid (PHG) developed by one of the authors, which provides the capability of doing large scale parallel computations with high parallel efficiency and the flexibility of choosing high order elements to achieve high order accuracy. The simulator is applied to a real transmembrane protein, the gramicidin A (gA) channel protein, to calculate the electrostatic potential, ion concentrations and I - V curve, with which both primitive and transformed PNP equations are studied and their numerical performances are compared. To further validate the method, we also apply the simulator to two other ion channel systems, the voltage dependent anion channel (VDAC) and α-Hemolysin (α-HL). The simulation results agree well with Brownian dynamics (BD) simulation results and experimental results. Moreover, because ionic finite size effects can be included in PNP model now, we also perform simulations using a size-modified PNP (SMPNP) model on VDAC and α-HL. It is shown that the size effects in SMPNP can effectively lead to reduced current in the channel, and the results are closer to BD simulation results. Copyright © 2013 Wiley Periodicals, Inc.
Computer Science Techniques Applied to Parallel Atomistic Simulation

NASA Astrophysics Data System (ADS)

Nakano, Aiichiro

1998-03-01

Recent developments in parallel processing technology and multiresolution numerical algorithms have established large-scale molecular dynamics (MD) simulations as a new research mode for studying materials phenomena such as fracture. However, this requires large system sizes and long simulated times. We have developed: i) Space-time multiresolution schemes; ii) fuzzy-clustering approach to hierarchical dynamics; iii) wavelet-based adaptive curvilinear-coordinate load balancing; iv) multilevel preconditioned conjugate gradient method; and v) spacefilling-curve-based data compression for parallel I/O. Using these techniques, million-atom parallel MD simulations are performed for the oxidation dynamics of nanocrystalline Al. The simulations take into account the effect of dynamic charge transfer between Al and O using the electronegativity equalization scheme. The resulting long-range Coulomb interaction is calculated efficiently with the fast multipole method. Results for temperature and charge distributions, residual stresses, bond lengths and bond angles, and diffusivities of Al and O will be presented. The oxidation of nanocrystalline Al is elucidated through immersive visualization in virtual environments. A unique dual-degree education program at Louisiana State University will also be discussed in which students can obtain a Ph.D. in Physics & Astronomy and a M.S. from the Department of Computer Science in five years. This program fosters interdisciplinary research activities for interfacing High Performance Computing and Communications with large-scale atomistic simulations of advanced materials. This work was supported by NSF (CAREER Program), ARO, PRF, and Louisiana LEQSF.
The Numerical Technique for the Landslide Tsunami Simulations Based on Navier-Stokes Equations

NASA Astrophysics Data System (ADS)

Kozelkov, A. S.

2017-12-01

The paper presents an integral technique simulating all phases of a landslide-driven tsunami. The technique is based on the numerical solution of the system of Navier-Stokes equations for multiphase flows. The numerical algorithm uses a fully implicit approximation method, in which the equations of continuity and momentum conservation are coupled through implicit summands of pressure gradient and mass flow. The method we propose removes severe restrictions on the time step and allows simulation of tsunami propagation to arbitrarily large distances. The landslide origin is simulated as an individual phase being a Newtonian fluid with its own density and viscosity and separated from the water and air phases by an interface. The basic formulas of equation discretization and expressions for coefficients are presented, and the main steps of the computation procedure are described in the paper. To enable simulations of tsunami propagation across wide water areas, we propose a parallel algorithm of the technique implementation, which employs an algebraic multigrid method. The implementation of the multigrid method is based on the global level and cascade collection algorithms that impose no limitations on the paralleling scale and make this technique applicable to petascale systems. We demonstrate the possibility of simulating all phases of a landslide-driven tsunami, including its generation, propagation and uprush. The technique has been verified against the problems supported by experimental data. The paper describes the mechanism of incorporating bathymetric data to simulate tsunamis in real water areas of the world ocean. Results of comparison with the nonlinear dispersion theory, which has demonstrated good agreement, are presented for the case of a historical tsunami of volcanic origin on the Montserrat Island in the Caribbean Sea.
Real-time dynamic simulation of the Cassini spacecraft using DARTS. Part 2: Parallel/vectorized real-time implementation

NASA Technical Reports Server (NTRS)

Fijany, A.; Roberts, J. A.; Jain, A.; Man, G. K.

1993-01-01

Part 1 of this paper presented the requirements for the real-time simulation of Cassini spacecraft along with some discussion of the DARTS algorithm. Here, in Part 2 we discuss the development and implementation of parallel/vectorized DARTS algorithm and architecture for real-time simulation. Development of the fast algorithms and architecture for real-time hardware-in-the-loop simulation of spacecraft dynamics is motivated by the fact that it represents a hard real-time problem, in the sense that the correctness of the simulation depends on both the numerical accuracy and the exact timing of the computation. For a given model fidelity, the computation should be computed within a predefined time period. Further reduction in computation time allows increasing the fidelity of the model (i.e., inclusion of more flexible modes) and the integration routine.
OpenACC performance for simulating 2D radial dambreak using FVM HLLE flux

NASA Astrophysics Data System (ADS)

Gunawan, P. H.; Pahlevi, M. R.

2018-03-01

The aim of this paper is to investigate the performances of openACC platform for computing 2D radial dambreak. Here, the shallow water equation will be used to describe and simulate 2D radial dambreak with finite volume method (FVM) using HLLE flux. OpenACC is a parallel computing platform based on GPU cores. Indeed, from this research this platform is used to minimize computational time on the numerical scheme performance. The results show the using OpenACC, the computational time is reduced. For the dry and wet radial dambreak simulations using 2048 grids, the computational time of parallel is obtained 575.984 s and 584.830 s respectively for both simulations. These results show the successful of OpenACC when they are compared with the serial time of dry and wet radial dambreak simulations which are collected 28047.500 s and 29269.40 s respectively.
Perspectives on the Future of CFD

NASA Technical Reports Server (NTRS)

Kwak, Dochan

2000-01-01

This viewgraph presentation gives an overview of the future of computational fluid dynamics (CFD), which in the past has pioneered the field of flow simulation. Over time CFD has progressed as computing power. Numerical methods have been advanced as CPU and memory capacity increases. Complex configurations are routinely computed now and direct numerical simulations (DNS) and large eddy simulations (LES) are used to study turbulence. As the computing resources changed to parallel and distributed platforms, computer science aspects such as scalability (algorithmic and implementation) and portability and transparent codings have advanced. Examples of potential future (or current) challenges include risk assessment, limitations of the heuristic model, and the development of CFD and information technology (IT) tools.
Terascale High-Fidelity Simulations of Turbulent Combustion with Detailed Chemistry: Spray Simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rutland, Christopher J.

2009-04-26

The Terascale High-Fidelity Simulations of Turbulent Combustion (TSTC) project is a multi-university collaborative effort to develop a high-fidelity turbulent reacting flow simulation capability utilizing terascale, massively parallel computer technology. The main paradigm of the approach is direct numerical simulation (DNS) featuring the highest temporal and spatial accuracy, allowing quantitative observations of the fine-scale physics found in turbulent reacting flows as well as providing a useful tool for development of sub-models needed in device-level simulations. Under this component of the TSTC program the simulation code named S3D, developed and shared with coworkers at Sandia National Laboratories, has been enhanced with newmore » numerical algorithms and physical models to provide predictive capabilities for turbulent liquid fuel spray dynamics. Major accomplishments include improved fundamental understanding of mixing and auto-ignition in multi-phase turbulent reactant mixtures and turbulent fuel injection spray jets.« less
GPAW - massively parallel electronic structure calculations with Python-based software.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Enkovaara, J.; Romero, N.; Shende, S.

2011-01-01

Electronic structure calculations are a widely used tool in materials science and large consumer of supercomputing resources. Traditionally, the software packages for these kind of simulations have been implemented in compiled languages, where Fortran in its different versions has been the most popular choice. While dynamic, interpreted languages, such as Python, can increase the effciency of programmer, they cannot compete directly with the raw performance of compiled languages. However, by using an interpreted language together with a compiled language, it is possible to have most of the productivity enhancing features together with a good numerical performance. We have used thismore » approach in implementing an electronic structure simulation software GPAW using the combination of Python and C programming languages. While the chosen approach works well in standard workstations and Unix environments, massively parallel supercomputing systems can present some challenges in porting, debugging and profiling the software. In this paper we describe some details of the implementation and discuss the advantages and challenges of the combined Python/C approach. We show that despite the challenges it is possible to obtain good numerical performance and good parallel scalability with Python based software.« less
MRXCAT: Realistic numerical phantoms for cardiovascular magnetic resonance

PubMed Central

2014-01-01

Background Computer simulations are important for validating novel image acquisition and reconstruction strategies. In cardiovascular magnetic resonance (CMR), numerical simulations need to combine anatomical information and the effects of cardiac and/or respiratory motion. To this end, a framework for realistic CMR simulations is proposed and its use for image reconstruction from undersampled data is demonstrated. Methods The extended Cardiac-Torso (XCAT) anatomical phantom framework with various motion options was used as a basis for the numerical phantoms. Different tissue, dynamic contrast and signal models, multiple receiver coils and noise are simulated. Arbitrary trajectories and undersampled acquisition can be selected. The utility of the framework is demonstrated for accelerated cine and first-pass myocardial perfusion imaging using k-t PCA and k-t SPARSE. Results MRXCAT phantoms allow for realistic simulation of CMR including optional cardiac and respiratory motion. Example reconstructions from simulated undersampled k-t parallel imaging demonstrate the feasibility of simulated acquisition and reconstruction using the presented framework. Myocardial blood flow assessment from simulated myocardial perfusion images highlights the suitability of MRXCAT for quantitative post-processing simulation. Conclusion The proposed MRXCAT phantom framework enables versatile and realistic simulations of CMR including breathhold and free-breathing acquisitions. PMID:25204441

Two-dimensional numerical simulation of a Stirling engine heat exchanger

NASA Technical Reports Server (NTRS)

Ibrahim, Mounir B.; Tew, Roy C.; Dudenhoefer, James E.

1989-01-01

The first phase of an effort to develop multidimensional models of Stirling engine components is described; the ultimate goal is to model an entire engine working space. More specifically, parallel plate and tubular heat exchanger models with emphasis on the central part of the channel (i.e., ignoring hydrodynamic and thermal end effects) are described. The model assumes: laminar, incompressible flow with constant thermophysical properties. In addition, a constant axial temperature gradient is imposed. The governing equations, describing the model, were solved using Crank-Nicloson finite-difference scheme. Model predictions were compared with analytical solutions for oscillating/reversing flow and heat transfer in order to check numerical accuracy. Excellent agreement was obtained for the model predictions with analytical solutions available for both flow in circular tubes and between parallel plates. Also the heat transfer computational results are in good agreement with the heat transfer analytical results for parallel plates.
Quadrature transmit array design using single-feed circularly polarized patch antenna for parallel transmission in MR imaging.

PubMed

Pang, Yong; Yu, Baiying; Vigneron, Daniel B; Zhang, Xiaoliang

2014-02-01

Quadrature coils are often desired in MR applications because they can improve MR sensitivity and also reduce excitation power. In this work, we propose, for the first time, a quadrature array design strategy for parallel transmission at 298 MHz using single-feed circularly polarized (CP) patch antenna technique. Each array element is a nearly square ring microstrip antenna and is fed at a point on the diagonal of the antenna to generate quadrature magnetic fields. Compared with conventional quadrature coils, the single-feed structure is much simple and compact, making the quadrature coil array design practical. Numerical simulations demonstrate that the decoupling between elements is better than -35 dB for all the elements and the RF fields are homogeneous with deep penetration and quadrature behavior in the area of interest. Bloch equation simulation is also performed to simulate the excitation procedure by using an 8-element quadrature planar patch array to demonstrate its feasibility in parallel transmission at the ultrahigh field of 7 Tesla.
A parallel finite element procedure for contact-impact problems using edge-based smooth triangular element and GPU

NASA Astrophysics Data System (ADS)

Cai, Yong; Cui, Xiangyang; Li, Guangyao; Liu, Wenyang

2018-04-01

The edge-smooth finite element method (ES-FEM) can improve the computational accuracy of triangular shell elements and the mesh partition efficiency of complex models. In this paper, an approach is developed to perform explicit finite element simulations of contact-impact problems with a graphical processing unit (GPU) using a special edge-smooth triangular shell element based on ES-FEM. Of critical importance for this problem is achieving finer-grained parallelism to enable efficient data loading and to minimize communication between the device and host. Four kinds of parallel strategies are then developed to efficiently solve these ES-FEM based shell element formulas, and various optimization methods are adopted to ensure aligned memory access. Special focus is dedicated to developing an approach for the parallel construction of edge systems. A parallel hierarchy-territory contact-searching algorithm (HITA) and a parallel penalty function calculation method are embedded in this parallel explicit algorithm. Finally, the program flow is well designed, and a GPU-based simulation system is developed, using Nvidia's CUDA. Several numerical examples are presented to illustrate the high quality of the results obtained with the proposed methods. In addition, the GPU-based parallel computation is shown to significantly reduce the computing time.
Three-Dimensional Numerical Simulation on Triaxial Failure Mechanical Behavior of Rock-Like Specimen Containing Two Unparallel Fissures

NASA Astrophysics Data System (ADS)

Huang, Yan-Hua; Yang, Sheng-Qi; Zhao, Jian

2016-12-01

A three-dimensional particle flow code (PFC3D) was used for a systematic numerical simulation of the strength failure and cracking behavior of rock-like material specimens containing two unparallel fissures under conventional triaxial compression. The micro-parameters of the parallel bond model were first calibrated using the laboratory results of intact specimens and then validated from the experimental results of pre-fissured specimens under triaxial compression. Numerically simulated stress-strain curves, strength and deformation parameters and macro-failure modes of pre-fissured specimens were all in good agreement with the experimental results. The relationship between stress and the micro-crack numbers was summarized. Crack initiation, propagation and coalescence process of pre-fissured specimens were analyzed in detail. Finally, horizontal and vertical cross sections of numerical specimens were derived from PFC3D. A detailed analysis to reveal the internal damage behavior of rock under triaxial compression was carried out. The experimental and simulated results are expected to improve the understanding of the strength failure and cracking behavior of fractured rock under triaxial compression.
Theoretical analysis and Vsim simulation of a low-voltage high-efficiency 250 GHz gyrotron

NASA Astrophysics Data System (ADS)

An, Chenxiang; Zhang, Dian; Zhang, Jun; Zhong, Huihuang

2018-02-01

Low-voltage, high-frequency gyrotrons with hundreds of watts of power are useful in radar, magnetic resonance spectroscopy and plasma diagnostic applications. In this paper, a 10 kV, 478 W, 250 GHz gyrotron with an efficiency of nearly 40% and a pitch ratio of 1.5 was designed through linear and nonlinear numerical analyses and Vsim particle-in-cell (PIC) simulation. Vsim is a highly efficient parallel PIC code, but it has seldom been used to carry out electron beam wave interaction simulations of gyro-devices. The setting up of the parameters required for the Vsim simulations of the gyrotron is presented. The results of Vsim simulations agree well with that of nonlinear numerical calculation. The commercial software Vsim7.2 completed the 3D gyrotron simulation in 80 h using a 20 core, 2.2 GHz personal computer with 256 GBytes of memory.
Direct numerical simulation of instabilities in parallel flow with spherical roughness elements

NASA Technical Reports Server (NTRS)

Deanna, R. G.

1992-01-01

Results from a direct numerical simulation of laminar flow over a flat surface with spherical roughness elements using a spectral-element method are given. The numerical simulation approximates roughness as a cellular pattern of identical spheres protruding from a smooth wall. Periodic boundary conditions on the domain's horizontal faces simulate an infinite array of roughness elements extending in the streamwise and spanwise directions, which implies the parallel-flow assumption, and results in a closed domain. A body force, designed to yield the horizontal Blasius velocity in the absence of roughness, sustains the flow. Instabilities above a critical Reynolds number reveal negligible oscillations in the recirculation regions behind each sphere and in the free stream, high-amplitude oscillations in the layer directly above the spheres, and a mean profile with an inflection point near the sphere's crest. The inflection point yields an unstable layer above the roughness (where U''(y) is less than 0) and a stable region within the roughness (where U''(y) is greater than 0). Evidently, the instability begins when the low-momentum or wake region behind an element, being the region most affected by disturbances (purely numerical in this case), goes unstable and moves. In compressible flow with periodic boundaries, this motion sends disturbances to all regions of the domain. In the unstable layer just above the inflection point, the disturbances grow while being carried downstream with a propagation speed equal to the local mean velocity; they do not grow amid the low energy region near the roughness patch. The most amplified disturbance eventually arrives at the next roughness element downstream, perturbing its wake and inducing a global response at a frequency governed by the streamwise spacing between spheres and the mean velocity of the most amplified layer.
Scalable High Performance Computing: Direct and Large-Eddy Turbulent Flow Simulations Using Massively Parallel Computers

NASA Technical Reports Server (NTRS)

Morgan, Philip E.

2004-01-01

This final report contains reports of research related to the tasks "Scalable High Performance Computing: Direct and Lark-Eddy Turbulent FLow Simulations Using Massively Parallel Computers" and "Devleop High-Performance Time-Domain Computational Electromagnetics Capability for RCS Prediction, Wave Propagation in Dispersive Media, and Dual-Use Applications. The discussion of Scalable High Performance Computing reports on three objectives: validate, access scalability, and apply two parallel flow solvers for three-dimensional Navier-Stokes flows; develop and validate a high-order parallel solver for Direct Numerical Simulations (DNS) and Large Eddy Simulation (LES) problems; and Investigate and develop a high-order Reynolds averaged Navier-Stokes turbulence model. The discussion of High-Performance Time-Domain Computational Electromagnetics reports on five objectives: enhancement of an electromagnetics code (CHARGE) to be able to effectively model antenna problems; utilize lessons learned in high-order/spectral solution of swirling 3D jets to apply to solving electromagnetics project; transition a high-order fluids code, FDL3DI, to be able to solve Maxwell's Equations using compact-differencing; develop and demonstrate improved radiation absorbing boundary conditions for high-order CEM; and extend high-order CEM solver to address variable material properties. The report also contains a review of work done by the systems engineer.
Hybrid parallelization of the XTOR-2F code for the simulation of two-fluid MHD instabilities in tokamaks

NASA Astrophysics Data System (ADS)

Marx, Alain; Lütjens, Hinrich

2017-03-01

A hybrid MPI/OpenMP parallel version of the XTOR-2F code [Lütjens and Luciani, J. Comput. Phys. 229 (2010) 8130] solving the two-fluid MHD equations in full tokamak geometry by means of an iterative Newton-Krylov matrix-free method has been developed. The present work shows that the code has been parallelized significantly despite the numerical profile of the problem solved by XTOR-2F, i.e. a discretization with pseudo-spectral representations in all angular directions, the stiffness of the two-fluid stability problem in tokamaks, and the use of a direct LU decomposition to invert the physical pre-conditioner at every Krylov iteration of the solver. The execution time of the parallelized version is an order of magnitude smaller than the sequential one for low resolution cases, with an increasing speedup when the discretization mesh is refined. Moreover, it allows to perform simulations with higher resolutions, previously forbidden because of memory limitations.
Terascale direct numerical simulations of turbulent combustion using S3D

NASA Astrophysics Data System (ADS)

Chen, J. H.; Choudhary, A.; de Supinski, B.; DeVries, M.; Hawkes, E. R.; Klasky, S.; Liao, W. K.; Ma, K. L.; Mellor-Crummey, J.; Podhorszki, N.; Sankaran, R.; Shende, S.; Yoo, C. S.

2009-01-01

Computational science is paramount to the understanding of underlying processes in internal combustion engines of the future that will utilize non-petroleum-based alternative fuels, including carbon-neutral biofuels, and burn in new combustion regimes that will attain high efficiency while minimizing emissions of particulates and nitrogen oxides. Next-generation engines will likely operate at higher pressures, with greater amounts of dilution and utilize alternative fuels that exhibit a wide range of chemical and physical properties. Therefore, there is a significant role for high-fidelity simulations, direct numerical simulations (DNS), specifically designed to capture key turbulence-chemistry interactions in these relatively uncharted combustion regimes, and in particular, that can discriminate the effects of differences in fuel properties. In DNS, all of the relevant turbulence and flame scales are resolved numerically using high-order accurate numerical algorithms. As a consequence terascale DNS are computationally intensive, require massive amounts of computing power and generate tens of terabytes of data. Recent results from terascale DNS of turbulent flames are presented here, illustrating its role in elucidating flame stabilization mechanisms in a lifted turbulent hydrogen/air jet flame in a hot air coflow, and the flame structure of a fuel-lean turbulent premixed jet flame. Computing at this scale requires close collaborations between computer and combustion scientists to provide optimized scaleable algorithms and software for terascale simulations, efficient collective parallel I/O, tools for volume visualization of multiscale, multivariate data and automating the combustion workflow. The enabling computer science, applied to combustion science, is also required in many other terascale physics and engineering simulations. In particular, performance monitoring is used to identify the performance of key kernels in the DNS code, S3D and especially memory intensive loops in the code. Through the careful application of loop transformations, data reuse in cache is exploited thereby reducing memory bandwidth needs, and hence, improving S3D's nodal performance. To enhance collective parallel I/O in S3D, an MPI-I/O caching design is used to construct a two-stage write-behind method for improving the performance of write-only operations. The simulations generate tens of terabytes of data requiring analysis. Interactive exploration of the simulation data is enabled by multivariate time-varying volume visualization. The visualization highlights spatial and temporal correlations between multiple reactive scalar fields using an intuitive user interface based on parallel coordinates and time histogram. Finally, an automated combustion workflow is designed using Kepler to manage large-scale data movement, data morphing, and archival and to provide a graphical display of run-time diagnostics.
A Parallel 2D Numerical Simulation of Tumor Cells Necrosis by Local Hyperthermia

NASA Astrophysics Data System (ADS)

Reis, R. F.; Loureiro, F. S.; Lobosco, M.

2014-03-01

Hyperthermia has been widely used in cancer treatment to destroy tumors. The main idea of the hyperthermia is to heat a specific region like a tumor so that above a threshold temperature the tumor cells are destroyed. This can be accomplished by many heat supply techniques and the use of magnetic nanoparticles that generate heat when an alternating magnetic field is applied has emerged as a promise technique. In the present paper, the Pennes bioheat transfer equation is adopted to model the thermal tumor ablation in the context of magnetic nanoparticles. Numerical simulations are carried out considering different injection sites for the nanoparticles in an attempt to achieve better hyperthermia conditions. Explicit finite difference method is employed to solve the equations. However, a large amount of computation is required for this purpose. Therefore, this work also presents an initial attempt to improve performance using OpenMP, a parallel programming API. Experimental results were quite encouraging: speedups around 35 were obtained on a 64-core machine.
Electromagnetically induced disintegration and polarization plane rotation of laser pulses

NASA Astrophysics Data System (ADS)

Parshkov, Oleg M.; Budyak, Victoria V.; Kochetkova, Anastasia E.

2017-04-01

The numerical simulation results of disintegration effect of linear polarized shot probe pulses of electromagnetically induced transparency in the counterintuitive superposed linear polarized control field are presented. It is shown, that this disintegration occurs, if linear polarizations of interacting pulses are not parallel or mutually perpendicular. In case of weak input probe field the polarization of one probe pulse in the medium is parallel, whereas the polarization of another probe pulse is perpendicular to polarization direction of input control radiation. The concerned effect is analogous to the effect, which must to take place when short laser pulse propagates along main axes of biaxial crystal because of group velocity of normal mod difference. The essential difference of probe pulse disintegration and linear process in biaxial crystal is that probe pulse preserves linear polarization in all stages of propagation. The numerical simulation is performed for scheme of degenerated quantum transitions between 3P0 , 3P01 and 3P2 energy levels of 208Pb isotope.
Numerical simulation of deformation and fracture of space protective shell structures from concrete and fiber concrete under pulse loading

NASA Astrophysics Data System (ADS)

Radchenko, P. A.; Batuev, S. P.; Radchenko, A. V.; Plevkov, V. S.

2015-11-01

This paper presents results of numerical simulation of interaction between aircraft Boeing 747-400 and protective shell of nuclear power plant. The shell is presented as complex multilayered cellular structure comprising layers of concrete and fiber concrete bonded with steel trusses. Numerical simulation was held three-dimensionally using the author's algorithm and software taking into account algorithms for building grids of complex geometric objects and parallel computations. The dynamics of stress-strain state and fracture of structure were studied. Destruction is described using two-stage model that allows taking into account anisotropy of elastic and strength properties of concrete and fiber concrete. It is shown that wave processes initiate destruction of shell cellular structure—cells start to destruct in unloading wave, originating after output of compression wave to the free surfaces of cells.
Evaluation of mechanical deformation and distributive magnetic loads with different mechanical constraints in two parallel conducting bars

NASA Astrophysics Data System (ADS)

Lee, Ho-Young; Lee, Se-Hee

2017-08-01

Mechanical deformation, bending deformation, and distributive magnetic loads were evaluated numerically and experimentally for conducting materials excited with high current. Until now, many research works have extensively studied the area of magnetic force and mechanical deformation by using coupled approaches such as multiphysics solvers. In coupled analysis for magnetoelastic problems, some articles and commercial software have presented the resultant mechanical deformation and stress on the body. To evaluate the mechanical deformation, the Lorentz force density method (LZ) and the Maxwell stress tensor method (MX) have been widely used for conducting materials. However, it is difficult to find any experimental verification regarding mechanical deformation or bending deformation due to magnetic force density. Therefore, we compared our numerical results to those from experiments with two parallel conducting bars to verify our numerical setup for bending deformation. Before showing this, the basic and interesting coupled simulation was conducted to test the mechanical deformations by the LZ (body force density) and the MX (surface force density) methods. This resulted in MX gave the same total force as LZ, but the local force distribution in MX introduced an incorrect mechanical deformation in the simulation of a solid conductor.
An innovative hybrid 3D analytic-numerical model for air breathing parallel channel counter-flow PEM fuel cells.

PubMed

Tavčar, Gregor; Katrašnik, Tomaž

2014-01-01

The parallel straight channel PEM fuel cell model presented in this paper extends the innovative hybrid 3D analytic-numerical (HAN) approach previously published by the authors with capabilities to address ternary diffusion systems and counter-flow configurations. The model's core principle is modelling species transport by obtaining a 2D analytic solution for species concentration distribution in the plane perpendicular to the cannel gas-flow and coupling consecutive 2D solutions by means of a 1D numerical pipe-flow model. Electrochemical and other nonlinear phenomena are coupled to the species transport by a routine that uses derivative approximation with prediction-iteration. The latter is also the core of the counter-flow computation algorithm. A HAN model of a laboratory test fuel cell is presented and evaluated against a professional 3D CFD simulation tool showing very good agreement between results of the presented model and those of the CFD simulation. Furthermore, high accuracy results are achieved at moderate computational times, which is owed to the semi-analytic nature and to the efficient computational coupling of electrochemical kinetics and species transport.
A software tool for modeling and simulation of numerical P systems.

PubMed

Buiu, Catalin; Arsene, Octavian; Cipu, Corina; Patrascu, Monica

2011-03-01

A P system represents a distributed and parallel bio-inspired computing model in which basic data structures are multi-sets or strings. Numerical P systems have been recently introduced and they use numerical variables and local programs (or evolution rules), usually in a deterministic way. They may find interesting applications in areas such as computational biology, process control or robotics. The first simulator of numerical P systems (SNUPS) has been designed, implemented and made available to the scientific community by the authors of this paper. SNUPS allows a wide range of applications, from modeling and simulation of ordinary differential equations, to the use of membrane systems as computational blocks of cognitive architectures, and as controllers for autonomous mobile robots. This paper describes the functioning of a numerical P system and presents an overview of SNUPS capabilities together with an illustrative example. SNUPS is freely available to researchers as a standalone application and may be downloaded from a dedicated website, http://snups.ics.pub.ro/, which includes an user manual and sample membrane structures. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
High Performance Input/Output for Parallel Computer Systems

NASA Technical Reports Server (NTRS)

Ligon, W. B.

1996-01-01

The goal of our project is to study the I/O characteristics of parallel applications used in Earth Science data processing systems such as Regional Data Centers (RDCs) or EOSDIS. Our approach is to study the runtime behavior of typical programs and the effect of key parameters of the I/O subsystem both under simulation and with direct experimentation on parallel systems. Our three year activity has focused on two items: developing a test bed that facilitates experimentation with parallel I/O, and studying representative programs from the Earth science data processing application domain. The Parallel Virtual File System (PVFS) has been developed for use on a number of platforms including the Tiger Parallel Architecture Workbench (TPAW) simulator, The Intel Paragon, a cluster of DEC Alpha workstations, and the Beowulf system (at CESDIS). PVFS provides considerable flexibility in configuring I/O in a UNIX- like environment. Access to key performance parameters facilitates experimentation. We have studied several key applications fiom levels 1,2 and 3 of the typical RDC processing scenario including instrument calibration and navigation, image classification, and numerical modeling codes. We have also considered large-scale scientific database codes used to organize image data.
Simulation of dilute polymeric fluids in a three-dimensional contraction using a multiscale FENE model

DOE Office of Scientific and Technical Information (OSTI.GOV)

Griebel, M., E-mail: griebel@ins.uni-bonn.de, E-mail: ruettgers@ins.uni-bonn.de; Rüttgers, A., E-mail: griebel@ins.uni-bonn.de, E-mail: ruettgers@ins.uni-bonn.de

The multiscale FENE model is applied to a 3D square-square contraction flow problem. For this purpose, the stochastic Brownian configuration field method (BCF) has been coupled with our fully parallelized three-dimensional Navier-Stokes solver NaSt3DGPF. The robustness of the BCF method enables the numerical simulation of high Deborah number flows for which most macroscopic methods suffer from stability issues. The results of our simulations are compared with that of experimental measurements from literature and show a very good agreement. In particular, flow phenomena such as a strong vortex enhancement, streamline divergence and a flow inversion for highly elastic flows are reproduced.more » Due to their computational complexity, our simulations require massively parallel computations. Using a domain decomposition approach with MPI, the implementation achieves excellent scale-up results for up to 128 processors.« less
Analysis of Plane-Parallel Electron Beam Propagation in Different Media by Numerical Simulation Methods

NASA Astrophysics Data System (ADS)

Miloichikova, I. A.; Bespalov, V. I.; Krasnykh, A. A.; Stuchebrov, S. G.; Cherepennikov, Yu. M.; Dusaev, R. R.

2018-04-01

Simulation by the Monte Carlo method is widely used to calculate the character of ionizing radiation interaction with substance. A wide variety of programs based on the given method allows users to choose the most suitable package for solving computational problems. In turn, it is important to know exactly restrictions of numerical systems to avoid gross errors. Results of estimation of the feasibility of application of the program PCLab (Computer Laboratory, version 9.9) for numerical simulation of the electron energy distribution absorbed in beryllium, aluminum, gold, and water for industrial, research, and clinical beams are presented. The data obtained using programs ITS and Geant4 being the most popular software packages for solving the given problems and the program PCLab are presented in the graphic form. A comparison and an analysis of the results obtained demonstrate the feasibility of application of the program PCLab for simulation of the absorbed energy distribution and dose of electrons in various materials for energies in the range 1-20 MeV.
Direct numerical simulation of turbulent flow in a rotating square duct

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dai, Yi-Jun; Huang, Wei-Xi, E-mail: hwx@tsinghua.edu.cn; Xu, Chun-Xiao

A fully developed turbulent flow in a rotating straight square duct is simulated by direct numerical simulations at Re{sub τ} = 300 and 0 ≤ Ro{sub τ} ≤ 40. The rotating axis is parallel to two opposite walls of the duct and normal to the main flow. Variations of the turbulence statistics with the rotation rate are presented, and a comparison with the rotating turbulent channel flow is discussed. Rich secondary flow patterns in the cross section are observed by varying the rotation rate. The appearance of a pair of additional vortices above the pressure wall is carefully examined, andmore » the underlying mechanism is explained according to the budget analysis of the mean momentum equations.« less
A polymorphic reconfigurable emulator for parallel simulation

NASA Technical Reports Server (NTRS)

Parrish, E. A., Jr.; Mcvey, E. S.; Cook, G.

1980-01-01

Microprocessor and arithmetic support chip technology was applied to the design of a reconfigurable emulator for real time flight simulation. The system developed consists of master control system to perform all man machine interactions and to configure the hardware to emulate a given aircraft, and numerous slave compute modules (SCM) which comprise the parallel computational units. It is shown that all parts of the state equations can be worked on simultaneously but that the algebraic equations cannot (unless they are slowly varying). Attempts to obtain algorithms that will allow parellel updates are reported. The word length and step size to be used in the SCM's is determined and the architecture of the hardware and software is described.

Plasma Jet Simulations Using a Generalized Ohm's Law

NASA Technical Reports Server (NTRS)

Ebersohn, Frans; Shebalin, John V.; Girimaji, Sharath S.

2012-01-01

Plasma jets are important physical phenomena in astrophysics and plasma propulsion devices. A currently proposed dual jet plasma propulsion device to be used for ISS experiments strongly resembles a coronal loop and further draws a parallel between these physical systems [1]. To study plasma jets we use numerical methods that solve the compressible MHD equations using the generalized Ohm s law [2]. Here, we will discuss the crucial underlying physics of these systems along with the numerical procedures we utilize to study them. Recent results from our numerical experiments will be presented and discussed.
Astrophysical N-body Simulations Using Hierarchical Tree Data Structures

NASA Astrophysics Data System (ADS)

Warren, M. S.; Salmon, J. K.

The authors report on recent large astrophysical N-body simulations executed on the Intel Touchstone Delta system. They review the astrophysical motivation and the numerical techniques and discuss steps taken to parallelize these simulations. The methods scale as O(N log N), for large values of N, and also scale linearly with the number of processors. The performance sustained for a duration of 67 h, was between 5.1 and 5.4 Gflop/s on a 512-processor system.
Queueing Network Models for Parallel Processing of Task Systems: an Operational Approach

NASA Technical Reports Server (NTRS)

Mak, Victor W. K.

1986-01-01

Computer performance modeling of possibly complex computations running on highly concurrent systems is considered. Earlier works in this area either dealt with a very simple program structure or resulted in methods with exponential complexity. An efficient procedure is developed to compute the performance measures for series-parallel-reducible task systems using queueing network models. The procedure is based on the concept of hierarchical decomposition and a new operational approach. Numerical results for three test cases are presented and compared to those of simulations.
Ballistic Deficits for Ionization Chamber Pulses in Pulse Shaping Amplifiers

NASA Astrophysics Data System (ADS)

Kumar, G. Anil; Sharma, S. L.; Choudhury, R. K.

2007-04-01

In order to understand the dependence of the ballistic deficit on the shape of rising portion of the voltage pulse at the input of a pulse shaping amplifier, we have estimated the ballistic deficits for the pulses from a two-electrode parallel plate ionization chamber as well as for the pulses from a gridded parallel plate ionization chamber. These estimations have been made using numerical integration method when the pulses are processed through the CR-RCn (n=1-6) shaping network as well as when the pulses are processed through the complex shaping network of the ORTEC Model 472 spectroscopic amplifier. Further, we have made simulations to see the effect of ballistic deficit on the pulse-height spectra under different conditions. We have also carried out measurements of the ballistic deficits for the pulses from a two-electrode parallel plate ionization chamber as well as for the pulses from a gridded parallel plate ionization chamber when these pulses are processed through the ORTEC 572 linear amplifier having a simple CR-RC shaping network. The reasonable matching of the simulated ballistic deficits with the experimental ballistic deficits for the CR-RC shaping network clearly establishes the validity of the simulation technique
The island dynamics model on parallel quadtree grids

NASA Astrophysics Data System (ADS)

Mistani, Pouria; Guittet, Arthur; Bochkov, Daniil; Schneider, Joshua; Margetis, Dionisios; Ratsch, Christian; Gibou, Frederic

2018-05-01

We introduce an approach for simulating epitaxial growth by use of an island dynamics model on a forest of quadtree grids, and in a parallel environment. To this end, we use a parallel framework introduced in the context of the level-set method. This framework utilizes: discretizations that achieve a second-order accurate level-set method on non-graded adaptive Cartesian grids for solving the associated free boundary value problem for surface diffusion; and an established library for the partitioning of the grid. We consider the cases with: irreversible aggregation, which amounts to applying Dirichlet boundary conditions at the island boundary; and an asymmetric (Ehrlich-Schwoebel) energy barrier for attachment/detachment of atoms at the island boundary, which entails the use of a Robin boundary condition. We provide the scaling analyses performed on the Stampede supercomputer and numerical examples that illustrate the capability of our methodology to efficiently simulate different aspects of epitaxial growth. The combination of adaptivity and parallelism in our approach enables simulations that are several orders of magnitude faster than those reported in the recent literature and, thus, provides a viable framework for the systematic study of mound formation on crystal surfaces.
Numerical Simulation of Internal Heat Transfer Phenomena Occurring During De-Icing of Aircraft Components

NASA Technical Reports Server (NTRS)

DeWitt, Keneth J.

1996-01-01

An experimental study to determine the convective heat transfer coefficient from castings made from ice-roughened plates is reported. A corresponding topic, 'Measurements of the Convective Heat Transfer Coefficient from Ice Roughened Surfaces in Parallel and Accelerated Flows,' is presented.
Numerical experiments with flows of elongated granules

NASA Technical Reports Server (NTRS)

Elrod, Harold G.; Brewe, David E.

1992-01-01

Theory and numerical results are given for a program simulating two dimensional granular flow (1) between two infinite, counter-moving, parallel, roughened walls, and (2) for an infinitely wide slider. Each granule is simulated by a central repulsive force field ratcheted with force restitution factor to introduce dissipation. Transmission of angular momentum between particles occurs via Coulomb friction. The effect of granular hardness is explored. Gaps from 7 to 28 particle diameters are investigated, with solid fractions ranging from 0.2 to 0.9. Among features observed are: slip flow at boundaries, coagulation at high densities, and gross fluctuation in surface stress. A videotape has been prepared to demonstrate the foregoing effects.
Long-time atomistic simulations with the Parallel Replica Dynamics method

NASA Astrophysics Data System (ADS)

Perez, Danny

Molecular Dynamics (MD) -- the numerical integration of atomistic equations of motion -- is a workhorse of computational materials science. Indeed, MD can in principle be used to obtain any thermodynamic or kinetic quantity, without introducing any approximation or assumptions beyond the adequacy of the interaction potential. It is therefore an extremely powerful and flexible tool to study materials with atomistic spatio-temporal resolution. These enviable qualities however come at a steep computational price, hence limiting the system sizes and simulation times that can be achieved in practice. While the size limitation can be efficiently addressed with massively parallel implementations of MD based on spatial decomposition strategies, allowing for the simulation of trillions of atoms, the same approach usually cannot extend the timescales much beyond microseconds. In this article, we discuss an alternative parallel-in-time approach, the Parallel Replica Dynamics (ParRep) method, that aims at addressing the timescale limitation of MD for systems that evolve through rare state-to-state transitions. We review the formal underpinnings of the method and demonstrate that it can provide arbitrarily accurate results for any definition of the states. When an adequate definition of the states is available, ParRep can simulate trajectories with a parallel speedup approaching the number of replicas used. We demonstrate the usefulness of ParRep by presenting different examples of materials simulations where access to long timescales was essential to access the physical regime of interest and discuss practical considerations that must be addressed to carry out these simulations. Work supported by the United States Department of Energy (U.S. DOE), Office of Science, Office of Basic Energy Sciences, Materials Sciences and Engineering Division.
A study of the parallel algorithm for large-scale DC simulation of nonlinear systems

NASA Astrophysics Data System (ADS)

Cortés Udave, Diego Ernesto; Ogrodzki, Jan; Gutiérrez de Anda, Miguel Angel

Newton-Raphson DC analysis of large-scale nonlinear circuits may be an extremely time consuming process even if sparse matrix techniques and bypassing of nonlinear models calculation are used. A slight decrease in the time required for this task may be enabled on multi-core, multithread computers if the calculation of the mathematical models for the nonlinear elements as well as the stamp management of the sparse matrix entries are managed through concurrent processes. This numerical complexity can be further reduced via the circuit decomposition and parallel solution of blocks taking as a departure point the BBD matrix structure. This block-parallel approach may give a considerable profit though it is strongly dependent on the system topology and, of course, on the processor type. This contribution presents the easy-parallelizable decomposition-based algorithm for DC simulation and provides a detailed study of its effectiveness.
New Parallel Algorithms for Landscape Evolution Model

NASA Astrophysics Data System (ADS)

Jin, Y.; Zhang, H.; Shi, Y.

2017-12-01

Most landscape evolution models (LEM) developed in the last two decades solve the diffusion equation to simulate the transportation of surface sediments. This numerical approach is difficult to parallelize due to the computation of drainage area for each node, which needs huge amount of communication if run in parallel. In order to overcome this difficulty, we developed two parallel algorithms for LEM with a stream net. One algorithm handles the partition of grid with traditional methods and applies an efficient global reduction algorithm to do the computation of drainage areas and transport rates for the stream net; the other algorithm is based on a new partition algorithm, which partitions the nodes in catchments between processes first, and then partitions the cells according to the partition of nodes. Both methods focus on decreasing communication between processes and take the advantage of massive computing techniques, and numerical experiments show that they are both adequate to handle large scale problems with millions of cells. We implemented the two algorithms in our program based on the widely used finite element library deal.II, so that it can be easily coupled with ASPECT.
Using parallel computing for the display and simulation of the space debris environment

NASA Astrophysics Data System (ADS)

Möckel, M.; Wiedemann, C.; Flegel, S.; Gelhaus, J.; Vörsmann, P.; Klinkrad, H.; Krag, H.

2011-07-01

Parallelism is becoming the leading paradigm in today's computer architectures. In order to take full advantage of this development, new algorithms have to be specifically designed for parallel execution while many old ones have to be upgraded accordingly. One field in which parallel computing has been firmly established for many years is computer graphics. Calculating and displaying three-dimensional computer generated imagery in real time requires complex numerical operations to be performed at high speed on a large number of objects. Since most of these objects can be processed independently, parallel computing is applicable in this field. Modern graphics processing units (GPUs) have become capable of performing millions of matrix and vector operations per second on multiple objects simultaneously. As a side project, a software tool is currently being developed at the Institute of Aerospace Systems that provides an animated, three-dimensional visualization of both actual and simulated space debris objects. Due to the nature of these objects it is possible to process them individually and independently from each other. Therefore, an analytical orbit propagation algorithm has been implemented to run on a GPU. By taking advantage of all its processing power a huge performance increase, compared to its CPU-based counterpart, could be achieved. For several years efforts have been made to harness this computing power for applications other than computer graphics. Software tools for the simulation of space debris are among those that could profit from embracing parallelism. With recently emerged software development tools such as OpenCL it is possible to transfer the new algorithms used in the visualization outside the field of computer graphics and implement them, for example, into the space debris simulation environment. This way they can make use of parallel hardware such as GPUs and Multi-Core-CPUs for faster computation. In this paper the visualization software will be introduced, including a comparison between the serial and the parallel method of orbit propagation. Ways of how to use the benefits of the latter method for space debris simulation will be discussed. An introduction to OpenCL will be given as well as an exemplary algorithm from the field of space debris simulation.
Using parallel computing for the display and simulation of the space debris environment

NASA Astrophysics Data System (ADS)

Moeckel, Marek; Wiedemann, Carsten; Flegel, Sven Kevin; Gelhaus, Johannes; Klinkrad, Heiner; Krag, Holger; Voersmann, Peter

Parallelism is becoming the leading paradigm in today's computer architectures. In order to take full advantage of this development, new algorithms have to be specifically designed for parallel execution while many old ones have to be upgraded accordingly. One field in which parallel computing has been firmly established for many years is computer graphics. Calculating and displaying three-dimensional computer generated imagery in real time requires complex numerical operations to be performed at high speed on a large number of objects. Since most of these objects can be processed independently, parallel computing is applicable in this field. Modern graphics processing units (GPUs) have become capable of performing millions of matrix and vector operations per second on multiple objects simultaneously. As a side project, a software tool is currently being developed at the Institute of Aerospace Systems that provides an animated, three-dimensional visualization of both actual and simulated space debris objects. Due to the nature of these objects it is possible to process them individually and independently from each other. Therefore, an analytical orbit propagation algorithm has been implemented to run on a GPU. By taking advantage of all its processing power a huge performance increase, compared to its CPU-based counterpart, could be achieved. For several years efforts have been made to harness this computing power for applications other than computer graphics. Software tools for the simulation of space debris are among those that could profit from embracing parallelism. With recently emerged software development tools such as OpenCL it is possible to transfer the new algorithms used in the visualization outside the field of computer graphics and implement them, for example, into the space debris simulation environment. This way they can make use of parallel hardware such as GPUs and Multi-Core-CPUs for faster computation. In this paper the visualization software will be introduced, including a comparison between the serial and the parallel method of orbit propagation. Ways of how to use the benefits of the latter method for space debris simulation will be discussed. An introduction of OpenCL will be given as well as an exemplary algorithm from the field of space debris simulation.
High accuracy mantle convection simulation through modern numerical methods - II: realistic models and problems

NASA Astrophysics Data System (ADS)

Heister, Timo; Dannberg, Juliane; Gassmöller, Rene; Bangerth, Wolfgang

2017-08-01

Computations have helped elucidate the dynamics of Earth's mantle for several decades already. The numerical methods that underlie these simulations have greatly evolved within this time span, and today include dynamically changing and adaptively refined meshes, sophisticated and efficient solvers, and parallelization to large clusters of computers. At the same time, many of the methods - discussed in detail in a previous paper in this series - were developed and tested primarily using model problems that lack many of the complexities that are common to the realistic models our community wants to solve today. With several years of experience solving complex and realistic models, we here revisit some of the algorithm designs of the earlier paper and discuss the incorporation of more complex physics. In particular, we re-consider time stepping and mesh refinement algorithms, evaluate approaches to incorporate compressibility, and discuss dealing with strongly varying material coefficients, latent heat, and how to track chemical compositions and heterogeneities. Taken together and implemented in a high-performance, massively parallel code, the techniques discussed in this paper then allow for high resolution, 3-D, compressible, global mantle convection simulations with phase transitions, strongly temperature dependent viscosity and realistic material properties based on mineral physics data.
Ef: Software for Nonrelativistic Beam Simulation by Particle-in-Cell Algorithm

NASA Astrophysics Data System (ADS)

Boytsov, A. Yu.; Bulychev, A. A.

2018-04-01

Understanding of particle dynamics is crucial in construction of electron guns, ion sources and other types of nonrelativistic beam devices. Apart from external guiding and focusing systems, a prominent role in evolution of such low-energy beams is played by particle-particle interaction. Numerical simulations taking into account these effects are typically accomplished by a well-known particle-in-cell method. In practice, for convenient work a simulation program should not only implement this method, but also support parallelization, provide integration with CAD systems and allow access to details of the simulation algorithm. To address the formulated requirements, development of a new open source code - Ef - has been started. It's current features and main functionality are presented. Comparison with several analytical models demonstrates good agreement between the numerical results and the theory. Further development plans are discussed.
Multidimensional Simulation Applied to Water Resources Management

NASA Astrophysics Data System (ADS)

Camara, A. S.; Ferreira, F. C.; Loucks, D. P.; Seixas, M. J.

1990-09-01

A framework for an integrated decision aiding simulation (IDEAS) methodology using numerical, linguistic, and pictorial entities and operations is introduced. IDEAS relies upon traditional numerical formulations, logical rules to handle linguistic entities with linguistic values, and a set of pictorial operations. Pictorial entities are defined by their shape, size, color, and position. Pictorial operators include reproduction (copy of a pictorial entity), mutation (expansion, rotation, translation, change in color), fertile encounters (intersection, reunion), and sterile encounters (absorption). Interaction between numerical, linguistic, and pictorial entities is handled through logical rules or a simplified vector calculus operation. This approach is shown to be applicable to various environmental and water resources management analyses using a model to assess the impacts of an oil spill. Future developments, including IDEAS implementation on parallel processing machines, are also discussed.
Numerical simulation of two-phase filtration in the near well bore zone

NASA Astrophysics Data System (ADS)

Maksat, Kalimoldayev; Kalipa, Kuspanova; Kulyash, Baisalbayeva; Orken, Mamyrbayev; Assel, Abdildayeva

2018-04-01

On the basis of the fundamental laws of energy conservation, nonstationary processes of filtration of two-phase liquids in multilayered reservoirs in the near well bore zone are considered. Number of reservoirs, fluid pressure in the given reservoirs, reservoir permeability, oil viscosity, etc. are taken into account upon that. Plane-parallel flow and axisymmetric cases have been studied. In the numerical solution, non-structured meshes are used. Closer to the well, the meshes thicken. The integration step over time is defined by the generalized Courant inequality. As a result, there are no large oscillations in the numerical solutions obtained. Oil production rates, Poisson's ratios, D-diameters of the well, filter height, filter permeability, and cumulative thickness of the filter cake and the area have been taken as the main inputs in numerical simulation of non-stationary processes of two-phase filtration.
User's Guide for TOUGH2-MP - A Massively Parallel Version of the TOUGH2 Code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Earth Sciences Division; Zhang, Keni; Zhang, Keni

TOUGH2-MP is a massively parallel (MP) version of the TOUGH2 code, designed for computationally efficient parallel simulation of isothermal and nonisothermal flows of multicomponent, multiphase fluids in one, two, and three-dimensional porous and fractured media. In recent years, computational requirements have become increasingly intensive in large or highly nonlinear problems for applications in areas such as radioactive waste disposal, CO2 geological sequestration, environmental assessment and remediation, reservoir engineering, and groundwater hydrology. The primary objective of developing the parallel-simulation capability is to significantly improve the computational performance of the TOUGH2 family of codes. The particular goal for the parallel simulator ismore » to achieve orders-of-magnitude improvement in computational time for models with ever-increasing complexity. TOUGH2-MP is designed to perform parallel simulation on multi-CPU computational platforms. An earlier version of TOUGH2-MP (V1.0) was based on the TOUGH2 Version 1.4 with EOS3, EOS9, and T2R3D modules, a software previously qualified for applications in the Yucca Mountain project, and was designed for execution on CRAY T3E and IBM SP supercomputers. The current version of TOUGH2-MP (V2.0) includes all fluid property modules of the standard version TOUGH2 V2.0. It provides computationally efficient capabilities using supercomputers, Linux clusters, or multi-core PCs, and also offers many user-friendly features. The parallel simulator inherits all process capabilities from V2.0 together with additional capabilities for handling fractured media from V1.4. This report provides a quick starting guide on how to set up and run the TOUGH2-MP program for users with a basic knowledge of running the (standard) version TOUGH2 code, The report also gives a brief technical description of the code, including a discussion of parallel methodology, code structure, as well as mathematical and numerical methods used. To familiarize users with the parallel code, illustrative sample problems are presented.« less
Quark structure of static correlators in high temperature QCD

NASA Astrophysics Data System (ADS)

Bernard, Claude; DeGrand, Thomas A.; DeTar, Carleton; Gottlieb, Steven; Krasnitz, A.; Ogilvie, Michael C.; Sugar, R. L.; Toussaint, D.

1992-07-01

We present results of numerical simulations of quantum chromodynamics at finite temperature with two flavors of Kogut-Susskind quarks on the Intel iPSC/860 parallel processor. We investigate the properties of the objects whose exchange gives static screening lengths by reconstructing their correlated quark-antiquark structure.
Modeling Flue Pipes: Subsonic Flow, Lattice Boltzmann, and Parallel Distributed Computers.

DTIC Science & Technology

1995-01-01

Abstract The problem of simulating the hydrodynamics and the acoustic waves inside wind musical instruments such as the recorder, the organ, and the ute...inside wind musical instruments such as the recorder, the organ, and the ute is considered. The problem is attacked by developing suitable local...applications such as the simulation of uid dynamics inside wind musical instruments. In the past, he has also worked on numerical methods for ordinary di
Numerical simulation of disperse particle flows on a graphics processing unit

NASA Astrophysics Data System (ADS)

Sierakowski, Adam J.

In both nature and technology, we commonly encounter solid particles being carried within fluid flows, from dust storms to sediment erosion and from food processing to energy generation. The motion of uncountably many particles in highly dynamic flow environments characterizes the tremendous complexity of such phenomena. While methods exist for the full-scale numerical simulation of such systems, current computational capabilities require the simplification of the numerical task with significant approximation using closure models widely recognized as insufficient. There is therefore a fundamental need for the investigation of the underlying physical processes governing these disperse particle flows. In the present work, we develop a new tool based on the Physalis method for the first-principles numerical simulation of thousands of particles (a small fraction of an entire disperse particle flow system) in order to assist in the search for new reduced-order closure models. We discuss numerous enhancements to the efficiency and stability of the Physalis method, which introduces the influence of spherical particles to a fixed-grid incompressible Navier-Stokes flow solver using a local analytic solution to the flow equations. Our first-principles investigation demands the modeling of unresolved length and time scales associated with particle collisions. We introduce a collision model alongside Physalis, incorporating lubrication effects and proposing a new nonlinearly damped Hertzian contact model. By reproducing experimental studies from the literature, we document extensive validation of the methods. We discuss the implementation of our methods for massively parallel computation using a graphics processing unit (GPU). We combine Eulerian grid-based algorithms with Lagrangian particle-based algorithms to achieve computational throughput up to 90 times faster than the legacy implementation of Physalis for a single central processing unit. By avoiding all data communication between the GPU and the host system during the simulation, we utilize with great efficacy the GPU hardware with which many high performance computing systems are currently equipped. We conclude by looking forward to the future of Physalis with multi-GPU parallelization in order to perform resolved disperse flow simulations of more than 100,000 particles and further advance the development of reduced-order closure models.

User's guide of TOUGH2-EGS-MP: A Massively Parallel Simulator with Coupled Geomechanics for Fluid and Heat Flow in Enhanced Geothermal Systems VERSION 1.0

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xiong, Yi; Fakcharoenphol, Perapon; Wang, Shihao

2013-12-01

TOUGH2-EGS-MP is a parallel numerical simulation program coupling geomechanics with fluid and heat flow in fractured and porous media, and is applicable for simulation of enhanced geothermal systems (EGS). TOUGH2-EGS-MP is based on the TOUGH2-MP code, the massively parallel version of TOUGH2. In TOUGH2-EGS-MP, the fully-coupled flow-geomechanics model is developed from linear elastic theory for thermo-poro-elastic systems and is formulated in terms of mean normal stress as well as pore pressure and temperature. Reservoir rock properties such as porosity and permeability depend on rock deformation, and the relationships between these two, obtained from poro-elasticity theories and empirical correlations, are incorporatedmore » into the simulation. This report provides the user with detailed information on the TOUGH2-EGS-MP mathematical model and instructions for using it for Thermal-Hydrological-Mechanical (THM) simulations. The mathematical model includes the fluid and heat flow equations, geomechanical equation, and discretization of those equations. In addition, the parallel aspects of the code, such as domain partitioning and communication between processors, are also included. Although TOUGH2-EGS-MP has the capability for simulating fluid and heat flows coupled with geomechanical effects, it is up to the user to select the specific coupling process, such as THM or only TH, in a simulation. There are several example problems illustrating applications of this program. These example problems are described in detail and their input data are presented. Their results demonstrate that this program can be used for field-scale geothermal reservoir simulation in porous and fractured media with fluid and heat flow coupled with geomechanical effects.« less
Formation of Electrostatic Potential Drops in the Auroral Zone

NASA Technical Reports Server (NTRS)

Schriver, D.; Ashour-Abdalla, M.; Richard, R. L.

2001-01-01

In order to examine the self-consistent formation of large-scale quasi-static parallel electric fields in the auroral zone on a micro/meso scale, a particle in cell simulation has been developed. The code resolves electron Debye length scales so that electron micro-processes are included and a variable grid scheme is used such that the overall length scale of the simulation is of the order of an Earth radii along the magnetic field. The simulation is electrostatic and includes the magnetic mirror force, as well as two types of plasmas, a cold dense ionospheric plasma and a warm tenuous magnetospheric plasma. In order to study the formation of parallel electric fields in the auroral zone, different magnetospheric ion and electron inflow boundary conditions are used to drive the system. It has been found that for conditions in the primary (upward) current region an upward directed quasi-static electric field can form across the system due to magnetic mirroring of the magnetospheric ions and electrons at different altitudes. For conditions in the return (downward) current region it is shown that a quasi-static parallel electric field in the opposite sense of that in the primary current region is formed, i.e., the parallel electric field is directed earthward. The conditions for how these different electric fields can be formed are discussed using satellite observations and numerical simulations.
Experimental Validation of Numerical Simulations for an Acoustic Liner in Grazing Flow

NASA Technical Reports Server (NTRS)

Tam, Christopher K. W.; Pastouchenko, Nikolai N.; Jones, Michael G.; Watson, Willie R.

2013-01-01

A coordinated experimental and numerical simulation effort is carried out to improve our understanding of the physics of acoustic liners in a grazing flow as well our computational aeroacoustics (CAA) method prediction capability. A numerical simulation code based on advanced CAA methods is developed. In a parallel effort, experiments are performed using the Grazing Flow Impedance Tube at the NASA Langley Research Center. In the experiment, a liner is installed in the upper wall of a rectangular flow duct with a 2 inch by 2.5 inch cross section. Spatial distribution of sound pressure levels and relative phases are measured on the wall opposite the liner in the presence of a Mach 0.3 grazing flow. The computer code is validated by comparing computed results with experimental measurements. Good agreements are found. The numerical simulation code is then used to investigate the physical properties of the acoustic liner. It is shown that an acoustic liner can produce self-noise in the presence of a grazing flow and that a feedback acoustic resonance mechanism is responsible for the generation of this liner self-noise. In addition, the same mechanism also creates additional liner drag. An estimate, based on numerical simulation data, indicates that for a resonant liner with a 10% open area ratio, the drag increase would be about 4% of the turbulent boundary layer drag over a flat wall.
Implementation of a partitioned algorithm for simulation of large CSI problems

NASA Technical Reports Server (NTRS)

Alvin, Kenneth F.; Park, K. C.

1991-01-01

The implementation of a partitioned numerical algorithm for determining the dynamic response of coupled structure/controller/estimator finite-dimensional systems is reviewed. The partitioned approach leads to a set of coupled first and second-order linear differential equations which are numerically integrated with extrapolation and implicit step methods. The present software implementation, ACSIS, utilizes parallel processing techniques at various levels to optimize performance on a shared-memory concurrent/vector processing system. A general procedure for the design of controller and filter gains is also implemented, which utilizes the vibration characteristics of the structure to be solved. Also presented are: example problems; a user's guide to the software; the procedures and algorithm scripts; a stability analysis for the algorithm; and the source code for the parallel implementation.
Numerical investigation of heat transfer in parallel channels with water at supercritical pressure.

PubMed

Shitsi, Edward; Kofi Debrah, Seth; Yao Agbodemegbe, Vincent; Ampomah-Amoako, Emmanuel

2017-11-01

Thermal phenomena such as heat transfer enhancement, heat transfer deterioration, and flow instability observed at supercritical pressures as a result of fluid property variations have the potential to affect the safety of design and operation of Supercritical Water-cooled Reactor SCWR, and also challenge the capabilities of both heat transfer correlations and Computational Fluid Dynamics CFD physical models. These phenomena observed at supercritical pressures need to be thoroughly investigated. An experimental study was carried out by Xi to investigate flow instability in parallel channels at supercritical pressures under different mass flow rates, pressures, and axial power shapes. Experimental data on flow instability at inlet of the heated channels were obtained but no heat transfer data along the axial length was obtained. This numerical study used 3D numerical tool STAR-CCM+ to investigate heat transfer at supercritical pressures along the axial lengths of the parallel channels with water ahead of experimental data. Homogeneous axial power shape HAPS was adopted and the heating powers adopted in this work were below the experimental threshold heating powers obtained for HAPS by Xi. The results show that the Fluid Centre-line Temperature FCLT increased linearly below and above the PCT region, but flattened at the PCT region for all the system parameters considered. The inlet temperature, heating power, pressure, gravity and mass flow rate have effects on WT (wall temperature) values in the NHT (normal heat transfer), EHT (enhanced heat transfer), DHT (deteriorated heat transfer) and recovery from DHT regions. While variation of all other system parameters in the EHT and PCT regions showed no significant difference in the WT and FCLT values respectively, the WT and FCLT values respectively increased with pressure in these regions. For most of the system parameters considered, the FCLT and WT values obtained in the two channels were nearly the same. The numerical study was not quantitatively compared with experimental data along the axial lengths of the parallel channels, but it was observed that the numerical tool STAR-CCM+ adopted was able to capture the trends for NHT, EHT, DHT and recovery from DHT regions. The heating powers used for the various simulations were below the experimentally observed threshold heating powers, but heat transfer deterioration HTD was observed, confirming the previous finding that HTD could occur before the occurrence of unstable behavior at supercritical pressures. For purposes of comparing the results of numerical simulations with experimental data, the heat transfer data on temperature oscillations obtained at the outlet of the heated channels and instability boundary results obtained at the inlet of the heated channels were compared. The numerical results obtained quite well agree with the experimental data. This work calls for provision of experimental data on heat transfer in parallel channels at supercritical pressures for validation of similar numerical studies.
Long range forecasts of the Northern Hemisphere anomalies with antecedent sea surface temperature patterns

NASA Technical Reports Server (NTRS)

Kung, Ernest C.

1994-01-01

The contract research has been conducted in the following three major areas: analysis of numerical simulations and parallel observations of atmospheric blocking, diagnosis of the lower boundary heating and the response of the atmospheric circulation, and comprehensive assessment of long-range forecasting with numerical and regression methods. The essential scientific and developmental purpose of this contract research is to extend our capability of numerical weather forecasting by the comprehensive general circulation model. The systematic work as listed above is thus geared to developing a technological basis for future NASA long-range forecasting.
A Computational Framework for Efficient Low Temperature Plasma Simulations

NASA Astrophysics Data System (ADS)

Verma, Abhishek Kumar; Venkattraman, Ayyaswamy

2016-10-01

Over the past years, scientific computing has emerged as an essential tool for the investigation and prediction of low temperature plasmas (LTP) applications which includes electronics, nanomaterial synthesis, metamaterials etc. To further explore the LTP behavior with greater fidelity, we present a computational toolbox developed to perform LTP simulations. This framework will allow us to enhance our understanding of multiscale plasma phenomenon using high performance computing tools mainly based on OpenFOAM FVM distribution. Although aimed at microplasma simulations, the modular framework is able to perform multiscale, multiphysics simulations of physical systems comprises of LTP. Some salient introductory features are capability to perform parallel, 3D simulations of LTP applications on unstructured meshes. Performance of the solver is tested based on numerical results assessing accuracy and efficiency of benchmarks for problems in microdischarge devices. Numerical simulation of microplasma reactor at atmospheric pressure with hemispherical dielectric coated electrodes will be discussed and hence, provide an overview of applicability and future scope of this framework.
N-MODY: A Code for Collisionless N-body Simulations in Modified Newtonian Dynamics

NASA Astrophysics Data System (ADS)

Londrillo, Pasquale; Nipoti, Carlo

2011-02-01

N-MODY is a parallel particle-mesh code for collisionless N-body simulations in modified Newtonian dynamics (MOND). N-MODY is based on a numerical potential solver in spherical coordinates that solves the non-linear MOND field equation, and is ideally suited to simulate isolated stellar systems. N-MODY can be used also to compute the MOND potential of arbitrary static density distributions. A few applications of N-MODY indicate that some astrophysically relevant dynamical processes are profoundly different in MOND and in Newtonian gravity with dark matter.
Multidisciplinary propulsion simulation using NPSS

NASA Technical Reports Server (NTRS)

Claus, Russell W.; Evans, Austin L.; Follen, Gregory J.

1992-01-01

The current status of the Numerical Propulsion System Simulation (NPSS) program, a cooperative effort of NASA, industry, and universities to reduce the cost and time of advanced technology propulsion system development, is reviewed. The technologies required for this program include (1) interdisciplinary analysis to couple the relevant disciplines, such as aerodynamics, structures, heat transfer, combustion, acoustics, controls, and materials; (2) integrated systems analysis; (3) a high-performance computing platform, including massively parallel processing; and (4) a simulation environment providing a user-friendly interface. Several research efforts to develop these technologies are discussed.
Progress in the Simulation of Steady and Time-Dependent Flows with 3D Parallel Unstructured Cartesian Methods

NASA Technical Reports Server (NTRS)

Aftosmis, M. J.; Berger, M. J.; Murman, S. M.; Kwak, Dochan (Technical Monitor)

2002-01-01

The proposed paper will present recent extensions in the development of an efficient Euler solver for adaptively-refined Cartesian meshes with embedded boundaries. The paper will focus on extensions of the basic method to include solution adaptation, time-dependent flow simulation, and arbitrary rigid domain motion. The parallel multilevel method makes use of on-the-fly parallel domain decomposition to achieve extremely good scalability on large numbers of processors, and is coupled with an automatic coarse mesh generation algorithm for efficient processing by a multigrid smoother. Numerical results are presented demonstrating parallel speed-ups of up to 435 on 512 processors. Solution-based adaptation may be keyed off truncation error estimates using tau-extrapolation or a variety of feature detection based refinement parameters. The multigrid method is extended to for time-dependent flows through the use of a dual-time approach. The extension to rigid domain motion uses an Arbitrary Lagrangian-Eulerlarian (ALE) formulation, and results will be presented for a variety of two- and three-dimensional example problems with both simple and complex geometry.
Flame-Generated Vorticity Production in Premixed Flame-Vortex Interactions

NASA Technical Reports Server (NTRS)

Patnaik, G.; Kailasanath, K.

2003-01-01

In this study, we use detailed time-dependent, multi-dimensional numerical simulations to investigate the relative importance of the processes leading to FGV in flame-vortex interactions in normal gravity and microgravity and to determine if the production of vorticity in flames in gravity is the same as that in zero gravity except for the contribution of the gravity term. The numerical simulations will be performed using the computational model developed at NRL, FLAME3D. FLAME3D is a parallel, multi-dimensional (either two- or three-dimensional) flame model based on FLIC2D, which has been used extensively to study the structure and stability of premixed hydrogen and methane flames.
NAS technical summaries. Numerical aerodynamic simulation program, March 1992 - February 1993

NASA Technical Reports Server (NTRS)

1994-01-01

NASA created the Numerical Aerodynamic Simulation (NAS) Program in 1987 to focus resources on solving critical problems in aeroscience and related disciplines by utilizing the power of the most advanced supercomputers available. The NAS Program provides scientists with the necessary computing power to solve today's most demanding computational fluid dynamics problems and serves as a pathfinder in integrating leading-edge supercomputing technologies, thus benefitting other supercomputer centers in government and industry. The 1992-93 operational year concluded with 399 high-speed processor projects and 91 parallel projects representing NASA, the Department of Defense, other government agencies, private industry, and universities. This document provides a glimpse at some of the significant scientific results for the year.
Parallel adaptive discontinuous Galerkin approximation for thin layer avalanche modeling

NASA Astrophysics Data System (ADS)

Patra, A. K.; Nichita, C. C.; Bauer, A. C.; Pitman, E. B.; Bursik, M.; Sheridan, M. F.

2006-08-01

This paper describes the development of highly accurate adaptive discontinuous Galerkin schemes for the solution of the equations arising from a thin layer type model of debris flows. Such flows have wide applicability in the analysis of avalanches induced by many natural calamities, e.g. volcanoes, earthquakes, etc. These schemes are coupled with special parallel solution methodologies to produce a simulation tool capable of very high-order numerical accuracy. The methodology successfully replicates cold rock avalanches at Mount Rainier, Washington and hot volcanic particulate flows at Colima Volcano, Mexico.
Parallel Computational Fluid Dynamics: Current Status and Future Requirements

NASA Technical Reports Server (NTRS)

Simon, Horst D.; VanDalsem, William R.; Dagum, Leonardo; Kutler, Paul (Technical Monitor)

1994-01-01

One or the key objectives of the Applied Research Branch in the Numerical Aerodynamic Simulation (NAS) Systems Division at NASA Allies Research Center is the accelerated introduction of highly parallel machines into a full operational environment. In this report we discuss the performance results obtained from the implementation of some computational fluid dynamics (CFD) applications on the Connection Machine CM-2 and the Intel iPSC/860. We summarize some of the experiences made so far with the parallel testbed machines at the NAS Applied Research Branch. Then we discuss the long term computational requirements for accomplishing some of the grand challenge problems in computational aerosciences. We argue that only massively parallel machines will be able to meet these grand challenge requirements, and we outline the computer science and algorithm research challenges ahead.
Plasma Physics Calculations on a Parallel Macintosh Cluster

NASA Astrophysics Data System (ADS)

Decyk, Viktor; Dauger, Dean; Kokelaar, Pieter

2000-03-01

We have constructed a parallel cluster consisting of 16 Apple Macintosh G3 computers running the MacOS, and achieved very good performance on numerically intensive, parallel plasma particle-in-cell simulations. A subset of the MPI message-passing library was implemented in Fortran77 and C. This library enabled us to port code, without modification, from other parallel processors to the Macintosh cluster. For large problems where message packets are large and relatively few in number, performance of 50-150 MFlops/node is possible, depending on the problem. This is fast enough that 3D calculations can be routinely done. Unlike Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. Full details are available on our web site: http://exodus.physics.ucla.edu/appleseed/.
Plasma Physics Calculations on a Parallel Macintosh Cluster

NASA Astrophysics Data System (ADS)

Decyk, Viktor K.; Dauger, Dean E.; Kokelaar, Pieter R.

We have constructed a parallel cluster consisting of 16 Apple Macintosh G3 computers running the MacOS, and achieved very good performance on numerically intensive, parallel plasma particle-in-cell simulations. A subset of the MPI message-passing library was implemented in Fortran77 and C. This library enabled us to port code, without modification, from other parallel processors to the Macintosh cluster. For large problems where message packets are large and relatively few in number, performance of 50-150 Mflops/node is possible, depending on the problem. This is fast enough that 3D calculations can be routinely done. Unlike Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. Full details are available on our web site: http://exodus.physics.ucla.edu/appleseed/.
Large-scale large eddy simulation of nuclear reactor flows: Issues and perspectives

DOE Office of Scientific and Technical Information (OSTI.GOV)

Merzari, Elia; Obabko, Aleks; Fischer, Paul

Numerical simulation has been an intrinsic part of nuclear engineering research since its inception. In recent years a transition is occurring toward predictive, first-principle-based tools such as computational fluid dynamics. Even with the advent of petascale computing, however, such tools still have significant limitations. In the present work some of these issues, and in particular the presence of massive multiscale separation, are discussed, as well as some of the research conducted to mitigate them. Petascale simulations at high fidelity (large eddy simulation/direct numerical simulation) were conducted with the massively parallel spectral element code Nek5000 on a series of representative problems.more » These simulations shed light on the requirements of several types of simulation: (1) axial flow around fuel rods, with particular attention to wall effects; (2) natural convection in the primary vessel; and (3) flow in a rod bundle in the presence of spacing devices. Finally, the focus of the work presented here is on the lessons learned and the requirements to perform these simulations at exascale. Additional physical insight gained from these simulations is also emphasized.« less
Large-scale large eddy simulation of nuclear reactor flows: Issues and perspectives

DOE PAGES

Merzari, Elia; Obabko, Aleks; Fischer, Paul; ...

2016-11-03

Numerical simulation has been an intrinsic part of nuclear engineering research since its inception. In recent years a transition is occurring toward predictive, first-principle-based tools such as computational fluid dynamics. Even with the advent of petascale computing, however, such tools still have significant limitations. In the present work some of these issues, and in particular the presence of massive multiscale separation, are discussed, as well as some of the research conducted to mitigate them. Petascale simulations at high fidelity (large eddy simulation/direct numerical simulation) were conducted with the massively parallel spectral element code Nek5000 on a series of representative problems.more » These simulations shed light on the requirements of several types of simulation: (1) axial flow around fuel rods, with particular attention to wall effects; (2) natural convection in the primary vessel; and (3) flow in a rod bundle in the presence of spacing devices. Finally, the focus of the work presented here is on the lessons learned and the requirements to perform these simulations at exascale. Additional physical insight gained from these simulations is also emphasized.« less
The formation of quasi-parallel shocks. [in space, solar and astrophysical plasmas

NASA Technical Reports Server (NTRS)

Cargill, Peter J.

1991-01-01

In a collisionless plasma, the coupling between a piston and the plasma must take place through either laminar or turbulent electromagnetic fields. Of the three types of coupling (laminar, Larmor and turbulent), shock formation in the parallel regime is dominated by the latter and in the quasi-parallel regime by a combination of all three, depending on the piston. In the quasi-perpendicular regime, there is usually a good separation between piston and shock. This is not true in the quasi-parallel and parallel regime. Hybrid numerical simulations for hot plasma pistons indicate that when the electrons are hot, a shock forms, but does not cleanly decouple from the piston. For hot ion pistons, no shock forms in the parallel limit: in the quasi-parallel case, a shock forms, but there is severe contamination from hot piston ions. These results suggest that the properties of solar and astrophysical shocks, such as particle acceleration, cannot be readily separated from their driving mechanism.
Kubo number and magnetic field line diffusion coefficient for anisotropic magnetic turbulence.

PubMed

Pommois, P; Veltri, P; Zimbardo, G

2001-06-01

The magnetic field line diffusion coefficients Dx and D(y) are obtained by numerical simulations in the case that all the magnetic turbulence correlation lengths l(x), l(y), and l(z) are different. We find that the variety of numerical results can be organized in terms of the Kubo number, the definition of which is extended from R=(deltaB/B(0))(l(parallel)/l(perpendicular)) to R=(deltaB/B(0))(l(z)/l(x)), for l(x) > or = l(y). Here, l(parallel) (l(perpendicular)) is the correlation length along (perpendicular to) the average field B(0)=B(0)ê(z). We have anomalous, non-Gaussian transport for R less, similar 0.1, in which case the mean square deviation scales nonlinearly with time. For R greater, similar 1 we have several Gaussian regimes: an almost quasilinear regime for 0.1 less, similar R less, similar 1, an intermediate, transition regime for 1 less, similar R less, similar 10, and a percolative regime for R greater, similar 10. An analytical form of the diffusion coefficient is proposed, D(i)=D(deltaBl(z)/B(0)l(x))(mu)(l(i)/l(x))(nu)l(2)(x)/l(z), which well describes the numerical simulation results in the quasilinear, intermediate, and percolative regimes.

Modeling of fatigue crack induced nonlinear ultrasonics using a highly parallelized explicit local interaction simulation approach

NASA Astrophysics Data System (ADS)

Shen, Yanfeng; Cesnik, Carlos E. S.

2016-04-01

This paper presents a parallelized modeling technique for the efficient simulation of nonlinear ultrasonics introduced by the wave interaction with fatigue cracks. The elastodynamic wave equations with contact effects are formulated using an explicit Local Interaction Simulation Approach (LISA). The LISA formulation is extended to capture the contact-impact phenomena during the wave damage interaction based on the penalty method. A Coulomb friction model is integrated into the computation procedure to capture the stick-slip contact shear motion. The LISA procedure is coded using the Compute Unified Device Architecture (CUDA), which enables the highly parallelized supercomputing on powerful graphic cards. Both the explicit contact formulation and the parallel feature facilitates LISA's superb computational efficiency over the conventional finite element method (FEM). The theoretical formulations based on the penalty method is introduced and a guideline for the proper choice of the contact stiffness is given. The convergence behavior of the solution under various contact stiffness values is examined. A numerical benchmark problem is used to investigate the new LISA formulation and results are compared with a conventional contact finite element solution. Various nonlinear ultrasonic phenomena are successfully captured using this contact LISA formulation, including the generation of nonlinear higher harmonic responses. Nonlinear mode conversion of guided waves at fatigue cracks is also studied.
Effect of alignment of easy axes on dynamic magnetization of immobilized magnetic nanoparticles

NASA Astrophysics Data System (ADS)

Yoshida, Takashi; Matsugi, Yuki; Tsujimura, Naotaka; Sasayama, Teruyoshi; Enpuku, Keiji; Viereck, Thilo; Schilling, Meinhard; Ludwig, Frank

2017-04-01

In some biomedical applications of magnetic nanoparticles (MNPs), the particles are physically immobilized. In this study, we explore the effect of the alignment of the magnetic easy axes on the dynamic magnetization of immobilized MNPs under an AC excitation field. We prepared three immobilized MNP samples: (1) a sample in which easy axes are randomly oriented, (2) a parallel-aligned sample in which easy axes are parallel to the AC field, and (3) an orthogonally aligned sample in which easy axes are perpendicular to the AC field. First, we show that the parallel-aligned sample has the largest hysteresis in the magnetization curve and the largest harmonic magnetization spectra, followed by the randomly oriented and orthogonally aligned samples. For example, 1.6-fold increase was observed in the area of the hysteresis loop of the parallel-aligned sample compared to that of the randomly oriented sample. To quantitatively discuss the experimental results, we perform a numerical simulation based on a Fokker-Planck equation, in which probability distributions for the directions of the easy axes are taken into account in simulating the prepared MNP samples. We obtained quantitative agreement between experiment and simulation. These results indicate that the dynamic magnetization of immobilized MNPs is significantly affected by the alignment of the easy axes.
Multiphase three-dimensional direct numerical simulation of a rotating impeller with code Blue

NASA Astrophysics Data System (ADS)

Kahouadji, Lyes; Shin, Seungwon; Chergui, Jalel; Juric, Damir; Craster, Richard V.; Matar, Omar K.

2017-11-01

The flow driven by a rotating impeller inside an open fixed cylindrical cavity is simulated using code Blue, a solver for massively-parallel simulations of fully three-dimensional multiphase flows. The impeller is composed of four blades at a 45° inclination all attached to a central hub and tube stem. In Blue, solid forms are constructed through the definition of immersed objects via a distance function that accounts for the object's interaction with the flow for both single and two-phase flows. We use a moving frame technique for imposing translation and/or rotation. The variation of the Reynolds number, the clearance, and the tank aspect ratio are considered, and we highlight the importance of the confinement ratio (blade radius versus the tank radius) in the mixing process. Blue uses a domain decomposition strategy for parallelization with MPI. The fluid interface solver is based on a parallel implementation of a hybrid front-tracking/level-set method designed complex interfacial topological changes. Parallel GMRES and multigrid iterative solvers are applied to the linear systems arising from the implicit solution for the fluid velocities and pressure in the presence of strong density and viscosity discontinuities across fluid phases. EPSRC, UK, MEMPHIS program Grant (EP/K003976/1), RAEng Research Chair (OKM).
Three-dimensional Hybrid Simulation Study of Anisotropic Turbulence in the Proton Kinetic Regime

NASA Astrophysics Data System (ADS)

Vasquez, Bernard J.; Markovskii, Sergei A.; Chandran, Benjamin D. G.

2014-06-01

Three-dimensional numerical hybrid simulations with particle protons and quasi-neutralizing fluid electrons are conducted for a freely decaying turbulence that is anisotropic with respect to the background magnetic field. The turbulence evolution is determined by both the combined root-mean-square (rms) amplitude for fluctuating proton bulk velocity and magnetic field and by the ratio of perpendicular to parallel wavenumbers. This kind of relationship had been considered in the past with regard to interplanetary turbulence. The fluctuations nonlinearly evolve to a turbulent phase whose net wave vector anisotropy is usually more perpendicular than the initial one, irrespective of the initial ratio of perpendicular to parallel wavenumbers. Self-similar anisotropy evolution is found as a function of the rms amplitude and parallel wavenumber. Proton heating rates in the turbulent phase vary strongly with the rms amplitude but only weakly with the initial wave vector anisotropy. Even in the limit where wave vectors are confined to the plane perpendicular to the background magnetic field, the heating rate remains close to the corresponding case with finite parallel wave vector components. Simulation results obtained as a function of proton plasma to background magnetic pressure ratio β p in the range 0.1-0.5 show that the wave vector anisotropy also weakly depends on β p .
Modeling the nanoscale viscoelasticity of fluids by bridging non-Markovian fluctuating hydrodynamics and molecular dynamics simulations

NASA Astrophysics Data System (ADS)

Voulgarakis, Nikolaos K.; Satish, Siddarth; Chu, Jhih-Wei

2009-12-01

A multiscale computational method is developed to model the nanoscale viscoelasticity of fluids by bridging non-Markovian fluctuating hydrodynamics (FHD) and molecular dynamics (MD) simulations. To capture the elastic responses that emerge at small length scales, we attach an additional rheological model parallel to the macroscopic constitutive equation of a fluid. The widely used linear Maxwell model is employed as a working choice; other models can be used as well. For a fluid that is Newtonian in the macroscopic limit, this approach results in a parallel Newtonian-Maxwell model. For water, argon, and an ionic liquid, the power spectrum of momentum field autocorrelation functions of the parallel Newtonian-Maxwell model agrees very well with those calculated from all-atom MD simulations. To incorporate thermal fluctuations, we generalize the equations of FHD to work with non-Markovian rheological models and colored noise. The fluctuating stress tensor (white noise) is integrated in time in the same manner as its dissipative counterpart and numerical simulations indicate that this approach accurately preserves the set temperature in a FHD simulation. By mapping position and velocity vectors in the molecular representation onto field variables, we bridge the non-Markovian FHD with atomistic MD simulations. Through this mapping, we quantitatively determine the transport coefficients of the parallel Newtonian-Maxwell model for water and argon from all-atom MD simulations. For both fluids, a significant enhancement in elastic responses is observed as the wave number of hydrodynamic modes is reduced to a few nanometers. The mapping from particle to field representations and the perturbative strategy of developing constitutive equations provide a useful framework for modeling the nanoscale viscoelasticity of fluids.
Tempest - Efficient Computation of Atmospheric Flows Using High-Order Local Discretization Methods

NASA Astrophysics Data System (ADS)

Ullrich, P. A.; Guerra, J. E.

2014-12-01

The Tempest Framework composes several compact numerical methods to easily facilitate intercomparison of atmospheric flow calculations on the sphere and in rectangular domains. This framework includes the implementations of Spectral Elements, Discontinuous Galerkin, Flux Reconstruction, and Hybrid Finite Element methods with the goal of achieving optimal accuracy in the solution of atmospheric problems. Several advantages of this approach are discussed such as: improved pressure gradient calculation, numerical stability by vertical/horizontal splitting, arbitrary order of accuracy, etc. The local numerical discretization allows for high performance parallel computation and efficient inclusion of parameterizations. These techniques are used in conjunction with a non-conformal, locally refined, cubed-sphere grid for global simulations and standard Cartesian grids for simulations at the mesoscale. A complete implementation of the methods described is demonstrated in a non-hydrostatic setting.
Limits to high-speed simulations of spiking neural networks using general-purpose computers.

PubMed

Zenke, Friedemann; Gerstner, Wulfram

2014-01-01

To understand how the central nervous system performs computations using recurrent neuronal circuitry, simulations have become an indispensable tool for theoretical neuroscience. To study neuronal circuits and their ability to self-organize, increasing attention has been directed toward synaptic plasticity. In particular spike-timing-dependent plasticity (STDP) creates specific demands for simulations of spiking neural networks. On the one hand a high temporal resolution is required to capture the millisecond timescale of typical STDP windows. On the other hand network simulations have to evolve over hours up to days, to capture the timescale of long-term plasticity. To do this efficiently, fast simulation speed is the crucial ingredient rather than large neuron numbers. Using different medium-sized network models consisting of several thousands of neurons and off-the-shelf hardware, we compare the simulation speed of the simulators: Brian, NEST and Neuron as well as our own simulator Auryn. Our results show that real-time simulations of different plastic network models are possible in parallel simulations in which numerical precision is not a primary concern. Even so, the speed-up margin of parallelism is limited and boosting simulation speeds beyond one tenth of real-time is difficult. By profiling simulation code we show that the run times of typical plastic network simulations encounter a hard boundary. This limit is partly due to latencies in the inter-process communications and thus cannot be overcome by increased parallelism. Overall, these results show that to study plasticity in medium-sized spiking neural networks, adequate simulation tools are readily available which run efficiently on small clusters. However, to run simulations substantially faster than real-time, special hardware is a prerequisite.
NAS Applications and Advanced Algorithms

NASA Technical Reports Server (NTRS)

Bailey, David H.; Biswas, Rupak; VanDerWijngaart, Rob; Kutler, Paul (Technical Monitor)

1997-01-01

This paper examines the applications most commonly run on the supercomputers at the Numerical Aerospace Simulation (NAS) facility. It analyzes the extent to which such applications are fundamentally oriented to vector computers, and whether or not they can be efficiently implemented on hierarchical memory machines, such as systems with cache memories and highly parallel, distributed memory systems.
Gyrokinetic simulation of ITG modes in a three-mode coupling model

NASA Astrophysics Data System (ADS)

Jenkins, Thomas G.; Lee, W. W.

2004-11-01

A three-mode coupling model of ITG modes with adiabatic electrons is studied both analytically and numerically in 2-dimensional slab geometry using the gyrokinetic formalism. It can be shown analytically that the (quasilinear) saturation amplitude of the waves in the system should be enhanced by the inclusion of the parallel velocity nonlinearity in the governing gyrokinetic equation. The effect of this (frequently neglected) nonlinearity on the steady-state transport properties of the plasma is studied numerically using standard gyrokinetic particle simulation techniques. The balance [1] between various steady-state transport properties of the model (particle and heat flux, entropy production, and collisional dissipation) is examined. Effects resulting from the inclusion of nonadiabatic electrons in the model are also considered numerically, making use of the gyrokinetic split-weight scheme [2] in the simulations. [1] W. W. Lee and W. M. Tang, Phys. Fluids 31, 612 (1988). [2] I. Manuilskiy and W. W. Lee, Phys. Plasmas 7, 1381 (2000).
Numerical approaches to combustion modeling. Progress in Astronautics and Aeronautics. Vol. 135

DOE Office of Scientific and Technical Information (OSTI.GOV)

Oran, E.S.; Boris, J.P.

1991-01-01

Various papers on numerical approaches to combustion modeling are presented. The topics addressed include; ab initio quantum chemistry for combustion; rate coefficient calculations for combustion modeling; numerical modeling of combustion of complex hydrocarbons; combustion kinetics and sensitivity analysis computations; reduction of chemical reaction models; length scales in laminar and turbulent flames; numerical modeling of laminar diffusion flames; laminar flames in premixed gases; spectral simulations of turbulent reacting flows; vortex simulation of reacting shear flow; combustion modeling using PDF methods. Also considered are: supersonic reacting internal flow fields; studies of detonation initiation, propagation, and quenching; numerical modeling of heterogeneous detonations, deflagration-to-detonationmore » transition to reactive granular materials; toward a microscopic theory of detonations in energetic crystals; overview of spray modeling; liquid drop behavior in dense and dilute clusters; spray combustion in idealized configurations: parallel drop streams; comparisons of deterministic and stochastic computations of drop collisions in dense sprays; ignition and flame spread across solid fuels; numerical study of pulse combustor dynamics; mathematical modeling of enclosure fires; nuclear systems.« less
A parallel graded-mesh FDTD algorithm for human-antenna interaction problems.

PubMed

Catarinucci, Luca; Tarricone, Luciano

2009-01-01

The finite difference time domain method (FDTD) is frequently used for the numerical solution of a wide variety of electromagnetic (EM) problems and, among them, those concerning human exposure to EM fields. In many practical cases related to the assessment of occupational EM exposure, large simulation domains are modeled and high space resolution adopted, so that strong memory and central processing unit power requirements have to be satisfied. To better afford the computational effort, the use of parallel computing is a winning approach; alternatively, subgridding techniques are often implemented. However, the simultaneous use of subgridding schemes and parallel algorithms is very new. In this paper, an easy-to-implement and highly-efficient parallel graded-mesh (GM) FDTD scheme is proposed and applied to human-antenna interaction problems, demonstrating its appropriateness in dealing with complex occupational tasks and showing its capability to guarantee the advantages of a traditional subgridding technique without affecting the parallel FDTD performance.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Bassi, Gabriele; Blednykh, Alexei; Smalyuk, Victor

A novel algorithm for self-consistent simulations of long-range wakefield effects has been developed and applied to the study of both longitudinal and transverse coupled-bunch instabilities at NSLS-II. The algorithm is implemented in the new parallel tracking code space (self-consistent parallel algorithm for collective effects) discussed in the paper. The code is applicable for accurate beam dynamics simulations in cases where both bunch-to-bunch and intrabunch motions need to be taken into account, such as chromatic head-tail effects on the coupled-bunch instability of a beam with a nonuniform filling pattern, or multibunch and single-bunch effects of a passive higher-harmonic cavity. The numericalmore » simulations have been compared with analytical studies. For a beam with an arbitrary filling pattern, intensity-dependent complex frequency shifts have been derived starting from a system of coupled Vlasov equations. The analytical formulas and numerical simulations confirm that the analysis is reduced to the formulation of an eigenvalue problem based on the known formulas of the complex frequency shifts for the uniform filling pattern case.« less
Parallel database search and prime factorization with magnonic holographic memory devices

DOE Office of Scientific and Technical Information (OSTI.GOV)

Khitun, Alexander

In this work, we describe the capabilities of Magnonic Holographic Memory (MHM) for parallel database search and prime factorization. MHM is a type of holographic device, which utilizes spin waves for data transfer and processing. Its operation is based on the correlation between the phases and the amplitudes of the input spin waves and the output inductive voltage. The input of MHM is provided by the phased array of spin wave generating elements allowing the producing of phase patterns of an arbitrary form. The latter makes it possible to code logic states into the phases of propagating waves and exploitmore » wave superposition for parallel data processing. We present the results of numerical modeling illustrating parallel database search and prime factorization. The results of numerical simulations on the database search are in agreement with the available experimental data. The use of classical wave interference may results in a significant speedup over the conventional digital logic circuits in special task data processing (e.g., √n in database search). Potentially, magnonic holographic devices can be implemented as complementary logic units to digital processors. Physical limitations and technological constrains of the spin wave approach are also discussed.« less
Parallel database search and prime factorization with magnonic holographic memory devices

NASA Astrophysics Data System (ADS)

Khitun, Alexander

2015-12-01

In this work, we describe the capabilities of Magnonic Holographic Memory (MHM) for parallel database search and prime factorization. MHM is a type of holographic device, which utilizes spin waves for data transfer and processing. Its operation is based on the correlation between the phases and the amplitudes of the input spin waves and the output inductive voltage. The input of MHM is provided by the phased array of spin wave generating elements allowing the producing of phase patterns of an arbitrary form. The latter makes it possible to code logic states into the phases of propagating waves and exploit wave superposition for parallel data processing. We present the results of numerical modeling illustrating parallel database search and prime factorization. The results of numerical simulations on the database search are in agreement with the available experimental data. The use of classical wave interference may results in a significant speedup over the conventional digital logic circuits in special task data processing (e.g., √n in database search). Potentially, magnonic holographic devices can be implemented as complementary logic units to digital processors. Physical limitations and technological constrains of the spin wave approach are also discussed.
A numerical analysis of plasma non-uniformity in the parallel plate VHF-CCP and the comparison among various model

NASA Astrophysics Data System (ADS)

Sawada, Ikuo

2012-10-01

We measured the radial distribution of electron density in a 200 mm parallel plate CCP and compared it with results from numerical simulations. The experiments were conducted with pure Ar gas with pressures ranging from 15 to 100 mTorr and 60 MHz applied at the top electrode with powers from 500 to 2000W. The measured electron profile is peaked in the center, and the relative non-uniformity is higher at 100 mTorr than at 15 mTorr. We compare the experimental results with simulations with both HPEM and Monte-Carlo/PIC codes. In HPEM simulations, we used either fluid or electron Monte-Carlo module, and the Poisson or the Electromagnetic solver. None of the models were able to duplicate the experimental results quantitatively. However, HPEM with the electron Monte-Carlo module and PIC qualitatively matched the experimental results. We will discuss the results from these models and how they illuminate the mechanism of enhanced electron central peak.[4pt] [1] T. Oshita, M. Matsukuma, S.Y. Kang, I. Sawada: The effect of non-uniform RF voltage in a CCP discharge, The 57^th JSAP Spring Meeting 2010[4pt] [2] I. Sawada, K. Matsuzaki, S.Y. Kang, T. Ohshita, M. Kawakami, S. Segawa: 1-st IC-PLANTS, 2008
MHD code using multi graphical processing units: SMAUG+

NASA Astrophysics Data System (ADS)

Gyenge, N.; Griffiths, M. K.; Erdélyi, R.

2018-01-01

This paper introduces the Sheffield Magnetohydrodynamics Algorithm Using GPUs (SMAUG+), an advanced numerical code for solving magnetohydrodynamic (MHD) problems, using multi-GPU systems. Multi-GPU systems facilitate the development of accelerated codes and enable us to investigate larger model sizes and/or more detailed computational domain resolutions. This is a significant advancement over the parent single-GPU MHD code, SMAUG (Griffiths et al., 2015). Here, we demonstrate the validity of the SMAUG + code, describe the parallelisation techniques and investigate performance benchmarks. The initial configuration of the Orszag-Tang vortex simulations are distributed among 4, 16, 64 and 100 GPUs. Furthermore, different simulation box resolutions are applied: 1000 × 1000, 2044 × 2044, 4000 × 4000 and 8000 × 8000 . We also tested the code with the Brio-Wu shock tube simulations with model size of 800 employing up to 10 GPUs. Based on the test results, we observed speed ups and slow downs, depending on the granularity and the communication overhead of certain parallel tasks. The main aim of the code development is to provide massively parallel code without the memory limitation of a single GPU. By using our code, the applied model size could be significantly increased. We demonstrate that we are able to successfully compute numerically valid and large 2D MHD problems.
Three pillars for achieving quantum mechanical molecular dynamics simulations of huge systems: Divide-and-conquer, density-functional tight-binding, and massively parallel computation.

PubMed

Nishizawa, Hiroaki; Nishimura, Yoshifumi; Kobayashi, Masato; Irle, Stephan; Nakai, Hiromi

2016-08-05

The linear-scaling divide-and-conquer (DC) quantum chemical methodology is applied to the density-functional tight-binding (DFTB) theory to develop a massively parallel program that achieves on-the-fly molecular reaction dynamics simulations of huge systems from scratch. The functions to perform large scale geometry optimization and molecular dynamics with DC-DFTB potential energy surface are implemented to the program called DC-DFTB-K. A novel interpolation-based algorithm is developed for parallelizing the determination of the Fermi level in the DC method. The performance of the DC-DFTB-K program is assessed using a laboratory computer and the K computer. Numerical tests show the high efficiency of the DC-DFTB-K program, a single-point energy gradient calculation of a one-million-atom system is completed within 60 s using 7290 nodes of the K computer. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Parallel replica dynamics method for bistable stochastic reaction networks: Simulation and sensitivity analysis

NASA Astrophysics Data System (ADS)

Wang, Ting; Plecháč, Petr

2017-12-01

Stochastic reaction networks that exhibit bistable behavior are common in systems biology, materials science, and catalysis. Sampling of stationary distributions is crucial for understanding and characterizing the long-time dynamics of bistable stochastic dynamical systems. However, simulations are often hindered by the insufficient sampling of rare transitions between the two metastable regions. In this paper, we apply the parallel replica method for a continuous time Markov chain in order to improve sampling of the stationary distribution in bistable stochastic reaction networks. The proposed method uses parallel computing to accelerate the sampling of rare transitions. Furthermore, it can be combined with the path-space information bounds for parametric sensitivity analysis. With the proposed methodology, we study three bistable biological networks: the Schlögl model, the genetic switch network, and the enzymatic futile cycle network. We demonstrate the algorithmic speedup achieved in these numerical benchmarks. More significant acceleration is expected when multi-core or graphics processing unit computer architectures and programming tools such as CUDA are employed.
Optimal mapping of irregular finite element domains to parallel processors

NASA Technical Reports Server (NTRS)

Flower, J.; Otto, S.; Salama, M.

1987-01-01

Mapping the solution domain of n-finite elements into N-subdomains that may be processed in parallel by N-processors is an optimal one if the subdomain decomposition results in a well-balanced workload distribution among the processors. The problem is discussed in the context of irregular finite element domains as an important aspect of the efficient utilization of the capabilities of emerging multiprocessor computers. Finding the optimal mapping is an intractable combinatorial optimization problem, for which a satisfactory approximate solution is obtained here by analogy to a method used in statistical mechanics for simulating the annealing process in solids. The simulated annealing analogy and algorithm are described, and numerical results are given for mapping an irregular two-dimensional finite element domain containing a singularity onto the Hypercube computer.
Design of neurophysiologically motivated structures of time-pulse coded neurons

NASA Astrophysics Data System (ADS)

Krasilenko, Vladimir G.; Nikolsky, Alexander I.; Lazarev, Alexander A.; Lobodzinska, Raisa F.

2009-04-01

The common methodology of biologically motivated concept of building of processing sensors systems with parallel input and picture operands processing and time-pulse coding are described in paper. Advantages of such coding for creation of parallel programmed 2D-array structures for the next generation digital computers which require untraditional numerical systems for processing of analog, digital, hybrid and neuro-fuzzy operands are shown. The optoelectronic time-pulse coded intelligent neural elements (OETPCINE) simulation results and implementation results of a wide set of neuro-fuzzy logic operations are considered. The simulation results confirm engineering advantages, intellectuality, circuit flexibility of OETPCINE for creation of advanced 2D-structures. The developed equivalentor-nonequivalentor neural element has power consumption of 10mW and processing time about 10...100us.

Fast Particle Methods for Multiscale Phenomena Simulations

NASA Technical Reports Server (NTRS)

Koumoutsakos, P.; Wray, A.; Shariff, K.; Pohorille, Andrew

2000-01-01

We are developing particle methods oriented at improving computational modeling capabilities of multiscale physical phenomena in : (i) high Reynolds number unsteady vortical flows, (ii) particle laden and interfacial flows, (iii)molecular dynamics studies of nanoscale droplets and studies of the structure, functions, and evolution of the earliest living cell. The unifying computational approach involves particle methods implemented in parallel computer architectures. The inherent adaptivity, robustness and efficiency of particle methods makes them a multidisciplinary computational tool capable of bridging the gap of micro-scale and continuum flow simulations. Using efficient tree data structures, multipole expansion algorithms, and improved particle-grid interpolation, particle methods allow for simulations using millions of computational elements, making possible the resolution of a wide range of length and time scales of these important physical phenomena.The current challenges in these simulations are in : [i] the proper formulation of particle methods in the molecular and continuous level for the discretization of the governing equations [ii] the resolution of the wide range of time and length scales governing the phenomena under investigation. [iii] the minimization of numerical artifacts that may interfere with the physics of the systems under consideration. [iv] the parallelization of processes such as tree traversal and grid-particle interpolations We are conducting simulations using vortex methods, molecular dynamics and smooth particle hydrodynamics, exploiting their unifying concepts such as : the solution of the N-body problem in parallel computers, highly accurate particle-particle and grid-particle interpolations, parallel FFT's and the formulation of processes such as diffusion in the context of particle methods. This approach enables us to transcend among seemingly unrelated areas of research.
High Performance Parallel Analysis of Coupled Problems for Aircraft Propulsion

NASA Technical Reports Server (NTRS)

Felippa, C. A.; Farhat, C.; Lanteri, S.; Maman, N.; Piperno, S.; Gumaste, U.

1994-01-01

In order to predict the dynamic response of a flexible structure in a fluid flow, the equations of motion of the structure and the fluid must be solved simultaneously. In this paper, we present several partitioned procedures for time-integrating this focus coupled problem and discuss their merits in terms of accuracy, stability, heterogeneous computing, I/O transfers, subcycling, and parallel processing. All theoretical results are derived for a one-dimensional piston model problem with a compressible flow, because the complete three-dimensional aeroelastic problem is difficult to analyze mathematically. However, the insight gained from the analysis of the coupled piston problem and the conclusions drawn from its numerical investigation are confirmed with the numerical simulation of the two-dimensional transient aeroelastic response of a flexible panel in a transonic nonlinear Euler flow regime.
Biospark: scalable analysis of large numerical datasets from biological simulations and experiments using Hadoop and Spark.

PubMed

Klein, Max; Sharma, Rati; Bohrer, Chris H; Avelis, Cameron M; Roberts, Elijah

2017-01-15

Data-parallel programming techniques can dramatically decrease the time needed to analyze large datasets. While these methods have provided significant improvements for sequencing-based analyses, other areas of biological informatics have not yet adopted them. Here, we introduce Biospark, a new framework for performing data-parallel analysis on large numerical datasets. Biospark builds upon the open source Hadoop and Spark projects, bringing domain-specific features for biology. Source code is licensed under the Apache 2.0 open source license and is available at the project website: https://www.assembla.com/spaces/roberts-lab-public/wiki/Biospark CONTACT: eroberts@jhu.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Self-consistent Simulations and Analysis of the Coupled-Bunch Instability for Arbitrary Multi-Bunch Configurations

DOE PAGES

Bassi, Gabriele; Blednykh, Alexei; Smalyuk, Victor

2016-02-24

A novel algorithm for self-consistent simulations of long-range wakefield effects has been developed and applied to the study of both longitudinal and transverse coupled-bunch instabilities at NSLS-II. The algorithm is implemented in the new parallel tracking code space (self-consistent parallel algorithm for collective effects) discussed in the paper. The code is applicable for accurate beam dynamics simulations in cases where both bunch-to-bunch and intrabunch motions need to be taken into account, such as chromatic head-tail effects on the coupled-bunch instability of a beam with a nonuniform filling pattern, or multibunch and single-bunch effects of a passive higher-harmonic cavity. The numericalmore » simulations have been compared with analytical studies. For a beam with an arbitrary filling pattern, intensity-dependent complex frequency shifts have been derived starting from a system of coupled Vlasov equations. The analytical formulas and numerical simulations confirm that the analysis is reduced to the formulation of an eigenvalue problem based on the known formulas of the complex frequency shifts for the uniform filling pattern case.« less
Unsteady numerical simulations of the stability and dynamics of flames

NASA Technical Reports Server (NTRS)

Kailasanath, K.; Patnaik, G.; Oran, E. S.

1995-01-01

In this report we describe the research performed at the Naval Research Laboratory in support of the NASA Microgravity Science and Applications Program over the past three years (from Feb. 1992) with emphasis on the work performed since the last microgravity combustion workshop. The primary objective of our research is to develop an understanding of the differences in the structure, stability, dynamics and extinction of flames in earth gravity and in microgravity environments. Numerical simulations, in which the various physical and chemical processes can be independently controlled, can significantly advance our understanding of these differences. Therefore, our approach is to use detailed time-dependent, multi-dimensional, multispecies numerical models to perform carefully designed computational experiments. The basic issues we have addressed, a general description of the numerical approach, and a summary of the results are described in this report. More detailed discussions are available in the papers published which are referenced herein. Some of the basic issues we have addressed recently are (1) the relative importance of wall losses and gravity on the extinguishment of downward-propagating flames; (2) the role of hydrodynamic instabilities in the formation of cellular flames; (3) effects of gravity on burner-stabilized flames, and (4) effects of radiative losses and chemical-kinetics on flames near flammability limits. We have also expanded our efforts to include hydrocarbon flames in addition to hydrogen flames and to perform simulations in support of other on-going efforts in the microgravity combustion sciences program. Modeling hydrocarbon flames typically involves a larger number of species and a much larger number of reactions when compared to hydrogen. In addition, more complex radiation models may also be needed. In order to efficiently compute such complex flames recent developments in parallel computing have been utilized to develop a state-of-the-art parallel flame code. This is discussed below in some detail after a brief discussion of the numerical models.
Parallel FEM Simulation of Electromechanics in the Heart

NASA Astrophysics Data System (ADS)

Xia, Henian; Wong, Kwai; Zhao, Xiaopeng

2011-11-01

Cardiovascular disease is the leading cause of death in America. Computer simulation of complicated dynamics of the heart could provide valuable quantitative guidance for diagnosis and treatment of heart problems. In this paper, we present an integrated numerical model which encompasses the interaction of cardiac electrophysiology, electromechanics, and mechanoelectrical feedback. The model is solved by finite element method on a Linux cluster and the Cray XT5 supercomputer, kraken. Dynamical influences between the effects of electromechanics coupling and mechanic-electric feedback are shown.
Calculation of heat sink around cracks formed under pulsed heat load

NASA Astrophysics Data System (ADS)

Lazareva, G. G.; Arakcheev, A. S.; Kandaurov, I. V.; Kasatov, A. A.; Kurkuchekov, V. V.; Maksimova, A. G.; Popov, V. A.; Shoshin, A. A.; Snytnikov, A. V.; Trunev, Yu A.; Vasilyev, A. A.; Vyacheslavov, L. N.

2017-10-01

The experimental and numerical simulations of the conditions causing the intensive erosion and expected to be realized infusion reactor were carried out. The influence of relevant pulsed heat loads to tungsten was simulated using a powerful electron beam source in BINP. The mechanical destruction, melting and splashing of the material were observed. The laboratory experiments are accompanied by computational ones. Computational experiment allowed to quantitatively describe the overheating near the cracks, caused by parallel to surface cracks.
Computationally efficient multibody simulations

NASA Technical Reports Server (NTRS)

Ramakrishnan, Jayant; Kumar, Manoj

1994-01-01

Computationally efficient approaches to the solution of the dynamics of multibody systems are presented in this work. The computational efficiency is derived from both the algorithmic and implementational standpoint. Order(n) approaches provide a new formulation of the equations of motion eliminating the assembly and numerical inversion of a system mass matrix as required by conventional algorithms. Computational efficiency is also gained in the implementation phase by the symbolic processing and parallel implementation of these equations. Comparison of this algorithm with existing multibody simulation programs illustrates the increased computational efficiency.
Multilevel summation method for electrostatic force evaluation.

PubMed

Hardy, David J; Wu, Zhe; Phillips, James C; Stone, John E; Skeel, Robert D; Schulten, Klaus

2015-02-10

The multilevel summation method (MSM) offers an efficient algorithm utilizing convolution for evaluating long-range forces arising in molecular dynamics simulations. Shifting the balance of computation and communication, MSM provides key advantages over the ubiquitous particle–mesh Ewald (PME) method, offering better scaling on parallel computers and permitting more modeling flexibility, with support for periodic systems as does PME but also for semiperiodic and nonperiodic systems. The version of MSM available in the simulation program NAMD is described, and its performance and accuracy are compared with the PME method. The accuracy feasible for MSM in practical applications reproduces PME results for water property calculations of density, diffusion constant, dielectric constant, surface tension, radial distribution function, and distance-dependent Kirkwood factor, even though the numerical accuracy of PME is higher than that of MSM. Excellent agreement between MSM and PME is found also for interface potentials of air–water and membrane–water interfaces, where long-range Coulombic interactions are crucial. Applications demonstrate also the suitability of MSM for systems with semiperiodic and nonperiodic boundaries. For this purpose, simulations have been performed with periodic boundaries along directions parallel to a membrane surface but not along the surface normal, yielding membrane pore formation induced by an imbalance of charge across the membrane. Using a similar semiperiodic boundary condition, ion conduction through a graphene nanopore driven by an ion gradient has been simulated. Furthermore, proteins have been simulated inside a single spherical water droplet. Finally, parallel scalability results show the ability of MSM to outperform PME when scaling a system of modest size (less than 100 K atoms) to over a thousand processors, demonstrating the suitability of MSM for large-scale parallel simulation.
Experimental and Numerical Study of Nozzle Plume Impingement on Spacecraft Surfaces

NASA Astrophysics Data System (ADS)

Ketsdever, A. D.; Lilly, T. C.; Gimelshein, S. F.; Alexeenko, A. A.

2005-05-01

An experimental and numerical effort was undertaken to assess the effects of a cold gas (To=300K) nozzle plume impinging on a simulated spacecraft surface. The nozzle flow impingement is investigated experimentally using a nano-Newton resolution force balance and numerically using the Direct Simulation Monte Carlo (DSMC) numerical technique. The Reynolds number range investigated in this study is from 0.5 to approximately 900 using helium and nitrogen propellants. The thrust produced by the nozzle was first assessed on a force balance to provide a baseline case. Subsequently, an aluminum plate was attached to the same force balance at various angles from 0° (parallel to the plume flow) to 10°. For low Reynolds number helium flow, a 16.5% decrease in thrust was measured for the plate at 0° relative to the free plume expansion case. For low Reynolds number nitrogen flow, the difference was found to be 12%. The thrust degradation was found to decrease at higher Reynolds numbers and larger plate angles.
The Oceanographic Multipurpose Software Environment (OMUSE v1.0)

NASA Astrophysics Data System (ADS)

Pelupessy, Inti; van Werkhoven, Ben; van Elteren, Arjen; Viebahn, Jan; Candy, Adam; Portegies Zwart, Simon; Dijkstra, Henk

2017-08-01

In this paper we present the Oceanographic Multipurpose Software Environment (OMUSE). OMUSE aims to provide a homogeneous environment for existing or newly developed numerical ocean simulation codes, simplifying their use and deployment. In this way, numerical experiments that combine ocean models representing different physics or spanning different ranges of physical scales can be easily designed. Rapid development of simulation models is made possible through the creation of simple high-level scripts. The low-level core of the abstraction in OMUSE is designed to deploy these simulations efficiently on heterogeneous high-performance computing resources. Cross-verification of simulation models with different codes and numerical methods is facilitated by the unified interface that OMUSE provides. Reproducibility in numerical experiments is fostered by allowing complex numerical experiments to be expressed in portable scripts that conform to a common OMUSE interface. Here, we present the design of OMUSE as well as the modules and model components currently included, which range from a simple conceptual quasi-geostrophic solver to the global circulation model POP (Parallel Ocean Program). The uniform access to the codes' simulation state and the extensive automation of data transfer and conversion operations aids the implementation of model couplings. We discuss the types of couplings that can be implemented using OMUSE. We also present example applications that demonstrate the straightforward model initialization and the concurrent use of data analysis tools on a running model. We give examples of multiscale and multiphysics simulations by embedding a regional ocean model into a global ocean model and by coupling a surface wave propagation model with a coastal circulation model.
Topography Modeling in Atmospheric Flows Using the Immersed Boundary Method

NASA Technical Reports Server (NTRS)

Ackerman, A. S.; Senocak, I.; Mansour, N. N.; Stevens, D. E.

2004-01-01

Numerical simulation of flow over complex geometry needs accurate and efficient computational methods. Different techniques are available to handle complex geometry. The unstructured grid and multi-block body-fitted grid techniques have been widely adopted for complex geometry in engineering applications. In atmospheric applications, terrain fitted single grid techniques have found common use. Although these are very effective techniques, their implementation, coupling with the flow algorithm, and efficient parallelization of the complete method are more involved than a Cartesian grid method. The grid generation can be tedious and one needs to pay special attention in numerics to handle skewed cells for conservation purposes. Researchers have long sought for alternative methods to ease the effort involved in simulating flow over complex geometry.
Modulated heat pulse propagation and partial transport barriers in chaotic magnetic fields

DOE PAGES

del-Castillo-Negrete, Diego; Blazevski, Daniel

2016-04-01

Direct numerical simulations of the time dependent parallel heat transport equation modeling heat pulses driven by power modulation in 3-dimensional chaotic magnetic fields are presented. The numerical method is based on the Fourier formulation of a Lagrangian-Green's function method that provides an accurate and efficient technique for the solution of the parallel heat transport equation in the presence of harmonic power modulation. The numerical results presented provide conclusive evidence that even in the absence of magnetic flux surfaces, chaotic magnetic field configurations with intermediate levels of stochasticity exhibit transport barriers to modulated heat pulse propagation. In particular, high-order islands and remnants of destroyed flux surfaces (Cantori) act as partial barriers that slow down or even stop the propagation of heat waves at places where the magnetic field connection length exhibits a strong gradient. The key parameter ismore » $$\\gamma=\\sqrt{\\omega/2 \\chi_\\parallel}$$ that determines the length scale, $$1/\\gamma$$, of the heat wave penetration along the magnetic field line. For large perturbation frequencies, $$\\omega \\gg 1$$, or small parallel thermal conductivities, $$\\chi_\\parallel \\ll 1$$, parallel heat transport is strongly damped and the magnetic field partial barriers act as robust barriers where the heat wave amplitude vanishes and its phase speed slows down to a halt. On the other hand, in the limit of small $$\\gamma$$, parallel heat transport is largely unimpeded, global transport is observed and the radial amplitude and phase speed of the heat wave remain finite. Results on modulated heat pulse propagation in fully stochastic fields and across magnetic islands are also presented. In qualitative agreement with recent experiments in LHD and DIII-D, it is shown that the elliptic (O) and hyperbolic (X) points of magnetic islands have a direct impact on the spatio-temporal dependence of the amplitude and the time delay of modulated heat pulses.« less
Percutaneous dilational tracheostomy (PDT) and prevention of blood aspiration with superimposed high-frequency jet ventilation (SHFJV) using the tracheotomy-endoscope (TED): results of numerical and experimental simulations.

PubMed

Nowak, Andreas; Langebach, Robin; Klemm, Eckart; Heller, Winfried

2012-04-01

We describe an innovative computer-based method for the analysis of gas flow using a modified airway management technique to perform percutaneous dilatational tracheotomy (PDT) with a rigid tracheotomy endoscope (TED). A test lung was connected via an artificial trachea with the tracheotomy endoscope and ventilated using superimposed high-frequency jet ventilation. Red packed cells were instilled during the puncture phase of a simulated percutaneous tracheotomy in a trachea model and migration of the red packed cells during breathing was continuously measured. Simultaneously, the calculation of the gas-flow within the endoscope was numerically simulated. In the experimental study, no backflow of blood occurred during the use of superimposed high-frequency jet ventilation (SHFJV) from the trachea into the endoscope nor did any transportation of blood into the lower respiratory tract occur. In parallel, the numerical simulations of the openings of TED show almost positive volume flows. Under the conditions investigated there is no risk of blood aspiration during PDT using the TED and simultaneous ventilation with SHFJV. In addition, no risk of impairment of endoscopic visibility exists through a backflow of blood into the TED. The method of numerical simulation offers excellent insight into the fluid flow even under highly transient conditions like jet ventilation.
A parallel domain decomposition-based implicit method for the Cahn–Hilliard–Cook phase-field equation in 3D

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zheng, Xiang; Yang, Chao; State Key Laboratory of Computer Science, Chinese Academy of Sciences, Beijing 100190

2015-03-15

We present a numerical algorithm for simulating the spinodal decomposition described by the three dimensional Cahn–Hilliard–Cook (CHC) equation, which is a fourth-order stochastic partial differential equation with a noise term. The equation is discretized in space and time based on a fully implicit, cell-centered finite difference scheme, with an adaptive time-stepping strategy designed to accelerate the progress to equilibrium. At each time step, a parallel Newton–Krylov–Schwarz algorithm is used to solve the nonlinear system. We discuss various numerical and computational challenges associated with the method. The numerical scheme is validated by a comparison with an explicit scheme of high accuracymore » (and unreasonably high cost). We present steady state solutions of the CHC equation in two and three dimensions. The effect of the thermal fluctuation on the spinodal decomposition process is studied. We show that the existence of the thermal fluctuation accelerates the spinodal decomposition process and that the final steady morphology is sensitive to the stochastic noise. We also show the evolution of the energies and statistical moments. In terms of the parallel performance, it is found that the implicit domain decomposition approach scales well on supercomputers with a large number of processors.« less
Hard-sphere-like dynamics in highly concentrated alpha-crystallin suspensions

DOE PAGES

Vodnala, Preeti; Karunaratne, Nuwan; Lurio, Laurence; ...

2018-02-02

The dynamics of concentrated suspensions of the eye-lens protein alpha crystallin have been measured using x-ray photon correlation spectroscopy. Measurements were made at wave vectors corresponding to the first peak in the hard-sphere structure factor and volume fractions close to the critical volume fraction for the glass transition. Langevin dynamics simulations were also performed in parallel to the experiments. The intermediate scattering function f(q,τ) could be fit using a stretched exponential decay for both experiments and numerical simulations. The measured relaxation times show good agreement with simulations for polydisperse hard-sphere colloids.
Hard-sphere-like dynamics in highly concentrated alpha-crystallin suspensions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vodnala, Preeti; Karunaratne, Nuwan; Lurio, Laurence

The dynamics of concentrated suspensions of the eye-lens protein alpha crystallin have been measured using x-ray photon correlation spectroscopy. Measurements were made at wave vectors corresponding to the first peak in the hard-sphere structure factor and volume fractions close to the critical volume fraction for the glass transition. Langevin dynamics simulations were also performed in parallel to the experiments. The intermediate scattering function f(q,τ) could be fit using a stretched exponential decay for both experiments and numerical simulations. The measured relaxation times show good agreement with simulations for polydisperse hard-sphere colloids.
Hard-sphere-like dynamics in highly concentrated alpha-crystallin suspensions

NASA Astrophysics Data System (ADS)

Vodnala, Preeti; Karunaratne, Nuwan; Lurio, Laurence; Thurston, George M.; Vega, Michael; Gaillard, Elizabeth; Narayanan, Suresh; Sandy, Alec; Zhang, Qingteng; Dufresne, Eric M.; Foffi, Giuseppe; Grybos, Pawel; Kmon, Piotr; Maj, Piotr; Szczygiel, Robert

2018-02-01

The dynamics of concentrated suspensions of the eye-lens protein alpha crystallin have been measured using x-ray photon correlation spectroscopy. Measurements were made at wave vectors corresponding to the first peak in the hard-sphere structure factor and volume fractions close to the critical volume fraction for the glass transition. Langevin dynamics simulations were also performed in parallel to the experiments. The intermediate scattering function f (q ,τ ) could be fit using a stretched exponential decay for both experiments and numerical simulations. The measured relaxation times show good agreement with simulations for polydisperse hard-sphere colloids.
Massive Exploration of Perturbed Conditions of the Blood Coagulation Cascade through GPU Parallelization

PubMed Central

Cazzaniga, Paolo; Nobile, Marco S.; Besozzi, Daniela; Bellini, Matteo; Mauri, Giancarlo

2014-01-01

The introduction of general-purpose Graphics Processing Units (GPUs) is boosting scientific applications in Bioinformatics, Systems Biology, and Computational Biology. In these fields, the use of high-performance computing solutions is motivated by the need of performing large numbers of in silico analysis to study the behavior of biological systems in different conditions, which necessitate a computing power that usually overtakes the capability of standard desktop computers. In this work we present coagSODA, a CUDA-powered computational tool that was purposely developed for the analysis of a large mechanistic model of the blood coagulation cascade (BCC), defined according to both mass-action kinetics and Hill functions. coagSODA allows the execution of parallel simulations of the dynamics of the BCC by automatically deriving the system of ordinary differential equations and then exploiting the numerical integration algorithm LSODA. We present the biological results achieved with a massive exploration of perturbed conditions of the BCC, carried out with one-dimensional and bi-dimensional parameter sweep analysis, and show that GPU-accelerated parallel simulations of this model can increase the computational performances up to a 181× speedup compared to the corresponding sequential simulations. PMID:25025072
3D multiphysics modeling of superconducting cavities with a massively parallel simulation suite

NASA Astrophysics Data System (ADS)

Kononenko, Oleksiy; Adolphsen, Chris; Li, Zenghai; Ng, Cho-Kuen; Rivetta, Claudio

2017-10-01

Radiofrequency cavities based on superconducting technology are widely used in particle accelerators for various applications. The cavities usually have high quality factors and hence narrow bandwidths, so the field stability is sensitive to detuning from the Lorentz force and external loads, including vibrations and helium pressure variations. If not properly controlled, the detuning can result in a serious performance degradation of a superconducting accelerator, so an understanding of the underlying detuning mechanisms can be very helpful. Recent advances in the simulation suite ace3p have enabled realistic multiphysics characterization of such complex accelerator systems on supercomputers. In this paper, we present the new capabilities in ace3p for large-scale 3D multiphysics modeling of superconducting cavities, in particular, a parallel eigensolver for determining mechanical resonances, a parallel harmonic response solver to calculate the response of a cavity to external vibrations, and a numerical procedure to decompose mechanical loads, such as from the Lorentz force or piezoactuators, into the corresponding mechanical modes. These capabilities have been used to do an extensive rf-mechanical analysis of dressed TESLA-type superconducting cavities. The simulation results and their implications for the operational stability of the Linac Coherent Light Source-II are discussed.

Application of computational physics within Northrop

NASA Technical Reports Server (NTRS)

George, M. W.; Ling, R. T.; Mangus, J. F.; Thompkins, W. T.

1987-01-01

An overview of Northrop programs in computational physics is presented. These programs depend on access to today's supercomputers, such as the Numerical Aerodynamical Simulator (NAS), and future growth on the continuing evolution of computational engines. Descriptions here are concentrated on the following areas: computational fluid dynamics (CFD), computational electromagnetics (CEM), computer architectures, and expert systems. Current efforts and future directions in these areas are presented. The impact of advances in the CFD area is described, and parallels are drawn to analagous developments in CEM. The relationship between advances in these areas and the development of advances (parallel) architectures and expert systems is also presented.
Numerical simulations of the flow with the prescribed displacement of the airfoil and comparison with experiment

NASA Astrophysics Data System (ADS)

Řidký, V.; Šidlof, P.; Vlček, V.

2013-04-01

The work is devoted to comparing measured data with the results of numerical simulations. As mathematical model was used mathematical model whitout turbulence for incompressible flow In the experiment was observed the behavior of designed NACA0015 airfoil in airflow. For the numerical solution was used OpenFOAM computational package, this is open-source software based on finite volume method. In the numerical solution is prescribed displacement of the airfoil, which corresponds to the experiment. The velocity at a point close to the airfoil surface is compared with the experimental data obtained from interferographic measurements of the velocity field. Numerical solution is computed on a 3D mesh composed of about 1 million ortogonal hexahedron elements. The time step is limited by the Courant number. Parallel computations are run on supercomputers of the CIV at Technical University in Prague (HAL and FOX) and on a computer cluster of the Faculty of Mechatronics of Liberec (HYDRA). Run time is fixed at five periods, the results from the fifth periods and average value for all periods are then be compared with experiment.
Numerical solution of the Navier-Stokes equations by discontinuous Galerkin method

NASA Astrophysics Data System (ADS)

Krasnov, M. M.; Kuchugov, P. A.; E Ladonkina, M.; E Lutsky, A.; Tishkin, V. F.

2017-02-01

Detailed unstructured grids and numerical methods of high accuracy are frequently used in the numerical simulation of gasdynamic flows in areas with complex geometry. Galerkin method with discontinuous basis functions or Discontinuous Galerkin Method (DGM) works well in dealing with such problems. This approach offers a number of advantages inherent to both finite-element and finite-difference approximations. Moreover, the present paper shows that DGM schemes can be viewed as Godunov method extension to piecewise-polynomial functions. As is known, DGM involves significant computational complexity, and this brings up the question of ensuring the most effective use of all the computational capacity available. In order to speed up the calculations, operator programming method has been applied while creating the computational module. This approach makes possible compact encoding of mathematical formulas and facilitates the porting of programs to parallel architectures, such as NVidia CUDA and Intel Xeon Phi. With the software package, based on DGM, numerical simulations of supersonic flow past solid bodies has been carried out. The numerical results are in good agreement with the experimental ones.
Supercomputer modeling of flow past hypersonic flight vehicles

NASA Astrophysics Data System (ADS)

Ermakov, M. K.; Kryukov, I. A.

2017-02-01

A software platform for MPI-based parallel solution of the Navier-Stokes (Euler) equations for viscous heat-conductive compressible perfect gas on 3-D unstructured meshes is developed. The discretization and solution of the Navier-Stokes equations are constructed on generalized S.K. Godunov’s method and the second order approximation in space and time. Developed software platform allows to carry out effectively flow past hypersonic flight vehicles simulations for the Mach numbers 6 and higher, and numerical meshes with up to 1 billion numerical cells and with up to 128 processors.
Study on the wind field and pollutant dispersion in street canyons using a stable numerical method.

PubMed

Xia, Ji-Yang; Leung, Dennis Y C

2005-01-01

A stable finite element method for the time dependent Navier-Stokes equations was used for studying the wind flow and pollutant dispersion within street canyons. A three-step fractional method was used to solve the velocity field and the pressure field separately from the governing equations. The Streamline Upwind Petrov-Galerkin (SUPG) method was used to get stable numerical results. Numerical oscillation was minimized and satisfactory results can be obtained for flows at high Reynolds numbers. Simulating the flow over a square cylinder within a wide range of Reynolds numbers validates the wind field model. The Strouhal numbers obtained from the numerical simulation had a good agreement with those obtained from experiment. The wind field model developed in the present study is applied to simulate more complex flow phenomena in street canyons with two different building configurations. The results indicated that the flow at rooftop of buildings might not be assumed parallel to the ground as some numerical modelers did. A counter-clockwise rotating vortex may be found in street canyons with an inflow from the left to right. In addition, increasing building height can increase velocity fluctuations in the street canyon under certain circumstances, which facilitate pollutant dispersion. At high Reynolds numbers, the flow regimes in street canyons do not change with inflow velocity.
Numerical aspects and implementation of a two-layer zonal wall model for LES of compressible turbulent flows on unstructured meshes

NASA Astrophysics Data System (ADS)

Park, George Ilhwan; Moin, Parviz

2016-01-01

This paper focuses on numerical and practical aspects associated with a parallel implementation of a two-layer zonal wall model for large-eddy simulation (LES) of compressible wall-bounded turbulent flows on unstructured meshes. A zonal wall model based on the solution of unsteady three-dimensional Reynolds-averaged Navier-Stokes (RANS) equations on a separate near-wall grid is implemented in an unstructured, cell-centered finite-volume LES solver. The main challenge in its implementation is to couple two parallel, unstructured flow solvers for efficient boundary data communication and simultaneous time integrations. A coupling strategy with good load balancing and low processors underutilization is identified. Face mapping and interpolation procedures at the coupling interface are explained in detail. The method of manufactured solution is used for verifying the correct implementation of solver coupling, and parallel performance of the combined wall-modeled LES (WMLES) solver is investigated. The method has successfully been applied to several attached and separated flows, including a transitional flow over a flat plate and a separated flow over an airfoil at an angle of attack.
Parallel numerical modeling of hybrid-dimensional compositional non-isothermal Darcy flows in fractured porous media

NASA Astrophysics Data System (ADS)

Xing, F.; Masson, R.; Lopez, S.

2017-09-01

This paper introduces a new discrete fracture model accounting for non-isothermal compositional multiphase Darcy flows and complex networks of fractures with intersecting, immersed and non-immersed fractures. The so called hybrid-dimensional model using a 2D model in the fractures coupled with a 3D model in the matrix is first derived rigorously starting from the equi-dimensional matrix fracture model. Then, it is discretized using a fully implicit time integration combined with the Vertex Approximate Gradient (VAG) finite volume scheme which is adapted to polyhedral meshes and anisotropic heterogeneous media. The fully coupled systems are assembled and solved in parallel using the Single Program Multiple Data (SPMD) paradigm with one layer of ghost cells. This strategy allows for a local assembly of the discrete systems. An efficient preconditioner is implemented to solve the linear systems at each time step and each Newton type iteration of the simulation. The numerical efficiency of our approach is assessed on different meshes, fracture networks, and physical settings in terms of parallel scalability, nonlinear convergence and linear convergence.
Multiple piezo-patch energy harvesters integrated to a thin plate with AC-DC conversion: analytical modeling and numerical validation

NASA Astrophysics Data System (ADS)

Aghakhani, Amirreza; Basdogan, Ipek; Erturk, Alper

2016-04-01

Plate-like components are widely used in numerous automotive, marine, and aerospace applications where they can be employed as host structures for vibration based energy harvesting. Piezoelectric patch harvesters can be easily attached to these structures to convert the vibrational energy to the electrical energy. Power output investigations of these harvesters require accurate models for energy harvesting performance evaluation and optimization. Equivalent circuit modeling of the cantilever-based vibration energy harvesters for estimation of electrical response has been proposed in recent years. However, equivalent circuit formulation and analytical modeling of multiple piezo-patch energy harvesters integrated to thin plates including nonlinear circuits has not been studied. In this study, equivalent circuit model of multiple parallel piezoelectric patch harvesters together with a resistive load is built in electronic circuit simulation software SPICE and voltage frequency response functions (FRFs) are validated using the analytical distributedparameter model. Analytical formulation of the piezoelectric patches in parallel configuration for the DC voltage output is derived while the patches are connected to a standard AC-DC circuit. The analytic model is based on the equivalent load impedance approach for piezoelectric capacitance and AC-DC circuit elements. The analytic results are validated numerically via SPICE simulations. Finally, DC power outputs of the harvesters are computed and compared with the peak power amplitudes in the AC output case.
Numerical study of remote detection outside the magnet with travelling wave Magnetic Resonance Imaging at 3T

NASA Astrophysics Data System (ADS)

López, M.; Vázquez, F.; Solís-Nájera, S.; Rodriguez, A. O.

2015-01-01

The use of the travelling wave approach for high magnetic field magnetic resonance imaging has been used recently with very promising results. This approach offer images one with greater field-of-view and a reasonable signal-to-noise ratio using a circular waveguide. This scheme has been proved to be successful at 7 T and 9.4 T with whole-body imager. Images have also been acquired with clinical magnetic resonance imaging systems whose resonant frequencies were 64 MHz and 128 MHz. These results motivated the use of remote detection of the magnetic resonance signal using a parallel-plate waveguide together with 3 T clinical scanners, to acquired human leg images. The cut-off frequency of this waveguide is zero for the principal mode, allowing us to overcome the barrier of transmitting waves at lower frequency than 300 MHz or 7 T for protons. These motivated the study of remote detection outside the actual magnet. We performed electromagnetic field simulations of a parallel-plate waveguide and a phantom. The signal transmission was done at 128 MHz and using a circular surface coil located almost 200 cm away for the magnet isocentre. Numerical simulations demonstrated that the magnetic field of the principal mode propagate inside a waveguide outside the magnet. Numerical results were compared with previous experimental-acquired image data under similar conditions.
A One Dimensional, Time Dependent Inlet/Engine Numerical Simulation for Aircraft Propulsion Systems

NASA Technical Reports Server (NTRS)

Garrard, Doug; Davis, Milt, Jr.; Cole, Gary

1999-01-01

The NASA Lewis Research Center (LeRC) and the Arnold Engineering Development Center (AEDC) have developed a closely coupled computer simulation system that provides a one dimensional, high frequency inlet/engine numerical simulation for aircraft propulsion systems. The simulation system, operating under the LeRC-developed Application Portable Parallel Library (APPL), closely coupled a supersonic inlet with a gas turbine engine. The supersonic inlet was modeled using the Large Perturbation Inlet (LAPIN) computer code, and the gas turbine engine was modeled using the Aerodynamic Turbine Engine Code (ATEC). Both LAPIN and ATEC provide a one dimensional, compressible, time dependent flow solution by solving the one dimensional Euler equations for the conservation of mass, momentum, and energy. Source terms are used to model features such as bleed flows, turbomachinery component characteristics, and inlet subsonic spillage while unstarted. High frequency events, such as compressor surge and inlet unstart, can be simulated with a high degree of fidelity. The simulation system was exercised using a supersonic inlet with sixty percent of the supersonic area contraction occurring internally, and a GE J85-13 turbojet engine.
Parallel traveling-wave MRI: a feasibility study.

PubMed

Pang, Yong; Vigneron, Daniel B; Zhang, Xiaoliang

2012-04-01

Traveling-wave magnetic resonance imaging utilizes far fields of a single-piece patch antenna in the magnet bore to generate radio frequency fields for imaging large-size samples, such as the human body. In this work, the feasibility of applying the "traveling-wave" technique to parallel imaging is studied using microstrip patch antenna arrays with both the numerical analysis and experimental tests. A specific patch array model is built and each array element is a microstrip patch antenna. Bench tests show that decoupling between two adjacent elements is better than -26-dB while matching of each element reaches -36-dB, demonstrating excellent isolation performance and impedance match capability. The sensitivity patterns are simulated and g-factors are calculated for both unloaded and loaded cases. The results on B 1- sensitivity patterns and g-factors demonstrate the feasibility of the traveling-wave parallel imaging. Simulations also suggest that different array configuration such as patch shape, position and orientation leads to different sensitivity patterns and g-factor maps, which provides a way to manipulate B(1) fields and improve the parallel imaging performance. The proposed method is also validated by using 7T MR imaging experiments. Copyright © 2011 Wiley-Liss, Inc.
3D Staggered-Grid Finite-Difference Simulation of Acoustic Waves in Turbulent Moving Media

NASA Astrophysics Data System (ADS)

Symons, N. P.; Aldridge, D. F.; Marlin, D.; Wilson, D. K.; Sullivan, P.; Ostashev, V.

2003-12-01

Acoustic wave propagation in a three-dimensional heterogeneous moving atmosphere is accurately simulated with a numerical algorithm recently developed under the DOD Common High Performance Computing Software Support Initiative (CHSSI). Sound waves within such a dynamic environment are mathematically described by a set of four, coupled, first-order partial differential equations governing small-amplitude fluctuations in pressure and particle velocity. The system is rigorously derived from fundamental principles of continuum mechanics, ideal-fluid constitutive relations, and reasonable assumptions that the ambient atmospheric motion is adiabatic and divergence-free. An explicit, time-domain, finite-difference (FD) numerical scheme is used to solve the system for both pressure and particle velocity wavefields. The atmosphere is characterized by 3D gridded models of sound speed, mass density, and the three components of the wind velocity vector. Dependent variables are stored on staggered spatial and temporal grids, and centered FD operators possess 2nd-order and 4th-order space/time accuracy. Accurate sound wave simulation is achieved provided grid intervals are chosen appropriately. The gridding must be fine enough to reduce numerical dispersion artifacts to an acceptable level and maintain stability. The algorithm is designed to execute on parallel computational platforms by utilizing a spatial domain-decomposition strategy. Currently, the algorithm has been validated on four different computational platforms, and parallel scalability of approximately 85% has been demonstrated. Comparisons with analytic solutions for uniform and vertically stratified wind models indicate that the FD algorithm generates accurate results with either a vanishing pressure or vanishing vertical-particle velocity boundary condition. Simulations are performed using a kinematic turbulence wind profile developed with the quasi-wavelet method. In addition, preliminary results are presented using high-resolution 3D dynamic turbulent flowfields generated by a large-eddy simulation model of a stably stratified planetary boundary layer. Sandia National Laboratories is a operated by Sandia Corporation, a Lockheed Martin Company, for the USDOE under contract 94-AL85000.
Parallel methodology to capture cyclic variability in motored engines

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ameen, Muhsin M.; Yang, Xiaofeng; Kuo, Tang-Wei

2016-07-28

Numerical prediction of of cycle-to-cycle variability (CCV) in SI engines is extremely challenging for two key reasons: (i) high-fidelity methods such as large eddy simulation (LES) are require to accurately capture the in-cylinder turbulent flowfield, and (ii) CCV is experienced over long timescales and hence the simulations need to be performed for hundreds of consecutive cycles. In this study, a new methodology is proposed to dissociate this long time-scale problem into several shorter time-scale problems, which can considerably reduce the computational time without sacrificing the fidelity of the simulations. The strategy is to perform multiple single-cycle simulations in parallel bymore » effectively perturbing the simulation parameters such as the initial and boundary conditions. It is shown that by perturbing the initial velocity field effectively based on the intensity of the in-cylinder turbulence, the mean and variance of the in-cylinder flowfield is captured reasonably well. Adding perturbations in the initial pressure field and the boundary pressure improves the predictions. It is shown that this new approach is able to give accurate predictions of the flowfield statistics in less than one-tenth of time required for the conventional approach of simulating consecutive engine cycles.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Popovich, P.; Carter, T. A.; Friedman, B.

Numerical simulation of plasma turbulence in the Large Plasma Device (LAPD) [W. Gekelman, H. Pfister, Z. Lucky et al., Rev. Sci. Instrum. 62, 2875 (1991)] is presented. The model, implemented in the BOUndary Turbulence code [M. Umansky, X. Xu, B. Dudson et al., Contrib. Plasma Phys. 180, 887 (2009)], includes three-dimensional (3D) collisional fluid equations for plasma density, electron parallel momentum, and current continuity, and also includes the effects of ion-neutral collisions. In nonlinear simulations using measured LAPD density profiles but assuming constant temperature profile for simplicity, self-consistent evolution of instabilities and nonlinearly generated zonal flows results in a saturatedmore » turbulent state. Comparisons of these simulations with measurements in LAPD plasmas reveal good qualitative and reasonable quantitative agreement, in particular in frequency spectrum, spatial correlation, and amplitude probability distribution function of density fluctuations. For comparison with LAPD measurements, the plasma density profile in simulations is maintained either by direct azimuthal averaging on each time step, or by adding particle source/sink function. The inferred source/sink values are consistent with the estimated ionization source and parallel losses in LAPD. These simulations lay the groundwork for more a comprehensive effort to test fluid turbulence simulation against LAPD data.« less
The TeraShake Computational Platform for Large-Scale Earthquake Simulations

NASA Astrophysics Data System (ADS)

Cui, Yifeng; Olsen, Kim; Chourasia, Amit; Moore, Reagan; Maechling, Philip; Jordan, Thomas

Geoscientific and computer science researchers with the Southern California Earthquake Center (SCEC) are conducting a large-scale, physics-based, computationally demanding earthquake system science research program with the goal of developing predictive models of earthquake processes. The computational demands of this program continue to increase rapidly as these researchers seek to perform physics-based numerical simulations of earthquake processes for larger meet the needs of this research program, a multiple-institution team coordinated by SCEC has integrated several scientific codes into a numerical modeling-based research tool we call the TeraShake computational platform (TSCP). A central component in the TSCP is a highly scalable earthquake wave propagation simulation program called the TeraShake anelastic wave propagation (TS-AWP) code. In this chapter, we describe how we extended an existing, stand-alone, wellvalidated, finite-difference, anelastic wave propagation modeling code into the highly scalable and widely used TS-AWP and then integrated this code into the TeraShake computational platform that provides end-to-end (initialization to analysis) research capabilities. We also describe the techniques used to enhance the TS-AWP parallel performance on TeraGrid supercomputers, as well as the TeraShake simulations phases including input preparation, run time, data archive management, and visualization. As a result of our efforts to improve its parallel efficiency, the TS-AWP has now shown highly efficient strong scaling on over 40K processors on IBM’s BlueGene/L Watson computer. In addition, the TSCP has developed into a computational system that is useful to many members of the SCEC community for performing large-scale earthquake simulations.
3D Lagrangian VPM: simulations of the near-wake of an actuator disc and horizontal axis wind turbine

NASA Astrophysics Data System (ADS)

Berdowski, T.; Ferreira, C.; Walther, J.

2016-09-01

The application of a 3-dimensional Lagrangian vortex particle method has been assessed for modelling the near-wake of an axisymmetrical actuator disc and 3-bladed horizontal axis wind turbine with prescribed circulation from the MEXICO (Model EXperiments In COntrolled conditions) experiment. The method was developed in the framework of the open- source Parallel Particle-Mesh library for handling the efficient data-parallelism on a CPU (Central Processing Unit) cluster, and utilized a O(N log N)-type fast multipole method for computational acceleration. Simulations with the actuator disc resulted in a wake expansion, velocity deficit profile, and induction factor that showed a close agreement with theoretical, numerical, and experimental results from literature. Also the shear layer expansion was present; the Kelvin-Helmholtz instability in the shear layer was triggered due to the round-off limitations of a numerical method, but this instability was delayed to beyond 1 diameter downstream due to the particle smoothing. Simulations with the 3-bladed turbine demonstrated that a purely 3-dimensional flow representation is challenging to model with particles. The manifestation of local complex flow structures of highly stretched vortices made the simulation unstable, but this was successfully counteracted by the application of a particle strength exchange scheme. The axial and radial velocity profile over the near wake have been compared to that of the original MEXICO experiment, which showed close agreement between results.
Parallelization of Rocket Engine System Software (Press)

NASA Technical Reports Server (NTRS)

Cezzar, Ruknet

1996-01-01

The main goal is to assess parallelization requirements for the Rocket Engine Numeric Simulator (RENS) project which, aside from gathering information on liquid-propelled rocket engines and setting forth requirements, involve a large FORTRAN based package at NASA Lewis Research Center and TDK software developed by SUBR/UWF. The ultimate aim is to develop, test, integrate, and suitably deploy a family of software packages on various aspects and facets of rocket engines using liquid-propellants. At present, all project efforts by the funding agency, NASA Lewis Research Center, and the HBCU participants are disseminated over the internet using world wide web home pages. Considering obviously expensive methods of actual field trails, the benefits of software simulators are potentially enormous. When realized, these benefits will be analogous to those provided by numerous CAD/CAM packages and flight-training simulators. According to the overall task assignments, Hampton University's role is to collect all available software, place them in a common format, assess and evaluate, define interfaces, and provide integration. Most importantly, the HU's mission is to see to it that the real-time performance is assured. This involves source code translations, porting, and distribution. The porting will be done in two phases: First, place all software on Cray XMP platform using FORTRAN. After testing and evaluation on the Cray X-MP, the code will be translated to C + + and ported to the parallel nCUBE platform. At present, we are evaluating another option of distributed processing over local area networks using Sun NFS, Ethernet, TCP/IP. Considering the heterogeneous nature of the present software (e.g., first started as an expert system using LISP machines) which now involve FORTRAN code, the effort is expected to be quite challenging.
A New Numerical Scheme for Cosmic-Ray Transport

NASA Astrophysics Data System (ADS)

Jiang, Yan-Fei; Oh, S. Peng

2018-02-01

Numerical solutions of the cosmic-ray (CR) magnetohydrodynamic equations are dogged by a powerful numerical instability, which arises from the constraint that CRs can only stream down their gradient. The standard cure is to regularize by adding artificial diffusion. Besides introducing ad hoc smoothing, this has a significant negative impact on either computational cost or complexity and parallel scalings. We describe a new numerical algorithm for CR transport, with close parallels to two-moment methods for radiative transfer under the reduced speed of light approximation. It stably and robustly handles CR streaming without any artificial diffusion. It allows for both isotropic and field-aligned CR streaming and diffusion, with arbitrary streaming and diffusion coefficients. CR transport is handled explicitly, while source terms are handled implicitly. The overall time step scales linearly with resolution (even when computing CR diffusion) and has a perfect parallel scaling. It is given by the standard Courant condition with respect to a constant maximum velocity over the entire simulation domain. The computational cost is comparable to that of solving the ideal MHD equation. We demonstrate the accuracy and stability of this new scheme with a wide variety of tests, including anisotropic streaming and diffusion tests, CR-modified shocks, CR-driven blast waves, and CR transport in multiphase media. The new algorithm opens doors to much more ambitious and hitherto intractable calculations of CR physics in galaxies and galaxy clusters. It can also be applied to other physical processes with similar mathematical structure, such as saturated, anisotropic heat conduction.
A nonrecursive order N preconditioned conjugate gradient: Range space formulation of MDOF dynamics

NASA Technical Reports Server (NTRS)

Kurdila, Andrew J.

1990-01-01

While excellent progress has been made in deriving algorithms that are efficient for certain combinations of system topologies and concurrent multiprocessing hardware, several issues must be resolved to incorporate transient simulation in the control design process for large space structures. Specifically, strategies must be developed that are applicable to systems with numerous degrees of freedom. In addition, the algorithms must have a growth potential in that they must also be amenable to implementation on forthcoming parallel system architectures. For mechanical system simulation, this fact implies that algorithms are required that induce parallelism on a fine scale, suitable for the emerging class of highly parallel processors; and transient simulation methods must be automatically load balancing for a wider collection of system topologies and hardware configurations. These problems are addressed by employing a combination range space/preconditioned conjugate gradient formulation of multi-degree-of-freedom dynamics. The method described has several advantages. In a sequential computing environment, the method has the features that: by employing regular ordering of the system connectivity graph, an extremely efficient preconditioner can be derived from the 'range space metric', as opposed to the system coefficient matrix; because of the effectiveness of the preconditioner, preliminary studies indicate that the method can achieve performance rates that depend linearly upon the number of substructures, hence the title 'Order N'; and the method is non-assembling. Furthermore, the approach is promising as a potential parallel processing algorithm in that the method exhibits a fine parallel granularity suitable for a wide collection of combinations of physical system topologies/computer architectures; and the method is easily load balanced among processors, and does not rely upon system topology to induce parallelism.
Global kinetic simulations of neoclassical toroidal viscosity in low-collisional perturbed tokamak plasmas

NASA Astrophysics Data System (ADS)

Matsuoka, Seikichi; Idomura, Yasuhiro; Satake, Shinsuke

2017-10-01

The neoclassical toroidal viscosity (NTV) caused by a non-axisymmetric magnetic field perturbation is numerically studied using two global kinetic simulations with different numerical approaches. Both simulations reproduce similar collisionality ( νb*) dependencies over wide νb * ranges. It is demonstrated that resonant structures in the velocity space predicted by the conventional superbanana-plateau theory exist in the small banana width limit, while the resonances diminish when the banana width becomes large. It is also found that fine scale structures are generated in the velocity space as νb* decreases in the large banana width simulations, leading to the νb* -dependency of the NTV. From the analyses of the particle orbit, it is found that the finite k∥ mode structure along the bounce motion appears owing to the finite orbit width, and it suffers from bounce phase mixing, suggesting the generation of the fine scale structures by the similar mechanism as the parallel phase mixing of passing particles.

Propagation of acoustic shock waves between parallel rigid boundaries and into shadow zones

DOE Office of Scientific and Technical Information (OSTI.GOV)

Desjouy, C., E-mail: cyril.desjouy@gmail.com; Ollivier, S.; Dragna, D.

2015-10-28

The study of acoustic shock propagation in complex environments is of great interest for urban acoustics, but also for source localization, an underlying problematic in military applications. To give a better understanding of the phenomenon taking place during the propagation of acoustic shocks, laboratory-scale experiments and numerical simulations were performed to study the propagation of weak shock waves between parallel rigid boundaries, and into shadow zones created by corners. In particular, this work focuses on the study of the local interactions taking place between incident, reflected, and diffracted waves according to the geometry in both regular or irregular – alsomore » called Von Neumann – regimes of reflection. In this latter case, an irregular reflection can lead to the formation of a Mach stem that can modify the spatial distribution of the acoustic pressure. Short duration acoustic shock waves were produced by a 20 kilovolts electric spark source and a schlieren optical method was used to visualize the incident shockfront and the reflection/diffraction patterns. Experimental results are compared to numerical simulations based on the high-order finite difference solution of the two dimensional Navier-Stokes equations.« less
A parallel program for numerical simulation of discrete fracture network and groundwater flow

NASA Astrophysics Data System (ADS)

Huang, Ting-Wei; Liou, Tai-Sheng; Kalatehjari, Roohollah

2017-04-01

The ability of modeling fluid flow in Discrete Fracture Network (DFN) is critical to various applications such as exploration of reserves in geothermal and petroleum reservoirs, geological sequestration of carbon dioxide and final disposal of spent nuclear fuels. Although several commerical or acdametic DFN flow simulators are already available (e.g., FracMan and DFNWORKS), challenges in terms of computational efficiency and three-dimensional visualization still remain, which therefore motivates this study for developing a new DFN and flow simulator. A new DFN and flow simulator, DFNbox, was written in C++ under a cross-platform software development framework provided by Qt. DFNBox integrates the following capabilities into a user-friendly drop-down menu interface: DFN simulation and clipping, 3D mesh generation, fracture data analysis, connectivity analysis, flow path analysis and steady-state grounwater flow simulation. All three-dimensional visualization graphics were developed using the free OpenGL API. Similar to other DFN simulators, fractures are conceptualized as random point process in space, with stochastic characteristics represented by orientation, size, transmissivity and aperture. Fracture meshing was implemented by Delaunay triangulation for visualization but not flow simulation purposes. Boundary element method was used for flow simulations such that only unknown head or flux along exterior and interection bounaries are needed for solving the flow field in the DFN. Parallel compuation concept was taken into account in developing DFNbox for calculations that such concept is possible. For example, the time-consuming seqential code for fracture clipping calculations has been completely replaced by a highly efficient parallel one. This can greatly enhance compuational efficiency especially on multi-thread platforms. Furthermore, DFNbox have been successfully tested in Windows and Linux systems with equally-well performance.
Parallel software for lattice N = 4 supersymmetric Yang-Mills theory

NASA Astrophysics Data System (ADS)

Schaich, David; DeGrand, Thomas

2015-05-01

We present new parallel software, SUSY LATTICE, for lattice studies of four-dimensional N = 4 supersymmetric Yang-Mills theory with gauge group SU(N). The lattice action is constructed to exactly preserve a single supersymmetry charge at non-zero lattice spacing, up to additional potential terms included to stabilize numerical simulations. The software evolved from the MILC code for lattice QCD, and retains a similar large-scale framework despite the different target theory. Many routines are adapted from an existing serial code (Catterall and Joseph, 2012), which SUSY LATTICE supersedes. This paper provides an overview of the new parallel software, summarizing the lattice system, describing the applications that are currently provided and explaining their basic workflow for non-experts in lattice gauge theory. We discuss the parallel performance of the code, and highlight some notable aspects of the documentation for those interested in contributing to its future development.
Evaluation of a two-dimensional numerical model for air quality simulation in a street canyon

NASA Astrophysics Data System (ADS)

Okamoto, Shin `Ichi; Lin, Fu Chi; Yamada, Hiroaki; Shiozawa, Kiyoshige

For many urban areas, the most severe air pollution caused by automobile emissions appears along a road surrounded by tall buildings: the so=called street canyon. A practical two-dimensional numerical model has been developed to be applied to this kind of road structure. This model contains two submodels: a wind-field model and a diffusion model based on a Monte Carlo particle scheme. In order to evaluate the predictive performance of this model, an air quality simulation was carried out at three trunk roads in the Tokyo metropolitan area: Nishi-Shimbashi, Aoyama and Kanda-Nishikicho (using SF 6 as a tracer and NO x measurement). Since this model has two-dimensional properties and cannot be used for the parallel wind condition, the perpendicular wind condition was selected for the simulation. The correlation coefficients for the SF 6 and NO x data in Aoyama were 0.67 and 0.62, respectively. When predictive performance of this model is compared with other models, this model is comparable to the SRI model, and superior to the APPS three-dimensional numerical model.
TMVOC-MP: a parallel numerical simulator for Three-PhaseNon-isothermal Flows of Multicomponent Hydrocarbon Mixtures inporous/fractured media

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Keni; Yamamoto, Hajime; Pruess, Karsten

2008-02-15

TMVOC-MP is a massively parallel version of the TMVOC code (Pruess and Battistelli, 2002), a numerical simulator for three-phase non-isothermal flow of water, gas, and a multicomponent mixture of volatile organic chemicals (VOCs) in multidimensional heterogeneous porous/fractured media. TMVOC-MP was developed by introducing massively parallel computing techniques into TMVOC. It retains the physical process model of TMVOC, designed for applications to contamination problems that involve hydrocarbon fuels or organic solvents in saturated and unsaturated zones. TMVOC-MP can model contaminant behavior under 'natural' environmental conditions, as well as for engineered systems, such as soil vapor extraction, groundwater pumping, or steam-assisted sourcemore » remediation. With its sophisticated parallel computing techniques, TMVOC-MP can handle much larger problems than TMVOC, and can be much more computationally efficient. TMVOC-MP models multiphase fluid systems containing variable proportions of water, non-condensible gases (NCGs), and water-soluble volatile organic chemicals (VOCs). The user can specify the number and nature of NCGs and VOCs. There are no intrinsic limitations to the number of NCGs or VOCs, although the arrays for fluid components are currently dimensioned as 20, accommodating water plus 19 components that may be either NCGs or VOCs. Among them, NCG arrays are dimensioned as 10. The user may select NCGs from a data bank provided in the software. The currently available choices include O{sub 2}, N{sub 2}, CO{sub 2}, CH{sub 4}, ethane, ethylene, acetylene, and air (a pseudo-component treated with properties averaged from N{sub 2} and O{sub 2}). Thermophysical property data of VOCs can be selected from a chemical data bank, included with TMVOC-MP, that provides parameters for 26 commonly encountered chemicals. Users also can input their own data for other fluids. The fluid components may partition (volatilize and/or dissolve) among gas, aqueous, and NAPL phases. Any combination of the three phases may present, and phases may appear and disappear in the course of a simulation. In addition, VOCs may be adsorbed by the porous medium, and may biodegrade according to a simple half-life model. Detailed discussion of physical processes, assumptions, and fluid properties used in TMVOC-MP can be found in the TMVOC user's guide (Pruess and Battistelli, 2002). TMVOC-MP was developed based on the parallel framework of the TOUGH2-MP code (Zhang et al. 2001, Wu et al. 2002). It uses the MPI (Message Passing Forum, 1994) for parallel implementation. A domain decomposition approach is adopted for the parallelization. The code partitions a simulation domain, defined by an unstructured grid, using partitioning algorithm from the METIS software package (Karypsis and Kumar, 1998). In parallel simulation, each processor is in charge of one part of the simulation domain for assembling mass and energy balance equations, solving linear equation systems, updating thermophysical properties, and performing other local computations. The local linear-equation systems are solved in parallel by multiple processors with the Aztec linear solver package (Tuminaro et al., 1999). Although each processor solves the linearized equations of subdomains independently, the entire linear equation system is solved together by all processors collaboratively via communication between neighboring processors during each iteration. Detailed discussion of the prototype of the data-exchange scheme can be found in Elmroth et al. (2001). In addition, FORTRAN 90 features are introduced to TMVOC-MP, such as dynamic memory allocation, array operation, matrix manipulation, and replacing 'common blocks' (used in the original TMVOC) with modules. All new subroutines are written in FORTRAN 90. Program units imported from the original TMVOC remain in standard FORTRAN 77. This report provides a quick starting guide for using the TMVOC-MP program. We suppose that the users have basic knowledge of using the original TMVOC code. The users can find the detailed technical description of the physical processes modeled, and the mathematical and numerical methods in the user's guide for TMVOC (Pruess and Battistelli, 2002).« less
Visualization of Octree Adaptive Mesh Refinement (AMR) in Astrophysical Simulations

NASA Astrophysics Data System (ADS)

Labadens, M.; Chapon, D.; Pomaréde, D.; Teyssier, R.

2012-09-01

Computer simulations are important in current cosmological research. Those simulations run in parallel on thousands of processors, and produce huge amount of data. Adaptive mesh refinement is used to reduce the computing cost while keeping good numerical accuracy in regions of interest. RAMSES is a cosmological code developed by the Commissariat à l'énergie atomique et aux énergies alternatives (English: Atomic Energy and Alternative Energies Commission) which uses Octree adaptive mesh refinement. Compared to grid based AMR, the Octree AMR has the advantage to fit very precisely the adaptive resolution of the grid to the local problem complexity. However, this specific octree data type need some specific software to be visualized, as generic visualization tools works on Cartesian grid data type. This is why the PYMSES software has been also developed by our team. It relies on the python scripting language to ensure a modular and easy access to explore those specific data. In order to take advantage of the High Performance Computer which runs the RAMSES simulation, it also uses MPI and multiprocessing to run some parallel code. We would like to present with more details our PYMSES software with some performance benchmarks. PYMSES has currently two visualization techniques which work directly on the AMR. The first one is a splatting technique, and the second one is a custom ray tracing technique. Both have their own advantages and drawbacks. We have also compared two parallel programming techniques with the python multiprocessing library versus the use of MPI run. The load balancing strategy has to be smartly defined in order to achieve a good speed up in our computation. Results obtained with this software are illustrated in the context of a massive, 9000-processor parallel simulation of a Milky Way-like galaxy.
A parallel computing engine for a class of time critical processes.

PubMed

Nabhan, T M; Zomaya, A Y

1997-01-01

This paper focuses on the efficient parallel implementation of systems of numerically intensive nature over loosely coupled multiprocessor architectures. These analytical models are of significant importance to many real-time systems that have to meet severe time constants. A parallel computing engine (PCE) has been developed in this work for the efficient simplification and the near optimal scheduling of numerical models over the different cooperating processors of the parallel computer. First, the analytical system is efficiently coded in its general form. The model is then simplified by using any available information (e.g., constant parameters). A task graph representing the interconnections among the different components (or equations) is generated. The graph can then be compressed to control the computation/communication requirements. The task scheduler employs a graph-based iterative scheme, based on the simulated annealing algorithm, to map the vertices of the task graph onto a Multiple-Instruction-stream Multiple-Data-stream (MIMD) type of architecture. The algorithm uses a nonanalytical cost function that properly considers the computation capability of the processors, the network topology, the communication time, and congestion possibilities. Moreover, the proposed technique is simple, flexible, and computationally viable. The efficiency of the algorithm is demonstrated by two case studies with good results.
Rigorous vector wave propagation for arbitrary flat media

NASA Astrophysics Data System (ADS)

Bos, Steven P.; Haffert, Sebastiaan Y.; Keller, Christoph U.

2017-08-01

Precise modelling of the (off-axis) point spread function (PSF) to identify geometrical and polarization aberrations is important for many optical systems. In order to characterise the PSF of the system in all Stokes parameters, an end-to-end simulation of the system has to be performed in which Maxwell's equations are rigorously solved. We present the first results of a python code that we are developing to perform multiscale end-to-end wave propagation simulations that include all relevant physics. Currently we can handle plane-parallel near- and far-field vector diffraction effects of propagating waves in homogeneous isotropic and anisotropic materials, refraction and reflection of flat parallel surfaces, interference effects in thin films and unpolarized light. We show that the code has a numerical precision on the order of 10-16 for non-absorbing isotropic and anisotropic materials. For absorbing materials the precision is on the order of 10-8. The capabilities of the code are demonstrated by simulating a converging beam reflecting from a flat aluminium mirror at normal incidence.
A signature of anisotropic cosmic-ray transport in the gamma-ray sky

NASA Astrophysics Data System (ADS)

Cerri, Silvio Sergio; Gaggero, Daniele; Vittino, Andrea; Evoli, Carmelo; Grasso, Dario

2017-10-01

A crucial process in Galactic cosmic-ray (CR) transport is the spatial diffusion due to the interaction with the interstellar turbulent magnetic field. Usually, CR diffusion is assumed to be uniform and isotropic all across the Galaxy. However, this picture is clearly inaccurate: several data-driven and theoretical arguments, as well as dedicated numerical simulations, show that diffusion exhibits highly anisotropic properties with respect to the direction of a background (ordered) magnetic field (i.e., parallel or perpendicular to it). In this paper we focus on a recently discovered anomaly in the hadronic CR spectrum inferred by the Fermi-LAT gamma-ray data at different positions in the Galaxy, i.e. the progressive hardening of the proton slope at low Galactocentric radii. We propose the idea that this feature can be interpreted as a signature of anisotropic diffusion in the complex Galactic magnetic field: in particular, the harder slope in the inner Galaxy is due, in our scenario, to the parallel diffusive escape along the poloidal component of the large-scale, regular, magnetic field. We implement this idea in a numerical framework, based on the DRAGON code, and perform detailed numerical tests on the accuracy of our setup. We discuss how the effect proposed depends on the relevant free parameters involved. Based on low-energy extrapolation of the few focused numerical simulations aimed at determining the scalings of the anisotropic diffusion coefficients, we finally present a set of plausible models that reproduce the behavior of the CR proton slopes inferred by gamma-ray data.
A signature of anisotropic cosmic-ray transport in the gamma-ray sky

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cerri, Silvio Sergio; Grasso, Dario; Gaggero, Daniele

A crucial process in Galactic cosmic-ray (CR) transport is the spatial diffusion due to the interaction with the interstellar turbulent magnetic field. Usually, CR diffusion is assumed to be uniform and isotropic all across the Galaxy. However, this picture is clearly inaccurate: several data-driven and theoretical arguments, as well as dedicated numerical simulations, show that diffusion exhibits highly anisotropic properties with respect to the direction of a background (ordered) magnetic field (i.e., parallel or perpendicular to it). In this paper we focus on a recently discovered anomaly in the hadronic CR spectrum inferred by the Fermi-LAT gamma-ray data at differentmore » positions in the Galaxy, i.e. the progressive hardening of the proton slope at low Galactocentric radii. We propose the idea that this feature can be interpreted as a signature of anisotropic diffusion in the complex Galactic magnetic field: in particular, the harder slope in the inner Galaxy is due, in our scenario, to the parallel diffusive escape along the poloidal component of the large-scale, regular, magnetic field. We implement this idea in a numerical framework, based on the DRAGON code, and perform detailed numerical tests on the accuracy of our setup. We discuss how the effect proposed depends on the relevant free parameters involved. Based on low-energy extrapolation of the few focused numerical simulations aimed at determining the scalings of the anisotropic diffusion coefficients, we finally present a set of plausible models that reproduce the behavior of the CR proton slopes inferred by gamma-ray data.« less
FOLDER: A numerical tool to simulate the development of structures in layered media

NASA Astrophysics Data System (ADS)

Adamuszek, Marta; Dabrowski, Marcin; Schmid, Daniel W.

2015-04-01

FOLDER is a numerical toolbox for modelling deformation in layered media during layer parallel shortening or extension in two dimensions. FOLDER builds on MILAMIN [1], a finite element method based mechanical solver, with a range of utilities included from the MUTILS package [2]. Numerical mesh is generated using the Triangle software [3]. The toolbox includes features that allow for: 1) designing complex structures such as multi-layer stacks, 2) accurately simulating large-strain deformation of linear and non-linear viscous materials, 3) post-processing of various physical fields such as velocity (total and perturbing), rate of deformation, finite strain, stress, deviatoric stress, pressure, apparent viscosity. FOLDER is designed to ensure maximum flexibility to configure model geometry, define material parameters, specify range of numerical parameters in simulations and choose the plotting options. FOLDER is an open source MATLAB application and comes with a user friendly graphical interface. The toolbox additionally comprises an educational application that illustrates various analytical solutions of growth rates calculated for the cases of folding and necking of a single layer with interfaces perturbed with a single sinusoidal waveform. We further derive two novel analytical expressions for the growth rate in the cases of folding and necking of a linear viscous layer embedded in a linear viscous medium of a finite thickness. We use FOLDER to test the accuracy of single-layer folding simulations using various 1) spatial and temporal resolutions, 2) time integration schemes, and 3) iterative algorithms for non-linear materials. The accuracy of the numerical results is quantified by: 1) comparing them to analytical solution, if available, or 2) running convergence tests. As a result, we provide a map of the most optimal choice of grid size, time step, and number of iterations to keep the results of the numerical simulations below a given error for a given time integration scheme. We also demonstrate that Euler and Leapfrog time integration schemes are not recommended for any practical use. Finally, the capabilities of the toolbox are illustrated based on two examples: 1) shortening of a synthetic multi-layer sequence and 2) extension of a folded quartz vein embedded in phyllite from Sprague Upper Reservoir (example discussed by Sherwin and Chapple [4]). The latter example demonstrates that FOLDER can be successfully used for reverse modelling and mechanical restoration. [1] Dabrowski, M., Krotkiewski, M., and Schmid, D. W., 2008, MILAMIN: MATLAB-based finite element method solver for large problems. Geochemistry Geophysics Geosystems, vol. 9. [2] Krotkiewski, M. and Dabrowski M., 2010 Parallel symmetric sparse matrix-vector product on scalar multi-core cpus. Parallel Computing, 36(4):181-198 [3] Shewchuk, J. R., 1996, Triangle: Engineering a 2D Quality Mesh Generator and Delaunay Triangulator, In: Applied Computational Geometry: Towards Geometric Engineering'' (Ming C. Lin and Dinesh Manocha, editors), Vol. 1148 of Lecture Notes in Computer Science, pp. 203-222, Springer-Verlag, Berlin [4] Sherwin, J.A., Chapple, W.M., 1968. Wavelengths of single layer folds - a Comparison between theory and Observation. American Journal of Science 266 (3), p. 167-179
Parallel Adjective High-Order CFD Simulations Characterizing SOFIA Cavity Acoustics

NASA Technical Reports Server (NTRS)

Barad, Michael F.; Brehm, Christoph; Kiris, Cetin C.; Biswas, Rupak

2016-01-01

This paper presents large-scale MPI-parallel computational uid dynamics simulations for the Stratospheric Observatory for Infrared Astronomy (SOFIA). SOFIA is an airborne, 2.5-meter infrared telescope mounted in an open cavity in the aft fuselage of a Boeing 747SP. These simulations focus on how the unsteady ow eld inside and over the cavity interferes with the optical path and mounting structure of the telescope. A temporally fourth-order accurate Runge-Kutta, and spatially fth-order accurate WENO- 5Z scheme was used to perform implicit large eddy simulations. An immersed boundary method provides automated gridding for complex geometries and natural coupling to a block-structured Cartesian adaptive mesh re nement framework. Strong scaling studies using NASA's Pleiades supercomputer with up to 32k CPU cores and 4 billion compu- tational cells shows excellent scaling. Dynamic load balancing based on execution time on individual AMR blocks addresses irregular numerical cost associated with blocks con- taining boundaries. Limits to scaling beyond 32k cores are identi ed, and targeted code optimizations are discussed.
Parallel Adaptive High-Order CFD Simulations Characterizing SOFIA Cavitiy Acoustics

NASA Technical Reports Server (NTRS)

Barad, Michael F.; Brehm, Christoph; Kiris, Cetin C.; Biswas, Rupak

2015-01-01

This paper presents large-scale MPI-parallel computational uid dynamics simulations for the Stratospheric Observatory for Infrared Astronomy (SOFIA). SOFIA is an airborne, 2.5-meter infrared telescope mounted in an open cavity in the aft fuselage of a Boeing 747SP. These simulations focus on how the unsteady ow eld inside and over the cavity interferes with the optical path and mounting structure of the telescope. A tempo- rally fourth-order accurate Runge-Kutta, and a spatially fth-order accurate WENO-5Z scheme were used to perform implicit large eddy simulations. An immersed boundary method provides automated gridding for complex geometries and natural coupling to a block-structured Cartesian adaptive mesh re nement framework. Strong scaling studies using NASA's Pleiades supercomputer with up to 32k CPU cores and 4 billion compu- tational cells shows excellent scaling. Dynamic load balancing based on execution time on individual AMR blocks addresses irregular numerical cost associated with blocks con- taining boundaries. Limits to scaling beyond 32k cores are identi ed, and targeted code optimizations are discussed.
Parallel implementation of a Lagrangian-based model on an adaptive mesh in C++: Application to sea-ice

NASA Astrophysics Data System (ADS)

Samaké, Abdoulaye; Rampal, Pierre; Bouillon, Sylvain; Ólason, Einar

2017-12-01

We present a parallel implementation framework for a new dynamic/thermodynamic sea-ice model, called neXtSIM, based on the Elasto-Brittle rheology and using an adaptive mesh. The spatial discretisation of the model is done using the finite-element method. The temporal discretisation is semi-implicit and the advection is achieved using either a pure Lagrangian scheme or an Arbitrary Lagrangian Eulerian scheme (ALE). The parallel implementation presented here focuses on the distributed-memory approach using the message-passing library MPI. The efficiency and the scalability of the parallel algorithms are illustrated by the numerical experiments performed using up to 500 processor cores of a cluster computing system. The performance obtained by the proposed parallel implementation of the neXtSIM code is shown being sufficient to perform simulations for state-of-the-art sea ice forecasting and geophysical process studies over geographical domain of several millions squared kilometers like the Arctic region.
Efficient Machine Learning Approach for Optimizing Scientific Computing Applications on Emerging HPC Architectures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arumugam, Kamesh

Efficient parallel implementations of scientific applications on multi-core CPUs with accelerators such as GPUs and Xeon Phis is challenging. This requires - exploiting the data parallel architecture of the accelerator along with the vector pipelines of modern x86 CPU architectures, load balancing, and efficient memory transfer between different devices. It is relatively easy to meet these requirements for highly structured scientific applications. In contrast, a number of scientific and engineering applications are unstructured. Getting performance on accelerators for these applications is extremely challenging because many of these applications employ irregular algorithms which exhibit data-dependent control-ow and irregular memory accesses. Furthermore,more » these applications are often iterative with dependency between steps, and thus making it hard to parallelize across steps. As a result, parallelism in these applications is often limited to a single step. Numerical simulation of charged particles beam dynamics is one such application where the distribution of work and memory access pattern at each time step is irregular. Applications with these properties tend to present significant branch and memory divergence, load imbalance between different processor cores, and poor compute and memory utilization. Prior research on parallelizing such irregular applications have been focused around optimizing the irregular, data-dependent memory accesses and control-ow during a single step of the application independent of the other steps, with the assumption that these patterns are completely unpredictable. We observed that the structure of computation leading to control-ow divergence and irregular memory accesses in one step is similar to that in the next step. It is possible to predict this structure in the current step by observing the computation structure of previous steps. In this dissertation, we present novel machine learning based optimization techniques to address the parallel implementation challenges of such irregular applications on different HPC architectures. In particular, we use supervised learning to predict the computation structure and use it to address the control-ow and memory access irregularities in the parallel implementation of such applications on GPUs, Xeon Phis, and heterogeneous architectures composed of multi-core CPUs with GPUs or Xeon Phis. We use numerical simulation of charged particles beam dynamics simulation as a motivating example throughout the dissertation to present our new approach, though they should be equally applicable to a wide range of irregular applications. The machine learning approach presented here use predictive analytics and forecasting techniques to adaptively model and track the irregular memory access pattern at each time step of the simulation to anticipate the future memory access pattern. Access pattern forecasts can then be used to formulate optimization decisions during application execution which improves the performance of the application at a future time step based on the observations from earlier time steps. In heterogeneous architectures, forecasts can also be used to improve the memory performance and resource utilization of all the processing units to deliver a good aggregate performance. We used these optimization techniques and anticipation strategy to design a cache-aware, memory efficient parallel algorithm to address the irregularities in the parallel implementation of charged particles beam dynamics simulation on different HPC architectures. Experimental result using a diverse mix of HPC architectures shows that our approach in using anticipation strategy is effective in maximizing data reuse, ensuring workload balance, minimizing branch and memory divergence, and in improving resource utilization.« less
Applying Parallel Adaptive Methods with GeoFEST/PYRAMID to Simulate Earth Surface Crustal Dynamics

NASA Technical Reports Server (NTRS)

Norton, Charles D.; Lyzenga, Greg; Parker, Jay; Glasscoe, Margaret; Donnellan, Andrea; Li, Peggy

2006-01-01

This viewgraph presentation reviews the use Adaptive Mesh Refinement (AMR) in simulating the Crustal Dynamics of Earth's Surface. AMR simultaneously improves solution quality, time to solution, and computer memory requirements when compared to generating/running on a globally fine mesh. The use of AMR in simulating the dynamics of the Earth's Surface is spurred by future proposed NASA missions, such as InSAR for Earth surface deformation and other measurements. These missions will require support for large-scale adaptive numerical methods using AMR to model observations. AMR was chosen because it has been successful in computation fluid dynamics for predictive simulation of complex flows around complex structures.
Helical vortices generated by flapping wings of bumblebees

NASA Astrophysics Data System (ADS)

Engels, Thomas; Kolomenskiy, Dmitry; Schneider, Kai; Farge, Marie; Lehmann, Fritz-Olaf; Sesterhenn, Jörn

2018-02-01

High resolution direct numerical simulations of rotating and flapping bumblebee wings are presented and their aerodynamics is studied focusing on the role of leading edge vortices and the associated helicity production. We first study the flow generated by only one rotating bumblebee wing in circular motion with 45◦ angle of attack. We then consider a model bumblebee flying in a numerical wind tunnel, which is tethered and has rigid wings flapping with a prescribed generic motion. The inflow condition of the wind varies from laminar to strongly turbulent regimes. Massively parallel simulations show that inflow turbulence does not significantly alter the wings’ leading edge vortex, which enhances lift production. Finally, we focus on studying the helicity of the generated vortices and analyze their contribution at different scales using orthogonal wavelets.
High Performance Parallel Computational Nanotechnology

NASA Technical Reports Server (NTRS)

Saini, Subhash; Craw, James M. (Technical Monitor)

1995-01-01

At a recent press conference, NASA Administrator Dan Goldin encouraged NASA Ames Research Center to take a lead role in promoting research and development of advanced, high-performance computer technology, including nanotechnology. Manufacturers of leading-edge microprocessors currently perform large-scale simulations in the design and verification of semiconductor devices and microprocessors. Recently, the need for this intensive simulation and modeling analysis has greatly increased, due in part to the ever-increasing complexity of these devices, as well as the lessons of experiences such as the Pentium fiasco. Simulation, modeling, testing, and validation will be even more important for designing molecular computers because of the complex specification of millions of atoms, thousands of assembly steps, as well as the simulation and modeling needed to ensure reliable, robust and efficient fabrication of the molecular devices. The software for this capacity does not exist today, but it can be extrapolated from the software currently used in molecular modeling for other applications: semi-empirical methods, ab initio methods, self-consistent field methods, Hartree-Fock methods, molecular mechanics; and simulation methods for diamondoid structures. In as much as it seems clear that the application of such methods in nanotechnology will require powerful, highly powerful systems, this talk will discuss techniques and issues for performing these types of computations on parallel systems. We will describe system design issues (memory, I/O, mass storage, operating system requirements, special user interface issues, interconnects, bandwidths, and programming languages) involved in parallel methods for scalable classical, semiclassical, quantum, molecular mechanics, and continuum models; molecular nanotechnology computer-aided designs (NanoCAD) techniques; visualization using virtual reality techniques of structural models and assembly sequences; software required to control mini robotic manipulators for positional control; scalable numerical algorithms for reliability, verifications and testability. There appears no fundamental obstacle to simulating molecular compilers and molecular computers on high performance parallel computers, just as the Boeing 777 was simulated on a computer before manufacturing it.
Parameter Estimation of Fractional-Order Chaotic Systems by Using Quantum Parallel Particle Swarm Optimization Algorithm

PubMed Central

Huang, Yu; Guo, Feng; Li, Yongling; Liu, Yufeng

2015-01-01

Parameter estimation for fractional-order chaotic systems is an important issue in fractional-order chaotic control and synchronization and could be essentially formulated as a multidimensional optimization problem. A novel algorithm called quantum parallel particle swarm optimization (QPPSO) is proposed to solve the parameter estimation for fractional-order chaotic systems. The parallel characteristic of quantum computing is used in QPPSO. This characteristic increases the calculation of each generation exponentially. The behavior of particles in quantum space is restrained by the quantum evolution equation, which consists of the current rotation angle, individual optimal quantum rotation angle, and global optimal quantum rotation angle. Numerical simulation based on several typical fractional-order systems and comparisons with some typical existing algorithms show the effectiveness and efficiency of the proposed algorithm. PMID:25603158
Particle in cell/Monte Carlo collision analysis of the problem of identification of impurities in the gas by the plasma electron spectroscopy method

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kusoglu Sarikaya, C.; Rafatov, I., E-mail: rafatov@metu.edu.tr; Kudryavtsev, A. A.

2016-06-15

The work deals with the Particle in Cell/Monte Carlo Collision (PIC/MCC) analysis of the problem of detection and identification of impurities in the nonlocal plasma of gas discharge using the Plasma Electron Spectroscopy (PLES) method. For this purpose, 1d3v PIC/MCC code for numerical simulation of glow discharge with nonlocal electron energy distribution function is developed. The elastic, excitation, and ionization collisions between electron-neutral pairs and isotropic scattering and charge exchange collisions between ion-neutral pairs and Penning ionizations are taken into account. Applicability of the numerical code is verified under the Radio-Frequency capacitively coupled discharge conditions. The efficiency of the codemore » is increased by its parallelization using Open Message Passing Interface. As a demonstration of the PLES method, parallel PIC/MCC code is applied to the direct current glow discharge in helium doped with a small amount of argon. Numerical results are consistent with the theoretical analysis of formation of nonlocal EEDF and existing experimental data.« less

A microkernel design for component-based parallel numerical software systems.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Balay, S.

1999-01-13

What is the minimal software infrastructure and what type of conventions are needed to simplify development of sophisticated parallel numerical application codes using a variety of software components that are not necessarily available as source code? We propose an opaque object-based model where the objects are dynamically loadable from the file system or network. The microkernel required to manage such a system needs to include, at most: (1) a few basic services, namely--a mechanism for loading objects at run time via dynamic link libraries, and consistent schemes for error handling and memory management; and (2) selected methods that all objectsmore » share, to deal with object life (destruction, reference counting, relationships), and object observation (viewing, profiling, tracing). We are experimenting with these ideas in the context of extensible numerical software within the ALICE (Advanced Large-scale Integrated Computational Environment) project, where we are building the microkernel to manage the interoperability among various tools for large-scale scientific simulations. This paper presents some preliminary observations and conclusions from our work with microkernel design.« less
The Navier-Stokes computer

NASA Technical Reports Server (NTRS)

Nosenchuck, D. M.; Littman, M. G.

1986-01-01

The Navier-Stokes computer (NSC) has been developed for solving problems in fluid mechanics involving complex flow simulations that require more speed and capacity than provided by current and proposed Class VI supercomputers. The machine is a parallel processing supercomputer with several new architectural elements which can be programmed to address a wide range of problems meeting the following criteria: (1) the problem is numerically intensive, and (2) the code makes use of long vectors. A simulation of two-dimensional nonsteady viscous flows is presented to illustrate the architecture, programming, and some of the capabilities of the NSC.
PARALLEL PERTURBATION MODEL FOR CYCLE TO CYCLE VARIABILITY PPM4CCV

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ameen, Muhsin Mohammed; Som, Sibendu

This code consists of a Fortran 90 implementation of the parallel perturbation model to compute cyclic variability in spark ignition (SI) engines. Cycle-to-cycle variability (CCV) is known to be detrimental to SI engine operation resulting in partial burn and knock, and result in an overall reduction in the reliability of the engine. Numerical prediction of cycle-to-cycle variability (CCV) in SI engines is extremely challenging for two key reasons: (i) high-fidelity methods such as large eddy simulation (LES) are required to accurately capture the in-cylinder turbulent flow field, and (ii) CCV is experienced over long timescales and hence the simulations needmore » to be performed for hundreds of consecutive cycles. In the new technique, the strategy is to perform multiple parallel simulations, each of which encompasses 2-3 cycles, by effectively perturbing the simulation parameters such as the initial and boundary conditions. The PPM4CCV code is a pre-processing code and can be coupled with any engine CFD code. PPM4CCV was coupled with Converge CFD code and a 10-time speedup was demonstrated over the conventional multi-cycle LES in predicting the CCV for a motored engine. Recently, the model is also being applied to fired engines including port fuel injected (PFI) and direct injection spark ignition engines and the preliminary results are very encouraging.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Bailey, David H.

The NAS Parallel Benchmarks (NPB) are a suite of parallel computer performance benchmarks. They were originally developed at the NASA Ames Research Center in 1991 to assess high-end parallel supercomputers. Although they are no longer used as widely as they once were for comparing high-end system performance, they continue to be studied and analyzed a great deal in the high-performance computing community. The acronym 'NAS' originally stood for the Numerical Aeronautical Simulation Program at NASA Ames. The name of this organization was subsequently changed to the Numerical Aerospace Simulation Program, and more recently to the NASA Advanced Supercomputing Center, althoughmore » the acronym remains 'NAS.' The developers of the original NPB suite were David H. Bailey, Eric Barszcz, John Barton, David Browning, Russell Carter, LeoDagum, Rod Fatoohi, Samuel Fineberg, Paul Frederickson, Thomas Lasinski, Rob Schreiber, Horst Simon, V. Venkatakrishnan and Sisira Weeratunga. The original NAS Parallel Benchmarks consisted of eight individual benchmark problems, each of which focused on some aspect of scientific computing. The principal focus was in computational aerophysics, although most of these benchmarks have much broader relevance, since in a much larger sense they are typical of many real-world scientific computing applications. The NPB suite grew out of the need for a more rational procedure to select new supercomputers for acquisition by NASA. The emergence of commercially available highly parallel computer systems in the late 1980s offered an attractive alternative to parallel vector supercomputers that had been the mainstay of high-end scientific computing. However, the introduction of highly parallel systems was accompanied by a regrettable level of hype, not only on the part of the commercial vendors but even, in some cases, by scientists using the systems. As a result, it was difficult to discern whether the new systems offered any fundamental performance advantage over vector supercomputers, and, if so, which of the parallel offerings would be most useful in real-world scientific computation. In part to draw attention to some of the performance reporting abuses prevalent at the time, the present author wrote a humorous essay 'Twelve Ways to Fool the Masses,' which described in a light-hearted way a number of the questionable ways in which both vendor marketing people and scientists were inflating and distorting their performance results. All of this underscored the need for an objective and scientifically defensible measure to compare performance on these systems.« less
Numerical propulsion system simulation

NASA Technical Reports Server (NTRS)

Lytle, John K.; Remaklus, David A.; Nichols, Lester D.

1990-01-01

The cost of implementing new technology in aerospace propulsion systems is becoming prohibitively expensive. One of the major contributors to the high cost is the need to perform many large scale system tests. Extensive testing is used to capture the complex interactions among the multiple disciplines and the multiple components inherent in complex systems. The objective of the Numerical Propulsion System Simulation (NPSS) is to provide insight into these complex interactions through computational simulations. This will allow for comprehensive evaluation of new concepts early in the design phase before a commitment to hardware is made. It will also allow for rapid assessment of field-related problems, particularly in cases where operational problems were encountered during conditions that would be difficult to simulate experimentally. The tremendous progress taking place in computational engineering and the rapid increase in computing power expected through parallel processing make this concept feasible within the near future. However it is critical that the framework for such simulations be put in place now to serve as a focal point for the continued developments in computational engineering and computing hardware and software. The NPSS concept which is described will provide that framework.
pypet: A Python Toolkit for Data Management of Parameter Explorations

PubMed Central

Meyer, Robert; Obermayer, Klaus

2016-01-01

pypet (Python parameter exploration toolkit) is a new multi-platform Python toolkit for managing numerical simulations. Sampling the space of model parameters is a key aspect of simulations and numerical experiments. pypet is designed to allow easy and arbitrary sampling of trajectories through a parameter space beyond simple grid searches. pypet collects and stores both simulation parameters and results in a single HDF5 file. This collective storage allows fast and convenient loading of data for further analyses. pypet provides various additional features such as multiprocessing and parallelization of simulations, dynamic loading of data, integration of git version control, and supervision of experiments via the electronic lab notebook Sumatra. pypet supports a rich set of data formats, including native Python types, Numpy and Scipy data, Pandas DataFrames, and BRIAN(2) quantities. Besides these formats, users can easily extend the toolkit to allow customized data types. pypet is a flexible tool suited for both short Python scripts and large scale projects. pypet's various features, especially the tight link between parameters and results, promote reproducible research in computational neuroscience and simulation-based disciplines. PMID:27610080
pypet: A Python Toolkit for Data Management of Parameter Explorations.

PubMed

Meyer, Robert; Obermayer, Klaus

2016-01-01

pypet (Python parameter exploration toolkit) is a new multi-platform Python toolkit for managing numerical simulations. Sampling the space of model parameters is a key aspect of simulations and numerical experiments. pypet is designed to allow easy and arbitrary sampling of trajectories through a parameter space beyond simple grid searches. pypet collects and stores both simulation parameters and results in a single HDF5 file. This collective storage allows fast and convenient loading of data for further analyses. pypet provides various additional features such as multiprocessing and parallelization of simulations, dynamic loading of data, integration of git version control, and supervision of experiments via the electronic lab notebook Sumatra. pypet supports a rich set of data formats, including native Python types, Numpy and Scipy data, Pandas DataFrames, and BRIAN(2) quantities. Besides these formats, users can easily extend the toolkit to allow customized data types. pypet is a flexible tool suited for both short Python scripts and large scale projects. pypet's various features, especially the tight link between parameters and results, promote reproducible research in computational neuroscience and simulation-based disciplines.
GPU-accelerated Red Blood Cells Simulations with Transport Dissipative Particle Dynamics.

PubMed

Blumers, Ansel L; Tang, Yu-Hang; Li, Zhen; Li, Xuejin; Karniadakis, George E

2017-08-01

Mesoscopic numerical simulations provide a unique approach for the quantification of the chemical influences on red blood cell functionalities. The transport Dissipative Particles Dynamics (tDPD) method can lead to such effective multiscale simulations due to its ability to simultaneously capture mesoscopic advection, diffusion, and reaction. In this paper, we present a GPU-accelerated red blood cell simulation package based on a tDPD adaptation of our red blood cell model, which can correctly recover the cell membrane viscosity, elasticity, bending stiffness, and cross-membrane chemical transport. The package essentially processes all computational workloads in parallel by GPU, and it incorporates multi-stream scheduling and non-blocking MPI communications to improve inter-node scalability. Our code is validated for accuracy and compared against the CPU counterpart for speed. Strong scaling and weak scaling are also presented to characterizes scalability. We observe a speedup of 10.1 on one GPU over all 16 cores within a single node, and a weak scaling efficiency of 91% across 256 nodes. The program enables quick-turnaround and high-throughput numerical simulations for investigating chemical-driven red blood cell phenomena and disorders.
High order parallel numerical schemes for solving incompressible flows

NASA Technical Reports Server (NTRS)

Lin, Avi; Milner, Edward J.; Liou, May-Fun; Belch, Richard A.

1992-01-01

The use of parallel computers for numerically solving flow fields has gained much importance in recent years. This paper introduces a new high order numerical scheme for computational fluid dynamics (CFD) specifically designed for parallel computational environments. A distributed MIMD system gives the flexibility of treating different elements of the governing equations with totally different numerical schemes in different regions of the flow field. The parallel decomposition of the governing operator to be solved is the primary parallel split. The primary parallel split was studied using a hypercube like architecture having clusters of shared memory processors at each node. The approach is demonstrated using examples of simple steady state incompressible flows. Future studies should investigate the secondary split because, depending on the numerical scheme that each of the processors applies and the nature of the flow in the specific subdomain, it may be possible for a processor to seek better, or higher order, schemes for its particular subcase.
Simulation of Foam Impact Effects on Components of the Space Shuttle Thermal Protection System. Chapter 7

NASA Technical Reports Server (NTRS)

Fahrenthold, Eric P.; Park, Young-Keun

2004-01-01

A series of three dimensional simulations has been performed to investigate analytically the effect of insulating foam impacts on ceramic tile and reinforced carbon-carbon components of the Space Shuttle thermal protection system. The simulations employed a hybrid particle-finite element method and a parallel code developed for use in spacecraft design applications. The conclusions suggested by the numerical study are in general consistent with experiment. The results emphasize the need for additional material testing work on the dynamic mechanical response of thermal protection system materials, and additional impact experiments for use in validating computational models of impact effects.
Analytical theory of coherent synchrotron radiation wakefield of short bunches shielded by conducting parallel plates

NASA Astrophysics Data System (ADS)

Stupakov, Gennady; Zhou, Demin

2016-04-01

We develop a general model of coherent synchrotron radiation (CSR) impedance with shielding provided by two parallel conducting plates. This model allows us to easily reproduce all previously known analytical CSR wakes and to expand the analysis to situations not explored before. It reduces calculations of the impedance to taking integrals along the trajectory of the beam. New analytical results are derived for the radiation impedance with shielding for the following orbits: a kink, a bending magnet, a wiggler of finite length, and an infinitely long wiggler. All our formulas are benchmarked against numerical simulations with the CSRZ computer code.
Graphics processing unit based computation for NDE applications

NASA Astrophysics Data System (ADS)

Nahas, C. A.; Rajagopal, Prabhu; Balasubramaniam, Krishnan; Krishnamurthy, C. V.

2012-05-01

Advances in parallel processing in recent years are helping to improve the cost of numerical simulation. Breakthroughs in Graphical Processing Unit (GPU) based computation now offer the prospect of further drastic improvements. The introduction of 'compute unified device architecture' (CUDA) by NVIDIA (the global technology company based in Santa Clara, California, USA) has made programming GPUs for general purpose computing accessible to the average programmer. Here we use CUDA to develop parallel finite difference schemes as applicable to two problems of interest to NDE community, namely heat diffusion and elastic wave propagation. The implementations are for two-dimensions. Performance improvement of the GPU implementation against serial CPU implementation is then discussed.
Analytical theory of coherent synchrotron radiation wakefield of short bunches shielded by conducting parallel plates

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stupakov, Gennady; Zhou, Demin

2016-04-21

We develop a general model of coherent synchrotron radiation (CSR) impedance with shielding provided by two parallel conducting plates. This model allows us to easily reproduce all previously known analytical CSR wakes and to expand the analysis to situations not explored before. It reduces calculations of the impedance to taking integrals along the trajectory of the beam. New analytical results are derived for the radiation impedance with shielding for the following orbits: a kink, a bending magnet, a wiggler of finite length, and an infinitely long wiggler. All our formulas are benchmarked against numerical simulations with the CSRZ computer code.
RT DDA: A hybrid method for predicting the scattering properties by densely packed media

NASA Astrophysics Data System (ADS)

Ramezan Pour, B.; Mackowski, D.

2017-12-01

The most accurate approaches to predicting the scattering properties of particulate media are based on exact solutions of the Maxwell's equations (MEs), such as the T-matrix and discrete dipole methods. Applying these techniques for optically thick targets is challenging problem due to the large-scale computations and are usually substituted by phenomenological radiative transfer (RT) methods. On the other hand, the RT technique is of questionable validity in media with large particle packing densities. In recent works, we used numerically exact ME solvers to examine the effects of particle concentration on the polarized reflection properties of plane parallel random media. The simulations were performed for plane parallel layers of wavelength-sized spherical particles, and results were compared with RT predictions. We have shown that RTE results monotonically converge to the exact solution as the particle volume fraction becomes smaller and one can observe a nearly perfect fit for packing densities of 2%-5%. This study describes the hybrid technique composed of exact and numerical scalar RT methods. The exact methodology in this work is the plane parallel discrete dipole approximation whereas the numerical method is based on the adding and doubling method. This approach not only decreases the computational time owing to the RT method but also includes the interference and multiple scattering effects, so it may be applicable to large particle density conditions.
Mid-IR colloidal quantum dot detectors enhanced by optical nano-antennas

NASA Astrophysics Data System (ADS)

Yifat, Yuval; Ackerman, Matthew; Guyot-Sionnest, Philippe

2017-01-01

We report the fabrication of a colloidal quantum dot based photodetector designed for the 3-5 μm mid infrared wavelength range incorporated with optical nano-antenna arrays to enhance the photocurrent. The fabricated arrays exhibit a resonant behavior dependent on the length of the nano-antenna rods, in good agreement with numerical simulation. The device exhibits a three-fold increase in the spectral photoresponse compared to a photodetector device without antennas, and the resonance is polarized parallel to the antenna orientation. We numerically estimate the device quantum efficiency and investigate its bias dependence.
GRC RBCC Concept Multidisciplinary Analysis

NASA Technical Reports Server (NTRS)

Suresh, Ambady

2001-01-01

This report outlines the GRC RBCC Concept for Multidisciplinary Analysis. The multidisciplinary coupling procedure is presented, along with technique validations and axisymmetric multidisciplinary inlet and structural results. The NPSS (Numerical Propulsion System Simulation) test bed developments and code parallelization are also presented. These include milestones and accomplishments, a discussion of running R4 fan application on the PII cluster as compared to other platforms, and the National Combustor Code speedup.
Numerical Modeling of Internal Flow Aerodynamics. Part 2: Unsteady Flows

DTIC Science & Technology

2004-01-01

fluid- structure coupling, ...). • • • • • Prediction: in this simulation, we want to assess the effect of a change in SRM geometry, propellant...surface reaches the structure ). The third characteristic time describes the slow evolution of the internal geometry. The last characteristic time...incorporates fluid- structure coupling facility, and is parallel. MOPTI® manages exchanges between two principal computational modules: • • A varying
3D multiphysics modeling of superconducting cavities with a massively parallel simulation suite

DOE PAGES

Kononenko, Oleksiy; Adolphsen, Chris; Li, Zenghai; ...

2017-10-10

Radiofrequency cavities based on superconducting technology are widely used in particle accelerators for various applications. The cavities usually have high quality factors and hence narrow bandwidths, so the field stability is sensitive to detuning from the Lorentz force and external loads, including vibrations and helium pressure variations. If not properly controlled, the detuning can result in a serious performance degradation of a superconducting accelerator, so an understanding of the underlying detuning mechanisms can be very helpful. Recent advances in the simulation suite ace3p have enabled realistic multiphysics characterization of such complex accelerator systems on supercomputers. In this paper, we presentmore » the new capabilities in ace3p for large-scale 3D multiphysics modeling of superconducting cavities, in particular, a parallel eigensolver for determining mechanical resonances, a parallel harmonic response solver to calculate the response of a cavity to external vibrations, and a numerical procedure to decompose mechanical loads, such as from the Lorentz force or piezoactuators, into the corresponding mechanical modes. These capabilities have been used to do an extensive rf-mechanical analysis of dressed TESLA-type superconducting cavities. Furthermore, the simulation results and their implications for the operational stability of the Linac Coherent Light Source-II are discussed.« less
3D multiphysics modeling of superconducting cavities with a massively parallel simulation suite

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kononenko, Oleksiy; Adolphsen, Chris; Li, Zenghai

Radiofrequency cavities based on superconducting technology are widely used in particle accelerators for various applications. The cavities usually have high quality factors and hence narrow bandwidths, so the field stability is sensitive to detuning from the Lorentz force and external loads, including vibrations and helium pressure variations. If not properly controlled, the detuning can result in a serious performance degradation of a superconducting accelerator, so an understanding of the underlying detuning mechanisms can be very helpful. Recent advances in the simulation suite ace3p have enabled realistic multiphysics characterization of such complex accelerator systems on supercomputers. In this paper, we presentmore » the new capabilities in ace3p for large-scale 3D multiphysics modeling of superconducting cavities, in particular, a parallel eigensolver for determining mechanical resonances, a parallel harmonic response solver to calculate the response of a cavity to external vibrations, and a numerical procedure to decompose mechanical loads, such as from the Lorentz force or piezoactuators, into the corresponding mechanical modes. These capabilities have been used to do an extensive rf-mechanical analysis of dressed TESLA-type superconducting cavities. Furthermore, the simulation results and their implications for the operational stability of the Linac Coherent Light Source-II are discussed.« less
A Multi-Level Parallelization Concept for High-Fidelity Multi-Block Solvers

NASA Technical Reports Server (NTRS)

Hatay, Ferhat F.; Jespersen, Dennis C.; Guruswamy, Guru P.; Rizk, Yehia M.; Byun, Chansup; Gee, Ken; VanDalsem, William R. (Technical Monitor)

1997-01-01

The integration of high-fidelity Computational Fluid Dynamics (CFD) analysis tools with the industrial design process benefits greatly from the robust implementations that are transportable across a wide range of computer architectures. In the present work, a hybrid domain-decomposition and parallelization concept was developed and implemented into the widely-used NASA multi-block Computational Fluid Dynamics (CFD) packages implemented in ENSAERO and OVERFLOW. The new parallel solver concept, PENS (Parallel Euler Navier-Stokes Solver), employs both fine and coarse granularity in data partitioning as well as data coalescing to obtain the desired load-balance characteristics on the available computer platforms. This multi-level parallelism implementation itself introduces no changes to the numerical results, hence the original fidelity of the packages are identically preserved. The present implementation uses the Message Passing Interface (MPI) library for interprocessor message passing and memory accessing. By choosing an appropriate combination of the available partitioning and coalescing capabilities only during the execution stage, the PENS solver becomes adaptable to different computer architectures from shared-memory to distributed-memory platforms with varying degrees of parallelism. The PENS implementation on the IBM SP2 distributed memory environment at the NASA Ames Research Center obtains 85 percent scalable parallel performance using fine-grain partitioning of single-block CFD domains using up to 128 wide computational nodes. Multi-block CFD simulations of complete aircraft simulations achieve 75 percent perfect load-balanced executions using data coalescing and the two levels of parallelism. SGI PowerChallenge, SGI Origin 2000, and a cluster of workstations are the other platforms where the robustness of the implementation is tested. The performance behavior on the other computer platforms with a variety of realistic problems will be included as this on-going study progresses.

On the stability of nongyrotropic ion populations - A first (analytic and simulation) assessment

NASA Technical Reports Server (NTRS)

Brinca, A. L.; Borda De Agua, L.; Winske, D.

1993-01-01

The wave and dispersion equations for perturbations propagating parallel to an ambient magnetic field in magnetoplasmas with nongyrotropic ion populations show, in general, the occurrence of coupling between the parallel (left- and right-hand circularly polarized electromagnetic and longitudinal electrostatic) eigenmodes of the associated gyrotropic medium. These interactions provide a means to driving linearly one mode with free-energy sources of other modes in homogeneous media. Different types of nongyrotropy bring about distinct classes of coupling. The stability of a hydrogen magnetoplasma with anisotropic, nongyrotropic protons that only couple the electromagnetic modes to each other is investigated analytically (via solution of the derived dispersion equation) and numerically (via simulation with a hybrid code). Nongyrotropy enhances growth and enlarges the unstable spectral range relative to the corresponding gyrotropic situation. The relevance of the properties of nongyrotropic populations to space plasma environments is also discussed.
Sub-domain decomposition methods and computational controls for multibody dynamical systems. [of spacecraft structures

NASA Technical Reports Server (NTRS)

Menon, R. G.; Kurdila, A. J.

1992-01-01

This paper presents a concurrent methodology to simulate the dynamics of flexible multibody systems with a large number of degrees of freedom. A general class of open-loop structures is treated and a redundant coordinate formulation is adopted. A range space method is used in which the constraint forces are calculated using a preconditioned conjugate gradient method. By using a preconditioner motivated by the regular ordering of the directed graph of the structures, it is shown that the method is order N in the total number of coordinates of the system. The overall formulation has the advantage that it permits fine parallelization and does not rely on system topology to induce concurrency. It can be efficiently implemented on the present generation of parallel computers with a large number of processors. Validation of the method is presented via numerical simulations of space structures incorporating large number of flexible degrees of freedom.
Numerical and experimental simulation of linear shear piezoelectric phased arrays for structural health monitoring

NASA Astrophysics Data System (ADS)

Wang, Wentao; Zhang, Hui; Lynch, Jerome P.; Cesnik, Carlos E. S.; Li, Hui

2017-04-01

A novel d36-type piezoelectric wafer fabricated from lead magnesium niobate-lead titanate (PMN-PT) is explored for the generation of in-plane horizontal shear waves in plate structures. The study focuses on the development of a linear phased array (PA) of PMN-PT wafers to improve the damage detection capabilities of a structural health monitoring (SHM) system. An attractive property of in-plane horizontal shear waves is that they are nondispersive yet sensitive to damage. This study characterizes the directionality of body waves (Lamb and horizontal shear) created by a single PMN-PT wafer bonded to the surface of a metallic plate structure. Second, a linear PA is designed from PMN-PT wafers to steer and focus Lamb and horizontal shear waves in a plate structure. Numerical studies are conducted to explore the capabilities of a PMN-PT-based PA to detect damage in aluminum plates. Numerical simulations are conducted using the Local Interaction Simulation Approach (LISA) implemented on a parallelized graphical processing unit (GPU) for high-speed execution. Numerical studies are further validated using experimental tests conducted with a linear PA. The study confirms the ability of an PMN-PT phased array to accurately detect and localize damage in aluminum plates.
3D numerical simulations of multiphase continental rifting

NASA Astrophysics Data System (ADS)

Naliboff, J.; Glerum, A.; Brune, S.

2017-12-01

Observations of rifted margin architecture suggest continental breakup occurs through multiple phases of extension with distinct styles of deformation. The initial rifting stages are often characterized by slow extension rates and distributed normal faulting in the upper crust decoupled from deformation in the lower crust and mantle lithosphere. Further rifting marks a transition to higher extension rates and coupling between the crust and mantle lithosphere, with deformation typically focused along large-scale detachment faults. Significantly, recent detailed reconstructions and high-resolution 2D numerical simulations suggest that rather than remaining focused on a single long-lived detachment fault, deformation in this phase may progress toward lithospheric breakup through a complex process of fault interaction and development. The numerical simulations also suggest that an initial phase of distributed normal faulting can play a key role in the development of these complex fault networks and the resulting finite deformation patterns. Motivated by these findings, we will present 3D numerical simulations of continental rifting that examine the role of temporal increases in extension velocity on rifted margin structure. The numerical simulations are developed with the massively parallel finite-element code ASPECT. While originally designed to model mantle convection using advanced solvers and adaptive mesh refinement techniques, ASPECT has been extended to model visco-plastic deformation that combines a Drucker Prager yield criterion with non-linear dislocation and diffusion creep. To promote deformation localization, the internal friction angle and cohesion weaken as a function of accumulated plastic strain. Rather than prescribing a single zone of weakness to initiate deformation, an initial random perturbation of the plastic strain field combined with rapid strain weakening produces distributed normal faulting at relatively slow rates of extension in both 2D and 3D simulations. Our presentation will focus on both the numerical assumptions required to produce these results and variations in 3D rifted margin architecture arising from a transition from slow to rapid rates of extension.
An efficient finite element method for simulation of droplet spreading on a topologically rough surface

NASA Astrophysics Data System (ADS)

Luo, Li; Wang, Xiao-Ping; Cai, Xiao-Chuan

2017-11-01

We study numerically the dynamics of a three-dimensional droplet spreading on a rough solid surface using a phase-field model consisting of the coupled Cahn-Hilliard and Navier-Stokes equations with a generalized Navier boundary condition (GNBC). An efficient finite element method on unstructured meshes is introduced to cope with the complex geometry of the solid surfaces. We extend the GNBC to surfaces with complex geometry by including its weak form along different normal and tangential directions in the finite element formulation. The semi-implicit time discretization scheme results in a decoupled system for the phase function, the velocity, and the pressure. In addition, a mass compensation algorithm is introduced to preserve the mass of the droplet. To efficiently solve the decoupled systems, we present a highly parallel solution strategy based on domain decomposition techniques. We validate the newly developed solution method through extensive numerical experiments, particularly for those phenomena that can not be achieved by two-dimensional simulations. On a surface with circular posts, we study how wettability of the rough surface depends on the geometry of the posts. The contact line motion for a droplet spreading over some periodic rough surfaces are also efficiently computed. Moreover, we study the spreading process of an impacting droplet on a microstructured surface, a qualitative agreement is achieved between the numerical and experimental results. The parallel performance suggests that the proposed solution algorithm is scalable with over 4,000 processors cores with tens of millions of unknowns.
Programmable logic construction kits for hyper-real-time neuronal modeling.

PubMed

Guerrero-Rivera, Ruben; Morrison, Abigail; Diesmann, Markus; Pearce, Tim C

2006-11-01

Programmable logic designs are presented that achieve exact integration of leaky integrate-and-fire soma and dynamical synapse neuronal models and incorporate spike-time dependent plasticity and axonal delays. Highly accurate numerical performance has been achieved by modifying simpler forward-Euler-based circuitry requiring minimal circuit allocation, which, as we show, behaves equivalently to exact integration. These designs have been implemented and simulated at the behavioral and physical device levels, demonstrating close agreement with both numerical and analytical results. By exploiting finely grained parallelism and single clock cycle numerical iteration, these designs achieve simulation speeds at least five orders of magnitude faster than the nervous system, termed here hyper-real-time operation, when deployed on commercially available field-programmable gate array (FPGA) devices. Taken together, our designs form a programmable logic construction kit of commonly used neuronal model elements that supports the building of large and complex architectures of spiking neuron networks for real-time neuromorphic implementation, neurophysiological interfacing, or efficient parameter space investigations.
Analysis of Time Filters in Multistep Methods

NASA Astrophysics Data System (ADS)

Hurl, Nicholas

Geophysical ow simulations have evolved sophisticated implicit-explicit time stepping methods (based on fast-slow wave splittings) followed by time filters to control any unstable models that result. Time filters are modular and parallel. Their effect on stability of the overall process has been tested in numerous simulations, but never analyzed. Stability is proven herein for the Crank-Nicolson Leapfrog (CNLF) method with the Robert-Asselin (RA) time filter and for the Crank-Nicolson Leapfrog method with the Robert-Asselin-Williams (RAW) time filter for systems by energy methods. We derive an equivalent multistep method for CNLF+RA and CNLF+RAW and stability regions are obtained. The time step restriction for energy stability of CNLF+RA is smaller than CNLF and CNLF+RAW time step restriction is even smaller. Numerical tests find that RA and RAW add numerical dissipation. This thesis also shows that all modes of the Crank-Nicolson Leap Frog (CNLF) method are asymptotically stable under the standard timestep condition.
Efficient Simulation of Compressible, Viscous Fluids using Multi-rate Time Integration

NASA Astrophysics Data System (ADS)

Mikida, Cory; Kloeckner, Andreas; Bodony, Daniel

2017-11-01

In the numerical simulation of problems of compressible, viscous fluids with single-rate time integrators, the global timestep used is limited to that of the finest mesh point or fastest physical process. This talk discusses the application of multi-rate Adams-Bashforth (MRAB) integrators to an overset mesh framework to solve compressible viscous fluid problems of varying scale with improved efficiency, with emphasis on the strategy of timescale separation and the application of the resulting numerical method to two sample problems: subsonic viscous flow over a cylinder and a viscous jet in crossflow. The results presented indicate the numerical efficacy of MRAB integrators, outline a number of outstanding code challenges, demonstrate the expected reduction in time enabled by MRAB, and emphasize the need for proper load balancing through spatial decomposition in order for parallel runs to achieve the predicted time-saving benefit. This material is based in part upon work supported by the Department of Energy, National Nuclear Security Administration, under Award Number DE-NA0002374.
Experimental and Numerical Study on the Tensile Behaviour of UACS/Al Fibre Metal Laminate

NASA Astrophysics Data System (ADS)

Xue, Jia; Wang, Wen-Xue; Zhang, Jia-Zhen; Wu, Su-Jun; Li, Hang

2015-10-01

A new fibre metal laminate fabricated with aluminium sheets and unidirectionally arrayed chopped strand (UACS) plies is proposed. The UACS ply is made by cutting parallel slits into a unidirectional carbon fibre prepreg. The UACS/Al laminate may be viewed as aluminium laminate reinforced by highly aligned, discontinuous carbon fibres. The tensile behaviour of UACS/Al laminate, including thermal residual stress and failure progression, is investigated through experiments and numerical simulation. Finite element analysis was used to simulate the onset and propagation of intra-laminar fractures occurring within slits of the UACS plies and delamination along the interfaces. The finite element models feature intra-laminar cohesive elements inserted into the slits and inter-laminar cohesive elements inserted at the interfaces. Good agreement are obtained between experimental results and finite element analysis, and certain limitations of the finite element models are observed and discussed. The combined experimental and numerical studies provide a detailed understanding of the tensile behaviour of UACS/Al laminates.
Bursting reconnection of the two co-rotating current loops

NASA Astrophysics Data System (ADS)

Bulanov, Sergei; Sokolov, Igor; Sakai, Jun-Ichi

2000-10-01

Two parallel plasma filaments carrying electric current (current loops) are considered. The Ampere force induces the filaments' coalescence, which is accompanied by the reconnection of the poloidal magnetic field. Initially the loops rotate along the axii of symmetry. Each of the two loops would be in equilibrium in the absence of the other one. The dynamics of the reconnection is numerically simulated using high-resolution numerical scheme for low-resistive magneto-hydrodynamics. The results of numerical simulation are presented in the form of computer movies. The results show that the rotation strongly modifies the reconnection process, resulting in quasi-periodic (bursting) appearance and disappearance of a current sheet. Fast sliding motion of the plasma along the current sheet is a significant element of the complicated structure of reconnection (current-vortex sheet). The magnetic surfaces in the overal flow are strongly rippled by slow magnetosonic perturbations, so that the specific spiral structures form. This should result in the particle transport enhancement.
Investigating nonlinear distortion in the photopolymer materials

NASA Astrophysics Data System (ADS)

Malallah, Ra'ed; Cassidy, Derek; Muniraj, Inbarasan; Zhao, Liang; Ryle, James P.; Sheridan, John T.

2017-05-01

Propagation and diffraction of a light beam through nonlinear materials are effectively compensated by the effect of selftrapping. The laser beam propagating through photo-sensitive polymer PVA/AA can generate a waveguide of higher refractive index in direction of the light propagation. In order to investigate this phenomenon occurring in light-sensitive photopolymer media, the behaviour of a single light beam focused on the front surface of photopolymer bulk is investigated. As part of this work the self-bending of parallel beams separated in spaces during self-writing waveguides are studied. It is shown that there is strong correlation between the intensity of the input beams and their separation distance and the resulting deformation of waveguide trajectory during channels formation. This self-channeling can be modelled numerically using a three-dimension model to describe what takes place inside the volume of a photopolymer media. Corresponding numerical simulations show good agreement with experimental observations, which confirm the validity of the numerical model that was used to simulate these experiments.
High-Performance Parallel Analysis of Coupled Problems for Aircraft Propulsion

NASA Technical Reports Server (NTRS)

Felippa, C. A.; Farhat, C.; Park, K. C.; Gumaste, U.; Chen, P.-S.; Lesoinne, M.; Stern, P.

1997-01-01

Applications are described of high-performance computing methods to the numerical simulation of complete jet engines. The methodology focuses on the partitioned analysis of the interaction of the gas flow with a flexible structure and with the fluid mesh motion driven by structural displacements. The latter is treated by a ALE technique that models the fluid mesh motion as that of a fictitious mechanical network laid along the edges of near-field elements. New partitioned analysis procedures to treat this coupled three-component problem were developed. These procedures involved delayed corrections and subcycling, and have been successfully tested on several massively parallel computers, including the iPSC-860, Paragon XP/S and the IBM SP2. The NASA-sponsored ENG10 program was used for the global steady state analysis of the whole engine. This program uses a regular FV-multiblock-grid discretization in conjunction with circumferential averaging to include effects of blade forces, loss, combustor heat addition, blockage, bleeds and convective mixing. A load-balancing preprocessor for parallel versions of ENG10 was developed as well as the capability for the first full 3D aeroelastic simulation of a multirow engine stage. This capability was tested on the IBM SP2 parallel supercomputer at NASA Ames.
Software Engineering Support of the Third Round of Scientific Grand Challenge Investigations: Earth System Modeling Software Framework Survey

NASA Technical Reports Server (NTRS)

Talbot, Bryan; Zhou, Shu-Jia; Higgins, Glenn; Zukor, Dorothy (Technical Monitor)

2002-01-01

One of the most significant challenges in large-scale climate modeling, as well as in high-performance computing in other scientific fields, is that of effectively integrating many software models from multiple contributors. A software framework facilitates the integration task, both in the development and runtime stages of the simulation. Effective software frameworks reduce the programming burden for the investigators, freeing them to focus more on the science and less on the parallel communication implementation. while maintaining high performance across numerous supercomputer and workstation architectures. This document surveys numerous software frameworks for potential use in Earth science modeling. Several frameworks are evaluated in depth, including Parallel Object-Oriented Methods and Applications (POOMA), Cactus (from (he relativistic physics community), Overture, Goddard Earth Modeling System (GEMS), the National Center for Atmospheric Research Flux Coupler, and UCLA/UCB Distributed Data Broker (DDB). Frameworks evaluated in less detail include ROOT, Parallel Application Workspace (PAWS), and Advanced Large-Scale Integrated Computational Environment (ALICE). A host of other frameworks and related tools are referenced in this context. The frameworks are evaluated individually and also compared with each other.
Graphics processing unit (GPU)-based computation of heat conduction in thermally anisotropic solids

NASA Astrophysics Data System (ADS)

Nahas, C. A.; Balasubramaniam, Krishnan; Rajagopal, Prabhu

2013-01-01

Numerical modeling of anisotropic media is a computationally intensive task since it brings additional complexity to the field problem in such a way that the physical properties are different in different directions. Largely used in the aerospace industry because of their lightweight nature, composite materials are a very good example of thermally anisotropic media. With advancements in video gaming technology, parallel processors are much cheaper today and accessibility to higher-end graphical processing devices has increased dramatically over the past couple of years. Since these massively parallel GPUs are very good in handling floating point arithmetic, they provide a new platform for engineers and scientists to accelerate their numerical models using commodity hardware. In this paper we implement a parallel finite difference model of thermal diffusion through anisotropic media using the NVIDIA CUDA (Compute Unified device Architecture). We use the NVIDIA GeForce GTX 560 Ti as our primary computing device which consists of 384 CUDA cores clocked at 1645 MHz with a standard desktop pc as the host platform. We compare the results from standard CPU implementation for its accuracy and speed and draw implications for simulation using the GPU paradigm.
Moving magnets in a micromagnetic finite-difference framework

NASA Astrophysics Data System (ADS)

Rissanen, Ilari; Laurson, Lasse

2018-05-01

We present a method and an implementation for smooth linear motion in a finite-difference-based micromagnetic simulation code, to be used in simulating magnetic friction and other phenomena involving moving microscale magnets. Our aim is to accurately simulate the magnetization dynamics and relative motion of magnets while retaining high computational speed. To this end, we combine techniques for fast scalar potential calculation and cubic b-spline interpolation, parallelizing them on a graphics processing unit (GPU). The implementation also includes the possibility of explicitly simulating eddy currents in the case of conducting magnets. We test our implementation by providing numerical examples of stick-slip motion of thin films pulled by a spring and the effect of eddy currents on the switching time of magnetic nanocubes.
Time domain simulation of harmonic ultrasound images and beam patterns in 3D using the k-space pseudospectral method.

PubMed

Treeby, Bradley E; Tumen, Mustafa; Cox, B T

2011-01-01

A k-space pseudospectral model is developed for the fast full-wave simulation of nonlinear ultrasound propagation through heterogeneous media. The model uses a novel equation of state to account for nonlinearity in addition to power law absorption. The spectral calculation of the spatial gradients enables a significant reduction in the number of required grid nodes compared to finite difference methods. The model is parallelized using a graphical processing unit (GPU) which allows the simulation of individual ultrasound scan lines using a 256 x 256 x 128 voxel grid in less than five minutes. Several numerical examples are given, including the simulation of harmonic ultrasound images and beam patterns using a linear phased array transducer.
Aircraft Engine Systems

NASA Technical Reports Server (NTRS)

Veres, Joseph

2001-01-01

This report outlines the detailed simulation of Aircraft Turbofan Engine. The objectives were to develop a detailed flow model of a full turbofan engine that runs on parallel workstation clusters overnight and to develop an integrated system of codes for combustor design and analysis to enable significant reduction in design time and cost. The model will initially simulate the 3-D flow in the primary flow path including the flow and chemistry in the combustor, and ultimately result in a multidisciplinary model of the engine. The overnight 3-D simulation capability of the primary flow path in a complete engine will enable significant reduction in the design and development time of gas turbine engines. In addition, the NPSS (Numerical Propulsion System Simulation) multidisciplinary integration and analysis are discussed.
High performance computation of radiative transfer equation using the finite element method

NASA Astrophysics Data System (ADS)

Badri, M. A.; Jolivet, P.; Rousseau, B.; Favennec, Y.

2018-05-01

This article deals with an efficient strategy for numerically simulating radiative transfer phenomena using distributed computing. The finite element method alongside the discrete ordinate method is used for spatio-angular discretization of the monochromatic steady-state radiative transfer equation in an anisotropically scattering media. Two very different methods of parallelization, angular and spatial decomposition methods, are presented. To do so, the finite element method is used in a vectorial way. A detailed comparison of scalability, performance, and efficiency on thousands of processors is established for two- and three-dimensional heterogeneous test cases. Timings show that both algorithms scale well when using proper preconditioners. It is also observed that our angular decomposition scheme outperforms our domain decomposition method. Overall, we perform numerical simulations at scales that were previously unattainable by standard radiative transfer equation solvers.
Numerical Simulations of Free Surface Magnetohydrodynamic Flows

NASA Astrophysics Data System (ADS)

Samulyak, Roman; Glimm, James; Oh, Wonho; Prykarpatskyy, Yarema

2003-11-01

We have developed a numerical algorithm and performed simulations of magnetohydrodynamic (MHD) free surface flows. The corresponding system of MHD equations is a system of strongly coupled hyperbolic and parabolic/elliptic equations in moving and geometrically complex domains. The hyperbolic system is solved using the front tracking technique for the free fluid interface. Parallel algorithms for solving elliptic and parabolic equations are based on a finite element discretization on moving grids dynamically conforming to fluid interfaces. The method has been implemented as an MHD extension of the FronTier code. The code has been applied for modeling the behavior of lithium and mercury jets in magnetic fields, laser ablation plumes, and the Richtmyer-Meshkov instability of a liquid mercury jet interacting with a high energy proton pulse in a strong magnetic field. Such an instability occurs in the target for the Muon Collider.
Numerical analysis of hydrodynamics in a rotor-stator reactor for biodiesel synthesis

NASA Astrophysics Data System (ADS)

Wen, Zhuqing; Petera, Jerzy

2016-06-01

A rotor-stator spinning disk reactor for intensified biodiesel synthesis is described and numerically simulated. The reactor consists of two flat disks, located coaxially and parallel to each other with a gap ranging from 0.1 mm to 0.2 mm between the disks. The upper disk is located on a rotating shaft while the lower disk is stationary. The feed liquids, triglycerides (TG) and methanol are introduced coaxially along the center line of rotating disk and stationary disk, respectively. Fluid hydrodynamics in the reactor for synthesis of biodiesel from TG and methanol in the presence of a sodium hydroxide catalyst are simulated, using convection-diffusion-reaction species transport model by the CFD software ANSYS©Fluent v. 13.0. The effects of upper disk's spinning speed, gap size and flow rates at inlets are evaluated.

Modeling of fracture of protective concrete structures under impact loads

NASA Astrophysics Data System (ADS)

Radchenko, P. A.; Batuev, S. P.; Radchenko, A. V.; Plevkov, V. S.

2015-10-01

This paper presents results of numerical simulation of interaction between a Boeing 747-400 aircraft and the protective shell of a nuclear power plant. The shell is presented as a complex multilayered cellular structure consisting of layers of concrete and fiber concrete bonded with steel trusses. Numerical simulation was performed three-dimensionally using the original algorithm and software taking into account algorithms for building grids of complex geometric objects and parallel computations. Dynamics of the stress-strain state and fracture of the structure were studied. Destruction is described using a two-stage model that allows taking into account anisotropy of elastic and strength properties of concrete and fiber concrete. It is shown that wave processes initiate destruction of the cellular shell structure; cells start to destruct in an unloading wave originating after the compression wave arrival at free cell surfaces.
Job Management Requirements for NAS Parallel Systems and Clusters

NASA Technical Reports Server (NTRS)

Saphir, William; Tanner, Leigh Ann; Traversat, Bernard

1995-01-01

A job management system is a critical component of a production supercomputing environment, permitting oversubscribed resources to be shared fairly and efficiently. Job management systems that were originally designed for traditional vector supercomputers are not appropriate for the distributed-memory parallel supercomputers that are becoming increasingly important in the high performance computing industry. Newer job management systems offer new functionality but do not solve fundamental problems. We address some of the main issues in resource allocation and job scheduling we have encountered on two parallel computers - a 160-node IBM SP2 and a cluster of 20 high performance workstations located at the Numerical Aerodynamic Simulation facility. We describe the requirements for resource allocation and job management that are necessary to provide a production supercomputing environment on these machines, prioritizing according to difficulty and importance, and advocating a return to fundamental issues.
Coupled electromechanical model of the heart: Parallel finite element formulation.

PubMed

Lafortune, Pierre; Arís, Ruth; Vázquez, Mariano; Houzeaux, Guillaume

2012-01-01

In this paper, a highly parallel coupled electromechanical model of the heart is presented and assessed. The parallel-coupled model is thoroughly discussed, with scalability proven up to hundreds of cores. This work focuses on the mechanical part, including the constitutive model (proposing some modifications to pre-existent models), the numerical scheme and the coupling strategy. The model is next assessed through two examples. First, the simulation of a small piece of cardiac tissue is used to introduce the main features of the coupled model and calibrate its parameters against experimental evidence. Then, a more realistic problem is solved using those parameters, with a mesh of the Oxford ventricular rabbit model. The results of both examples demonstrate the capability of the model to run efficiently in hundreds of processors and to reproduce some basic characteristic of cardiac deformation.
Hierarchical fractional-step approximations and parallel kinetic Monte Carlo algorithms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arampatzis, Giorgos, E-mail: garab@math.uoc.gr; Katsoulakis, Markos A., E-mail: markos@math.umass.edu; Plechac, Petr, E-mail: plechac@math.udel.edu

2012-10-01

We present a mathematical framework for constructing and analyzing parallel algorithms for lattice kinetic Monte Carlo (KMC) simulations. The resulting algorithms have the capacity to simulate a wide range of spatio-temporal scales in spatially distributed, non-equilibrium physiochemical processes with complex chemistry and transport micro-mechanisms. Rather than focusing on constructing exactly the stochastic trajectories, our approach relies on approximating the evolution of observables, such as density, coverage, correlations and so on. More specifically, we develop a spatial domain decomposition of the Markov operator (generator) that describes the evolution of all observables according to the kinetic Monte Carlo algorithm. This domain decompositionmore » corresponds to a decomposition of the Markov generator into a hierarchy of operators and can be tailored to specific hierarchical parallel architectures such as multi-core processors or clusters of Graphical Processing Units (GPUs). Based on this operator decomposition, we formulate parallel Fractional step kinetic Monte Carlo algorithms by employing the Trotter Theorem and its randomized variants; these schemes, (a) are partially asynchronous on each fractional step time-window, and (b) are characterized by their communication schedule between processors. The proposed mathematical framework allows us to rigorously justify the numerical and statistical consistency of the proposed algorithms, showing the convergence of our approximating schemes to the original serial KMC. The approach also provides a systematic evaluation of different processor communicating schedules. We carry out a detailed benchmarking of the parallel KMC schemes using available exact solutions, for example, in Ising-type systems and we demonstrate the capabilities of the method to simulate complex spatially distributed reactions at very large scales on GPUs. Finally, we discuss work load balancing between processors and propose a re-balancing scheme based on probabilistic mass transport methods.« less
Dynamical features and electric field strengths of double layers driven by currents. [in auroras

NASA Technical Reports Server (NTRS)

Singh, N.; Thiemann, H.; Schunk, R. W.

1985-01-01

In recent years, a number of papers have been concerned with 'ion-acoustic' double layers. In the present investigation, results from numerical simulations are presented to show that the shapes and forms of current-driven double layers evolve dynamically with the fluctuations in the current through the plasma. It is shown that double layers with a potential dip can form even without the excitation of ion-acoustic modes. Double layers in two-and one-half-dimensional simulations are discussed, taking into account the simulation technique, the spatial and temporal features of plasma, and the dynamical behavior of the parallel potential distribution. Attention is also given to double layers in one-dimensional simulations, and electrical field strengths predicted by two-and one-half-dimensional simulations.
Lattice Boltzmann model for numerical relativity.

PubMed

Ilseven, E; Mendoza, M

2016-02-01

In the Z4 formulation, Einstein equations are written as a set of flux conservative first-order hyperbolic equations that resemble fluid dynamics equations. Based on this formulation, we construct a lattice Boltzmann model for numerical relativity and validate it with well-established tests, also known as "apples with apples." Furthermore, we find that by increasing the relaxation time, we gain stability at the cost of losing accuracy, and by decreasing the lattice spacings while keeping a constant numerical diffusivity, the accuracy and stability of our simulations improve. Finally, in order to show the potential of our approach, a linear scaling law for parallelization with respect to number of CPU cores is demonstrated. Our model represents the first step in using lattice kinetic theory to solve gravitational problems.
Numerical Model for the Study of the Strength and Failure Modes of Rock Containing Non-Persistent Joints

NASA Astrophysics Data System (ADS)

Vergara, Maximiliano R.; Van Sint Jan, Michel; Lorig, Loren

2016-04-01

The mechanical behavior of rock containing parallel non-persistent joint sets was studied using a numerical model. The numerical analysis was performed using the discrete element software UDEC. The use of fictitious joints allowed the inclusion of non-persistent joints in the model domain and simulating the progressive failure due to propagation of existing fractures. The material and joint mechanical parameters used in the model were obtained from experimental results. The results of the numerical model showed good agreement with the strength and failure modes observed in the laboratory. The results showed the large anisotropy in the strength resulting from variation of the joint orientation. Lower strength of the specimens was caused by the coalescence of fractures belonging to parallel joint sets. A correlation was found between geometrical parameters of the joint sets and the contribution of the joint sets strength in the global strength of the specimen. The results suggest that for the same dip angle with respect to the principal stresses; the uniaxial strength depends primarily on the joint spacing and the angle between joints tips and less on the length of the rock bridges (persistency). A relation between joint geometrical parameters was found from which the resulting failure mode can be predicted.
Numerical models for fluid-grains interactions: opportunities and limitations

NASA Astrophysics Data System (ADS)

Esteghamatian, Amir; Rahmani, Mona; Wachs, Anthony

2017-06-01

In the framework of a multi-scale approach, we develop numerical models for suspension flows. At the micro scale level, we perform particle-resolved numerical simulations using a Distributed Lagrange Multiplier/Fictitious Domain approach. At the meso scale level, we use a two-way Euler/Lagrange approach with a Gaussian filtering kernel to model fluid-solid momentum transfer. At both the micro and meso scale levels, particles are individually tracked in a Lagrangian way and all inter-particle collisions are computed by a Discrete Element/Soft-sphere method. The previous numerical models have been extended to handle particles of arbitrary shape (non-spherical, angular and even non-convex) as well as to treat heat and mass transfer. All simulation tools are fully-MPI parallel with standard domain decomposition and run on supercomputers with a satisfactory scalability on up to a few thousands of cores. The main asset of multi scale analysis is the ability to extend our comprehension of the dynamics of suspension flows based on the knowledge acquired from the high-fidelity micro scale simulations and to use that knowledge to improve the meso scale model. We illustrate how we can benefit from this strategy for a fluidized bed, where we introduce a stochastic drag force model derived from micro-scale simulations to recover the proper level of particle fluctuations. Conversely, we discuss the limitations of such modelling tools such as their limited ability to capture lubrication forces and boundary layers in highly inertial flows. We suggest ways to overcome these limitations in order to enhance further the capabilities of the numerical models.
A Parallel Numerical Micromagnetic Code Using FEniCS

NASA Astrophysics Data System (ADS)

Nagy, L.; Williams, W.; Mitchell, L.

2013-12-01

Many problems in the geosciences depend on understanding the ability of magnetic minerals to provide stable paleomagnetic recordings. Numerical micromagnetic modelling allows us to calculate the domain structures found in naturally occurring magnetic materials. However the computational cost rises exceedingly quickly with respect to the size and complexity of the geometries that we wish to model. This problem is compounded by the fact that the modern processor design no longer focuses on the speed at which calculations are performed, but rather on the number of computational units amongst which we may distribute our calculations. Consequently to better exploit modern computational resources our micromagnetic simulations must "go parallel". We present a parallel and scalable micromagnetics code written using FEniCS. FEniCS is a multinational collaboration involving several institutions (University of Cambridge, University of Chicago, The Simula Research Laboratory, etc.) that aims to provide a set of tools for writing scientific software; in particular software that employs the finite element method. The advantages of this approach are the leveraging of pre-existing projects from the world of scientific computing (PETSc, Trilinos, Metis/Parmetis, etc.) and exposing these so that researchers may pose problems in a manner closer to the mathematical language of their domain. Our code provides a scriptable interface (in Python) that allows users to not only run micromagnetic models in parallel, but also to perform pre/post processing of data.
Direct numerical simulation of steady state, three dimensional, laminar flow around a wall mounted cube

NASA Astrophysics Data System (ADS)

Liakos, Anastasios; Malamataris, Nikolaos A.

2014-05-01

The topology and evolution of flow around a surface mounted cubical object in three dimensional channel flow is examined for low to moderate Reynolds numbers. Direct numerical simulations were performed via a home made parallel finite element code. The computational domain has been designed according to actual laboratory experiment conditions. Analysis of the results is performed using the three dimensional theory of separation. Our findings indicate that a tornado-like vortex by the side of the cube is present for all Reynolds numbers for which flow was simulated. A horseshoe vortex upstream from the cube was formed at Reynolds number approximately 1266. Pressure distributions are shown along with three dimensional images of the tornado-like vortex and the horseshoe vortex at selected Reynolds numbers. Finally, and in accordance to previous work, our results indicate that the upper limit for the Reynolds number for which steady state results are physically realizable is roughly 2000.
SPH simulation of free surface flow over a sharp-crested weir

NASA Astrophysics Data System (ADS)

Ferrari, Angela

2010-03-01

In this paper the numerical simulation of a free surface flow over a sharp-crested weir is presented. Since in this case the usual shallow water assumptions are not satisfied, we propose to solve the problem using the full weakly compressible Navier-Stokes equations with the Tait equation of state for water. The numerical method used consists of the new meshless Smooth Particle Hydrodynamics (SPH) formulation proposed by Ferrari et al. (2009) [8], that accurately tracks the free surface profile and provides monotone pressure fields. Thus, the unsteady evolution of the complex moving material interface (free surface) can been properly solved. The simulations involving about half a million of fluid particles have been run in parallel on two of the most powerful High Performance Computing (HPC) facilities in Europe. The validation of the results has been carried out analysing the pressure field and comparing the free surface profiles obtained with the SPH scheme with experimental measurements available in literature [18]. A very good quantitative agreement has been obtained.
Universality of (2+1)-dimensional restricted solid-on-solid models

NASA Astrophysics Data System (ADS)

Kelling, Jeffrey; Ódor, Géza; Gemming, Sibylle

2016-08-01

Extensive dynamical simulations of restricted solid-on-solid models in D =2 +1 dimensions have been done using parallel multisurface algorithms implemented on graphics cards. Numerical evidence is presented that these models exhibit Kardar-Parisi-Zhang surface growth scaling, irrespective of the step heights N . We show that by increasing N the corrections to scaling increase, thus smaller step-sized models describe better the asymptotic, long-wave-scaling behavior.
Characteristics of the air supply envelop of the cooled flooded air jet

NASA Astrophysics Data System (ADS)

Timofeevskiy, A. L.; Sulin, A. B.; Ryabova, T. N.; Neganov, D. V.

2017-08-01

The characteristics of a plane-parallel non-isothermal airflow (which is fed into the room in the form of a flooded jet) were investigated,. The temperature and velocity fields were measured experimentally in the cross section of the supply air flare. The results of the theoretical calculation and numerical simulation of temperature and velocity profiles were compared with experimental data in a flat cooled supply jet.
Flexible parallel implicit modelling of coupled thermal-hydraulic-mechanical processes in fractured rocks

NASA Astrophysics Data System (ADS)

Cacace, Mauro; Jacquey, Antoine B.

2017-09-01

Theory and numerical implementation describing groundwater flow and the transport of heat and solute mass in fully saturated fractured rocks with elasto-plastic mechanical feedbacks are developed. In our formulation, fractures are considered as being of lower dimension than the hosting deformable porous rock and we consider their hydraulic and mechanical apertures as scaling parameters to ensure continuous exchange of fluid mass and energy within the fracture-solid matrix system. The coupled system of equations is implemented in a new simulator code that makes use of a Galerkin finite-element technique. The code builds on a flexible, object-oriented numerical framework (MOOSE, Multiphysics Object Oriented Simulation Environment) which provides an extensive scalable parallel and implicit coupling to solve for the multiphysics problem. The governing equations of groundwater flow, heat and mass transport, and rock deformation are solved in a weak sense (either by classical Newton-Raphson or by free Jacobian inexact Newton-Krylow schemes) on an underlying unstructured mesh. Nonlinear feedbacks among the active processes are enforced by considering evolving fluid and rock properties depending on the thermo-hydro-mechanical state of the system and the local structure, i.e. degree of connectivity, of the fracture system. A suite of applications is presented to illustrate the flexibility and capability of the new simulator to address problems of increasing complexity and occurring at different spatial (from centimetres to tens of kilometres) and temporal scales (from minutes to hundreds of years).
Interface COMSOL-PHREEQC (iCP), an efficient numerical framework for the solution of coupled multiphysics and geochemistry

NASA Astrophysics Data System (ADS)

Nardi, Albert; Idiart, Andrés; Trinchero, Paolo; de Vries, Luis Manuel; Molinero, Jorge

2014-08-01

This paper presents the development, verification and application of an efficient interface, denoted as iCP, which couples two standalone simulation programs: the general purpose Finite Element framework COMSOL Multiphysics® and the geochemical simulator PHREEQC. The main goal of the interface is to maximize the synergies between the aforementioned codes, providing a numerical platform that can efficiently simulate a wide number of multiphysics problems coupled with geochemistry. iCP is written in Java and uses the IPhreeqc C++ dynamic library and the COMSOL Java-API. Given the large computational requirements of the aforementioned coupled models, special emphasis has been placed on numerical robustness and efficiency. To this end, the geochemical reactions are solved in parallel by balancing the computational load over multiple threads. First, a benchmark exercise is used to test the reliability of iCP regarding flow and reactive transport. Then, a large scale thermo-hydro-chemical (THC) problem is solved to show the code capabilities. The results of the verification exercise are successfully compared with those obtained using PHREEQC and the application case demonstrates the scalability of a large scale model, at least up to 32 threads.
Numerically based design of an orifice plate flowmetering system for human respiratory flow monitoring.

PubMed

Fortuna, A O; Gurd, J R

1999-01-01

During certain medical procedures, it is important to continuously measure the respiratory flow of a patient, as lack of proper ventilation can cause brain damage and ultimately death. The monitoring of the ventilatory condition of a patient is usually performed with the aid of flowmeters. However, water and other secretions present in the expired air can build up and ultimately block a traditional, restriction-based flowmeter; by using an orifice plate flowmeter, such blockages are minimized. This paper describes the design of an orifice plate flowmetering system including, especially, a description of the numerical and computational techniques adopted in order to simulate human respiratory and sinusoidal air flow across various possible designs for the orifice plate flowmeter device. Parallel computation and multigrid techniques were employed in order to reduce execution time. The simulated orifice plate was later built and tested under unsteady sinusoidal flows. Experimental tests show reasonable agreement with the numerical simulation, thereby reinforcing the general hypothesis that computational exploration of the design space is sufficiently accurate to allow designers of such systems to use this in preference to the more traditional, mechanical prototyping techniques.
Numerical modelling of orthogonal cutting: application to woodworking with a bench plane.

PubMed

Nairn, John A

2016-06-06

A numerical model for orthogonal cutting using the material point method was applied to woodcutting using a bench plane. The cutting process was modelled by accounting for surface energy associated with wood fracture toughness for crack growth parallel to the grain. By using damping to deal with dynamic crack propagation and modelling all contact between wood and the plane, simulations could initiate chip formation and proceed into steady-state chip propagation including chip curling. Once steady-state conditions were achieved, the cutting forces became constant and could be determined as a function of various simulation variables. The modelling details included a cutting tool, the tool's rake and grinding angles, a chip breaker, a base plate and a mouth opening between the base plate and the tool. The wood was modelled as an anisotropic elastic-plastic material. The simulations were verified by comparison to an analytical model and then used to conduct virtual experiments on wood planing. The virtual experiments showed interactions between depth of cut, chip breaker location and mouth opening. Additional simulations investigated the role of tool grinding angle, tool sharpness and friction.
An Improved Simulation of the Diurnally Varying Street Canyon Flow

NASA Astrophysics Data System (ADS)

Yaghoobian, Neda; Kleissl, Jan; Paw U, Kyaw Tha

2012-11-01

The impact of diurnal variation of temperature distribution over building and ground surfaces on the wind flow and scalar transport in street canyons is numerically investigated using the PArallelized LES Model (PALM). The Temperature of Urban Facets Indoor-Outdoor Building Energy Simulator (TUF-IOBES) is used for predicting urban surface heat fluxes as boundary conditions for a modified version of PALM. TUF-IOBES dynamically simulates indoor and outdoor building surface temperatures and heat fluxes in an urban area taking into account weather conditions, indoor heat sources, building and urban material properties, composition of the building envelope (e.g. windows, insulation), and HVAC equipment. Temperature (and heat flux) distribution over urban surfaces of the 3-D raster-type geometry of TUF-IOBES makes it possible to provide realistic, high resolution boundary conditions for the numerical simulation of flow and scalar transport in an urban canopy. Compared to some previous analyses using uniformly distributed thermal forcing associated with urban surfaces, the present analysis shows that resolving non-uniform thermal forcings can provide more detailed and realistic patterns of the local air flow and pollutant dispersion in urban canyons.
Simulation of dispersion in layered coastal aquifer systems

USGS Publications Warehouse

Reilly, T.E.

1990-01-01

A density-dependent solute-transport formulation is used to examine ground-water flow in layered coastal aquifers. The numerical experiments indicate that although the transition zone may be thought of as an impermeable 'sharp' interface with freshwater flow parallel to the transition zone in homogeneous aquifers, this is not the case for layered systems. Freshwater can discharge through the transition zone in the confining units. Further, for the best simulation of layered coastal aquifer systems, either a flow-direction-dependent dispersion formulation is required, or the dispersivities must change spatially to reflect the tight thin confining unit. ?? 1990.
Computational Modeling of the Dolphin Kick in Competitive Swimming

NASA Astrophysics Data System (ADS)

Loebbeck, A.; Mark, R.; Bhanot, G.

2005-11-01

Numerical simulations are being used to study the fluid dynamics of the dolphin kick in competitive swimming. This stroke is performed underwater after starts and turns and involves an undulatory motion of the body. Highly detailed laser body scans of elite swimmers are used and the kinematics of the dolphin kick is recreated from videos of Olympic level swimmers. We employ a parallelized immersed boundary method to simulate the flow associated with this stroke in all its complexity. The simulations provide a first of its kind glimpse of the fluid and vortex dynamics associated with this stroke and hydrodynamic force computations allow us to gain a better understanding of the thrust producing mechanisms.

Some recent developments of the immersed interface method for flow simulation

NASA Astrophysics Data System (ADS)

Xu, Sheng

2017-11-01

The immersed interface method is a general methodology for solving PDEs subject to interfaces. In this talk, I will give an overview of some recent developments of the method toward the enhancement of its robustness for flow simulation. In particular, I will present with numerical results how to capture boundary conditions on immersed rigid objects, how to adopt interface triangulation in the method, and how to parallelize the method for flow with moving objects. With these developments, the immersed interface method can achieve accurate and efficient simulation of a flow involving multiple moving complex objects. Thanks to NSF for the support of this work under Grant NSF DMS 1320317.
Characterizing the velocity of a wandering black hole and properties of the surrounding medium using convolutional neural networks

NASA Astrophysics Data System (ADS)

González, J. A.; Guzmán, F. S.

2018-03-01

We present a method for estimating the velocity of a wandering black hole and the equation of state for the gas around it based on a catalog of numerical simulations. The method uses machine-learning methods based on convolutional neural networks applied to the classification of images resulting from numerical simulations. Specifically we focus on the supersonic velocity regime and choose the direction of the black hole to be parallel to its spin. We build a catalog of 900 simulations by numerically solving Euler's equations onto the fixed space-time background of a black hole, for two parameters: the adiabatic index Γ with values in the range [1.1, 5 /3 ], and the asymptotic relative velocity of the black hole with respect to the surroundings v∞, with values within [0.2 ,0.8 ]c . For each simulation we produce a 2D image of the gas density once the process of accretion has approached a stationary regime. The results obtained show that the implemented convolutional neural networks are able to correctly classify the adiabatic index 87.78% of the time within an uncertainty of ±0.0284 , while the prediction of the velocity is correct 96.67% of the time within an uncertainty of ±0.03 c . We expect that this combination of a massive number of numerical simulations and machine-learning methods will help us analyze more complicated scenarios related to future high-resolution observations of black holes, like those from the Event Horizon Telescope.
Development of a fully implicit particle-in-cell scheme for gyrokinetic electromagnetic turbulence simulation in XGC1

NASA Astrophysics Data System (ADS)

Ku, Seung-Hoe; Hager, R.; Chang, C. S.; Chacon, L.; Chen, G.; EPSI Team

2016-10-01

The cancelation problem has been a long-standing issue for long wavelengths modes in electromagnetic gyrokinetic PIC simulations in toroidal geometry. As an attempt of resolving this issue, we implemented a fully implicit time integration scheme in the full-f, gyrokinetic PIC code XGC1. The new scheme - based on the implicit Vlasov-Darwin PIC algorithm by G. Chen and L. Chacon - can potentially resolve cancelation problem. The time advance for the field and the particle equations is space-time-centered, with particle sub-cycling. The resulting system of equations is solved by a Picard iteration solver with fixed-point accelerator. The algorithm is implemented in the parallel velocity formalism instead of the canonical parallel momentum formalism. XGC1 specializes in simulating the tokamak edge plasma with magnetic separatrix geometry. A fully implicit scheme could be a way to accurate and efficient gyrokinetic simulations. We will test if this numerical scheme overcomes the cancelation problem, and reproduces the dispersion relation of Alfven waves and tearing modes in cylindrical geometry. Funded by US DOE FES and ASCR, and computing resources provided by OLCF through ALCC.
Event-based hydrological modeling for detecting dominant hydrological process and suitable model strategy for semi-arid catchments

NASA Astrophysics Data System (ADS)

Huang, Pengnian; Li, Zhijia; Chen, Ji; Li, Qiaoling; Yao, Cheng

2016-11-01

To simulate the hydrological processes in semi-arid areas properly is still challenging. This study assesses the impact of different modeling strategies on simulating flood processes in semi-arid catchments. Four classic hydrological models, TOPMODEL, XINANJIANG (XAJ), SAC-SMA and TANK, were selected and applied to three semi-arid catchments in North China. Based on analysis and comparison of the simulation results of these classic models, four new flexible models were constructed and used to further investigate the suitability of various modeling strategies for semi-arid environments. Numerical experiments were also designed to examine the performances of the models. The results show that in semi-arid catchments a suitable model needs to include at least one nonlinear component to simulate the main process of surface runoff generation. If there are more than two nonlinear components in the hydrological model, they should be arranged in parallel, rather than in series. In addition, the results show that the parallel nonlinear components should be combined by multiplication rather than addition. Moreover, this study reveals that the key hydrological process over semi-arid catchments is the infiltration excess surface runoff, a non-linear component.
Numerical modelling of the formation of fibrous bedding-parallel veins

NASA Astrophysics Data System (ADS)

Torremans, Koen; Muchez, Philippe; Sintubin, Manuel

2014-05-01

Bedding-parallel veins with a fibrous infill oriented orthogonal to the vein wall, are often observed in fine-grained metasedimentary sequences. Several mechanisms have been proposed for their formation, mostly with respect to effects of fluid overpressures and anisotropy of the host-rock fabric in order to explain the inferred extensional failure with sub-vertical opening. Abundant pre-folding, bedding-parallel fibrous dolomite veins are found associated with the Nkana-Mindola stratiform Cu-Co deposit in Zambia. The goal of this study is to better understand the formation mechanisms of these veins and to explain their particular spatial and thickness distribution, with respect to failure of transversely isotropic rocks. The spatial distribution and thickness variation of these veins was quantified during a field campaign in thirteen line transects perpendicular to undeformed veins in underground crosscuts. The fibrous dolomite veins studied are not related to lithological contrasts, but to a strong bedding-parallel shaly fabric, typical for the black shale facies of the Copperbelt Orebody Member. The host rock can hence be considered as transversely isotropic. Growth morphologies vary from antitaxial with a pronounced median surface to asymmetric syntaxial, always with small but quantifiable growth competition. A microstructural fabric study reveals that the undeformed dolomite veins show low-tortuosity vein walls and quantifiable growth competition. Here, we use a Discrete Element Method numerical modelling approach with ESyS-Particle (http://launchpad.net/esys-particle) to simulate the observed properties of the veins. Calibrated numerical specimens with a transversely isotropic matrix are repeatedly brought to failure under constant strain rates by changing the effective strain rates at model boundaries. After each fracture event, fractures in the numerical model are filled with cohesive vein material and the experiment is repeated. By systematically varying stress states, fluid pressures and mechanical properties of materials (host rock, vein infill and interface), we attempt to reproduce the characteristics of spatial distribution and thickness variation of the veins. Four parameter sets of mechanical micro-properties are defined in the models, essentially yielding (1) a competent and (2) incompetent matrix, (3) a vein material and (4) a vein-matrix interface. Each combination of parameters and particle packings is calibrated to fit a predetermined Mohr-Coulomb type failure envelope, via an automated calibration procedure. Preliminary tests already show that by varying these parameters, we are able to simulate realistically distributed cracking through crack-seal processes. Different types of veins and vein generations can be modelled, ranging from single veins, over crack-seal veins to anastomosing veins, by varying the mechanical strength of competent and incompetent matrix, vein and interface material. Further results of this approach will be presented. We will discuss our results with respect to mechanisms proposed in the literature for bedding-parallel, fibrous veins in metasedimentary rock sequences.
Numerical Test of Analytical Theories for Perpendicular Diffusion in Small Kubo Number Turbulence

DOE Office of Scientific and Technical Information (OSTI.GOV)

Heusen, M.; Shalchi, A., E-mail: husseinm@myumanitoba.ca, E-mail: andreasm4@yahoo.com

In the literature, one can find various analytical theories for perpendicular diffusion of energetic particles interacting with magnetic turbulence. Besides quasi-linear theory, there are different versions of the nonlinear guiding center (NLGC) theory and the unified nonlinear transport (UNLT) theory. For turbulence with high Kubo numbers, such as two-dimensional turbulence or noisy reduced magnetohydrodynamic turbulence, the aforementioned nonlinear theories provide similar results. For slab and small Kubo number turbulence, however, this is not the case. In the current paper, we compare different linear and nonlinear theories with each other and test-particle simulations for a noisy slab model corresponding to smallmore » Kubo number turbulence. We show that UNLT theory agrees very well with all performed test-particle simulations. In the limit of long parallel mean free paths, the perpendicular mean free path approaches asymptotically the quasi-linear limit as predicted by the UNLT theory. For short parallel mean free paths we find a Rechester and Rosenbluth type of scaling as predicted by UNLT theory as well. The original NLGC theory disagrees with all performed simulations regardless what the parallel mean free path is. The random ballistic interpretation of the NLGC theory agrees much better with the simulations, but compared to UNLT theory the agreement is inferior. We conclude that for this type of small Kubo number turbulence, only the latter theory allows for an accurate description of perpendicular diffusion.« less
Advancing MODFLOW Applying the Derived Vector Space Method

NASA Astrophysics Data System (ADS)

Herrera, G. S.; Herrera, I.; Lemus-García, M.; Hernandez-Garcia, G. D.

2015-12-01

The most effective domain decomposition methods (DDM) are non-overlapping DDMs. Recently a new approach, the DVS-framework, based on an innovative discretization method that uses a non-overlapping system of nodes (the derived-nodes), was introduced and developed by I. Herrera et al. [1, 2]. Using the DVS-approach a group of four algorithms, referred to as the 'DVS-algorithms', which fulfill the DDM-paradigm (i.e. the solution of global problems is obtained by resolution of local problems exclusively) has been derived. Such procedures are applicable to any boundary-value problem, or system of such equations, for which a standard discretization method is available and then software with a high degree of parallelization can be constructed. In a parallel talk, in this AGU Fall Meeting, Ismael Herrera will introduce the general DVS methodology. The application of the DVS-algorithms has been demonstrated in the solution of several boundary values problems of interest in Geophysics. Numerical examples for a single-equation, for the cases of symmetric, non-symmetric and indefinite problems were demonstrated before [1,2]. For these problems DVS-algorithms exhibited significantly improved numerical performance with respect to standard versions of DDM algorithms. In view of these results our research group is in the process of applying the DVS method to a widely used simulator for the first time, here we present the advances of the application of this method for the parallelization of MODFLOW. Efficiency results for a group of tests will be presented. References [1] I. Herrera, L.M. de la Cruz and A. Rosas-Medina. Non overlapping discretization methods for partial differential equations, Numer Meth Part D E, (2013). [2] Herrera, I., & Contreras Iván "An Innovative Tool for Effectively Applying Highly Parallelized Software To Problems of Elasticity". Geofísica Internacional, 2015 (In press)
Analytical theory of coherent synchrotron radiation wakefield of short bunches shielded by conducting parallel plates

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stupakov, Gennady; Zhou, Demin

2016-04-21

We develop a general model of coherent synchrotron radiation (CSR) impedance with shielding provided by two parallel conducting plates. This model allows us to easily reproduce all previously known analytical CSR wakes and to expand the analysis to situations not explored before. It reduces calculations of the impedance to taking integrals along the trajectory of the beam. New analytical results are derived for the radiation impedance with shielding for the following orbits: a kink, a bending magnet, a wiggler of finite length, and an infinitely long wiggler. Furthermore, all our formulas are benchmarked against numerical simulations with the CSRZ computermore » code.« less
Salinity transfer in double diffusive convection bounded by two parallel plates

NASA Astrophysics Data System (ADS)

Yang, Yantao; van der Poel, Erwin P.; Ostilla-Monico, Rodolfo; Sun, Chao; Verzicco, Roberto; Grossmann, Siegfried; Lohse, Detlef

2014-11-01

The double diffusive convection (DDC) is the convection flow with the fluid density affected by two different components. In this study we numerically investigate DDC between two parallel plates with no-slip boundary conditions. The top plate has higher salinity and temperature than the lower one. Thus the flow is driven by the salinity difference and stabilised by the temperature difference. Our simulations are compared with the experiments by Hage and Tilgner (Phys. Fluids 22, 076603 (2010)) for several sets of parameters. Reasonable agreement is achieved for the salinity flux and its dependence on the salinity Rayleigh number. For all parameters considered, salt fingers emerge and extend through the entire domain height. The thermal Rayleigh number shows minor influence on the salinity flux although it does affect the Reynolds number. We apply the Grossmann-Lohse theory for Rayleigh-Bénard flow to the current problem without introducing any new coefficients. The theory successfully predicts the salinity flux with respect to the scaling for both the numerical and experimental results.
A conservative scheme of drift kinetic electrons for gyrokinetic simulation of kinetic-MHD processes in toroidal plasmas

NASA Astrophysics Data System (ADS)

Bao, J.; Liu, D.; Lin, Z.

2017-10-01

A conservative scheme of drift kinetic electrons for gyrokinetic simulations of kinetic-magnetohydrodynamic processes in toroidal plasmas has been formulated and verified. Both vector potential and electron perturbed distribution function are decomposed into adiabatic part with analytic solution and non-adiabatic part solved numerically. The adiabatic parallel electric field is solved directly from the electron adiabatic response, resulting in a high degree of accuracy. The consistency between electrostatic potential and parallel vector potential is enforced by using the electron continuity equation. Since particles are only used to calculate the non-adiabatic response, which is used to calculate the non-adiabatic vector potential through Ohm's law, the conservative scheme minimizes the electron particle noise and mitigates the cancellation problem. Linear dispersion relations of the kinetic Alfvén wave and the collisionless tearing mode in cylindrical geometry have been verified in gyrokinetic toroidal code simulations, which show that the perpendicular grid size can be larger than the electron collisionless skin depth when the mode wavelength is longer than the electron skin depth.
Numerical Analysis of Ginzburg-Landau Models for Superconductivity.

NASA Astrophysics Data System (ADS)

Coskun, Erhan

Thin film conventional, as well as High T _{c} superconductors of various geometric shapes placed under both uniform and variable strength magnetic field are studied using the universially accepted macroscopic Ginzburg-Landau model. A series of new theoretical results concerning the properties of solution is presented using the semi -discrete time-dependent Ginzburg-Landau equations, staggered grid setup and natural boundary conditions. Efficient serial algorithms including a novel adaptive algorithm is developed and successfully implemented for solving the governing highly nonlinear parabolic system of equations. Refinement technique used in the adaptive algorithm is based on modified forward Euler method which was also developed by us to ease the restriction on time step size for stability considerations. Stability and convergence properties of forward and modified forward Euler schemes are studied. Numerical simulations of various recent physical experiments of technological importance such as vortes motion and pinning are performed. The numerical code for solving time-dependent Ginzburg-Landau equations is parallelized using BlockComm -Chameleon and PCN. The parallel code was run on the distributed memory multiprocessors intel iPSC/860, IBM-SP1 and cluster of Sun Sparc workstations, all located at Mathematics and Computer Science Division, Argonne National Laboratory.
Modulated heat pulse propagation and partial transport barriers in chaotic magnetic fields

DOE Office of Scientific and Technical Information (OSTI.GOV)

Castillo-Negrete, Diego del; Blazevski, Daniel

2016-04-15

Direct numerical simulations of the time dependent parallel heat transport equation modeling heat pulses driven by power modulation in three-dimensional chaotic magnetic fields are presented. The numerical method is based on the Fourier formulation of a Lagrangian-Green's function method that provides an accurate and efficient technique for the solution of the parallel heat transport equation in the presence of harmonic power modulation. The numerical results presented provide conclusive evidence that even in the absence of magnetic flux surfaces, chaotic magnetic field configurations with intermediate levels of stochasticity exhibit transport barriers to modulated heat pulse propagation. In particular, high-order islands andmore » remnants of destroyed flux surfaces (Cantori) act as partial barriers that slow down or even stop the propagation of heat waves at places where the magnetic field connection length exhibits a strong gradient. Results on modulated heat pulse propagation in fully stochastic fields and across magnetic islands are also presented. In qualitative agreement with recent experiments in large helical device and DIII-D, it is shown that the elliptic (O) and hyperbolic (X) points of magnetic islands have a direct impact on the spatio-temporal dependence of the amplitude of modulated heat pulses.« less
Lattice dynamics calculations based on density-functional perturbation theory in real space

NASA Astrophysics Data System (ADS)

Shang, Honghui; Carbogno, Christian; Rinke, Patrick; Scheffler, Matthias

2017-06-01

A real-space formalism for density-functional perturbation theory (DFPT) is derived and applied for the computation of harmonic vibrational properties in molecules and solids. The practical implementation using numeric atom-centered orbitals as basis functions is demonstrated exemplarily for the all-electron Fritz Haber Institute ab initio molecular simulations (FHI-aims) package. The convergence of the calculations with respect to numerical parameters is carefully investigated and a systematic comparison with finite-difference approaches is performed both for finite (molecules) and extended (periodic) systems. Finally, the scaling tests and scalability tests on massively parallel computer systems demonstrate the computational efficiency.
A parallel adaptive mesh refinement algorithm

NASA Technical Reports Server (NTRS)

Quirk, James J.; Hanebutte, Ulf R.

1993-01-01

Over recent years, Adaptive Mesh Refinement (AMR) algorithms which dynamically match the local resolution of the computational grid to the numerical solution being sought have emerged as powerful tools for solving problems that contain disparate length and time scales. In particular, several workers have demonstrated the effectiveness of employing an adaptive, block-structured hierarchical grid system for simulations of complex shock wave phenomena. Unfortunately, from the parallel algorithm developer's viewpoint, this class of scheme is quite involved; these schemes cannot be distilled down to a small kernel upon which various parallelizing strategies may be tested. However, because of their block-structured nature such schemes are inherently parallel, so all is not lost. In this paper we describe the method by which Quirk's AMR algorithm has been parallelized. This method is built upon just a few simple message passing routines and so it may be implemented across a broad class of MIMD machines. Moreover, the method of parallelization is such that the original serial code is left virtually intact, and so we are left with just a single product to support. The importance of this fact should not be underestimated given the size and complexity of the original algorithm.
Large eddy simulations and direct numerical simulations of high speed turbulent reacting flows

NASA Technical Reports Server (NTRS)

Givi, P.; Frankel, S. H.; Adumitroaie, V.; Sabini, G.; Madnia, C. K.

1993-01-01

The primary objective of this research is to extend current capabilities of Large Eddy Simulations (LES) and Direct Numerical Simulations (DNS) for the computational analyses of high speed reacting flows. Our efforts in the first two years of this research have been concentrated on a priori investigations of single-point Probability Density Function (PDF) methods for providing subgrid closures in reacting turbulent flows. In the efforts initiated in the third year, our primary focus has been on performing actual LES by means of PDF methods. The approach is based on assumed PDF methods and we have performed extensive analysis of turbulent reacting flows by means of LES. This includes simulations of both three-dimensional (3D) isotropic compressible flows and two-dimensional reacting planar mixing layers. In addition to these LES analyses, some work is in progress to assess the extent of validity of our assumed PDF methods. This assessment is done by making detailed companions with recent laboratory data in predicting the rate of reactant conversion in parallel reacting shear flows. This report provides a summary of our achievements for the first six months of the third year of this program.
Edge gyrokinetic theory and continuum simulations

NASA Astrophysics Data System (ADS)

Xu, X. Q.; Xiong, Z.; Dorr, M. R.; Hittinger, J. A.; Bodi, K.; Candy, J.; Cohen, B. I.; Cohen, R. H.; Colella, P.; Kerbel, G. D.; Krasheninnikov, S.; Nevins, W. M.; Qin, H.; Rognlien, T. D.; Snyder, P. B.; Umansky, M. V.

2007-08-01

The following results are presented from the development and application of TEMPEST, a fully nonlinear (full-f) five-dimensional (3d2v) gyrokinetic continuum edge-plasma code. (1) As a test of the interaction of collisions and parallel streaming, TEMPEST is compared with published analytic and numerical results for endloss of particles confined by combined electrostatic and magnetic wells. Good agreement is found over a wide range of collisionality, confining potential and mirror ratio, and the required velocity space resolution is modest. (2) In a large-aspect-ratio circular geometry, excellent agreement is found for a neoclassical equilibrium with parallel ion flow in the banana regime with zero temperature gradient and radial electric field. (3) The four-dimensional (2d2v) version of the code produces the first self-consistent simulation results of collisionless damping of geodesic acoustic modes and zonal flow (Rosenbluth-Hinton residual) with Boltzmann electrons using a full-f code. The electric field is also found to agree with the standard neoclassical expression for steep density and ion temperature gradients in the plateau regime. In divertor geometry, it is found that the endloss of particles and energy induces parallel flow stronger than the core neoclassical predictions in the SOL.
Parallel high-precision orbit propagation using the modified Picard-Chebyshev method

NASA Astrophysics Data System (ADS)

Koblick, Darin C.

2012-03-01

The modified Picard-Chebyshev method, when run in parallel, is thought to be more accurate and faster than the most efficient sequential numerical integration techniques when applied to orbit propagation problems. Previous experiments have shown that the modified Picard-Chebyshev method can have up to a one order magnitude speedup over the 12th order Runge-Kutta-Nystrom method. For this study, the evaluation of the accuracy and computational time of the modified Picard-Chebyshev method, using the Java Astrodynamics Toolkit high-precision force model, is conducted to assess its runtime performance. Simulation results of the modified Picard-Chebyshev method, implemented in MATLAB and the MATLAB Parallel Computing Toolbox, are compared against the most efficient first and second order Ordinary Differential Equation (ODE) solvers. A total of six processors were used to assess the runtime performance of the modified Picard-Chebyshev method. It was found that for all orbit propagation test cases, where the gravity model was simulated to be of higher degree and order (above 225 to increase computational overhead), the modified Picard-Chebyshev method was faster, by as much as a factor of two, than the other ODE solvers which were tested.
Direct Numerical Simulation of Turbulent Flow Over Complex Bathymetry

NASA Astrophysics Data System (ADS)

Yue, L.; Hsu, T. J.

2017-12-01

Direct numerical simulation (DNS) is regarded as a powerful tool in the investigation of turbulent flow featured with a wide range of time and spatial scales. With the application of coordinate transformation in a pseudo-spectral scheme, a parallelized numerical modeling system was created aiming at simulating flow over complex bathymetry with high numerical accuracy and efficiency. The transformed governing equations were integrated in time using a third-order low-storage Runge-Kutta method. For spatial discretization, the discrete Fourier expansion was adopted in the streamwise and spanwise direction, enforcing the periodic boundary condition in both directions. The Chebyshev expansion on Chebyshev-Gauss-Lobatto points was used in the wall-normal direction, assuming there is no-slip on top and bottom walls. The diffusion terms were discretized with a Crank-Nicolson scheme, while the advection terms dealiased with the 2/3 rule were discretized with an Adams-Bashforth scheme. In the prediction step, the velocity was calculated in physical domain by solving the resulting linear equation directly. However, the extra terms introduced by coordinate transformation impose a strict limitation to time step and an iteration method was applied to overcome this restriction in the correction step for pressure by solving the Helmholtz equation. The numerical solver is written in object-oriented C++ programing language utilizing Armadillo linear algebra library for matrix computation. Several benchmarking cases in laminar and turbulent flow were carried out to verify/validate the numerical model and very good agreements are achieved. Ongoing work focuses on implementing sediment transport capability for multiple sediment classes and parameterizations for flocculation processes.
Time-Accurate Local Time Stepping and High-Order Time CESE Methods for Multi-Dimensional Flows Using Unstructured Meshes

NASA Technical Reports Server (NTRS)

Chang, Chau-Lyan; Venkatachari, Balaji Shankar; Cheng, Gary

2013-01-01

With the wide availability of affordable multiple-core parallel supercomputers, next generation numerical simulations of flow physics are being focused on unsteady computations for problems involving multiple time scales and multiple physics. These simulations require higher solution accuracy than most algorithms and computational fluid dynamics codes currently available. This paper focuses on the developmental effort for high-fidelity multi-dimensional, unstructured-mesh flow solvers using the space-time conservation element, solution element (CESE) framework. Two approaches have been investigated in this research in order to provide high-accuracy, cross-cutting numerical simulations for a variety of flow regimes: 1) time-accurate local time stepping and 2) highorder CESE method. The first approach utilizes consistent numerical formulations in the space-time flux integration to preserve temporal conservation across the cells with different marching time steps. Such approach relieves the stringent time step constraint associated with the smallest time step in the computational domain while preserving temporal accuracy for all the cells. For flows involving multiple scales, both numerical accuracy and efficiency can be significantly enhanced. The second approach extends the current CESE solver to higher-order accuracy. Unlike other existing explicit high-order methods for unstructured meshes, the CESE framework maintains a CFL condition of one for arbitrarily high-order formulations while retaining the same compact stencil as its second-order counterpart. For large-scale unsteady computations, this feature substantially enhances numerical efficiency. Numerical formulations and validations using benchmark problems are discussed in this paper along with realistic examples.
Parallelization of Rocket Engine Simulator Software (PRESS)

NASA Technical Reports Server (NTRS)

Cezzar, Ruknet

1997-01-01

Parallelization of Rocket Engine System Software (PRESS) project is part of a collaborative effort with Southern University at Baton Rouge (SUBR), University of West Florida (UWF), and Jackson State University (JSU). The second-year funding, which supports two graduate students enrolled in our new Master's program in Computer Science at Hampton University and the principal investigator, have been obtained for the period from October 19, 1996 through October 18, 1997. The key part of the interim report was new directions for the second year funding. This came about from discussions during Rocket Engine Numeric Simulator (RENS) project meeting in Pensacola on January 17-18, 1997. At that time, a software agreement between Hampton University and NASA Lewis Research Center had already been concluded. That agreement concerns off-NASA-site experimentation with PUMPDES/TURBDES software. Before this agreement, during the first year of the project, another large-scale FORTRAN-based software, Two-Dimensional Kinetics (TDK), was being used for translation to an object-oriented language and parallelization experiments. However, that package proved to be too complex and lacking sufficient documentation for effective translation effort to the object-oriented C + + source code. The focus, this time with better documented and more manageable PUMPDES/TURBDES package, was still on translation to C + + with design improvements. At the RENS Meeting, however, the new impetus for the RENS projects in general, and PRESS in particular, has shifted in two important ways. One was closer alignment with the work on Numerical Propulsion System Simulator (NPSS) through cooperation and collaboration with LERC ACLU organization. The other was to see whether and how NASA's various rocket design software can be run over local and intra nets without any radical efforts for redesign and translation into object-oriented source code. There were also suggestions that the Fortran based code be encapsulated in C + + code thereby facilitating reuse without undue development effort. The details are covered in the aforementioned section of the interim report filed on April 28, 1997.

Continuum Edge Gyrokinetic Theory and Simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xu, X Q; Xiong, Z; Dorr, M R

The following results are presented from the development and application of TEMPEST, a fully nonlinear (full-f) five dimensional (3d2v) gyrokinetic continuum edge-plasma code. (1) As a test of the interaction of collisions and parallel streaming, TEMPEST is compared with published analytic and numerical results for endloss of particles confined by combined electrostatic and magnetic wells. Good agreement is found over a wide range of collisionality, confining potential, and mirror ratio; and the required velocity space resolution is modest. (2) In a large-aspect-ratio circular geometry, excellent agreement is found for a neoclassical equilibrium with parallel ion flow in the banana regimemore » with zero temperature gradient and radial electric field. (3) The four-dimensional (2d2v) version of the code produces the first self-consistent simulation results of collisionless damping of geodesic acoustic modes and zonal flow (Rosenbluth-Hinton residual) with Boltzmann electrons using a full-f code. The electric field is also found to agree with the standard neoclassical expression for steep density and ion temperature gradients in the banana regime. In divertor geometry, it is found that the endloss of particles and energy induces parallel flow stronger than the core neoclassical predictions in the SOL. (5) Our 5D gyrokinetic formulation yields a set of nonlinear electrostatic gyrokinetic equations that are for both neoclassical and turbulence simulations.« less
Numerical simulations of internal wave generation by convection in water.

PubMed

Lecoanet, Daniel; Le Bars, Michael; Burns, Keaton J; Vasil, Geoffrey M; Brown, Benjamin P; Quataert, Eliot; Oishi, Jeffrey S

2015-06-01

Water's density maximum at 4°C makes it well suited to study internal gravity wave excitation by convection: an increasing temperature profile is unstable to convection below 4°C, but stably stratified above 4°C. We present numerical simulations of a waterlike fluid near its density maximum in a two-dimensional domain. We successfully model the damping of waves in the simulations using linear theory, provided we do not take the weak damping limit typically used in the literature. To isolate the physical mechanism exciting internal waves, we use the spectral code dedalus to run several simplified model simulations of our more detailed simulation. We use data from the full simulation as source terms in two simplified models of internal-wave excitation by convection: bulk excitation by convective Reynolds stresses, and interface forcing via the mechanical oscillator effect. We find excellent agreement between the waves generated in the full simulation and the simplified simulation implementing the bulk excitation mechanism. The interface forcing simulations overexcite high-frequency waves because they assume the excitation is by the "impulsive" penetration of plumes, which spreads energy to high frequencies. However, we find that the real excitation is instead by the "sweeping" motion of plumes parallel to the interface. Our results imply that the bulk excitation mechanism is a very accurate heuristic for internal-wave generation by convection.
Physics Computing '92: Proceedings of the 4th International Conference

NASA Astrophysics Data System (ADS)

de Groot, Robert A.; Nadrchal, Jaroslav

1993-04-01

The Table of Contents for the book is as follows: * Preface * INVITED PAPERS * Ab Initio Theoretical Approaches to the Structural, Electronic and Vibrational Properties of Small Clusters and Fullerenes: The State of the Art * Neural Multigrid Methods for Gauge Theories and Other Disordered Systems * Multicanonical Monte Carlo Simulations * On the Use of the Symbolic Language Maple in Physics and Chemistry: Several Examples * Nonequilibrium Phase Transitions in Catalysis and Population Models * Computer Algebra, Symmetry Analysis and Integrability of Nonlinear Evolution Equations * The Path-Integral Quantum Simulation of Hydrogen in Metals * Digital Optical Computing: A New Approach of Systolic Arrays Based on Coherence Modulation of Light and Integrated Optics Technology * Molecular Dynamics Simulations of Granular Materials * Numerical Implementation of a K.A.M. Algorithm * Quasi-Monte Carlo, Quasi-Random Numbers and Quasi-Error Estimates * What Can We Learn from QMC Simulations * Physics of Fluctuating Membranes * Plato, Apollonius, and Klein: Playing with Spheres * Steady States in Nonequilibrium Lattice Systems * CONVODE: A REDUCE Package for Differential Equations * Chaos in Coupled Rotators * Symplectic Numerical Methods for Hamiltonian Problems * Computer Simulations of Surfactant Self Assembly * High-dimensional and Very Large Cellular Automata for Immunological Shape Space * A Review of the Lattice Boltzmann Method * Electronic Structure of Solids in the Self-interaction Corrected Local-spin-density Approximation * Dedicated Computers for Lattice Gauge Theory Simulations * Physics Education: A Survey of Problems and Possible Solutions * Parallel Computing and Electronic-Structure Theory * High Precision Simulation Techniques for Lattice Field Theory * CONTRIBUTED PAPERS * Case Study of Microscale Hydrodynamics Using Molecular Dynamics and Lattice Gas Methods * Computer Modelling of the Structural and Electronic Properties of the Supported Metal Catalysis * Ordered Particle Simulations for Serial and MIMD Parallel Computers * "NOLP" -- Program Package for Laser Plasma Nonlinear Optics * Algorithms to Solve Nonlinear Least Square Problems * Distribution of Hydrogen Atoms in Pd-H Computed by Molecular Dynamics * A Ray Tracing of Optical System for Protein Crystallography Beamline at Storage Ring-SIBERIA-2 * Vibrational Properties of a Pseudobinary Linear Chain with Correlated Substitutional Disorder * Application of the Software Package Mathematica in Generalized Master Equation Method * Linelist: An Interactive Program for Analysing Beam-foil Spectra * GROMACS: A Parallel Computer for Molecular Dynamics Simulations * GROMACS Method of Virial Calculation Using a Single Sum * The Interactive Program for the Solution of the Laplace Equation with the Elimination of Singularities for Boundary Functions * Random-Number Generators: Testing Procedures and Comparison of RNG Algorithms * Micro-TOPIC: A Tokamak Plasma Impurities Code * Rotational Molecular Scattering Calculations * Orthonormal Polynomial Method for Calibrating of Cryogenic Temperature Sensors * Frame-based System Representing Basis of Physics * The Role of Massively Data-parallel Computers in Large Scale Molecular Dynamics Simulations * Short-range Molecular Dynamics on a Network of Processors and Workstations * An Algorithm for Higher-order Perturbation Theory in Radiative Transfer Computations * Hydrostochastics: The Master Equation Formulation of Fluid Dynamics * HPP Lattice Gas on Transputers and Networked Workstations * Study on the Hysteresis Cycle Simulation Using Modeling with Different Functions on Intervals * Refined Pruning Techniques for Feed-forward Neural Networks * Random Walk Simulation of the Motion of Transient Charges in Photoconductors * The Optical Hysteresis in Hydrogenated Amorphous Silicon * Diffusion Monte Carlo Analysis of Modern Interatomic Potentials for He * A Parallel Strategy for Molecular Dynamics Simulations of Polar Liquids on Transputer Arrays * Distribution of Ions Reflected on Rough Surfaces * The Study of Step Density Distribution During Molecular Beam Epitaxy Growth: Monte Carlo Computer Simulation * Towards a Formal Approach to the Construction of Large-scale Scientific Applications Software * Correlated Random Walk and Discrete Modelling of Propagation through Inhomogeneous Media * Teaching Plasma Physics Simulation * A Theoretical Determination of the Au-Ni Phase Diagram * Boson and Fermion Kinetics in One-dimensional Lattices * Computational Physics Course on the Technical University * Symbolic Computations in Simulation Code Development and Femtosecond-pulse Laser-plasma Interaction Studies * Computer Algebra and Integrated Computing Systems in Education of Physical Sciences * Coordinated System of Programs for Undergraduate Physics Instruction * Program Package MIRIAM and Atomic Physics of Extreme Systems * High Energy Physics Simulation on the T_Node * The Chapman-Kolmogorov Equation as Representation of Huygens' Principle and the Monolithic Self-consistent Numerical Modelling of Lasers * Authoring System for Simulation Developments * Molecular Dynamics Study of Ion Charge Effects in the Structure of Ionic Crystals * A Computational Physics Introductory Course * Computer Calculation of Substrate Temperature Field in MBE System * Multimagnetical Simulation of the Ising Model in Two and Three Dimensions * Failure of the CTRW Treatment of the Quasicoherent Excitation Transfer * Implementation of a Parallel Conjugate Gradient Method for Simulation of Elastic Light Scattering * Algorithms for Study of Thin Film Growth * Algorithms and Programs for Physics Teaching in Romanian Technical Universities * Multicanonical Simulation of 1st order Transitions: Interface Tension of the 2D 7-State Potts Model * Two Numerical Methods for the Calculation of Periodic Orbits in Hamiltonian Systems * Chaotic Behavior in a Probabilistic Cellular Automata? * Wave Optics Computing by a Networked-based Vector Wave Automaton * Tensor Manipulation Package in REDUCE * Propagation of Electromagnetic Pulses in Stratified Media * The Simple Molecular Dynamics Model for the Study of Thermalization of the Hot Nucleon Gas * Electron Spin Polarization in PdCo Alloys Calculated by KKR-CPA-LSD Method * Simulation Studies of Microscopic Droplet Spreading * A Vectorizable Algorithm for the Multicolor Successive Overrelaxation Method * Tetragonality of the CuAu I Lattice and Its Relation to Electronic Specific Heat and Spin Susceptibility * Computer Simulation of the Formation of Metallic Aggregates Produced by Chemical Reactions in Aqueous Solution * Scaling in Growth Models with Diffusion: A Monte Carlo Study * The Nucleus as the Mesoscopic System * Neural Network Computation as Dynamic System Simulation * First-principles Theory of Surface Segregation in Binary Alloys * Data Smooth Approximation Algorithm for Estimating the Temperature Dependence of the Ice Nucleation Rate * Genetic Algorithms in Optical Design * Application of 2D-FFT in the Study of Molecular Exchange Processes by NMR * Advanced Mobility Model for Electron Transport in P-Si Inversion Layers * Computer Simulation for Film Surfaces and its Fractal Dimension * Parallel Computation Techniques and the Structure of Catalyst Surfaces * Educational SW to Teach Digital Electronics and the Corresponding Text Book * Primitive Trinomials (Mod 2) Whose Degree is a Mersenne Exponent * Stochastic Modelisation and Parallel Computing * Remarks on the Hybrid Monte Carlo Algorithm for the ∫4 Model * An Experimental Computer Assisted Workbench for Physics Teaching * A Fully Implicit Code to Model Tokamak Plasma Edge Transport * EXPFIT: An Interactive Program for Automatic Beam-foil Decay Curve Analysis * Mapping Technique for Solving General, 1-D Hamiltonian Systems * Freeway Traffic, Cellular Automata, and Some (Self-Organizing) Criticality * Photonuclear Yield Analysis by Dynamic Programming * Incremental Representation of the Simply Connected Planar Curves * Self-convergence in Monte Carlo Methods * Adaptive Mesh Technique for Shock Wave Propagation * Simulation of Supersonic Coronal Streams and Their Interaction with the Solar Wind * The Nature of Chaos in Two Systems of Ordinary Nonlinear Differential Equations * Considerations of a Window-shopper * Interpretation of Data Obtained by RTP 4-Channel Pulsed Radar Reflectometer Using a Multi Layer Perceptron * Statistics of Lattice Bosons for Finite Systems * Fractal Based Image Compression with Affine Transformations * Algorithmic Studies on Simulation Codes for Heavy-ion Reactions * An Energy-Wise Computer Simulation of DNA-Ion-Water Interactions Explains the Abnormal Structure of Poly[d(A)]:Poly[d(T)] * Computer Simulation Study of Kosterlitz-Thouless-Like Transitions * Problem-oriented Software Package GUN-EBT for Computer Simulation of Beam Formation and Transport in Technological Electron-Optical Systems * Parallelization of a Boundary Value Solver and its Application in Nonlinear Dynamics * The Symbolic Classification of Real Four-dimensional Lie Algebras * Short, Singular Pulses Generation by a Dye Laser at Two Wavelengths Simultaneously * Quantum Monte Carlo Simulations of the Apex-Oxygen-Model * Approximation Procedures for the Axial Symmetric Static Einstein-Maxwell-Higgs Theory * Crystallization on a Sphere: Parallel Simulation on a Transputer Network * FAMULUS: A Software Product (also) for Physics Education * MathCAD vs. FAMULUS -- A Brief Comparison * First-principles Dynamics Used to Study Dissociative Chemisorption * A Computer Controlled System for Crystal Growth from Melt * A Time Resolved Spectroscopic Method for Short Pulsed Particle Emission * Green's Function Computation in Radiative Transfer Theory * Random Search Optimization Technique for One-criteria and Multi-criteria Problems * Hartley Transform Applications to Thermal Drift Elimination in Scanning Tunneling Microscopy * Algorithms of Measuring, Processing and Interpretation of Experimental Data Obtained with Scanning Tunneling Microscope * Time-dependent Atom-surface Interactions * Local and Global Minima on Molecular Potential Energy Surfaces: An Example of N3 Radical * Computation of Bifurcation Surfaces * Symbolic Computations in Quantum Mechanics: Energies in Next-to-solvable Systems * A Tool for RTP Reactor and Lamp Field Design * Modelling of Particle Spectra for the Analysis of Solid State Surface * List of Participants
Blob dynamics in TORPEX poloidal null configurations

NASA Astrophysics Data System (ADS)

Shanahan, B. W.; Dudson, B. D.

2016-12-01

3D blob dynamics are simulated in X-point magnetic configurations in the TORPEX device via a non-field-aligned coordinate system, using an isothermal model which evolves density, vorticity, parallel velocity and parallel current density. By modifying the parallel gradient operator to include perpendicular perturbations from poloidal field coils, numerical singularities associated with field aligned coordinates are avoided. A comparison with a previously developed analytical model (Avino 2016 Phys. Rev. Lett. 116 105001) is performed and an agreement is found with minimal modification. Experimental comparison determines that the null region can cause an acceleration of filaments due to increasing connection length, but this acceleration is small relative to other effects, which we quantify. Experimental measurements (Avino 2016 Phys. Rev. Lett. 116 105001) are reproduced, and the dominant acceleration mechanism is identified as that of a developing dipole in a moving background. Contributions from increasing connection length close to the null point are a small correction.
Performance of multi-hop parallel free-space optical communication over gamma-gamma fading channel with pointing errors.

PubMed

Gao, Zhengguang; Liu, Hongzhan; Ma, Xiaoping; Lu, Wei

2016-11-10

Multi-hop parallel relaying is considered in a free-space optical (FSO) communication system deploying binary phase-shift keying (BPSK) modulation under the combined effects of a gamma-gamma (GG) distribution and misalignment fading. Based on the best path selection criterion, the cumulative distribution function (CDF) of this cooperative random variable is derived. Then the performance of this optical mesh network is analyzed in detail. A Monte Carlo simulation is also conducted to demonstrate the effectiveness of the results for the average bit error rate (ABER) and outage probability. The numerical result proves that it needs a smaller average transmitted optical power to achieve the same ABER and outage probability when using the multi-hop parallel network in FSO links. Furthermore, the system use of more number of hops and cooperative paths can improve the quality of the communication.
Computational Issues in Damping Identification for Large Scale Problems

NASA Technical Reports Server (NTRS)

Pilkey, Deborah L.; Roe, Kevin P.; Inman, Daniel J.

1997-01-01

Two damping identification methods are tested for efficiency in large-scale applications. One is an iterative routine, and the other a least squares method. Numerical simulations have been performed on multiple degree-of-freedom models to test the effectiveness of the algorithm and the usefulness of parallel computation for the problems. High Performance Fortran is used to parallelize the algorithm. Tests were performed using the IBM-SP2 at NASA Ames Research Center. The least squares method tested incurs high communication costs, which reduces the benefit of high performance computing. This method's memory requirement grows at a very rapid rate meaning that larger problems can quickly exceed available computer memory. The iterative method's memory requirement grows at a much slower pace and is able to handle problems with 500+ degrees of freedom on a single processor. This method benefits from parallelization, and significant speedup can he seen for problems of 100+ degrees-of-freedom.
Automatic mesh refinement and parallel load balancing for Fokker-Planck-DSMC algorithm

NASA Astrophysics Data System (ADS)

Küchlin, Stephan; Jenny, Patrick

2018-06-01

Recently, a parallel Fokker-Planck-DSMC algorithm for rarefied gas flow simulation in complex domains at all Knudsen numbers was developed by the authors. Fokker-Planck-DSMC (FP-DSMC) is an augmentation of the classical DSMC algorithm, which mitigates the near-continuum deficiencies in terms of computational cost of pure DSMC. At each time step, based on a local Knudsen number criterion, the discrete DSMC collision operator is dynamically switched to the Fokker-Planck operator, which is based on the integration of continuous stochastic processes in time, and has fixed computational cost per particle, rather than per collision. In this contribution, we present an extension of the previous implementation with automatic local mesh refinement and parallel load-balancing. In particular, we show how the properties of discrete approximations to space-filling curves enable an efficient implementation. Exemplary numerical studies highlight the capabilities of the new code.
Reducing neural network training time with parallel processing

NASA Technical Reports Server (NTRS)

Rogers, James L., Jr.; Lamarsh, William J., II

1995-01-01

Obtaining optimal solutions for engineering design problems is often expensive because the process typically requires numerous iterations involving analysis and optimization programs. Previous research has shown that a near optimum solution can be obtained in less time by simulating a slow, expensive analysis with a fast, inexpensive neural network. A new approach has been developed to further reduce this time. This approach decomposes a large neural network into many smaller neural networks that can be trained in parallel. Guidelines are developed to avoid some of the pitfalls when training smaller neural networks in parallel. These guidelines allow the engineer: to determine the number of nodes on the hidden layer of the smaller neural networks; to choose the initial training weights; and to select a network configuration that will capture the interactions among the smaller neural networks. This paper presents results describing how these guidelines are developed.
Numerical investigation for the impact of CO2 geologic sequestration on regional groundwater flow

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yamamoto, H.; Zhang, K.; Karasaki, K.

Large-scale storage of carbon dioxide in saline aquifers may cause considerable pressure perturbation and brine migration in deep rock formations, which may have a significant influence on the regional groundwater system. With the help of parallel computing techniques, we conducted a comprehensive, large-scale numerical simulation of CO{sub 2} geologic storage that predicts not only CO{sub 2} migration, but also its impact on regional groundwater flow. As a case study, a hypothetical industrial-scale CO{sub 2} injection in Tokyo Bay, which is surrounded by the most heavily industrialized area in Japan, was considered, and the impact of CO{sub 2} injection on near-surfacemore » aquifers was investigated, assuming relatively high seal-layer permeability (higher than 10 microdarcy). A regional hydrogeological model with an area of about 60 km x 70 km around Tokyo Bay was discretized into about 10 million gridblocks. To solve the high-resolution model efficiently, we used a parallelized multiphase flow simulator TOUGH2-MP/ECO2N on a world-class high performance supercomputer in Japan, the Earth Simulator. In this simulation, CO{sub 2} was injected into a storage aquifer at about 1 km depth under Tokyo Bay from 10 wells, at a total rate of 10 million tons/year for 100 years. Through the model, we can examine regional groundwater pressure buildup and groundwater migration to the land surface. The results suggest that even if containment of CO{sub 2} plume is ensured, pressure buildup on the order of a few bars can occur in the shallow confined aquifers over extensive regions, including urban inlands.« less
A fully non-linear multi-species Fokker–Planck–Landau collision operator for simulation of fusion plasma

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hager, Robert, E-mail: rhager@pppl.gov; Yoon, E.S., E-mail: yoone@rpi.edu; Ku, S., E-mail: sku@pppl.gov

2016-06-15

Fusion edge plasmas can be far from thermal equilibrium and require the use of a non-linear collision operator for accurate numerical simulations. In this article, the non-linear single-species Fokker–Planck–Landau collision operator developed by Yoon and Chang (2014) [9] is generalized to include multiple particle species. The finite volume discretization used in this work naturally yields exact conservation of mass, momentum, and energy. The implementation of this new non-linear Fokker–Planck–Landau operator in the gyrokinetic particle-in-cell codes XGC1 and XGCa is described and results of a verification study are discussed. Finally, the numerical techniques that make our non-linear collision operator viable onmore » high-performance computing systems are described, including specialized load balancing algorithms and nested OpenMP parallelization. The collision operator's good weak and strong scaling behavior are shown.« less
A fully non-linear multi-species Fokker–Planck–Landau collision operator for simulation of fusion plasma

DOE PAGES

Hager, Robert; Yoon, E. S.; Ku, S.; ...

2016-04-04

Fusion edge plasmas can be far from thermal equilibrium and require the use of a non-linear collision operator for accurate numerical simulations. The non-linear single-species Fokker–Planck–Landau collision operator developed by Yoon and Chang (2014) [9] is generalized to include multiple particle species. Moreover, the finite volume discretization used in this work naturally yields exact conservation of mass, momentum, and energy. The implementation of this new non-linear Fokker–Planck–Landau operator in the gyrokinetic particle-in-cell codes XGC1 and XGCa is described and results of a verification study are discussed. Finally, the numerical techniques that make our non-linear collision operator viable on high-performance computingmore » systems are described, including specialized load balancing algorithms and nested OpenMP parallelization. As a result, the collision operator's good weak and strong scaling behavior are shown.« less
Numerical analysis of hydrodynamics in a rotor-stator reactor for biodiesel synthesis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wen, Zhuqing; Petera, Jerzy

A rotor-stator spinning disk reactor for intensified biodiesel synthesis is described and numerically simulated. The reactor consists of two flat disks, located coaxially and parallel to each other with a gap ranging from 0.1 mm to 0.2 mm between the disks. The upper disk is located on a rotating shaft while the lower disk is stationary. The feed liquids, triglycerides (TG) and methanol are introduced coaxially along the center line of rotating disk and stationary disk, respectively. Fluid hydrodynamics in the reactor for synthesis of biodiesel from TG and methanol in the presence of a sodium hydroxide catalyst are simulated, using convection-diffusion-reactionmore » species transport model by the CFD software ANSYS©Fluent v. 13.0. The effects of upper disk’s spinning speed, gap size and flow rates at inlets are evaluated.« less
GPU-accelerated simulations of isolated black holes

NASA Astrophysics Data System (ADS)

Lewis, Adam G. M.; Pfeiffer, Harald P.

2018-05-01

We present a port of the numerical relativity code SpEC which is capable of running on NVIDIA GPUs. Since this code must be maintained in parallel with SpEC itself, a primary design consideration is to perform as few explicit code changes as possible. We therefore rely on a hierarchy of automated porting strategies. At the highest level we use TLoops, a C++ library of our design, to automatically emit CUDA code equivalent to tensorial expressions written into C++ source using a syntax similar to analytic calculation. Next, we trace out and cache explicit matrix representations of the numerous linear transformations in the SpEC code, which allows these to be performed on the GPU using pre-existing matrix-multiplication libraries. We port the few remaining important modules by hand. In this paper we detail the specifics of our port, and present benchmarks of it simulating isolated black hole spacetimes on several generations of NVIDIA GPU.
Modelling and simulation of “Free Cooling” process applied to building construction

NASA Astrophysics Data System (ADS)

Ousegui, A.; Asbik, M.

2018-05-01

Thermal energy storage systems (TES), using phase change material (PCM) in building walls, consists a hot topic within the research community currently. In the present work, a numerical model is developed to simulate free cooling of air-PCM heat exchanger in both charging and discharging steps. The studied case is taken from experimental work. The domain consists in two parallel plates made of Paraffin as PCM, separate by a gap where air circulates. The flow and temperature can be adjusted. The goal is to calculate the temperature of the air at the outlet, in order to analyse the performance of the device. A good agreement was founded between experimental and numerical results. The analysis of the influence of the flow rate on the efficiency of the process confirms a previous works, that the heating flow rate should be higher than cooling one.
Simulation of two-dimensional turbulent flows in a rotating annulus

NASA Astrophysics Data System (ADS)

Storey, Brian D.

2004-05-01

Rotating water tank experiments have been used to study fundamental processes of atmospheric and geophysical turbulence in a controlled laboratory setting. When these tanks are undergoing strong rotation the forced turbulent flow becomes highly two dimensional along the axis of rotation. An efficient numerical method has been developed for simulating the forced quasi-geostrophic equations in an annular geometry to model current laboratory experiments. The algorithm employs a spectral method with Fourier series and Chebyshev polynomials as basis functions. The algorithm has been implemented on a parallel architecture to allow modelling of a wide range of spatial scales over long integration times. This paper describes the derivation of the model equations, numerical method, testing and performance of the algorithm. Results provide reasonable agreement with the experimental data, indicating that such computations can be used as a predictive tool to design future experiments.
Pure quasi-P wave equation and numerical solution in 3D TTI media

NASA Astrophysics Data System (ADS)

Zhang, Jian-Min; He, Bing-Shou; Tang, Huai-Gu

2017-03-01

Based on the pure quasi-P wave equation in transverse isotropic media with a vertical symmetry axis (VTI media), a quasi-P wave equation is obtained in transverse isotropic media with a tilted symmetry axis (TTI media). This is achieved using projection transformation, which rotates the direction vector in the coordinate system of observation toward the direction vector for the coordinate system in which the z-component is parallel to the symmetry axis of the TTI media. The equation has a simple form, is easily calculated, is not influenced by the pseudo-shear wave, and can be calculated reliably when δ is greater than ɛ. The finite difference method is used to solve the equation. In addition, a perfectly matched layer (PML) absorbing boundary condition is obtained for the equation. Theoretical analysis and numerical simulation results with forward modeling prove that the equation can accurately simulate a quasi-P wave in TTI medium.
Performance Optimization of Marine Science and Numerical Modeling on HPC Cluster

PubMed Central

Yang, Dongdong; Yang, Hailong; Wang, Luming; Zhou, Yucong; Zhang, Zhiyuan; Wang, Rui; Liu, Yi

2017-01-01

Marine science and numerical modeling (MASNUM) is widely used in forecasting ocean wave movement, through simulating the variation tendency of the ocean wave. Although efforts have been devoted to improve the performance of MASNUM from various aspects by existing work, there is still large space unexplored for further performance improvement. In this paper, we aim at improving the performance of propagation solver and data access during the simulation, in addition to the efficiency of output I/O and load balance. Our optimizations include several effective techniques such as the algorithm redesign, load distribution optimization, parallel I/O and data access optimization. The experimental results demonstrate that our approach achieves higher performance compared to the state-of-the-art work, about 3.5x speedup without degrading the prediction accuracy. In addition, the parameter sensitivity analysis shows our optimizations are effective under various topography resolutions and output frequencies. PMID:28045972
EIT forward problem parallel simulation environment with anisotropic tissue and realistic electrode models.

PubMed

De Marco, Tommaso; Ries, Florian; Guermandi, Marco; Guerrieri, Roberto

2012-05-01

Electrical impedance tomography (EIT) is an imaging technology based on impedance measurements. To retrieve meaningful insights from these measurements, EIT relies on detailed knowledge of the underlying electrical properties of the body. This is obtained from numerical models of current flows therein. The nonhomogeneous and anisotropic electric properties of human tissues make accurate modeling and simulation very challenging, leading to a tradeoff between physical accuracy and technical feasibility, which at present severely limits the capabilities of EIT. This work presents a complete algorithmic flow for an accurate EIT modeling environment featuring high anatomical fidelity with a spatial resolution equal to that provided by an MRI and a novel realistic complete electrode model implementation. At the same time, we demonstrate that current graphics processing unit (GPU)-based platforms provide enough computational power that a domain discretized with five million voxels can be numerically modeled in about 30 s.
Modeling of fracture of protective concrete structures under impact loads

DOE Office of Scientific and Technical Information (OSTI.GOV)

Radchenko, P. A., E-mail: radchenko@live.ru; Batuev, S. P.; Radchenko, A. V.

This paper presents results of numerical simulation of interaction between a Boeing 747-400 aircraft and the protective shell of a nuclear power plant. The shell is presented as a complex multilayered cellular structure consisting of layers of concrete and fiber concrete bonded with steel trusses. Numerical simulation was performed three-dimensionally using the original algorithm and software taking into account algorithms for building grids of complex geometric objects and parallel computations. Dynamics of the stress-strain state and fracture of the structure were studied. Destruction is described using a two-stage model that allows taking into account anisotropy of elastic and strength propertiesmore » of concrete and fiber concrete. It is shown that wave processes initiate destruction of the cellular shell structure; cells start to destruct in an unloading wave originating after the compression wave arrival at free cell surfaces.« less
ISCFD Nagoya 1989 - International Symposium on Computational Fluid Dynamics, 3rd, Nagoya, Japan, Aug. 28-31, 1989, Technical Papers

NASA Astrophysics Data System (ADS)

Recent advances in computational fluid dynamics are discussed in reviews and reports. Topics addressed include large-scale LESs for turbulent pipe and channel flows, numerical solutions of the Euler and Navier-Stokes equations on parallel computers, multigrid methods for steady high-Reynolds-number flow past sudden expansions, finite-volume methods on unstructured grids, supersonic wake flow on a blunt body, a grid-characteristic method for multidimensional gas dynamics, and CIC numerical simulation of a wave boundary layer. Consideration is given to vortex simulations of confined two-dimensional jets, supersonic viscous shear layers, spectral methods for compressible flows, shock-wave refraction at air/water interfaces, oscillatory flow in a two-dimensional collapsible channel, the growth of randomness in a spatially developing wake, and an efficient simplex algorithm for the finite-difference and dynamic linear-programming method in optimal potential control.

Plasma dynamics on current-carrying magnetic flux tubes

NASA Technical Reports Server (NTRS)

Swift, Daniel W.

1992-01-01

A 1D numerical simulation is used to investigate the evolution of a plasma in a current-carrying magnetic flux tube of variable cross section. A large potential difference, parallel to the magnetic field, is applied across the domain. The result is that density minimum tends to deepen, primarily in the cathode end, and the entire potential drop becomes concentrated across the region of density minimum. The evolution of the simulation shows some sensitivity to particle boundary conditions, but the simulations inevitably evolve into a final state with a nearly stationary double layer near the cathode end. The simulation results are at sufficient variance with observations that it appears unlikely that auroral electrons can be explained by a simple process of acceleration through a field-aligned potential drop.
Parallel language constructs for tensor product computations on loosely coupled architectures

NASA Technical Reports Server (NTRS)

Mehrotra, Piyush; Van Rosendale, John

1989-01-01

A set of language primitives designed to allow the specification of parallel numerical algorithms at a higher level is described. The authors focus on tensor product array computations, a simple but important class of numerical algorithms. They consider first the problem of programming one-dimensional kernel routines, such as parallel tridiagonal solvers, and then look at how such parallel kernels can be combined to form parallel tensor product algorithms.
Experimental and Numerical Study on the Strength of Aluminum Extrusion Welding.

PubMed

Bingöl, Sedat; Bozacı, Atilla

2015-07-17

The quality of extrusion welding in the extruded hollow shapes is influenced significantly by the pressure and effective stress under which the material is being joined inside the welding chamber. However, extrusion welding was not accounted for in the past by the developers of finite element software packages. In this study, the strength of hollow extrusion profile with seam weld produced at different ram speeds was investigated experimentally and numerically. The experiments were performed on an extruded hollow aluminum profile which was suitable to obtain the tensile tests specimens from its seam weld's region at both parallel to extrusion direction and perpendicular to extrusion direction. A new numerical modeling approach, which was recently proposed in literature, was used for numerical analyses of the study. The simulation results performed at different ram speeds were compared with the experimental results, and a good agreement was obtained.
Coordinate Systems, Numerical Objects and Algorithmic Operations of Computational Experiment in Fluid Mechanics

NASA Astrophysics Data System (ADS)

Degtyarev, Alexander; Khramushin, Vasily

2016-02-01

The paper deals with the computer implementation of direct computational experiments in fluid mechanics, constructed on the basis of the approach developed by the authors. The proposed approach allows the use of explicit numerical scheme, which is an important condition for increasing the effciency of the algorithms developed by numerical procedures with natural parallelism. The paper examines the main objects and operations that let you manage computational experiments and monitor the status of the computation process. Special attention is given to a) realization of tensor representations of numerical schemes for direct simulation; b) realization of representation of large particles of a continuous medium motion in two coordinate systems (global and mobile); c) computing operations in the projections of coordinate systems, direct and inverse transformation in these systems. Particular attention is paid to the use of hardware and software of modern computer systems.
Meso-modeling of Carbon Fiber Composite for Crash Safety Analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lin, Shih-Po; Chen, Yijung; Zeng, Danielle

2017-04-06

In the conventional approach, the material properties for crash safety simulations are typically obtained from standard coupon tests, where the test results only provide single layer material properties used in crash simulations. However, the lay-up effects for the failure behaviors of the real structure were not considered in numerical simulations. Hence, there was discrepancy between the crash simulations and experimental tests. Consequently, an intermediate stage is required for accurate predictions. Some component tests are required to correlate the material models in the intermediate stage. In this paper, a Mazda Tube under high-impact velocity is chosen as an example for themore » crash safety analysis. The tube consists of 24 layers of uni-directional (UD) carbon fiber composite materials, in which 4 layers are perpendicular to, while the other layers are parallel to the impact direction. An LS-DYNA meso-model was constructed with orthotropic material models counting for the single-layer material behaviors. Between layers, a node-based tie-break contact was used for modeling the delamination of the composite material. Since fiber directions are not single-oriented, the lay-up effects could be an important effect. From the first numerical trial, premature material failure occurred due to the use of material parameters obtained directly from the coupon tests. Some parametric studies were conducted to identify the cause of the numerical instability. The finding is that the material failure strength used in the numerical model needs to be enlarged to stabilize the numerical model. Some hypothesis was made to provide the foundation for enlarging the failure strength and the corresponding experiments will be conducted to validate the hypothesis.« less
Parallelized modelling and solution scheme for hierarchically scaled simulations

NASA Technical Reports Server (NTRS)

Padovan, Joe

1995-01-01

This two-part paper presents the results of a benchmarked analytical-numerical investigation into the operational characteristics of a unified parallel processing strategy for implicit fluid mechanics formulations. This hierarchical poly tree (HPT) strategy is based on multilevel substructural decomposition. The Tree morphology is chosen to minimize memory, communications and computational effort. The methodology is general enough to apply to existing finite difference (FD), finite element (FEM), finite volume (FV) or spectral element (SE) based computer programs without an extensive rewrite of code. In addition to finding large reductions in memory, communications, and computational effort associated with a parallel computing environment, substantial reductions are generated in the sequential mode of application. Such improvements grow with increasing problem size. Along with a theoretical development of general 2-D and 3-D HPT, several techniques for expanding the problem size that the current generation of computers are capable of solving, are presented and discussed. Among these techniques are several interpolative reduction methods. It was found that by combining several of these techniques that a relatively small interpolative reduction resulted in substantial performance gains. Several other unique features/benefits are discussed in this paper. Along with Part 1's theoretical development, Part 2 presents a numerical approach to the HPT along with four prototype CFD applications. These demonstrate the potential of the HPT strategy.
A synthetic method for atmospheric diffusion simulation and environmental impact assessment of accidental pollution in the chemical industry in a WEBGIS context.

PubMed

Ni, Haochen; Rui, Yikang; Wang, Jiechen; Cheng, Liang

2014-09-05

The chemical industry poses a potential security risk to factory personnel and neighboring residents. In order to mitigate prospective damage, a synthetic method must be developed for an emergency response. With the development of environmental numeric simulation models, model integration methods, and modern information technology, many Decision Support Systems (DSSs) have been established. However, existing systems still have limitations, in terms of synthetic simulation and network interoperation. In order to resolve these limitations, the matured simulation model for chemical accidents was integrated into the WEB Geographic Information System (WEBGIS) platform. The complete workflow of the emergency response, including raw data (meteorology information, and accident information) management, numeric simulation of different kinds of accidents, environmental impact assessments, and representation of the simulation results were achieved. This allowed comprehensive and real-time simulation of acute accidents in the chemical industry. The main contribution of this paper is that an organizational mechanism of the model set, based on the accident type and pollutant substance; a scheduling mechanism for the parallel processing of multi-accident-type, multi-accident-substance, and multi-simulation-model; and finally a presentation method for scalar and vector data on the web browser on the integration of a WEB Geographic Information System (WEBGIS) platform. The outcomes demonstrated that this method could provide effective support for deciding emergency responses of acute chemical accidents.
A Synthetic Method for Atmospheric Diffusion Simulation and Environmental Impact Assessment of Accidental Pollution in the Chemical Industry in a WEBGIS Context

PubMed Central

Ni, Haochen; Rui, Yikang; Wang, Jiechen; Cheng, Liang

2014-01-01

The chemical industry poses a potential security risk to factory personnel and neighboring residents. In order to mitigate prospective damage, a synthetic method must be developed for an emergency response. With the development of environmental numeric simulation models, model integration methods, and modern information technology, many Decision Support Systems (DSSs) have been established. However, existing systems still have limitations, in terms of synthetic simulation and network interoperation. In order to resolve these limitations, the matured simulation model for chemical accidents was integrated into the WEB Geographic Information System (WEBGIS) platform. The complete workflow of the emergency response, including raw data (meteorology information, and accident information) management, numeric simulation of different kinds of accidents, environmental impact assessments, and representation of the simulation results were achieved. This allowed comprehensive and real-time simulation of acute accidents in the chemical industry. The main contribution of this paper is that an organizational mechanism of the model set, based on the accident type and pollutant substance; a scheduling mechanism for the parallel processing of multi-accident-type, multi-accident-substance, and multi-simulation-model; and finally a presentation method for scalar and vector data on the web browser on the integration of a WEB Geographic Information System (WEBGIS) platform. The outcomes demonstrated that this method could provide effective support for deciding emergency responses of acute chemical accidents. PMID:25198686
A 3-D Finite-Volume Non-hydrostatic Icosahedral Model (NIM)

NASA Astrophysics Data System (ADS)

Lee, Jin

2014-05-01

The Nonhydrostatic Icosahedral Model (NIM) formulates the latest numerical innovation of the three-dimensional finite-volume control volume on the quasi-uniform icosahedral grid suitable for ultra-high resolution simulations. NIM's modeling goal is to improve numerical accuracy for weather and climate simulations as well as to utilize the state-of-art computing architecture such as massive parallel CPUs and GPUs to deliver routine high-resolution forecasts in timely manner. NIM dynamic corel innovations include: * A local coordinate system remapped spherical surface to plane for numerical accuracy (Lee and MacDonald, 2009), * Grid points in a table-driven horizontal loop that allow any horizontal point sequence (A.E. MacDonald, et al., 2010), * Flux-Corrected Transport formulated on finite-volume operators to maintain conservative positive definite transport (J.-L, Lee, ET. Al., 2010), *Icosahedral grid optimization (Wang and Lee, 2011), * All differentials evaluated as three-dimensional finite-volume integrals around the control volume. The three-dimensional finite-volume solver in NIM is designed to improve pressure gradient calculation and orographic precipitation over complex terrain. NIM dynamical core has been successfully verified with various non-hydrostatic benchmark test cases such as internal gravity wave, and mountain waves in Dynamical Cores Model Inter-comparisons Projects (DCMIP). Physical parameterizations suitable for NWP are incorporated into NIM dynamical core and successfully tested with multimonth aqua-planet simulations. Recently, NIM has started real data simulations using GFS initial conditions. Results from the idealized tests as well as real-data simulations will be shown in the conference.
Direct numerical simulation of vacillation in convection induced by centrifugal buoyancy

NASA Astrophysics Data System (ADS)

Pitz, Diogo B.; Marxen, Olaf; Chew, John W.

2017-11-01

Flows induced by centrifugal buoyancy occur in industrial systems, such as in the compressor cavities of gas turbines, as well as in flows of geophysical interest. In this numerical study we use direct numerical simulation (DNS) to investigate the transition between the steady waves regime, which is characterized by great regularity, to the vacillation regime, which is critical to understand transition to the fully turbulent regime. From previous work it is known that the onset of convection occurs in the form of pairs of nearly-circular rolls which span the entire axial length of the cavity, with small deviations near the parallel, no-slip end walls. When non-linearity sets in triadic interactions occur and, depending on the value of the centrifugal Rayleigh number, the flow is dominated by either a single mode and its harmonics or by broadband effects if turbulence develops. In this study we increase the centrifugal Rayleigh number progressively and investigate mode interactions during the vacillation regime which eventually lead to chaotic motion. Diogo B. Pitz acknowledges the financial support from the Capes foundation through the Science without Borders program.
Notes on the KIVA-2 software and chemically reactive fluid mechanics

NASA Astrophysics Data System (ADS)

Holst, M. J.

1992-09-01

Working notes regarding the mechanics of chemically reactive fluids with sprays, and their numerical simulation with the KIVA-2 software are presented. KIVA-2 is a large FORTRAN program developed at Los Alamos National Laboratory for internal combustion engine simulation. It is our hope that these notes summarize some of the necessary background material in fluid mechanics and combustion, explain the numerical methods currently used in KIVA-2 and similar combustion codes, and provide an outline of the overall structure of KIVA-2 as a representative combustion program, in order to aid the researcher in the task of implementing KIVA-2 or a similar combustion code on a massively parallel computer. The notes are organized into three parts as follows. In Part 1, a brief introduction to continuum mechanics, to fluid mechanics, and to the mechanics of chemically reactive fluids with sprays is presented. In Part 2, a close look at the governing equations of KIVA-2 is taken, and the methods employed in the numerical solution of these equations is discussed. Some conclusions are drawn and some observations are made in Part 3.
Discontinuous Galerkin Methods and High-Speed Turbulent Flows

NASA Astrophysics Data System (ADS)

Atak, Muhammed; Larsson, Johan; Munz, Claus-Dieter

2014-11-01

Discontinuous Galerkin methods gain increasing importance within the CFD community as they combine arbitrary high order of accuracy in complex geometries with parallel efficiency. Particularly the discontinuous Galerkin spectral element method (DGSEM) is a promising candidate for both the direct numerical simulation (DNS) and large eddy simulation (LES) of turbulent flows due to its excellent scaling attributes. In this talk, we present a DNS of a compressible turbulent boundary layer along a flat plate at a free-stream Mach number of M = 2.67 and assess the computational efficiency of the DGSEM at performing high-fidelity simulations of both transitional and turbulent boundary layers. We compare the accuracy of the results as well as the computational performance to results using a high order finite difference method.
Large eddy simulation of hydrodynamic cavitation

NASA Astrophysics Data System (ADS)

Bhatt, Mrugank; Mahesh, Krishnan

2017-11-01

Large eddy simulation is used to study sheet to cloud cavitation over a wedge. The mixture of water and water vapor is represented using a homogeneous mixture model. Compressible Navier-Stokes equations for mixture quantities along with transport equation for vapor mass fraction employing finite rate mass transfer between the two phases, are solved using the numerical method of Gnanaskandan and Mahesh. The method is implemented on unstructured grid with parallel MPI capabilities. Flow over a wedge is simulated at Re = 200 , 000 and the performance of the homogeneous mixture model is analyzed in predicting different regimes of sheet to cloud cavitation; namely, incipient, transitory and periodic, as observed in the experimental investigation of Harish et al.. This work is supported by the Office of Naval Research.
Study of talcum charging status in parallel plate electrostatic separator based on particle trajectory analysis

NASA Astrophysics Data System (ADS)

Yunxiao, CAO; Zhiqiang, WANG; Jinjun, WANG; Guofeng, LI

2018-05-01

Electrostatic separation has been extensively used in mineral processing, and has the potential to separate gangue minerals from raw talcum ore. As for electrostatic separation, the particle charging status is one of important influence factors. To describe the talcum particle charging status in a parallel plate electrostatic separator accurately, this paper proposes a modern images processing method. Based on the actual trajectories obtained from sequence images of particle movement and the analysis of physical forces applied on a charged particle, a numerical model is built, which could calculate the charge-to-mass ratios represented as the charging status of particle and simulate the particle trajectories. The simulated trajectories agree well with the experimental results obtained by images processing. In addition, chemical composition analysis is employed to reveal the relationship between ferrum gangue mineral content and charge-to-mass ratios. Research results show that the proposed method is effective for describing the particle charging status in electrostatic separation.
GPU accelerated particle visualization with Splotch

NASA Astrophysics Data System (ADS)

Rivi, M.; Gheller, C.; Dykes, T.; Krokos, M.; Dolag, K.

2014-07-01

Splotch is a rendering algorithm for exploration and visual discovery in particle-based datasets coming from astronomical observations or numerical simulations. The strengths of the approach are production of high quality imagery and support for very large-scale datasets through an effective mix of the OpenMP and MPI parallel programming paradigms. This article reports our experiences in re-designing Splotch for exploiting emerging HPC architectures nowadays increasingly populated with GPUs. A performance model is introduced to guide our re-factoring of Splotch. A number of parallelization issues are discussed, in particular relating to race conditions and workload balancing, towards achieving optimal performances. Our implementation was accomplished by using the CUDA programming paradigm. Our strategy is founded on novel schemes achieving optimized data organization and classification of particles. We deploy a reference cosmological simulation to present performance results on acceleration gains and scalability. We finally outline our vision for future work developments including possibilities for further optimizations and exploitation of hybrid systems and emerging accelerators.
Stochastic coalescence in finite systems: an algorithm for the numerical solution of the multivariate master equation.

NASA Astrophysics Data System (ADS)

Alfonso, Lester; Zamora, Jose; Cruz, Pedro

2015-04-01

The stochastic approach to coagulation considers the coalescence process going in a system of a finite number of particles enclosed in a finite volume. Within this approach, the full description of the system can be obtained from the solution of the multivariate master equation, which models the evolution of the probability distribution of the state vector for the number of particles of a given mass. Unfortunately, due to its complexity, only limited results were obtained for certain type of kernels and monodisperse initial conditions. In this work, a novel numerical algorithm for the solution of the multivariate master equation for stochastic coalescence that works for any type of kernels and initial conditions is introduced. The performance of the method was checked by comparing the numerically calculated particle mass spectrum with analytical solutions obtained for the constant and sum kernels, with an excellent correspondence between the analytical and numerical solutions. In order to increase the speedup of the algorithm, software parallelization techniques with OpenMP standard were used, along with an implementation in order to take advantage of new accelerator technologies. Simulations results show an important speedup of the parallelized algorithms. This study was funded by a grant from Consejo Nacional de Ciencia y Tecnologia de Mexico SEP-CONACYT CB-131879. The authors also thanks LUFAC® Computacion SA de CV for CPU time and all the support provided.
Parallelization and automatic data distribution for nuclear reactor simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liebrock, L.M.

1997-07-01

Detailed attempts at realistic nuclear reactor simulations currently take many times real time to execute on high performance workstations. Even the fastest sequential machine can not run these simulations fast enough to ensure that the best corrective measure is used during a nuclear accident to prevent a minor malfunction from becoming a major catastrophe. Since sequential computers have nearly reached the speed of light barrier, these simulations will have to be run in parallel to make significant improvements in speed. In physical reactor plants, parallelism abounds. Fluids flow, controls change, and reactions occur in parallel with only adjacent components directlymore » affecting each other. These do not occur in the sequentialized manner, with global instantaneous effects, that is often used in simulators. Development of parallel algorithms that more closely approximate the real-world operation of a reactor may, in addition to speeding up the simulations, actually improve the accuracy and reliability of the predictions generated. Three types of parallel architecture (shared memory machines, distributed memory multicomputers, and distributed networks) are briefly reviewed as targets for parallelization of nuclear reactor simulation. Various parallelization models (loop-based model, shared memory model, functional model, data parallel model, and a combined functional and data parallel model) are discussed along with their advantages and disadvantages for nuclear reactor simulation. A variety of tools are introduced for each of the models. Emphasis is placed on the data parallel model as the primary focus for two-phase flow simulation. Tools to support data parallel programming for multiple component applications and special parallelization considerations are also discussed.« less
High performance Python for direct numerical simulations of turbulent flows

NASA Astrophysics Data System (ADS)

Mortensen, Mikael; Langtangen, Hans Petter

2016-06-01

Direct Numerical Simulations (DNS) of the Navier Stokes equations is an invaluable research tool in fluid dynamics. Still, there are few publicly available research codes and, due to the heavy number crunching implied, available codes are usually written in low-level languages such as C/C++ or Fortran. In this paper we describe a pure scientific Python pseudo-spectral DNS code that nearly matches the performance of C++ for thousands of processors and billions of unknowns. We also describe a version optimized through Cython, that is found to match the speed of C++. The solvers are written from scratch in Python, both the mesh, the MPI domain decomposition, and the temporal integrators. The solvers have been verified and benchmarked on the Shaheen supercomputer at the KAUST supercomputing laboratory, and we are able to show very good scaling up to several thousand cores. A very important part of the implementation is the mesh decomposition (we implement both slab and pencil decompositions) and 3D parallel Fast Fourier Transforms (FFT). The mesh decomposition and FFT routines have been implemented in Python using serial FFT routines (either NumPy, pyFFTW or any other serial FFT module), NumPy array manipulations and with MPI communications handled by MPI for Python (mpi4py). We show how we are able to execute a 3D parallel FFT in Python for a slab mesh decomposition using 4 lines of compact Python code, for which the parallel performance on Shaheen is found to be slightly better than similar routines provided through the FFTW library. For a pencil mesh decomposition 7 lines of code is required to execute a transform.
Parallelization of Lower-Upper Symmetric Gauss-Seidel Method for Chemically Reacting Flow

NASA Technical Reports Server (NTRS)

Yoon, Seokkwan; Jost, Gabriele; Chang, Sherry

2005-01-01

Development of technologies for exploration of the solar system has revived an interest in computational simulation of chemically reacting flows since planetary probe vehicles exhibit non-equilibrium phenomena during the atmospheric entry of a planet or a moon as well as the reentry to the Earth. Stability in combustion is essential for new propulsion systems. Numerical solution of real-gas flows often increases computational work by an order-of-magnitude compared to perfect gas flow partly because of the increased complexity of equations to solve. Recently, as part of Project Columbia, NASA has integrated a cluster of interconnected SGI Altix systems to provide a ten-fold increase in current supercomputing capacity that includes an SGI Origin system. Both the new and existing machines are based on cache coherent non-uniform memory access architecture. Lower-Upper Symmetric Gauss-Seidel (LU-SGS) relaxation method has been implemented into both perfect and real gas flow codes including Real-Gas Aerodynamic Simulator (RGAS). However, the vectorized RGAS code runs inefficiently on cache-based shared-memory machines such as SGI system. Parallelization of a Gauss-Seidel method is nontrivial due to its sequential nature. The LU-SGS method has been vectorized on an oblique plane in INS3D-LU code that has been one of the base codes for NAS Parallel benchmarks. The oblique plane has been called a hyperplane by computer scientists. It is straightforward to parallelize a Gauss-Seidel method by partitioning the hyperplanes once they are formed. Another way of parallelization is to schedule processors like a pipeline using software. Both hyperplane and pipeline methods have been implemented using openMP directives. The present paper reports the performance of the parallelized RGAS code on SGI Origin and Altix systems.
Magnus-induced ratchet effects for skyrmions interacting with asymmetric substrates

NASA Astrophysics Data System (ADS)

Reichhardt, C.; Ray, D.; Olson Reichhardt, C. J.

2015-07-01

We show using numerical simulations that pronounced ratchet effects can occur for ac driven skyrmions moving over asymmetric quasi-one-dimensional substrates. We find a new type of ratchet effect called a Magnus-induced transverse ratchet that arises when the ac driving force is applied perpendicular rather than parallel to the asymmetry direction of the substrate. This transverse ratchet effect only occurs when the Magnus term is finite, and the threshold ac amplitude needed to induce it decreases as the Magnus term becomes more prominent. Ratcheting skyrmions follow ordered orbits in which the net displacement parallel to the substrate asymmetry direction is quantized. Skyrmion ratchets represent a new ac current-based method for controlling skyrmion positions and motion for spintronic applications.

Parallel Work of CO2 Ejectors Installed in a Multi-Ejector Module of Refrigeration System

NASA Astrophysics Data System (ADS)

Bodys, Jakub; Palacz, Michal; Haida, Michal; Smolka, Jacek; Nowak, Andrzej J.; Banasiak, Krzysztof; Hafner, Armin

2016-09-01

A performance analysis on of fixed ejectors installed in a multi-ejector module in a CO2 refrigeration system is presented in this study. The serial and the parallel work of four fixed-geometry units that compose the multi-ejector pack was carried out. The executed numerical simulations were performed with the use of validated Homogeneous Equilibrium Model (HEM). The computational tool ejectorPL for typical transcritical parameters at the motive nozzle were used in all the tests. A wide range of the operating conditions for supermarket applications in three different European climate zones were taken into consideration. The obtained results present the high and stable performance of all the ejectors in the multi-ejector pack.
Hypersonic Boundary Layer Instability Over a Corner

NASA Technical Reports Server (NTRS)

Balakumar, Ponnampalam; Zhao, Hong-Wu; McClinton, Charles (Technical Monitor)

2001-01-01

A boundary-layer transition study over a compression corner was conducted under a hypersonic flow condition. Due to the discontinuities in boundary layer flow, the full Navier-Stokes equations were solved to simulate the development of disturbance in the boundary layer. A linear stability analysis and PSE method were used to get the initial disturbance for parallel and non-parallel flow respectively. A 2-D code was developed to solve the full Navier-stokes by using WENO(weighted essentially non-oscillating) scheme. The given numerical results show the evolution of the linear disturbance for the most amplified disturbance in supersonic and hypersonic flow over a compression ramp. The nonlinear computations also determined the minimal amplitudes necessary to cause transition at a designed location.
Data assimilation using a GPU accelerated path integral Monte Carlo approach

NASA Astrophysics Data System (ADS)

Quinn, John C.; Abarbanel, Henry D. I.

2011-09-01

The answers to data assimilation questions can be expressed as path integrals over all possible state and parameter histories. We show how these path integrals can be evaluated numerically using a Markov Chain Monte Carlo method designed to run in parallel on a graphics processing unit (GPU). We demonstrate the application of the method to an example with a transmembrane voltage time series of a simulated neuron as an input, and using a Hodgkin-Huxley neuron model. By taking advantage of GPU computing, we gain a parallel speedup factor of up to about 300, compared to an equivalent serial computation on a CPU, with performance increasing as the length of the observation time used for data assimilation increases.
Ab initio molecular simulations with numeric atom-centered orbitals

NASA Astrophysics Data System (ADS)

Blum, Volker; Gehrke, Ralf; Hanke, Felix; Havu, Paula; Havu, Ville; Ren, Xinguo; Reuter, Karsten; Scheffler, Matthias

2009-11-01

We describe a complete set of algorithms for ab initio molecular simulations based on numerically tabulated atom-centered orbitals (NAOs) to capture a wide range of molecular and materials properties from quantum-mechanical first principles. The full algorithmic framework described here is embodied in the Fritz Haber Institute "ab initio molecular simulations" (FHI-aims) computer program package. Its comprehensive description should be relevant to any other first-principles implementation based on NAOs. The focus here is on density-functional theory (DFT) in the local and semilocal (generalized gradient) approximations, but an extension to hybrid functionals, Hartree-Fock theory, and MP2/GW electron self-energies for total energies and excited states is possible within the same underlying algorithms. An all-electron/full-potential treatment that is both computationally efficient and accurate is achieved for periodic and cluster geometries on equal footing, including relaxation and ab initio molecular dynamics. We demonstrate the construction of transferable, hierarchical basis sets, allowing the calculation to range from qualitative tight-binding like accuracy to meV-level total energy convergence with the basis set. Since all basis functions are strictly localized, the otherwise computationally dominant grid-based operations scale as O(N) with system size N. Together with a scalar-relativistic treatment, the basis sets provide access to all elements from light to heavy. Both low-communication parallelization of all real-space grid based algorithms and a ScaLapack-based, customized handling of the linear algebra for all matrix operations are possible, guaranteeing efficient scaling (CPU time and memory) up to massively parallel computer systems with thousands of CPUs.
OpenSWPC: an open-source integrated parallel simulation code for modeling seismic wave propagation in 3D heterogeneous viscoelastic media

NASA Astrophysics Data System (ADS)

Maeda, Takuto; Takemura, Shunsuke; Furumura, Takashi

2017-07-01

We have developed an open-source software package, Open-source Seismic Wave Propagation Code (OpenSWPC), for parallel numerical simulations of seismic wave propagation in 3D and 2D (P-SV and SH) viscoelastic media based on the finite difference method in local-to-regional scales. This code is equipped with a frequency-independent attenuation model based on the generalized Zener body and an efficient perfectly matched layer for absorbing boundary condition. A hybrid-style programming using OpenMP and the Message Passing Interface (MPI) is adopted for efficient parallel computation. OpenSWPC has wide applicability for seismological studies and great portability to allowing excellent performance from PC clusters to supercomputers. Without modifying the code, users can conduct seismic wave propagation simulations using their own velocity structure models and the necessary source representations by specifying them in an input parameter file. The code has various modes for different types of velocity structure model input and different source representations such as single force, moment tensor and plane-wave incidence, which can easily be selected via the input parameters. Widely used binary data formats, the Network Common Data Form (NetCDF) and the Seismic Analysis Code (SAC) are adopted for the input of the heterogeneous structure model and the outputs of the simulation results, so users can easily handle the input/output datasets. All codes are written in Fortran 2003 and are available with detailed documents in a public repository.[Figure not available: see fulltext.
Parallel Implementation of a High Order Implicit Collocation Method for the Heat Equation

NASA Technical Reports Server (NTRS)

Kouatchou, Jules; Halem, Milton (Technical Monitor)

2000-01-01

We combine a high order compact finite difference approximation and collocation techniques to numerically solve the two dimensional heat equation. The resulting method is implicit arid can be parallelized with a strategy that allows parallelization across both time and space. We compare the parallel implementation of the new method with a classical implicit method, namely the Crank-Nicolson method, where the parallelization is done across space only. Numerical experiments are carried out on the SGI Origin 2000.
Solvent-assisted multistage nonequilibrium electron transfer in rigid supramolecular systems: Diabatic free energy surfaces and algorithms for numerical simulations

NASA Astrophysics Data System (ADS)

Feskov, Serguei V.; Ivanov, Anatoly I.

2018-03-01

An approach to the construction of diabatic free energy surfaces (FESs) for ultrafast electron transfer (ET) in a supramolecule with an arbitrary number of electron localization centers (redox sites) is developed, supposing that the reorganization energies for the charge transfers and shifts between all these centers are known. Dimensionality of the coordinate space required for the description of multistage ET in this supramolecular system is shown to be equal to N - 1, where N is the number of the molecular centers involved in the reaction. The proposed algorithm of FES construction employs metric properties of the coordinate space, namely, relation between the solvent reorganization energy and the distance between the two FES minima. In this space, the ET reaction coordinate zn n' associated with electron transfer between the nth and n'th centers is calculated through the projection to the direction, connecting the FES minima. The energy-gap reaction coordinates zn n' corresponding to different ET processes are not in general orthogonal so that ET between two molecular centers can create nonequilibrium distribution, not only along its own reaction coordinate but along other reaction coordinates too. This results in the influence of the preceding ET steps on the kinetics of the ensuing ET. It is important for the ensuing reaction to be ultrafast to proceed in parallel with relaxation along the ET reaction coordinates. Efficient algorithms for numerical simulation of multistage ET within the stochastic point-transition model are developed. The algorithms are based on the Brownian simulation technique with the recrossing-event detection procedure. The main advantages of the numerical method are (i) its computational complexity is linear with respect to the number of electronic states involved and (ii) calculations can be naturally parallelized up to the level of individual trajectories. The efficiency of the proposed approach is demonstrated for a model supramolecular system involving four redox centers.
Development of a Robust and Efficient Parallel Solver for Unsteady Turbomachinery Flows

NASA Technical Reports Server (NTRS)

West, Jeff; Wright, Jeffrey; Thakur, Siddharth; Luke, Ed; Grinstead, Nathan

2012-01-01

The traditional design and analysis practice for advanced propulsion systems relies heavily on expensive full-scale prototype development and testing. Over the past decade, use of high-fidelity analysis and design tools such as CFD early in the product development cycle has been identified as one way to alleviate testing costs and to develop these devices better, faster and cheaper. In the design of advanced propulsion systems, CFD plays a major role in defining the required performance over the entire flight regime, as well as in testing the sensitivity of the design to the different modes of operation. Increased emphasis is being placed on developing and applying CFD models to simulate the flow field environments and performance of advanced propulsion systems. This necessitates the development of next generation computational tools which can be used effectively and reliably in a design environment. The turbomachinery simulation capability presented here is being developed in a computational tool called Loci-STREAM [1]. It integrates proven numerical methods for generalized grids and state-of-the-art physical models in a novel rule-based programming framework called Loci [2] which allows: (a) seamless integration of multidisciplinary physics in a unified manner, and (b) automatic handling of massively parallel computing. The objective is to be able to routinely simulate problems involving complex geometries requiring large unstructured grids and complex multidisciplinary physics. An immediate application of interest is simulation of unsteady flows in rocket turbopumps, particularly in cryogenic liquid rocket engines. The key components of the overall methodology presented in this paper are the following: (a) high fidelity unsteady simulation capability based on Detached Eddy Simulation (DES) in conjunction with second-order temporal discretization, (b) compliance with Geometric Conservation Law (GCL) in order to maintain conservative property on moving meshes for second-order time-stepping scheme, (c) a novel cloud-of-points interpolation method (based on a fast parallel kd-tree search algorithm) for interfaces between turbomachinery components in relative motion which is demonstrated to be highly scalable, and (d) demonstrated accuracy and parallel scalability on large grids (approx 250 million cells) in full turbomachinery geometries.
Numerical Prediction of Non-Reacting and Reacting Flow in a Model Gas Turbine Combustor

NASA Technical Reports Server (NTRS)

Davoudzadeh, Farhad; Liu, Nan-Suey

2005-01-01

The three-dimensional, viscous, turbulent, reacting and non-reacting flow characteristics of a model gas turbine combustor operating on air/methane are simulated via an unstructured and massively parallel Reynolds-Averaged Navier-Stokes (RANS) code. This serves to demonstrate the capabilities of the code for design and analysis of real combustor engines. The effects of some design features of combustors are examined. In addition, the computed results are validated against experimental data.
QCD thermodynamics with two flavors of quarks[1

NASA Astrophysics Data System (ADS)

MIMD lattice Computations (MILC) Collaboration

We present results of numerical simulations of quantum chromodynamics at finite temperature on the Intel iPSC/860 parallel processor. We performed calculations with two flavors of Kogut-Susskind quarks and of Wilson quarks on 6 × 12 3 lattices in order to study the crossover from the low temperature hadronic regime to the high temperature regime. We investigate the properties of the objects whose exchange gives static screening lengths be reconstructing their correlated quark-antiquark structure.
Lattice Boltzmann model for simulation of magnetohydrodynamics

NASA Technical Reports Server (NTRS)

Chen, Shiyi; Chen, Hudong; Martinez, Daniel; Matthaeus, William

1991-01-01

A numerical method, based on a discrete Boltzmann equation, is presented for solving the equations of magnetohydrodynamics (MHD). The algorithm provides advantages similar to the cellular automaton method in that it is local and easily adapted to parallel computing environments. Because of much lower noise levels and less stringent requirements on lattice size, the method appears to be more competitive with traditional solution methods. Examples show that the model accurately reproduces both linear and nonlinear MHD phenomena.
Problems Related to Parallelization of CFD Algorithms on GPU, Multi-GPU and Hybrid Architectures

NASA Astrophysics Data System (ADS)

Biazewicz, Marek; Kurowski, Krzysztof; Ludwiczak, Bogdan; Napieraia, Krystyna

2010-09-01

Computational Fluid Dynamics (CFD) is one of the branches of fluid mechanics, which uses numerical methods and algorithms to solve and analyze fluid flows. CFD is used in various domains, such as oil and gas reservoir uncertainty analysis, aerodynamic body shapes optimization (e.g. planes, cars, ships, sport helmets, skis), natural phenomena analysis, numerical simulation for weather forecasting or realistic visualizations. CFD problem is very complex and needs a lot of computational power to obtain the results in a reasonable time. We have implemented a parallel application for two-dimensional CFD simulation with a free surface approximation (MAC method) using new hardware architectures, in particular multi-GPU and hybrid computing environments. For this purpose we decided to use NVIDIA graphic cards with CUDA environment due to its simplicity of programming and good computations performance. We used finite difference discretization of Navier-Stokes equations, where fluid is propagated over an Eulerian Grid. In this model, the behavior of the fluid inside the cell depends only on the properties of local, surrounding cells, therefore it is well suited for the GPU-based architecture. In this paper we demonstrate how to use efficiently the computing power of GPUs for CFD. Additionally, we present some best practices to help users analyze and improve the performance of CFD applications executed on GPU. Finally, we discuss various challenges around the multi-GPU implementation on the example of matrix multiplication.
Simulations of relativistic quantum plasmas using real-time lattice scalar QED

NASA Astrophysics Data System (ADS)

Shi, Yuan; Xiao, Jianyuan; Qin, Hong; Fisch, Nathaniel J.

2018-05-01

Real-time lattice quantum electrodynamics (QED) provides a unique tool for simulating plasmas in the strong-field regime, where collective plasma scales are not well separated from relativistic-quantum scales. As a toy model, we study scalar QED, which describes self-consistent interactions between charged bosons and electromagnetic fields. To solve this model on a computer, we first discretize the scalar-QED action on a lattice, in a way that respects geometric structures of exterior calculus and U(1)-gauge symmetry. The lattice scalar QED can then be solved, in the classical-statistics regime, by advancing an ensemble of statistically equivalent initial conditions in time, using classical field equations obtained by extremizing the discrete action. To demonstrate the capability of our numerical scheme, we apply it to two example problems. The first example is the propagation of linear waves, where we recover analytic wave dispersion relations using numerical spectrum. The second example is an intense laser interacting with a one-dimensional plasma slab, where we demonstrate natural transition from wakefield acceleration to pair production when the wave amplitude exceeds the Schwinger threshold. Our real-time lattice scheme is fully explicit and respects local conservation laws, making it reliable for long-time dynamics. The algorithm is readily parallelized using domain decomposition, and the ensemble may be computed using quantum parallelism in the future.
3D simulations of early blood vessel formation

NASA Astrophysics Data System (ADS)

Cavalli, F.; Gamba, A.; Naldi, G.; Semplice, M.; Valdembri, D.; Serini, G.

2007-08-01

Blood vessel networks form by spontaneous aggregation of individual cells migrating toward vascularization sites (vasculogenesis). A successful theoretical model of two-dimensional experimental vasculogenesis has been recently proposed, showing the relevance of percolation concepts and of cell cross-talk (chemotactic autocrine loop) to the understanding of this self-aggregation process. Here we study the natural 3D extension of the computational model proposed earlier, which is relevant for the investigation of the genuinely three-dimensional process of vasculogenesis in vertebrate embryos. The computational model is based on a multidimensional Burgers equation coupled with a reaction diffusion equation for a chemotactic factor and a mass conservation law. The numerical approximation of the computational model is obtained by high order relaxed schemes. Space and time discretization are performed by using TVD schemes and, respectively, IMEX schemes. Due to the computational costs of realistic simulations, we have implemented the numerical algorithm on a cluster for parallel computation. Starting from initial conditions mimicking the experimentally observed ones, numerical simulations produce network-like structures qualitatively similar to those observed in the early stages of in vivo vasculogenesis. We develop the computation of critical percolative indices as a robust measure of the network geometry as a first step towards the comparison of computational and experimental data.
Parallelized direct execution simulation of message-passing parallel programs

NASA Technical Reports Server (NTRS)

Dickens, Phillip M.; Heidelberger, Philip; Nicol, David M.

1994-01-01

As massively parallel computers proliferate, there is growing interest in findings ways by which performance of massively parallel codes can be efficiently predicted. This problem arises in diverse contexts such as parallelizing computers, parallel performance monitoring, and parallel algorithm development. In this paper we describe one solution where one directly executes the application code, but uses a discrete-event simulator to model details of the presumed parallel machine such as operating system and communication network behavior. Because this approach is computationally expensive, we are interested in its own parallelization specifically the parallelization of the discrete-event simulator. We describe methods suitable for parallelized direct execution simulation of message-passing parallel programs, and report on the performance of such a system, Large Application Parallel Simulation Environment (LAPSE), we have built on the Intel Paragon. On all codes measured to date, LAPSE predicts performance well typically within 10 percent relative error. Depending on the nature of the application code, we have observed low slowdowns (relative to natively executing code) and high relative speedups using up to 64 processors.
Suppressing epidemics on networks by exploiting observer nodes.

PubMed

Takaguchi, Taro; Hasegawa, Takehisa; Yoshida, Yuichi

2014-07-01

To control infection spreading on networks, we investigate the effect of observer nodes that recognize infection in a neighboring node and make the rest of the neighbor nodes immune. We numerically show that random placement of observer nodes works better on networks with clustering than on locally treelike networks, implying that our model is promising for realistic social networks. The efficiency of several heuristic schemes for observer placement is also examined for synthetic and empirical networks. In parallel with numerical simulations of epidemic dynamics, we also show that the effect of observer placement can be assessed by the size of the largest connected component of networks remaining after removing observer nodes and links between their neighboring nodes.
International Symposium on Numerical Methods in Engineering, 5th, Ecole Polytechnique Federale de Lausanne, Switzerland, Sept. 11-15, 1989, Proceedings. Volumes 1 & 2

NASA Astrophysics Data System (ADS)

Gruber, Ralph; Periaux, Jaques; Shaw, Richard Paul

Recent advances in computational mechanics are discussed in reviews and reports. Topics addressed include spectral superpositions on finite elements for shear banding problems, strain-based finite plasticity, numerical simulation of hypersonic viscous continuum flow, constitutive laws in solid mechanics, dynamics problems, fracture mechanics and damage tolerance, composite plates and shells, contact and friction, metal forming and solidification, coupling problems, and adaptive FEMs. Consideration is given to chemical flows, convection problems, free boundaries and artificial boundary conditions, domain-decomposition and multigrid methods, combustion and thermal analysis, wave propagation, mixed and hybrid FEMs, integral-equation methods, optimization, software engineering, and vector and parallel computing.
Suppressing epidemics on networks by exploiting observer nodes

NASA Astrophysics Data System (ADS)

Takaguchi, Taro; Hasegawa, Takehisa; Yoshida, Yuichi

2014-07-01

To control infection spreading on networks, we investigate the effect of observer nodes that recognize infection in a neighboring node and make the rest of the neighbor nodes immune. We numerically show that random placement of observer nodes works better on networks with clustering than on locally treelike networks, implying that our model is promising for realistic social networks. The efficiency of several heuristic schemes for observer placement is also examined for synthetic and empirical networks. In parallel with numerical simulations of epidemic dynamics, we also show that the effect of observer placement can be assessed by the size of the largest connected component of networks remaining after removing observer nodes and links between their neighboring nodes.
Mathematical and Numerical Aspects of the Adaptive Fast Multipole Poisson-Boltzmann Solver

DOE PAGES

Zhang, Bo; Lu, Benzhuo; Cheng, Xiaolin; ...

2013-01-01

This paper summarizes the mathematical and numerical theories and computational elements of the adaptive fast multipole Poisson-Boltzmann (AFMPB) solver. We introduce and discuss the following components in order: the Poisson-Boltzmann model, boundary integral equation reformulation, surface mesh generation, the nodepatch discretization approach, Krylov iterative methods, the new version of fast multipole methods (FMMs), and a dynamic prioritization technique for scheduling parallel operations. For each component, we also remark on feasible approaches for further improvements in efficiency, accuracy and applicability of the AFMPB solver to large-scale long-time molecular dynamics simulations. Lastly, the potential of the solver is demonstrated with preliminary numericalmore » results.« less
A High-Resolution Capability for Large-Eddy Simulation of Jet Flows

NASA Technical Reports Server (NTRS)

DeBonis, James R.

2011-01-01

A large-eddy simulation (LES) code that utilizes high-resolution numerical schemes is described and applied to a compressible jet flow. The code is written in a general manner such that the accuracy/resolution of the simulation can be selected by the user. Time discretization is performed using a family of low-dispersion Runge-Kutta schemes, selectable from first- to fourth-order. Spatial discretization is performed using central differencing schemes. Both standard schemes, second- to twelfth-order (3 to 13 point stencils) and Dispersion Relation Preserving schemes from 7 to 13 point stencils are available. The code is written in Fortran 90 and uses hybrid MPI/OpenMP parallelization. The code is applied to the simulation of a Mach 0.9 jet flow. Four-stage third-order Runge-Kutta time stepping and the 13 point DRP spatial discretization scheme of Bogey and Bailly are used. The high resolution numerics used allows for the use of relatively sparse grids. Three levels of grid resolution are examined, 3.5, 6.5, and 9.2 million points. Mean flow, first-order turbulent statistics and turbulent spectra are reported. Good agreement with experimental data for mean flow and first-order turbulent statistics is shown.

A parallel algorithm for switch-level timing simulation on a hypercube multiprocessor

NASA Technical Reports Server (NTRS)

Rao, Hariprasad Nannapaneni

1989-01-01

The parallel approach to speeding up simulation is studied, specifically the simulation of digital LSI MOS circuitry on the Intel iPSC/2 hypercube. The simulation algorithm is based on RSIM, an event driven switch-level simulator that incorporates a linear transistor model for simulating digital MOS circuits. Parallel processing techniques based on the concepts of Virtual Time and rollback are utilized so that portions of the circuit may be simulated on separate processors, in parallel for as large an increase in speed as possible. A partitioning algorithm is also developed in order to subdivide the circuit for parallel processing.
Hairpin vortices in turbulent boundary layers

NASA Astrophysics Data System (ADS)

Eitel-Amor, G.; Örlü, R.; Schlatter, P.; Flores, O.

2015-02-01

The present work presents a number of parallel and spatially developing simulations of boundary layers to address the question of whether hairpin vortices are a dominant feature of near-wall turbulence, and which role they play during transition. In the first part, the parent-offspring regeneration mechanism is investigated in parallel (temporal) simulations of a single hairpin vortex introduced in a mean shear flow corresponding to either turbulent channels or boundary layers (Reτ ≲ 590). The effect of a turbulent background superimposed on the mean flow is considered by using an eddy viscosity computed from resolved simulations. Tracking the vortical structure downstream, it is found that secondary hairpins are only created shortly after initialization, with all rotational structures decaying for later times. For hairpins in a clean (laminar) environment, the decay is relatively slow, while hairpins in weak turbulent environments (10% of νt) dissipate after a couple of eddy turnover times. In the second part, the role of hairpin vortices in laminar-turbulent transition is studied using simulations of spatial boundary layers tripped by hairpin vortices. These vortices are generated by means of specific volumetric forces representing an ejection event, creating a synthetic turbulent boundary layer initially dominated by hairpin-like vortices. These hairpins are advected towards the wake region of the boundary layer, while a sinusoidal instability of the streaks near the wall results in rapid development of a turbulent boundary layer. For Reθ > 400, the boundary layer is fully developed, with no evidence of hairpin vortices reaching into the wall region. The results from both the parallel and spatial simulations strongly suggest that the regeneration process is rather short-lived and may not sustain once a turbulent background is developed. From the transitional flow simulations, it is conjectured that the forest of hairpins reported in former direct numerical simulation studies is reminiscent of the transitional boundary layer and may not be connected to some aspects of the dynamics of the fully developed wall-bounded turbulence.
Symmetry Breaking by Parallel Flow Shear

NASA Astrophysics Data System (ADS)

Li, Jiacong; Diamond, Patrick

2015-11-01

Plasma rotation is important in reducing turbulent transport, suppressing MHD instabilities, and is beneficial to confinement. Intrinsic rotation without an external momentum input is of interest for its plausible application on ITER. k∥ spectrum asymmetry is required for residual Reynolds stress that drives the intrinsic rotation. Parallel flows are reported in linear devices without magnetic shear. In CSDX, parallel flows are mostly peaked in the core [Thakur et al., 2014]; more robust flows and reversed profiles are seen in PANTA [Oldenburger, et al. 2012]. A novel mechanism for symmetry breaking in momentum transport is proposed. Magnetic shear or mean flow profile are not required. A seed parallel flow shear (PFS) sets the sign of residual stress by selecting certain modes to grow faster. The resulted spectrum imbalance leads to a nonzero residual stress, which further drives a parallel flow with ∇n as the free energy source, adding to the shear until saturated by diffusion. Balanced flow gradient is set by Π∥Res /χϕ . Residual stress is calculated for ITG turbulence and collisional drift wave turbulence where electron-ion and electron-neutral collisions are discussed and compared. Numerical simulation is proposed for testing the effect of PFS.
Least squares reconstruction of non-linear RF phase encoded MR data.

PubMed

Salajeghe, Somaie; Babyn, Paul; Sharp, Jonathan C; Sarty, Gordon E

2016-09-01

The numerical feasibility of reconstructing MRI signals generated by RF coils that produce B1 fields with a non-linearly varying spatial phase is explored. A global linear spatial phase variation of B1 is difficult to produce from current confined to RF coils. Here we use regularized least squares inversion, in place of the usual Fourier transform, to reconstruct signals generated in B1 fields with non-linear phase variation. RF encoded signals were simulated for three RF coil configurations: ideal linear, parallel conductors and, circular coil pairs. The simulated signals were reconstructed by Fourier transform and by regularized least squares. The Fourier reconstruction of simulated RF encoded signals from the parallel conductor coil set showed minor distortions over the reconstruction of signals from the ideal linear coil set but the Fourier reconstruction of signals from the circular coil set produced severe geometric distortion. Least squares inversion in all cases produced reconstruction errors comparable to the Fourier reconstruction of the simulated signal from the ideal linear coil set. MRI signals encoded in B1 fields with non-linearly varying spatial phase may be accurately reconstructed using regularized least squares thus pointing the way to the use of simple RF coil designs for RF encoded MRI. Crown Copyright © 2016. Published by Elsevier Inc. All rights reserved.
Parallel simulation today

NASA Technical Reports Server (NTRS)

Nicol, David; Fujimoto, Richard

1992-01-01

This paper surveys topics that presently define the state of the art in parallel simulation. Included in the tutorial are discussions on new protocols, mathematical performance analysis, time parallelism, hardware support for parallel simulation, load balancing algorithms, and dynamic memory management for optimistic synchronization.
An unified numerical simulation of seismic ground motion, ocean acoustics, coseismic deformations and tsunamis of 2011 Tohoku earthquake

NASA Astrophysics Data System (ADS)

Maeda, T.; Furumura, T.; Noguchi, S.; Takemura, S.; Iwai, K.; Lee, S.; Sakai, S.; Shinohara, M.

2011-12-01

The fault rupture of the 2011 Tohoku (Mw9.0) earthquake spread approximately 550 km by 260 km with a long source rupture duration of ~200 s. For such large earthquake with a complicated source rupture process the radiation of seismic wave from the source rupture and initiation of tsunami due to the coseismic deformation is considered to be very complicated. In order to understand such a complicated process of seismic wave, coseismic deformation and tsunami, we proposed a unified approach for total modeling of earthquake induced phenomena in a single numerical scheme based on a finite-difference method simulation (Maeda and Furumura, 2011). This simulation model solves the equation of motion of based on the linear elastic theory with equilibrium between quasi-static pressure and gravity in the water column. The height of tsunami is obtained from this simulation as a vertical displacement of ocean surface. In order to simulate seismic waves, ocean acoustics, coseismic deformations, and tsunami from the 2011 Tohoku earthquake, we assembled a high-resolution 3D heterogeneous subsurface structural model of northern Japan. The area of simulation is 1200 km x 800 km and 120 km in depth, which have been discretized with grid interval of 1 km in horizontal directions and 0.25 km in vertical direction, respectively. We adopt a source-rupture model proposed by Lee et al. (2011) which is obtained by the joint inversion of teleseismic, near-field strong motion, and coseismic deformation. For conducting such a large-scale simulation, we fully parallelized our simulation code based on a domain-partitioning procedure which achieved a good speed-up by parallel computing up to 8192 core processors with parallel efficiency of 99.839%. The simulation result demonstrates clearly the process in which the seismic wave radiates from the complicated source rupture over the fault plane and propagating in heterogeneous structure of northern Japan. Then, generation of tsunami from coseismic ground deformation at sea floor due to the earthquake and propagation is also well demonstrated . The simulation also demonstrates that a very large slip up to 40 m at shallow plate boundary near the trench pushes up sea floor with source rupture propagation, and the highly elevated sea surface gradually start propagation as tsunamis due to the gravity. The result of simulation of vertical-component displacement waveform matches the ocean-bottom pressure gauge record which is installed just above the source fault area (Maeda et al., 2011) very consistently. Strong reverberation of the ocean-acoustic waves between sea surface and sea bottom particularly near the Japan Trench for long time after the source rupture ends is confirmed in the present simulation. Accordingly, long wavetrains of high-frequency ocean acoustic waves is developed and overlap to later tsunami waveforms as we found in the observations.
Implementing Parquet equations using HPX

NASA Astrophysics Data System (ADS)

Kellar, Samuel; Wagle, Bibek; Yang, Shuxiang; Tam, Ka-Ming; Kaiser, Hartmut; Moreno, Juana; Jarrell, Mark

A new C++ runtime system (HPX) enables simulations of complex systems to run more efficiently on parallel and heterogeneous systems. This increased efficiency allows for solutions to larger simulations of the parquet approximation for a system with impurities. The relevancy of the parquet equations depends upon the ability to solve systems which require long runs and large amounts of memory. These limitations, in addition to numerical complications arising from stability of the solutions, necessitate running on large distributed systems. As the computational resources trend towards the exascale and the limitations arising from computational resources vanish efficiency of large scale simulations becomes a focus. HPX facilitates efficient simulations through intelligent overlapping of computation and communication. Simulations such as the parquet equations which require the transfer of large amounts of data should benefit from HPX implementations. Supported by the the NSF EPSCoR Cooperative Agreement No. EPS-1003897 with additional support from the Louisiana Board of Regents.
Computer simulations of austenite decomposition of microalloyed 700 MPa steel during cooling

NASA Astrophysics Data System (ADS)

Pohjonen, Aarne; Paananen, Joni; Mourujärvi, Juho; Manninen, Timo; Larkiola, Jari; Porter, David

2018-05-01

We present computer simulations of austenite decomposition to ferrite and bainite during cooling. The phase transformation model is based on Johnson-Mehl-Avrami-Kolmogorov type equations. The model is parameterized by numerical fitting to continuous cooling data obtained with Gleeble thermo-mechanical simulator and it can be used for calculation of the transformation behavior occurring during cooling along any cooling path. The phase transformation model has been coupled with heat conduction simulations. The model includes separate parameters to account for the incubation stage and for the kinetics after the transformation has started. The incubation time is calculated with inversion of the CCT transformation start time. For heat conduction simulations we employed our own parallelized 2-dimensional finite difference code. In addition, the transformation model was also implemented as a subroutine in commercial finite-element software Abaqus which allows for the use of the model in various engineering applications.
Propulsion of helical flagella near boundaries

NASA Astrophysics Data System (ADS)

Rodenborn, Bruce; Giesbrecht, Grant; Ni, Katha; Vock, Isaac

The presence of nearby boundaries is known to have dramatic effects on the swimming behavior of microorganisms because of the no-slip condition at the boundary. Microorganisms that use a helical flagellum experience forces both along the axis of the helix and in the direction perpendicular to the axis. These low Reynolds number boundary effects have primarily been studied using live bacteria and using numerical simulations. However, small scale measurements give limited information about the forces and torques on the microorganisms. Furthermore, numerical studies are approximate because they have generally used Stokeslet-based simulations with image Stokeslets to represent the effects of the boundaries. Instead, we directly measure the propulsion of macroscopic helical flagella with diameter 12 mm using a fluid with viscosity 105 times that of water to ensure the Reynolds number in the experiments is much less than unity, just as for bacteria. We measure the parallel and perpendicular forces as a function of boundary distance to determine the nonzero elements of the propulsive matrix for axial rotation near a boundary. We then compare our results to the theory and simulations of Lauga et al. and to biological measurements.
Direct numerical simulation of steady state, three dimensional, laminar flow around a wall mounted cube

NASA Astrophysics Data System (ADS)

Liakos, Anastasios; Malamataris, Nikolaos

2014-11-01

The topology and evolution of flow around a surface mounted cubical object in three dimensional channel flow is examined for low to moderate Reynolds numbers. Direct numerical simulations were performed via a home made parallel finite element code. The computational domain has been designed according to actual laboratory experimental conditions. Analysis of the results is performed using the three dimensional theory of separation. Our findings indicate that a tornado-like vortex by the side of the cube is present for all Reynolds numbers for which flow was simulated. A horse-shoe vortex upstream from the cube was formed at Reynolds number approximately 1266. Pressure distributions are shown along with three dimensional images of the tornado-like vortex and the horseshoe vortex at selected Reynolds numbers. Finally, and in accordance to previous work, our results indicate that the upper limit for the Reynolds number for which steady state results are physically realizable is roughly 2000. Financial support of author NM from the Office of Naval Research Global (ONRG-VSP, N62909-13-1-V016) is acknowledged.
OH PLIF Visualization of a Premixed Ethylene-fueled Dual-Mode Scramjet Combustor

NASA Technical Reports Server (NTRS)

Cantu, Luca M. L.; Gallo, Emanuela C. A.; Cutler, Andrew D.; Danehy, Paul M.; Johansen, Craig T.; Rockwell, Robert D.; Goyne, Christopher P.; McDaniel, James C.

2016-01-01

Hydroxyl radical (OH) planar induced laser fluorescence (PLIF) measurements have been performed in a small-scale scramjet combustor at the University of Virginia Aerospace Research Laboratory at nominal simulated Mach 5 enthalpy. OH lines were carefully chosen to have fluorescent signal that is independent of pressure and temperature but linear with mole fraction. The OH PLIF signal was imaged in planes orthogonal to and parallel to the freestream flow at different equivalence ratios. Flameout limits were tested and identified. Instantaneous planar images were recorded and analyzed to compare the results with width increased dual-pump enhanced coherent anti-Stokes Raman spectroscopy (WIDECARS) measurements in the same facility and large eddy simulation/Reynolds average Navier-Stokes (LES/RANS) numerical simulation. The flame angle was found to be approximately 10 degrees for several different conditions, which is in agreement with numerical predictions and measurements using WIDECARS. Finally, a comparison between NO PLIF non-combustion cases and OH PLIF combustion cases is provided: the comparison reveals that the dominant effect of flame propagation is freestream turbulence rather than heat release and concentration gradients.
Development of the US3D Code for Advanced Compressible and Reacting Flow Simulations

NASA Technical Reports Server (NTRS)

Candler, Graham V.; Johnson, Heath B.; Nompelis, Ioannis; Subbareddy, Pramod K.; Drayna, Travis W.; Gidzak, Vladimyr; Barnhardt, Michael D.

2015-01-01

Aerothermodynamics and hypersonic flows involve complex multi-disciplinary physics, including finite-rate gas-phase kinetics, finite-rate internal energy relaxation, gas-surface interactions with finite-rate oxidation and sublimation, transition to turbulence, large-scale unsteadiness, shock-boundary layer interactions, fluid-structure interactions, and thermal protection system ablation and thermal response. Many of the flows have a large range of length and time scales, requiring large computational grids, implicit time integration, and large solution run times. The University of Minnesota NASA US3D code was designed for the simulation of these complex, highly-coupled flows. It has many of the features of the well-established DPLR code, but uses unstructured grids and has many advanced numerical capabilities and physical models for multi-physics problems. The main capabilities of the code are described, the physical modeling approaches are discussed, the different types of numerical flux functions and time integration approaches are outlined, and the parallelization strategy is overviewed. Comparisons between US3D and the NASA DPLR code are presented, and several advanced simulations are presented to illustrate some of novel features of the code.
Simulation of quasi-static hydraulic fracture propagation in porous media with XFEM

NASA Astrophysics Data System (ADS)

Juan-Lien Ramirez, Alina; Neuweiler, Insa; Löhnert, Stefan

2015-04-01

Hydraulic fracturing is the injection of a fracking fluid at high pressures into the underground. Its goal is to create and expand fracture networks to increase the rock permeability. It is a technique used, for example, for oil and gas recovery and for geothermal energy extraction, since higher rock permeability improves production. Many physical processes take place when it comes to fracking; rock deformation, fluid flow within the fractures, as well as into and through the porous rock. All these processes are strongly coupled, what makes its numerical simulation rather challenging. We present a 2D numerical model that simulates the hydraulic propagation of an embedded fracture quasi-statically in a poroelastic, fully saturated material. Fluid flow within the porous rock is described by Darcy's law and the flow within the fracture is approximated by a parallel plate model. Additionally, the effect of leak-off is taken into consideration. The solid component of the porous medium is assumed to be linear elastic and the propagation criteria are given by the energy release rate and the stress intensity factors [1]. The used numerical method for the spatial discretization is the eXtended Finite Element Method (XFEM) [2]. It is based on the standard Finite Element Method, but introduces additional degrees of freedom and enrichment functions to describe discontinuities locally in a system. Through them the geometry of the discontinuity (e.g. a fracture) becomes independent of the mesh allowing it to move freely through the domain without a mesh-adapting step. With this numerical model we are able to simulate hydraulic fracture propagation with different initial fracture geometries and material parameters. Results from these simulations will also be presented. References [1] D. Gross and T. Seelig. Fracture Mechanics with an Introduction to Micromechanics. Springer, 2nd edition, (2011) [2] T. Belytschko and T. Black. Elastic crack growth in finite elements with minimal remeshing. Int. J. Numer. Meth. Engng. 45, 601-620, (1999)
Scalable direct Vlasov solver with discontinuous Galerkin method on unstructured mesh.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xu, J.; Ostroumov, P. N.; Mustapha, B.

2010-12-01

This paper presents the development of parallel direct Vlasov solvers with discontinuous Galerkin (DG) method for beam and plasma simulations in four dimensions. Both physical and velocity spaces are in two dimesions (2P2V) with unstructured mesh. Contrary to the standard particle-in-cell (PIC) approach for kinetic space plasma simulations, i.e., solving Vlasov-Maxwell equations, direct method has been used in this paper. There are several benefits to solving a Vlasov equation directly, such as avoiding noise associated with a finite number of particles and the capability to capture fine structure in the plasma. The most challanging part of a direct Vlasov solvermore » comes from higher dimensions, as the computational cost increases as N{sup 2d}, where d is the dimension of the physical space. Recently, due to the fast development of supercomputers, the possibility has become more realistic. Many efforts have been made to solve Vlasov equations in low dimensions before; now more interest has focused on higher dimensions. Different numerical methods have been tried so far, such as the finite difference method, Fourier Spectral method, finite volume method, and spectral element method. This paper is based on our previous efforts to use the DG method. The DG method has been proven to be very successful in solving Maxwell equations, and this paper is our first effort in applying the DG method to Vlasov equations. DG has shown several advantages, such as local mass matrix, strong stability, and easy parallelization. These are particularly suitable for Vlasov equations. Domain decomposition in high dimensions has been used for parallelization; these include a highly scalable parallel two-dimensional Poisson solver. Benchmark results have been shown and simulation results will be reported.« less
The firehose instability during multiple reconnection in the Earth's magnetotail

NASA Astrophysics Data System (ADS)

Alexandrova, Alexandra; Divin, Andrey; Retino, Alessandro; Deca, Jan; Catapano, Filomena; Cozzani, Giulia

2017-04-01

We found unique events in the Cluster spacecraft observations of the Earth's magnetotail which correspond to the case of multiple reconnection sites. The ion temperature anisotropy of more energized ions in the direction parallel to the magnetic field, rather than in the perpendicular direction, is observed in the region of dynamical interaction between two active X-lines. The magnetic field and plasma parameters associated with the anisotropy correspond to the firehose instability conditions. We discuss possible scenarios of development of the firehose instability in multiple reconnection by comparing the observations with numerical simulations. Conventional Particle-in-Cell simulations of 2D magnetic reconnection starting from Harris equilibria are performed using implicit PIC code iPIC3D [Markidis, 2010]. At earlier stages the evolution creates fronts which push the weakly magnetized current sheet plasma away from the X-line. Fronts accelerate and reflect particles, producing parallel ion beams and increasing parallel ion temperature ahead of the front. If multiple X-lines are present, then the counterstreaming ion beams appear inside the original current sheet between colliding reconnection jet fronts. For large enough parallel ion pressure anisotropy, the firehose-like mode is excited inside the original current sheet with a flapping-like appearance along the X GSM direction but not Y GSM (current) direction. One should note that our simulations do not include the Bz magnetic field component (normal to the current sheet), hence ion beams cannot escape into the lobes and the whole region between two colliding fronts is unstable to firehose-like instability. In the Earth's magnetotail such configuration likely occurs when two active X-lines are close enough to each other, similar to a few cases we found in the Cluster observations.
OpenGeoSys: Performance-Oriented Computational Methods for Numerical Modeling of Flow in Large Hydrogeological Systems

NASA Astrophysics Data System (ADS)

Naumov, D.; Fischer, T.; Böttcher, N.; Watanabe, N.; Walther, M.; Rink, K.; Bilke, L.; Shao, H.; Kolditz, O.

2014-12-01

OpenGeoSys (OGS) is a scientific open source code for numerical simulation of thermo-hydro-mechanical-chemical processes in porous and fractured media. Its basic concept is to provide a flexible numerical framework for solving multi-field problems for applications in geoscience and hydrology as e.g. for CO2 storage applications, geothermal power plant forecast simulation, salt water intrusion, water resources management, etc. Advances in computational mathematics have revolutionized the variety and nature of the problems that can be addressed by environmental scientists and engineers nowadays and an intensive code development in the last years enables in the meantime the solutions of much larger numerical problems and applications. However, solving environmental processes along the water cycle at large scales, like for complete catchment or reservoirs, stays computationally still a challenging task. Therefore, we started a new OGS code development with focus on execution speed and parallelization. In the new version, a local data structure concept improves the instruction and data cache performance by a tight bundling of data with an element-wise numerical integration loop. Dedicated analysis methods enable the investigation of memory-access patterns in the local and global assembler routines, which leads to further data structure optimization for an additional performance gain. The concept is presented together with a technical code analysis of the recent development and a large case study including transient flow simulation in the unsaturated / saturated zone of the Thuringian Syncline, Germany. The analysis is performed on a high-resolution mesh (up to 50M elements) with embedded fault structures.
Parallelization of elliptic solver for solving 1D Boussinesq model

NASA Astrophysics Data System (ADS)

Tarwidi, D.; Adytia, D.

2018-03-01

In this paper, a parallel implementation of an elliptic solver in solving 1D Boussinesq model is presented. Numerical solution of Boussinesq model is obtained by implementing a staggered grid scheme to continuity, momentum, and elliptic equation of Boussinesq model. Tridiagonal system emerging from numerical scheme of elliptic equation is solved by cyclic reduction algorithm. The parallel implementation of cyclic reduction is executed on multicore processors with shared memory architectures using OpenMP. To measure the performance of parallel program, large number of grids is varied from 28 to 214. Two test cases of numerical experiment, i.e. propagation of solitary and standing wave, are proposed to evaluate the parallel program. The numerical results are verified with analytical solution of solitary and standing wave. The best speedup of solitary and standing wave test cases is about 2.07 with 214 of grids and 1.86 with 213 of grids, respectively, which are executed by using 8 threads. Moreover, the best efficiency of parallel program is 76.2% and 73.5% for solitary and standing wave test cases, respectively.
3-D parallel program for numerical calculation of gas dynamics problems with heat conductivity on distributed memory computational systems (CS)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sofronov, I.D.; Voronin, B.L.; Butnev, O.I.

1997-12-31

The aim of the work performed is to develop a 3D parallel program for numerical calculation of gas dynamics problem with heat conductivity on distributed memory computational systems (CS), satisfying the condition of numerical result independence from the number of processors involved. Two basically different approaches to the structure of massive parallel computations have been developed. The first approach uses the 3D data matrix decomposition reconstructed at temporal cycle and is a development of parallelization algorithms for multiprocessor CS with shareable memory. The second approach is based on using a 3D data matrix decomposition not reconstructed during a temporal cycle.more » The program was developed on 8-processor CS MP-3 made in VNIIEF and was adapted to a massive parallel CS Meiko-2 in LLNL by joint efforts of VNIIEF and LLNL staffs. A large number of numerical experiments has been carried out with different number of processors up to 256 and the efficiency of parallelization has been evaluated in dependence on processor number and their parameters.« less
Effects of parallel magnetic field on electrocodeposition behavior of Fe/nano-Si particles composite electroplating

NASA Astrophysics Data System (ADS)

Zhou, Pengwei; Zhong, Yunbo; Wang, Huai; Long, Qiong; Li, Fu; Sun, Zongqian; Dong, Licheng; Fan, Lijun

2013-10-01

The influence of an external parallel strong parallel magnetic field (respect to current) on the electrocodeposition of nano-silicon particles into an iron matrix has been studied in this paper. Test results show that magnetic field has a great influence on the distribution of silicon, as well as the surface morphology and the thickness of the composite coatings. When no magnetic field was applied, a high current density was needed to get high concentration of silicon particles, while that could be easily obtained at a low current density with a 2 T parallel magnetic field. However, Owing to the unevenness of the current density J-distribution on the surface of the electrode in 8 T, the thicker and rougher composite deposits appear in the edge region (L or R region), and the thinner and smoother ones appear in the middle region (M). Meanwhile, the distribution curve of silicon content looks like a “pan” along the center line of coatings. A possible mechanism combining to the numerical simulation results was suggested out to illustrate the obtained experiment results.
Iterative load-balancing method with multigrid level relaxation for particle simulation with short-range interactions

NASA Astrophysics Data System (ADS)

Furuichi, Mikito; Nishiura, Daisuke

2017-10-01

We developed dynamic load-balancing algorithms for Particle Simulation Methods (PSM) involving short-range interactions, such as Smoothed Particle Hydrodynamics (SPH), Moving Particle Semi-implicit method (MPS), and Discrete Element method (DEM). These are needed to handle billions of particles modeled in large distributed-memory computer systems. Our method utilizes flexible orthogonal domain decomposition, allowing the sub-domain boundaries in the column to be different for each row. The imbalances in the execution time between parallel logical processes are treated as a nonlinear residual. Load-balancing is achieved by minimizing the residual within the framework of an iterative nonlinear solver, combined with a multigrid technique in the local smoother. Our iterative method is suitable for adjusting the sub-domain frequently by monitoring the performance of each computational process because it is computationally cheaper in terms of communication and memory costs than non-iterative methods. Numerical tests demonstrated the ability of our approach to handle workload imbalances arising from a non-uniform particle distribution, differences in particle types, or heterogeneous computer architecture which was difficult with previously proposed methods. We analyzed the parallel efficiency and scalability of our method using Earth simulator and K-computer supercomputer systems.

Synchronization Of Parallel Discrete Event Simulations

NASA Technical Reports Server (NTRS)

Steinman, Jeffrey S.

1992-01-01

Adaptive, parallel, discrete-event-simulation-synchronization algorithm, Breathing Time Buckets, developed in Synchronous Parallel Environment for Emulation and Discrete Event Simulation (SPEEDES) operating system. Algorithm allows parallel simulations to process events optimistically in fluctuating time cycles that naturally adapt while simulation in progress. Combines best of optimistic and conservative synchronization strategies while avoiding major disadvantages. Algorithm processes events optimistically in time cycles adapting while simulation in progress. Well suited for modeling communication networks, for large-scale war games, for simulated flights of aircraft, for simulations of computer equipment, for mathematical modeling, for interactive engineering simulations, and for depictions of flows of information.
Numerical simulation of weakly ionized hypersonic flow over reentry capsules

NASA Astrophysics Data System (ADS)

Scalabrin, Leonardo C.

The mathematical and numerical formulation employed in the development of a new multi-dimensional Computational Fluid Dynamics (CFD) code for the simulation of weakly ionized hypersonic flows in thermo-chemical non-equilibrium over reentry configurations is presented. The flow is modeled using the Navier-Stokes equations modified to include finite-rate chemistry and relaxation rates to compute the energy transfer between different energy modes. The set of equations is solved numerically by discretizing the flowfield using unstructured grids made of any mixture of quadrilaterals and triangles in two-dimensions or hexahedra, tetrahedra, prisms and pyramids in three-dimensions. The partial differential equations are integrated on such grids using the finite volume approach. The fluxes across grid faces are calculated using a modified form of the Steger-Warming Flux Vector Splitting scheme that has low numerical dissipation inside boundary layers. The higher order extension of inviscid fluxes in structured grids is generalized in this work to be used in unstructured grids. Steady state solutions are obtained by integrating the solution over time implicitly. The resulting sparse linear system is solved by using a point implicit or by a line implicit method in which a tridiagonal matrix is assembled by using lines of cells that are formed starting at the wall. An algorithm that assembles these lines using completely general unstructured grids is developed. The code is parallelized to allow simulation of computationally demanding problems. The numerical code is successfully employed in the simulation of several hypersonic entry flows over space capsules as part of its validation process. Important quantities for the aerothermodynamics design of capsules such as aerodynamic coefficients and heat transfer rates are compared to available experimental and flight test data and other numerical results yielding very good agreement. A sensitivity analysis of predicted radiative heating of a space capsule to several thermo-chemical non-equilibrium models is also performed.
Structural dynamics of possible late-stage intermediates in folding of quadruplex DNA studied by molecular simulations

PubMed Central

Stadlbauer, Petr; Krepl, Miroslav; Cheatham, Thomas E.; Koča, Jaroslav; Šponer, Jiří

2013-01-01

Explicit solvent molecular dynamics simulations have been used to complement preceding experimental and computational studies of folding of guanine quadruplexes (G-DNA). We initiate early stages of unfolding of several G-DNAs by simulating them under no-salt conditions and then try to fold them back using standard excess salt simulations. There is a significant difference between G-DNAs with all-anti parallel stranded stems and those with stems containing mixtures of syn and anti guanosines. The most natural rearrangement for all-anti stems is a vertical mutual slippage of the strands. This leads to stems with reduced numbers of tetrads during unfolding and a reduction of strand slippage during refolding. The presence of syn nucleotides prevents mutual strand slippage; therefore, the antiparallel and hybrid quadruplexes initiate unfolding via separation of the individual strands. The simulations confirm the capability of G-DNA molecules to adopt numerous stable locally and globally misfolded structures. The key point for a proper individual folding attempt appears to be correct prior distribution of syn and anti nucleotides in all four G-strands. The results suggest that at the level of individual molecules, G-DNA folding is an extremely multi-pathway process that is slowed by numerous misfolding arrangements stabilized on highly variable timescales. PMID:23700306
Coupling of in-situ X-ray Microtomography Observations with Discrete Element Simulations-Application to Powder Sintering

NASA Astrophysics Data System (ADS)

Olmos, L.; Bouvard, D.; Martin, C. L.; Bellet, D.; Di Michiel, M.

2009-06-01

The sintering of both a powder with a wide particle size distribution (0-63 μm) and of a powder with artificially created pores is investigated by coupling in situ X-ray microtomography observations with Discrete Element simulations. The micro structure evolution of the copper particles is observed by microtomography all along a typical sintering cycle at 1050° C at the European Synchrotron Research Facilities (ESRF, Grenoble, France). A quantitative analysis of the 3D images provides original data on interparticle indentation, coordination and particle displacements throughout sintering. In parallel, the sintering of similar powder systems has been simulated with a discrete element code which incorporates appropriate sintering contact laws from the literature. The initial numerical packing is generated directly from the 3D microtomography images or alternatively from a random set of particles with the same size distribution. The comparison between the information drawn from the simulations and the one obtained by tomography leads to the conclusion that the first method is not satisfactory because real particles are not perfectly spherical as the numerical ones. On the opposite the packings built with the second method show sintering behaviors close to the behaviors of real materials, although particle rearrangement is underestimated by DEM simulations.
Plastohydrodynamic drawing and coating of stainless steel wire using a tapered bore die of no metal to metal contact

NASA Astrophysics Data System (ADS)

Hasan, S.; Basmage, O.; Stokes, J. T.; Hashmi, M. S. J.

2018-05-01

A review of wire coating studies using plasto-hydrodynamic pressure shows that most of the works were carried out by conducting experiments simultaneously with simulation analysis based upon Bernoulli's principle and Euler and Navier-Stokes (N-S) equations. These characteristics relate to the domain of Computational Fluid Dynamics (CFD) which is an interdisciplinary topic (Fluid Mechanics, Numerical Analysis of Fluid flow and Computer Science). This research investigates two aspects: (i) simulation work and (ii) experimentation. A mathematical model was developed to investigate the flow pattern of the molten polymer and pressure distribution within the wire-drawing dies, assessment of polymer coating thickness on the coated wires and speed of coating on the wires at the outlet of the drawing dies, without deploying any pressurizing pump. In addition to a physical model which was developed within ANSYS™ environment through the simulation design of ANSYS™ Workbench. The design was customized to simulate the process of wire-coating on the fine stainless-steel wires using drawing dies having different bore geometries such as: stepped parallel bore, tapered bore and combined parallel and tapered bore. The convergence of the designed CFD model and numerical and physical solution parameters for simulation were dynamically monitored for the viscous flow of the polypropylene (PP) polymer. Simulation results were validated against experimental results and used to predict the ideal bore shape to produce a thin coating on stainless wires with different diameter. Simulation studies confirmed that a specific speed should be attained by the stainless-steel wires while passing through the drawing dies. It has been observed that all the speed values within specific speed range did not produce a coating thickness having the desired coating characteristic features. Therefore, some optimization of the experimental set up through design of experiments (Stat-Ease) was applied to validate the results. Further rapid solidification of the viscous coating on the wires was targeted so that the coated wires do not stick to the winding spool after the coating process.
Exploiting multi-scale parallelism for large scale numerical modelling of laser wakefield accelerators

NASA Astrophysics Data System (ADS)

Fonseca, R. A.; Vieira, J.; Fiuza, F.; Davidson, A.; Tsung, F. S.; Mori, W. B.; Silva, L. O.

2013-12-01

A new generation of laser wakefield accelerators (LWFA), supported by the extreme accelerating fields generated in the interaction of PW-Class lasers and underdense targets, promises the production of high quality electron beams in short distances for multiple applications. Achieving this goal will rely heavily on numerical modelling to further understand the underlying physics and identify optimal regimes, but large scale modelling of these scenarios is computationally heavy and requires the efficient use of state-of-the-art petascale supercomputing systems. We discuss the main difficulties involved in running these simulations and the new developments implemented in the OSIRIS framework to address these issues, ranging from multi-dimensional dynamic load balancing and hybrid distributed/shared memory parallelism to the vectorization of the PIC algorithm. We present the results of the OASCR Joule Metric program on the issue of large scale modelling of LWFA, demonstrating speedups of over 1 order of magnitude on the same hardware. Finally, scalability to over ˜106 cores and sustained performance over ˜2 P Flops is demonstrated, opening the way for large scale modelling of LWFA scenarios.
Influence of fast advective flows on pattern formation of Dictyostelium discoideum

PubMed Central

Bae, Albert; Zykov, Vladimir; Bodenschatz, Eberhard

2018-01-01

We report experimental and numerical results on pattern formation of self-organizing Dictyostelium discoideum cells in a microfluidic setup under a constant buffer flow. The external flow advects the signaling molecule cyclic adenosine monophosphate (cAMP) downstream, while the chemotactic cells attached to the solid substrate are not transported with the flow. At high flow velocities, elongated cAMP waves are formed that cover the whole length of the channel and propagate both parallel and perpendicular to the flow direction. While the wave period and transverse propagation velocity are constant, parallel wave velocity and the wave width increase linearly with the imposed flow. We also observe that the acquired wave shape is highly dependent on the wave generation site and the strength of the imposed flow. We compared the wave shape and velocity with numerical simulations performed using a reaction-diffusion model and found excellent agreement. These results are expected to play an important role in understanding the process of pattern formation and aggregation of D. discoideum that may experience fluid flows in its natural habitat. PMID:29590179
Numerical modelling of strain in lava tubes

NASA Astrophysics Data System (ADS)

Merle, Olivier

The strain within lava tubes is described in terms of pipe flow. Strain is partitioned into three components: (a) two simple shear components acting from top to bottom and from side to side of a rectangular tube in transverse section; and (b) a pure shear component corresponding to vertical shortening in a deflating flow and horizontal compression in an inflating flow. The sense of shear of the two simple shear components is reversed on either side of a central zone of no shear. Results of numerical simulations of strain within lava tubes reveal a concentric pattern of flattening planes in section normal to the flow direction. The central node is a zone of low strain, which increases toward the lateral borders. Sections parallel to the flow show obliquity of the flattening plane to the flow axis, constituting an imbrication. The strain ellipsoid is generally of plane strain type, but can be of constriction or flattening type if thinning (i.e. deflating flow) or thickening (i.e. inflating flow) is superimposed on the simple shear regime. The strain pattern obtained from numerical simulation is then compared with several patterns recently described in natural lava flows. It is shown that the strain pattern revealed by AMS studies or crystal preferred orientations is remarkably similar to the numerical simulation. However, some departure from the model is found in AMS measurements. This may indicate inherited strain recorded during early stages of the flow or some limitation of the AMS technique.
Increasing horizontal resolution in numerical weather prediction and climate simulations: illusion or panacea?

PubMed

Wedi, Nils P

2014-06-28

The steady path of doubling the global horizontal resolution approximately every 8 years in numerical weather prediction (NWP) at the European Centre for Medium Range Weather Forecasts may be substantially altered with emerging novel computing architectures. It coincides with the need to appropriately address and determine forecast uncertainty with increasing resolution, in particular, when convective-scale motions start to be resolved. Blunt increases in the model resolution will quickly become unaffordable and may not lead to improved NWP forecasts. Consequently, there is a need to accordingly adjust proven numerical techniques. An informed decision on the modelling strategy for harnessing exascale, massively parallel computing power thus also requires a deeper understanding of the sensitivity to uncertainty--for each part of the model--and ultimately a deeper understanding of multi-scale interactions in the atmosphere and their numerical realization in ultra-high-resolution NWP and climate simulations. This paper explores opportunities for substantial increases in the forecast efficiency by judicious adjustment of the formal accuracy or relative resolution in the spectral and physical space. One path is to reduce the formal accuracy by which the spectral transforms are computed. The other pathway explores the importance of the ratio used for the horizontal resolution in gridpoint space versus wavenumbers in spectral space. This is relevant for both high-resolution simulations as well as ensemble-based uncertainty estimation. © 2014 The Author(s) Published by the Royal Society. All rights reserved.
A Numerical Study of Scalable Cardiac Electro-Mechanical Solvers on HPC Architectures

PubMed Central

Colli Franzone, Piero; Pavarino, Luca F.; Scacchi, Simone

2018-01-01

We introduce and study some scalable domain decomposition preconditioners for cardiac electro-mechanical 3D simulations on parallel HPC (High Performance Computing) architectures. The electro-mechanical model of the cardiac tissue is composed of four coupled sub-models: (1) the static finite elasticity equations for the transversely isotropic deformation of the cardiac tissue; (2) the active tension model describing the dynamics of the intracellular calcium, cross-bridge binding and myofilament tension; (3) the anisotropic Bidomain model describing the evolution of the intra- and extra-cellular potentials in the deforming cardiac tissue; and (4) the ionic membrane model describing the dynamics of ionic currents, gating variables, ionic concentrations and stretch-activated channels. This strongly coupled electro-mechanical model is discretized in time with a splitting semi-implicit technique and in space with isoparametric finite elements. The resulting scalable parallel solver is based on Multilevel Additive Schwarz preconditioners for the solution of the Bidomain system and on BDDC preconditioned Newton-Krylov solvers for the non-linear finite elasticity system. The results of several 3D parallel simulations show the scalability of both linear and non-linear solvers and their application to the study of both physiological excitation-contraction cardiac dynamics and re-entrant waves in the presence of different mechano-electrical feedbacks. PMID:29674971
Three-dimensional finite amplitude electroconvection in dielectric liquids

NASA Astrophysics Data System (ADS)

Luo, Kang; Wu, Jian; Yi, Hong-Liang; Tan, He-Ping

2018-02-01

Charge injection induced electroconvection in a dielectric liquid lying between two parallel plates is numerically simulated in three dimensions (3D) using a unified lattice Boltzmann method (LBM). Cellular flow patterns and their subcritical bifurcation phenomena of 3D electroconvection are numerically investigated for the first time. A unit conversion is also derived to connect the LBM system to the real physical system. The 3D LBM codes are validated by three carefully chosen cases and all results are found to be highly consistent with the analytical solutions or other numerical studies. For strong injection, the steady state roll, polygon, and square flow patterns are observed under different initial disturbances. Numerical results show that the hexagonal cell with the central region being empty of charge and centrally downward flow is preferred in symmetric systems under random initial disturbance. For weak injection, the numerical results show that the flow directly passes from the motionless state to turbulence once the system loses its linear stability. In addition, the numerically predicted linear and finite amplitude stability criteria of different flow patterns are discussed.
Kranc: a Mathematica package to generate numerical codes for tensorial evolution equations

NASA Astrophysics Data System (ADS)

Husa, Sascha; Hinder, Ian; Lechner, Christiane

2006-06-01

We present a suite of Mathematica-based computer-algebra packages, termed "Kranc", which comprise a toolbox to convert certain (tensorial) systems of partial differential evolution equations to parallelized C or Fortran code for solving initial boundary value problems. Kranc can be used as a "rapid prototyping" system for physicists or mathematicians handling very complicated systems of partial differential equations, but through integration into the Cactus computational toolkit we can also produce efficient parallelized production codes. Our work is motivated by the field of numerical relativity, where Kranc is used as a research tool by the authors. In this paper we describe the design and implementation of both the Mathematica packages and the resulting code, we discuss some example applications, and provide results on the performance of an example numerical code for the Einstein equations. Program summaryTitle of program: Kranc Catalogue identifier: ADXS_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADXS_v1_0 Program obtainable from: CPC Program Library, Queen's University of Belfast, N. Ireland Distribution format: tar.gz Computer for which the program is designed and others on which it has been tested: General computers which run Mathematica (for code generation) and Cactus (for numerical simulations), tested under Linux Programming language used: Mathematica, C, Fortran 90 Memory required to execute with typical data: This depends on the number of variables and gridsize, the included ADM example requires 4308 KB Has the code been vectorized or parallelized: The code is parallelized based on the Cactus framework. Number of bytes in distributed program, including test data, etc.: 1 578 142 Number of lines in distributed program, including test data, etc.: 11 711 Nature of physical problem: Solution of partial differential equations in three space dimensions, which are formulated as an initial value problem. In particular, the program is geared towards handling very complex tensorial equations as they appear, e.g., in numerical relativity. The worked out examples comprise the Klein-Gordon equations, the Maxwell equations, and the ADM formulation of the Einstein equations. Method of solution: The method of numerical solution is finite differencing and method of lines time integration, the numerical code is generated through a high level Mathematica interface. Restrictions on the complexity of the program: Typical numerical relativity applications will contain up to several dozen evolution variables and thousands of source terms, Cactus applications have shown scaling up to several thousand processors and grid sizes exceeding 500 3. Typical running time: This depends on the number of variables and the grid size: the included ADM example takes approximately 100 seconds on a 1600 MHz Intel Pentium M processor. Unusual features of the program: based on Mathematica and Cactus
Xyce Parallel Electronic Simulator - Users' Guide Version 2.1.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hutchinson, Scott A; Hoekstra, Robert J.; Russo, Thomas V.

This manual describes the use of theXyceParallel Electronic Simulator.Xycehasbeen designed as a SPICE-compatible, high-performance analog circuit simulator, andhas been written to support the simulation needs of the Sandia National Laboratorieselectrical designers. This development has focused on improving capability over thecurrent state-of-the-art in the following areas:%04Capability to solve extremely large circuit problems by supporting large-scale par-allel computing platforms (up to thousands of processors). Note that this includessupport for most popular parallel and serial computers.%04Improved performance for all numerical kernels (e.g., time integrator, nonlinearand linear solvers) through state-of-the-art algorithms and novel techniques.%04Device models which are specifically tailored to meet Sandia's needs, includingmanymore » radiation-aware devices.3 XyceTMUsers' Guide%04Object-oriented code design and implementation using modern coding practicesthat ensure that theXyceParallel Electronic Simulator will be maintainable andextensible far into the future.Xyceis a parallel code in the most general sense of the phrase - a message passingparallel implementation - which allows it to run efficiently on the widest possible numberof computing platforms. These include serial, shared-memory and distributed-memoryparallel as well as heterogeneous platforms. Careful attention has been paid to thespecific nature of circuit-simulation problems to ensure that optimal parallel efficiencyis achieved as the number of processors grows.The development ofXyceprovides a platform for computational research and de-velopment aimed specifically at the needs of the Laboratory. WithXyce, Sandia hasan %22in-house%22 capability with which both new electrical (e.g., device model develop-ment) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms)research and development can be performed. As a result,Xyceis a unique electricalsimulation capability, designed to meet the unique needs of the laboratory.4 XyceTMUsers' GuideAcknowledgementsThe authors would like to acknowledge the entire Sandia National Laboratories HPEMS(High Performance Electrical Modeling and Simulation) team, including Steve Wix, CarolynBogdan, Regina Schells, Ken Marx, Steve Brandon and Bill Ballard, for their support onthis project. We also appreciate very much the work of Jim Emery, Becky Arnold and MikeWilliamson for the help in reviewing this document.Lastly, a very special thanks to Hue Lai for typesetting this document with LATEX.TrademarksThe information herein is subject to change without notice.Copyrightc 2002-2003 Sandia Corporation. All rights reserved.XyceTMElectronic Simulator andXyceTMtrademarks of Sandia Corporation.Orcad, Orcad Capture, PSpice and Probe are registered trademarks of Cadence DesignSystems, Inc.Silicon Graphics, the Silicon Graphics logo and IRIX are registered trademarks of SiliconGraphics, Inc.Microsoft, Windows and Windows 2000 are registered trademark of Microsoft Corporation.Solaris and UltraSPARC are registered trademarks of Sun Microsystems Corporation.Medici, DaVinci and Taurus are registered trademarks of Synopsys Corporation.HP and Alpha are registered trademarks of Hewlett-Packard company.Amtec and TecPlot are trademarks of Amtec Engineering, Inc.Xyce's expression library is based on that inside Spice 3F5 developed by the EECS De-partment at the University of California.All other trademarks are property of their respective owners.ContactsBug Reportshttp://tvrusso.sandia.gov/bugzillaEmailxyce-support%40sandia.govWorld Wide Webhttp://www.cs.sandia.gov/xyce5 XyceTMUsers' GuideThis page is left intentionally blank6« less
Analysis of multimode fiber bundles for endoscopic spectral-domain optical coherence tomography

PubMed Central

Risi, Matthew D.; Makhlouf, Houssine; Rouse, Andrew R.; Gmitro, Arthur F.

2016-01-01

A theoretical analysis of the use of a fiber bundle in spectral-domain optical coherence tomography (OCT) systems is presented. The fiber bundle enables a flexible endoscopic design and provides fast, parallelized acquisition of the OCT data. However, the multimode characteristic of the fibers in the fiber bundle affects the depth sensitivity of the imaging system. A description of light interference in a multimode fiber is presented along with numerical simulations and experimental studies to illustrate the theoretical analysis. PMID:25967012
Interaction of non-radially symmetric camphor particles

NASA Astrophysics Data System (ADS)

Ei, Shin-Ichiro; Kitahata, Hiroyuki; Koyano, Yuki; Nagayama, Masaharu

2018-03-01

In this study, the interaction between two non-radially symmetric camphor particles is theoretically investigated and the equation describing the motion is derived as an ordinary differential system for the locations and the rotations. In particular, slightly modified non-radially symmetric cases from radial symmetry are extensively investigated and explicit motions are obtained. For example, it is theoretically shown that elliptically deformed camphor particles interact so as to be parallel with major axes. Such predicted motions are also checked by real experiments and numerical simulations.
Calculation of transonic aileron buzz

NASA Technical Reports Server (NTRS)

Steger, J. L.; Bailey, H. E.

1979-01-01

An implicit finite-difference computer code that uses a two-layer algebraic eddy viscosity model and exact geometric specification of the airfoil has been used to simulate transonic aileron buzz. The calculated results, which were performed on both the Illiac IV parallel computer processor and the Control Data 7600 computer, are in essential agreement with the original expository wind-tunnel data taken in the Ames 16-Foot Wind Tunnel just after World War II. These results and a description of the pertinent numerical techniques are included.
Generating performance portable geoscientific simulation code with Firedrake (Invited)

NASA Astrophysics Data System (ADS)

Ham, D. A.; Bercea, G.; Cotter, C. J.; Kelly, P. H.; Loriant, N.; Luporini, F.; McRae, A. T.; Mitchell, L.; Rathgeber, F.

2013-12-01

This presentation will demonstrate how a change in simulation programming paradigm can be exploited to deliver sophisticated simulation capability which is far easier to programme than are conventional models, is capable of exploiting different emerging parallel hardware, and is tailored to the specific needs of geoscientific simulation. Geoscientific simulation represents a grand challenge computational task: many of the largest computers in the world are tasked with this field, and the requirements of resolution and complexity of scientists in this field are far from being sated. However, single thread performance has stalled, even sometimes decreased, over the last decade, and has been replaced by ever more parallel systems: both as conventional multicore CPUs and in the emerging world of accelerators. At the same time, the needs of scientists to couple ever-more complex dynamics and parametrisations into their models makes the model development task vastly more complex. The conventional approach of writing code in low level languages such as Fortran or C/C++ and then hand-coding parallelism for different platforms by adding library calls and directives forces the intermingling of the numerical code with its implementation. This results in an almost impossible set of skill requirements for developers, who must simultaneously be domain science experts, numericists, software engineers and parallelisation specialists. Even more critically, it requires code to be essentially rewritten for each emerging hardware platform. Since new platforms are emerging constantly, and since code owners do not usually control the procurement of the supercomputers on which they must run, this represents an unsustainable development load. The Firedrake system, conversely, offers the developer the opportunity to write PDE discretisations in the high-level mathematical language UFL from the FEniCS project (http://fenicsproject.org). Non-PDE model components, such as parametrisations, can be written as short C kernels operating locally on the underlying mesh, with no explicit parallelism. The executable code is then generated in C, CUDA or OpenCL and executed in parallel on the target architecture. The system also offers features of special relevance to the geosciences. In particular, the large scale separation between the vertical and horizontal directions in many geoscientific processes can be exploited to offer the flexibility of unstructured meshes in the horizontal direction, without the performance penalty usually associated with those methods.
Numerical analysis of ion wind flow using space charge for optimal design

NASA Astrophysics Data System (ADS)

Ko, Han Seo; Shin, Dong Ho; Baek, Soo Hong

2014-11-01

Ion wind flow has been widly studied for its advantages of a micro fluidic device. However, it is very difficult to predict the performance of the ion wind flow for various conditions because of its complicated electrohydrodynamic phenomena. Thus, a reliable numerical modeling is required to design an otimal ion wind generator and calculate velocity of the ion wind for the proper performance. In this study, the numerical modeling of the ion wind has been modified and newly defined to calculate the veloctiy of the ion wind flow by combining three basic models such as electrostatics, electrodynamics and fluid dynamics. The model has included presence of initial space charges to calculate transfer energy between space charges and air gas molecules using a developed space charge correlation. The simulation has been performed for a geometry of a pin to parallel plate electrode. Finally, the results of the simulation have been compared with the experimental data for the ion wind velocity to confirm the accuracy of the modified numerical modeling and to obtain the optimal design of the ion wind generator. This work was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Korean government (MEST) (No. 2013R1A2A2A01068653).
Airbreathing Propulsion System Analysis Using Multithreaded Parallel Processing

NASA Technical Reports Server (NTRS)

Schunk, Richard Gregory; Chung, T. J.; Rodriguez, Pete (Technical Monitor)

2000-01-01

In this paper, parallel processing is used to analyze the mixing, and combustion behavior of hypersonic flow. Preliminary work for a sonic transverse hydrogen jet injected from a slot into a Mach 4 airstream in a two-dimensional duct combustor has been completed [Moon and Chung, 1996]. Our aim is to extend this work to three-dimensional domain using multithreaded domain decomposition parallel processing based on the flowfield-dependent variation theory. Numerical simulations of chemically reacting flows are difficult because of the strong interactions between the turbulent hydrodynamic and chemical processes. The algorithm must provide an accurate representation of the flowfield, since unphysical flowfield calculations will lead to the faulty loss or creation of species mass fraction, or even premature ignition, which in turn alters the flowfield information. Another difficulty arises from the disparity in time scales between the flowfield and chemical reactions, which may require the use of finite rate chemistry. The situations are more complex when there is a disparity in length scales involved in turbulence. In order to cope with these complicated physical phenomena, it is our plan to utilize the flowfield-dependent variation theory mentioned above, facilitated by large eddy simulation. Undoubtedly, the proposed computation requires the most sophisticated computational strategies. The multithreaded domain decomposition parallel processing will be necessary in order to reduce both computational time and storage. Without special treatments involved in computer engineering, our attempt to analyze the airbreathing combustion appears to be difficult, if not impossible.
Reactive Transport Modeling of Induced Calcite Precipitation Reaction Fronts in Porous Media Using A Parallel, Fully Coupled, Fully Implicit Approach

NASA Astrophysics Data System (ADS)

Guo, L.; Huang, H.; Gaston, D.; Redden, G. D.; Fox, D. T.; Fujita, Y.

2010-12-01

Inducing mineral precipitation in the subsurface is one potential strategy for immobilizing trace metal and radionuclide contaminants. Generating mineral precipitates in situ can be achieved by manipulating chemical conditions, typically through injection or in situ generation of reactants. How these reactants transport, mix and react within the medium controls the spatial distribution and composition of the resulting mineral phases. Multiple processes, including fluid flow, dispersive/diffusive transport of reactants, biogeochemical reactions and changes in porosity-permeability, are tightly coupled over a number of scales. Numerical modeling can be used to investigate the nonlinear coupling effects of these processes which are quite challenging to explore experimentally. Many subsurface reactive transport simulators employ a de-coupled or operator-splitting approach where transport equations and batch chemistry reactions are solved sequentially. However, such an approach has limited applicability for biogeochemical systems with fast kinetics and strong coupling between chemical reactions and medium properties. A massively parallel, fully coupled, fully implicit Reactive Transport simulator (referred to as “RAT”) based on a parallel multi-physics object-oriented simulation framework (MOOSE) has been developed at the Idaho National Laboratory. Within this simulator, systems of transport and reaction equations can be solved simultaneously in a fully coupled, fully implicit manner using the Jacobian Free Newton-Krylov (JFNK) method with additional advanced computing capabilities such as (1) physics-based preconditioning for solution convergence acceleration, (2) massively parallel computing and scalability, and (3) adaptive mesh refinements for 2D and 3D structured and unstructured mesh. The simulator was first tested against analytical solutions, then applied to simulating induced calcium carbonate mineral precipitation in 1D columns and 2D flow cells as analogs to homogeneous and heterogeneous porous media, respectively. In 1D columns, calcium carbonate mineral precipitation was driven by urea hydrolysis catalyzed by urease enzyme, and in 2D flow cells, calcium carbonate mineral forming reactants were injected sequentially, forming migrating reaction fronts that are typically highly nonuniform. The RAT simulation results for the spatial and temporal distributions of precipitates, reaction rates and major species in the system, and also for changes in porosity and permeability, were compared to both laboratory experimental data and computational results obtained using other reactive transport simulators. The comparisons demonstrate the ability of RAT to simulate complex nonlinear systems and the advantages of fully coupled approaches, over de-coupled methods, for accurate simulation of complex, dynamic processes such as engineered mineral precipitation in subsurface environments.

Meshless collocation methods for the numerical solution of elliptic boundary valued problems the rotational shallow water equations on the sphere

NASA Astrophysics Data System (ADS)

Blakely, Christopher D.

This dissertation thesis has three main goals: (1) To explore the anatomy of meshless collocation approximation methods that have recently gained attention in the numerical analysis community; (2) Numerically demonstrate why the meshless collocation method should clearly become an attractive alternative to standard finite-element methods due to the simplicity of its implementation and its high-order convergence properties; (3) Propose a meshless collocation method for large scale computational geophysical fluid dynamics models. We provide numerical verification and validation of the meshless collocation scheme applied to the rotational shallow-water equations on the sphere and demonstrate computationally that the proposed model can compete with existing high performance methods for approximating the shallow-water equations such as the SEAM (spectral-element atmospheric model) developed at NCAR. A detailed analysis of the parallel implementation of the model, along with the introduction of parallel algorithmic routines for the high-performance simulation of the model will be given. We analyze the programming and computational aspects of the model using Fortran 90 and the message passing interface (mpi) library along with software and hardware specifications and performance tests. Details from many aspects of the implementation in regards to performance, optimization, and stabilization will be given. In order to verify the mathematical correctness of the algorithms presented and to validate the performance of the meshless collocation shallow-water model, we conclude the thesis with numerical experiments on some standardized test cases for the shallow-water equations on the sphere using the proposed method.
Lung assist devices influence cardio-energetic parameters: Numerical simulation study.

PubMed

De Lazzari, C; Quatember, B; Recheis, W; Mayr, M; Demertzis, S; Allasia, G; De Rossi, A; Cavoretto, R; Venturino, E; Genuini, I

2015-08-01

We aim at an analysis of the effects mechanical ventilators (MVs) and thoracic artificial lungs (TALs) will have on the cardiovascular system, especially on important quantities, such as left and right ventricular external work (EW), pressure-volume area (PVA) and cardiac mechanical efficiency (CME). Our analyses are based on simulation studies which were carried out by using our CARDIOSIM(©) software simulator. At first, we carried out simulation studies of patients undergoing mechanical ventilation (MV) without a thoracic artificial lung (TAL). Subsequently, we conducted simulation studies of patients who had been provided with a TAL, but did not undergo MV. We aimed at describing the patient's physiological characteristics and their variations with time, such as EW, PVA, CME, cardiac output (CO) and mean pulmonary arterial/venous pressure (PAP/PVP). We were starting with a simulation run under well-defined initial conditions which was followed by simulation runs for a wide range of mean intrathoracic pressure settings. Our simulations of MV without TAL showed that for mean intrathoracic pressure settings from negative (-4 mmHg) to positive (+5 mmHg) values, the left and right ventricular EW and PVA, right ventricular CME and CO decreased, whereas left ventricular CME and the PAP increased. The simulation studies of patients with a TAL, comprised all the usual TAL arrangements, viz. configurations "in series" and in parallel with the natural lung and, moreover, hybrid configurations. The main objective of the simulation studies was, as before, the assessment of the hemodynamic response to the application of a TAL. We could for instance show that, in case of an "in series" configuration, a reduction (an increase) in left (right) ventricular EW and PVA values occurred, whereas the best performance in terms of CO can be achieved in the case of an in parallel configuration.
A parallel electrostatic Particle-in-Cell method on unstructured tetrahedral grids for large-scale bounded collisionless plasma simulations

NASA Astrophysics Data System (ADS)

Averkin, Sergey N.; Gatsonis, Nikolaos A.

2018-06-01

An unstructured electrostatic Particle-In-Cell (EUPIC) method is developed on arbitrary tetrahedral grids for simulation of plasmas bounded by arbitrary geometries. The electric potential in EUPIC is obtained on cell vertices from a finite volume Multi-Point Flux Approximation of Gauss' law using the indirect dual cell with Dirichlet, Neumann and external circuit boundary conditions. The resulting matrix equation for the nodal potential is solved with a restarted generalized minimal residual method (GMRES) and an ILU(0) preconditioner algorithm, parallelized using a combination of node coloring and level scheduling approaches. The electric field on vertices is obtained using the gradient theorem applied to the indirect dual cell. The algorithms for injection, particle loading, particle motion, and particle tracking are parallelized for unstructured tetrahedral grids. The algorithms for the potential solver, electric field evaluation, loading, scatter-gather algorithms are verified using analytic solutions for test cases subject to Laplace and Poisson equations. Grid sensitivity analysis examines the L2 and L∞ norms of the relative error in potential, field, and charge density as a function of edge-averaged and volume-averaged cell size. Analysis shows second order of convergence for the potential and first order of convergence for the electric field and charge density. Temporal sensitivity analysis is performed and the momentum and energy conservation properties of the particle integrators in EUPIC are examined. The effects of cell size and timestep on heating, slowing-down and the deflection times are quantified. The heating, slowing-down and the deflection times are found to be almost linearly dependent on number of particles per cell. EUPIC simulations of current collection by cylindrical Langmuir probes in collisionless plasmas show good comparison with previous experimentally validated numerical results. These simulations were also used in a parallelization efficiency investigation. Results show that the EUPIC has efficiency of more than 80% when the simulation is performed on a single CPU from a non-uniform memory access node and the efficiency is decreasing as the number of threads further increases. The EUPIC is applied to the simulation of the multi-species plasma flow over a geometrically complex CubeSat in Low Earth Orbit. The EUPIC potential and flowfield distribution around the CubeSat exhibit features that are consistent with previous simulations over simpler geometrical bodies.
Estimating non-circular motions in barred galaxies using numerical N-body simulations

NASA Astrophysics Data System (ADS)

Randriamampandry, T. H.; Combes, F.; Carignan, C.; Deg, N.

2015-12-01

The observed velocities of the gas in barred galaxies are a combination of the azimuthally averaged circular velocity and non-circular motions, primarily caused by gas streaming along the bar. These non-circular flows must be accounted for before the observed velocities can be used in mass modelling. In this work, we examine the performance of the tilted-ring method and the DISKFIT algorithm for transforming velocity maps of barred spiral galaxies into rotation curves (RCs) using simulated data. We find that the tilted-ring method, which does not account for streaming motions, under-/overestimates the circular motions when the bar is parallel/perpendicular to the projected major axis. DISKFIT, which does include streaming motions, is limited to orientations where the bar is not aligned with either the major or minor axis of the image. Therefore, we propose a method of correcting RCs based on numerical simulations of galaxies. We correct the RC derived from the tilted-ring method based on a numerical simulation of a galaxy with similar properties and projections as the observed galaxy. Using observations of NGC 3319, which has a bar aligned with the major axis, as a test case, we show that the inferred mass models from the uncorrected and corrected RCs are significantly different. These results show the importance of correcting for the non-circular motions and demonstrate that new methods of accounting for these motions are necessary as current methods fail for specific bar alignments.
PHAST: Protein-like heteropolymer analysis by statistical thermodynamics

NASA Astrophysics Data System (ADS)

Frigori, Rafael B.

2017-06-01

PHAST is a software package written in standard Fortran, with MPI and CUDA extensions, able to efficiently perform parallel multicanonical Monte Carlo simulations of single or multiple heteropolymeric chains, as coarse-grained models for proteins. The outcome data can be straightforwardly analyzed within its microcanonical Statistical Thermodynamics module, which allows for computing the entropy, caloric curve, specific heat and free energies. As a case study, we investigate the aggregation of heteropolymers bioinspired on Aβ25-33 fragments and their cross-seeding with IAPP20-29 isoforms. Excellent parallel scaling is observed, even under numerically difficult first-order like phase transitions, which are properly described by the built-in fully reconfigurable force fields. Still, the package is free and open source, this shall motivate users to readily adapt it to specific purposes.
Phase space simulation of collisionless stellar systems on the massively parallel processor

NASA Technical Reports Server (NTRS)

White, Richard L.

1987-01-01

A numerical technique for solving the collisionless Boltzmann equation describing the time evolution of a self gravitating fluid in phase space was implemented on the Massively Parallel Processor (MPP). The code performs calculations for a two dimensional phase space grid (with one space and one velocity dimension). Some results from calculations are presented. The execution speed of the code is comparable to the speed of a single processor of a Cray-XMP. Advantages and disadvantages of the MPP architecture for this type of problem are discussed. The nearest neighbor connectivity of the MPP array does not pose a significant obstacle. Future MPP-like machines should have much more local memory and easier access to staging memory and disks in order to be effective for this type of problem.
The accuracy of the compressible Reynolds equation for predicting the local pressure in gas-lubricated textured parallel slider bearings

PubMed Central

Qiu, Mingfeng; Bailey, Brian N.; Stoll, Rob

2014-01-01

The validity of the compressible Reynolds equation to predict the local pressure in a gas-lubricated, textured parallel slider bearing is investigated. The local bearing pressure is numerically simulated using the Reynolds equation and the Navier-Stokes equations for different texture geometries and operating conditions. The respective results are compared and the simplifying assumptions inherent in the application of the Reynolds equation are quantitatively evaluated. The deviation between the local bearing pressure obtained with the Reynolds equation and the Navier-Stokes equations increases with increasing texture aspect ratio, because a significant cross-film pressure gradient and a large velocity gradient in the sliding direction develop in the lubricant film. Inertia is found to be negligible throughout this study. PMID:25049440
The high performance parallel algorithm for Unified Gas-Kinetic Scheme

NASA Astrophysics Data System (ADS)

Li, Shiyi; Li, Qibing; Fu, Song; Xu, Jinxiu

2016-11-01

A high performance parallel algorithm for UGKS is developed to simulate three-dimensional flows internal and external on arbitrary grid system. The physical domain and velocity domain are divided into different blocks and distributed according to the two-dimensional Cartesian topology with intra-communicators in physical domain for data exchange and other intra-communicators in velocity domain for sum reduction to moment integrals. Numerical results of three-dimensional cavity flow and flow past a sphere agree well with the results from the existing studies and validate the applicability of the algorithm. The scalability of the algorithm is tested both on small (1-16) and large (729-5832) scale processors. The tested speed-up ratio is near linear ashind thus the efficiency is around 1, which reveals the good scalability of the present algorithm.
Temporal parallelization of edge plasma simulations using the parareal algorithm and the SOLPS code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Samaddar, Debasmita; Coster, D. P.; Bonnin, X.

We show that numerical modelling of edge plasma physics may be successfully parallelized in time. The parareal algorithm has been employed for this purpose and the SOLPS code package coupling the B2.5 finite-volume fluid plasma solver with the kinetic Monte-Carlo neutral code Eirene has been used as a test bed. The complex dynamics of the plasma and neutrals in the scrape-off layer (SOL) region makes this a unique application. It is demonstrated that a significant computational gain (more than an order of magnitude) may be obtained with this technique. The use of the IPS framework for event-based parareal implementation optimizesmore » resource utilization and has been shown to significantly contribute to the computational gain.« less
Temporal parallelization of edge plasma simulations using the parareal algorithm and the SOLPS code

DOE PAGES

Samaddar, Debasmita; Coster, D. P.; Bonnin, X.; ...

2017-07-31

We show that numerical modelling of edge plasma physics may be successfully parallelized in time. The parareal algorithm has been employed for this purpose and the SOLPS code package coupling the B2.5 finite-volume fluid plasma solver with the kinetic Monte-Carlo neutral code Eirene has been used as a test bed. The complex dynamics of the plasma and neutrals in the scrape-off layer (SOL) region makes this a unique application. It is demonstrated that a significant computational gain (more than an order of magnitude) may be obtained with this technique. The use of the IPS framework for event-based parareal implementation optimizesmore » resource utilization and has been shown to significantly contribute to the computational gain.« less
Magnus-induced ratchet effects for skyrmions interacting with asymmetric substrates

DOE Office of Scientific and Technical Information (OSTI.GOV)

Reichhardt, C.; Ray, D.; Reichhardt, C. J. Olson

2015-07-31

We show using numerical simulations that pronounced ratchet effects can occur for ac driven skyrmions moving over asymmetric quasi-one-dimensional substrates. We find a new type of ratchet effect called a Magnus-induced transverse ratchet that arises when the ac driving force is applied perpendicular rather than parallel to the asymmetry direction of the substrate. This transverse ratchet effect only occurs when the Magnus term is finite, and the threshold ac amplitude needed to induce it decreases as the Magnus term becomes more prominent. Ratcheting skyrmions follow ordered orbits in which the net displacement parallel to the substrate asymmetry direction is quantized.more » As a result, skyrmion ratchets represent a new ac current-based method for controlling skyrmion positions and motion for spintronic applications.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Jalas, S.; Dornmair, I.; Lehe, R.

Particle in Cell (PIC) simulations are a widely used tool for the investigation of both laser- and beam-driven plasma acceleration. It is a known issue that the beam quality can be artificially degraded by numerical Cherenkov radiation (NCR) resulting primarily from an incorrectly modeled dispersion relation. Pseudo-spectral solvers featuring infinite order stencils can strongly reduce NCR - or even suppress it - and are therefore well suited to correctly model the beam properties. For efficient parallelization of the PIC algorithm, however, localized solvers are inevitable. Arbitrary order pseudo-spectral methods provide this needed locality. Yet, these methods can again be pronemore » to NCR. Here in this paper, we show that acceptably low solver orders are sufficient to correctly model the physics of interest, while allowing for parallel computation by domain decomposition.« less
Implementation of Shifted Periodic Boundary Conditions in the Large-Scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) Software

DTIC Science & Technology

2015-08-01

Atomic/Molecular Massively Parallel Simulator ( LAMMPS ) Software by N Scott Weingarten and James P Larentzos Approved for...Massively Parallel Simulator ( LAMMPS ) Software by N Scott Weingarten Weapons and Materials Research Directorate, ARL James P Larentzos Engility...Shifted Periodic Boundary Conditions in the Large-Scale Atomic/Molecular Massively Parallel Simulator ( LAMMPS ) Software 5a. CONTRACT NUMBER 5b
Petascale computation of multi-physics seismic simulations

NASA Astrophysics Data System (ADS)

Gabriel, Alice-Agnes; Madden, Elizabeth H.; Ulrich, Thomas; Wollherr, Stephanie; Duru, Kenneth C.

2017-04-01

Capturing the observed complexity of earthquake sources in concurrence with seismic wave propagation simulations is an inherently multi-scale, multi-physics problem. In this presentation, we present simulations of earthquake scenarios resolving high-detail dynamic rupture evolution and high frequency ground motion. The simulations combine a multitude of representations of model complexity; such as non-linear fault friction, thermal and fluid effects, heterogeneous fault stress and fault strength initial conditions, fault curvature and roughness, on- and off-fault non-elastic failure to capture dynamic rupture behavior at the source; and seismic wave attenuation, 3D subsurface structure and bathymetry impacting seismic wave propagation. Performing such scenarios at the necessary spatio-temporal resolution requires highly optimized and massively parallel simulation tools which can efficiently exploit HPC facilities. Our up to multi-PetaFLOP simulations are performed with SeisSol (www.seissol.org), an open-source software package based on an ADER-Discontinuous Galerkin (DG) scheme solving the seismic wave equations in velocity-stress formulation in elastic, viscoelastic, and viscoplastic media with high-order accuracy in time and space. Our flux-based implementation of frictional failure remains free of spurious oscillations. Tetrahedral unstructured meshes allow for complicated model geometry. SeisSol has been optimized on all software levels, including: assembler-level DG kernels which obtain 50% peak performance on some of the largest supercomputers worldwide; an overlapping MPI-OpenMP parallelization shadowing the multiphysics computations; usage of local time stepping; parallel input and output schemes and direct interfaces to community standard data formats. All these factors enable aim to minimise the time-to-solution. The results presented highlight the fact that modern numerical methods and hardware-aware optimization for modern supercomputers are essential to further our understanding of earthquake source physics and complement both physic-based ground motion research and empirical approaches in seismic hazard analysis. Lastly, we will conclude with an outlook on future exascale ADER-DG solvers for seismological applications.
Numerical grid generation in computational field simulations. Volume 1

DOE Office of Scientific and Technical Information (OSTI.GOV)

Soni, B.K.; Thompson, J.F.; Haeuser, J.

1996-12-31

To enhance the CFS technology to its next level of applicability (i.e., to create acceptance of CFS in an integrated product and process development involving multidisciplinary optimization) the basic requirements are: rapid turn-around time, reliable and accurate simulation, affordability and appropriate linkage to other engineering disciplines. In response to this demand, there has been a considerable growth in the grid generation related research activities involving automization, parallel processing, linkage with the CAD-CAM systems, CFS with dynamic motion and moving boundaries, strategies and algorithms associated with multi-block structured, unstructured, hybrid, hexahedral, and Cartesian grids, along with its applicability to various disciplinesmore » including biomedical, semiconductor, geophysical, ocean modeling, and multidisciplinary optimization.« less
COLA with scale-dependent growth: applications to screened modified gravity models

NASA Astrophysics Data System (ADS)

Winther, Hans A.; Koyama, Kazuya; Manera, Marc; Wright, Bill S.; Zhao, Gong-Bo

2017-08-01

We present a general parallelized and easy-to-use code to perform numerical simulations of structure formation using the COLA (COmoving Lagrangian Acceleration) method for cosmological models that exhibit scale-dependent growth at the level of first and second order Lagrangian perturbation theory. For modified gravity theories we also include screening using a fast approximate method that covers all the main examples of screening mechanisms in the literature. We test the code by comparing it to full simulations of two popular modified gravity models, namely f(R) gravity and nDGP, and find good agreement in the modified gravity boost-factors relative to ΛCDM even when using a fairly small number of COLA time steps.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Winther, Hans A.; Koyama, Kazuya; Wright, Bill S.

We present a general parallelized and easy-to-use code to perform numerical simulations of structure formation using the COLA (COmoving Lagrangian Acceleration) method for cosmological models that exhibit scale-dependent growth at the level of first and second order Lagrangian perturbation theory. For modified gravity theories we also include screening using a fast approximate method that covers all the main examples of screening mechanisms in the literature. We test the code by comparing it to full simulations of two popular modified gravity models, namely f ( R ) gravity and nDGP, and find good agreement in the modified gravity boost-factors relative tomore » ΛCDM even when using a fairly small number of COLA time steps.« less
A New Code SORD for Simulation of Polarized Light Scattering in the Earth Atmosphere

NASA Technical Reports Server (NTRS)

Korkin, Sergey; Lyapustin, Alexei; Sinyuk, Aliaksandr; Holben, Brent

2016-01-01

We report a new publicly available radiative transfer (RT) code for numerical simulation of polarized light scattering in plane-parallel atmosphere of the Earth. Using 44 benchmark tests, we prove high accuracy of the new RT code, SORD (Successive ORDers of scattering). We describe capabilities of SORD and show run time for each test on two different machines. At present, SORD is supposed to work as part of the Aerosol Robotic NETwork (AERONET) inversion algorithm. For natural integration with the AERONET software, SORD is coded in Fortran 90/95. The code is available by email request from the corresponding (first) author or from ftp://climate1.gsfc.nasa.gov/skorkin/SORD/.
NDL-v2.0: A new version of the numerical differentiation library for parallel architectures

NASA Astrophysics Data System (ADS)

Hadjidoukas, P. E.; Angelikopoulos, P.; Voglis, C.; Papageorgiou, D. G.; Lagaris, I. E.

2014-07-01

We present a new version of the numerical differentiation library (NDL) used for the numerical estimation of first and second order partial derivatives of a function by finite differencing. In this version we have restructured the serial implementation of the code so as to achieve optimal task-based parallelization. The pure shared-memory parallelization of the library has been based on the lightweight OpenMP tasking model allowing for the full extraction of the available parallelism and efficient scheduling of multiple concurrent library calls. On multicore clusters, parallelism is exploited by means of TORC, an MPI-based multi-threaded tasking library. The new MPI implementation of NDL provides optimal performance in terms of function calls and, furthermore, supports asynchronous execution of multiple library calls within legacy MPI programs. In addition, a Python interface has been implemented for all cases, exporting the functionality of our library to sequential Python codes. Catalog identifier: AEDG_v2_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEDG_v2_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 63036 No. of bytes in distributed program, including test data, etc.: 801872 Distribution format: tar.gz Programming language: ANSI Fortran-77, ANSI C, Python. Computer: Distributed systems (clusters), shared memory systems. Operating system: Linux, Unix. Has the code been vectorized or parallelized?: Yes. RAM: The library uses O(N) internal storage, N being the dimension of the problem. It can use up to O(N2) internal storage for Hessian calculations, if a task throttling factor has not been set by the user. Classification: 4.9, 4.14, 6.5. Catalog identifier of previous version: AEDG_v1_0 Journal reference of previous version: Comput. Phys. Comm. 180(2009)1404 Does the new version supersede the previous version?: Yes Nature of problem: The numerical estimation of derivatives at several accuracy levels is a common requirement in many computational tasks, such as optimization, solution of nonlinear systems, and sensitivity analysis. For a large number of scientific and engineering applications, the underlying functions correspond to simulation codes for which analytical estimation of derivatives is difficult or almost impossible. A parallel implementation that exploits systems with multiple CPUs is very important for large scale and computationally expensive problems. Solution method: Finite differencing is used with a carefully chosen step that minimizes the sum of the truncation and round-off errors. The parallel versions employ both OpenMP and MPI libraries. Reasons for new version: The updated version was motivated by our endeavors to extend a parallel Bayesian uncertainty quantification framework [1], by incorporating higher order derivative information as in most state-of-the-art stochastic simulation methods such as Stochastic Newton MCMC [2] and Riemannian Manifold Hamiltonian MC [3]. The function evaluations are simulations with significant time-to-solution, which also varies with the input parameters such as in [1, 4]. The runtime of the N-body-type of problem changes considerably with the introduction of a longer cut-off between the bodies. In the first version of the library, the OpenMP-parallel subroutines spawn a new team of threads and distribute the function evaluations with a PARALLEL DO directive. This limits the functionality of the library as multiple concurrent calls require nested parallelism support from the OpenMP environment. Therefore, either their function evaluations will be serialized or processor oversubscription is likely to occur due to the increased number of OpenMP threads. In addition, the Hessian calculations include two explicit parallel regions that compute first the diagonal and then the off-diagonal elements of the array. Due to the barrier between the two regions, the parallelism of the calculations is not fully exploited. These issues have been addressed in the new version by first restructuring the serial code and then running the function evaluations in parallel using OpenMP tasks. Although the MPI-parallel implementation of the first version is capable of fully exploiting the task parallelism of the PNDL routines, it does not utilize the caching mechanism of the serial code and, therefore, performs some redundant function evaluations in the Hessian and Jacobian calculations. This can lead to: (a) higher execution times if the number of available processors is lower than the total number of tasks, and (b) significant energy consumption due to wasted processor cycles. Overcoming these drawbacks, which become critical as the time of a single function evaluation increases, was the primary goal of this new version. Due to the code restructure, the MPI-parallel implementation (and the OpenMP-parallel in accordance) avoids redundant calls, providing optimal performance in terms of the number of function evaluations. Another limitation of the library was that the library subroutines were collective and synchronous calls. In the new version, each MPI process can issue any number of subroutines for asynchronous execution. We introduce two library calls that provide global and local task synchronizations, similarly to the BARRIER and TASKWAIT directives of OpenMP. The new MPI-implementation is based on TORC, a new tasking library for multicore clusters [5-7]. TORC improves the portability of the software, as it relies exclusively on the POSIX-Threads and MPI programming interfaces. It allows MPI processes to utilize multiple worker threads, offering a hybrid programming and execution environment similar to MPI+OpenMP, in a completely transparent way. Finally, to further improve the usability of our software, a Python interface has been implemented on top of both the OpenMP and MPI versions of the library. This allows sequential Python codes to exploit shared and distributed memory systems. Summary of revisions: The revised code improves the performance of both parallel (OpenMP and MPI) implementations. The functionality and the user-interface of the MPI-parallel version have been extended to support the asynchronous execution of multiple PNDL calls, issued by one or multiple MPI processes. A new underlying tasking library increases portability and allows MPI processes to have multiple worker threads. For both implementations, an interface to the Python programming language has been added. Restrictions: The library uses only double precision arithmetic. The MPI implementation assumes the homogeneity of the execution environment provided by the operating system. Specifically, the processes of a single MPI application must have identical address space and a user function resides at the same virtual address. In addition, address space layout randomization should not be used for the application. Unusual features: The software takes into account bound constraints, in the sense that only feasible points are used to evaluate the derivatives, and given the level of the desired accuracy, the proper formula is automatically employed. Running time: Running time depends on the function's complexity. The test run took 23 ms for the serial distribution, 25 ms for the OpenMP with 2 threads, 53 ms and 1.01 s for the MPI parallel distribution using 2 threads and 2 processes respectively and yield-time for idle workers equal to 10 ms. References: [1] P. Angelikopoulos, C. Paradimitriou, P. Koumoutsakos, Bayesian uncertainty quantification and propagation in molecular dynamics simulations: a high performance computing framework, J. Chem. Phys 137 (14). [2] H.P. Flath, L.C. Wilcox, V. Akcelik, J. Hill, B. van Bloemen Waanders, O. Ghattas, Fast algorithms for Bayesian uncertainty quantification in large-scale linear inverse problems based on low-rank partial Hessian approximations, SIAM J. Sci. Comput. 33 (1) (2011) 407-432. [3] M. Girolami, B. Calderhead, Riemann manifold Langevin and Hamiltonian Monte Carlo methods, J. R. Stat. Soc. Ser. B (Stat. Methodol.) 73 (2) (2011) 123-214. [4] P. Angelikopoulos, C. Paradimitriou, P. Koumoutsakos, Data driven, predictive molecular dynamics for nanoscale flow simulations under uncertainty, J. Phys. Chem. B 117 (47) (2013) 14808-14816. [5] P.E. Hadjidoukas, E. Lappas, V.V. Dimakopoulos, A runtime library for platform-independent task parallelism, in: PDP, IEEE, 2012, pp. 229-236. [6] C. Voglis, P.E. Hadjidoukas, D.G. Papageorgiou, I. Lagaris, A parallel hybrid optimization algorithm for fitting interatomic potentials, Appl. Soft Comput. 13 (12) (2013) 4481-4492. [7] P.E. Hadjidoukas, C. Voglis, V.V. Dimakopoulos, I. Lagaris, D.G. Papageorgiou, Supporting adaptive and irregular parallelism for non-linear numerical optimization, Appl. Math. Comput. 231 (2014) 544-559.
Parallel and orthogonal stimulus in ultradiluted neural networks

NASA Astrophysics Data System (ADS)

Sobral, G. A., Jr.; Vieira, V. M.; Lyra, M. L.; da Silva, C. R.

2006-10-01

Extending a model due to Derrida, Gardner, and Zippelius, we have studied the recognition ability of an extreme and asymmetrically diluted version of the Hopfield model for associative memory by including the effect of a stimulus in the dynamics of the system. We obtain exact results for the dynamic evolution of the average network superposition. The stimulus field was considered as proportional to the overlapping of the state of the system with a particular stimulated pattern. Two situations were analyzed, namely, the external stimulus acting on the initialization pattern (parallel stimulus) and the external stimulus acting on a pattern orthogonal to the initialization one (orthogonal stimulus). In both cases, we obtained the complete phase diagram in the parameter space composed of the stimulus field, thermal noise, and network capacity. Our results show that the system improves its recognition ability for parallel stimulus. For orthogonal stimulus two recognition phases emerge with the system locking at the initialization or stimulated pattern. We confront our analytical results with numerical simulations for the noiseless case T=0 .

The quasiperpendicular environment of large magnetic pulses in Earth's quasiparallel foreshock - ISEE 1 and 2 observations

NASA Technical Reports Server (NTRS)

Greenstadt, E. W.; Moses, S. L.; Coroniti, F. V.; Farris, M. H.; Russell, C. T.

1993-01-01

ULF waves in Earth's foreshock cause the instantaneous angle theta-B(n) between the upstream magnetic field and the shock normal to deviate from its average value. Close to the quasi-parallel (Q-parallel) shock, the transverse components of the waves become so large that the orientation of the field to the normal becomes quasi-perpendicular (Q-perpendicular) during applicable phases of each wave cycle. Large upstream pulses of B were observed completely enclosed in excursions of Theta-B(n) into the Q-perpendicular range. A recent numerical simulation included Theta-B(n) among the parameters examined in Q-parallel runs, and described a similar coincidence as intrinsic to a stage in development of the reformation process of such shocks. Thus, the natural environment of the Q-perpendicular section of Earth's bow shock seems to include an identifiable class of enlarged magnetic pulses for which local Q-perpendicular geometry is a necessary association.
Broadband ground motion simulation using a paralleled hybrid approach of Frequency Wavenumber and Finite Difference method

NASA Astrophysics Data System (ADS)

Chen, M.; Wei, S.

2016-12-01

The serious damage of Mexico City caused by the 1985 Michoacan earthquake 400 km away indicates that urban areas may be affected by remote earthquakes. To asses earthquake risk of urban areas imposed by distant earthquakes, we developed a hybrid Frequency Wavenumber (FK) and Finite Difference (FD) code implemented with MPI, since the computation of seismic wave propagation from a distant earthquake using a single numerical method (e.g. Finite Difference, Finite Element or Spectral Element) is very expensive. In our approach, we compute the incident wave field (ud) at the boundaries of the excitation box, which surrounding the local structure, using a paralleled FK method (Zhu and Rivera, 2002), and compute the total wave field (u) within the excitation box using a parallelled 2D FD method. We apply perfectly matched layer (PML) absorbing condition to the diffracted wave field (u-ud). Compared to previous Generalized Ray Theory and Finite Difference (Wen and Helmberger, 1998), Frequency Wavenumber and Spectral Element (Tong et al., 2014), and Direct Solution Method and Spectral Element hybrid method (Monteiller et al., 2013), our absorbing boundary condition dramatically suppress the numerical noise. The MPI implementation of our method can greatly speed up the calculation. Besides, our hybrid method also has a potential use in high resolution array imaging similar to Tong et al. (2014).
Measurement and numerical simulation of the changes in the open-loop transfer function in hearing aid as a function telephone handset proximity

NASA Astrophysics Data System (ADS)

Daigle, Gilles A.; Stinson, Michael R.

2002-11-01

The presence of a nearby object (telephone handset, cupped hand, etc.) can cause acoustical feedback to occur in a hearing aid. The object reflects or scatters additional sound energy to the microphone position causing the open-loop transfer function (OLTF) to increase. Feedback can occur when the OLTF>0 dB. To investigate this problem, measurements of the OLTF were made for three hearing aids (BTE, ITC, ITE) mounted on a KEMAR manikin. A telephone handset, positioned initially in a typical user position, was translated to positions between 0 and 100 mm away from the pinna, repeatibly, using a linear translation system. Changes of up to 15 dB or more were observed as the handset moved, particularly for positions within 20 mm of the pinna. In parallel, numerical simulations were made using a boundary element method. Computed changes in OLTF were consistent with the measured changes.
Massively Parallel Real-Time TDDFT Simulations of Electronic Stopping Processes

NASA Astrophysics Data System (ADS)

Yost, Dillon; Lee, Cheng-Wei; Draeger, Erik; Correa, Alfredo; Schleife, Andre; Kanai, Yosuke

Electronic stopping describes transfer of kinetic energy from fast-moving charged particles to electrons, producing massive electronic excitations in condensed matter. Understanding this phenomenon for ion irradiation has implications in modern technologies, ranging from nuclear reactors, to semiconductor devices for aerospace missions, to proton-based cancer therapy. Recent advances in high-performance computing allow us to achieve an accurate parameter-free description of these phenomena through numerical simulations. Here we discuss results from our recently-developed large-scale real-time TDDFT implementation for electronic stopping processes in important example materials such as metals, semiconductors, liquid water, and DNA. We will illustrate important insight into the physics underlying electronic stopping and we discuss current limitations of our approach both regarding physical and numerical approximations. This work is supported by the DOE through the INCITE awards and by the NSF. Part of this work was performed under the auspices of U.S. DOE by LLNL under Contract DE-AC52-07NA27344.
Highly Scalable Asynchronous Computing Method for Partial Differential Equations: A Path Towards Exascale

NASA Astrophysics Data System (ADS)

Konduri, Aditya

Many natural and engineering systems are governed by nonlinear partial differential equations (PDEs) which result in a multiscale phenomena, e.g. turbulent flows. Numerical simulations of these problems are computationally very expensive and demand for extreme levels of parallelism. At realistic conditions, simulations are being carried out on massively parallel computers with hundreds of thousands of processing elements (PEs). It has been observed that communication between PEs as well as their synchronization at these extreme scales take up a significant portion of the total simulation time and result in poor scalability of codes. This issue is likely to pose a bottleneck in scalability of codes on future Exascale systems. In this work, we propose an asynchronous computing algorithm based on widely used finite difference methods to solve PDEs in which synchronization between PEs due to communication is relaxed at a mathematical level. We show that while stability is conserved when schemes are used asynchronously, accuracy is greatly degraded. Since message arrivals at PEs are random processes, so is the behavior of the error. We propose a new statistical framework in which we show that average errors drop always to first-order regardless of the original scheme. We propose new asynchrony-tolerant schemes that maintain accuracy when synchronization is relaxed. The quality of the solution is shown to depend, not only on the physical phenomena and numerical schemes, but also on the characteristics of the computing machine. A novel algorithm using remote memory access communications has been developed to demonstrate excellent scalability of the method for large-scale computing. Finally, we present a path to extend this method in solving complex multi-scale problems on Exascale machines.
GPU-based simulation of optical propagation through turbulence for active and passive imaging

NASA Astrophysics Data System (ADS)

Monnier, Goulven; Duval, François-Régis; Amram, Solène

2014-10-01

IMOTEP is a GPU-based (Graphical Processing Units) software relying on a fast parallel implementation of Fresnel diffraction through successive phase screens. Its applications include active imaging, laser telemetry and passive imaging through turbulence with anisoplanatic spatial and temporal fluctuations. Thanks to parallel implementation on GPU, speedups ranging from 40X to 70X are achieved. The present paper gives a brief overview of IMOTEP models, algorithms, implementation and user interface. It then focuses on major improvements recently brought to the anisoplanatic imaging simulation method. Previously, we took advantage of the computational power offered by the GPU to develop a simulation method based on large series of deterministic realisations of the PSF distorted by turbulence. The phase screen propagation algorithm, by reproducing higher moments of the incident wavefront distortion, provides realistic PSFs. However, we first used a coarse gaussian model to fit the numerical PSFs and characterise there spatial statistics through only 3 parameters (two-dimensional displacements of centroid and width). Meanwhile, this approach was unable to reproduce the effects related to the details of the PSF structure, especially the "speckles" leading to prominent high-frequency content in short-exposure images. To overcome this limitation, we recently implemented a new empirical model of the PSF, based on Principal Components Analysis (PCA), ought to catch most of the PSF complexity. The GPU implementation allows estimating and handling efficiently the numerous (up to several hundreds) principal components typically required under the strong turbulence regime. A first demanding computational step involves PCA, phase screen propagation and covariance estimates. In a second step, realistic instantaneous images, fully accounting for anisoplanatic effects, are quickly generated. Preliminary results are presented.
ITG-TEM turbulence simulation with bounce-averaged kinetic electrons in tokamak geometry

NASA Astrophysics Data System (ADS)

Kwon, Jae-Min; Qi, Lei; Yi, S.; Hahm, T. S.

2017-06-01

We develop a novel numerical scheme to simulate electrostatic turbulence with kinetic electron responses in magnetically confined toroidal plasmas. Focusing on ion gyro-radius scale turbulences with slower frequencies than the time scales for electron parallel motions, we employ and adapt the bounce-averaged kinetic equation to model trapped electrons for nonlinear turbulence simulation with Coulomb collisions. Ions are modeled by employing the gyrokinetic equation. The newly developed scheme is implemented on a global δf particle in cell code gKPSP. By performing linear and nonlinear simulations, it is demonstrated that the new scheme can reproduce key physical properties of Ion Temperature Gradient (ITG) and Trapped Electron Mode (TEM) instabilities, and resulting turbulent transport. The overall computational cost of kinetic electrons using this novel scheme is limited to 200%-300% of the cost for simulations with adiabatic electrons. Therefore the new scheme allows us to perform kinetic simulations with trapped electrons very efficiently in magnetized plasmas.
The evolving energy budget of accretionary wedges

NASA Astrophysics Data System (ADS)

McBeck, Jessica; Cooke, Michele; Maillot, Bertrand; Souloumiac, Pauline

2017-04-01

The energy budget of evolving accretionary systems reveals how deformational processes partition energy as faults slip, topography uplifts, and layer-parallel shortening produces distributed off-fault deformation. The energy budget provides a quantitative framework for evaluating the energetic contribution or consumption of diverse deformation mechanisms. We investigate energy partitioning in evolving accretionary prisms by synthesizing data from physical sand accretion experiments and numerical accretion simulations. We incorporate incremental strain fields and cumulative force measurements from two suites of experiments to design numerical simulations that represent accretionary wedges with stronger and weaker detachment faults. One suite of the physical experiments includes a basal glass bead layer and the other does not. Two physical experiments within each suite implement different boundary conditions (stable base versus moving base configuration). Synthesizing observations from the differing base configurations reduces the influence of sidewall friction because the force vector produced by sidewall friction points in opposite directions depending on whether the base is fixed or moving. With the numerical simulations, we calculate the energy budget at two stages of accretion: at the maximum force preceding the development of the first thrust pair, and at the minimum force following the development of the pair. To identify the appropriate combination of material and fault properties to apply in the simulations, we systematically vary the Young's modulus and the fault static and dynamic friction coefficients in numerical accretion simulations, and identify the set of parameters that minimizes the misfit between the normal force measured on the physical backwall and the numerically simulated force. Following this derivation of the appropriate material and fault properties, we calculate the components of the work budget in the numerical simulations and in the simulated increments of the physical experiments. The work budget components of the physical experiments are determined from backwall force measurements and incremental velocity fields calculated via digital image correlation. Comparison of the energy budget preceding and following the development of the first thrust pair quantifies the tradeoff of work done in distributed deformation and work expended in frictional slip due to the development of the first backthrust and forethrust. In both the numerical and physical experiments, after the pair develops internal work decreases at the expense of frictional work, which increases. Despite the increase in frictional work, the total external work of the system decreases, revealing that accretion faulting leads to gains in efficiency. Comparison of the energy budget of the accretion experiments and simulations with the strong and weak detachments indicate that when the detachment is strong, the total energy consumed in frictional sliding and internal deformation is larger than when the detachment is relatively weak.
Algorithmic trends in computational fluid dynamics; The Institute for Computer Applications in Science and Engineering (ICASE)/LaRC Workshop, NASA Langley Research Center, Hampton, VA, US, Sep. 15-17, 1991

NASA Technical Reports Server (NTRS)

Hussaini, M. Y. (Editor); Kumar, A. (Editor); Salas, M. D. (Editor)

1993-01-01

The purpose here is to assess the state of the art in the areas of numerical analysis that are particularly relevant to computational fluid dynamics (CFD), to identify promising new developments in various areas of numerical analysis that will impact CFD, and to establish a long-term perspective focusing on opportunities and needs. Overviews are given of discretization schemes, computational fluid dynamics, algorithmic trends in CFD for aerospace flow field calculations, simulation of compressible viscous flow, and massively parallel computation. Also discussed are accerelation methods, spectral and high-order methods, multi-resolution and subcell resolution schemes, and inherently multidimensional schemes.
RIACS

NASA Technical Reports Server (NTRS)

Oliger, Joseph

1997-01-01

Topics considered include: high-performance computing; cognitive and perceptual prostheses (computational aids designed to leverage human abilities); autonomous systems. Also included: development of a 3D unstructured grid code based on a finite volume formulation and applied to the Navier-stokes equations; Cartesian grid methods for complex geometry; multigrid methods for solving elliptic problems on unstructured grids; algebraic non-overlapping domain decomposition methods for compressible fluid flow problems on unstructured meshes; numerical methods for the compressible navier-stokes equations with application to aerodynamic flows; research in aerodynamic shape optimization; S-HARP: a parallel dynamic spectral partitioner; numerical schemes for the Hamilton-Jacobi and level set equations on triangulated domains; application of high-order shock capturing schemes to direct simulation of turbulence; multicast technology; network testbeds; supercomputer consolidation project.
Model and particle-in-cell simulation of ion energy distribution in collisionless sheath

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhou, Zhuwen, E-mail: zzwwdxy@gznc.edu.cn; Key Laboratory of Photoelectron Materials Design and Simulation in Guizhou Province, Guiyang 550018; Scientific Research Innovation Team in Plasma and Functional Thin Film Materials in Guizhou Province, Guiyang 550018

2015-06-15

In this paper, we propose a self-consistent theoretical model, which is described by the ion energy distributions (IEDs) in collisionless sheaths, and the analytical results for different combined dc/radio frequency (rf) capacitive coupled plasma discharge cases, including sheath voltage errors analysis, are compared with the results of numerical simulations using a one-dimensional plane-parallel particle-in-cell (PIC) simulation. The IEDs in collisionless sheaths are performed on combination of dc/rf voltage sources electrodes discharge using argon as the process gas. The incident ions on the grounded electrode are separated, according to their different radio frequencies, and dc voltages on a separated electrode, themore » IEDs, and widths of energy in sheath and the plasma sheath thickness are discussed. The IEDs, the IED widths, and sheath voltages by the theoretical model are investigated and show good agreement with PIC simulations.« less
Real-space grids and the Octopus code as tools for the development of new simulation approaches for electronic systems

NASA Astrophysics Data System (ADS)

Andrade, Xavier; Strubbe, David; De Giovannini, Umberto; Larsen, Ask Hjorth; Oliveira, Micael J. T.; Alberdi-Rodriguez, Joseba; Varas, Alejandro; Theophilou, Iris; Helbig, Nicole; Verstraete, Matthieu J.; Stella, Lorenzo; Nogueira, Fernando; Aspuru-Guzik, Alán; Castro, Alberto; Marques, Miguel A. L.; Rubio, Angel

Real-space grids are a powerful alternative for the simulation of electronic systems. One of the main advantages of the approach is the flexibility and simplicity of working directly in real space where the different fields are discretized on a grid, combined with competitive numerical performance and great potential for parallelization. These properties constitute a great advantage at the time of implementing and testing new physical models. Based on our experience with the Octopus code, in this article we discuss how the real-space approach has allowed for the recent development of new ideas for the simulation of electronic systems. Among these applications are approaches to calculate response properties, modeling of photoemission, optimal control of quantum systems, simulation of plasmonic systems, and the exact solution of the Schr\\"odinger equation for low-dimensionality systems.
Advanced computational simulations of water waves interacting with wave energy converters

NASA Astrophysics Data System (ADS)

Pathak, Ashish; Freniere, Cole; Raessi, Mehdi

2017-03-01

Wave energy converter (WEC) devices harness the renewable ocean wave energy and convert it into useful forms of energy, e.g. mechanical or electrical. This paper presents an advanced 3D computational framework to study the interaction between water waves and WEC devices. The computational tool solves the full Navier-Stokes equations and considers all important effects impacting the device performance. To enable large-scale simulations in fast turnaround times, the computational solver was developed in an MPI parallel framework. A fast multigrid preconditioned solver is introduced to solve the computationally expensive pressure Poisson equation. The computational solver was applied to two surface-piercing WEC geometries: bottom-hinged cylinder and flap. Their numerically simulated response was validated against experimental data. Additional simulations were conducted to investigate the applicability of Froude scaling in predicting full-scale WEC response from the model experiments.
Numerical simulation of a mini PEMFC stack

NASA Astrophysics Data System (ADS)

Liu, Zhixiang; Mao, Zongqiang; Wang, Cheng; Zhuge, Weilin; Zhang, Yangjun

Fuel cell modeling and simulation has aroused much attention recently because it can probe transport and reaction mechanism. In this paper, a computational fuel cell dynamics (CFCD) method was applied to simulate a proton exchange membrane fuel cell (PEMFC) stack for the first time. The air cooling mini fuel cell stack consisted of six cells, in which the active area was 8 cm 2 (2 cm × 4 cm). With reasonable simplification, the computational elements were effectively reduced and allowed a simulation which could be conducted on a personal computer without large-scale parallel computation. The results indicated that the temperature gradient inside the fuel cell stack was determined by the flow rate of the cooling air. If the air flow rate is too low, the stack could not be effectively cooled and the temperature will rise to a range that might cause unstable stack operation.
3D simulation of floral oil storage in the scopa of South American insects

NASA Astrophysics Data System (ADS)

Ruettgers, Alexander; Griebel, Michael; Pastrik, Lars; Schmied, Heiko; Wittmann, Dieter; Scherrieble, Andreas; Dinkelmann, Albrecht; Stegmaier, Thomas; InstituteNumerical Simulation Team; Institute of Crop Science; Resource Conservation Team; Institute of Textile Technology; Process Engineering Team

2014-11-01

Several species of bees in South America possess structures to store and transport floral oils. By using closely spaced hairs at their back legs, the so called scopa, these bees can absorb and release oil droplets without loss. The high efficiency of this process is a matter of ongoing research. Basing on recent x-ray microtomography scans from the scopa of these bees at the Institute of Textile Technology and Process Engineering Denkendorf, we build a three-dimensional computer model. Using NaSt3DGPF, a two-phase flow solver developed at the Institute for Numerical Simulation of the University of Bonn, we perform massively parallel flow simulations with the complex micro-CT data. In this talk, we discuss the results of our simulations and the transfer of the x-ray measurement into a computer model. This research was funded under GR 1144/18-1 by the Deutsche Forschungsgemeinschaft (DFG).
High-speed extended-term time-domain simulation for online cascading analysis of power system

NASA Astrophysics Data System (ADS)

Fu, Chuan

A high-speed extended-term (HSET) time domain simulator (TDS), intended to become a part of an energy management system (EMS), has been newly developed for use in online extended-term dynamic cascading analysis of power systems. HSET-TDS includes the following attributes for providing situational awareness of high-consequence events: (i) online analysis, including n-1 and n-k events, (ii) ability to simulate both fast and slow dynamics for 1-3 hours in advance, (iii) inclusion of rigorous protection-system modeling, (iv) intelligence for corrective action ID, storage, and fast retrieval, and (v) high-speed execution. Very fast on-line computational capability is the most desired attribute of this simulator. Based on the process of solving algebraic differential equations describing the dynamics of power system, HSET-TDS seeks to develop computational efficiency at each of the following hierarchical levels, (i) hardware, (ii) strategies, (iii) integration methods, (iv) nonlinear solvers, and (v) linear solver libraries. This thesis first describes the Hammer-Hollingsworth 4 (HH4) implicit integration method. Like the trapezoidal rule, HH4 is symmetrically A-Stable but it possesses greater high-order precision (h4 ) than the trapezoidal rule. Such precision enables larger integration steps and therefore improves simulation efficiency for variable step size implementations. This thesis provides the underlying theory on which we advocate use of HH4 over other numerical integration methods for power system time-domain simulation. Second, motivated by the need to perform high speed extended-term time domain simulation (HSET-TDS) for on-line purposes, this thesis presents principles for designing numerical solvers of differential algebraic systems associated with power system time-domain simulation, including DAE construction strategies (Direct Solution Method), integration methods(HH4), nonlinear solvers(Very Dishonest Newton), and linear solvers(SuperLU). We have implemented a design appropriate for HSET-TDS, and we compare it to various solvers, including the commercial grade PSSE program, with respect to computational efficiency and accuracy, using as examples the New England 39 bus system, the expanded 8775 bus system, and PJM 13029 buses system. Third, we have explored a stiffness-decoupling method, intended to be part of parallel design of time domain simulation software for super computers. The stiffness-decoupling method is able to combine the advantages of implicit methods (A-stability) and explicit method(less computation). With the new stiffness detection method proposed herein, the stiffness can be captured. The expanded 975 buses system is used to test simulation efficiency. Finally, several parallel strategies for super computer deployment to simulate power system dynamics are proposed and compared. Design A partitions the task via scale with the stiffness decoupling method, waveform relaxation, and parallel linear solver. Design B partitions the task via the time axis using a highly precise integration method, the Kuntzmann-Butcher Method - order 8 (KB8). The strategy of partitioning events is designed to partition the whole simulation via the time axis through a simulated sequence of cascading events. For all strategies proposed, a strategy of partitioning cascading events is recommended, since the sub-tasks for each processor are totally independent, and therefore minimum communication time is needed.
High-Performance Parallel Analysis of Coupled Problems for Aircraft Propulsion

NASA Technical Reports Server (NTRS)

Felippa, C. A.; Farhat, C.; Park, K. C.; Gumaste, U.; Chen, P.-S.; Lesoinne, M.; Stern, P.

1996-01-01

This research program dealt with the application of high-performance computing methods to the numerical simulation of complete jet engines. The program was initiated in January 1993 by applying two-dimensional parallel aeroelastic codes to the interior gas flow problem of a bypass jet engine. The fluid mesh generation, domain decomposition and solution capabilities were successfully tested. Attention was then focused on methodology for the partitioned analysis of the interaction of the gas flow with a flexible structure and with the fluid mesh motion driven by these structural displacements. The latter is treated by a ALE technique that models the fluid mesh motion as that of a fictitious mechanical network laid along the edges of near-field fluid elements. New partitioned analysis procedures to treat this coupled three-component problem were developed during 1994 and 1995. These procedures involved delayed corrections and subcycling, and have been successfully tested on several massively parallel computers, including the iPSC-860, Paragon XP/S and the IBM SP2. For the global steady-state axisymmetric analysis of a complete engine we have decided to use the NASA-sponsored ENG10 program, which uses a regular FV-multiblock-grid discretization in conjunction with circumferential averaging to include effects of blade forces, loss, combustor heat addition, blockage, bleeds and convective mixing. A load-balancing preprocessor tor parallel versions of ENG10 was developed. During 1995 and 1996 we developed the capability tor the first full 3D aeroelastic simulation of a multirow engine stage. This capability was tested on the IBM SP2 parallel supercomputer at NASA Ames. Benchmark results were presented at the 1196 Computational Aeroscience meeting.
Computational study of noise in a large signal transduction network.

PubMed

Intosalmi, Jukka; Manninen, Tiina; Ruohonen, Keijo; Linne, Marja-Leena

2011-06-21

Biochemical systems are inherently noisy due to the discrete reaction events that occur in a random manner. Although noise is often perceived as a disturbing factor, the system might actually benefit from it. In order to understand the role of noise better, its quality must be studied in a quantitative manner. Computational analysis and modeling play an essential role in this demanding endeavor. We implemented a large nonlinear signal transduction network combining protein kinase C, mitogen-activated protein kinase, phospholipase A2, and β isoform of phospholipase C networks. We simulated the network in 300 different cellular volumes using the exact Gillespie stochastic simulation algorithm and analyzed the results in both the time and frequency domain. In order to perform simulations in a reasonable time, we used modern parallel computing techniques. The analysis revealed that time and frequency domain characteristics depend on the system volume. The simulation results also indicated that there are several kinds of noise processes in the network, all of them representing different kinds of low-frequency fluctuations. In the simulations, the power of noise decreased on all frequencies when the system volume was increased. We concluded that basic frequency domain techniques can be applied to the analysis of simulation results produced by the Gillespie stochastic simulation algorithm. This approach is suited not only to the study of fluctuations but also to the study of pure noise processes. Noise seems to have an important role in biochemical systems and its properties can be numerically studied by simulating the reacting system in different cellular volumes. Parallel computing techniques make it possible to run massive simulations in hundreds of volumes and, as a result, accurate statistics can be obtained from computational studies. © 2011 Intosalmi et al; licensee BioMed Central Ltd.
A Numerical Investigation of Two-Different Drosophila Forward Flight Modes

NASA Astrophysics Data System (ADS)

Sahin, Mehmet; Dilek, Ezgi; Erzincanli, Belkis

2016-11-01

The parallel large-scale unstructured finite volume method based on an Arbitrary Lagrangian-Eulerian (ALE) formulation has been applied in order to investigate the near wake structure of Drosophila in forward flight. DISTENE MeshGems-Hexa algorithm based on the octree method is used to generate the all hexahedral mesh for the wing-body combination. The mesh deformation algorithm is based on the indirect radial basis function (RBF) method at each time level while avoiding remeshing in order to enhance numerical robustness. The large-scale numerical simulations are carried out for a flapping Drosophila in forward flight. In the first case, the wing tip-path plane is tilted forward to generate forward force. In the second case, paddling wing motion is used to generate the forward fore. The λ2-criterion proposed by Jeong and Hussain (1995) is used for investigating the time variation of the Eulerian coherent structures in the near wake. The present simulations reveal highly detailed near wake topology for a hovering Drosophila. This is very useful in terms of understanding physics in biological flights which can provide a very useful tool for designing bio-inspired MAVs.
Numerical Simulation of the Vortex-Induced Vibration of A Curved Flexible Riser in Shear Flow

NASA Astrophysics Data System (ADS)

Zhu, Hong-jun; Lin, Peng-zhi

2018-06-01

A series of fully three-dimensional (3D) numerical simulations of flow past a free-to-oscillate curved flexible riser in shear flow were conducted at Reynolds number of 185-1015. The numerical results obtained by the two-way fluid-structure interaction (FSI) simulations are in good agreement with the experimental results reported in the earlier study. It is further found that the frequency transition is out of phase not only in the inline (IL) and crossflow (CF) directions but also along the span direction. The mode competition leads to the non-zero nodes of the rootmean- square (RMS) amplitude and the relatively chaotic trajectories. The fluid-structure interaction is to some extent reflected by the transverse velocity of the ambient fluid, which reaches the maximum value when the riser reaches the equilibrium position. Moreover, the local maximum transverse velocities occur at the peak CF amplitudes, and the values are relatively large when the vibration is in the resonance regions. The 3D vortex columns are shed nearly parallel to the axis of the curved flexible riser. As the local Reynolds number increases from 0 at the bottom of the riser to the maximum value at the top, the wake undergoes a transition from a two-dimensional structure to a 3D one. More irregular small-scale vortices appeared at the wake region of the riser, undergoing large amplitude responses.

Flood predictions using the parallel version of distributed numerical physical rainfall-runoff model TOPKAPI

NASA Astrophysics Data System (ADS)

Boyko, Oleksiy; Zheleznyak, Mark

2015-04-01

The original numerical code TOPKAPI-IMMS of the distributed rainfall-runoff model TOPKAPI ( Todini et al, 1996-2014) is developed and implemented in Ukraine. The parallel version of the code has been developed recently to be used on multiprocessors systems - multicore/processors PC and clusters. Algorithm is based on binary-tree decomposition of the watershed for the balancing of the amount of computation for all processors/cores. Message passing interface (MPI) protocol is used as a parallel computing framework. The numerical efficiency of the parallelization algorithms is demonstrated for the case studies for the flood predictions of the mountain watersheds of the Ukrainian Carpathian regions. The modeling results is compared with the predictions based on the lumped parameters models.
A novel feedback algorithm for simulating controlled dynamics and confinement in the advanced reversed-field pinch

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dahlin, J.-E.; Scheffel, J.

2005-06-15

In the advanced reversed-field pinch (RFP), the current density profile is externally controlled to diminish tearing instabilities. Thus the scaling of energy confinement time with plasma current and density is improved substantially as compared to the conventional RFP. This may be numerically simulated by introducing an ad hoc electric field, adjusted to generate a tearing mode stable parallel current density profile. In the present work a current profile control algorithm, based on feedback of the fluctuating electric field in Ohm's law, is introduced into the resistive magnetohydrodynamic code DEBSP [D. D. Schnack and D. C. Baxter, J. Comput. Phys. 55,more » 485 (1984); D. D. Schnack, D. C. Barnes, Z. Mikic, D. S. Marneal, E. J. Caramana, and R. A. Nebel, Comput. Phys. Commun. 43, 17 (1986)]. The resulting radial magnetic field is decreased considerably, causing an increase in energy confinement time and poloidal {beta}. It is found that the parallel current density profile spontaneously becomes hollow, and that a formation, being related to persisting resistive g modes, appears close to the reversal surface.« less
Interfacial properties in a discrete model for tumor growth

NASA Astrophysics Data System (ADS)

Moglia, Belén; Guisoni, Nara; Albano, Ezequiel V.

2013-03-01

We propose and study, by means of Monte Carlo numerical simulations, a minimal discrete model for avascular tumor growth, which can also be applied for the description of cell cultures in vitro. The interface of the tumor is self-affine and its width can be characterized by the following exponents: (i) the growth exponent β=0.32(2) that governs the early time regime, (ii) the roughness exponent α=0.49(2) related to the fluctuations in the stationary regime, and (iii) the dynamic exponent z=α/β≃1.49(2), which measures the propagation of correlations in the direction parallel to the interface, e.g., ξ∝t1/z, where ξ is the parallel correlation length. Therefore, the interface belongs to the Kardar-Parisi-Zhang universality class, in agreement with recent experiments of cell cultures in vitro. Furthermore, density profiles of the growing cells are rationalized in terms of traveling waves that are solutions of the Fisher-Kolmogorov equation. In this way, we achieved excellent agreement between the simulation results of the discrete model and the continuous description of the growth front of the culture or tumor.
Scalable Nonlinear Solvers for Fully Implicit Coupled Nuclear Fuel Modeling. Final Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cai, Xiao-Chuan; Keyes, David; Yang, Chao

2014-09-29

The focus of the project is on the development and customization of some highly scalable domain decomposition based preconditioning techniques for the numerical solution of nonlinear, coupled systems of partial differential equations (PDEs) arising from nuclear fuel simulations. These high-order PDEs represent multiple interacting physical fields (for example, heat conduction, oxygen transport, solid deformation), each is modeled by a certain type of Cahn-Hilliard and/or Allen-Cahn equations. Most existing approaches involve a careful splitting of the fields and the use of field-by-field iterations to obtain a solution of the coupled problem. Such approaches have many advantages such as ease of implementationmore » since only single field solvers are needed, but also exhibit disadvantages. For example, certain nonlinear interactions between the fields may not be fully captured, and for unsteady problems, stable time integration schemes are difficult to design. In addition, when implemented on large scale parallel computers, the sequential nature of the field-by-field iterations substantially reduces the parallel efficiency. To overcome the disadvantages, fully coupled approaches have been investigated in order to obtain full physics simulations.« less
Instrumentation, performance visualization, and debugging tools for multiprocessors

NASA Technical Reports Server (NTRS)

Yan, Jerry C.; Fineman, Charles E.; Hontalas, Philip J.

1991-01-01

The need for computing power has forced a migration from serial computation on a single processor to parallel processing on multiprocessor architectures. However, without effective means to monitor (and visualize) program execution, debugging, and tuning parallel programs becomes intractably difficult as program complexity increases with the number of processors. Research on performance evaluation tools for multiprocessors is being carried out at ARC. Besides investigating new techniques for instrumenting, monitoring, and presenting the state of parallel program execution in a coherent and user-friendly manner, prototypes of software tools are being incorporated into the run-time environments of various hardware testbeds to evaluate their impact on user productivity. Our current tool set, the Ames Instrumentation Systems (AIMS), incorporates features from various software systems developed in academia and industry. The execution of FORTRAN programs on the Intel iPSC/860 can be automatically instrumented and monitored. Performance data collected in this manner can be displayed graphically on workstations supporting X-Windows. We have successfully compared various parallel algorithms for computational fluid dynamics (CFD) applications in collaboration with scientists from the Numerical Aerodynamic Simulation Systems Division. By performing these comparisons, we show that performance monitors and debuggers such as AIMS are practical and can illuminate the complex dynamics that occur within parallel programs.
Simulation Exploration through Immersive Parallel Planes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brunhart-Lupo, Nicholas J; Bush, Brian W; Gruchalla, Kenny M

We present a visualization-driven simulation system that tightly couples systems dynamics simulations with an immersive virtual environment to allow analysts to rapidly develop and test hypotheses in a high-dimensional parameter space. To accomplish this, we generalize the two-dimensional parallel-coordinates statistical graphic as an immersive 'parallel-planes' visualization for multivariate time series emitted by simulations running in parallel with the visualization. In contrast to traditional parallel coordinate's mapping the multivariate dimensions onto coordinate axes represented by a series of parallel lines, we map pairs of the multivariate dimensions onto a series of parallel rectangles. As in the case of parallel coordinates, eachmore » individual observation in the dataset is mapped to a polyline whose vertices coincide with its coordinate values. Regions of the rectangles can be 'brushed' to highlight and select observations of interest: a 'slider' control allows the user to filter the observations by their time coordinate. In an immersive virtual environment, users interact with the parallel planes using a joystick that can select regions on the planes, manipulate selection, and filter time. The brushing and selection actions are used to both explore existing data as well as to launch additional simulations corresponding to the visually selected portions of the input parameter space. As soon as the new simulations complete, their resulting observations are displayed in the virtual environment. This tight feedback loop between simulation and immersive analytics accelerates users' realization of insights about the simulation and its output.« less
Simulation Exploration through Immersive Parallel Planes: Preprint

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brunhart-Lupo, Nicholas; Bush, Brian W.; Gruchalla, Kenny

We present a visualization-driven simulation system that tightly couples systems dynamics simulations with an immersive virtual environment to allow analysts to rapidly develop and test hypotheses in a high-dimensional parameter space. To accomplish this, we generalize the two-dimensional parallel-coordinates statistical graphic as an immersive 'parallel-planes' visualization for multivariate time series emitted by simulations running in parallel with the visualization. In contrast to traditional parallel coordinate's mapping the multivariate dimensions onto coordinate axes represented by a series of parallel lines, we map pairs of the multivariate dimensions onto a series of parallel rectangles. As in the case of parallel coordinates, eachmore » individual observation in the dataset is mapped to a polyline whose vertices coincide with its coordinate values. Regions of the rectangles can be 'brushed' to highlight and select observations of interest: a 'slider' control allows the user to filter the observations by their time coordinate. In an immersive virtual environment, users interact with the parallel planes using a joystick that can select regions on the planes, manipulate selection, and filter time. The brushing and selection actions are used to both explore existing data as well as to launch additional simulations corresponding to the visually selected portions of the input parameter space. As soon as the new simulations complete, their resulting observations are displayed in the virtual environment. This tight feedback loop between simulation and immersive analytics accelerates users' realization of insights about the simulation and its output.« less
A path-level exact parallelization strategy for sequential simulation

NASA Astrophysics Data System (ADS)

Peredo, Oscar F.; Baeza, Daniel; Ortiz, Julián M.; Herrero, José R.

2018-01-01

Sequential Simulation is a well known method in geostatistical modelling. Following the Bayesian approach for simulation of conditionally dependent random events, Sequential Indicator Simulation (SIS) method draws simulated values for K categories (categorical case) or classes defined by K different thresholds (continuous case). Similarly, Sequential Gaussian Simulation (SGS) method draws simulated values from a multivariate Gaussian field. In this work, a path-level approach to parallelize SIS and SGS methods is presented. A first stage of re-arrangement of the simulation path is performed, followed by a second stage of parallel simulation for non-conflicting nodes. A key advantage of the proposed parallelization method is to generate identical realizations as with the original non-parallelized methods. Case studies are presented using two sequential simulation codes from GSLIB: SISIM and SGSIM. Execution time and speedup results are shown for large-scale domains, with many categories and maximum kriging neighbours in each case, achieving high speedup results in the best scenarios using 16 threads of execution in a single machine.
Study of Magnetic Damping Effect on Convection and Solidification Under G-Jitter Conditions

NASA Technical Reports Server (NTRS)

Li, Ben Q.; deGroh, H. C., III

1999-01-01

As shown by NASA resources dedicated to measuring residual gravity (SAMS and OARE systems), g-jitter is a critical issue affecting space experiments on solidification processing of materials. This study aims to provide, through extensive numerical simulations and ground based experiments, an assessment of the use of magnetic fields in combination with microgravity to reduce the g-jitter induced convective flows in space processing systems. We have so far completed asymptotic analyses based on the analytical solutions for g-jitter driven flow and magnetic field damping effects for a simple one-dimensional parallel plate configuration, and developed both 2-D and 3-D numerical models for g-jitter driven flows in simple solidification systems with and without presence of an applied magnetic field. Numerical models have been checked with the analytical solutions and have been applied to simulate the convective flows and mass transfer using both synthetic g-jitter functions and the g-jitter data taken from space flight. Some useful findings have been obtained from the analyses and the modeling results. Some key points may be summarized as follows: (1) the amplitude of the oscillating velocity decreases at a rate inversely proportional to the g-jitter frequency and with an increase in the applied magnetic field; (2) the induced flow approximately oscillates at the same frequency as the affecting g-jitter, but out of a phase angle; (3) the phase angle is a complicated function of geometry, applied magnetic field, temperature gradient and frequency; (4) g-jitter driven flows exhibit a complex fluid flow pattern evolving in time; (5) the damping effect is more effective for low frequency flows; and (6) the applied magnetic field helps to reduce the variation of solutal distribution along the solid-liquid interface. Work in progress includes numerical simulations and ground-based measurements. Both 2-D and 3-D numerical simulations are being continued to obtain further information on g-jitter driven flows and magnetic field effects. A physical model for ground-based measurements is completed and some measurements of the oscillating convection are being taken on the physical model. The comparison of the measurements with numerical simulations is in progress. Additional work planned in the project will also involve extending the 2-D numerical model to include the solidification phenomena with the presence of both g-jitter and magnetic fields.
Incorporation of Three-dimensional Radiative Transfer into a Very High Resolution Simulation of Horizontally Inhomogeneous Clouds

NASA Astrophysics Data System (ADS)

Ishida, H.; Ota, Y.; Sekiguchi, M.; Sato, Y.

2016-12-01

A three-dimensional (3D) radiative transfer calculation scheme is developed to estimate horizontal transport of radiation energy in a very high resolution (with the order of 10 m in spatial grid) simulation of cloud evolution, especially for horizontally inhomogeneous clouds such as shallow cumulus and stratocumulus. Horizontal radiative transfer due to inhomogeneous clouds seems to cause local heating/cooling in an atmosphere with a fine spatial scale. It is, however, usually difficult to estimate the 3D effects, because the 3D radiative transfer often needs a large resource for computation compared to a plane-parallel approximation. This study attempts to incorporate a solution scheme that explicitly solves the 3D radiative transfer equation into a numerical simulation, because this scheme has an advantage in calculation for a sequence of time evolution (i.e., the scene at a time is little different from that at the previous time step). This scheme is also appropriate to calculation of radiation with strong absorption, such as the infrared regions. For efficient computation, this scheme utilizes several techniques, e.g., the multigrid method for iteration solution, and a correlated-k distribution method refined for efficient approximation of the wavelength integration. For a case study, the scheme is applied to an infrared broadband radiation calculation in a broken cloud field generated with a large eddy simulation model. The horizontal transport of infrared radiation, which cannot be estimated by the plane-parallel approximation, and its variation in time can be retrieved. The calculation result elucidates that the horizontal divergences and convergences of infrared radiation flux are not negligible, especially at the boundaries of clouds and within optically thin clouds, and the radiative cooling at lateral boundaries of clouds may reduce infrared radiative heating in clouds. In a future work, the 3D effects on radiative heating/cooling will be able to be included into atmospheric numerical models.
SIAM Conference on Parallel Processing for Scientific Computing, 4th, Chicago, IL, Dec. 11-13, 1989, Proceedings

NASA Technical Reports Server (NTRS)

Dongarra, Jack (Editor); Messina, Paul (Editor); Sorensen, Danny C. (Editor); Voigt, Robert G. (Editor)

1990-01-01

Attention is given to such topics as an evaluation of block algorithm variants in LAPACK and presents a large-grain parallel sparse system solver, a multiprocessor method for the solution of the generalized Eigenvalue problem on an interval, and a parallel QR algorithm for iterative subspace methods on the CM2. A discussion of numerical methods includes the topics of asynchronous numerical solutions of PDEs on parallel computers, parallel homotopy curve tracking on a hypercube, and solving Navier-Stokes equations on the Cedar Multi-Cluster system. A section on differential equations includes a discussion of a six-color procedure for the parallel solution of elliptic systems using the finite quadtree structure, data parallel algorithms for the finite element method, and domain decomposition methods in aerodynamics. Topics dealing with massively parallel computing include hypercube vs. 2-dimensional meshes and massively parallel computation of conservation laws. Performance and tools are also discussed.
A New Attempt of 2-D Numerical Ice Flow Model to Reconstruct Paleoclimate from Mountain Glaciers

NASA Astrophysics Data System (ADS)

Candaş, Adem; Akif Sarıkaya, Mehmet

2017-04-01

A new two dimensional (2D) numerical ice flow model is generated to simulate the steady-state glacier extent for a wide range of climate conditions. The simulation includes the flow of ice enforced by the annual mass balance gradient of a valley glacier. The annual mass balance is calculated by the difference of the net accumulation and ablation of snow and (or) ice. The generated model lets users to compare the simulated and field observed ice extent of paleoglaciers. As a result, model results provide the conditions about the past climates since simulated ice extent is a function of predefined climatic conditions. To predict the glacier shape and distribution in two dimension, time dependent partial differential equation (PDE) is solved. Thus, a 2D glacier flow model code is constructed in MATLAB and a finite difference method is used to solve this equation. On the other hand, Parallel Ice Sheet Model (PISM) is used to regenerate paleoglaciers in the same area where the MATLAB code is applied. We chose the Mount Dedegöl, an extensively glaciated mountain in SW Turkey, to apply both models. Model results will be presented and discussed in this presentation. This study was supported by TÜBİTAK 114Y548 project.
Multi-scale simulations of space problems with iPIC3D

NASA Astrophysics Data System (ADS)

Lapenta, Giovanni; Bettarini, Lapo; Markidis, Stefano

The implicit Particle-in-Cell method for the computer simulation of space plasma, and its im-plementation in a three-dimensional parallel code, called iPIC3D, are presented. The implicit integration in time of the Vlasov-Maxwell system removes the numerical stability constraints and enables kinetic plasma simulations at magnetohydrodynamics scales. Simulations of mag-netic reconnection in plasma are presented to show the effectiveness of the algorithm. In particular we will show a number of simulations done for large scale 3D systems using the physical mass ratio for Hydrogen. Most notably one simulation treats kinetically a box of tens of Earth radii in each direction and was conducted using about 16000 processors of the Pleiades NASA computer. The work is conducted in collaboration with the MMS-IDS theory team from University of Colorado (M. Goldman, D. Newman and L. Andersson). Reference: Stefano Markidis, Giovanni Lapenta, Rizwan-uddin Multi-scale simulations of plasma with iPIC3D Mathematics and Computers in Simulation, Available online 17 October 2009, http://dx.doi.org/10.1016/j.matcom.2009.08.038
A compositional reservoir simulator on distributed memory parallel computers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rame, M.; Delshad, M.

1995-12-31

This paper presents the application of distributed memory parallel computes to field scale reservoir simulations using a parallel version of UTCHEM, The University of Texas Chemical Flooding Simulator. The model is a general purpose highly vectorized chemical compositional simulator that can simulate a wide range of displacement processes at both field and laboratory scales. The original simulator was modified to run on both distributed memory parallel machines (Intel iPSC/960 and Delta, Connection Machine 5, Kendall Square 1 and 2, and CRAY T3D) and a cluster of workstations. A domain decomposition approach has been taken towards parallelization of the code. Amore » portion of the discrete reservoir model is assigned to each processor by a set-up routine that attempts a data layout as even as possible from the load-balance standpoint. Each of these subdomains is extended so that data can be shared between adjacent processors for stencil computation. The added routines that make parallel execution possible are written in a modular fashion that makes the porting to new parallel platforms straight forward. Results of the distributed memory computing performance of Parallel simulator are presented for field scale applications such as tracer flood and polymer flood. A comparison of the wall-clock times for same problems on a vector supercomputer is also presented.« less
Hydrodynamic Simulations of Giant Impacts

NASA Astrophysics Data System (ADS)

Reinhardt, Christian; Stadel, Joachim

2013-07-01

We studied the basic numerical aspects of giant impacts using Smoothed Particles Hydrodynamics (SPH), which has been used in most of the prior studies conducted in this area (e.g., Benz, Canup). Our main goal was to modify the massive parallel, multi-stepping code GASOLINE widely used in cosmological simulations so that it can properly simulate the behavior of condensed materials such as granite or iron using the Tillotson equation of state. GASOLINE has been used to simulate hundreds of millions of particles for ideal gas physics so that using several millions of particles in condensed material simulations seems possible. In order to focus our attention of the numerical aspects of the problem we neglected the internal structure of the protoplanets and modelled them as homogenous (isothermal) granite spheres. For the energy balance we only considered PdV work and shock heating of the material during the impact (neglected cooling of the material). Starting at a low resolution of 2048 particles for the target and the impactor we run several simulations for different impact parameters and impact velocities and successfully reproduced the main features of the pioneering work of Benz from 1986. The impact sends a shock wave through both bodies heating the target and disrupting the remaining impactor. As in prior simulations material is ejected from the collision. How much, and whether it leaves the system or survives in an orbit for a longer time, depends on the initial conditions but also on resolution. Increasing the resolution (to 1.2x10⁶ particles) results in both a much clearer shock wave and deformation of the bodies during the impact and a more compact and detailed "arm" like structure of the ejected material. Currently we are investigating some numerical issues we encountered and are implementing differentiated models, making one step closer to more realistic protoplanets in such giant impact simulations.
A Novel Method for Characterization of Superconductors: Physical Measurements and Modeling of Thin Films

NASA Technical Reports Server (NTRS)

Kim, B. F.; Moorjani, K.; Phillips, T. E.; Adrian, F. J.; Bohandy, J.; Dolecek, Q. E.

1993-01-01

A method for characterization of granular superconducting thin films has been developed which encompasses both the morphological state of the sample and its fabrication process parameters. The broad scope of this technique is due to the synergism between experimental measurements and their interpretation using numerical simulation. Two novel technologies form the substance of this system: the magnetically modulated resistance method for characterizing superconductors; and a powerful new computer peripheral, the Parallel Information Processor card, which provides enhanced computing capability for PC computers. This enhancement allows PC computers to operate at speeds approaching that of supercomputers. This makes atomic scale simulations possible on low cost machines. The present development of this system involves the integration of these two technologies using mesoscale simulations of thin film growth. A future stage of development will incorporate atomic scale modeling.
Dynamic load balance scheme for the DSMC algorithm

DOE Office of Scientific and Technical Information (OSTI.GOV)

Li, Jin; Geng, Xiangren; Jiang, Dingwu

The direct simulation Monte Carlo (DSMC) algorithm, devised by Bird, has been used over a wide range of various rarified flow problems in the past 40 years. While the DSMC is suitable for the parallel implementation on powerful multi-processor architecture, it also introduces a large load imbalance across the processor array, even for small examples. The load imposed on a processor by a DSMC calculation is determined to a large extent by the total of simulator particles upon it. Since most flows are impulsively started with initial distribution of particles which is surely quite different from the steady state, themore » total of simulator particles will change dramatically. The load balance based upon an initial distribution of particles will break down as the steady state of flow is reached. The load imbalance and huge computational cost of DSMC has limited its application to rarefied or simple transitional flows. In this paper, by taking advantage of METIS, a software for partitioning unstructured graphs, and taking the total of simulator particles in each cell as a weight information, the repartitioning based upon the principle that each processor handles approximately the equal total of simulator particles has been achieved. The computation must pause several times to renew the total of simulator particles in each processor and repartition the whole domain again. Thus the load balance across the processors array holds in the duration of computation. The parallel efficiency can be improved effectively. The benchmark solution of a cylinder submerged in hypersonic flow has been simulated numerically. Besides, hypersonic flow past around a complex wing-body configuration has also been simulated. The results have displayed that, for both of cases, the computational time can be reduced by about 50%.« less
Heat loads on poloidal and toroidal edges of castellated plasma-facing components in COMPASS

NASA Astrophysics Data System (ADS)

Dejarnac, R.; Corre, Y.; Vondracek, P.; Gaspar, J.; Gauthier, E.; Gunn, J. P.; Komm, M.; Gardarein, J.-L.; Horacek, J.; Hron, M.; Matejicek, J.; Pitts, R. A.; Panek, R.

2018-06-01

Dedicated experiments have been performed in the COMPASS tokamak to thoroughly study the power deposition processes occurring on poloidal and toroidal edges of castellated plasma-facing components in tokamaks during steady-state L-mode conditions. Surface temperatures measured by a high resolution infra-red camera are compared with reconstructed synthetic data from a 2D thermal model using heat flux profiles derived from both the optical approximation and 2D particle-in-cell (PIC) simulations. In the case of poloidal leading edges, when the contribution from local radiation is taken into account, the parallel heat flux deduced from unperturbed, upstream measurements is fully consistent with the observed temperature increase at the leading edges of various heights, respecting power balance assuming simple projection of the parallel flux density. Smoothing of the heat flux deposition profile due to finite ion Larmor radius predicted by the PIC simulations is found to be weak and the power deposition on misaligned poloidal edges is better described by the optical approximation. This is consistent with an electron-dominated regime associated with a non-ambipolar parallel current flow. In the case of toroidal gap edges, the different contributions of the total incoming flux along the gap have been observed experimentally for the first time. They confirm the results of recent numerical studies performed for ITER showing that in specific cases the heat deposition does not necessarily follow the optical approximation. Indeed, ions can spiral onto the magnetically shadowed toroidal edge. Particle-in-cell simulations emphasize again the role played by local non-ambipolarity in the deposition pattern.
Earthquake Rupture Dynamics using Adaptive Mesh Refinement and High-Order Accurate Numerical Methods

NASA Astrophysics Data System (ADS)

Kozdon, J. E.; Wilcox, L.

2013-12-01

Our goal is to develop scalable and adaptive (spatial and temporal) numerical methods for coupled, multiphysics problems using high-order accurate numerical methods. To do so, we are developing an opensource, parallel library known as bfam (available at http://bfam.in). The first application to be developed on top of bfam is an earthquake rupture dynamics solver using high-order discontinuous Galerkin methods and summation-by-parts finite difference methods. In earthquake rupture dynamics, wave propagation in the Earth's crust is coupled to frictional sliding on fault interfaces. This coupling is two-way, required the simultaneous simulation of both processes. The use of laboratory-measured friction parameters requires near-fault resolution that is 4-5 orders of magnitude higher than that needed to resolve the frequencies of interest in the volume. This, along with earlier simulations using a low-order, finite volume based adaptive mesh refinement framework, suggest that adaptive mesh refinement is ideally suited for this problem. The use of high-order methods is motivated by the high level of resolution required off the fault in earlier the low-order finite volume simulations; we believe this need for resolution is a result of the excessive numerical dissipation of low-order methods. In bfam spatial adaptivity is handled using the p4est library and temporal adaptivity will be accomplished through local time stepping. In this presentation we will present the guiding principles behind the library as well as verification of code against the Southern California Earthquake Center dynamic rupture code validation test problems.
High-Resolution Numerical Simulation and Analysis of Mach Reflection Structures in Detonation Waves in Low-Pressure H 2 –O 2 –Ar Mixtures: A Summary of Results Obtained with the Adaptive Mesh Refinement Framework AMROC

DOE PAGES

Deiterding, Ralf

2011-01-01

Numerical simulation can be key to the understanding of the multidimensional nature of transient detonation waves. However, the accurate approximation of realistic detonations is demanding as a wide range of scales needs to be resolved. This paper describes a successful solution strategy that utilizes logically rectangular dynamically adaptive meshes. The hydrodynamic transport scheme and the treatment of the nonequilibrium reaction terms are sketched. A ghost fluid approach is integrated into the method to allow for embedded geometrically complex boundaries. Large-scale parallel simulations of unstable detonation structures of Chapman-Jouguet detonations in low-pressure hydrogen-oxygen-argon mixtures demonstrate the efficiency of the described techniquesmore » in practice. In particular, computations of regular cellular structures in two and three space dimensions and their development under transient conditions, that is, under diffraction and for propagation through bends are presented. Some of the observed patterns are classified by shock polar analysis, and a diagram of the transition boundaries between possible Mach reflection structures is constructed.« less

Some links on this page may take you to non-federal websites. Their policies may differ from this site.