parallel direct numerical: Topics by Science.gov

Sample records for parallel direct numerical

Efficient Parallel Algorithm For Direct Numerical Simulation of Turbulent Flows

NASA Technical Reports Server (NTRS)

Moitra, Stuti; Gatski, Thomas B.

1997-01-01

A distributed algorithm for a high-order-accurate finite-difference approach to the direct numerical simulation (DNS) of transition and turbulence in compressible flows is described. This work has two major objectives. The first objective is to demonstrate that parallel and distributed-memory machines can be successfully and efficiently used to solve computationally intensive and input/output intensive algorithms of the DNS class. The second objective is to show that the computational complexity involved in solving the tridiagonal systems inherent in the DNS algorithm can be reduced by algorithm innovations that obviate the need to use a parallelized tridiagonal solver.
Parallel spatial direct numerical simulations on the Intel iPSC/860 hypercube

NASA Technical Reports Server (NTRS)

Joslin, Ronald D.; Zubair, Mohammad

1993-01-01

The implementation and performance of a parallel spatial direct numerical simulation (PSDNS) approach on the Intel iPSC/860 hypercube is documented. The direct numerical simulation approach is used to compute spatially evolving disturbances associated with the laminar-to-turbulent transition in boundary-layer flows. The feasibility of using the PSDNS on the hypercube to perform transition studies is examined. The results indicate that the direct numerical simulation approach can effectively be parallelized on a distributed-memory parallel machine. By increasing the number of processors nearly ideal linear speedups are achieved with nonoptimized routines; slower than linear speedups are achieved with optimized (machine dependent library) routines. This slower than linear speedup results because the Fast Fourier Transform (FFT) routine dominates the computational cost and because the routine indicates less than ideal speedups. However with the machine-dependent routines the total computational cost decreases by a factor of 4 to 5 compared with standard FORTRAN routines. The computational cost increases linearly with spanwise wall-normal and streamwise grid refinements. The hypercube with 32 processors was estimated to require approximately twice the amount of Cray supercomputer single processor time to complete a comparable simulation; however it is estimated that a subgrid-scale model which reduces the required number of grid points and becomes a large-eddy simulation (PSLES) would reduce the computational cost and memory requirements by a factor of 10 over the PSDNS. This PSLES implementation would enable transition simulations on the hypercube at a reasonable computational cost.
Numerical study of the interaction between a head fire and a backfire propagating in grassland.

Treesearch

Dominique Morvan; Sofiane Meradji; William Mell

2011-01-01

One of the objectives of this paper was to simulate numerically the interaction between two line fires ignited in a grassland, on a flat terrain, perpendicularly to the wind direction, in such a way that the two fire fronts (a head fire and a backfire) propagated in opposite directions parallel to the wind. The numerical simulations were conducted in 3-0 using the new...
A parallel time integrator for noisy nonlinear oscillatory systems

NASA Astrophysics Data System (ADS)

Subber, Waad; Sarkar, Abhijit

2018-06-01

In this paper, we adapt a parallel time integration scheme to track the trajectories of noisy non-linear dynamical systems. Specifically, we formulate a parallel algorithm to generate the sample path of nonlinear oscillator defined by stochastic differential equations (SDEs) using the so-called parareal method for ordinary differential equations (ODEs). The presence of Wiener process in SDEs causes difficulties in the direct application of any numerical integration techniques of ODEs including the parareal algorithm. The parallel implementation of the algorithm involves two SDEs solvers, namely a fine-level scheme to integrate the system in parallel and a coarse-level scheme to generate and correct the required initial conditions to start the fine-level integrators. For the numerical illustration, a randomly excited Duffing oscillator is investigated in order to study the performance of the stochastic parallel algorithm with respect to a range of system parameters. The distributed implementation of the algorithm exploits Massage Passing Interface (MPI).
THC-MP: High performance numerical simulation of reactive transport and multiphase flow in porous media

NASA Astrophysics Data System (ADS)

Wei, Xiaohui; Li, Weishan; Tian, Hailong; Li, Hongliang; Xu, Haixiao; Xu, Tianfu

2015-07-01

The numerical simulation of multiphase flow and reactive transport in the porous media on complex subsurface problem is a computationally intensive application. To meet the increasingly computational requirements, this paper presents a parallel computing method and architecture. Derived from TOUGHREACT that is a well-established code for simulating subsurface multi-phase flow and reactive transport problems, we developed a high performance computing THC-MP based on massive parallel computer, which extends greatly on the computational capability for the original code. The domain decomposition method was applied to the coupled numerical computing procedure in the THC-MP. We designed the distributed data structure, implemented the data initialization and exchange between the computing nodes and the core solving module using the hybrid parallel iterative and direct solver. Numerical accuracy of the THC-MP was verified through a CO2 injection-induced reactive transport problem by comparing the results obtained from the parallel computing and sequential computing (original code). Execution efficiency and code scalability were examined through field scale carbon sequestration applications on the multicore cluster. The results demonstrate successfully the enhanced performance using the THC-MP on parallel computing facilities.
Surrogates for numerical simulations; optimization of eddy-promoter heat exchangers

NASA Technical Reports Server (NTRS)

Patera, Anthony T.; Patera, Anthony

1993-01-01

Although the advent of fast and inexpensive parallel computers has rendered numerous previously intractable calculations feasible, many numerical simulations remain too resource-intensive to be directly inserted in engineering optimization efforts. An attractive alternative to direct insertion considers models for computational systems: the expensive simulation is evoked only to construct and validate a simplified, input-output model; this simplified input-output model then serves as a simulation surrogate in subsequent engineering optimization studies. A simple 'Bayesian-validated' statistical framework for the construction, validation, and purposive application of static computer simulation surrogates is presented. As an example, dissipation-transport optimization of laminar-flow eddy-promoter heat exchangers are considered: parallel spectral element Navier-Stokes calculations serve to construct and validate surrogates for the flowrate and Nusselt number; these surrogates then represent the originating Navier-Stokes equations in the ensuing design process.
A Parallel Compact Multi-Dimensional Numerical Algorithm with Aeroacoustics Applications

NASA Technical Reports Server (NTRS)

Povitsky, Alex; Morris, Philip J.

1999-01-01

In this study we propose a novel method to parallelize high-order compact numerical algorithms for the solution of three-dimensional PDEs (Partial Differential Equations) in a space-time domain. For this numerical integration most of the computer time is spent in computation of spatial derivatives at each stage of the Runge-Kutta temporal update. The most efficient direct method to compute spatial derivatives on a serial computer is a version of Gaussian elimination for narrow linear banded systems known as the Thomas algorithm. In a straightforward pipelined implementation of the Thomas algorithm processors are idle due to the forward and backward recurrences of the Thomas algorithm. To utilize processors during this time, we propose to use them for either non-local data independent computations, solving lines in the next spatial direction, or local data-dependent computations by the Runge-Kutta method. To achieve this goal, control of processor communication and computations by a static schedule is adopted. Thus, our parallel code is driven by a communication and computation schedule instead of the usual "creative, programming" approach. The obtained parallelization speed-up of the novel algorithm is about twice as much as that for the standard pipelined algorithm and close to that for the explicit DRP algorithm.
Scalability of Parallel Spatial Direct Numerical Simulations on Intel Hypercube and IBM SP1 and SP2

NASA Technical Reports Server (NTRS)

Joslin, Ronald D.; Hanebutte, Ulf R.; Zubair, Mohammad

1995-01-01

The implementation and performance of a parallel spatial direct numerical simulation (PSDNS) approach on the Intel iPSC/860 hypercube and IBM SP1 and SP2 parallel computers is documented. Spatially evolving disturbances associated with the laminar-to-turbulent transition in boundary-layer flows are computed with the PSDNS code. The feasibility of using the PSDNS to perform transition studies on these computers is examined. The results indicate that PSDNS approach can effectively be parallelized on a distributed-memory parallel machine by remapping the distributed data structure during the course of the calculation. Scalability information is provided to estimate computational costs to match the actual costs relative to changes in the number of grid points. By increasing the number of processors, slower than linear speedups are achieved with optimized (machine-dependent library) routines. This slower than linear speedup results because the computational cost is dominated by FFT routine, which yields less than ideal speedups. By using appropriate compile options and optimized library routines on the SP1, the serial code achieves 52-56 M ops on a single node of the SP1 (45 percent of theoretical peak performance). The actual performance of the PSDNS code on the SP1 is evaluated with a "real world" simulation that consists of 1.7 million grid points. One time step of this simulation is calculated on eight nodes of the SP1 in the same time as required by a Cray Y/MP supercomputer. For the same simulation, 32-nodes of the SP1 and SP2 are required to reach the performance of a Cray C-90. A 32 node SP1 (SP2) configuration is 2.9 (4.6) times faster than a Cray Y/MP for this simulation, while the hypercube is roughly 2 times slower than the Y/MP for this application. KEY WORDS: Spatial direct numerical simulations; incompressible viscous flows; spectral methods; finite differences; parallel computing.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Chao; Pouransari, Hadi; Rajamanickam, Sivasankaran

We present a parallel hierarchical solver for general sparse linear systems on distributed-memory machines. For large-scale problems, this fully algebraic algorithm is faster and more memory-efficient than sparse direct solvers because it exploits the low-rank structure of fill-in blocks. Depending on the accuracy of low-rank approximations, the hierarchical solver can be used either as a direct solver or as a preconditioner. The parallel algorithm is based on data decomposition and requires only local communication for updating boundary data on every processor. Moreover, the computation-to-communication ratio of the parallel algorithm is approximately the volume-to-surface-area ratio of the subdomain owned by everymore » processor. We also provide various numerical results to demonstrate the versatility and scalability of the parallel algorithm.« less
Highly Parallel Alternating Directions Algorithm for Time Dependent Problems

NASA Astrophysics Data System (ADS)

Ganzha, M.; Georgiev, K.; Lirkov, I.; Margenov, S.; Paprzycki, M.

2011-11-01

In our work, we consider the time dependent Stokes equation on a finite time interval and on a uniform rectangular mesh, written in terms of velocity and pressure. For this problem, a parallel algorithm based on a novel direction splitting approach is developed. Here, the pressure equation is derived from a perturbed form of the continuity equation, in which the incompressibility constraint is penalized in a negative norm induced by the direction splitting. The scheme used in the algorithm is composed of two parts: (i) velocity prediction, and (ii) pressure correction. This is a Crank-Nicolson-type two-stage time integration scheme for two and three dimensional parabolic problems in which the second-order derivative, with respect to each space variable, is treated implicitly while the other variable is made explicit at each time sub-step. In order to achieve a good parallel performance the solution of the Poison problem for the pressure correction is replaced by solving a sequence of one-dimensional second order elliptic boundary value problems in each spatial direction. The parallel code is implemented using the standard MPI functions and tested on two modern parallel computer systems. The performed numerical tests demonstrate good level of parallel efficiency and scalability of the studied direction-splitting-based algorithm.
Petascale turbulence simulation using a highly parallel fast multipole method on GPUs

NASA Astrophysics Data System (ADS)

Yokota, Rio; Barba, L. A.; Narumi, Tetsu; Yasuoka, Kenji

2013-03-01

This paper reports large-scale direct numerical simulations of homogeneous-isotropic fluid turbulence, achieving sustained performance of 1.08 petaflop/s on GPU hardware using single precision. The simulations use a vortex particle method to solve the Navier-Stokes equations, with a highly parallel fast multipole method (FMM) as numerical engine, and match the current record in mesh size for this application, a cube of 40963 computational points solved with a spectral method. The standard numerical approach used in this field is the pseudo-spectral method, relying on the FFT algorithm as the numerical engine. The particle-based simulations presented in this paper quantitatively match the kinetic energy spectrum obtained with a pseudo-spectral method, using a trusted code. In terms of parallel performance, weak scaling results show the FMM-based vortex method achieving 74% parallel efficiency on 4096 processes (one GPU per MPI process, 3 GPUs per node of the TSUBAME-2.0 system). The FFT-based spectral method is able to achieve just 14% parallel efficiency on the same number of MPI processes (using only CPU cores), due to the all-to-all communication pattern of the FFT algorithm. The calculation time for one time step was 108 s for the vortex method and 154 s for the spectral method, under these conditions. Computing with 69 billion particles, this work exceeds by an order of magnitude the largest vortex-method calculations to date.
Numerical Solution of the Navier-Stokes Equations for Steady Magnetohydrodynamic Flow Between Two Parallel Porous Plates with an Angular Velocity

NASA Astrophysics Data System (ADS)

Delhi Babu, R.; Ganesh, S.

2018-04-01

The Steady Laminar stream of an electrically directing thick, incompressible liquid between two parallel permeable plates of a divert within the sight of a transverse attractive field with an angular velocity when the liquid is being pulled back through both the dividers of the channel at a similar rate with a precise speed is examined. Numerical arrangement is acquired for various estimations of R (Suction Reynolds number) utilizing R-K Gill's technique and the diagrams of dimensionless functions f ' and f have been drawn.
Wave Number Selection for Incompressible Parallel Jet Flows Periodic in Space

NASA Technical Reports Server (NTRS)

Miles, Jeffrey Hilton

1997-01-01

The temporal instability of a spatially periodic parallel flow of an incompressible inviscid fluid for various jet velocity profiles is studied numerically using Floquet Analysis. The transition matrix at the end of a period is evaluated by direct numerical integration. For verification, a method based on approximating a continuous function by a series of step functions was used. Unstable solutions were found only over a limited range of wave numbers and have a band type structure. The results obtained are analogous to the behavior observed in systems exhibiting complexity at the edge of order and chaos.
Efficiency Analysis of the Parallel Implementation of the SIMPLE Algorithm on Multiprocessor Computers

NASA Astrophysics Data System (ADS)

Lashkin, S. V.; Kozelkov, A. S.; Yalozo, A. V.; Gerasimov, V. Yu.; Zelensky, D. K.

2017-12-01

This paper describes the details of the parallel implementation of the SIMPLE algorithm for numerical solution of the Navier-Stokes system of equations on arbitrary unstructured grids. The iteration schemes for the serial and parallel versions of the SIMPLE algorithm are implemented. In the description of the parallel implementation, special attention is paid to computational data exchange among processors under the condition of the grid model decomposition using fictitious cells. We discuss the specific features for the storage of distributed matrices and implementation of vector-matrix operations in parallel mode. It is shown that the proposed way of matrix storage reduces the number of interprocessor exchanges. A series of numerical experiments illustrates the effect of the multigrid SLAE solver tuning on the general efficiency of the algorithm; the tuning involves the types of the cycles used (V, W, and F), the number of iterations of a smoothing operator, and the number of cells for coarsening. Two ways (direct and indirect) of efficiency evaluation for parallelization of the numerical algorithm are demonstrated. The paper presents the results of solving some internal and external flow problems with the evaluation of parallelization efficiency by two algorithms. It is shown that the proposed parallel implementation enables efficient computations for the problems on a thousand processors. Based on the results obtained, some general recommendations are made for the optimal tuning of the multigrid solver, as well as for selecting the optimal number of cells per processor.
Scalability study of parallel spatial direct numerical simulation code on IBM SP1 parallel supercomputer

NASA Technical Reports Server (NTRS)

Hanebutte, Ulf R.; Joslin, Ronald D.; Zubair, Mohammad

1994-01-01

The implementation and the performance of a parallel spatial direct numerical simulation (PSDNS) code are reported for the IBM SP1 supercomputer. The spatially evolving disturbances that are associated with laminar-to-turbulent in three-dimensional boundary-layer flows are computed with the PS-DNS code. By remapping the distributed data structure during the course of the calculation, optimized serial library routines can be utilized that substantially increase the computational performance. Although the remapping incurs a high communication penalty, the parallel efficiency of the code remains above 40% for all performed calculations. By using appropriate compile options and optimized library routines, the serial code achieves 52-56 Mflops on a single node of the SP1 (45% of theoretical peak performance). The actual performance of the PSDNS code on the SP1 is evaluated with a 'real world' simulation that consists of 1.7 million grid points. One time step of this simulation is calculated on eight nodes of the SP1 in the same time as required by a Cray Y/MP for the same simulation. The scalability information provides estimated computational costs that match the actual costs relative to changes in the number of grid points.
A parallel direct-forcing fictitious domain method for simulating microswimmers

NASA Astrophysics Data System (ADS)

Gao, Tong; Lin, Zhaowu

2017-11-01

We present a 3D parallel direct-forcing fictitious domain method for simulating swimming micro-organisms at small Reynolds numbers. We treat the motile micro-swimmers as spherical rigid particles using the ``Squirmer'' model. The particle dynamics are solved on the moving Larangian meshes that overlay upon a fixed Eulerian mesh for solving the fluid motion, and the momentum exchange between the two phases is resolved by distributing pseudo body-forces over the particle interior regions which constrain the background fictitious fluids to follow the particle movement. While the solid and fluid subproblems are solved separately, no inner-iterations are required to enforce numerical convergence. We demonstrate the accuracy and robustness of the method by comparing our results with the existing analytical and numerical studies for various cases of single particle dynamics and particle-particle interactions. We also perform a series of numerical explorations to obtain statistical and rheological measurements to characterize the dynamics and structures of Squirmer suspensions. NSF DMS 1619960.
Modulated heat pulse propagation and partial transport barriers in chaotic magnetic fields

DOE PAGES

del-Castillo-Negrete, Diego; Blazevski, Daniel

2016-04-01

Direct numerical simulations of the time dependent parallel heat transport equation modeling heat pulses driven by power modulation in 3-dimensional chaotic magnetic fields are presented. The numerical method is based on the Fourier formulation of a Lagrangian-Green's function method that provides an accurate and efficient technique for the solution of the parallel heat transport equation in the presence of harmonic power modulation. The numerical results presented provide conclusive evidence that even in the absence of magnetic flux surfaces, chaotic magnetic field configurations with intermediate levels of stochasticity exhibit transport barriers to modulated heat pulse propagation. In particular, high-order islands and remnants of destroyed flux surfaces (Cantori) act as partial barriers that slow down or even stop the propagation of heat waves at places where the magnetic field connection length exhibits a strong gradient. The key parameter ismore » $$\\gamma=\\sqrt{\\omega/2 \\chi_\\parallel}$$ that determines the length scale, $$1/\\gamma$$, of the heat wave penetration along the magnetic field line. For large perturbation frequencies, $$\\omega \\gg 1$$, or small parallel thermal conductivities, $$\\chi_\\parallel \\ll 1$$, parallel heat transport is strongly damped and the magnetic field partial barriers act as robust barriers where the heat wave amplitude vanishes and its phase speed slows down to a halt. On the other hand, in the limit of small $$\\gamma$$, parallel heat transport is largely unimpeded, global transport is observed and the radial amplitude and phase speed of the heat wave remain finite. Results on modulated heat pulse propagation in fully stochastic fields and across magnetic islands are also presented. In qualitative agreement with recent experiments in LHD and DIII-D, it is shown that the elliptic (O) and hyperbolic (X) points of magnetic islands have a direct impact on the spatio-temporal dependence of the amplitude and the time delay of modulated heat pulses.« less
Numerical simulation of h-adaptive immersed boundary method for freely falling disks

NASA Astrophysics Data System (ADS)

Zhang, Pan; Xia, Zhenhua; Cai, Qingdong

2018-05-01

In this work, a freely falling disk with aspect ratio 1/10 is directly simulated by using an adaptive numerical model implemented on a parallel computation framework JASMIN. The adaptive numerical model is a combination of the h-adaptive mesh refinement technique and the implicit immersed boundary method (IBM). Our numerical results agree well with the experimental results in all of the six degrees of freedom of the disk. Furthermore, very similar vortex structures observed in the experiment were also obtained.
A Parallel, Finite-Volume Algorithm for Large-Eddy Simulation of Turbulent Flows

NASA Technical Reports Server (NTRS)

Bui, Trong T.

1999-01-01

A parallel, finite-volume algorithm has been developed for large-eddy simulation (LES) of compressible turbulent flows. This algorithm includes piecewise linear least-square reconstruction, trilinear finite-element interpolation, Roe flux-difference splitting, and second-order MacCormack time marching. Parallel implementation is done using the message-passing programming model. In this paper, the numerical algorithm is described. To validate the numerical method for turbulence simulation, LES of fully developed turbulent flow in a square duct is performed for a Reynolds number of 320 based on the average friction velocity and the hydraulic diameter of the duct. Direct numerical simulation (DNS) results are available for this test case, and the accuracy of this algorithm for turbulence simulations can be ascertained by comparing the LES solutions with the DNS results. The effects of grid resolution, upwind numerical dissipation, and subgrid-scale dissipation on the accuracy of the LES are examined. Comparison with DNS results shows that the standard Roe flux-difference splitting dissipation adversely affects the accuracy of the turbulence simulation. For accurate turbulence simulations, only 3-5 percent of the standard Roe flux-difference splitting dissipation is needed.
Coordinate Systems, Numerical Objects and Algorithmic Operations of Computational Experiment in Fluid Mechanics

NASA Astrophysics Data System (ADS)

Degtyarev, Alexander; Khramushin, Vasily

2016-02-01

The paper deals with the computer implementation of direct computational experiments in fluid mechanics, constructed on the basis of the approach developed by the authors. The proposed approach allows the use of explicit numerical scheme, which is an important condition for increasing the effciency of the algorithms developed by numerical procedures with natural parallelism. The paper examines the main objects and operations that let you manage computational experiments and monitor the status of the computation process. Special attention is given to a) realization of tensor representations of numerical schemes for direct simulation; b) realization of representation of large particles of a continuous medium motion in two coordinate systems (global and mobile); c) computing operations in the projections of coordinate systems, direct and inverse transformation in these systems. Particular attention is paid to the use of hardware and software of modern computer systems.

GRAVIDY, a GPU modular, parallel direct-summation N-body integrator: dynamics with softening

NASA Astrophysics Data System (ADS)

Maureira-Fredes, Cristián; Amaro-Seoane, Pau

2018-01-01

A wide variety of outstanding problems in astrophysics involve the motion of a large number of particles under the force of gravity. These include the global evolution of globular clusters, tidal disruptions of stars by a massive black hole, the formation of protoplanets and sources of gravitational radiation. The direct-summation of N gravitational forces is a complex problem with no analytical solution and can only be tackled with approximations and numerical methods. To this end, the Hermite scheme is a widely used integration method. With different numerical techniques and special-purpose hardware, it can be used to speed up the calculations. But these methods tend to be computationally slow and cumbersome to work with. We present a new graphics processing unit (GPU), direct-summation N-body integrator written from scratch and based on this scheme, which includes relativistic corrections for sources of gravitational radiation. GRAVIDY has high modularity, allowing users to readily introduce new physics, it exploits available computational resources and will be maintained by regular updates. GRAVIDY can be used in parallel on multiple CPUs and GPUs, with a considerable speed-up benefit. The single-GPU version is between one and two orders of magnitude faster than the single-CPU version. A test run using four GPUs in parallel shows a speed-up factor of about 3 as compared to the single-GPU version. The conception and design of this first release is aimed at users with access to traditional parallel CPU clusters or computational nodes with one or a few GPU cards.
Using OpenMP vs. Threading Building Blocks for Medical Imaging on Multi-cores

NASA Astrophysics Data System (ADS)

Kegel, Philipp; Schellmann, Maraike; Gorlatch, Sergei

We compare two parallel programming approaches for multi-core systems: the well-known OpenMP and the recently introduced Threading Building Blocks (TBB) library by Intel®. The comparison is made using the parallelization of a real-world numerical algorithm for medical imaging. We develop several parallel implementations, and compare them w.r.t. programming effort, programming style and abstraction, and runtime performance. We show that TBB requires a considerable program re-design, whereas with OpenMP simple compiler directives are sufficient. While TBB appears to be less appropriate for parallelizing existing implementations, it fosters a good programming style and higher abstraction level for newly developed parallel programs. Our experimental measurements on a dual quad-core system demonstrate that OpenMP slightly outperforms TBB in our implementation.
Hybrid parallelization of the XTOR-2F code for the simulation of two-fluid MHD instabilities in tokamaks

NASA Astrophysics Data System (ADS)

Marx, Alain; Lütjens, Hinrich

2017-03-01

A hybrid MPI/OpenMP parallel version of the XTOR-2F code [Lütjens and Luciani, J. Comput. Phys. 229 (2010) 8130] solving the two-fluid MHD equations in full tokamak geometry by means of an iterative Newton-Krylov matrix-free method has been developed. The present work shows that the code has been parallelized significantly despite the numerical profile of the problem solved by XTOR-2F, i.e. a discretization with pseudo-spectral representations in all angular directions, the stiffness of the two-fluid stability problem in tokamaks, and the use of a direct LU decomposition to invert the physical pre-conditioner at every Krylov iteration of the solver. The execution time of the parallelized version is an order of magnitude smaller than the sequential one for low resolution cases, with an increasing speedup when the discretization mesh is refined. Moreover, it allows to perform simulations with higher resolutions, previously forbidden because of memory limitations.
Parallelized implicit propagators for the finite-difference Schrödinger equation

NASA Astrophysics Data System (ADS)

Parker, Jonathan; Taylor, K. T.

1995-08-01

We describe the application of block Gauss-Seidel and block Jacobi iterative methods to the design of implicit propagators for finite-difference models of the time-dependent Schrödinger equation. The block-wise iterative methods discussed here are mixed direct-iterative methods for solving simultaneous equations, in the sense that direct methods (e.g. LU decomposition) are used to invert certain block sub-matrices, and iterative methods are used to complete the solution. We describe parallel variants of the basic algorithm that are well suited to the medium- to coarse-grained parallelism of work-station clusters, and MIMD supercomputers, and we show that under a wide range of conditions, fine-grained parallelism of the computation can be achieved. Numerical tests are conducted on a typical one-electron atom Hamiltonian. The methods converge robustly to machine precision (15 significant figures), in some cases in as few as 6 or 7 iterations. The rate of convergence is nearly independent of the finite-difference grid-point separations.
Modulated heat pulse propagation and partial transport barriers in chaotic magnetic fields

DOE Office of Scientific and Technical Information (OSTI.GOV)

Castillo-Negrete, Diego del; Blazevski, Daniel

2016-04-15

Direct numerical simulations of the time dependent parallel heat transport equation modeling heat pulses driven by power modulation in three-dimensional chaotic magnetic fields are presented. The numerical method is based on the Fourier formulation of a Lagrangian-Green's function method that provides an accurate and efficient technique for the solution of the parallel heat transport equation in the presence of harmonic power modulation. The numerical results presented provide conclusive evidence that even in the absence of magnetic flux surfaces, chaotic magnetic field configurations with intermediate levels of stochasticity exhibit transport barriers to modulated heat pulse propagation. In particular, high-order islands andmore » remnants of destroyed flux surfaces (Cantori) act as partial barriers that slow down or even stop the propagation of heat waves at places where the magnetic field connection length exhibits a strong gradient. Results on modulated heat pulse propagation in fully stochastic fields and across magnetic islands are also presented. In qualitative agreement with recent experiments in large helical device and DIII-D, it is shown that the elliptic (O) and hyperbolic (X) points of magnetic islands have a direct impact on the spatio-temporal dependence of the amplitude of modulated heat pulses.« less
Weibel instability for a streaming electron, counterstreaming e-e, and e-p plasmas with intrinsic temperature anisotropy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ghorbanalilu, M.; Physics Department, Azarbaijan Shahid Madani University, Tabriz; Sadegzadeh, S.

2014-05-15

The existence of Weibel instability for a streaming electron, counterstreaming electron-electron (e-e), and electron-positron (e-p) plasmas with intrinsic temperature anisotropy is investigated. The temperature anisotropy is included in the directions perpendicular and parallel to the streaming direction. It is shown that the beam mean speed changes the instability mode, for a streaming electron beam, from the classic Weibel to the Weibel-like mode. The analytical and numerical solutions approved that Weibel-like modes are excited for both counterstreaming e-e and e-p plasmas. The growth rates of the instabilities in e-e and e-p plasmas are compared. The growth rate is larger for e-pmore » plasmas if the thermal anisotropy is small and the opposite is true for large thermal anisotropies. The analytical and numerical solutions are in good agreement only in the small parallel temperature and wave number limits, when the instability growth rate increases linearly with normalized wave number kc∕ω{sub p}.« less
An Investigation of the Flow Physics of Acoustic Liners by Direct Numerical Simulation

NASA Technical Reports Server (NTRS)

Watson, Willie R. (Technical Monitor); Tam, Christopher

2004-01-01

This report concentrates on reporting the effort and status of work done on three dimensional (3-D) simulation of a multi-hole resonator in an impedance tube. This work is coordinated with a parallel experimental effort to be carried out at the NASA Langley Research Center. The outline of this report is as follows : 1. Preliminary consideration. 2. Computation model. 3. Mesh design and parallel computing. 4. Visualization. 5. Status of computer code development. 1. Preliminary Consideration.
A parallel direct numerical simulation of dust particles in a turbulent flow

NASA Astrophysics Data System (ADS)

Nguyen, H. V.; Yokota, R.; Stenchikov, G.; Kocurek, G.

2012-04-01

Due to their effects on radiation transport, aerosols play an important role in the global climate. Mineral dust aerosol is a predominant natural aerosol in the desert and semi-desert regions of the Middle East and North Africa (MENA). The Arabian Peninsula is one of the three predominant source regions on the planet "exporting" dust to almost the entire world. Mineral dust aerosols make up about 50% of the tropospheric aerosol mass and therefore produces a significant impact on the Earth's climate and the atmospheric environment, especially in the MENA region that is characterized by frequent dust storms and large aerosol generation. Understanding the mechanisms of dust emission, transport and deposition is therefore essential for correctly representing dust in numerical climate prediction. In this study we present results of numerical simulations of dust particles in a turbulent flow to study the interaction between dust and the atmosphere. Homogenous and passive dust particles in the boundary layers are entrained and advected under the influence of a turbulent flow. Currently no interactions between particles are included. Turbulence is resolved through direct numerical simulation using a parallel incompressible Navier-Stokes flow solver. Model output provides information on particle trajectories, turbulent transport of dust and effects of gravity on dust motion, which will be used to compare with the wind tunnel experiments at University of Texas at Austin. Results of testing of parallel efficiency and scalability is provided. Future versions of the model will include air-particle momentum exchanges, varying particle sizes and saltation effect. The results will be used for interpreting wind tunnel and field experiments and for improvement of dust generation parameterizations in meteorological models.
LDRD final report on massively-parallel linear programming : the parPCx system.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Parekh, Ojas; Phillips, Cynthia Ann; Boman, Erik Gunnar

2005-02-01

This report summarizes the research and development performed from October 2002 to September 2004 at Sandia National Laboratories under the Laboratory-Directed Research and Development (LDRD) project ''Massively-Parallel Linear Programming''. We developed a linear programming (LP) solver designed to use a large number of processors. LP is the optimization of a linear objective function subject to linear constraints. Companies and universities have expended huge efforts over decades to produce fast, stable serial LP solvers. Previous parallel codes run on shared-memory systems and have little or no distribution of the constraint matrix. We have seen no reports of general LP solver runsmore » on large numbers of processors. Our parallel LP code is based on an efficient serial implementation of Mehrotra's interior-point predictor-corrector algorithm (PCx). The computational core of this algorithm is the assembly and solution of a sparse linear system. We have substantially rewritten the PCx code and based it on Trilinos, the parallel linear algebra library developed at Sandia. Our interior-point method can use either direct or iterative solvers for the linear system. To achieve a good parallel data distribution of the constraint matrix, we use a (pre-release) version of a hypergraph partitioner from the Zoltan partitioning library. We describe the design and implementation of our new LP solver called parPCx and give preliminary computational results. We summarize a number of issues related to efficient parallel solution of LPs with interior-point methods including data distribution, numerical stability, and solving the core linear system using both direct and iterative methods. We describe a number of applications of LP specific to US Department of Energy mission areas and we summarize our efforts to integrate parPCx (and parallel LP solvers in general) into Sandia's massively-parallel integer programming solver PICO (Parallel Interger and Combinatorial Optimizer). We conclude with directions for long-term future algorithmic research and for near-term development that could improve the performance of parPCx.« less
Direct numerical simulation of instabilities in parallel flow with spherical roughness elements

NASA Technical Reports Server (NTRS)

Deanna, R. G.

1992-01-01

Results from a direct numerical simulation of laminar flow over a flat surface with spherical roughness elements using a spectral-element method are given. The numerical simulation approximates roughness as a cellular pattern of identical spheres protruding from a smooth wall. Periodic boundary conditions on the domain's horizontal faces simulate an infinite array of roughness elements extending in the streamwise and spanwise directions, which implies the parallel-flow assumption, and results in a closed domain. A body force, designed to yield the horizontal Blasius velocity in the absence of roughness, sustains the flow. Instabilities above a critical Reynolds number reveal negligible oscillations in the recirculation regions behind each sphere and in the free stream, high-amplitude oscillations in the layer directly above the spheres, and a mean profile with an inflection point near the sphere's crest. The inflection point yields an unstable layer above the roughness (where U''(y) is less than 0) and a stable region within the roughness (where U''(y) is greater than 0). Evidently, the instability begins when the low-momentum or wake region behind an element, being the region most affected by disturbances (purely numerical in this case), goes unstable and moves. In compressible flow with periodic boundaries, this motion sends disturbances to all regions of the domain. In the unstable layer just above the inflection point, the disturbances grow while being carried downstream with a propagation speed equal to the local mean velocity; they do not grow amid the low energy region near the roughness patch. The most amplified disturbance eventually arrives at the next roughness element downstream, perturbing its wake and inducing a global response at a frequency governed by the streamwise spacing between spheres and the mean velocity of the most amplified layer.
A parallel variable metric optimization algorithm

NASA Technical Reports Server (NTRS)

Straeter, T. A.

1973-01-01

An algorithm, designed to exploit the parallel computing or vector streaming (pipeline) capabilities of computers is presented. When p is the degree of parallelism, then one cycle of the parallel variable metric algorithm is defined as follows: first, the function and its gradient are computed in parallel at p different values of the independent variable; then the metric is modified by p rank-one corrections; and finally, a single univariant minimization is carried out in the Newton-like direction. Several properties of this algorithm are established. The convergence of the iterates to the solution is proved for a quadratic functional on a real separable Hilbert space. For a finite-dimensional space the convergence is in one cycle when p equals the dimension of the space. Results of numerical experiments indicate that the new algorithm will exploit parallel or pipeline computing capabilities to effect faster convergence than serial techniques.
Numerical simulation of phenomenon on zonal disintegration in deep underground mining in case of unsupported roadway

NASA Astrophysics Data System (ADS)

Han, Fengshan; Wu, Xinli; Li, Xia; Zhu, Dekang

2018-02-01

Zonal disintegration phenomenon was found in deep mining roadway surrounding rock. It seriously affects the safety of mining and underground engineering and it may lead to the occurrence of natural disasters. in deep mining roadway surrounding rock, tectonic stress in deep mining roadway rock mass, horizontal stress is much greater than the vertical stress, When the direction of maximum principal stress is parallel to the axis of the roadway in deep mining, this is the main reasons for Zonal disintegration phenomenon. Using ABAQUS software to numerical simulation of the three-dimensional model of roadway rupture formation process systematically, and the study shows that when The Direction of maximum main stress in deep underground mining is along the roadway axial direction, Zonal disintegration phenomenon in deep underground mining is successfully reproduced by our numerical simulation..numerical simulation shows that using ABAQUA simulation can reproduce Zonal disintegration phenomenon and the formation process of damage of surrounding rock can be reproduced. which have important engineering practical significance.
Multi-dimensional high order essentially non-oscillatory finite difference methods in generalized coordinates

NASA Technical Reports Server (NTRS)

Shu, Chi-Wang

1992-01-01

The nonlinear stability of compact schemes for shock calculations is investigated. In recent years compact schemes were used in various numerical simulations including direct numerical simulation of turbulence. However to apply them to problems containing shocks, one has to resolve the problem of spurious numerical oscillation and nonlinear instability. A framework to apply nonlinear limiting to a local mean is introduced. The resulting scheme can be proven total variation (1D) or maximum norm (multi D) stable and produces nice numerical results in the test cases. The result is summarized in the preprint entitled 'Nonlinearly Stable Compact Schemes for Shock Calculations', which was submitted to SIAM Journal on Numerical Analysis. Research was continued on issues related to two and three dimensional essentially non-oscillatory (ENO) schemes. The main research topics include: parallel implementation of ENO schemes on Connection Machines; boundary conditions; shock interaction with hydrogen bubbles, a preparation for the full combustion simulation; and direct numerical simulation of compressible sheared turbulence.
Numerical study of the stress-strain state of reinforced plate on an elastic foundation by the Bubnov-Galerkin method

NASA Astrophysics Data System (ADS)

Beskopylny, Alexey; Kadomtseva, Elena; Strelnikov, Grigory

2017-10-01

The stress-strain state of a rectangular slab resting on an elastic foundation is considered. The slab material is isotropic. The slab has stiffening ribs that directed parallel to both sides of the plate. Solving equations are obtained for determining the deflection for various mechanical and geometric characteristics of the stiffening ribs which are parallel to different sides of the plate, having different rigidity for bending and torsion. The calculation scheme assumes an orthotropic slab having different cylindrical stiffness in two mutually perpendicular directions parallel to the reinforcing ribs. An elastic foundation is adopted by Winkler model. To determine the deflection the Bubnov-Galerkin method is used. The deflection is taken in the form of an expansion in a series with unknown coefficients by special polynomials, which are a combination of Legendre polynomials.
Development and parallelization of a direct numerical simulation to study the formation and transport of nanoparticle clusters in a viscous fluid

NASA Astrophysics Data System (ADS)

Sloan, Gregory James

The direct numerical simulation (DNS) offers the most accurate approach to modeling the behavior of a physical system, but carries an enormous computation cost. There exists a need for an accurate DNS to model the coupled solid-fluid system seen in targeted drug delivery (TDD), nanofluid thermal energy storage (TES), as well as other fields where experiments are necessary, but experiment design may be costly. A parallel DNS can greatly reduce the large computation times required, while providing the same results and functionality of the serial counterpart. A D2Q9 lattice Boltzmann method approach was implemented to solve the fluid phase. The use of domain decomposition with message passing interface (MPI) parallelism resulted in an algorithm that exhibits super-linear scaling in testing, which may be attributed to the caching effect. Decreased performance on a per-node basis for a fixed number of processes confirms this observation. A multiscale approach was implemented to model the behavior of nanoparticles submerged in a viscous fluid, and used to examine the mechanisms that promote or inhibit clustering. Parallelization of this model using a masterworker algorithm with MPI gives less-than-linear speedup for a fixed number of particles and varying number of processes. This is due to the inherent inefficiency of the master-worker approach. Lastly, these separate simulations are combined, and two-way coupling is implemented between the solid and fluid.
Stress orientation and fracturing during three-dimensional buckling: Numerical simulation and application to chocolate-tablet structures in folded turbidites, SW Portugal

NASA Astrophysics Data System (ADS)

Reber, J. E.; Schmalholz, S. M.; Burg, J.-P.

2010-10-01

Two orthogonal sets of veins, both orthogonal to bedding, form chocolate tablet structures on the limbs of folded quartzwackes of Carboniferous turbidites in SW Portugal. Structural observations suggest that (1) mode 1 fractures transverse to the fold axes formed while fold amplitudes were small and limbs were under layer-subparallel compression and (2) mode 1 fractures parallel to the fold axes formed while fold amplitudes were large and limbs were brought to be under layer-subparallel tension. We performed two- and three-dimensional numerical simulations investigating the evolution of stress orientations during viscous folding to test whether and how these two successive sets of fractures were related to folding. We employed ellipses and ellipsoids for the visualization and quantification of the local stress field. The numerical simulations show a change in the orientation of the local σ1 direction by almost 90° with respect to the bedding plane in the fold limbs. The coeval σ3 direction rotates from parallel to the fold axis at low fold amplitudes to orthogonal to the fold axis at high fold amplitudes. The stress orientation changes faster in multilayers than in single-layers. The numerical simulations are consistent with observation and provide a mechanical interpretation for the formation of the chocolate tablet structures through consecutive sets of fractures on rotating limbs of folded competent layers.
Experimental and Numerical Study on the Strength of Aluminum Extrusion Welding.

PubMed

Bingöl, Sedat; Bozacı, Atilla

2015-07-17

The quality of extrusion welding in the extruded hollow shapes is influenced significantly by the pressure and effective stress under which the material is being joined inside the welding chamber. However, extrusion welding was not accounted for in the past by the developers of finite element software packages. In this study, the strength of hollow extrusion profile with seam weld produced at different ram speeds was investigated experimentally and numerically. The experiments were performed on an extruded hollow aluminum profile which was suitable to obtain the tensile tests specimens from its seam weld's region at both parallel to extrusion direction and perpendicular to extrusion direction. A new numerical modeling approach, which was recently proposed in literature, was used for numerical analyses of the study. The simulation results performed at different ram speeds were compared with the experimental results, and a good agreement was obtained.
Parallel gene analysis with allele-specific padlock probes and tag microarrays

PubMed Central

Banér, Johan; Isaksson, Anders; Waldenström, Erik; Jarvius, Jonas; Landegren, Ulf; Nilsson, Mats

2003-01-01

Parallel, highly specific analysis methods are required to take advantage of the extensive information about DNA sequence variation and of expressed sequences. We present a scalable laboratory technique suitable to analyze numerous target sequences in multiplexed assays. Sets of padlock probes were applied to analyze single nucleotide variation directly in total genomic DNA or cDNA for parallel genotyping or gene expression analysis. All reacted probes were then co-amplified and identified by hybridization to a standard tag oligonucleotide array. The technique was illustrated by analyzing normal and pathogenic variation within the Wilson disease-related ATP7B gene, both at the level of DNA and RNA, using allele-specific padlock probes. PMID:12930977
High order parallel numerical schemes for solving incompressible flows

NASA Technical Reports Server (NTRS)

Lin, Avi; Milner, Edward J.; Liou, May-Fun; Belch, Richard A.

1992-01-01

The use of parallel computers for numerically solving flow fields has gained much importance in recent years. This paper introduces a new high order numerical scheme for computational fluid dynamics (CFD) specifically designed for parallel computational environments. A distributed MIMD system gives the flexibility of treating different elements of the governing equations with totally different numerical schemes in different regions of the flow field. The parallel decomposition of the governing operator to be solved is the primary parallel split. The primary parallel split was studied using a hypercube like architecture having clusters of shared memory processors at each node. The approach is demonstrated using examples of simple steady state incompressible flows. Future studies should investigate the secondary split because, depending on the numerical scheme that each of the processors applies and the nature of the flow in the specific subdomain, it may be possible for a processor to seek better, or higher order, schemes for its particular subcase.
Parallel Domain Decomposition Formulation and Software for Large-Scale Sparse Symmetrical/Unsymmetrical Aeroacoustic Applications

NASA Technical Reports Server (NTRS)

Nguyen, D. T.; Watson, Willie R. (Technical Monitor)

2005-01-01

The overall objectives of this research work are to formulate and validate efficient parallel algorithms, and to efficiently design/implement computer software for solving large-scale acoustic problems, arised from the unified frameworks of the finite element procedures. The adopted parallel Finite Element (FE) Domain Decomposition (DD) procedures should fully take advantages of multiple processing capabilities offered by most modern high performance computing platforms for efficient parallel computation. To achieve this objective. the formulation needs to integrate efficient sparse (and dense) assembly techniques, hybrid (or mixed) direct and iterative equation solvers, proper pre-conditioned strategies, unrolling strategies, and effective processors' communicating schemes. Finally, the numerical performance of the developed parallel finite element procedures will be evaluated by solving series of structural, and acoustic (symmetrical and un-symmetrical) problems (in different computing platforms). Comparisons with existing "commercialized" and/or "public domain" software are also included, whenever possible.

Asymptotic-preserving Lagrangian approach for modeling anisotropic transport in magnetized plasmas

NASA Astrophysics Data System (ADS)

Chacon, Luis; Del-Castillo-Negrete, Diego

2012-03-01

Modeling electron transport in magnetized plasmas is extremely challenging due to the extreme anisotropy between parallel (to the magnetic field) and perpendicular directions (the transport-coefficient ratio χ/χ˜10^10 in fusion plasmas). Recently, a novel Lagrangian Green's function method has been proposedfootnotetextD. del-Castillo-Negrete, L. Chac'on, PRL, 106, 195004 (2011); D. del-Castillo-Negrete, L. Chac'on, Phys. Plasmas, submitted (2011) to solve the local and non-local purely parallel transport equation in general 3D magnetic fields. The approach avoids numerical pollution, is inherently positivity-preserving, and is scalable algorithmically (i.e., work per degree-of-freedom is grid-independent). In this poster, we discuss the extension of the Lagrangian Green's function approach to include perpendicular transport terms and sources. We present an asymptotic-preserving numerical formulation, which ensures a consistent numerical discretization temporally and spatially for arbitrary χ/χ ratios. We will demonstrate the potential of the approach with various challenging configurations, including the case of transport across a magnetic island in cylindrical geometry.
Magnus-induced ratchet effects for skyrmions interacting with asymmetric substrates

NASA Astrophysics Data System (ADS)

Reichhardt, C.; Ray, D.; Olson Reichhardt, C. J.

2015-07-01

We show using numerical simulations that pronounced ratchet effects can occur for ac driven skyrmions moving over asymmetric quasi-one-dimensional substrates. We find a new type of ratchet effect called a Magnus-induced transverse ratchet that arises when the ac driving force is applied perpendicular rather than parallel to the asymmetry direction of the substrate. This transverse ratchet effect only occurs when the Magnus term is finite, and the threshold ac amplitude needed to induce it decreases as the Magnus term becomes more prominent. Ratcheting skyrmions follow ordered orbits in which the net displacement parallel to the substrate asymmetry direction is quantized. Skyrmion ratchets represent a new ac current-based method for controlling skyrmion positions and motion for spintronic applications.
Poiseuille, thermal transpiration and Couette flows of a rarefied gas between plane parallel walls with nonuniform surface properties in the transverse direction and their reciprocity relations

NASA Astrophysics Data System (ADS)

Doi, Toshiyuki

2018-04-01

Slow flows of a rarefied gas between two plane parallel walls with nonuniform surface properties are studied based on kinetic theory. It is assumed that one wall is a diffuse reflection boundary and the other wall is a Maxwell-type boundary whose accommodation coefficient varies periodically in the direction perpendicular to the flow. The time-independent Poiseuille, thermal transpiration and Couette flows are considered. The flow behavior is numerically studied based on the linearized Bhatnagar-Gross-Krook-Welander model of the Boltzmann equation. The flow field, the mass and heat flow rates in the gas, and the tangential force acting on the wall surface are studied over a wide range of the gas rarefaction degree and the parameters characterizing the distribution of the accommodation coefficient. The locally convex velocity distribution is observed in Couette flow of a highly rarefied gas, similarly to Poiseuille flow and thermal transpiration. The reciprocity relations are numerically confirmed over a wide range of the flow parameters.
Scalable High Performance Computing: Direct and Large-Eddy Turbulent Flow Simulations Using Massively Parallel Computers

NASA Technical Reports Server (NTRS)

Morgan, Philip E.

2004-01-01

This final report contains reports of research related to the tasks "Scalable High Performance Computing: Direct and Lark-Eddy Turbulent FLow Simulations Using Massively Parallel Computers" and "Devleop High-Performance Time-Domain Computational Electromagnetics Capability for RCS Prediction, Wave Propagation in Dispersive Media, and Dual-Use Applications. The discussion of Scalable High Performance Computing reports on three objectives: validate, access scalability, and apply two parallel flow solvers for three-dimensional Navier-Stokes flows; develop and validate a high-order parallel solver for Direct Numerical Simulations (DNS) and Large Eddy Simulation (LES) problems; and Investigate and develop a high-order Reynolds averaged Navier-Stokes turbulence model. The discussion of High-Performance Time-Domain Computational Electromagnetics reports on five objectives: enhancement of an electromagnetics code (CHARGE) to be able to effectively model antenna problems; utilize lessons learned in high-order/spectral solution of swirling 3D jets to apply to solving electromagnetics project; transition a high-order fluids code, FDL3DI, to be able to solve Maxwell's Equations using compact-differencing; develop and demonstrate improved radiation absorbing boundary conditions for high-order CEM; and extend high-order CEM solver to address variable material properties. The report also contains a review of work done by the systems engineer.
Direct Numerical Simulation of Turbulent Flow Over Complex Bathymetry

NASA Astrophysics Data System (ADS)

Yue, L.; Hsu, T. J.

2017-12-01

Direct numerical simulation (DNS) is regarded as a powerful tool in the investigation of turbulent flow featured with a wide range of time and spatial scales. With the application of coordinate transformation in a pseudo-spectral scheme, a parallelized numerical modeling system was created aiming at simulating flow over complex bathymetry with high numerical accuracy and efficiency. The transformed governing equations were integrated in time using a third-order low-storage Runge-Kutta method. For spatial discretization, the discrete Fourier expansion was adopted in the streamwise and spanwise direction, enforcing the periodic boundary condition in both directions. The Chebyshev expansion on Chebyshev-Gauss-Lobatto points was used in the wall-normal direction, assuming there is no-slip on top and bottom walls. The diffusion terms were discretized with a Crank-Nicolson scheme, while the advection terms dealiased with the 2/3 rule were discretized with an Adams-Bashforth scheme. In the prediction step, the velocity was calculated in physical domain by solving the resulting linear equation directly. However, the extra terms introduced by coordinate transformation impose a strict limitation to time step and an iteration method was applied to overcome this restriction in the correction step for pressure by solving the Helmholtz equation. The numerical solver is written in object-oriented C++ programing language utilizing Armadillo linear algebra library for matrix computation. Several benchmarking cases in laminar and turbulent flow were carried out to verify/validate the numerical model and very good agreements are achieved. Ongoing work focuses on implementing sediment transport capability for multiple sediment classes and parameterizations for flocculation processes.
Tensor methodology and computational geometry in direct computational experiments in fluid mechanics

NASA Astrophysics Data System (ADS)

Degtyarev, Alexander; Khramushin, Vasily; Shichkina, Julia

2017-07-01

The paper considers a generalized functional and algorithmic construction of direct computational experiments in fluid dynamics. Notation of tensor mathematics is naturally embedded in the finite - element operation in the construction of numerical schemes. Large fluid particle, which have a finite size, its own weight, internal displacement and deformation is considered as an elementary computing object. Tensor representation of computational objects becomes strait linear and uniquely approximation of elementary volumes and fluid particles inside them. The proposed approach allows the use of explicit numerical scheme, which is an important condition for increasing the efficiency of the algorithms developed by numerical procedures with natural parallelism. It is shown that advantages of the proposed approach are achieved among them by considering representation of large particles of a continuous medium motion in dual coordinate systems and computing operations in the projections of these two coordinate systems with direct and inverse transformations. So new method for mathematical representation and synthesis of computational experiment based on large particle method is proposed.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Bennett, Janine Camille; Thompson, David; Pebay, Philippe Pierre

Statistical analysis is typically used to reduce the dimensionality of and infer meaning from data. A key challenge of any statistical analysis package aimed at large-scale, distributed data is to address the orthogonal issues of parallel scalability and numerical stability. Many statistical techniques, e.g., descriptive statistics or principal component analysis, are based on moments and co-moments and, using robust online update formulas, can be computed in an embarrassingly parallel manner, amenable to a map-reduce style implementation. In this paper we focus on contingency tables, through which numerous derived statistics such as joint and marginal probability, point-wise mutual information, information entropy,more » and {chi}{sup 2} independence statistics can be directly obtained. However, contingency tables can become large as data size increases, requiring a correspondingly large amount of communication between processors. This potential increase in communication prevents optimal parallel speedup and is the main difference with moment-based statistics (which we discussed in [1]) where the amount of inter-processor communication is independent of data size. Here we present the design trade-offs which we made to implement the computation of contingency tables in parallel. We also study the parallel speedup and scalability properties of our open source implementation. In particular, we observe optimal speed-up and scalability when the contingency statistics are used in their appropriate context, namely, when the data input is not quasi-diffuse.« less
Gust Acoustics Computation with a Space-Time CE/SE Parallel 3D Solver

NASA Technical Reports Server (NTRS)

Wang, X. Y.; Himansu, A.; Chang, S. C.; Jorgenson, P. C. E.; Reddy, D. R. (Technical Monitor)

2002-01-01

The benchmark Problem 2 in Category 3 of the Third Computational Aero-Acoustics (CAA) Workshop is solved using the space-time conservation element and solution element (CE/SE) method. This problem concerns the unsteady response of an isolated finite-span swept flat-plate airfoil bounded by two parallel walls to an incident gust. The acoustic field generated by the interaction of the gust with the flat-plate airfoil is computed by solving the 3D (three-dimensional) Euler equations in the time domain using a parallel version of a 3D CE/SE solver. The effect of the gust orientation on the far-field directivity is studied. Numerical solutions are presented and compared with analytical solutions, showing a reasonable agreement.
Measures of three-dimensional anisotropy and intermittency in strong Alfvénic turbulence

NASA Astrophysics Data System (ADS)

Mallet, A.; Schekochihin, A. A.; Chandran, B. D. G.; Chen, C. H. K.; Horbury, T. S.; Wicks, R. T.; Greenan, C. C.

2016-06-01

We measure the local anisotropy of numerically simulated strong Alfvénic turbulence with respect to two local, physically relevant directions: along the local mean magnetic field and along the local direction of one of the fluctuating Elsasser fields. We find significant scaling anisotropy with respect to both these directions: the fluctuations are `ribbon-like' - statistically, they are elongated along both the mean magnetic field and the fluctuating field. The latter form of anisotropy is due to scale-dependent alignment of the fluctuating fields. The intermittent scalings of the nth-order conditional structure functions in the direction perpendicular to both the local mean field and the fluctuations agree well with the theory of Chandran, Schekochihin & Mallet, while the parallel scalings are consistent with those implied by the critical-balance conjecture. We quantify the relationship between the perpendicular scalings and those in the fluctuation and parallel directions, and find that the scaling exponent of the perpendicular anisotropy (I.e. of the aspect ratio of the Alfvénic structures in the plane perpendicular to the mean magnetic field) depends on the amplitude of the fluctuations. This is shown to be equivalent to the anticorrelation of fluctuation amplitude and alignment at each scale. The dependence of the anisotropy on amplitude is shown to be more significant for the anisotropy between the perpendicular and fluctuation-direction scales than it is between the perpendicular and parallel scales.
Magnus-induced ratchet effects for skyrmions interacting with asymmetric substrates

DOE Office of Scientific and Technical Information (OSTI.GOV)

Reichhardt, C.; Ray, D.; Reichhardt, C. J. Olson

2015-07-31

We show using numerical simulations that pronounced ratchet effects can occur for ac driven skyrmions moving over asymmetric quasi-one-dimensional substrates. We find a new type of ratchet effect called a Magnus-induced transverse ratchet that arises when the ac driving force is applied perpendicular rather than parallel to the asymmetry direction of the substrate. This transverse ratchet effect only occurs when the Magnus term is finite, and the threshold ac amplitude needed to induce it decreases as the Magnus term becomes more prominent. Ratcheting skyrmions follow ordered orbits in which the net displacement parallel to the substrate asymmetry direction is quantized.more » As a result, skyrmion ratchets represent a new ac current-based method for controlling skyrmion positions and motion for spintronic applications.« less
Application of computational physics within Northrop

NASA Technical Reports Server (NTRS)

George, M. W.; Ling, R. T.; Mangus, J. F.; Thompkins, W. T.

1987-01-01

An overview of Northrop programs in computational physics is presented. These programs depend on access to today's supercomputers, such as the Numerical Aerodynamical Simulator (NAS), and future growth on the continuing evolution of computational engines. Descriptions here are concentrated on the following areas: computational fluid dynamics (CFD), computational electromagnetics (CEM), computer architectures, and expert systems. Current efforts and future directions in these areas are presented. The impact of advances in the CFD area is described, and parallels are drawn to analagous developments in CEM. The relationship between advances in these areas and the development of advances (parallel) architectures and expert systems is also presented.
Direct and Inverse Kinematics of a Novel Tip-Tilt-Piston Parallel Manipulator

NASA Technical Reports Server (NTRS)

Tahmasebi, Farhad

2004-01-01

Closed-form direct and inverse kinematics of a new three degree-of-freedom (DOF) parallel manipulator with inextensible limbs and base-mounted actuators are presented. The manipulator has higher resolution and precision than the existing three DOF mechanisms with extensible limbs. Since all of the manipulator actuators are base-mounted; higher payload capacity, smaller actuator sizes, and lower power dissipation can be obtained. The manipulator is suitable for alignment applications where only tip, tilt, and piston motions are significant. The direct kinematics of the manipulator is reduced to solving an eighth-degree polynomial in the square of tangent of half-angle between one of the limbs and the base plane. Hence, there are at most 16 assembly configurations for the manipulator. In addition, it is shown that the 16 solutions are eight pairs of reflected configurations with respect to the base plane. Numerical examples for the direct and inverse kinematics of the manipulator are also presented.
Fast Numerical Solution of the Plasma Response Matrix for Real-time Ideal MHD Control

DOE Office of Scientific and Technical Information (OSTI.GOV)

Glasser, Alexander; Kolemen, Egemen; Glasser, Alan H.

To help effectuate near real-time feedback control of ideal MHD instabilities in tokamak geometries, a parallelized version of A.H. Glasser’s DCON (Direct Criterion of Newcomb) code is developed. To motivate the numerical implementation, we first solve DCON’s δW formulation with a Hamilton-Jacobi theory, elucidating analytical and numerical features of the ideal MHD stability problem. The plasma response matrix is demonstrated to be the solution of an ideal MHD Riccati equation. We then describe our adaptation of DCON with numerical methods natural to solutions of the Riccati equation, parallelizing it to enable its operation in near real-time. We replace DCON’s serial integration of perturbed modes—which satisfy a singular Euler- Lagrange equation—with a domain-decomposed integration of state transition matrices. Output is shown to match results from DCON with high accuracy, and with computation time < 1s. Such computational speed may enable active feedback ideal MHD stability control, especially in plasmas whose ideal MHD equilibria evolve with inductive timescalemore » $$\\tau$$ ≳ 1s—as in ITER. Further potential applications of this theory are discussed.« less
Fast Numerical Solution of the Plasma Response Matrix for Real-time Ideal MHD Control

DOE PAGES

Glasser, Alexander; Kolemen, Egemen; Glasser, Alan H.

2018-03-26

To help effectuate near real-time feedback control of ideal MHD instabilities in tokamak geometries, a parallelized version of A.H. Glasser’s DCON (Direct Criterion of Newcomb) code is developed. To motivate the numerical implementation, we first solve DCON’s δW formulation with a Hamilton-Jacobi theory, elucidating analytical and numerical features of the ideal MHD stability problem. The plasma response matrix is demonstrated to be the solution of an ideal MHD Riccati equation. We then describe our adaptation of DCON with numerical methods natural to solutions of the Riccati equation, parallelizing it to enable its operation in near real-time. We replace DCON’s serial integration of perturbed modes—which satisfy a singular Euler- Lagrange equation—with a domain-decomposed integration of state transition matrices. Output is shown to match results from DCON with high accuracy, and with computation time < 1s. Such computational speed may enable active feedback ideal MHD stability control, especially in plasmas whose ideal MHD equilibria evolve with inductive timescalemore » $$\\tau$$ ≳ 1s—as in ITER. Further potential applications of this theory are discussed.« less
Parallel/Vector Integration Methods for Dynamical Astronomy

NASA Astrophysics Data System (ADS)

Fukushima, T.

Progress of parallel/vector computers has driven us to develop suitable numerical integrators utilizing their computational power to the full extent while being independent on the size of system to be integrated. Unfortunately, the parallel version of Runge-Kutta type integrators are known to be not so efficient. Recently we developed a parallel version of the extrapolation method (Ito and Fukushima 1997), which allows variable timesteps and still gives an acceleration factor of 3-4 for general problems. While the vector-mode usage of Picard-Chebyshev method (Fukushima 1997a, 1997b) will lead the acceleration factor of order of 1000 for smooth problems such as planetary/satellites orbit integration. The success of multiple-correction PECE mode of time-symmetric implicit Hermitian integrator (Kokubo 1998) seems to enlighten Milankar's so-called "pipelined predictor corrector method", which is expected to lead an acceleration factor of 3-4. We will review these directions and discuss future prospects.
Domain decomposition methods in aerodynamics

NASA Technical Reports Server (NTRS)

Venkatakrishnan, V.; Saltz, Joel

1990-01-01

Compressible Euler equations are solved for two-dimensional problems by a preconditioned conjugate gradient-like technique. An approximate Riemann solver is used to compute the numerical fluxes to second order accuracy in space. Two ways to achieve parallelism are tested, one which makes use of parallelism inherent in triangular solves and the other which employs domain decomposition techniques. The vectorization/parallelism in triangular solves is realized by the use of a recording technique called wavefront ordering. This process involves the interpretation of the triangular matrix as a directed graph and the analysis of the data dependencies. It is noted that the factorization can also be done in parallel with the wave front ordering. The performances of two ways of partitioning the domain, strips and slabs, are compared. Results on Cray YMP are reported for an inviscid transonic test case. The performances of linear algebra kernels are also reported.
Direct numerical simulation of turbulent flow in a rotating square duct

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dai, Yi-Jun; Huang, Wei-Xi, E-mail: hwx@tsinghua.edu.cn; Xu, Chun-Xiao

A fully developed turbulent flow in a rotating straight square duct is simulated by direct numerical simulations at Re{sub τ} = 300 and 0 ≤ Ro{sub τ} ≤ 40. The rotating axis is parallel to two opposite walls of the duct and normal to the main flow. Variations of the turbulence statistics with the rotation rate are presented, and a comparison with the rotating turbulent channel flow is discussed. Rich secondary flow patterns in the cross section are observed by varying the rotation rate. The appearance of a pair of additional vortices above the pressure wall is carefully examined, andmore » the underlying mechanism is explained according to the budget analysis of the mean momentum equations.« less
Particle in cell/Monte Carlo collision analysis of the problem of identification of impurities in the gas by the plasma electron spectroscopy method

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kusoglu Sarikaya, C.; Rafatov, I., E-mail: rafatov@metu.edu.tr; Kudryavtsev, A. A.

2016-06-15

The work deals with the Particle in Cell/Monte Carlo Collision (PIC/MCC) analysis of the problem of detection and identification of impurities in the nonlocal plasma of gas discharge using the Plasma Electron Spectroscopy (PLES) method. For this purpose, 1d3v PIC/MCC code for numerical simulation of glow discharge with nonlocal electron energy distribution function is developed. The elastic, excitation, and ionization collisions between electron-neutral pairs and isotropic scattering and charge exchange collisions between ion-neutral pairs and Penning ionizations are taken into account. Applicability of the numerical code is verified under the Radio-Frequency capacitively coupled discharge conditions. The efficiency of the codemore » is increased by its parallelization using Open Message Passing Interface. As a demonstration of the PLES method, parallel PIC/MCC code is applied to the direct current glow discharge in helium doped with a small amount of argon. Numerical results are consistent with the theoretical analysis of formation of nonlocal EEDF and existing experimental data.« less
Parallel language constructs for tensor product computations on loosely coupled architectures

NASA Technical Reports Server (NTRS)

Mehrotra, Piyush; Van Rosendale, John

1989-01-01

A set of language primitives designed to allow the specification of parallel numerical algorithms at a higher level is described. The authors focus on tensor product array computations, a simple but important class of numerical algorithms. They consider first the problem of programming one-dimensional kernel routines, such as parallel tridiagonal solvers, and then look at how such parallel kernels can be combined to form parallel tensor product algorithms.
RISC Processors and High Performance Computing

NASA Technical Reports Server (NTRS)

Bailey, David H.; Saini, Subhash; Craw, James M. (Technical Monitor)

1995-01-01

This tutorial will discuss the top five RISC microprocessors and the parallel systems in which they are used. It will provide a unique cross-machine comparison not available elsewhere. The effective performance of these processors will be compared by citing standard benchmarks in the context of real applications. The latest NAS Parallel Benchmarks, both absolute performance and performance per dollar, will be listed. The next generation of the NPB will be described. The tutorial will conclude with a discussion of future directions in the field. Technology Transfer Considerations: All of these computer systems are commercially available internationally. Information about these processors is available in the public domain, mostly from the vendors themselves. The NAS Parallel Benchmarks and their results have been previously approved numerous times for public release, beginning back in 1991.

Dynamics of magnetic single domain particles embedded in a viscous liquid

NASA Astrophysics Data System (ADS)

Usadel, K. D.; Usadel, C.

2015-12-01

Kinetic equations for magnetic nano particles dispersed in a viscous liquid are developed and analyzed numerically. Depending on the amplitude of an applied oscillatory magnetic field, the particles orient their time averaged anisotropy axis perpendicular to the applied field for low magnetic field amplitudes and nearly parallel to the direction of the field for high amplitudes. The transition between these regions takes place in a narrow field interval. In the low field region, the magnetic moment is locked to some crystal axis and the energy absorption in an oscillatory driving field is dominated by viscous losses associated with particle rotation in the liquid. In the opposite limit, the magnetic moment rotates within the particle while its easy axis being nearly parallel to the external field direction oscillates. The kinetic equations are generalized to include thermal fluctuations. This leads to a significant increase of the power absorption in the low and intermediate field regions with a pronounced absorption peak as function of particle size. In the high field region, on the other hand, the inclusion of thermal fluctuations reduces the power absorption. The illustrative numerical calculations presented are performed for magnetic parameters typical for iron oxide.
An adaptable parallel algorithm for the direct numerical simulation of incompressible turbulent flows using a Fourier spectral/hp element method and MPI virtual topologies

NASA Astrophysics Data System (ADS)

Bolis, A.; Cantwell, C. D.; Moxey, D.; Serson, D.; Sherwin, S. J.

2016-09-01

A hybrid parallelisation technique for distributed memory systems is investigated for a coupled Fourier-spectral/hp element discretisation of domains characterised by geometric homogeneity in one or more directions. The performance of the approach is mathematically modelled in terms of operation count and communication costs for identifying the most efficient parameter choices. The model is calibrated to target a specific hardware platform after which it is shown to accurately predict the performance in the hybrid regime. The method is applied to modelling turbulent flow using the incompressible Navier-Stokes equations in an axisymmetric pipe and square channel. The hybrid method extends the practical limitations of the discretisation, allowing greater parallelism and reduced wall times. Performance is shown to continue to scale when both parallelisation strategies are used.
A Parallel Numerical Algorithm To Solve Linear Systems Of Equations Emerging From 3D Radiative Transfer

NASA Astrophysics Data System (ADS)

Wichert, Viktoria; Arkenberg, Mario; Hauschildt, Peter H.

2016-10-01

Highly resolved state-of-the-art 3D atmosphere simulations will remain computationally extremely expensive for years to come. In addition to the need for more computing power, rethinking coding practices is necessary. We take a dual approach by introducing especially adapted, parallel numerical methods and correspondingly parallelizing critical code passages. In the following, we present our respective work on PHOENIX/3D. With new parallel numerical algorithms, there is a big opportunity for improvement when iteratively solving the system of equations emerging from the operator splitting of the radiative transfer equation J = ΛS. The narrow-banded approximate Λ-operator Λ* , which is used in PHOENIX/3D, occurs in each iteration step. By implementing a numerical algorithm which takes advantage of its characteristic traits, the parallel code's efficiency is further increased and a speed-up in computational time can be achieved.
Solution of partial differential equations on vector and parallel computers

NASA Technical Reports Server (NTRS)

Ortega, J. M.; Voigt, R. G.

1985-01-01

The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed.
GPAW - massively parallel electronic structure calculations with Python-based software.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Enkovaara, J.; Romero, N.; Shende, S.

2011-01-01

Electronic structure calculations are a widely used tool in materials science and large consumer of supercomputing resources. Traditionally, the software packages for these kind of simulations have been implemented in compiled languages, where Fortran in its different versions has been the most popular choice. While dynamic, interpreted languages, such as Python, can increase the effciency of programmer, they cannot compete directly with the raw performance of compiled languages. However, by using an interpreted language together with a compiled language, it is possible to have most of the productivity enhancing features together with a good numerical performance. We have used thismore » approach in implementing an electronic structure simulation software GPAW using the combination of Python and C programming languages. While the chosen approach works well in standard workstations and Unix environments, massively parallel supercomputing systems can present some challenges in porting, debugging and profiling the software. In this paper we describe some details of the implementation and discuss the advantages and challenges of the combined Python/C approach. We show that despite the challenges it is possible to obtain good numerical performance and good parallel scalability with Python based software.« less
Graphics processing unit (GPU)-based computation of heat conduction in thermally anisotropic solids

NASA Astrophysics Data System (ADS)

Nahas, C. A.; Balasubramaniam, Krishnan; Rajagopal, Prabhu

2013-01-01

Numerical modeling of anisotropic media is a computationally intensive task since it brings additional complexity to the field problem in such a way that the physical properties are different in different directions. Largely used in the aerospace industry because of their lightweight nature, composite materials are a very good example of thermally anisotropic media. With advancements in video gaming technology, parallel processors are much cheaper today and accessibility to higher-end graphical processing devices has increased dramatically over the past couple of years. Since these massively parallel GPUs are very good in handling floating point arithmetic, they provide a new platform for engineers and scientists to accelerate their numerical models using commodity hardware. In this paper we implement a parallel finite difference model of thermal diffusion through anisotropic media using the NVIDIA CUDA (Compute Unified device Architecture). We use the NVIDIA GeForce GTX 560 Ti as our primary computing device which consists of 384 CUDA cores clocked at 1645 MHz with a standard desktop pc as the host platform. We compare the results from standard CPU implementation for its accuracy and speed and draw implications for simulation using the GPU paradigm.
Parallel Implementation of a High Order Implicit Collocation Method for the Heat Equation

NASA Technical Reports Server (NTRS)

Kouatchou, Jules; Halem, Milton (Technical Monitor)

2000-01-01

We combine a high order compact finite difference approximation and collocation techniques to numerically solve the two dimensional heat equation. The resulting method is implicit arid can be parallelized with a strategy that allows parallelization across both time and space. We compare the parallel implementation of the new method with a classical implicit method, namely the Crank-Nicolson method, where the parallelization is done across space only. Numerical experiments are carried out on the SGI Origin 2000.
A 3D staggered-grid finite difference scheme for poroelastic wave equation

NASA Astrophysics Data System (ADS)

Zhang, Yijie; Gao, Jinghuai

2014-10-01

Three dimensional numerical modeling has been a viable tool for understanding wave propagation in real media. The poroelastic media can better describe the phenomena of hydrocarbon reservoirs than acoustic and elastic media. However, the numerical modeling in 3D poroelastic media demands significantly more computational capacity, including both computational time and memory. In this paper, we present a 3D poroelastic staggered-grid finite difference (SFD) scheme. During the procedure, parallel computing is implemented to reduce the computational time. Parallelization is based on domain decomposition, and communication between processors is performed using message passing interface (MPI). Parallel analysis shows that the parallelized SFD scheme significantly improves the simulation efficiency and 3D decomposition in domain is the most efficient. We also analyze the numerical dispersion and stability condition of the 3D poroelastic SFD method. Numerical results show that the 3D numerical simulation can provide a real description of wave propagation.
Kinematics of a New High Precision Three Degree-of-Freedom Parallel Manipulator

NASA Technical Reports Server (NTRS)

Tahmasebi, Farhad

2005-01-01

Closed-form direct and inverse kinematics of a new three degree-of-freedom (DOF) parallel manipulator with inextensible limbs and base-mounted actuators are presented. The manipulator has higher resolution and precision than the existing three DOF mechanisms with extensible limbs. Since all of the manipulator actuators are base-mounted; higher payload capacity, smaller actuator sizes, and lower power dissipation can be obtained. The manipulator is suitable for alignment applications where only tip, tilt, and piston motions are significant. The direct kinematics of the manipulator is reduced to solving an eighth-degree polynomial in the square of tangent of half-angle between one of the limbs and the base plane. Hence, there are at most sixteen assembly configurations for the manipulator. In addition, it is shown that the sixteen solutions are eight pairs of reflected configurations with respect to the base plane. Numerical examples for the direct and inverse kinematics of the manipulator are also presented.
Mineral lineation produced by 3-D rotation of rigid inclusions in confined viscous simple shear

NASA Astrophysics Data System (ADS)

Marques, Fernando O.

2016-08-01

The solid-state flow of rocks commonly produces a parallel arrangement of elongate minerals with their longest axes coincident with the direction of flow-a mineral lineation. However, this does not conform to Jeffery's theory of the rotation of rigid ellipsoidal inclusions (REIs) in viscous simple shear, because rigid inclusions rotate continuously with applied shear. In 2-dimensional (2-D) flow, the REI's greatest axis (e1) is already in the shear direction; therefore, the problem is to find mechanisms that can prevent the rotation of the REI about one axis, the vorticity axis. In 3-D flow, the problem is to find a mechanism that can make e1 rotate towards the shear direction, and so generate a mineral lineation by rigid rotation about two axes. 3-D analogue and numerical modelling was used to test the effects of confinement on REI rotation and, for narrow channels (shear zone thickness over inclusion's least axis, Wr < 2), the results show that: (1) the rotational behaviour deviates greatly from Jeffery's model; (2) inclusions with aspect ratio Ar (greatest over least principle axis, e1/e3) > 1 can rotate backwards from an initial orientation w e1 parallel to the shear plane, in great contrast to Jeffery's model; (3) back rotation is limited because inclusions reach a stable equilibrium orientation; (4) most importantly and, in contrast to Jeffery's model and to the 2-D simulations, in 3-D, the confined REI gradually rotated about an axis orthogonal to the shear plane towards an orientation with e1 parallel to the shear direction, thus producing a lineation parallel to the shear direction. The modelling results lead to the conclusion that confined simple shear can be responsible for the mineral alignment (lineation) observed in ductile shear zones.
Scalable direct Vlasov solver with discontinuous Galerkin method on unstructured mesh.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xu, J.; Ostroumov, P. N.; Mustapha, B.

2010-12-01

This paper presents the development of parallel direct Vlasov solvers with discontinuous Galerkin (DG) method for beam and plasma simulations in four dimensions. Both physical and velocity spaces are in two dimesions (2P2V) with unstructured mesh. Contrary to the standard particle-in-cell (PIC) approach for kinetic space plasma simulations, i.e., solving Vlasov-Maxwell equations, direct method has been used in this paper. There are several benefits to solving a Vlasov equation directly, such as avoiding noise associated with a finite number of particles and the capability to capture fine structure in the plasma. The most challanging part of a direct Vlasov solvermore » comes from higher dimensions, as the computational cost increases as N{sup 2d}, where d is the dimension of the physical space. Recently, due to the fast development of supercomputers, the possibility has become more realistic. Many efforts have been made to solve Vlasov equations in low dimensions before; now more interest has focused on higher dimensions. Different numerical methods have been tried so far, such as the finite difference method, Fourier Spectral method, finite volume method, and spectral element method. This paper is based on our previous efforts to use the DG method. The DG method has been proven to be very successful in solving Maxwell equations, and this paper is our first effort in applying the DG method to Vlasov equations. DG has shown several advantages, such as local mass matrix, strong stability, and easy parallelization. These are particularly suitable for Vlasov equations. Domain decomposition in high dimensions has been used for parallelization; these include a highly scalable parallel two-dimensional Poisson solver. Benchmark results have been shown and simulation results will be reported.« less
Summer Proceedings 2016: The Center for Computing Research at Sandia National Laboratories

DOE Office of Scientific and Technical Information (OSTI.GOV)

Carleton, James Brian; Parks, Michael L.

Solving sparse linear systems from the discretization of elliptic partial differential equations (PDEs) is an important building block in many engineering applications. Sparse direct solvers can solve general linear systems, but are usually slower and use much more memory than effective iterative solvers. To overcome these two disadvantages, a hierarchical solver (LoRaSp) based on H2-matrices was introduced in [22]. Here, we have developed a parallel version of the algorithm in LoRaSp to solve large sparse matrices on distributed memory machines. On a single processor, the factorization time of our parallel solver scales almost linearly with the problem size for three-dimensionalmore » problems, as opposed to the quadratic scalability of many existing sparse direct solvers. Moreover, our solver leads to almost constant numbers of iterations, when used as a preconditioner for Poisson problems. On more than one processor, our algorithm has significant speedups compared to sequential runs. With this parallel algorithm, we are able to solve large problems much faster than many existing packages as demonstrated by the numerical experiments.« less
Parallelization of elliptic solver for solving 1D Boussinesq model

NASA Astrophysics Data System (ADS)

Tarwidi, D.; Adytia, D.

2018-03-01

In this paper, a parallel implementation of an elliptic solver in solving 1D Boussinesq model is presented. Numerical solution of Boussinesq model is obtained by implementing a staggered grid scheme to continuity, momentum, and elliptic equation of Boussinesq model. Tridiagonal system emerging from numerical scheme of elliptic equation is solved by cyclic reduction algorithm. The parallel implementation of cyclic reduction is executed on multicore processors with shared memory architectures using OpenMP. To measure the performance of parallel program, large number of grids is varied from 28 to 214. Two test cases of numerical experiment, i.e. propagation of solitary and standing wave, are proposed to evaluate the parallel program. The numerical results are verified with analytical solution of solitary and standing wave. The best speedup of solitary and standing wave test cases is about 2.07 with 214 of grids and 1.86 with 213 of grids, respectively, which are executed by using 8 threads. Moreover, the best efficiency of parallel program is 76.2% and 73.5% for solitary and standing wave test cases, respectively.
3-D parallel program for numerical calculation of gas dynamics problems with heat conductivity on distributed memory computational systems (CS)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sofronov, I.D.; Voronin, B.L.; Butnev, O.I.

1997-12-31

The aim of the work performed is to develop a 3D parallel program for numerical calculation of gas dynamics problem with heat conductivity on distributed memory computational systems (CS), satisfying the condition of numerical result independence from the number of processors involved. Two basically different approaches to the structure of massive parallel computations have been developed. The first approach uses the 3D data matrix decomposition reconstructed at temporal cycle and is a development of parallelization algorithms for multiprocessor CS with shareable memory. The second approach is based on using a 3D data matrix decomposition not reconstructed during a temporal cycle.more » The program was developed on 8-processor CS MP-3 made in VNIIEF and was adapted to a massive parallel CS Meiko-2 in LLNL by joint efforts of VNIIEF and LLNL staffs. A large number of numerical experiments has been carried out with different number of processors up to 256 and the efficiency of parallelization has been evaluated in dependence on processor number and their parameters.« less
Delineation of recharge areas for selected wells in the St. Peter-Prairie du Chien-Jordan Aquifer, Rochester, Minnesota

USGS Publications Warehouse

Delin, G.N.; Almendinger, James Edward

1991-01-01

Hydrogeologic mapping and numerical modeling were used to delineate zones of contribution to wells, defined as all parts of a ground-water-flow system that could supply water to a well. The zones of contribution delineated by use of numerical modeling have similar orientation (parallel to regional flow directions) but significantly different areas than the zones of contribution delineated by use of hydrogeologic mapping. Differences in computed areas of recharge are attributed to the capability of the numerical model to more accurately represent (1) the three-dimensional flow system, (2) hydrologic boundaries like streams, (3) variable recharge, and (4) the influence of nearby pumped wells, compared to the analytical models.
Delineation of recharge areas for selected wells in the St. Peter-Prairie du Chien-Jordan aquifer, Rochester, Minnesota

USGS Publications Warehouse

Delin, G.N.; Almendinger, James Edward

1993-01-01

Hydrogeologic mapping and numerical modeling were used to delineate zones of contribution to wells, defined as all parts of a ground-water-flow system that could supply water to a well. The zones of contribution delineated by use of numerical modeling have similar orientation (parallel to regional flow directions) but significantly different areas than the zones of contribution delineated by use of hydrogeologic mapping. Differences in computed areas of recharge are attributed to the capability of the numerical model to more accurately represent (1) the three-dimensional flow system, (2) hydrologic boundaries such as streams, (3) variable recharge, and (4) the influence of nearby pumped wells, compared to the analytical models.
Field characterization of elastic properties across a fault zone reactivated by fluid injection

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jeanne, Pierre; Guglielmi, Yves; Rutqvist, Jonny

In this paper, we studied the elastic properties of a fault zone intersecting the Opalinus Clay formation at 300 m depth in the Mont Terri Underground Research Laboratory (Switzerland). Four controlled water injection experiments were performed in borehole straddle intervals set at successive locations across the fault zone. A three-component displacement sensor, which allowed capturing the borehole wall movements during injection, was used to estimate the elastic properties of representative locations across the fault zone, from the host rock to the damage zone to the fault core. Young's moduli were estimated by both an analytical approach and numerical finite differencemore » modeling. Results show a decrease in Young's modulus from the host rock to the damage zone by a factor of 5 and from the damage zone to the fault core by a factor of 2. In the host rock, our results are in reasonable agreement with laboratory data showing a strong elastic anisotropy characterized by the direction of the plane of isotropy parallel to the laminar structure of the shale formation. In the fault zone, strong rotations of the direction of anisotropy can be observed. Finally, the plane of isotropy can be oriented either parallel to bedding (when few discontinuities are present), parallel to the direction of the main fracture family intersecting the zone, and possibly oriented parallel or perpendicular to the fractures critically oriented for shear reactivation (when repeated past rupture along this plane has created a zone).« less
Field characterization of elastic properties across a fault zone reactivated by fluid injection

DOE PAGES

Jeanne, Pierre; Guglielmi, Yves; Rutqvist, Jonny; ...

2017-08-12

In this paper, we studied the elastic properties of a fault zone intersecting the Opalinus Clay formation at 300 m depth in the Mont Terri Underground Research Laboratory (Switzerland). Four controlled water injection experiments were performed in borehole straddle intervals set at successive locations across the fault zone. A three-component displacement sensor, which allowed capturing the borehole wall movements during injection, was used to estimate the elastic properties of representative locations across the fault zone, from the host rock to the damage zone to the fault core. Young's moduli were estimated by both an analytical approach and numerical finite differencemore » modeling. Results show a decrease in Young's modulus from the host rock to the damage zone by a factor of 5 and from the damage zone to the fault core by a factor of 2. In the host rock, our results are in reasonable agreement with laboratory data showing a strong elastic anisotropy characterized by the direction of the plane of isotropy parallel to the laminar structure of the shale formation. In the fault zone, strong rotations of the direction of anisotropy can be observed. Finally, the plane of isotropy can be oriented either parallel to bedding (when few discontinuities are present), parallel to the direction of the main fracture family intersecting the zone, and possibly oriented parallel or perpendicular to the fractures critically oriented for shear reactivation (when repeated past rupture along this plane has created a zone).« less
Status of parallel Python-based implementation of UEDGE

NASA Astrophysics Data System (ADS)

Umansky, M. V.; Pankin, A. Y.; Rognlien, T. D.; Dimits, A. M.; Friedman, A.; Joseph, I.

2017-10-01

The tokamak edge transport code UEDGE has long used the code-development and run-time framework Basis. However, with the support for Basis expected to terminate in the coming years, and with the advent of the modern numerical language Python, it has become desirable to move UEDGE to Python, to ensure its long-term viability. Our new Python-based UEDGE implementation takes advantage of the portable build system developed for FACETS. The new implementation gives access to Python's graphical libraries and numerical packages for pre- and post-processing, and support of HDF5 simplifies exchanging data. The older serial version of UEDGE has used for time-stepping the Newton-Krylov solver NKSOL. The renovated implementation uses backward Euler discretization with nonlinear solvers from PETSc, which has the promise to significantly improve the UEDGE parallel performance. We will report on assessment of some of the extended UEDGE capabilities emerging in the new implementation, and will discuss the future directions. Work performed for U.S. DOE by LLNL under contract DE-AC52-07NA27344.
Electromagnetically induced disintegration and polarization plane rotation of laser pulses

NASA Astrophysics Data System (ADS)

Parshkov, Oleg M.; Budyak, Victoria V.; Kochetkova, Anastasia E.

2017-04-01

The numerical simulation results of disintegration effect of linear polarized shot probe pulses of electromagnetically induced transparency in the counterintuitive superposed linear polarized control field are presented. It is shown, that this disintegration occurs, if linear polarizations of interacting pulses are not parallel or mutually perpendicular. In case of weak input probe field the polarization of one probe pulse in the medium is parallel, whereas the polarization of another probe pulse is perpendicular to polarization direction of input control radiation. The concerned effect is analogous to the effect, which must to take place when short laser pulse propagates along main axes of biaxial crystal because of group velocity of normal mod difference. The essential difference of probe pulse disintegration and linear process in biaxial crystal is that probe pulse preserves linear polarization in all stages of propagation. The numerical simulation is performed for scheme of degenerated quantum transitions between 3P0 , 3P01 and 3P2 energy levels of 208Pb isotope.

Feasibility of using the Massively Parallel Processor for large eddy simulations and other Computational Fluid Dynamics applications

NASA Technical Reports Server (NTRS)

Bruno, John

1984-01-01

The results of an investigation into the feasibility of using the MPP for direct and large eddy simulations of the Navier-Stokes equations is presented. A major part of this study was devoted to the implementation of two of the standard numerical algorithms for CFD. These implementations were not run on the Massively Parallel Processor (MPP) since the machine delivered to NASA Goddard does not have sufficient capacity. Instead, a detailed implementation plan was designed and from these were derived estimates of the time and space requirements of the algorithms on a suitably configured MPP. In addition, other issues related to the practical implementation of these algorithms on an MPP-like architecture were considered; namely, adaptive grid generation, zonal boundary conditions, the table lookup problem, and the software interface. Performance estimates show that the architectural components of the MPP, the Staging Memory and the Array Unit, appear to be well suited to the numerical algorithms of CFD. This combined with the prospect of building a faster and larger MMP-like machine holds the promise of achieving sustained gigaflop rates that are required for the numerical simulations in CFD.
High Performance Input/Output for Parallel Computer Systems

NASA Technical Reports Server (NTRS)

Ligon, W. B.

1996-01-01

The goal of our project is to study the I/O characteristics of parallel applications used in Earth Science data processing systems such as Regional Data Centers (RDCs) or EOSDIS. Our approach is to study the runtime behavior of typical programs and the effect of key parameters of the I/O subsystem both under simulation and with direct experimentation on parallel systems. Our three year activity has focused on two items: developing a test bed that facilitates experimentation with parallel I/O, and studying representative programs from the Earth science data processing application domain. The Parallel Virtual File System (PVFS) has been developed for use on a number of platforms including the Tiger Parallel Architecture Workbench (TPAW) simulator, The Intel Paragon, a cluster of DEC Alpha workstations, and the Beowulf system (at CESDIS). PVFS provides considerable flexibility in configuring I/O in a UNIX- like environment. Access to key performance parameters facilitates experimentation. We have studied several key applications fiom levels 1,2 and 3 of the typical RDC processing scenario including instrument calibration and navigation, image classification, and numerical modeling codes. We have also considered large-scale scientific database codes used to organize image data.
Numerical Modeling of High Irradiance Electromagnetic Beam Effects on Composite and Polymer Materials

DTIC Science & Technology

2013-05-10

propelled away from the surface as a jet due to their rapid production [19]. Higher energy UV radiation produced by excimer lasers has the required...5) ∑ ∑ ( ) (6) Pan determined that the shape (usually elliptical ) of an HAZ in a UD...radii of an elliptical HAZ in the parallel and perpendicular to fiber directions. In UD composites the HAZ is elliptical because of the higher
Stroop-Like Effects for Monkeys and Humans: Processing Speed or Strength of Association?

NASA Technical Reports Server (NTRS)

Washburn, David A.

1994-01-01

Stroop-like effects have been found using a variety of paradigms and subject groups. In the present investigation, 6 rhesus monkeys (Macaca mulatta) and 28 humans exhibited Stroop-like interference and facilitation in a relative-numerousness task. Monkeys, like humans, processed the meanings of the numerical symbols automatically, despite the fact that these meanings were irrelevant to task performance. These data also afforded direct comparison of interpretations of the Stroop effect in terms of processing speed versus association strength. These findings were consistent with parallel-processing models of Stroop-like interference proposed elsewhere, but not with processing-speed accounts posited frequently to explain the effect.
The firehose instability during multiple reconnection in the Earth's magnetotail

NASA Astrophysics Data System (ADS)

Alexandrova, Alexandra; Divin, Andrey; Retino, Alessandro; Deca, Jan; Catapano, Filomena; Cozzani, Giulia

2017-04-01

We found unique events in the Cluster spacecraft observations of the Earth's magnetotail which correspond to the case of multiple reconnection sites. The ion temperature anisotropy of more energized ions in the direction parallel to the magnetic field, rather than in the perpendicular direction, is observed in the region of dynamical interaction between two active X-lines. The magnetic field and plasma parameters associated with the anisotropy correspond to the firehose instability conditions. We discuss possible scenarios of development of the firehose instability in multiple reconnection by comparing the observations with numerical simulations. Conventional Particle-in-Cell simulations of 2D magnetic reconnection starting from Harris equilibria are performed using implicit PIC code iPIC3D [Markidis, 2010]. At earlier stages the evolution creates fronts which push the weakly magnetized current sheet plasma away from the X-line. Fronts accelerate and reflect particles, producing parallel ion beams and increasing parallel ion temperature ahead of the front. If multiple X-lines are present, then the counterstreaming ion beams appear inside the original current sheet between colliding reconnection jet fronts. For large enough parallel ion pressure anisotropy, the firehose-like mode is excited inside the original current sheet with a flapping-like appearance along the X GSM direction but not Y GSM (current) direction. One should note that our simulations do not include the Bz magnetic field component (normal to the current sheet), hence ion beams cannot escape into the lobes and the whole region between two colliding fronts is unstable to firehose-like instability. In the Earth's magnetotail such configuration likely occurs when two active X-lines are close enough to each other, similar to a few cases we found in the Cluster observations.
Programming a hillslope water movement model on the MPP

NASA Technical Reports Server (NTRS)

Devaney, J. E.; Irving, A. R.; Camillo, P. J.; Gurney, R. J.

1987-01-01

A physically based numerical model was developed of heat and moisture flow within a hillslope on a parallel architecture computer, as a precursor to a model of a complete catchment. Moisture flow within a catchment includes evaporation, overland flow, flow in unsaturated soil, and flow in saturated soil. Because of the empirical evidence that moisture flow in unsaturated soil is mainly in the vertical direction, flow in the unsaturated zone can be modeled as a series of one dimensional columns. This initial version of the hillslope model includes evaporation and a single column of one dimensional unsaturated zone flow. This case has already been solved on an IBM 3081 computer and is now being applied to the massively parallel processor architecture so as to make the extension to the one dimensional case easier and to check the problems and benefits of using a parallel architecture machine.
Coeval emplacement and orogen-parallel transport of gold in oblique convergent orogens

NASA Astrophysics Data System (ADS)

Upton, Phaedra; Craw, Dave

2016-12-01

Varying amounts of gold mineralisation is occurring in all young and active collisional mountain belts. Concurrently, these syn-orogenic hydrothermal deposits are being eroded and transported to form placer deposits. Local extension occurs in convergent orogens, especially oblique orogens, and facilitates emplacement of syn-orogenic gold-bearing deposits with or without associated magmatism. Numerical modelling has shown that extension results from directional variations in movement rates along the rock transport trajectory during convergence, and is most pronounced for highly oblique convergence with strong crustal rheology. On-going uplift during orogenesis exposes gold deposits to erosion, transport, and localised placer concentration. Drainage patterns in variably oblique convergent orogenic belts typically have an orogen-parallel or sub-parallel component; the details of which varies with convergence obliquity and the vagaries of underlying geological controls. This leads to lateral transport of eroded syn-orogenic gold on a range of scales, up to > 100 km. The presence of inherited crustal blocks with contrasting rheology in oblique orogenic collision zones can cause perturbations in drainage patterns, but numerical modelling suggests that orogen-parallel drainage is still a persistent and robust feature. The presence of an inherited block of weak crust enhances the orogen-parallel drainage by imposition of localised subsidence zones elongated along a plate boundary. Evolution and reorientation of orogen-parallel drainage can sever links between gold placer deposits and their syn-orogenic sources. Many of these modelled features of syn-orogenic gold emplacement and varying amounts of orogen-parallel detrital gold transport can be recognised in the Miocene to Recent New Zealand oblique convergent orogen. These processes contribute little gold to major placer goldfields, which require more long-term recycling and placer gold concentration. Most eroded syn-orogenic gold becomes diluted by abundant lithic debris in rivers and sedimentary basins except where localised concentration occurs, especially on beaches.
Redundantly piezo-actuated XYθ z compliant mechanism for nano-positioning featuring simple kinematics, bi-directional motion and enlarged workspace

NASA Astrophysics Data System (ADS)

Zhu, Wu-Le; Zhu, Zhiwei; To, Suet; Liu, Qiang; Ju, Bing-Feng; Zhou, Xiaoqin

2016-12-01

This paper presents a novel redundantly piezo-actuated three-degree-of-freedom XYθ z compliant mechanism for nano-positioning, driven by four mirror-symmetrically configured piezoelectric actuators (PEAs). By means of differential motion principle, linearized kinematics and physically bi-directional motions in all the three directions are achieved. Meanwhile, the decoupled delivering of three-directional independent motions at the output end is accessible, and the essential parallel and mirror symmetric configuration guarantees large output stiffness, high natural frequencies, high accuracy as well as high structural compactness of the mechanism. Accurate kinematics analysis with consideration of input coupling indicates that the proposed redundantly actuated compliant mechanism can generate three-dimensional (3D) symmetric polyhedral workspace envelope with enlarged reachable workspace, as compared with the most common parallel XYθ z mechanism driven by three PEAs. Keeping a high consistence with both analytical and numerical models, the experimental results show the working ranges of ±6.21 μm and ±12.41 μm in X- and Y-directions, and that of ±873.2 μrad in θ z-direction with nano-positioning capability can be realized. The superior performances and easily achievable structure well facilitate practical applications of the proposed XYθ z compliant mechanism in nano-positioning systems.
Flood predictions using the parallel version of distributed numerical physical rainfall-runoff model TOPKAPI

NASA Astrophysics Data System (ADS)

Boyko, Oleksiy; Zheleznyak, Mark

2015-04-01

The original numerical code TOPKAPI-IMMS of the distributed rainfall-runoff model TOPKAPI ( Todini et al, 1996-2014) is developed and implemented in Ukraine. The parallel version of the code has been developed recently to be used on multiprocessors systems - multicore/processors PC and clusters. Algorithm is based on binary-tree decomposition of the watershed for the balancing of the amount of computation for all processors/cores. Message passing interface (MPI) protocol is used as a parallel computing framework. The numerical efficiency of the parallelization algorithms is demonstrated for the case studies for the flood predictions of the mountain watersheds of the Ukrainian Carpathian regions. The modeling results is compared with the predictions based on the lumped parameters models.
SIAM Conference on Parallel Processing for Scientific Computing, 4th, Chicago, IL, Dec. 11-13, 1989, Proceedings

NASA Technical Reports Server (NTRS)

Dongarra, Jack (Editor); Messina, Paul (Editor); Sorensen, Danny C. (Editor); Voigt, Robert G. (Editor)

1990-01-01

Attention is given to such topics as an evaluation of block algorithm variants in LAPACK and presents a large-grain parallel sparse system solver, a multiprocessor method for the solution of the generalized Eigenvalue problem on an interval, and a parallel QR algorithm for iterative subspace methods on the CM2. A discussion of numerical methods includes the topics of asynchronous numerical solutions of PDEs on parallel computers, parallel homotopy curve tracking on a hypercube, and solving Navier-Stokes equations on the Cedar Multi-Cluster system. A section on differential equations includes a discussion of a six-color procedure for the parallel solution of elliptic systems using the finite quadtree structure, data parallel algorithms for the finite element method, and domain decomposition methods in aerodynamics. Topics dealing with massively parallel computing include hypercube vs. 2-dimensional meshes and massively parallel computation of conservation laws. Performance and tools are also discussed.
How Deep Is Your SNARC? Interactions Between Numerical Magnitude, Response Hands, and Reachability in Peripersonal Space.

PubMed

Lohmann, Johannes; Schroeder, Philipp A; Nuerk, Hans-Christoph; Plewnia, Christian; Butz, Martin V

2018-01-01

Spatial, physical, and semantic magnitude dimensions can influence action decisions in human cognitive processing and interact with each other. For example, in the spatial-numerical associations of response code (SNARC) effect, semantic numerical magnitude facilitates left-hand or right-hand responding dependent on the small or large magnitude of number symbols. SNARC-like interactions of numerical magnitudes with the radial spatial dimension (depth) were postulated from early on. Usually, the SNARC effect in any direction is investigated using fronto-parallel computer monitors for presentation of stimuli. In such 2D setups, however, the metaphorical and literal interpretation of the radial depth axis with seemingly close/far stimuli or responses are not distinct. Hence, it is difficult to draw clear conclusions with respect to the contribution of different spatial mappings to the SNARC effect. In order to disentangle the different mappings in a natural way, we studied parametrical interactions between semantic numerical magnitude, horizontal directional responses, and perceptual distance by means of stereoscopic depth in an immersive virtual reality (VR). Two VR experiments show horizontal SNARC effects across all spatial displacements in traditional latency measures and kinematic response parameters. No indications of a SNARC effect along the depth axis, as it would be predicted by a direct mapping account, were observed, but the results show a non-linear relationship between horizontal SNARC slopes and physical distance. Steepest SNARC slopes were observed for digits presented close to the hands. We conclude that spatial-numerical processing is susceptible to effector-based processes but relatively resilient to task-irrelevant variations of radial-spatial magnitudes.
Singular boundary method for global gravity field modelling

NASA Astrophysics Data System (ADS)

Cunderlik, Robert

2014-05-01

The singular boundary method (SBM) and method of fundamental solutions (MFS) are meshless boundary collocation techniques that use the fundamental solution of a governing partial differential equation (e.g. the Laplace equation) as their basis functions. They have been developed to avoid singular numerical integration as well as mesh generation in the traditional boundary element method (BEM). SBM have been proposed to overcome a main drawback of MFS - its controversial fictitious boundary outside the domain. The key idea of SBM is to introduce a concept of the origin intensity factors that isolate singularities of the fundamental solution and its derivatives using some appropriate regularization techniques. Consequently, the source points can be placed directly on the real boundary and coincide with the collocation nodes. In this study we deal with SBM applied for high-resolution global gravity field modelling. The first numerical experiment presents a numerical solution to the fixed gravimetric boundary value problem. The achieved results are compared with the numerical solutions obtained by MFS or the direct BEM indicating efficiency of all methods. In the second numerical experiments, SBM is used to derive the geopotential and its first derivatives from the Tzz components of the gravity disturbing tensor observed by the GOCE satellite mission. A determination of the origin intensity factors allows to evaluate the disturbing potential and gravity disturbances directly on the Earth's surface where the source points are located. To achieve high-resolution numerical solutions, the large-scale parallel computations are performed on the cluster with 1TB of the distributed memory and an iterative elimination of far zones' contributions is applied.
How to Build an AppleSeed: A Parallel Macintosh Cluster for Numerically Intensive Computing

NASA Astrophysics Data System (ADS)

Decyk, V. K.; Dauger, D. E.

We have constructed a parallel cluster consisting of a mixture of Apple Macintosh G3 and G4 computers running the Mac OS, and have achieved very good performance on numerically intensive, parallel plasma particle-incell simulations. A subset of the MPI message-passing library was implemented in Fortran77 and C. This library enabled us to port code, without modification, from other parallel processors to the Macintosh cluster. Unlike Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. This enables us to move parallel computing from the realm of experts to the main stream of computing.
Formation of Electrostatic Potential Drops in the Auroral Zone

NASA Technical Reports Server (NTRS)

Schriver, D.; Ashour-Abdalla, M.; Richard, R. L.

2001-01-01

In order to examine the self-consistent formation of large-scale quasi-static parallel electric fields in the auroral zone on a micro/meso scale, a particle in cell simulation has been developed. The code resolves electron Debye length scales so that electron micro-processes are included and a variable grid scheme is used such that the overall length scale of the simulation is of the order of an Earth radii along the magnetic field. The simulation is electrostatic and includes the magnetic mirror force, as well as two types of plasmas, a cold dense ionospheric plasma and a warm tenuous magnetospheric plasma. In order to study the formation of parallel electric fields in the auroral zone, different magnetospheric ion and electron inflow boundary conditions are used to drive the system. It has been found that for conditions in the primary (upward) current region an upward directed quasi-static electric field can form across the system due to magnetic mirroring of the magnetospheric ions and electrons at different altitudes. For conditions in the return (downward) current region it is shown that a quasi-static parallel electric field in the opposite sense of that in the primary current region is formed, i.e., the parallel electric field is directed earthward. The conditions for how these different electric fields can be formed are discussed using satellite observations and numerical simulations.
Signal-domain optimization metrics for MPRAGE RF pulse design in parallel transmission at 7 tesla.

PubMed

Gras, V; Vignaud, A; Mauconduit, F; Luong, M; Amadon, A; Le Bihan, D; Boulant, N

2016-11-01

Standard radiofrequency pulse design strategies focus on minimizing the deviation of the flip angle from a target value, which is sufficient but not necessary for signal homogeneity. An alternative approach, based directly on the signal, here is proposed for the MPRAGE sequence, and is developed in the parallel transmission framework with the use of the k T -points parametrization. The flip angle-homogenizing and the proposed methods were investigated numerically under explicit power and specific absorption rate constraints and tested experimentally in vivo on a 7 T parallel transmission system enabling real time local specific absorption rate monitoring. Radiofrequency pulse performance was assessed by a careful analysis of the signal and contrast between white and gray matter. Despite a slight reduction of the flip angle uniformity, an improved signal and contrast homogeneity with a significant reduction of the specific absorption rate was achieved with the proposed metric in comparison with standard pulse designs. The proposed joint optimization of the inversion and excitation pulses enables significant reduction of the specific absorption rate in the MPRAGE sequence while preserving image quality. The work reported thus unveils a possible direction to increase the potential of ultra-high field MRI and parallel transmission. Magn Reson Med 76:1431-1442, 2016. © 2015 International Society for Magnetic Resonance in Medicine. © 2015 International Society for Magnetic Resonance in Medicine.
Automating FEA programming

NASA Technical Reports Server (NTRS)

Sharma, Naveen

1992-01-01

In this paper we briefly describe a combined symbolic and numeric approach for solving mathematical models on parallel computers. An experimental software system, PIER, is being developed in Common Lisp to synthesize computationally intensive and domain formulation dependent phases of finite element analysis (FEA) solution methods. Quantities for domain formulation like shape functions, element stiffness matrices, etc., are automatically derived using symbolic mathematical computations. The problem specific information and derived formulae are then used to generate (parallel) numerical code for FEA solution steps. A constructive approach to specify a numerical program design is taken. The code generator compiles application oriented input specifications into (parallel) FORTRAN77 routines with the help of built-in knowledge of the particular problem, numerical solution methods and the target computer.
Logarithmic Superdiffusion in Two Dimensional Driven Lattice Gases

NASA Astrophysics Data System (ADS)

Krug, J.; Neiss, R. A.; Schadschneider, A.; Schmidt, J.

2018-03-01

The spreading of density fluctuations in two-dimensional driven diffusive systems is marginally anomalous. Mode coupling theory predicts that the diffusivity in the direction of the drive diverges with time as (ln t)^{2/3} with a prefactor depending on the macroscopic current-density relation and the diffusion tensor of the fluctuating hydrodynamic field equation. Here we present the first numerical verification of this behavior for a particular version of the two-dimensional asymmetric exclusion process. Particles jump strictly asymmetrically along one of the lattice directions and symmetrically along the other, and an anisotropy parameter p governs the ratio between the two rates. Using a novel massively parallel coupling algorithm that strongly reduces the fluctuations in the numerical estimate of the two-point correlation function, we are able to accurately determine the exponent of the logarithmic correction. In addition, the variation of the prefactor with p provides a stringent test of mode coupling theory.
Pure quasi-P wave equation and numerical solution in 3D TTI media

NASA Astrophysics Data System (ADS)

Zhang, Jian-Min; He, Bing-Shou; Tang, Huai-Gu

2017-03-01

Based on the pure quasi-P wave equation in transverse isotropic media with a vertical symmetry axis (VTI media), a quasi-P wave equation is obtained in transverse isotropic media with a tilted symmetry axis (TTI media). This is achieved using projection transformation, which rotates the direction vector in the coordinate system of observation toward the direction vector for the coordinate system in which the z-component is parallel to the symmetry axis of the TTI media. The equation has a simple form, is easily calculated, is not influenced by the pseudo-shear wave, and can be calculated reliably when δ is greater than ɛ. The finite difference method is used to solve the equation. In addition, a perfectly matched layer (PML) absorbing boundary condition is obtained for the equation. Theoretical analysis and numerical simulation results with forward modeling prove that the equation can accurately simulate a quasi-P wave in TTI medium.
Minimizing Concentration Effects in Water-Based, Laminar-Flow Condensation Particle Counters

PubMed Central

Lewis, Gregory S.; Hering, Susanne V.

2013-01-01

Concentration effects in water condensation systems, such as used in the water-based condensation particle counter, are explored through numeric modeling and direct measurements. Modeling shows that the condensation heat release and vapor depletion associated with particle activation and growth lowers the peak supersaturation. At higher number concentrations, the diameter of the droplets formed is smaller, and the threshold particle size for activation is higher. This occurs in both cylindrical and parallel plate geometries. For water-based systems we find that condensational heat release is more important than is vapor depletion. We also find that concentration effects can be minimized through use of smaller tube diameters, or more closely spaced parallel plates. Experimental measurements of droplet diameter confirm modeling results. PMID:24436507
DOE Office of Scientific and Technical Information (OSTI.GOV)

Spotz, William F.

PyTrilinos is a set of Python interfaces to compiled Trilinos packages. This collection supports serial and parallel dense linear algebra, serial and parallel sparse linear algebra, direct and iterative linear solution techniques, algebraic and multilevel preconditioners, nonlinear solvers and continuation algorithms, eigensolvers and partitioning algorithms. Also included are a variety of related utility functions and classes, including distributed I/O, coloring algorithms and matrix generation. PyTrilinos vector objects are compatible with the popular NumPy Python package. As a Python front end to compiled libraries, PyTrilinos takes advantage of the flexibility and ease of use of Python, and the efficiency of themore » underlying C++, C and Fortran numerical kernels. This paper covers recent, previously unpublished advances in the PyTrilinos package.« less

High performance Python for direct numerical simulations of turbulent flows

NASA Astrophysics Data System (ADS)

Mortensen, Mikael; Langtangen, Hans Petter

2016-06-01

Direct Numerical Simulations (DNS) of the Navier Stokes equations is an invaluable research tool in fluid dynamics. Still, there are few publicly available research codes and, due to the heavy number crunching implied, available codes are usually written in low-level languages such as C/C++ or Fortran. In this paper we describe a pure scientific Python pseudo-spectral DNS code that nearly matches the performance of C++ for thousands of processors and billions of unknowns. We also describe a version optimized through Cython, that is found to match the speed of C++. The solvers are written from scratch in Python, both the mesh, the MPI domain decomposition, and the temporal integrators. The solvers have been verified and benchmarked on the Shaheen supercomputer at the KAUST supercomputing laboratory, and we are able to show very good scaling up to several thousand cores. A very important part of the implementation is the mesh decomposition (we implement both slab and pencil decompositions) and 3D parallel Fast Fourier Transforms (FFT). The mesh decomposition and FFT routines have been implemented in Python using serial FFT routines (either NumPy, pyFFTW or any other serial FFT module), NumPy array manipulations and with MPI communications handled by MPI for Python (mpi4py). We show how we are able to execute a 3D parallel FFT in Python for a slab mesh decomposition using 4 lines of compact Python code, for which the parallel performance on Shaheen is found to be slightly better than similar routines provided through the FFTW library. For a pencil mesh decomposition 7 lines of code is required to execute a transform.
Direct numerical simulation of vacillation in convection induced by centrifugal buoyancy

NASA Astrophysics Data System (ADS)

Pitz, Diogo B.; Marxen, Olaf; Chew, John W.

2017-11-01

Flows induced by centrifugal buoyancy occur in industrial systems, such as in the compressor cavities of gas turbines, as well as in flows of geophysical interest. In this numerical study we use direct numerical simulation (DNS) to investigate the transition between the steady waves regime, which is characterized by great regularity, to the vacillation regime, which is critical to understand transition to the fully turbulent regime. From previous work it is known that the onset of convection occurs in the form of pairs of nearly-circular rolls which span the entire axial length of the cavity, with small deviations near the parallel, no-slip end walls. When non-linearity sets in triadic interactions occur and, depending on the value of the centrifugal Rayleigh number, the flow is dominated by either a single mode and its harmonics or by broadband effects if turbulence develops. In this study we increase the centrifugal Rayleigh number progressively and investigate mode interactions during the vacillation regime which eventually lead to chaotic motion. Diogo B. Pitz acknowledges the financial support from the Capes foundation through the Science without Borders program.
Perspectives on the Future of CFD

NASA Technical Reports Server (NTRS)

Kwak, Dochan

2000-01-01

This viewgraph presentation gives an overview of the future of computational fluid dynamics (CFD), which in the past has pioneered the field of flow simulation. Over time CFD has progressed as computing power. Numerical methods have been advanced as CPU and memory capacity increases. Complex configurations are routinely computed now and direct numerical simulations (DNS) and large eddy simulations (LES) are used to study turbulence. As the computing resources changed to parallel and distributed platforms, computer science aspects such as scalability (algorithmic and implementation) and portability and transparent codings have advanced. Examples of potential future (or current) challenges include risk assessment, limitations of the heuristic model, and the development of CFD and information technology (IT) tools.
CFD Mixing Analysis of Jets Injected from Straight and Slanted Slots into Confined Crossflow in Rectangular Ducts

NASA Technical Reports Server (NTRS)

Bain, D. B.; Smith, C. E.; Holdeman, J. D.

1992-01-01

A CFD study was performed to analyze the mixing potential of opposed rows of staggered jets injected into confined crossflow in a rectangular duct. Three jet configurations were numerically tested: (1) straight (0 deg) slots; (2) perpendicular slanted (45 deg) slots angled in opposite directions on top and bottom walls; and (3) parallel slanted (45 deg) slots angled in the same direction on top and bottom walls. All three configurations were tested at slot spacing-to-duct height ratios (S/H) of 0.5, 0.75, and 1.0; a jet-to-mainstream momentum flux ratio (J) of 100; and a jet-to-mainstream mass flow ratio of 0.383. Each configuration had its best mixing performance at S/H of 0.75. Asymmetric flow patterns were expected and predicted for all slanted slot configurations. The parallel slanted slot configuration was the best overall configuration at x/H of 1.0 for S/H of 0.75.
Shear-induced chaos

NASA Astrophysics Data System (ADS)

Lin, Kevin K.; Young, Lai-Sang

2008-05-01

Guided by a geometric understanding developed in earlier works of Wang and Young, we carry out numerical studies of shear-induced chaos in several parallel but different situations. The settings considered include periodic kicking of limit cycles, random kicks at Poisson times and continuous-time driving by white noise. The forcing of a quasi-periodic model describing two coupled oscillators is also investigated. In all cases, positive Lyapunov exponents are found in suitable parameter ranges when the forcing is suitably directed.
Palladium-Catalyzed Nitromethylation of Aryl Halides: An Orthogonal Formylation Equivalent

PubMed Central

Walvoord, Ryan R.; Berritt, Simon; Kozlowski, Marisa C.

2012-01-01

An efficient cross-coupling reaction of aryl halides and nitromethane was developed with the use of parallel microscale experimentation. The arylnitromethane products are precursors for numerous useful synthetic products. An efficient method for their direct conversion to the corresponding oximes and aldehydes in a one-pot operation has been discovered. The process exploits inexpensive nitromethane as a carbonyl equivalent, providing a mild and convenient formylation method that is compatible with many functional groups. PMID:22839593
Influence of fast advective flows on pattern formation of Dictyostelium discoideum

PubMed Central

Bae, Albert; Zykov, Vladimir; Bodenschatz, Eberhard

2018-01-01

We report experimental and numerical results on pattern formation of self-organizing Dictyostelium discoideum cells in a microfluidic setup under a constant buffer flow. The external flow advects the signaling molecule cyclic adenosine monophosphate (cAMP) downstream, while the chemotactic cells attached to the solid substrate are not transported with the flow. At high flow velocities, elongated cAMP waves are formed that cover the whole length of the channel and propagate both parallel and perpendicular to the flow direction. While the wave period and transverse propagation velocity are constant, parallel wave velocity and the wave width increase linearly with the imposed flow. We also observe that the acquired wave shape is highly dependent on the wave generation site and the strength of the imposed flow. We compared the wave shape and velocity with numerical simulations performed using a reaction-diffusion model and found excellent agreement. These results are expected to play an important role in understanding the process of pattern formation and aggregation of D. discoideum that may experience fluid flows in its natural habitat. PMID:29590179
Meso-modeling of Carbon Fiber Composite for Crash Safety Analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lin, Shih-Po; Chen, Yijung; Zeng, Danielle

2017-04-06

In the conventional approach, the material properties for crash safety simulations are typically obtained from standard coupon tests, where the test results only provide single layer material properties used in crash simulations. However, the lay-up effects for the failure behaviors of the real structure were not considered in numerical simulations. Hence, there was discrepancy between the crash simulations and experimental tests. Consequently, an intermediate stage is required for accurate predictions. Some component tests are required to correlate the material models in the intermediate stage. In this paper, a Mazda Tube under high-impact velocity is chosen as an example for themore » crash safety analysis. The tube consists of 24 layers of uni-directional (UD) carbon fiber composite materials, in which 4 layers are perpendicular to, while the other layers are parallel to the impact direction. An LS-DYNA meso-model was constructed with orthotropic material models counting for the single-layer material behaviors. Between layers, a node-based tie-break contact was used for modeling the delamination of the composite material. Since fiber directions are not single-oriented, the lay-up effects could be an important effect. From the first numerical trial, premature material failure occurred due to the use of material parameters obtained directly from the coupon tests. Some parametric studies were conducted to identify the cause of the numerical instability. The finding is that the material failure strength used in the numerical model needs to be enlarged to stabilize the numerical model. Some hypothesis was made to provide the foundation for enlarging the failure strength and the corresponding experiments will be conducted to validate the hypothesis.« less
Parallel language constructs for tensor product computations on loosely coupled architectures

NASA Technical Reports Server (NTRS)

Mehrotra, Piyush; Vanrosendale, John

1989-01-01

Distributed memory architectures offer high levels of performance and flexibility, but have proven awkard to program. Current languages for nonshared memory architectures provide a relatively low level programming environment, and are poorly suited to modular programming, and to the construction of libraries. A set of language primitives designed to allow the specification of parallel numerical algorithms at a higher level is described. Tensor product array computations are focused on along with a simple but important class of numerical algorithms. The problem of programming 1-D kernal routines is focused on first, such as parallel tridiagonal solvers, and then how such parallel kernels can be combined to form parallel tensor product algorithms is examined.
Parallel processing in finite element structural analysis

NASA Technical Reports Server (NTRS)

Noor, Ahmed K.

1987-01-01

A brief review is made of the fundamental concepts and basic issues of parallel processing. Discussion focuses on parallel numerical algorithms, performance evaluation of machines and algorithms, and parallelism in finite element computations. A computational strategy is proposed for maximizing the degree of parallelism at different levels of the finite element analysis process including: 1) formulation level (through the use of mixed finite element models); 2) analysis level (through additive decomposition of the different arrays in the governing equations into the contributions to a symmetrized response plus correction terms); 3) numerical algorithm level (through the use of operator splitting techniques and application of iterative processes); and 4) implementation level (through the effective combination of vectorization, multitasking and microtasking, whenever available).
Numerical study of the existence criterion for the reversed shear Alfven eigenmode in the presence of a parallel equilibrium current

NASA Astrophysics Data System (ADS)

Shahzad, M.; Rizvi, H.; Panwar, A.; Ryu, C. M.

2017-06-01

We have re-visited the existence criterion of the reverse shear Alfven eigenmodes (RSAEs) in the presence of the parallel equilibrium current by numerically solving the eigenvalue equation using a fast eigenvalue solver code KAES. The parallel equilibrium current can bring in the kink effect and is known to be strongly unfavorable for the RSAE. We have numerically estimated the critical value of the toroidicity factor Qtor in a circular tokamak plasma, above which RSAEs can exist, and compared it to the analytical one. The difference between the numerical and analytical critical values is small for low frequency RSAEs, but it increases as the frequency of the mode increases, becoming greater for higher poloidal harmonic modes.
Three-dimensional finite amplitude electroconvection in dielectric liquids

NASA Astrophysics Data System (ADS)

Luo, Kang; Wu, Jian; Yi, Hong-Liang; Tan, He-Ping

2018-02-01

Charge injection induced electroconvection in a dielectric liquid lying between two parallel plates is numerically simulated in three dimensions (3D) using a unified lattice Boltzmann method (LBM). Cellular flow patterns and their subcritical bifurcation phenomena of 3D electroconvection are numerically investigated for the first time. A unit conversion is also derived to connect the LBM system to the real physical system. The 3D LBM codes are validated by three carefully chosen cases and all results are found to be highly consistent with the analytical solutions or other numerical studies. For strong injection, the steady state roll, polygon, and square flow patterns are observed under different initial disturbances. Numerical results show that the hexagonal cell with the central region being empty of charge and centrally downward flow is preferred in symmetric systems under random initial disturbance. For weak injection, the numerical results show that the flow directly passes from the motionless state to turbulence once the system loses its linear stability. In addition, the numerically predicted linear and finite amplitude stability criteria of different flow patterns are discussed.
On the wall-normal velocity of the compressible boundary-layer equations

NASA Technical Reports Server (NTRS)

Pruett, C. David

1991-01-01

Numerical methods for the compressible boundary-layer equations are facilitated by transformation from the physical (x,y) plane to a computational (xi,eta) plane in which the evolution of the flow is 'slow' in the time-like xi direction. The commonly used Levy-Lees transformation results in a computationally well-behaved problem for a wide class of non-similar boundary-layer flows, but it complicates interpretation of the solution in physical space. Specifically, the transformation is inherently nonlinear, and the physical wall-normal velocity is transformed out of the problem and is not readily recovered. In light of recent research which shows mean-flow non-parallelism to significantly influence the stability of high-speed compressible flows, the contribution of the wall-normal velocity in the analysis of stability should not be routinely neglected. Conventional methods extract the wall-normal velocity in physical space from the continuity equation, using finite-difference techniques and interpolation procedures. The present spectrally-accurate method extracts the wall-normal velocity directly from the transformation itself, without interpolation, leaving the continuity equation free as a check on the quality of the solution. The present method for recovering wall-normal velocity, when used in conjunction with a highly-accurate spectral collocation method for solving the compressible boundary-layer equations, results in a discrete solution which is extraordinarily smooth and accurate, and which satisfies the continuity equation nearly to machine precision. These qualities make the method well suited to the computation of the non-parallel mean flows needed by spatial direct numerical simulations (DNS) and parabolized stability equation (PSE) approaches to the analysis of stability.
An asymptotic-preserving Lagrangian algorithm for the time-dependent anisotropic heat transport equation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chacon, Luis; del-Castillo-Negrete, Diego; Hauck, Cory D.

2014-09-01

We propose a Lagrangian numerical algorithm for a time-dependent, anisotropic temperature transport equation in magnetized plasmas in the large guide field regime. The approach is based on an analytical integral formal solution of the parallel (i.e., along the magnetic field) transport equation with sources, and it is able to accommodate both local and non-local parallel heat flux closures. The numerical implementation is based on an operator-split formulation, with two straightforward steps: a perpendicular transport step (including sources), and a Lagrangian (field-line integral) parallel transport step. Algorithmically, the first step is amenable to the use of modern iterative methods, while themore » second step has a fixed cost per degree of freedom (and is therefore scalable). Accuracy-wise, the approach is free from the numerical pollution introduced by the discrete parallel transport term when the perpendicular to parallel transport coefficient ratio X ⊥ /X ∥ becomes arbitrarily small, and is shown to capture the correct limiting solution when ε = X⊥L 2 ∥/X1L 2 ⊥ → 0 (with L∥∙ L⊥ , the parallel and perpendicular diffusion length scales, respectively). Therefore, the approach is asymptotic-preserving. We demonstrate the capabilities of the scheme with several numerical experiments with varying magnetic field complexity in two dimensions, including the case of transport across a magnetic island.« less
Broadband ground motion simulation using a paralleled hybrid approach of Frequency Wavenumber and Finite Difference method

NASA Astrophysics Data System (ADS)

Chen, M.; Wei, S.

2016-12-01

The serious damage of Mexico City caused by the 1985 Michoacan earthquake 400 km away indicates that urban areas may be affected by remote earthquakes. To asses earthquake risk of urban areas imposed by distant earthquakes, we developed a hybrid Frequency Wavenumber (FK) and Finite Difference (FD) code implemented with MPI, since the computation of seismic wave propagation from a distant earthquake using a single numerical method (e.g. Finite Difference, Finite Element or Spectral Element) is very expensive. In our approach, we compute the incident wave field (ud) at the boundaries of the excitation box, which surrounding the local structure, using a paralleled FK method (Zhu and Rivera, 2002), and compute the total wave field (u) within the excitation box using a parallelled 2D FD method. We apply perfectly matched layer (PML) absorbing condition to the diffracted wave field (u-ud). Compared to previous Generalized Ray Theory and Finite Difference (Wen and Helmberger, 1998), Frequency Wavenumber and Spectral Element (Tong et al., 2014), and Direct Solution Method and Spectral Element hybrid method (Monteiller et al., 2013), our absorbing boundary condition dramatically suppress the numerical noise. The MPI implementation of our method can greatly speed up the calculation. Besides, our hybrid method also has a potential use in high resolution array imaging similar to Tong et al. (2014).
The novel implicit LU-SGS parallel iterative method based on the diffusion equation of a nuclear reactor on a GPU cluster

NASA Astrophysics Data System (ADS)

Zhang, Jilin; Sha, Chaoqun; Wu, Yusen; Wan, Jian; Zhou, Li; Ren, Yongjian; Si, Huayou; Yin, Yuyu; Jing, Ya

2017-02-01

GPU not only is used in the field of graphic technology but also has been widely used in areas needing a large number of numerical calculations. In the energy industry, because of low carbon, high energy density, high duration and other characteristics, the development of nuclear energy cannot easily be replaced by other energy sources. Management of core fuel is one of the major areas of concern in a nuclear power plant, and it is directly related to the economic benefits and cost of nuclear power. The large-scale reactor core expansion equation is large and complicated, so the calculation of the diffusion equation is crucial in the core fuel management process. In this paper, we use CUDA programming technology on a GPU cluster to run the LU-SGS parallel iterative calculation against the background of the diffusion equation of the reactor. We divide one-dimensional and two-dimensional mesh into a plurality of domains, with each domain evenly distributed on the GPU blocks. A parallel collision scheme is put forward that defines the virtual boundary of the grid exchange information and data transmission by non-stop collision. Compared with the serial program, the experiment shows that GPU greatly improves the efficiency of program execution and verifies that GPU is playing a much more important role in the field of numerical calculations.
A Ratiometric Wavelength Measurement Based on a Silicon-on-Insulator Directional Coupler Integrated Device

PubMed Central

Wang, Pengfei; Hatta, Agus Muhamad; Zhao, Haoyu; Zheng, Jie; Farrell, Gerald; Brambilla, Gilberto

2015-01-01

A ratiometric wavelength measurement based on a Silicon-on-Insulator (SOI) integrated device is proposed and designed, which consists of directional couplers acting as two edge filters with opposite spectral responses. The optimal separation distance between two parallel silicon waveguides and the interaction length of the directional coupler are designed to meet the desired spectral response by using local supermodes. The wavelength discrimination ability of the designed ratiometric structure is demonstrated by a beam propagation method numerically and then is verified experimentally. The experimental results have shown a general agreement with the theoretical models. The ratiometric wavelength system demonstrates a resolution of better than 50 pm at a wavelength around 1550 nm with ease of assembly and calibration. PMID:26343668
Disappearance of Anisotropic Intermittency in Large-amplitude MHD Turbulence and Its Comparison with Small-amplitude MHD Turbulence

NASA Astrophysics Data System (ADS)

Yang, Liping; Zhang, Lei; He, Jiansen; Tu, Chuanyi; Li, Shengtai; Wang, Xin; Wang, Linghua

2018-03-01

Multi-order structure functions in the solar wind are reported to display a monofractal scaling when sampled parallel to the local magnetic field and a multifractal scaling when measured perpendicularly. Whether and to what extent will the scaling anisotropy be weakened by the enhancement of turbulence amplitude relative to the background magnetic strength? In this study, based on two runs of the magnetohydrodynamic (MHD) turbulence simulation with different relative levels of turbulence amplitude, we investigate and compare the scaling of multi-order magnetic structure functions and magnetic probability distribution functions (PDFs) as well as their dependence on the direction of the local field. The numerical results show that for the case of large-amplitude MHD turbulence, the multi-order structure functions display a multifractal scaling at all angles to the local magnetic field, with PDFs deviating significantly from the Gaussian distribution and a flatness larger than 3 at all angles. In contrast, for the case of small-amplitude MHD turbulence, the multi-order structure functions and PDFs have different features in the quasi-parallel and quasi-perpendicular directions: a monofractal scaling and Gaussian-like distribution in the former, and a conversion of a monofractal scaling and Gaussian-like distribution into a multifractal scaling and non-Gaussian tail distribution in the latter. These results hint that when intermittencies are abundant and intense, the multifractal scaling in the structure functions can appear even if it is in the quasi-parallel direction; otherwise, the monofractal scaling in the structure functions remains even if it is in the quasi-perpendicular direction.
Load Forecasting Based Distribution System Network Reconfiguration -- A Distributed Data-Driven Approach

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jiang, Huaiguang; Zhang, Yingchen; Muljadi, Eduard

In this paper, a short-term load forecasting approach based network reconfiguration is proposed in a parallel manner. Specifically, a support vector regression (SVR) based short-term load forecasting approach is designed to provide an accurate load prediction and benefit the network reconfiguration. Because of the nonconvexity of the three-phase balanced optimal power flow, a second-order cone program (SOCP) based approach is used to relax the optimal power flow problem. Then, the alternating direction method of multipliers (ADMM) is used to compute the optimal power flow in distributed manner. Considering the limited number of the switches and the increasing computation capability, themore » proposed network reconfiguration is solved in a parallel way. The numerical results demonstrate the feasible and effectiveness of the proposed approach.« less
The accuracy of the compressible Reynolds equation for predicting the local pressure in gas-lubricated textured parallel slider bearings

PubMed Central

Qiu, Mingfeng; Bailey, Brian N.; Stoll, Rob

2014-01-01

The validity of the compressible Reynolds equation to predict the local pressure in a gas-lubricated, textured parallel slider bearing is investigated. The local bearing pressure is numerically simulated using the Reynolds equation and the Navier-Stokes equations for different texture geometries and operating conditions. The respective results are compared and the simplifying assumptions inherent in the application of the Reynolds equation are quantitatively evaluated. The deviation between the local bearing pressure obtained with the Reynolds equation and the Navier-Stokes equations increases with increasing texture aspect ratio, because a significant cross-film pressure gradient and a large velocity gradient in the sliding direction develop in the lubricant film. Inertia is found to be negligible throughout this study. PMID:25049440

Macro-actor execution on multilevel data-driven architectures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gaudiot, J.L.; Najjar, W.

1988-12-31

The data-flow model of computation brings to multiprocessors high programmability at the expense of increased overhead. Applying the model at a higher level leads to better performance but also introduces loss of parallelism. We demonstrate here syntax directed program decomposition methods for the creation of large macro-actors in numerical algorithms. In order to alleviate some of the problems introduced by the lower resolution interpretation, we describe a multi-level of resolution and analyze the requirements for its actual hardware and software integration.
Numerical Studies of Boundary-Layer Receptivity

NASA Technical Reports Server (NTRS)

Reed, Helen L.

1995-01-01

Direct numerical simulations (DNS) of the acoustic receptivity process on a semi-infinite flat plate with a modified-super-elliptic (MSE) leading edge are performed. The incompressible Navier-Stokes equations are solved in stream-function/vorticity form in a general curvilinear coordinate system. The steady basic-state solution is found by solving the governing equations using an alternating direction implicit (ADI) procedure which takes advantage of the parallelism present in line-splitting techniques. Time-harmonic oscillations of the farfield velocity are applied as unsteady boundary conditions to the unsteady disturbance equations. An efficient time-harmonic scheme is used to produce the disturbance solutions. Buffer-zone techniques have been applied to eliminate wave reflection from the outflow boundary. The spatial evolution of Tollmien-Schlichting (T-S) waves is analyzed and compared with experiment and theory. The effects of nose-radius, frequency, Reynolds number, angle of attack, and amplitude of the acoustic wave are investigated. This work is being performed in conjunction with the experiments at the Arizona State University Unsteady Wind Tunnel under the direction of Professor William Saric. The simulations are of the same configuration and parameters used in the wind-tunnel experiments.
A NUMERICAL SIMULATION OF COSMIC RAY MODULATION NEAR THE HELIOPAUSE. II. SOME PHYSICAL INSIGHTS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Luo, Xi; Feng, Xueshang; Potgieter, Marius S.

Cosmic ray (CR) transport near the heliopause (HP) is studied using a hybrid transport model, with the parameters constrained by observations from the Voyager 1 spacecraft. We simulate the CR radial flux along different directions in the heliosphere. There is no well-defined thin layer between the solar wind region and the interstellar region along the tail and polar directions of the heliosphere. By analyzing the radial flux curve along the direction of Voyager 2 , together with its trajectory information, the crossing time of the HP by Voyager 2 is predicted to be in 2017.14. We simulate the CR radialmore » flux for different energy values along the direction of Voyager 1 . We find that there is only a modest modulation region of about 10 au wide beyond the HP, so that Voyager 1 observing the Local Interstellar Spectra is justified in numerical modeling. We analyze the heliospheric exit information of pseudo-particles in our stochastic numerical (time-backward) method, conjecturing that they represent the behavior of CR particles, and we find that pseudo-particles that have been traced from the nose region exit in the tail region. This implies that many CR particles diffuse directly from the heliospheric tail region to the nose region near the HP. In addition, when pseudo-particles were traced from the Local Interstellar Medium (LISM), it is found that their exit location (entrance for real particles) from the simulation domain is along the prescribed Interstellar Magnetic Field direction. This indicates that parallel diffusion dominates CR particle transport in the LISM.« less
Numerical and experimental studies of hydrodynamics of flapping foils

NASA Astrophysics Data System (ADS)

Zhou, Kai; Liu, Jun-kao; Chen, Wei-shan

2018-04-01

The flapping foil based on bionics is a sort of simplified models which imitate the motion of wings or fins of fish or birds. In this paper, a universal kinematic model with three degrees of freedom is adopted and the motion parallel to the flow direction is considered. The force coefficients, the torque coefficient, and the flow field characteristics are extracted and analyzed. Then the propulsive efficiency is calculated. The influence of the motion parameters on the hydrodynamic performance of the bionic foil is studied. The results show that the motion parameters play important roles in the hydrodynamic performance of the flapping foil. To validate the reliability of the numerical method used in this paper, an experiment platform is designed and verification experiments are carried out. Through the comparison, it is found that the numerical results compare well with the experimental results, to show that the adopted numerical method is reliable. The results of this paper provide a theoretical reference for the design of underwater vehicles based on the flapping propulsion.
Experimental and Numerical Study of Nozzle Plume Impingement on Spacecraft Surfaces

NASA Astrophysics Data System (ADS)

Ketsdever, A. D.; Lilly, T. C.; Gimelshein, S. F.; Alexeenko, A. A.

2005-05-01

An experimental and numerical effort was undertaken to assess the effects of a cold gas (To=300K) nozzle plume impinging on a simulated spacecraft surface. The nozzle flow impingement is investigated experimentally using a nano-Newton resolution force balance and numerically using the Direct Simulation Monte Carlo (DSMC) numerical technique. The Reynolds number range investigated in this study is from 0.5 to approximately 900 using helium and nitrogen propellants. The thrust produced by the nozzle was first assessed on a force balance to provide a baseline case. Subsequently, an aluminum plate was attached to the same force balance at various angles from 0° (parallel to the plume flow) to 10°. For low Reynolds number helium flow, a 16.5% decrease in thrust was measured for the plate at 0° relative to the free plume expansion case. For low Reynolds number nitrogen flow, the difference was found to be 12%. The thrust degradation was found to decrease at higher Reynolds numbers and larger plate angles.
Parareal in time 3D numerical solver for the LWR Benchmark neutron diffusion transient model

DOE Office of Scientific and Technical Information (OSTI.GOV)

Baudron, Anne-Marie, E-mail: anne-marie.baudron@cea.fr; CEA-DRN/DMT/SERMA, CEN-Saclay, 91191 Gif sur Yvette Cedex; Lautard, Jean-Jacques, E-mail: jean-jacques.lautard@cea.fr

2014-12-15

In this paper we present a time-parallel algorithm for the 3D neutrons calculation of a transient model in a nuclear reactor core. The neutrons calculation consists in numerically solving the time dependent diffusion approximation equation, which is a simplified transport equation. The numerical resolution is done with finite elements method based on a tetrahedral meshing of the computational domain, representing the reactor core, and time discretization is achieved using a θ-scheme. The transient model presents moving control rods during the time of the reaction. Therefore, cross-sections (piecewise constants) are taken into account by interpolations with respect to the velocity ofmore » the control rods. The parallelism across the time is achieved by an adequate use of the parareal in time algorithm to the handled problem. This parallel method is a predictor corrector scheme that iteratively combines the use of two kinds of numerical propagators, one coarse and one fine. Our method is made efficient by means of a coarse solver defined with large time step and fixed position control rods model, while the fine propagator is assumed to be a high order numerical approximation of the full model. The parallel implementation of our method provides a good scalability of the algorithm. Numerical results show the efficiency of the parareal method on large light water reactor transient model corresponding to the Langenbuch–Maurer–Werner benchmark.« less
Turbulent statistics in flow field due to interaction of two plane parallel jets

NASA Astrophysics Data System (ADS)

Bisoi, Mukul; Das, Manab Kumar; Roy, Subhransu; Patel, Devendra Kumar

2017-12-01

Turbulent characteristics of flow fields due to the interaction of two plane parallel jets separated by the jet width distance are studied. Numerical simulation is carried out by large eddy simulation with a dynamic Smagorinsky model for the sub-grid scale stresses. The energy spectra are observed to follow the -5/3 power law for the inertial sub-range. A proper orthogonal decomposition study indicates that the energy carrying large coherent structures is present close to the nozzle exit. It is shown that these coherent structures interact with each other and finally disintegrate into smaller vortices further downstream. The turbulent fluctuations in the longitudinal and lateral directions are shown to follow a similarity. The mean flow at the same time also maintains a close similarity. Prandtl's mixing length, the Taylor microscale, and the Kolmogorov length scales are shown along the lateral direction for different downstream locations. The autocorrelation in the longitudinal and transverse directions is seen to follow a similarity profile. By plotting the probability density function, the skewness and the flatness (kurtosis) are analyzed. The Reynolds stress anisotropy tensor is calculated, and the anisotropy invariant map known as Lumley's triangle is presented and analyzed.
Dimensional synthesis of a 3-DOF parallel manipulator with full circle rotation

NASA Astrophysics Data System (ADS)

Ni, Yanbing; Wu, Nan; Zhong, Xueyong; Zhang, Biao

2015-07-01

Parallel robots are widely used in the academic and industrial fields. In spite of the numerous achievements in the design and dimensional synthesis of the low-mobility parallel robots, few research efforts are directed towards the asymmetric 3-DOF parallel robots whose end-effector can realize 2 translational and 1 rotational(2T1R) motion. In order to develop a manipulator with the capability of full circle rotation to enlarge the workspace, a new 2T1R parallel mechanism is proposed. The modeling approach and kinematic analysis of this proposed mechanism are investigated. Using the method of vector analysis, the inverse kinematic equations are established. This is followed by a vigorous proof that this mechanism attains an annular workspace through its circular rotation and 2 dimensional translations. Taking the first order perturbation of the kinematic equations, the error Jacobian matrix which represents the mapping relationship between the error sources of geometric parameters and the end-effector position errors is derived. With consideration of the constraint conditions of pressure angles and feasible workspace, the dimensional synthesis is conducted with a goal to minimize the global comprehensive performance index. The dimension parameters making the mechanism to have optimal error mapping and kinematic performance are obtained through the optimization algorithm. All these research achievements lay the foundation for the prototype building of such kind of parallel robots.
The Space-Time Conservative Schemes for Large-Scale, Time-Accurate Flow Simulations with Tetrahedral Meshes

NASA Technical Reports Server (NTRS)

Venkatachari, Balaji Shankar; Streett, Craig L.; Chang, Chau-Lyan; Friedlander, David J.; Wang, Xiao-Yen; Chang, Sin-Chung

2016-01-01

Despite decades of development of unstructured mesh methods, high-fidelity time-accurate simulations are still predominantly carried out on structured, or unstructured hexahedral meshes by using high-order finite-difference, weighted essentially non-oscillatory (WENO), or hybrid schemes formed by their combinations. In this work, the space-time conservation element solution element (CESE) method is used to simulate several flow problems including supersonic jet/shock interaction and its impact on launch vehicle acoustics, and direct numerical simulations of turbulent flows using tetrahedral meshes. This paper provides a status report for the continuing development of the space-time conservation element solution element (CESE) numerical and software framework under the Revolutionary Computational Aerosciences (RCA) project. Solution accuracy and large-scale parallel performance of the numerical framework is assessed with the goal of providing a viable paradigm for future high-fidelity flow physics simulations.
RIACS

NASA Technical Reports Server (NTRS)

Oliger, Joseph

1997-01-01

Topics considered include: high-performance computing; cognitive and perceptual prostheses (computational aids designed to leverage human abilities); autonomous systems. Also included: development of a 3D unstructured grid code based on a finite volume formulation and applied to the Navier-stokes equations; Cartesian grid methods for complex geometry; multigrid methods for solving elliptic problems on unstructured grids; algebraic non-overlapping domain decomposition methods for compressible fluid flow problems on unstructured meshes; numerical methods for the compressible navier-stokes equations with application to aerodynamic flows; research in aerodynamic shape optimization; S-HARP: a parallel dynamic spectral partitioner; numerical schemes for the Hamilton-Jacobi and level set equations on triangulated domains; application of high-order shock capturing schemes to direct simulation of turbulence; multicast technology; network testbeds; supercomputer consolidation project.
Real-world hydrologic assessment of a fully-distributed hydrological model in a parallel computing environment

NASA Astrophysics Data System (ADS)

Vivoni, Enrique R.; Mascaro, Giuseppe; Mniszewski, Susan; Fasel, Patricia; Springer, Everett P.; Ivanov, Valeriy Y.; Bras, Rafael L.

2011-10-01

SummaryA major challenge in the use of fully-distributed hydrologic models has been the lack of computational capabilities for high-resolution, long-term simulations in large river basins. In this study, we present the parallel model implementation and real-world hydrologic assessment of the Triangulated Irregular Network (TIN)-based Real-time Integrated Basin Simulator (tRIBS). Our parallelization approach is based on the decomposition of a complex watershed using the channel network as a directed graph. The resulting sub-basin partitioning divides effort among processors and handles hydrologic exchanges across boundaries. Through numerical experiments in a set of nested basins, we quantify parallel performance relative to serial runs for a range of processors, simulation complexities and lengths, and sub-basin partitioning methods, while accounting for inter-run variability on a parallel computing system. In contrast to serial simulations, the parallel model speed-up depends on the variability of hydrologic processes. Load balancing significantly improves parallel speed-up with proportionally faster runs as simulation complexity (domain resolution and channel network extent) increases. The best strategy for large river basins is to combine a balanced partitioning with an extended channel network, with potential savings through a lower TIN resolution. Based on these advances, a wider range of applications for fully-distributed hydrologic models are now possible. This is illustrated through a set of ensemble forecasts that account for precipitation uncertainty derived from a statistical downscaling model.
Parallel Computation and Visualization of Three-dimensional, Time-dependent, Thermal Convective Flows

NASA Technical Reports Server (NTRS)

Wang, P.; Li, P.

1998-01-01

A high-resolution numerical study on parallel systems is reported on three-dimensional, time-dependent, thermal convective flows. A parallel implentation on the finite volume method with a multigrid scheme is discussed, and a parallel visualization systemm is developed on distributed systems for visualizing the flow.
An Artificial Neural Networks Method for Solving Partial Differential Equations

NASA Astrophysics Data System (ADS)

Alharbi, Abir

2010-09-01

While there already exists many analytical and numerical techniques for solving PDEs, this paper introduces an approach using artificial neural networks. The approach consists of a technique developed by combining the standard numerical method, finite-difference, with the Hopfield neural network. The method is denoted Hopfield-finite-difference (HFD). The architecture of the nets, energy function, updating equations, and algorithms are developed for the method. The HFD method has been used successfully to approximate the solution of classical PDEs, such as the Wave, Heat, Poisson and the Diffusion equations, and on a system of PDEs. The software Matlab is used to obtain the results in both tabular and graphical form. The results are similar in terms of accuracy to those obtained by standard numerical methods. In terms of speed, the parallel nature of the Hopfield nets methods makes them easier to implement on fast parallel computers while some numerical methods need extra effort for parallelization.
Magnetic helicity conservation and inverse energy cascade in electron magnetohydrodynamic wave packets.

PubMed

Cho, Jungyeon

2011-05-13

Electron magnetohydrodynamics (EMHD) provides a fluidlike description of small-scale magnetized plasmas. An EMHD wave propagates along magnetic field lines. The direction of propagation can be either parallel or antiparallel to the magnetic field lines. We numerically study propagation of three-dimensional (3D) EMHD wave packets moving in one direction. We obtain two major results. (1) Unlike its magnetohydrodynamic (MHD) counterpart, an EMHD wave packet is dispersive. Because of this, EMHD wave packets traveling in one direction create opposite-traveling wave packets via self-interaction and cascade energy to smaller scales. (2) EMHD wave packets traveling in one direction clearly exhibit inverse energy cascade. We find that the latter is due to conservation of magnetic helicity. We compare inverse energy cascade in 3D EMHD turbulence and two-dimensional (2D) hydrodynamic turbulence.
Direct numerical simulation of steady state, three dimensional, laminar flow around a wall mounted cube

NASA Astrophysics Data System (ADS)

Liakos, Anastasios; Malamataris, Nikolaos A.

2014-05-01

The topology and evolution of flow around a surface mounted cubical object in three dimensional channel flow is examined for low to moderate Reynolds numbers. Direct numerical simulations were performed via a home made parallel finite element code. The computational domain has been designed according to actual laboratory experiment conditions. Analysis of the results is performed using the three dimensional theory of separation. Our findings indicate that a tornado-like vortex by the side of the cube is present for all Reynolds numbers for which flow was simulated. A horseshoe vortex upstream from the cube was formed at Reynolds number approximately 1266. Pressure distributions are shown along with three dimensional images of the tornado-like vortex and the horseshoe vortex at selected Reynolds numbers. Finally, and in accordance to previous work, our results indicate that the upper limit for the Reynolds number for which steady state results are physically realizable is roughly 2000.
Three dimensional adaptive mesh refinement on a spherical shell for atmospheric models with lagrangian coordinates

NASA Astrophysics Data System (ADS)

Penner, Joyce E.; Andronova, Natalia; Oehmke, Robert C.; Brown, Jonathan; Stout, Quentin F.; Jablonowski, Christiane; van Leer, Bram; Powell, Kenneth G.; Herzog, Michael

2007-07-01

One of the most important advances needed in global climate models is the development of atmospheric General Circulation Models (GCMs) that can reliably treat convection. Such GCMs require high resolution in local convectively active regions, both in the horizontal and vertical directions. During previous research we have developed an Adaptive Mesh Refinement (AMR) dynamical core that can adapt its grid resolution horizontally. Our approach utilizes a finite volume numerical representation of the partial differential equations with floating Lagrangian vertical coordinates and requires resolving dynamical processes on small spatial scales. For the latter it uses a newly developed general-purpose library, which facilitates 3D block-structured AMR on spherical grids. The library manages neighbor information as the blocks adapt, and handles the parallel communication and load balancing, freeing the user to concentrate on the scientific modeling aspects of their code. In particular, this library defines and manages adaptive blocks on the sphere, provides user interfaces for interpolation routines and supports the communication and load-balancing aspects for parallel applications. We have successfully tested the library in a 2-D (longitude-latitude) implementation. During the past year, we have extended the library to treat adaptive mesh refinement in the vertical direction. Preliminary results are discussed. This research project is characterized by an interdisciplinary approach involving atmospheric science, computer science and mathematical/numerical aspects. The work is done in close collaboration between the Atmospheric Science, Computer Science and Aerospace Engineering Departments at the University of Michigan and NOAA GFDL.
Numerical modeling of heat transfer in molten silicon during directional solidification process

DOE Office of Scientific and Technical Information (OSTI.GOV)

Srinivasan, M.; Ramasamy, P., E-mail: ramasamyp@ssn.edu.in

2015-06-24

Numerical investigation is performed for some of the thermal and fluid flow properties of silicon melt during directional solidification by numerical modeling. Dimensionless numbers are extremely useful to understand the heat and mass transfer of fluid flow on Si melt and control the flow patterns during crystal growth processes. The average grain size of whole crystal would increase when the melt flow is laminar. In the silicon growth process, the melt flow is mainly driven by the buoyancy force resulting from the horizontal temperature gradient. The thermal and flow pattern influences the quality of the crystal through the convective heatmore » and mass transport. The computations are carried out in a 2D axisymmetric model using the finite-element technique. The buoyancy effect is observed in the melt domain for a constant Rayleigh number and for different Prandtl numbers. The convective heat flux and Reynolds numbers are studied in the five parallel horizontal cross section of melt silicon region. And also, velocity field is simulated for whole melt domain with limited thermal boundaries. The results indicate that buoyancy forces have a dramatic effect on the most of melt region except central part.« less
CW all optical self switching in nonlinear chalcogenide nano plasmonic directional coupler

NASA Astrophysics Data System (ADS)

Motamed-Jahromi, Leila; Hatami, Mohsen

2018-04-01

In this paper we obtain the coupling coefficient of plasmonic directional coupler (PDC) made up of two parallel monolayer waveguides filled with high nonlinear chalcogenide material for TM mode in continues wave (CW) regime. In addition, we assume each waveguides acts as a perturbation to other waveguide. Four nonlinear-coupled equations are derived. Transfer distances are numerically calculated and used for deriving length of all optical switch. The length of designed switch is in the range of 10-1000 μm, and the switching power is in the range of 1-100 W/m. Obtained values are suitable for designing all optical elements in the integrated optical circuits.
Evolution of stress and strain during 3D folding: application to orthogonal fracture systems in folded turbidites, SW Portugal

NASA Astrophysics Data System (ADS)

Reber, J. E.; Schmalholz, S. M.; Lechmann, S. M.

2009-04-01

We present field data and numerical modeling results which show the evolution of stress and strain patterns during 3D folding resulting in an orthogonal fracture system. The field area is located near Almograve, SW Portugal. The area is part of the Mira Formation which itself is part of the South Portuguese Zone (SPZ). The structural development of the SPZ is characterized by southwest vergent folding and thrust displacement. The metamorphism in the SPZ increases from diagenetic conditions in the southwest to greenschist-facies conditions to the northeast. The Mira Formation is composed of turbiditic layers of Carboniferous age with low sandstone to shale ratio. The data was gathered at three outcrops which show structures similar to chocolate tablet structures in the folded sandstone layers. Chocolate tablet structures are generated under simultaneous extension in two directions and show two fracture systems of the same age which are perpendicular to each other. However, the Mira Formation is located in a convergent area. Also, the outcrops near Almograve show two fracture systems of different age. The fractures orthogonal to the fold axis and the bedding are crosscut by fractures parallel to the fold axis and orthogonal to the bedding. Our hypothesis for the evolution of the observed fracture systems is as follows; the older fractures which are now orthogonal to the fold axis and to the bedding plane were generated during compression while the layers were still approximately horizontal. They are parallel to σ1(i.e. mode 1 fractures). The second and younger fracture family was generated in a phase where there is local extension in the fold limbs. These fractures are orthogonal to the far-field σ1, parallel to the fold axis and perpendicular to the bedding. The shortening direction is constant during the entire folding process. We test our hypothesis with numerical modeling. We use 2D and 3D finite element codes with a mixed formulation for incompressible flow and a viscous rheology. The stress and strain tensor components are calculated at each numerical nodal point. The stress and strain fields are visualized through ellipses and ellipsoids which are calculated using the eigenvalues of the respective tensors. The shortest main axis represents the direction of the smallest stress σ3 and the longest main axis represents the direction of the largest stress σ1. To generate two orthogonal fracture systems in the fold limbs we expect a relatively rapid change of the stress field in the fold limbs during folding. With a relatively slow change of the stress field we would expect to see more than two fracture systems with a wide range of fracture orientation which we did not observe in the field. The preliminary 2D results show, as expected, a sudden flip of the main axes of the stress ellipse which corresponds to a change from limb-parallel compression to extension. For the 3D model we expect similar results and we will investigate the impact of different deformation boundary conditions on the evolution of the 3D stress and strain fields.
A signature of anisotropic cosmic-ray transport in the gamma-ray sky

NASA Astrophysics Data System (ADS)

Cerri, Silvio Sergio; Gaggero, Daniele; Vittino, Andrea; Evoli, Carmelo; Grasso, Dario

2017-10-01

A crucial process in Galactic cosmic-ray (CR) transport is the spatial diffusion due to the interaction with the interstellar turbulent magnetic field. Usually, CR diffusion is assumed to be uniform and isotropic all across the Galaxy. However, this picture is clearly inaccurate: several data-driven and theoretical arguments, as well as dedicated numerical simulations, show that diffusion exhibits highly anisotropic properties with respect to the direction of a background (ordered) magnetic field (i.e., parallel or perpendicular to it). In this paper we focus on a recently discovered anomaly in the hadronic CR spectrum inferred by the Fermi-LAT gamma-ray data at different positions in the Galaxy, i.e. the progressive hardening of the proton slope at low Galactocentric radii. We propose the idea that this feature can be interpreted as a signature of anisotropic diffusion in the complex Galactic magnetic field: in particular, the harder slope in the inner Galaxy is due, in our scenario, to the parallel diffusive escape along the poloidal component of the large-scale, regular, magnetic field. We implement this idea in a numerical framework, based on the DRAGON code, and perform detailed numerical tests on the accuracy of our setup. We discuss how the effect proposed depends on the relevant free parameters involved. Based on low-energy extrapolation of the few focused numerical simulations aimed at determining the scalings of the anisotropic diffusion coefficients, we finally present a set of plausible models that reproduce the behavior of the CR proton slopes inferred by gamma-ray data.

A signature of anisotropic cosmic-ray transport in the gamma-ray sky

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cerri, Silvio Sergio; Grasso, Dario; Gaggero, Daniele

A crucial process in Galactic cosmic-ray (CR) transport is the spatial diffusion due to the interaction with the interstellar turbulent magnetic field. Usually, CR diffusion is assumed to be uniform and isotropic all across the Galaxy. However, this picture is clearly inaccurate: several data-driven and theoretical arguments, as well as dedicated numerical simulations, show that diffusion exhibits highly anisotropic properties with respect to the direction of a background (ordered) magnetic field (i.e., parallel or perpendicular to it). In this paper we focus on a recently discovered anomaly in the hadronic CR spectrum inferred by the Fermi-LAT gamma-ray data at differentmore » positions in the Galaxy, i.e. the progressive hardening of the proton slope at low Galactocentric radii. We propose the idea that this feature can be interpreted as a signature of anisotropic diffusion in the complex Galactic magnetic field: in particular, the harder slope in the inner Galaxy is due, in our scenario, to the parallel diffusive escape along the poloidal component of the large-scale, regular, magnetic field. We implement this idea in a numerical framework, based on the DRAGON code, and perform detailed numerical tests on the accuracy of our setup. We discuss how the effect proposed depends on the relevant free parameters involved. Based on low-energy extrapolation of the few focused numerical simulations aimed at determining the scalings of the anisotropic diffusion coefficients, we finally present a set of plausible models that reproduce the behavior of the CR proton slopes inferred by gamma-ray data.« less
Parallel 3-D numerical simulation of dielectric barrier discharge plasma actuators

NASA Astrophysics Data System (ADS)

Houba, Tomas

Dielectric barrier discharge plasma actuators have shown promise in a range of applications including flow control, sterilization and ozone generation. Developing numerical models of plasma actuators is of great importance, because a high-fidelity parallel numerical model allows new design configurations to be tested rapidly. Additionally, it provides a better understanding of the plasma actuator physics which is useful for further innovation. The physics of plasma actuators is studied numerically. A loosely coupled approach is utilized for the coupling of the plasma to the neutral fluid. The state of the art in numerical plasma modeling is advanced by the development of a parallel, three-dimensional, first-principles model with detailed air chemistry. The model incorporates 7 charged species and 18 reactions, along with a solution of the electron energy equation. To the author's knowledge, a parallel three-dimensional model of a gas discharge with a detailed air chemistry model and the solution of electron energy is unique. Three representative geometries are studied using the gas discharge model. The discharge of gas between two parallel electrodes is used to validate the air chemistry model developed for the gas discharge code. The gas discharge model is then applied to the discharge produced by placing a dc powered wire and grounded plate electrodes in a channel. Finally, a three-dimensional simulation of gas discharge produced by electrodes placed inside a riblet is carried out. The body force calculated with the gas discharge model is loosely coupled with a fluid model to predict the induced flow inside the riblet.
Numerical Simulation of Transit-Time Ultrasonic Flowmeters by a Direct Approach.

PubMed

Luca, Adrian; Marchiano, Regis; Chassaing, Jean-Camille

2016-06-01

This paper deals with the development of a computational code for the numerical simulation of wave propagation through domains with a complex geometry consisting in both solids and moving fluids. The emphasis is on the numerical simulation of ultrasonic flowmeters (UFMs) by modeling the wave propagation in solids with the equations of linear elasticity (ELE) and in fluids with the linearized Euler equations (LEEs). This approach requires high performance computing because of the high number of degrees of freedom and the long propagation distances. Therefore, the numerical method should be chosen with care. In order to minimize the numerical dissipation which may occur in this kind of configuration, the numerical method employed here is the nodal discontinuous Galerkin (DG) method. Also, this method is well suited for parallel computing. To speed up the code, almost all the computational stages have been implemented to run on graphical processing unit (GPU) by using the compute unified device architecture (CUDA) programming model from NVIDIA. This approach has been validated and then used for the two-dimensional simulation of gas UFMs. The large contrast of acoustic impedance characteristic to gas UFMs makes their simulation a real challenge.
All-silicon-based nano-antennas for wavelength and polarization demultiplexing.

PubMed

Panmai, Mingcheng; Xiang, Jin; Sun, Zhibo; Peng, Yuanyuan; Liu, Hongfeng; Liu, Haiying; Dai, Qiaofeng; Tie, Shaolong; Lan, Sheng

2018-05-14

We propose an all-silicon-based nano-antenna that functions as not only a wavelength demultiplexer but also a polarization one. The nano-antenna is composed of two silicon cuboids with the same length and height but with different widths. The asymmetric structure of the nano-antenna with respect to the electric field of the incident light induced an electric dipole component in the propagation direction of the incident light. The interference between this electric dipole and the magnetic dipole induced by the magnetic field parallel to the long side of the cuboids is exploited to manipulate the radiation direction of the nano-antenna. The radiation direction of the nano-antenna at a certain wavelength depends strongly on the phase difference between the electric and magnetic dipoles interacting coherently, offering us the opportunity to realize wavelength demultiplexing. By varying the polarization of the incident light, the interference of the magnetic dipole induced by the asymmetry of the nano-antenna and the electric dipole induced by the electric field parallel to the long side of the cuboids can also be used to realize polarization demultiplexing in a certain wavelength range. More interestingly, the interference between the dipole and quadrupole modes of the nano-antenna can be utilized to shape the radiation directivity of the nano-antenna. We demonstrate numerically that radiation with adjustable direction and high directivity can be realized in such a nano-antenna which is compatible with the current fabrication technology of silicon chips.
Elastic Characterization of Transversely Isotropic Soft Materials by Dynamic Shear and Asymmetric Indentation

PubMed Central

Namani, R.; Feng, Y.; Okamoto, R. J.; Jesuraj, N.; Sakiyama-Elbert, S. E.; Genin, G. M.; Bayly, P. V.

2012-01-01

The mechanical characterization of soft anisotropic materials is a fundamental challenge because of difficulties in applying mechanical loads to soft matter and the need to combine information from multiple tests. A method to characterize the linear elastic properties of transversely isotropic soft materials is proposed, based on the combination of dynamic shear testing (DST) and asymmetric indentation. The procedure was demonstrated by characterizing a nearly incompressible transversely isotropic soft material. A soft gel with controlled anisotropy was obtained by polymerizing a mixture of fibrinogen and thrombin solutions in a high field magnet (B = 11.7 T); fibrils in the resulting gel were predominantly aligned parallel to the magnetic field. Aligned fibrin gels were subject to dynamic (20–40 Hz) shear deformation in two orthogonal directions. The shear storage modulus was 1.08 ± 0. 42 kPa (mean ± std. dev.) for shear in a plane parallel to the dominant fiber direction, and 0.58 ± 0.21 kPa for shear in the plane of isotropy. Gels were indented by a rectangular tip of a large aspect ratio, aligned either parallel or perpendicular to the normal to the plane of transverse isotropy. Aligned fibrin gels appeared stiffer when indented with the long axis of a rectangular tip perpendicular to the dominant fiber direction. Three-dimensional numerical simulations of asymmetric indentation were used to determine the relationship between direction-dependent differences in indentation stiffness and material parameters. This approach enables the estimation of a complete set of parameters for an incompressible, transversely isotropic, linear elastic material. PMID:22757501
Longitudinal train dynamics: an overview

NASA Astrophysics Data System (ADS)

Wu, Qing; Spiryagin, Maksym; Cole, Colin

2016-12-01

This paper discusses the evolution of longitudinal train dynamics (LTD) simulations, which covers numerical solvers, vehicle connection systems, air brake systems, wagon dumper systems and locomotives, resistance forces and gravitational components, vehicle in-train instabilities, and computing schemes. A number of potential research topics are suggested, such as modelling of friction, polymer, and transition characteristics for vehicle connection simulations, studies of wagon dumping operations, proper modelling of vehicle in-train instabilities, and computing schemes for LTD simulations. Evidence shows that LTD simulations have evolved with computing capabilities. Currently, advanced component models that directly describe the working principles of the operation of air brake systems, vehicle connection systems, and traction systems are available. Parallel computing is a good solution to combine and simulate all these advanced models. Parallel computing can also be used to conduct three-dimensional long train dynamics simulations.
A gyrokinetic one-dimensional scrape-off layer model of an edge-localized mode heat pulse

DOE PAGES

Shi, E. L.; Hakim, A. H.; Hammett, G. W.

2015-02-03

An electrostatic gyrokinetic-based model is applied to simulate parallel plasma transport in the scrape-off layer to a divertor plate. We focus on a test problem that has been studied previously, using parameters chosen to model a heat pulse driven by an edge-localized mode in JET. Previous work has used direct particle-in-cellequations with full dynamics, or Vlasov or fluid equations with only parallel dynamics. With the use of the gyrokinetic quasineutrality equation and logical sheathboundary conditions, spatial and temporal resolution requirements are no longer set by the electron Debye length and plasma frequency, respectively. Finally, this test problem also helps illustratemore » some of the physics contained in the Hamiltonian form of the gyrokineticequations and some of the numerical challenges in developing an edge gyrokinetic code.« less
Radiofrequency pulse design in parallel transmission under strict temperature constraints.

PubMed

Boulant, Nicolas; Massire, Aurélien; Amadon, Alexis; Vignaud, Alexandre

2014-09-01

To gain radiofrequency (RF) pulse performance by directly addressing the temperature constraints, as opposed to the specific absorption rate (SAR) constraints, in parallel transmission at ultra-high field. The magnitude least-squares RF pulse design problem under hard SAR constraints was solved repeatedly by using the virtual observation points and an active-set algorithm. The SAR constraints were updated at each iteration based on the result of a thermal simulation. The numerical study was performed for an SAR-demanding and simplified time of flight sequence using B1 and ΔB0 maps obtained in vivo on a human brain at 7T. The proposed adjustment of the SAR constraints combined with an active-set algorithm provided higher flexibility in RF pulse design within a reasonable time. The modifications of those constraints acted directly upon the thermal response as desired. Although further confidence in the thermal models is needed, this study shows that RF pulse design under strict temperature constraints is within reach, allowing better RF pulse performance and faster acquisitions at ultra-high fields at the cost of higher sequence complexity. Copyright © 2013 Wiley Periodicals, Inc.
Constraint treatment techniques and parallel algorithms for multibody dynamic analysis. Ph.D. Thesis

NASA Technical Reports Server (NTRS)

Chiou, Jin-Chern

1990-01-01

Computational procedures for kinematic and dynamic analysis of three-dimensional multibody dynamic (MBD) systems are developed from the differential-algebraic equations (DAE's) viewpoint. Constraint violations during the time integration process are minimized and penalty constraint stabilization techniques and partitioning schemes are developed. The governing equations of motion, a two-stage staggered explicit-implicit numerical algorithm, are treated which takes advantage of a partitioned solution procedure. A robust and parallelizable integration algorithm is developed. This algorithm uses a two-stage staggered central difference algorithm to integrate the translational coordinates and the angular velocities. The angular orientations of bodies in MBD systems are then obtained by using an implicit algorithm via the kinematic relationship between Euler parameters and angular velocities. It is shown that the combination of the present solution procedures yields a computationally more accurate solution. To speed up the computational procedures, parallel implementation of the present constraint treatment techniques, the two-stage staggered explicit-implicit numerical algorithm was efficiently carried out. The DAE's and the constraint treatment techniques were transformed into arrowhead matrices to which Schur complement form was derived. By fully exploiting the sparse matrix structural analysis techniques, a parallel preconditioned conjugate gradient numerical algorithm is used to solve the systems equations written in Schur complement form. A software testbed was designed and implemented in both sequential and parallel computers. This testbed was used to demonstrate the robustness and efficiency of the constraint treatment techniques, the accuracy of the two-stage staggered explicit-implicit numerical algorithm, and the speed up of the Schur-complement-based parallel preconditioned conjugate gradient algorithm on a parallel computer.
Terascale direct numerical simulations of turbulent combustion using S3D

NASA Astrophysics Data System (ADS)

Chen, J. H.; Choudhary, A.; de Supinski, B.; DeVries, M.; Hawkes, E. R.; Klasky, S.; Liao, W. K.; Ma, K. L.; Mellor-Crummey, J.; Podhorszki, N.; Sankaran, R.; Shende, S.; Yoo, C. S.

2009-01-01

Computational science is paramount to the understanding of underlying processes in internal combustion engines of the future that will utilize non-petroleum-based alternative fuels, including carbon-neutral biofuels, and burn in new combustion regimes that will attain high efficiency while minimizing emissions of particulates and nitrogen oxides. Next-generation engines will likely operate at higher pressures, with greater amounts of dilution and utilize alternative fuels that exhibit a wide range of chemical and physical properties. Therefore, there is a significant role for high-fidelity simulations, direct numerical simulations (DNS), specifically designed to capture key turbulence-chemistry interactions in these relatively uncharted combustion regimes, and in particular, that can discriminate the effects of differences in fuel properties. In DNS, all of the relevant turbulence and flame scales are resolved numerically using high-order accurate numerical algorithms. As a consequence terascale DNS are computationally intensive, require massive amounts of computing power and generate tens of terabytes of data. Recent results from terascale DNS of turbulent flames are presented here, illustrating its role in elucidating flame stabilization mechanisms in a lifted turbulent hydrogen/air jet flame in a hot air coflow, and the flame structure of a fuel-lean turbulent premixed jet flame. Computing at this scale requires close collaborations between computer and combustion scientists to provide optimized scaleable algorithms and software for terascale simulations, efficient collective parallel I/O, tools for volume visualization of multiscale, multivariate data and automating the combustion workflow. The enabling computer science, applied to combustion science, is also required in many other terascale physics and engineering simulations. In particular, performance monitoring is used to identify the performance of key kernels in the DNS code, S3D and especially memory intensive loops in the code. Through the careful application of loop transformations, data reuse in cache is exploited thereby reducing memory bandwidth needs, and hence, improving S3D's nodal performance. To enhance collective parallel I/O in S3D, an MPI-I/O caching design is used to construct a two-stage write-behind method for improving the performance of write-only operations. The simulations generate tens of terabytes of data requiring analysis. Interactive exploration of the simulation data is enabled by multivariate time-varying volume visualization. The visualization highlights spatial and temporal correlations between multiple reactive scalar fields using an intuitive user interface based on parallel coordinates and time histogram. Finally, an automated combustion workflow is designed using Kepler to manage large-scale data movement, data morphing, and archival and to provide a graphical display of run-time diagnostics.
Parallel pivoting combined with parallel reduction

NASA Technical Reports Server (NTRS)

Alaghband, Gita

1987-01-01

Parallel algorithms for triangularization of large, sparse, and unsymmetric matrices are presented. The method combines the parallel reduction with a new parallel pivoting technique, control over generations of fill-ins and a check for numerical stability, all done in parallel with the work being distributed over the active processes. The parallel technique uses the compatibility relation between pivots to identify parallel pivot candidates and uses the Markowitz number of pivots to minimize fill-in. This technique is not a preordering of the sparse matrix and is applied dynamically as the decomposition proceeds.
Polymer scaling and dynamics in steady-state sedimentation at infinite Péclet number.

PubMed

Lehtola, V; Punkkinen, O; Ala-Nissila, T

2007-11-01

We consider the static and dynamical behavior of a flexible polymer chain under steady-state sedimentation using analytic arguments and computer simulations. The model system comprises a single coarse-grained polymer chain of N segments, which resides in a Newtonian fluid as described by the Navier-Stokes equations. The chain is driven into nonequilibrium steady state by gravity acting on each segment. The equations of motion for the segments and the Navier-Stokes equations are solved simultaneously using an immersed boundary method, where thermal fluctuations are neglected. To characterize the chain conformation, we consider its radius of gyration RG(N). We find that the presence of gravity explicitly breaks the spatial symmetry leading to anisotropic scaling of the components of RG with N along the direction of gravity RG, parallel and perpendicular to it RG, perpendicular, respectively. We numerically estimate the corresponding anisotropic scaling exponents nu parallel approximately 0.79 and nu perpendicular approximately 0.45, which differ significantly from the equilibrium scaling exponent nue=0.588 in three dimensions. This indicates that on the average, the chain becomes elongated along the sedimentation direction for large enough N. We present a generalization of the Flory scaling argument, which is in good agreement with the numerical results. It also reveals an explicit dependence of the scaling exponents on the Reynolds number. To study the dynamics of the chain, we compute its effective diffusion coefficient D(N), which does not contain Brownian motion. For the range of values of N used here, we find that both the parallel and perpendicular components of D increase with the chain length N, in contrast to the case of thermal diffusion in equilibrium. This is caused by the fluid-driven fluctuations in the internal configuration of the polymer that are magnified as polymer size becomes larger.
Helical vortices generated by flapping wings of bumblebees

NASA Astrophysics Data System (ADS)

Engels, Thomas; Kolomenskiy, Dmitry; Schneider, Kai; Farge, Marie; Lehmann, Fritz-Olaf; Sesterhenn, Jörn

2018-02-01

High resolution direct numerical simulations of rotating and flapping bumblebee wings are presented and their aerodynamics is studied focusing on the role of leading edge vortices and the associated helicity production. We first study the flow generated by only one rotating bumblebee wing in circular motion with 45◦ angle of attack. We then consider a model bumblebee flying in a numerical wind tunnel, which is tethered and has rigid wings flapping with a prescribed generic motion. The inflow condition of the wind varies from laminar to strongly turbulent regimes. Massively parallel simulations show that inflow turbulence does not significantly alter the wings’ leading edge vortex, which enhances lift production. Finally, we focus on studying the helicity of the generated vortices and analyze their contribution at different scales using orthogonal wavelets.
Computer-Aided Parallelizer and Optimizer

NASA Technical Reports Server (NTRS)

Jin, Haoqiang

2011-01-01

The Computer-Aided Parallelizer and Optimizer (CAPO) automates the insertion of compiler directives (see figure) to facilitate parallel processing on Shared Memory Parallel (SMP) machines. While CAPO currently is integrated seamlessly into CAPTools (developed at the University of Greenwich, now marketed as ParaWise), CAPO was independently developed at Ames Research Center as one of the components for the Legacy Code Modernization (LCM) project. The current version takes serial FORTRAN programs, performs interprocedural data dependence analysis, and generates OpenMP directives. Due to the widely supported OpenMP standard, the generated OpenMP codes have the potential to run on a wide range of SMP machines. CAPO relies on accurate interprocedural data dependence information currently provided by CAPTools. Compiler directives are generated through identification of parallel loops in the outermost level, construction of parallel regions around parallel loops and optimization of parallel regions, and insertion of directives with automatic identification of private, reduction, induction, and shared variables. Attempts also have been made to identify potential pipeline parallelism (implemented with point-to-point synchronization). Although directives are generated automatically, user interaction with the tool is still important for producing good parallel codes. A comprehensive graphical user interface is included for users to interact with the parallelization process.
Efficient calculation of atomic rate coefficients in dense plasmas

NASA Astrophysics Data System (ADS)

Aslanyan, Valentin; Tallents, Greg J.

2017-03-01

Modelling electron statistics in a cold, dense plasma by the Fermi-Dirac distribution leads to complications in the calculations of atomic rate coefficients. The Pauli exclusion principle slows down the rate of collisions as electrons must find unoccupied quantum states and adds a further computational cost. Methods to calculate these coefficients by direct numerical integration with a high degree of parallelism are presented. This degree of optimization allows the effects of degeneracy to be incorporated into a time-dependent collisional-radiative model. Example results from such a model are presented.
On the three-dimensional instability of laminar boundary layers on concave walls

NASA Technical Reports Server (NTRS)

Gortler, Henry

1954-01-01

A study is made of the stability of laminar boundary-layer profiles on slightly curved walls relative to small disturbances that result from vortices whose axes are parallel to the principal direction of flow. The result is an eigenvalue problem by which, for a given undisturbed flow at a prescribed wall, the amplification or decay is computed for each Reynolds number and each vortex thickness. For neutral disturbances (zero amplification) a critical Reynolds number is determined for each vortex distribution. The numerical calculation produces amplified disturbances on concave walls only.
On the impact of communication complexity in the design of parallel numerical algorithms

NASA Technical Reports Server (NTRS)

Gannon, D.; Vanrosendale, J.

1984-01-01

This paper describes two models of the cost of data movement in parallel numerical algorithms. One model is a generalization of an approach due to Hockney, and is suitable for shared memory multiprocessors where each processor has vector capabilities. The other model is applicable to highly parallel nonshared memory MIMD systems. In the second model, algorithm performance is characterized in terms of the communication network design. Techniques used in VLSI complexity theory are also brought in, and algorithm independent upper bounds on system performance are derived for several problems that are important to scientific computation.
On the impact of communication complexity on the design of parallel numerical algorithms

NASA Technical Reports Server (NTRS)

Gannon, D. B.; Van Rosendale, J.

1984-01-01

This paper describes two models of the cost of data movement in parallel numerical alorithms. One model is a generalization of an approach due to Hockney, and is suitable for shared memory multiprocessors where each processor has vector capabilities. The other model is applicable to highly parallel nonshared memory MIMD systems. In this second model, algorithm performance is characterized in terms of the communication network design. Techniques used in VLSI complexity theory are also brought in, and algorithm-independent upper bounds on system performance are derived for several problems that are important to scientific computation.
High-performance computing — an overview

NASA Astrophysics Data System (ADS)

Marksteiner, Peter

1996-08-01

An overview of high-performance computing (HPC) is given. Different types of computer architectures used in HPC are discussed: vector supercomputers, high-performance RISC processors, various parallel computers like symmetric multiprocessors, workstation clusters, massively parallel processors. Software tools and programming techniques used in HPC are reviewed: vectorizing compilers, optimization and vector tuning, optimization for RISC processors; parallel programming techniques like shared-memory parallelism, message passing and data parallelism; and numerical libraries.
Automatic Generation of Directive-Based Parallel Programs for Shared Memory Parallel Systems

NASA Technical Reports Server (NTRS)

Jin, Hao-Qiang; Yan, Jerry; Frumkin, Michael

2000-01-01

The shared-memory programming model is a very effective way to achieve parallelism on shared memory parallel computers. As great progress was made in hardware and software technologies, performance of parallel programs with compiler directives has demonstrated large improvement. The introduction of OpenMP directives, the industrial standard for shared-memory programming, has minimized the issue of portability. Due to its ease of programming and its good performance, the technique has become very popular. In this study, we have extended CAPTools, a computer-aided parallelization toolkit, to automatically generate directive-based, OpenMP, parallel programs. We outline techniques used in the implementation of the tool and present test results on the NAS parallel benchmarks and ARC3D, a CFD application. This work demonstrates the great potential of using computer-aided tools to quickly port parallel programs and also achieve good performance.

One, two, three, four, nothing more: an investigation of the conceptual sources of the verbal counting principles.

PubMed

Le Corre, Mathieu; Carey, Susan

2007-11-01

Since the publication of [Gelman, R., & Gallistel, C. R. (1978). The child's understanding of number. Cambridge, MA: Harvard University Press.] seminal work on the development of verbal counting as a representation of number, the nature of the ontogenetic sources of the verbal counting principles has been intensely debated. The present experiments explore proposals according to which the verbal counting principles are acquired by mapping numerals in the count list onto systems of numerical representation for which there is evidence in infancy, namely, analog magnitudes, parallel individuation, and set-based quantification. By asking 3- and 4-year-olds to estimate the number of elements in sets without counting, we investigate whether the numerals that are assigned cardinal meaning as part of the acquisition process display the signatures of what we call "enriched parallel individuation" (which combines properties of parallel individuation and of set-based quantification) or analog magnitudes. Two experiments demonstrate that while "one" to "four" are mapped onto core representations of small sets prior to the acquisition of the counting principles, numerals beyond "four" are only mapped onto analog magnitudes about six months after the acquisition of the counting principles. Moreover, we show that children's numerical estimates of sets from 1 to 4 elements fail to show the signature of numeral use based on analog magnitudes - namely, scalar variability. We conclude that, while representations of small sets provided by parallel individuation, enriched by the resources of set-based quantification are recruited in the acquisition process to provide the first numerical meanings for "one" to "four", analog magnitudes play no role in this process.
Automatic Multilevel Parallelization Using OpenMP

NASA Technical Reports Server (NTRS)

Jin, Hao-Qiang; Jost, Gabriele; Yan, Jerry; Ayguade, Eduard; Gonzalez, Marc; Martorell, Xavier; Biegel, Bryan (Technical Monitor)

2002-01-01

In this paper we describe the extension of the CAPO (CAPtools (Computer Aided Parallelization Toolkit) OpenMP) parallelization support tool to support multilevel parallelism based on OpenMP directives. CAPO generates OpenMP directives with extensions supported by the NanosCompiler to allow for directive nesting and definition of thread groups. We report some results for several benchmark codes and one full application that have been parallelized using our system.
Energy flow of electric dipole radiation in between parallel mirrors

NASA Astrophysics Data System (ADS)

Xu, Zhangjin; Arnoldus, Henk F.

2017-11-01

We have studied the energy flow patterns of the radiation emitted by an electric dipole located in between parallel mirrors. It appears that the field lines of the Poynting vector (the flow lines of energy) can have very intricate structures, including many singularities and vortices. The flow line patterns depend on the distance between the mirrors, the distance of the dipole to one of the mirrors and the angle of oscillation of the dipole moment with respect to the normal of the mirror surfaces. Already for the simplest case of a dipole moment oscillating perpendicular to the mirrors, singularities appear at regular intervals along the direction of propagation (parallel to the mirrors). For a parallel dipole, vortices appear in the neighbourhood of the dipole. For a dipole oscillating under a finite angle with the surface normal, the radiating tends to swirl around the dipole before travelling off parallel to the mirrors. For relatively large mirror separations, vortices appear in the pattern. When the dipole is off-centred with respect to the midway point between the mirrors, the flow line structure becomes even more complicated, with numerous vortices in the pattern, and tiny loops near the dipole. We have also investigated the locations of the vortices and singularities, and these can be found without any specific knowledge about the flow lines. This provides an independent means of studying the propagation of dipole radiation between mirrors.
Kinetic theory of turbulence for parallel propagation revisited: Low-to-intermediate frequency regime

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yoon, Peter H., E-mail: yoonp@umd.edu; School of Space Research, Kyung Hee University, Yongin, Gyeonggi 446-701

2015-09-15

A previous paper [P. H. Yoon, “Kinetic theory of turbulence for parallel propagation revisited: Formal results,” Phys. Plasmas 22, 082309 (2015)] revisited the second-order nonlinear kinetic theory for turbulence propagating in directions parallel/anti-parallel to the ambient magnetic field, in which the original work according to Yoon and Fang [Phys. Plasmas 15, 122312 (2008)] was refined, following the paper by Gaelzer et al. [Phys. Plasmas 22, 032310 (2015)]. The main finding involved the dimensional correction pertaining to discrete-particle effects in Yoon and Fang's theory. However, the final result was presented in terms of formal linear and nonlinear susceptibility response functions. Inmore » the present paper, the formal equations are explicitly written down for the case of low-to-intermediate frequency regime by making use of approximate forms for the response functions. The resulting equations are sufficiently concrete so that they can readily be solved by numerical means or analyzed by theoretical means. The derived set of equations describe nonlinear interactions of quasi-parallel modes whose frequency range covers the Alfvén wave range to ion-cyclotron mode, but is sufficiently lower than the electron cyclotron mode. The application of the present formalism may range from the nonlinear evolution of whistler anisotropy instability in the high-beta regime, and the nonlinear interaction of electrons with whistler-range turbulence.« less
Iterative methods for 3D implicit finite-difference migration using the complex Padé approximation

NASA Astrophysics Data System (ADS)

Costa, Carlos A. N.; Campos, Itamara S.; Costa, Jessé C.; Neto, Francisco A.; Schleicher, Jörg; Novais, Amélia

2013-08-01

Conventional implementations of 3D finite-difference (FD) migration use splitting techniques to accelerate performance and save computational cost. However, such techniques are plagued with numerical anisotropy that jeopardises the correct positioning of dipping reflectors in the directions not used for the operator splitting. We implement 3D downward continuation FD migration without splitting using a complex Padé approximation. In this way, the numerical anisotropy is eliminated at the expense of a computationally more intensive solution of a large-band linear system. We compare the performance of the iterative stabilized biconjugate gradient (BICGSTAB) and that of the multifrontal massively parallel direct solver (MUMPS). It turns out that the use of the complex Padé approximation not only stabilizes the solution, but also acts as an effective preconditioner for the BICGSTAB algorithm, reducing the number of iterations as compared to the implementation using the real Padé expansion. As a consequence, the iterative BICGSTAB method is more efficient than the direct MUMPS method when solving a single term in the Padé expansion. The results of both algorithms, here evaluated by computing the migration impulse response in the SEG/EAGE salt model, are of comparable quality.
Improving Data Transfer Throughput with Direct Search Optimization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Balaprakash, Prasanna; Morozov, Vitali; Kettimuthu, Rajkumar

2016-01-01

Improving data transfer throughput over high-speed long-distance networks has become increasingly difficult. Numerous factors such as nondeterministic congestion, dynamics of the transfer protocol, and multiuser and multitask source and destination endpoints, as well as interactions among these factors, contribute to this difficulty. A promising approach to improving throughput consists in using parallel streams at the application layer.We formulate and solve the problem of choosing the number of such streams from a mathematical optimization perspective. We propose the use of direct search methods, a class of easy-to-implement and light-weight mathematical optimization algorithms, to improve the performance of data transfers by dynamicallymore » adapting the number of parallel streams in a manner that does not require domain expertise, instrumentation, analytical models, or historic data. We apply our method to transfers performed with the GridFTP protocol, and illustrate the effectiveness of the proposed algorithm when used within Globus, a state-of-the-art data transfer tool, on productionWAN links and servers. We show that when compared to user default settings our direct search methods can achieve up to 10x performance improvement under certain conditions. We also show that our method can overcome performance degradation due to external compute and network load on source end points, a common scenario at high performance computing facilities.« less
Performance Analysis and Optimization on the UCLA Parallel Atmospheric General Circulation Model Code

NASA Technical Reports Server (NTRS)

Lou, John; Ferraro, Robert; Farrara, John; Mechoso, Carlos

1996-01-01

An analysis is presented of several factors influencing the performance of a parallel implementation of the UCLA atmospheric general circulation model (AGCM) on massively parallel computer systems. Several modificaitons to the original parallel AGCM code aimed at improving its numerical efficiency, interprocessor communication cost, load-balance and issues affecting single-node code performance are discussed.
Experimental and numerical study of water-filled vessel impacted by flat projectiles

NASA Astrophysics Data System (ADS)

Zhang, Wei; Ren, Peng; Huang, Wei; Gao, Yu Bo

2014-05-01

To understand the failure modes and impact resistance of double-layer plates separated by water, a flat-nosed projectile was accelerated by a two-stage light gas gun against a water-filled vessel which was placed in an air-filled tank. Targets consisted of a tank made of two flat 5A06 aluminum alloy plates held by a high strength steel frame. The penetration process was recorded by a digital high-speed camera. The same projectile-target system was also used to fire the targets placed directly in air for comparison. Parallel numerical tests were also carried out. The result indicated that experimental and numerical results were in good agreement. Numerical simulations were able to capture the main physical behavior. It was also found that the impact resistance of double layer plates separated by water was lager than that of the target plates in air. Tearing was the main failure models of the water-filled vessel targets which was different from that of the target plates in air where the shear plugging was in dominate.
An efficient finite element method for simulation of droplet spreading on a topologically rough surface

NASA Astrophysics Data System (ADS)

Luo, Li; Wang, Xiao-Ping; Cai, Xiao-Chuan

2017-11-01

We study numerically the dynamics of a three-dimensional droplet spreading on a rough solid surface using a phase-field model consisting of the coupled Cahn-Hilliard and Navier-Stokes equations with a generalized Navier boundary condition (GNBC). An efficient finite element method on unstructured meshes is introduced to cope with the complex geometry of the solid surfaces. We extend the GNBC to surfaces with complex geometry by including its weak form along different normal and tangential directions in the finite element formulation. The semi-implicit time discretization scheme results in a decoupled system for the phase function, the velocity, and the pressure. In addition, a mass compensation algorithm is introduced to preserve the mass of the droplet. To efficiently solve the decoupled systems, we present a highly parallel solution strategy based on domain decomposition techniques. We validate the newly developed solution method through extensive numerical experiments, particularly for those phenomena that can not be achieved by two-dimensional simulations. On a surface with circular posts, we study how wettability of the rough surface depends on the geometry of the posts. The contact line motion for a droplet spreading over some periodic rough surfaces are also efficiently computed. Moreover, we study the spreading process of an impacting droplet on a microstructured surface, a qualitative agreement is achieved between the numerical and experimental results. The parallel performance suggests that the proposed solution algorithm is scalable with over 4,000 processors cores with tens of millions of unknowns.
Efficient, massively parallel eigenvalue computation

NASA Technical Reports Server (NTRS)

Huo, Yan; Schreiber, Robert

1993-01-01

In numerical simulations of disordered electronic systems, one of the most common approaches is to diagonalize random Hamiltonian matrices and to study the eigenvalues and eigenfunctions of a single electron in the presence of a random potential. An effort to implement a matrix diagonalization routine for real symmetric dense matrices on massively parallel SIMD computers, the Maspar MP-1 and MP-2 systems, is described. Results of numerical tests and timings are also presented.
Theory of energy and power flow of plasmonic waves on single-walled carbon nanotubes

NASA Astrophysics Data System (ADS)

Moradi, Afshin

2017-10-01

The energy theorem of electrodynamics is extended so as to apply to the plasmonic waves on single-walled carbon nanotubes which propagate parallel to the axial direction of the system and are periodic waves in the azimuthal direction. Electronic excitations on the nanotube surface are modeled by an infinitesimally thin layer of free-electron gas which is described by means of the linearized hydrodynamic theory. General expressions of energy and power flow associated with surface waves are obtained by solving Maxwell and hydrodynamic equations with appropriate boundary conditions. Numerical results for the transverse magnetic mode show that energy, power flow, and energy transport velocity of the plasmonic waves strongly depend on the nanotube radius in the long-wavelength region.
Sub-domain decomposition methods and computational controls for multibody dynamical systems. [of spacecraft structures

NASA Technical Reports Server (NTRS)

Menon, R. G.; Kurdila, A. J.

1992-01-01

This paper presents a concurrent methodology to simulate the dynamics of flexible multibody systems with a large number of degrees of freedom. A general class of open-loop structures is treated and a redundant coordinate formulation is adopted. A range space method is used in which the constraint forces are calculated using a preconditioned conjugate gradient method. By using a preconditioner motivated by the regular ordering of the directed graph of the structures, it is shown that the method is order N in the total number of coordinates of the system. The overall formulation has the advantage that it permits fine parallelization and does not rely on system topology to induce concurrency. It can be efficiently implemented on the present generation of parallel computers with a large number of processors. Validation of the method is presented via numerical simulations of space structures incorporating large number of flexible degrees of freedom.
Effect of alignment of easy axes on dynamic magnetization of immobilized magnetic nanoparticles

NASA Astrophysics Data System (ADS)

Yoshida, Takashi; Matsugi, Yuki; Tsujimura, Naotaka; Sasayama, Teruyoshi; Enpuku, Keiji; Viereck, Thilo; Schilling, Meinhard; Ludwig, Frank

2017-04-01

In some biomedical applications of magnetic nanoparticles (MNPs), the particles are physically immobilized. In this study, we explore the effect of the alignment of the magnetic easy axes on the dynamic magnetization of immobilized MNPs under an AC excitation field. We prepared three immobilized MNP samples: (1) a sample in which easy axes are randomly oriented, (2) a parallel-aligned sample in which easy axes are parallel to the AC field, and (3) an orthogonally aligned sample in which easy axes are perpendicular to the AC field. First, we show that the parallel-aligned sample has the largest hysteresis in the magnetization curve and the largest harmonic magnetization spectra, followed by the randomly oriented and orthogonally aligned samples. For example, 1.6-fold increase was observed in the area of the hysteresis loop of the parallel-aligned sample compared to that of the randomly oriented sample. To quantitatively discuss the experimental results, we perform a numerical simulation based on a Fokker-Planck equation, in which probability distributions for the directions of the easy axes are taken into account in simulating the prepared MNP samples. We obtained quantitative agreement between experiment and simulation. These results indicate that the dynamic magnetization of immobilized MNPs is significantly affected by the alignment of the easy axes.
Multiphase three-dimensional direct numerical simulation of a rotating impeller with code Blue

NASA Astrophysics Data System (ADS)

Kahouadji, Lyes; Shin, Seungwon; Chergui, Jalel; Juric, Damir; Craster, Richard V.; Matar, Omar K.

2017-11-01

The flow driven by a rotating impeller inside an open fixed cylindrical cavity is simulated using code Blue, a solver for massively-parallel simulations of fully three-dimensional multiphase flows. The impeller is composed of four blades at a 45° inclination all attached to a central hub and tube stem. In Blue, solid forms are constructed through the definition of immersed objects via a distance function that accounts for the object's interaction with the flow for both single and two-phase flows. We use a moving frame technique for imposing translation and/or rotation. The variation of the Reynolds number, the clearance, and the tank aspect ratio are considered, and we highlight the importance of the confinement ratio (blade radius versus the tank radius) in the mixing process. Blue uses a domain decomposition strategy for parallelization with MPI. The fluid interface solver is based on a parallel implementation of a hybrid front-tracking/level-set method designed complex interfacial topological changes. Parallel GMRES and multigrid iterative solvers are applied to the linear systems arising from the implicit solution for the fluid velocities and pressure in the presence of strong density and viscosity discontinuities across fluid phases. EPSRC, UK, MEMPHIS program Grant (EP/K003976/1), RAEng Research Chair (OKM).
Investigating nonlinear distortion in the photopolymer materials

NASA Astrophysics Data System (ADS)

Malallah, Ra'ed; Cassidy, Derek; Muniraj, Inbarasan; Zhao, Liang; Ryle, James P.; Sheridan, John T.

2017-05-01

Propagation and diffraction of a light beam through nonlinear materials are effectively compensated by the effect of selftrapping. The laser beam propagating through photo-sensitive polymer PVA/AA can generate a waveguide of higher refractive index in direction of the light propagation. In order to investigate this phenomenon occurring in light-sensitive photopolymer media, the behaviour of a single light beam focused on the front surface of photopolymer bulk is investigated. As part of this work the self-bending of parallel beams separated in spaces during self-writing waveguides are studied. It is shown that there is strong correlation between the intensity of the input beams and their separation distance and the resulting deformation of waveguide trajectory during channels formation. This self-channeling can be modelled numerically using a three-dimension model to describe what takes place inside the volume of a photopolymer media. Corresponding numerical simulations show good agreement with experimental observations, which confirm the validity of the numerical model that was used to simulate these experiments.
A fully coupled method for massively parallel simulation of hydraulically driven fractures in 3-dimensions: FULLY COUPLED PARALLEL SIMULATION OF HYDRAULIC FRACTURES IN 3-D

DOE PAGES

Settgast, Randolph R.; Fu, Pengcheng; Walsh, Stuart D. C.; ...

2016-09-18

This study describes a fully coupled finite element/finite volume approach for simulating field-scale hydraulically driven fractures in three dimensions, using massively parallel computing platforms. The proposed method is capable of capturing realistic representations of local heterogeneities, layering and natural fracture networks in a reservoir. A detailed description of the numerical implementation is provided, along with numerical studies comparing the model with both analytical solutions and experimental results. The results demonstrate the effectiveness of the proposed method for modeling large-scale problems involving hydraulically driven fractures in three dimensions.
A fully coupled method for massively parallel simulation of hydraulically driven fractures in 3-dimensions: FULLY COUPLED PARALLEL SIMULATION OF HYDRAULIC FRACTURES IN 3-D

DOE Office of Scientific and Technical Information (OSTI.GOV)

Settgast, Randolph R.; Fu, Pengcheng; Walsh, Stuart D. C.

This study describes a fully coupled finite element/finite volume approach for simulating field-scale hydraulically driven fractures in three dimensions, using massively parallel computing platforms. The proposed method is capable of capturing realistic representations of local heterogeneities, layering and natural fracture networks in a reservoir. A detailed description of the numerical implementation is provided, along with numerical studies comparing the model with both analytical solutions and experimental results. The results demonstrate the effectiveness of the proposed method for modeling large-scale problems involving hydraulically driven fractures in three dimensions.
Influence of uneven rail irregularities on the dynamic response of the railway track using a three-dimensional model of the vehicle-track system

NASA Astrophysics Data System (ADS)

Naeimi, Meysam; Zakeri, Jabbar Ali; Esmaeili, Morteza; Shadfar, Morad

2015-01-01

A mathematical model of the vehicle-track interaction is developed to investigate the coupled behaviour of vehicle-track system, in the presence of uneven irregularities at left/right rails. The railway vehicle is simplified as a 3D multi-rigid-body model, and the track is treated as the two parallel beams on a layered discrete support system. Besides the car-body, the bogies and the wheel sets, the sleepers are assumed to have roll degree of freedom, in order to simulate the in-plane rotation of the components. The wheel-rail interface is treated using a nonlinear Hertzian contact model, coupling the mathematical equations of the vehicle-track systems. The dynamic interaction of the entire system is numerically studied in time domain, employing Newmark's integration method. The track irregularity spectra of both the left/right rails are taken into account, as the inputs of dynamic excitations. The dynamic responses of the track system induced by such irregularities are obtained, particularly in terms of the vertical (bounce) and roll displacements. The numerical model of the present research is validated using several benchmark models reported in the literature, for both the smooth and unsmooth track conditions. Four sample profiles of the measured rail irregularities are considered as the case studies of excitation sources, examining their influences on the dynamic behaviour of the coupled system. The results of numerical simulations demonstrate that the motion of track system is significantly influenced by the presence of uneven irregularities in left/right rails. Dynamic response of the sleepers in the roll direction becomes more sensitive to the rail irregularities, as the unevenness severity of the parallel profiles (quantitative difference between left and right rail spectra) is increased. The severe geometric deformation of the track in the bounce-pitch-roll directions is mainly related to such profile unevenness (cross-level) in left/right rails.
Automatic Multilevel Parallelization Using OpenMP

NASA Technical Reports Server (NTRS)

Jin, Hao-Qiang; Jost, Gabriele; Yan, Jerry; Ayguade, Eduard; Gonzalez, Marc; Martorell, Xavier; Biegel, Bryan (Technical Monitor)

2002-01-01

In this paper we describe the extension of the CAPO parallelization support tool to support multilevel parallelism based on OpenMP directives. CAPO generates OpenMP directives with extensions supported by the NanosCompiler to allow for directive nesting and definition of thread groups. We report first results for several benchmark codes and one full application that have been parallelized using our system.
Parallel processing for nonlinear dynamics simulations of structures including rotating bladed-disk assemblies

NASA Technical Reports Server (NTRS)

Hsieh, Shang-Hsien

1993-01-01

The principal objective of this research is to develop, test, and implement coarse-grained, parallel-processing strategies for nonlinear dynamic simulations of practical structural problems. There are contributions to four main areas: finite element modeling and analysis of rotational dynamics, numerical algorithms for parallel nonlinear solutions, automatic partitioning techniques to effect load-balancing among processors, and an integrated parallel analysis system.

Zephyr: Open-source Parallel Seismic Waveform Inversion in an Integrated Python-based Framework

NASA Astrophysics Data System (ADS)

Smithyman, B. R.; Pratt, R. G.; Hadden, S. M.

2015-12-01

Seismic Full-Waveform Inversion (FWI) is an advanced method to reconstruct wave properties of materials in the Earth from a series of seismic measurements. These methods have been developed by researchers since the late 1980s, and now see significant interest from the seismic exploration industry. As researchers move towards implementing advanced numerical modelling (e.g., 3D, multi-component, anisotropic and visco-elastic physics), it is desirable to make use of a modular approach, minimizing the effort developing a new set of tools for each new numerical problem. SimPEG (http://simpeg.xyz) is an open source project aimed at constructing a general framework to enable geophysical inversion in various domains. In this abstract we describe Zephyr (https://github.com/bsmithyman/zephyr), which is a coupled research project focused on parallel FWI in the seismic context. The software is built on top of Python, Numpy and IPython, which enables very flexible testing and implementation of new features. Zephyr is an open source project, and is released freely to enable reproducible research. We currently implement a parallel, distributed seismic forward modelling approach that solves the 2.5D (two-and-one-half dimensional) viscoacoustic Helmholtz equation at a range modelling frequencies, generating forward solutions for a given source behaviour, and gradient solutions for a given set of observed data. Solutions are computed in a distributed manner on a set of heterogeneous workers. The researcher's frontend computer may be separated from the worker cluster by a network link to enable full support for computation on remote clusters from individual workstations or laptops. The present codebase introduces a numerical discretization equivalent to that used by FULLWV, a well-known seismic FWI research codebase. This makes it straightforward to compare results from Zephyr directly with FULLWV. The flexibility introduced by the use of a Python programming environment makes extension of the codebase with new methods much more straightforward. This enables comparison and integration of new efforts with existing results.
Artificial acoustic stiffness reduction in fully compressible, direct numerical simulation of combustion

NASA Astrophysics Data System (ADS)

Wang, Yi; Trouvé, Arnaud

2004-09-01

A pseudo-compressibility method is proposed to modify the acoustic time step restriction found in fully compressible, explicit flow solvers. The method manipulates terms in the governing equations of order Ma2, where Ma is a characteristic flow Mach number. A decrease in the speed of acoustic waves is obtained by adding an extra term in the balance equation for total energy. This term is proportional to flow dilatation and uses a decomposition of the dilatational field into an acoustic component and a component due to heat transfer. The present method is a variation of the pressure gradient scaling (PGS) method proposed in Ramshaw et al (1985 Pressure gradient scaling method for fluid flow with nearly uniform pressure J. Comput. Phys. 58 361-76). It achieves gains in computational efficiencies similar to PGS: at the cost of a slightly more involved right-hand-side computation, the numerical time step increases by a full order of magnitude. It also features the added benefit of preserving the hydrodynamic pressure field. The original and modified PGS methods are implemented into a parallel direct numerical simulation solver developed for applications to turbulent reacting flows with detailed chemical kinetics. The performance of the pseudo-compressibility methods is illustrated in a series of test problems ranging from isothermal sound propagation to laminar premixed flame problems.
Numerical Study of Rotating Turbulence with External Forcing

NASA Technical Reports Server (NTRS)

Yeung, P. K.; Zhou, Ye

1998-01-01

Direct numerical simulation at 256(exp 3) resolution have been carried out to study the response of isotropic turbulence to the concurrent effects of solid-body rotation and numerical forcing at the large scales. Because energy transfer to the smaller scales is weakened by rotation, energy input from forcing gradually builds up at the large scales, causing the overall kinetic energy to increase. At intermediate wavenumbers the energy spectrum undergoes a transition from a limited k(exp -5/3) inertial range to k(exp -2) scaling recently predicted in the literature. Although the Reynolds stress tensor remains approximately isotropic and three-components, evidence for anisotropy and quasi- two-dimensionality in length scales and spectra in different velocity components and directions is strong. The small scales are found to deviate from local isotropy, primarily as a result of anisotropic transfer to the high wavenumbers. To understand the spectral dynamics of this flow we study the detailed behavior of nonlinear triadic interactions in wavenumber space. Spectral transfer in the velocity component parallel to the axis of rotation is qualitatively similar to that in non-rotating turbulence; however the perpendicular component is characterized by a greatly suppressed energy cascade at high wavenumber and a local reverse transfer at the largest scales. The broader implications of this work are briefly addressed.
A comparison of the primal and semi-dual variational formats of gradient-extended crystal inelasticity

NASA Astrophysics Data System (ADS)

Carlsson, Kristoffer; Runesson, Kenneth; Larsson, Fredrik; Ekh, Magnus

2017-10-01

In this paper we discuss issues related to the theoretical as well as the computational format of gradient-extended crystal viscoplasticity. The so-called primal format uses the displacements, the slip of each slip system and the dissipative stresses as the primary unknown fields. An alternative format is coined the semi-dual format, which in addition includes energetic microstresses among the primary unknown fields. We compare the primal and semi-dual variational formats in terms of advantages and disadvantages from modeling as well as numerical viewpoints. Finally, we perform a series of representative numerical tests to investigate the rate of convergence with finite element mesh refinement. In particular, it is shown that the commonly adopted microhard boundary condition poses a challenge in the special case that the slip direction is parallel to a grain boundary.
A parallel orbital-updating based plane-wave basis method for electronic structure calculations

NASA Astrophysics Data System (ADS)

Pan, Yan; Dai, Xiaoying; de Gironcoli, Stefano; Gong, Xin-Gao; Rignanese, Gian-Marco; Zhou, Aihui

2017-11-01

Motivated by the recently proposed parallel orbital-updating approach in real space method [1], we propose a parallel orbital-updating based plane-wave basis method for electronic structure calculations, for solving the corresponding eigenvalue problems. In addition, we propose two new modified parallel orbital-updating methods. Compared to the traditional plane-wave methods, our methods allow for two-level parallelization, which is particularly interesting for large scale parallelization. Numerical experiments show that these new methods are more reliable and efficient for large scale calculations on modern supercomputers.
Numerical solution of the exterior oblique derivative BVP using the direct BEM formulation

NASA Astrophysics Data System (ADS)

Čunderlík, Róbert; Špir, Róbert; Mikula, Karol

2016-04-01

The fixed gravimetric boundary value problem (FGBVP) represents an exterior oblique derivative problem for the Laplace equation. A direct formulation of the boundary element method (BEM) for the Laplace equation leads to a boundary integral equation (BIE) where a harmonic function is represented as a superposition of the single-layer and double-layer potential. Such a potential representation is applied to obtain a numerical solution of FGBVP. The oblique derivative problem is treated by a decomposition of the gradient of the unknown disturbing potential into its normal and tangential components. Our numerical scheme uses the collocation with linear basis functions. It involves a triangulated discretization of the Earth's surface as our computational domain considering its complicated topography. To achieve high-resolution numerical solutions, parallel implementations using the MPI subroutines as well as an iterative elimination of far zones' contributions are performed. Numerical experiments present a reconstruction of a harmonic function above the Earth's topography given by the spherical harmonic approach, namely by the EGM2008 geopotential model up to degree 2160. The SRTM30 global topography model is used to approximate the Earth's surface by the triangulated discretization. The obtained BEM solution with the resolution 0.05 deg (12,960,002 nodes) is compared with EGM2008. The standard deviation of residuals 5.6 cm indicates a good agreement. The largest residuals are obviously in high mountainous regions. They are negative reaching up to -0.7 m in Himalayas and about -0.3 m in Andes and Rocky Mountains. A local refinement in the area of Slovakia confirms an improvement of the numerical solution in this mountainous region despite of the fact that the Earth's topography is here considered in more details.
Numerical study of slip system activity and crystal lattice rotation under wedge nanoindents in tungsten single crystals

NASA Astrophysics Data System (ADS)

Volz, T.; Schwaiger, R.; Wang, J.; Weygand, S. M.

2018-05-01

Tungsten is a promising material for plasma facing components in future nuclear fusion reactors. In the present work, we numerically investigate the deformation behavior of unirradiated tungsten (a body-centered cubic (bcc) single crystal) underneath nanoindents. A finite element (FE) model is presented to simulate wedge indentation. Crystal plasticity finite element (CPFE) simulations were performed for face-centered and body-centered single crystals accounting for the slip system family {110} <111> in the bcc crystal system and the {111} <110> slip family in the fcc system. The 90° wedge indenter was aligned parallel to the [1 ¯01 ]-direction and indented the crystal in the [0 1 ¯0 ]-direction up to a maximum indentation depth of 2 µm. In both, the fcc and bcc single crystals, the activity of slip systems was investigated and compared. Good agreement with the results from former investigations on fcc single crystals was observed. Furthermore, the in-plane lattice rotation in the material underneath an indent was determined and compared for the fcc and bcc single crystals.
Propulsion of helical flagella near boundaries

NASA Astrophysics Data System (ADS)

Rodenborn, Bruce; Giesbrecht, Grant; Ni, Katha; Vock, Isaac

The presence of nearby boundaries is known to have dramatic effects on the swimming behavior of microorganisms because of the no-slip condition at the boundary. Microorganisms that use a helical flagellum experience forces both along the axis of the helix and in the direction perpendicular to the axis. These low Reynolds number boundary effects have primarily been studied using live bacteria and using numerical simulations. However, small scale measurements give limited information about the forces and torques on the microorganisms. Furthermore, numerical studies are approximate because they have generally used Stokeslet-based simulations with image Stokeslets to represent the effects of the boundaries. Instead, we directly measure the propulsion of macroscopic helical flagella with diameter 12 mm using a fluid with viscosity 105 times that of water to ensure the Reynolds number in the experiments is much less than unity, just as for bacteria. We measure the parallel and perpendicular forces as a function of boundary distance to determine the nonzero elements of the propulsive matrix for axial rotation near a boundary. We then compare our results to the theory and simulations of Lauga et al. and to biological measurements.
Direct numerical simulation of steady state, three dimensional, laminar flow around a wall mounted cube

NASA Astrophysics Data System (ADS)

Liakos, Anastasios; Malamataris, Nikolaos

2014-11-01

The topology and evolution of flow around a surface mounted cubical object in three dimensional channel flow is examined for low to moderate Reynolds numbers. Direct numerical simulations were performed via a home made parallel finite element code. The computational domain has been designed according to actual laboratory experimental conditions. Analysis of the results is performed using the three dimensional theory of separation. Our findings indicate that a tornado-like vortex by the side of the cube is present for all Reynolds numbers for which flow was simulated. A horse-shoe vortex upstream from the cube was formed at Reynolds number approximately 1266. Pressure distributions are shown along with three dimensional images of the tornado-like vortex and the horseshoe vortex at selected Reynolds numbers. Finally, and in accordance to previous work, our results indicate that the upper limit for the Reynolds number for which steady state results are physically realizable is roughly 2000. Financial support of author NM from the Office of Naval Research Global (ONRG-VSP, N62909-13-1-V016) is acknowledged.
Exact solutions of the Navier-Stokes equations generalized for flow in porous media

NASA Astrophysics Data System (ADS)

Daly, Edoardo; Basser, Hossein; Rudman, Murray

2018-05-01

Flow of Newtonian fluids in porous media is often modelled using a generalized version of the full non-linear Navier-Stokes equations that include additional terms describing the resistance to flow due to the porous matrix. Because this formulation is becoming increasingly popular in numerical models, exact solutions are required as a benchmark of numerical codes. The contribution of this study is to provide a number of non-trivial exact solutions of the generalized form of the Navier-Stokes equations for parallel flow in porous media. Steady-state solutions are derived in the case of flows in a medium with constant permeability along the main direction of flow and a constant cross-stream velocity in the case of both linear and non-linear drag. Solutions are also presented for cases in which the permeability changes in the direction normal to the main flow. An unsteady solution for a flow with velocity driven by a time-periodic pressure gradient is also derived. These solutions form a basis for validating computational models across a wide range of Reynolds and Darcy numbers.
Coupling single giant nanocrystal quantum dots to the fundamental mode of patch nanoantennas through fringe field

DOE PAGES

Wang, Feng; Karan, Niladri S.; Minh Nguyen, Hue; ...

2015-09-23

Through single dot spectroscopy and numerical simulation studies, we demonstrate that the fundamental mode of gold patch nanoantennas have fringe-field resonance capable of enhancing the nano-emitters coupled around the edge of the patch antenna. This fringe-field coupling is used to enhance the radiative rates of core/thick-shell nanocrystal quantum dots (g-NQDs) that cannot be embedded into the ultra-thin dielectric gap of patch nanoantennas due to their large sizes. We attain 14 and 3 times enhancements in single exciton radiative decay rate and bi-exciton emission efficiencies of g-NQDs respectively, with no detectable metal quenching. Our numerical studies confirmed our experimental results andmore » further reveal that patch nanoantennas can provide strong emission enhancement for dipoles lying not only in radial direction of the circular patches but also in the direction normal to the antennas surface. Finally, this provides a distinct advantage over the parallel gap-bar antennas that can provide enhancement only for the dipoles oriented across the gap.« less
A model of plasma current through a hole of Rogowski probe including sheath effects

DOE Office of Scientific and Technical Information (OSTI.GOV)

Furui, H., E-mail: furui@fusion.k.u-tokyo.ac.jp; Ejiri, A.; Takase, Y.

2016-04-15

In TST-2 Ohmic discharges, local current is measured using a Rogowski probe by changing the angle between the local magnetic field and the direction of the hole of the Rogowski probe. The angular dependence shows a peak when the direction of the hole is almost parallel to the local magnetic field. The obtained width of the peak was broader than that of the theoretical curve expected from the probe geometry. In order to explain this disagreement, we consider the effect of sheath in the vicinity of the Rogowski probe. A sheath model was constructed and electron orbits were numerically calculated.more » From the calculation, it was found that the electron orbit is affected by E × B drift due to the sheath electric field. Such orbit causes the broadening of the peak in the angular dependence and the dependence agrees with the experimental results. The dependence of the broadening on various plasma parameters was studied numerically and explained qualitatively by a simplified analytical model.« less
Morphological Simulation of Phase Separation Coupled Oscillation Shear and Varying Temperature Fields

NASA Astrophysics Data System (ADS)

Wang, Heping; Li, Xiaoguang; Lin, Kejun; Geng, Xingguo

2018-05-01

This paper explores the effect of the shear frequency and Prandtl number ( Pr) on the procedure and pattern formation of phase separation in symmetric and asymmetric systems. For the symmetric system, the periodic shear significantly prolongs the spinodal decomposition stage and enlarges the separated domain in domain growth stage. By adjusting the Pr and shear frequency, the number and orientation of separated steady layer structures can be controlled during domain stretch stage. The numerical results indicate that the increase in Pr and decrease in the shear frequency can significantly increase in the layer number of the lamellar structure, which relates to the decrease in domain size. Furthermore, the lamellar orientation parallel to the shear direction is altered into that perpendicular to the shear direction by further increasing the shear frequency, and also similar results for larger systems. For asymmetric system, the quantitative analysis shows that the decrease in the shear frequency enlarges the size of separated minority phases. These numerical results provide guidance for setting the optimum condition for the phase separation under periodic shear and slow cooling.
Generation of temperature anisotropy for alpha particle velocity distributions in solar wind at 0.3 AU: Vlasov simulations and Helios observations

NASA Astrophysics Data System (ADS)

Perrone, D.; Bourouaine, S.; Valentini, F.; Marsch, E.; Veltri, P.

2014-04-01

Solar wind "in situ" measurements from the Helios spacecraft in regions of the Heliosphere close to the Sun (˜0.3 AU), at which typical values of the proton plasma beta are observed to be lower than unity, show that the alpha particle distribution functions depart from the equilibrium Maxwellian configuration, displaying significant elongations in the direction perpendicular to the background magnetic field. In the present work, we made use of multi-ion hybrid Vlasov-Maxwell simulations to provide theoretical support and interpretation to the empirical evidences above. Our numerical results show that, at variance with the case of βp≃1 discussed in Perrone et al. (2011), for βp=0.1 the turbulent cascade in the direction parallel to the ambient magnetic field is not efficient in transferring energy toward scales shorter than the proton inertial length. Moreover, our numerical analysis provides new insights for the theoretical interpretation of the empirical evidences obtained from the Helios spacecraft, concerning the generation of temperature anisotropy in the particle velocity distributions.
Drag reduction by herringbone riblet texture in direct numerical simulations of turbulent channel flow

NASA Astrophysics Data System (ADS)

Benschop, H. O. G.; Breugem, W.-P.

2017-08-01

A bird-feather-inspired herringbone riblet texture was investigated for turbulent drag reduction. The texture consists of blade riblets in a converging/diverging or herringbone pattern with spanwise wavelength Λf. The aim is to quantify the drag change for this texture as compared to a smooth wall and to study the underlying mechanisms. To that purpose, direct numerical simulations of turbulent flow in a channel with height Lz were performed. The Fukagata-Iwamoto-Kasagi identity for drag decomposition was extended to textured walls and was used to study the drag change mechanisms. For Λf/Lz ≳ O(10), the herringbone texture behaves similarly to a conventional parallel-riblet texture in yaw: the suppression of turbulent advective transport results in a slight drag reduction of 2%. For Λf/Lz ≲ O(1), the drag increases strongly with a maximum of 73%. This is attributed to enhanced mean and turbulent advection, which results from the strong secondary flow that forms over regions of riblet convergence/divergence. Hence, the employment of convergent/divergent riblets in the texture seems to be detrimental to turbulent drag reduction.
Multithreaded Model for Dynamic Load Balancing Parallel Adaptive PDE Computations

NASA Technical Reports Server (NTRS)

Chrisochoides, Nikos

1995-01-01

We present a multithreaded model for the dynamic load-balancing of numerical, adaptive computations required for the solution of Partial Differential Equations (PDE's) on multiprocessors. Multithreading is used as a means of exploring concurrency in the processor level in order to tolerate synchronization costs inherent to traditional (non-threaded) parallel adaptive PDE solvers. Our preliminary analysis for parallel, adaptive PDE solvers indicates that multithreading can be used an a mechanism to mask overheads required for the dynamic balancing of processor workloads with computations required for the actual numerical solution of the PDE's. Also, multithreading can simplify the implementation of dynamic load-balancing algorithms, a task that is very difficult for traditional data parallel adaptive PDE computations. Unfortunately, multithreading does not always simplify program complexity, often makes code re-usability not an easy task, and increases software complexity.
Parallel Algorithms for Least Squares and Related Computations.

DTIC Science & Technology

1991-03-22

for dense computations in linear algebra . The work has recently been published in a general reference book on parallel algorithms by SIAM. AFO SR...written his Ph.D. dissertation with the principal investigator. (See publication 6.) • Parallel Algorithms for Dense Linear Algebra Computations. Our...and describe and to put into perspective a selection of the more important parallel algorithms for numerical linear algebra . We give a major new
Simulation of dispersion in layered coastal aquifer systems

USGS Publications Warehouse

Reilly, T.E.

1990-01-01

A density-dependent solute-transport formulation is used to examine ground-water flow in layered coastal aquifers. The numerical experiments indicate that although the transition zone may be thought of as an impermeable 'sharp' interface with freshwater flow parallel to the transition zone in homogeneous aquifers, this is not the case for layered systems. Freshwater can discharge through the transition zone in the confining units. Further, for the best simulation of layered coastal aquifer systems, either a flow-direction-dependent dispersion formulation is required, or the dispersivities must change spatially to reflect the tight thin confining unit. ?? 1990.
Parallelized direct execution simulation of message-passing parallel programs

NASA Technical Reports Server (NTRS)

Dickens, Phillip M.; Heidelberger, Philip; Nicol, David M.

1994-01-01

As massively parallel computers proliferate, there is growing interest in findings ways by which performance of massively parallel codes can be efficiently predicted. This problem arises in diverse contexts such as parallelizing computers, parallel performance monitoring, and parallel algorithm development. In this paper we describe one solution where one directly executes the application code, but uses a discrete-event simulator to model details of the presumed parallel machine such as operating system and communication network behavior. Because this approach is computationally expensive, we are interested in its own parallelization specifically the parallelization of the discrete-event simulator. We describe methods suitable for parallelized direct execution simulation of message-passing parallel programs, and report on the performance of such a system, Large Application Parallel Simulation Environment (LAPSE), we have built on the Intel Paragon. On all codes measured to date, LAPSE predicts performance well typically within 10 percent relative error. Depending on the nature of the application code, we have observed low slowdowns (relative to natively executing code) and high relative speedups using up to 64 processors.
Evidence for parallel consolidation of motion direction and orientation into visual short-term memory.

PubMed

Rideaux, Reuben; Apthorp, Deborah; Edwards, Mark

2015-02-12

Recent findings have indicated the capacity to consolidate multiple items into visual short-term memory in parallel varies as a function of the type of information. That is, while color can be consolidated in parallel, evidence suggests that orientation cannot. Here we investigated the capacity to consolidate multiple motion directions in parallel and reexamined this capacity using orientation. This was achieved by determining the shortest exposure duration necessary to consolidate a single item, then examining whether two items, presented simultaneously, could be consolidated in that time. The results show that parallel consolidation of direction and orientation information is possible, and that parallel consolidation of direction appears to be limited to two. Additionally, we demonstrate the importance of adequate separation between feature intervals used to define items when attempting to consolidate in parallel, suggesting that when multiple items are consolidated in parallel, as opposed to serially, the resolution of representations suffer. Finally, we used facilitation of spatial attention to show that the deterioration of item resolution occurs during parallel consolidation, as opposed to storage. © 2015 ARVO.

Performance Modeling and Measurement of Parallelized Code for Distributed Shared Memory Multiprocessors

NASA Technical Reports Server (NTRS)

Waheed, Abdul; Yan, Jerry

1998-01-01

This paper presents a model to evaluate the performance and overhead of parallelizing sequential code using compiler directives for multiprocessing on distributed shared memory (DSM) systems. With increasing popularity of shared address space architectures, it is essential to understand their performance impact on programs that benefit from shared memory multiprocessing. We present a simple model to characterize the performance of programs that are parallelized using compiler directives for shared memory multiprocessing. We parallelized the sequential implementation of NAS benchmarks using native Fortran77 compiler directives for an Origin2000, which is a DSM system based on a cache-coherent Non Uniform Memory Access (ccNUMA) architecture. We report measurement based performance of these parallelized benchmarks from four perspectives: efficacy of parallelization process; scalability; parallelization overhead; and comparison with hand-parallelized and -optimized version of the same benchmarks. Our results indicate that sequential programs can conveniently be parallelized for DSM systems using compiler directives but realizing performance gains as predicted by the performance model depends primarily on minimizing architecture-specific data locality overhead.
Pressure gradients fail to predict diffusio-osmosis

NASA Astrophysics Data System (ADS)

Liu, Yawei; Ganti, Raman; Frenkel, Daan

2018-05-01

We present numerical simulations of diffusio-osmotic flow, i.e. the fluid flow generated by a concentration gradient along a solid-fluid interface. In our study, we compare a number of distinct approaches that have been proposed for computing such flows and compare them with a reference calculation based on direct, non-equilibrium molecular dynamics simulations. As alternatives, we consider schemes that compute diffusio-osmotic flow from the gradient of the chemical potentials of the constituent species and from the gradient of the component of the pressure tensor parallel to the interface. We find that the approach based on treating chemical potential gradients as external forces acting on various species agrees with the direct simulations, thereby supporting the approach of Marbach et al (2017 J. Chem. Phys. 146 194701). In contrast, an approach based on computing the gradients of the microscopic pressure tensor does not reproduce the direct non-equilibrium results.
Coherent Structures and Extreme Events in Rotating Multiphase Turbulent Flows

NASA Astrophysics Data System (ADS)

Biferale, L.; Bonaccorso, F.; Mazzitelli, I. M.; van Hinsberg, M. A. T.; Lanotte, A. S.; Musacchio, S.; Perlekar, P.; Toschi, F.

2016-10-01

By using direct numerical simulations (DNS) at unprecedented resolution, we study turbulence under rotation in the presence of simultaneous direct and inverse cascades. The accumulation of energy at large scale leads to the formation of vertical coherent regions with high vorticity oriented along the rotation axis. By seeding the flow with millions of inertial particles, we quantify—for the first time—the effects of those coherent vertical structures on the preferential concentration of light and heavy particles. Furthermore, we quantitatively show that extreme fluctuations, leading to deviations from a normal-distributed statistics, result from the entangled interaction of the vertical structures with the turbulent background. Finally, we present the first-ever measurement of the relative importance between Stokes drag, Coriolis force, and centripetal force along the trajectories of inertial particles. We discover that vortical coherent structures lead to unexpected diffusion properties for heavy and light particles in the directions parallel and perpendicular to the rotation axis.
On the suitability of the connection machine for direct particle simulation

NASA Technical Reports Server (NTRS)

Dagum, Leonard

1990-01-01

The algorithmic structure was examined of the vectorizable Stanford particle simulation (SPS) method and the structure is reformulated in data parallel form. Some of the SPS algorithms can be directly translated to data parallel, but several of the vectorizable algorithms have no direct data parallel equivalent. This requires the development of new, strictly data parallel algorithms. In particular, a new sorting algorithm is developed to identify collision candidates in the simulation and a master/slave algorithm is developed to minimize communication cost in large table look up. Validation of the method is undertaken through test calculations for thermal relaxation of a gas, shock wave profiles, and shock reflection from a stationary wall. A qualitative measure is provided of the performance of the Connection Machine for direct particle simulation. The massively parallel architecture of the Connection Machine is found quite suitable for this type of calculation. However, there are difficulties in taking full advantage of this architecture because of lack of a broad based tradition of data parallel programming. An important outcome of this work has been new data parallel algorithms specifically of use for direct particle simulation but which also expand the data parallel diction.
Failed oceanic transform models: experience of shaking the tree

NASA Astrophysics Data System (ADS)

Gerya, Taras

2017-04-01

In geodynamics, numerical modeling is often used as a trial-and-error tool, which does not necessarily requires full understanding or even a correct concept for a modeled phenomenon. Paradoxically, in order to understand an enigmatic process one should simply try to model it based on some initial assumptions, which must not even be correct… The reason is that our intuition is not always well "calibrated" for understanding of geodynamic phenomena, which develop on space- and timescales that are very different from our everyday experience. We often have much better ideas about physical laws governing geodynamic processes than on how these laws should interact on geological space- and timescales. From this prospective, numerical models, in which these physical laws are self-consistently implemented, can gradually calibrate our intuition by exploring what scenarios are physically sensible and what are not. I personally went through this painful learning path many times and one noteworthy example was my 3D numerical modeling of oceanic transform faults. As I understand in retrospective, my initial literature-inspired concept of how and why transform faults form and evolve was thermomechanically inconsistent and based on two main assumptions (btw. both were incorrect!): (1) oceanic transforms are directly inherited from the continental rifting and breakup stages and (2) they represent plate fragmentation structures having peculiar extension-parallel orientation due to the stress rotation caused by thermal contraction of the oceanic lithosphere. During one year (!) of high-resolution thermomechanical numerical experiments exploring various physics (including very computationally demanding thermal contraction) I systematically observed how my initially prescribed extension-parallel weak transform faults connecting ridge segments rotated away from their original orientation and get converted into oblique ridge sections… This was really an epic failure! However, at the very same time, some pseudo-2D "side-models" with initial strait ridge and ad-hock strain weakened rheology, which were run for curiosity, suddenly showed spontaneous development of ridge curvature… Fraction of these models showed spontaneous development of orthogonal ridge-transform patterns by rotation of oblique ridge sections toward extension-parallel direction to accommodate asymmetric plate accretion. The later was controlled by detachment faults stabilized by strain weakening. Further exploration of these "side-models" resulted in complete changing of my concept for oceanic transforms: they are not plate fragmentation but rather plate growth structures stabilized by continuous plate accretion and rheological weakening of deforming rocks (Gerya, 2010, 2013). The conclusion is - keep shaking the tree and banana will fall… Gerya, T. (2010) Dynamical instability produces transform faults at mid-ocean ridges. Science, 329, 1047-1050. Gerya, T.V. (2013) Three-dimensional thermomechanical modeling of oceanic spreading initiation and evolution. Phys. Earth Planet. Interiors, 214, 35-52.
Acceleration of charged particles by crossed cyclotron waves, Resonant Moments Method

NASA Astrophysics Data System (ADS)

Ponomarjov, M.; Carati, D.

A mechanism for enhanced acceleration of charged particles in crossing radio frequency or micro waves propagating at different angles with respect to an external magnetic field is investigated. This mechanism consists in introducing low amplitude secondary waves in order to improve the parallel momentum transfer from the high amplitude primary wave to charged particles. The use of two parallel counter-propagating waves has recently been considered (Gell and Nakach, 1999) and numerical tests (Louies et al, 2001) have shown that the two-wave scheme may lead to higher averaged parallel velocity. On the other hand, it has been concluded that it may be more effective to accelerate electrons when the waves propagate obliquely to the external magnetic field (Karimabadi and Angelopoulos 1989, Cohen et al 1991). The idea considered here is similar although no constraint is imposed on the refraction indices of the primary and the secondary waves. The theoretical analysis of the acceleration mechanism is based on the Resonance Moments Method (RMM) in which moments of the velocity distribution are computed by using an averages over the resonant layers (RL)i only instead of a complete phase-space average. The quantities obtained using this approach, referred to as Resonant Moments (RM), suggest the existence of optimal angles of propagation for the primary and secondary waves as long as the maximization of the parallel flux of charged particles is considered. The fraction of charged particles that are close to the resonance conditions, that correspond to the RL, becomes then as important as the time these particles remain resonant. The secondary wave tends to maintain a pseudo-equilibrium velocity distribution by continuously re-filling the RL. Our suggestions are confirmed by direct numerical simulations for a populations of 105 relativistic electrons. The secondary wave yields a clear increase (up to one order of magnitude) of the average parallel velocity of the particles. It is a quite promising result since the amplitude of the secondary wave is ten times lower the one of the first wave. Qualitative results give one of the enhanced acceleration mechanisms of the charged particles (including relativistic electrons in planetary magnetospheres) by the crossed cyclotron waves in ambient magnetic field.
Directional multimode coupler for planar magnonics: Side-coupled magnetic stripes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sadovnikov, A. V., E-mail: sadovnikovav@gmail.com; Nikitov, S. A.; Kotel'nikov Institute of Radioengineering and Electronics, Russian Academy of Sciences, Moscow 125009

We experimentally demonstrate spin waves coupling in two laterally adjacent magnetic stripes. By the means of Brillouin light scattering spectroscopy, we show that the coupling efficiency depends both on the magnonic waveguides' geometry and the characteristics of spin-wave modes. In particular, the lateral confinement of coupled yttrium-iron-garnet stripes enables the possibility of control over the spin-wave propagation characteristics. Numerical simulations (in time domain and frequency domain) reveal the nature of intermodal coupling between two magnonic stripes. The proposed topology of multimode magnonic coupler can be utilized as a building block for fabrication of integrated parallel functional and logic devices suchmore » as the frequency selective directional coupler or tunable splitter, enabling a number of potential applications for planar magnonics.« less
Three-dimensional numerical simulations of crustal-scale wrenching using a non-linear failure criterion

NASA Astrophysics Data System (ADS)

Braun, Jean

1994-08-01

We have developed a three-dimensional finite element model to study wrench deformation of the crust regarded as an elasto-plastic material obeying Murrell's extension of Griffith's failure criterion. Numerical experiments using this model predict that the imposed basal wrenching is accommodated by an array of oblique Riedel-like shears and Y-shears (parallel to the direction of wrenching). The partitioning of deformation between the two types of structure depends on the width of the zone of imposed basal wrenching and the existence of a component of deformation in the x-direction (normal to the direction of wrenching). The Riedel shears are arranged in spiral-like structures that root into the basal wrench zone. In cross-section, the Riedel shears resemble wedge-shaped flower structures similar to those often observed in seismic cross-sections. The 'polarity' of the flower structures is positive (or palm-tree-like) in transpression experiments and negative (or tulip-like) in transtension experiments. The orientation of the Riedel shears throughout the crust obeys Mohr's hypothesis for incipient faulting combined with Murrell's failure criterion. The model also predicts plastic dilatancy inversely proportional to the square root of the confining pressure; this result agrees qualitatively with field observations and the results of sand-box experiments and quantitatively with direct measurement of dilatancy during high-pressure rock-deformation experiments.
A simple finite-difference scheme for handling topography with the first-order wave equation

NASA Astrophysics Data System (ADS)

Mulder, W. A.; Huiskes, M. J.

2017-07-01

One approach to incorporate topography in seismic finite-difference codes is a local modification of the difference operators near the free surface. An earlier paper described an approach for modelling irregular boundaries in a constant-density acoustic finite-difference code, based on the second-order formulation of the wave equation that only involves the pressure. Here, a similar method is considered for the first-order formulation in terms of pressure and particle velocity, using a staggered finite-difference discretization both in space and in time. In one space dimension, the boundary conditions consist in imposing antisymmetry for the pressure and symmetry for particle velocity components. For the pressure, this means that the solution values as well as all even derivatives up to a certain order are zero on the boundary. For the particle velocity, all odd derivatives are zero. In 2D, the 1-D assumption is used along each coordinate direction, with antisymmetry for the pressure along the coordinate and symmetry for the particle velocity component parallel to that coordinate direction. Since the symmetry or antisymmetry should hold along the direction normal to the boundary rather than along the coordinate directions, this generates an additional numerical error on top of the time stepping errors and the errors due to the interior spatial discretization. Numerical experiments in 2D and 3D nevertheless produce acceptable results.
Different Relative Orientation of Static and Alternative Magnetic Fields and Cress Roots Direction of Growth Changes Their Gravitropic Reaction

NASA Astrophysics Data System (ADS)

Sheykina, Nadiia; Bogatina, Nina

The following variants of roots location relatively to static and alternative components of magnetic field were studied. At first variant the static magnetic field was directed parallel to the gravitation vector, the alternative magnetic field was directed perpendicular to static one; roots were directed perpendicular to both two fields’ components and gravitation vector. At the variant the negative gravitropysm for cress roots was observed. At second variant the static magnetic field was directed parallel to the gravitation vector, the alternative magnetic field was directed perpendicular to static one; roots were directed parallel to alternative magnetic field. At third variant the alternative magnetic field was directed parallel to the gravitation vector, the static magnetic field was directed perpendicular to the gravitation vector, roots were directed perpendicular to both two fields components and gravitation vector; At forth variant the alternative magnetic field was directed parallel to the gravitation vector, the static magnetic field was directed perpendicular to the gravitation vector, roots were directed parallel to static magnetic field. In all cases studied the alternative magnetic field frequency was equal to Ca ions cyclotron frequency. In 2, 3 and 4 variants gravitropism was positive. But the gravitropic reaction speeds were different. In second and forth variants the gravitropic reaction speed in error limits coincided with the gravitropic reaction speed under Earth’s conditions. At third variant the gravitropic reaction speed was slowed essentially.
Extending HPF for advanced data parallel applications

NASA Technical Reports Server (NTRS)

Chapman, Barbara; Mehrotra, Piyush; Zima, Hans

1994-01-01

The stated goal of High Performance Fortran (HPF) was to 'address the problems of writing data parallel programs where the distribution of data affects performance'. After examining the current version of the language we are led to the conclusion that HPF has not fully achieved this goal. While the basic distribution functions offered by the language - regular block, cyclic, and block cyclic distributions - can support regular numerical algorithms, advanced applications such as particle-in-cell codes or unstructured mesh solvers cannot be expressed adequately. We believe that this is a major weakness of HPF, significantly reducing its chances of becoming accepted in the numeric community. The paper discusses the data distribution and alignment issues in detail, points out some flaws in the basic language, and outlines possible future paths of development. Furthermore, we briefly deal with the issue of task parallelism and its integration with the data parallel paradigm of HPF.
Developing Information Power Grid Based Algorithms and Software

NASA Technical Reports Server (NTRS)

Dongarra, Jack

1998-01-01

This exploratory study initiated our effort to understand performance modeling on parallel systems. The basic goal of performance modeling is to understand and predict the performance of a computer program or set of programs on a computer system. Performance modeling has numerous applications, including evaluation of algorithms, optimization of code implementations, parallel library development, comparison of system architectures, parallel system design, and procurement of new systems. Our work lays the basis for the construction of parallel libraries that allow for the reconstruction of application codes on several distinct architectures so as to assure performance portability. Following our strategy, once the requirements of applications are well understood, one can then construct a library in a layered fashion. The top level of this library will consist of architecture-independent geometric, numerical, and symbolic algorithms that are needed by the sample of applications. These routines should be written in a language that is portable across the targeted architectures.
A Parallel Stochastic Framework for Reservoir Characterization and History Matching

DOE PAGES

Thomas, Sunil G.; Klie, Hector M.; Rodriguez, Adolfo A.; ...

2011-01-01

The spatial distribution of parameters that characterize the subsurface is never known to any reasonable level of accuracy required to solve the governing PDEs of multiphase flow or species transport through porous media. This paper presents a numerically cheap, yet efficient, accurate and parallel framework to estimate reservoir parameters, for example, medium permeability, using sensor information from measurements of the solution variables such as phase pressures, phase concentrations, fluxes, and seismic and well log data. Numerical results are presented to demonstrate the method.
Advances in locally constrained k-space-based parallel MRI.

PubMed

Samsonov, Alexey A; Block, Walter F; Arunachalam, Arjun; Field, Aaron S

2006-02-01

In this article, several theoretical and methodological developments regarding k-space-based, locally constrained parallel MRI (pMRI) reconstruction are presented. A connection between Parallel MRI with Adaptive Radius in k-Space (PARS) and GRAPPA methods is demonstrated. The analysis provides a basis for unified treatment of both methods. Additionally, a weighted PARS reconstruction is proposed, which may absorb different weighting strategies for improved image reconstruction. Next, a fast and efficient method for pMRI reconstruction of data sampled on non-Cartesian trajectories is described. In the new technique, the computational burden associated with the numerous matrix inversions in the original PARS method is drastically reduced by limiting direct calculation of reconstruction coefficients to only a few reference points. The rest of the coefficients are found by interpolating between the reference sets, which is possible due to the similar configuration of points participating in reconstruction for highly symmetric trajectories, such as radial and spirals. As a result, the time requirements are drastically reduced, which makes it practical to use pMRI with non-Cartesian trajectories in many applications. The new technique was demonstrated with simulated and actual data sampled on radial trajectories. Copyright 2006 Wiley-Liss, Inc.
A new estimate for present-day Cocos-Caribbean Plate motion: Implications for slip along the Central American Volcanic Arc

NASA Astrophysics Data System (ADS)

DeMets, Charles

Velocities from 153 continuously-operating GPS sites on the Caribbean, North American, and Pacific plates are combined with 61 newly estimated Pacific-Cocos seafloor spreading rates and additional marine geophysical data to derive a new estimate of present-day Cocos-Caribbean plate motion. A comparison of the predicted Cocos-Caribbean direction to slip directions of numerous shallow-thrust subduction earthquakes from the Middle America trench between Costa Rica and Guatemala shows the slip directions to be deflected 10° clockwise from the plate convergence direction, supporting the hypothesis that frequent dextral strike-slip earthquakes along the Central American volcanic arc result from partitioning of oblique Cocos-Caribbean plate convergence. Linear velocity analysis for forearc locations in Nicaragua and Guatemala predicts 14±2 mm yr-1 of northwestward trench-parallel slip of the forearc relative to the Caribbean plate, possibly decreasing in magnitude in El Salvador and Guatemala, where extension east of the volcanic arc complicates the tectonic setting.
Finite Element Analysis of Magnetic Damping Effects on G-Jitter Induced Fluid Flow

NASA Technical Reports Server (NTRS)

Pan, Bo; Li, Ben Q.; deGroh, Henry C., III

1997-01-01

This paper reports some interim results on numerical modeling and analyses of magnetic damping of g-jitter driven fluid flow in microgravity. A finite element model is developed to represent the fluid flow, thermal and solute transport phenomena in a 2-D cavity under g-jitter conditions with and without an applied magnetic field. The numerical model is checked by comparing with analytical solutions obtained for a simple parallel plate channel flow driven by g-jitter in a transverse magnetic field. The model is then applied to study the effect of steady state g-jitter induced oscillation and on the solute redistribution in the liquid that bears direct relevance to the Bridgman-Stockbarger single crystal growth processes. A selection of computed results is presented and the results indicate that an applied magnetic field can effectively damp the velocity caused by g-jitter and help to reduce the time variation of solute redistribution.
A method of measuring the effective thermal conductivity of thermoplastic foams

NASA Astrophysics Data System (ADS)

Asséko, André Chateau Akué; Cosson, Benoit; Chaki, Salim; Duborper, Clément; Lacrampe, Marie-France; Krawczak, Patricia

2017-10-01

An inverse method for determining the in-plane effective thermal conductivity of porous thermoplastics was implemented by coupling infrared thermography experiments and numerical solution of heat transfer in straight fins having temperature-dependent convective heat transfer coefficient. The obtained effective thermal conductivity values were compared with previous results obtained using a numerical solution based on periodic homogenization techniques (NSHT) in which the microstructure heterogeneity of extruded polymeric polyethylene (PE) foam in which pores are filled with air with different levels of open and closed porosity was taken into account and Transient Plane Source Technique (TPS) in order to verify the accuracy of the proposed method. The new method proposed in the present study is in good agreement with both NSHT and TPS. It is also applicable to structural materials such as composites, e.g. unidirectional fiber-reinforced plastics, where heat transfer is very different according to the fiber direction (parallel or transverse to the fibers).
Terascale High-Fidelity Simulations of Turbulent Combustion with Detailed Chemistry: Spray Simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rutland, Christopher J.

2009-04-26

The Terascale High-Fidelity Simulations of Turbulent Combustion (TSTC) project is a multi-university collaborative effort to develop a high-fidelity turbulent reacting flow simulation capability utilizing terascale, massively parallel computer technology. The main paradigm of the approach is direct numerical simulation (DNS) featuring the highest temporal and spatial accuracy, allowing quantitative observations of the fine-scale physics found in turbulent reacting flows as well as providing a useful tool for development of sub-models needed in device-level simulations. Under this component of the TSTC program the simulation code named S3D, developed and shared with coworkers at Sandia National Laboratories, has been enhanced with newmore » numerical algorithms and physical models to provide predictive capabilities for turbulent liquid fuel spray dynamics. Major accomplishments include improved fundamental understanding of mixing and auto-ignition in multi-phase turbulent reactant mixtures and turbulent fuel injection spray jets.« less
Concurrent Cuba

NASA Astrophysics Data System (ADS)

Hahn, T.

2016-10-01

The parallel version of the multidimensional numerical integration package Cuba is presented and achievable speed-ups discussed. The parallelization is based on the fork/wait POSIX functions, needs no extra software installed, imposes almost no constraints on the integrand function, and works largely automatically.
A Fast MHD Code for Gravitationally Stratified Media using Graphical Processing Units: SMAUG

NASA Astrophysics Data System (ADS)

Griffiths, M. K.; Fedun, V.; Erdélyi, R.

2015-03-01

Parallelization techniques have been exploited most successfully by the gaming/graphics industry with the adoption of graphical processing units (GPUs), possessing hundreds of processor cores. The opportunity has been recognized by the computational sciences and engineering communities, who have recently harnessed successfully the numerical performance of GPUs. For example, parallel magnetohydrodynamic (MHD) algorithms are important for numerical modelling of highly inhomogeneous solar, astrophysical and geophysical plasmas. Here, we describe the implementation of SMAUG, the Sheffield Magnetohydrodynamics Algorithm Using GPUs. SMAUG is a 1-3D MHD code capable of modelling magnetized and gravitationally stratified plasma. The objective of this paper is to present the numerical methods and techniques used for porting the code to this novel and highly parallel compute architecture. The methods employed are justified by the performance benchmarks and validation results demonstrating that the code successfully simulates the physics for a range of test scenarios including a full 3D realistic model of wave propagation in the solar atmosphere.

The Automatic Parallelisation of Scientific Application Codes Using a Computer Aided Parallelisation Toolkit

NASA Technical Reports Server (NTRS)

Ierotheou, C.; Johnson, S.; Leggett, P.; Cross, M.; Evans, E.; Jin, Hao-Qiang; Frumkin, M.; Yan, J.; Biegel, Bryan (Technical Monitor)

2001-01-01

The shared-memory programming model is a very effective way to achieve parallelism on shared memory parallel computers. Historically, the lack of a programming standard for using directives and the rather limited performance due to scalability have affected the take-up of this programming model approach. Significant progress has been made in hardware and software technologies, as a result the performance of parallel programs with compiler directives has also made improvements. The introduction of an industrial standard for shared-memory programming with directives, OpenMP, has also addressed the issue of portability. In this study, we have extended the computer aided parallelization toolkit (developed at the University of Greenwich), to automatically generate OpenMP based parallel programs with nominal user assistance. We outline the way in which loop types are categorized and how efficient OpenMP directives can be defined and placed using the in-depth interprocedural analysis that is carried out by the toolkit. We also discuss the application of the toolkit on the NAS Parallel Benchmarks and a number of real-world application codes. This work not only demonstrates the great potential of using the toolkit to quickly parallelize serial programs but also the good performance achievable on up to 300 processors for hybrid message passing and directive-based parallelizations.
Numerical techniques in radiative heat transfer for general, scattering, plane-parallel media

NASA Technical Reports Server (NTRS)

Sharma, A.; Cogley, A. C.

1982-01-01

The study of radiative heat transfer with scattering usually leads to the solution of singular Fredholm integral equations. The present paper presents an accurate and efficient numerical method to solve certain integral equations that govern radiative equilibrium problems in plane-parallel geometry for both grey and nongrey, anisotropically scattering media. In particular, the nongrey problem is represented by a spectral integral of a system of nonlinear integral equations in space, which has not been solved previously. The numerical technique is constructed to handle this unique nongrey governing equation as well as the difficulties caused by singular kernels. Example problems are solved and the method's accuracy and computational speed are analyzed.
Parallelization of NAS Benchmarks for Shared Memory Multiprocessors

NASA Technical Reports Server (NTRS)

Waheed, Abdul; Yan, Jerry C.; Saini, Subhash (Technical Monitor)

1998-01-01

This paper presents our experiences of parallelizing the sequential implementation of NAS benchmarks using compiler directives on SGI Origin2000 distributed shared memory (DSM) system. Porting existing applications to new high performance parallel and distributed computing platforms is a challenging task. Ideally, a user develops a sequential version of the application, leaving the task of porting to new generations of high performance computing systems to parallelization tools and compilers. Due to the simplicity of programming shared-memory multiprocessors, compiler developers have provided various facilities to allow the users to exploit parallelism. Native compilers on SGI Origin2000 support multiprocessing directives to allow users to exploit loop-level parallelism in their programs. Additionally, supporting tools can accomplish this process automatically and present the results of parallelization to the users. We experimented with these compiler directives and supporting tools by parallelizing sequential implementation of NAS benchmarks. Results reported in this paper indicate that with minimal effort, the performance gain is comparable with the hand-parallelized, carefully optimized, message-passing implementations of the same benchmarks.
Automatic Generation of OpenMP Directives and Its Application to Computational Fluid Dynamics Codes

NASA Technical Reports Server (NTRS)

Yan, Jerry; Jin, Haoqiang; Frumkin, Michael; Yan, Jerry (Technical Monitor)

2000-01-01

The shared-memory programming model is a very effective way to achieve parallelism on shared memory parallel computers. As great progress was made in hardware and software technologies, performance of parallel programs with compiler directives has demonstrated large improvement. The introduction of OpenMP directives, the industrial standard for shared-memory programming, has minimized the issue of portability. In this study, we have extended CAPTools, a computer-aided parallelization toolkit, to automatically generate OpenMP-based parallel programs with nominal user assistance. We outline techniques used in the implementation of the tool and discuss the application of this tool on the NAS Parallel Benchmarks and several computational fluid dynamics codes. This work demonstrates the great potential of using the tool to quickly port parallel programs and also achieve good performance that exceeds some of the commercial tools.
Parallelization of Lower-Upper Symmetric Gauss-Seidel Method for Chemically Reacting Flow

NASA Technical Reports Server (NTRS)

Yoon, Seokkwan; Jost, Gabriele; Chang, Sherry

2005-01-01

Development of technologies for exploration of the solar system has revived an interest in computational simulation of chemically reacting flows since planetary probe vehicles exhibit non-equilibrium phenomena during the atmospheric entry of a planet or a moon as well as the reentry to the Earth. Stability in combustion is essential for new propulsion systems. Numerical solution of real-gas flows often increases computational work by an order-of-magnitude compared to perfect gas flow partly because of the increased complexity of equations to solve. Recently, as part of Project Columbia, NASA has integrated a cluster of interconnected SGI Altix systems to provide a ten-fold increase in current supercomputing capacity that includes an SGI Origin system. Both the new and existing machines are based on cache coherent non-uniform memory access architecture. Lower-Upper Symmetric Gauss-Seidel (LU-SGS) relaxation method has been implemented into both perfect and real gas flow codes including Real-Gas Aerodynamic Simulator (RGAS). However, the vectorized RGAS code runs inefficiently on cache-based shared-memory machines such as SGI system. Parallelization of a Gauss-Seidel method is nontrivial due to its sequential nature. The LU-SGS method has been vectorized on an oblique plane in INS3D-LU code that has been one of the base codes for NAS Parallel benchmarks. The oblique plane has been called a hyperplane by computer scientists. It is straightforward to parallelize a Gauss-Seidel method by partitioning the hyperplanes once they are formed. Another way of parallelization is to schedule processors like a pipeline using software. Both hyperplane and pipeline methods have been implemented using openMP directives. The present paper reports the performance of the parallelized RGAS code on SGI Origin and Altix systems.
Parallelizing alternating direction implicit solver on GPUs

USDA-ARS?s Scientific Manuscript database

We present a parallel Alternating Direction Implicit (ADI) solver on GPUs. Our implementation significantly improves existing implementations in two aspects. First, we address the scalability issue of existing Parallel Cyclic Reduction (PCR) implementations by eliminating their hardware resource con...
Numerical Simulation of the Vortex-Induced Vibration of A Curved Flexible Riser in Shear Flow

NASA Astrophysics Data System (ADS)

Zhu, Hong-jun; Lin, Peng-zhi

2018-06-01

A series of fully three-dimensional (3D) numerical simulations of flow past a free-to-oscillate curved flexible riser in shear flow were conducted at Reynolds number of 185-1015. The numerical results obtained by the two-way fluid-structure interaction (FSI) simulations are in good agreement with the experimental results reported in the earlier study. It is further found that the frequency transition is out of phase not only in the inline (IL) and crossflow (CF) directions but also along the span direction. The mode competition leads to the non-zero nodes of the rootmean- square (RMS) amplitude and the relatively chaotic trajectories. The fluid-structure interaction is to some extent reflected by the transverse velocity of the ambient fluid, which reaches the maximum value when the riser reaches the equilibrium position. Moreover, the local maximum transverse velocities occur at the peak CF amplitudes, and the values are relatively large when the vibration is in the resonance regions. The 3D vortex columns are shed nearly parallel to the axis of the curved flexible riser. As the local Reynolds number increases from 0 at the bottom of the riser to the maximum value at the top, the wake undergoes a transition from a two-dimensional structure to a 3D one. More irregular small-scale vortices appeared at the wake region of the riser, undergoing large amplitude responses.
Generating performance portable geoscientific simulation code with Firedrake (Invited)

NASA Astrophysics Data System (ADS)

Ham, D. A.; Bercea, G.; Cotter, C. J.; Kelly, P. H.; Loriant, N.; Luporini, F.; McRae, A. T.; Mitchell, L.; Rathgeber, F.

2013-12-01

This presentation will demonstrate how a change in simulation programming paradigm can be exploited to deliver sophisticated simulation capability which is far easier to programme than are conventional models, is capable of exploiting different emerging parallel hardware, and is tailored to the specific needs of geoscientific simulation. Geoscientific simulation represents a grand challenge computational task: many of the largest computers in the world are tasked with this field, and the requirements of resolution and complexity of scientists in this field are far from being sated. However, single thread performance has stalled, even sometimes decreased, over the last decade, and has been replaced by ever more parallel systems: both as conventional multicore CPUs and in the emerging world of accelerators. At the same time, the needs of scientists to couple ever-more complex dynamics and parametrisations into their models makes the model development task vastly more complex. The conventional approach of writing code in low level languages such as Fortran or C/C++ and then hand-coding parallelism for different platforms by adding library calls and directives forces the intermingling of the numerical code with its implementation. This results in an almost impossible set of skill requirements for developers, who must simultaneously be domain science experts, numericists, software engineers and parallelisation specialists. Even more critically, it requires code to be essentially rewritten for each emerging hardware platform. Since new platforms are emerging constantly, and since code owners do not usually control the procurement of the supercomputers on which they must run, this represents an unsustainable development load. The Firedrake system, conversely, offers the developer the opportunity to write PDE discretisations in the high-level mathematical language UFL from the FEniCS project (http://fenicsproject.org). Non-PDE model components, such as parametrisations, can be written as short C kernels operating locally on the underlying mesh, with no explicit parallelism. The executable code is then generated in C, CUDA or OpenCL and executed in parallel on the target architecture. The system also offers features of special relevance to the geosciences. In particular, the large scale separation between the vertical and horizontal directions in many geoscientific processes can be exploited to offer the flexibility of unstructured meshes in the horizontal direction, without the performance penalty usually associated with those methods.
Scalable Domain Decomposed Monte Carlo Particle Transport

DOE Office of Scientific and Technical Information (OSTI.GOV)

O'Brien, Matthew Joseph

2013-12-05

In this dissertation, we present the parallel algorithms necessary to run domain decomposed Monte Carlo particle transport on large numbers of processors (millions of processors). Previous algorithms were not scalable, and the parallel overhead became more computationally costly than the numerical simulation.
A parallel algorithm for the eigenvalues and eigenvectors for a general complex matrix

NASA Technical Reports Server (NTRS)

Shroff, Gautam

1989-01-01

A new parallel Jacobi-like algorithm is developed for computing the eigenvalues of a general complex matrix. Most parallel methods for this parallel typically display only linear convergence. Sequential norm-reducing algorithms also exit and they display quadratic convergence in most cases. The new algorithm is a parallel form of the norm-reducing algorithm due to Eberlein. It is proven that the asymptotic convergence rate of this algorithm is quadratic. Numerical experiments are presented which demonstrate the quadratic convergence of the algorithm and certain situations where the convergence is slow are also identified. The algorithm promises to be very competitive on a variety of parallel architectures.
Coherent backscattering of light by complex random media of spherical scatterers: numerical solution

NASA Astrophysics Data System (ADS)

Muinonen, Karri

2004-07-01

Novel Monte Carlo techniques are described for the computation of reflection coefficient matrices for multiple scattering of light in plane-parallel random media of spherical scatterers. The present multiple scattering theory is composed of coherent backscattering and radiative transfer. In the radiative transfer part, the Stokes parameters of light escaping from the medium are updated at each scattering process in predefined angles of emergence. The scattering directions at each process are randomized using probability densities for the polar and azimuthal scattering angles: the former angle is generated using the single-scattering phase function, whereafter the latter follows from Kepler's equation. For spherical scatterers in the Rayleigh regime, randomization proceeds semi-analytically whereas, beyond that regime, cubic spline presentation of the scattering matrix is used for numerical computations. In the coherent backscattering part, the reciprocity of electromagnetic waves in the backscattering direction allows the renormalization of the reversely propagating waves, whereafter the scattering characteristics are computed in other directions. High orders of scattering (~10 000) can be treated because of the peculiar polarization characteristics of the reverse wave: after a number of scatterings, the polarization state of the reverse wave becomes independent of that of the incident wave, that is, it becomes fully dictated by the scatterings at the end of the reverse path. The coherent backscattering part depends on the single-scattering albedo in a non-monotonous way, the most pronounced signatures showing up for absorbing scatterers. The numerical results compare favourably to the literature results for nonabsorbing spherical scatterers both in and beyond the Rayleigh regime.
Making it Easy to Construct Accurate Hydrological Models that Exploit High Performance Computers (Invited)

NASA Astrophysics Data System (ADS)

Kees, C. E.; Farthing, M. W.; Terrel, A.; Certik, O.; Seljebotn, D.

2013-12-01

This presentation will focus on two barriers to progress in the hydrological modeling community, and research and development conducted to lessen or eliminate them. The first is a barrier to sharing hydrological models among specialized scientists that is caused by intertwining the implementation of numerical methods with the implementation of abstract numerical modeling information. In the Proteus toolkit for computational methods and simulation, we have decoupled these two important parts of computational model through separate "physics" and "numerics" interfaces. More recently we have begun developing the Strong Form Language for easy and direct representation of the mathematical model formulation in a domain specific language embedded in Python. The second major barrier is sharing ANY scientific software tools that have complex library or module dependencies, as most parallel, multi-physics hydrological models must have. In this setting, users and developer are dependent on an entire distribution, possibly depending on multiple compilers and special instructions depending on the environment of the target machine. To solve these problem we have developed, hashdist, a stateless package management tool and a resulting portable, open source scientific software distribution.
High output lamp with high brightness

DOEpatents

Kirkpatrick, Douglas A.; Bass, Gary K.; Copsey, Jesse F.; Garber, Jr., William E.; Kwong, Vincent H.; Levin, Izrail; MacLennan, Donald A.; Roy, Robert J.; Steiner, Paul E.; Tsai, Peter; Turner, Brian P.

2002-01-01

An ultra bright, low wattage inductively coupled electrodeless aperture lamp is powered by a solid state RF source in the range of several tens to several hundreds of watts at various frequencies in the range of 400 to 900 MHz. Numerous novel lamp circuits and components are disclosed including a wedding ring shaped coil having one axial and one radial lead, a high accuracy capacitor stack, a high thermal conductivity aperture cup and various other aperture bulb configurations, a coaxial capacitor arrangement, and an integrated coil and capacitor assembly. Numerous novel RF circuits are also disclosed including a high power oscillator circuit with reduced complexity resonant pole configuration, parallel RF power FET transistors with soft gate switching, a continuously variable frequency tuning circuit, a six port directional coupler, an impedance switching RF source, and an RF source with controlled frequency-load characteristics. Numerous novel RF control methods are disclosed including controlled adjustment of the operating frequency to find a resonant frequency and reduce reflected RF power, controlled switching of an impedance switched lamp system, active power control and active gate bias control.
Cutting-edge Kinetic Physics with Parker Solar Probe and Solar Orbiter: The Arbitrary Linear Plasma Solver (ALPS)

NASA Astrophysics Data System (ADS)

Verscharen, D.; Klein, K. G.; Chandran, B. D. G.; Stevens, M. L.; Salem, C. S.; Bale, S. D.

2017-12-01

The Arbitrary Linear Plasma Solver (ALPS) is a parallelized numerical code that solves the dispersion relation in a hot (even relativistic) magnetized plasma with an arbitrary number of particle species with arbitrary gyrotropic equilibrium distribution functions for any direction of wave propagation with respect to the background field. In this way, ALPS retains generality and overcomes the shortcomings of previous (bi-)Maxwellian solvers for the plasma dispersion relations. The unprecedented high-resolution particle and field data products from Parker Solar Probe (PSP) and Solar Orbiter (SO) will require novel theoretical tools. ALPS is one such tool, and its use will make possible new investigations into the role of non-Maxwellian distributions in the near-Sun solar wind. It can be applied to numerous high-velocity-resolution systems, ranging from current space missions to numerical simulations. We will briefly discuss the ALPS algorithm and demonstrate its functionality based on previous solar-wind measurements. We will then highlight our plans for future applications of ALPS to PSP and SO observations.
Parallel Plate System for Collecting Data Used to Determine Viscosity

NASA Technical Reports Server (NTRS)

Ethridge, Edwin C. (Inventor); Kaukler, William (Inventor)

2013-01-01

A parallel-plate system collects data used to determine viscosity. A first plate is coupled to a translator so that the first plate can be moved along a first direction. A second plate has a pendulum device coupled thereto such that the second plate is suspended above and parallel to the first plate. The pendulum device constrains movement of the second plate to a second direction that is aligned with the first direction and is substantially parallel thereto. A force measuring device is coupled to the second plate for measuring force along the second direction caused by movement of the second plate.
Summary of research in applied mathematics, numerical analysis, and computer sciences

NASA Technical Reports Server (NTRS)

1986-01-01

The major categories of current ICASE research programs addressed include: numerical methods, with particular emphasis on the development and analysis of basic numerical algorithms; control and parameter identification problems, with emphasis on effective numerical methods; computational problems in engineering and physical sciences, particularly fluid dynamics, acoustics, and structural analysis; and computer systems and software, especially vector and parallel computers.
Fourth order discretization of anisotropic heat conduction operator

NASA Astrophysics Data System (ADS)

Krasheninnikova, Natalia; Chacon, Luis

2008-11-01

In magnetized plasmas, heat conduction plays an important role in such processes as energy confinement, turbulence, and a number of instabilities. As a consequence of the presence of a magnetic field, heat transport is strongly anisotropic, with energy flowing preferentially along the magnetic field direction. This in turn results in parallel and perpendicular heat conduction coefficients being separated by orders of magnitude. The computational difficulties in treating such heat conduction anisotropies are significant, as perpendicular dynamics numerically is polluted by the parallel one. In this work, we report on progress of the implementation of a fourth order, conservative finite volume discretization scheme for the anisotropic heat conduction operator into the extended MHD code PIXIE3D [1]. We will demonstrate its spatial discretization accuracy and its effectiveness with two physical applications of interest, both of which feature a strong sensitivity to the heat conduction anisotropy: the thermal instability and the neoclassical tearing mode. [1] L. Chacon Phys. Plasmas 15, 056103 (2008)
Application of a Phase-resolving, Directional Nonlinear Spectral Wave Model

NASA Astrophysics Data System (ADS)

Davis, J. R.; Sheremet, A.; Tian, M.; Hanson, J. L.

2014-12-01

We describe several applications of a phase-resolving, directional nonlinear spectral wave model. The model describes a 2D surface gravity wave field approaching a mildly sloping beach with parallel depth contours at an arbitrary angle accounting for nonlinear, quadratic triad interactions. The model is hyperbolic, with the initial wave spectrum specified in deep water. Complex amplitudes are generated based on the random phase approximation. The numerical implementation includes unidirectional propagation as a special case. In directional mode, it solves the system of equations in the frequency-alongshore wave number space. Recent enhancements of the model include the incorporation of dissipation caused by breaking and propagation over a viscous mud layer and the calculation of wave induced setup. Applications presented include: a JONSWAP spectrum with a cos2s directional distribution, for shore-perpendicular and oblique propagation, a study of the evolution of a single directional triad, and several preliminary comparisons to wave spectra collected at the USACE-FRF in Duck, NC which show encouraging results although further validation with a wider range of beach slopes and wave conditions is needed.
The cost of parallel consolidation into visual working memory.

PubMed

Rideaux, Reuben; Edwards, Mark

2016-01-01

A growing body of evidence indicates that information can be consolidated into visual working memory in parallel. Initially, it was suggested that color information could be consolidated in parallel while orientation was strictly limited to serial consolidation (Liu & Becker, 2013). However, we recently found evidence suggesting that both orientation and motion direction items can be consolidated in parallel, with different levels of accuracy (Rideaux, Apthorp, & Edwards, 2015). Here we examine whether there is a cost associated with parallel consolidation of orientation and direction information by comparing performance, in terms of precision and guess rate, on a target recall task where items are presented either sequentially or simultaneously. The results compellingly indicate that motion direction can be consolidated in parallel, but the evidence for orientation is less conclusive. Further, we find that there is a twofold cost associated with parallel consolidation of direction: Both the probability of failing to consolidate one (or both) item/s increases and the precision at which representations are encoded is reduced. Additionally, we find evidence indicating that the increased consolidation failure may be due to interference between items presented simultaneously, and is moderated by item similarity. These findings suggest that a biased competition model may explain differences in parallel consolidation between features.
Development of programs for computing characteristics of ultraviolet radiation

NASA Technical Reports Server (NTRS)

Dave, J. V.

1972-01-01

Efficient programs were developed for computing all four characteristics of the radiation scattered by a plane-parallel, turbid, terrestrial atmospheric model. They were developed (FORTRAN 4) and tested on the IBM /360 computers with 2314 direct access storage facility. The storage requirement varies between 200K and 750K bytes depending upon the task. The scattering phase matrix (or function) is expanded in a Fourier series whose number of terms depend upon the zenith angles of the incident and scattered radiations, as well as on the nature of aerosols. A Gauss-Seidel procedure is used for obtaining the numerical solution of the transfer equation.

Programming Probabilistic Structural Analysis for Parallel Processing Computer

NASA Technical Reports Server (NTRS)

Sues, Robert H.; Chen, Heh-Chyun; Twisdale, Lawrence A.; Chamis, Christos C.; Murthy, Pappu L. N.

1991-01-01

The ultimate goal of this research program is to make Probabilistic Structural Analysis (PSA) computationally efficient and hence practical for the design environment by achieving large scale parallelism. The paper identifies the multiple levels of parallelism in PSA, identifies methodologies for exploiting this parallelism, describes the development of a parallel stochastic finite element code, and presents results of two example applications. It is demonstrated that speeds within five percent of those theoretically possible can be achieved. A special-purpose numerical technique, the stochastic preconditioned conjugate gradient method, is also presented and demonstrated to be extremely efficient for certain classes of PSA problems.
Spatiotemporal Responses of Groundwater Flow and Aquifer-River Exchanges to Flood Events

NASA Astrophysics Data System (ADS)

Liang, Xiuyu; Zhan, Hongbin; Schilling, Keith

2018-03-01

Rapidly rising river stages induced by flood events lead to considerable river water infiltration into aquifers and carry surface-borne solutes into hyporheic zones which are widely recognized as an important place for the biogeochemical activity. Existing studies for surface-groundwater exchanges induced by flood events usually limit to a river-aquifer cross section that is perpendicular to river channels, and neglect groundwater flow in parallel with river channels. In this study, surface-groundwater exchanges to a flood event are investigated with specific considerations of unconfined flow in direction that is in parallel with river channels. The groundwater flow is described by a two-dimensional Boussinesq equation and the flood event is described by a diffusive-type flood wave. Analytical solutions are derived and tested using the numerical solution. The results indicate that river water infiltrates into aquifers quickly during flood events, and mostly returns to the river within a short period of time after the flood event. However, the rest river water will stay in aquifers for a long period of time. The residual river water not only flows back to rivers but also flows to downstream aquifers. The one-dimensional model of neglecting flow in the direction parallel with river channels will overestimate heads and discharge in upstream aquifers. The return flow induced by the flood event has a power law form with time and has a significant impact on the base flow recession at early times. The solution can match the observed hydraulic heads in riparian zone wells of Iowa during flood events.
Cold Electrons as the Drivers of Parallel, Electrostatic Waves in Asymmetric Reconnection

NASA Astrophysics Data System (ADS)

Holmes, J.; Ergun, R.; Newman, D. L.; Wilder, F. D.; Schwartz, S. J.; Goodrich, K.; Eriksson, S.; Torbert, R. B.; Russell, C. T.; Lindqvist, P. A.; Giles, B. L.; Pollock, C. J.; Le Contel, O.; Strangeway, R. J.; Burch, J. L.

2016-12-01

The Magnetospheric MultiScale mission (MMS) has observed several instances of asymmetric reconnection at Earth's magnetopause, where plasma from the magnetosheath encounters that of the magnetosphere. On Earth's dayside, the magnetosphere is often made up of a two-component distribution of cold (<< 10 eV) and hot ( 1 keV) plasma, sometimes including the cold ion plume. Magnetosheath plasma is primarily warm ( 100 eV) post-shock solar wind. Where they meet, magnetopause reconnection alters the magnetic topology such that these two populations are left cohabiting a field line and rapidly mix. There have been several events observed by MMS where the Fast Plasma Instrument (FPI) clearly shows cold ions near the diffusion region impinging upon the warm magnetosheath population. In many of these, we also see patches of strong electrostatic waves parallel to the magnetic field - a smoking gun for rapid mixing via nonlinear processes. Cold ions alone are too slow to create the same waves; solving for roots of a simplified dispersion relation shows the electron population damps out the ion modes. From this, we infer the presence of cold electrons; in one notable case found by Wilder et al. 2016 (in review), they have been observed directly by FPI. Vlasov simulations of plasma mixing for a number of these events closely reproduce the observed electric field signatures. We conclude from numerical analysis and direct MMS observations that cold plasma mixing, including cold electrons, is the primary driver of parallel electrostatic waves observed near the electron diffusion region in asymmetric magnetic reconnection.
Nonlinear Evolution of Short-wavelength Torsional Alfvén Waves

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shestov, S. V.; Nakariakov, V. M.; Ulyanov, A. S.

2017-05-10

We analyze nonlinear evolution of torsional Alfvén waves in a straight magnetic flux tube filled in with a low- β plasma, and surrounded with a plasma of lower density. Such magnetic tubes model, in particular, a segment of a coronal loop or a polar plume. The wavelength is taken comparable to the tube radius. We perform a numerical simulation of the wave propagation using ideal magnetohydrodynamics. We find that a torsional wave nonlinearly induces three kinds of compressive flows: the parallel flow at the Alfvén speed, which constitutes a bulk plasma motion along the magnetic field, the tube wave, andmore » also transverse flows in the radial direction, associated with sausage fast magnetoacoustic modes. In addition, the nonlinear torsional wave steepens and its propagation speed increases. The latter effect leads to the progressive distortion of the torsional wave front, i.e., nonlinear phase mixing. Because of the intrinsic non-uniformity of the torsional wave amplitude across the tube radius, the nonlinear effects are more pronounced in regions with higher wave amplitudes. They are always absent at the axes of the flux tube. In the case of a linear radial profile of the wave amplitude, the nonlinear effects are localized in an annulus region near the tube boundary. Thus, the parallel compressive flows driven by torsional Alfvén waves in the solar and stellar coronae, are essentially non-uniform in the perpendicular direction. The presence of additional sinks for the wave energy reduces the efficiency of the nonlinear parallel cascade in torsional Alfvén waves.« less
Nonlinear Evolution of Short-wavelength Torsional Alfvén Waves

NASA Astrophysics Data System (ADS)

Shestov, S. V.; Nakariakov, V. M.; Ulyanov, A. S.; Reva, A. A.; Kuzin, S. V.

2017-05-01

We analyze nonlinear evolution of torsional Alfvén waves in a straight magnetic flux tube filled in with a low-β plasma, and surrounded with a plasma of lower density. Such magnetic tubes model, in particular, a segment of a coronal loop or a polar plume. The wavelength is taken comparable to the tube radius. We perform a numerical simulation of the wave propagation using ideal magnetohydrodynamics. We find that a torsional wave nonlinearly induces three kinds of compressive flows: the parallel flow at the Alfvén speed, which constitutes a bulk plasma motion along the magnetic field, the tube wave, and also transverse flows in the radial direction, associated with sausage fast magnetoacoustic modes. In addition, the nonlinear torsional wave steepens and its propagation speed increases. The latter effect leads to the progressive distortion of the torsional wave front, I.e., nonlinear phase mixing. Because of the intrinsic non-uniformity of the torsional wave amplitude across the tube radius, the nonlinear effects are more pronounced in regions with higher wave amplitudes. They are always absent at the axes of the flux tube. In the case of a linear radial profile of the wave amplitude, the nonlinear effects are localized in an annulus region near the tube boundary. Thus, the parallel compressive flows driven by torsional Alfvén waves in the solar and stellar coronae, are essentially non-uniform in the perpendicular direction. The presence of additional sinks for the wave energy reduces the efficiency of the nonlinear parallel cascade in torsional Alfvén waves.
Large eddy simulations and direct numerical simulations of high speed turbulent reacting flows

NASA Technical Reports Server (NTRS)

Givi, P.; Frankel, S. H.; Adumitroaie, V.; Sabini, G.; Madnia, C. K.

1993-01-01

The primary objective of this research is to extend current capabilities of Large Eddy Simulations (LES) and Direct Numerical Simulations (DNS) for the computational analyses of high speed reacting flows. Our efforts in the first two years of this research have been concentrated on a priori investigations of single-point Probability Density Function (PDF) methods for providing subgrid closures in reacting turbulent flows. In the efforts initiated in the third year, our primary focus has been on performing actual LES by means of PDF methods. The approach is based on assumed PDF methods and we have performed extensive analysis of turbulent reacting flows by means of LES. This includes simulations of both three-dimensional (3D) isotropic compressible flows and two-dimensional reacting planar mixing layers. In addition to these LES analyses, some work is in progress to assess the extent of validity of our assumed PDF methods. This assessment is done by making detailed companions with recent laboratory data in predicting the rate of reactant conversion in parallel reacting shear flows. This report provides a summary of our achievements for the first six months of the third year of this program.
Inertial instabilities in a mixing-separating microfluidic device

NASA Astrophysics Data System (ADS)

Domingues, Allysson; Poole, Robert; Dennis, David

2017-11-01

Combining and separating fluids has many industrial and biomedical applications. This numerical and experimental study explores inertial instabilities in a so-called mixing-separating cell micro-geometry which could potentiality be used to enhance mixing. Our microfluidic mixing-separating cell consists of two straight square parallel channels with flow from opposite directions with a central gap that allows the streams to interact, mix or remain separate (often referred to as the `H' geometry). A stagnation point is generated at the centre of symmetry due to the two opposed inlets and outlets. Under creeping flow conditions (Reynolds number [ Re 0 ]) the flow is steady, two-dimensional and produces a sharp symmetric boundary between fluids stream entering the geometry from opposite directions. For Re > 30 , an inertial instability appears which leads to the generation of a central vortex and the breaking of symmetry, although the flow remains steady. As Re increases the central vortex divides into two vortices. Our experimental and numerical investigations both show the same phenomena. The results suggest that the effect observed can be exploited to enhance mixing in biomedical or other applications. Work supported by CNPq Grant 203195/2014-0.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Li, C.; Yu, G.; Wang, K.

The physical designs of the new concept reactors which have complex structure, various materials and neutronic energy spectrum, have greatly improved the requirements to the calculation methods and the corresponding computing hardware. Along with the widely used parallel algorithm, heterogeneous platforms architecture has been introduced into numerical computations in reactor physics. Because of the natural parallel characteristics, the CPU-FPGA architecture is often used to accelerate numerical computation. This paper studies the application and features of this kind of heterogeneous platforms used in numerical calculation of reactor physics through practical examples. After the designed neutron diffusion module based on CPU-FPGA architecturemore » achieves a 11.2 speed up factor, it is proved to be feasible to apply this kind of heterogeneous platform into reactor physics. (authors)« less
Numerical computation of solar neutrino flux attenuated by the MSW mechanism

NASA Astrophysics Data System (ADS)

Kim, Jai Sam; Chae, Yoon Sang; Kim, Jung Dae

1999-07-01

We compute the survival probability of an electron neutrino in its flight through the solar core experiencing the Mikheyev-Smirnov-Wolfenstein effect with all three neutrino species considered. We adopted a hybrid method that uses an accurate approximation formula in the non-resonance region and numerical integration in the non-adiabatic resonance region. The key of our algorithm is to use the importance sampling method for sampling the neutrino creation energy and position and to find the optimum radii to start and stop numerical integration. We further developed a parallel algorithm for a message passing parallel computer. By using an idea of job token, we have developed a dynamical load balancing mechanism which is effective under any irregular load distributions
A multithreaded and GPU-optimized compact finite difference algorithm for turbulent mixing at high Schmidt number using petascale computing

NASA Astrophysics Data System (ADS)

Clay, M. P.; Yeung, P. K.; Buaria, D.; Gotoh, T.

2017-11-01

Turbulent mixing at high Schmidt number is a multiscale problem which places demanding requirements on direct numerical simulations to resolve fluctuations down the to Batchelor scale. We use a dual-grid, dual-scheme and dual-communicator approach where velocity and scalar fields are computed by separate groups of parallel processes, the latter using a combined compact finite difference (CCD) scheme on finer grid with a static 3-D domain decomposition free of the communication overhead of memory transposes. A high degree of scalability is achieved for a 81923 scalar field at Schmidt number 512 in turbulence with a modest inertial range, by overlapping communication with computation whenever possible. On the Cray XE6 partition of Blue Waters, use of a dedicated thread for communication combined with OpenMP locks and nested parallelism reduces CCD timings by 34% compared to an MPI baseline. The code has been further optimized for the 27-petaflops Cray XK7 machine Titan using GPUs as accelerators with the latest OpenMP 4.5 directives, giving 2.7X speedup compared to CPU-only execution at the largest problem size. Supported by NSF Grant ACI-1036170, the NCSA Blue Waters Project with subaward via UIUC, and a DOE INCITE allocation at ORNL.
Electro-osmotic flow in coated nanocapillaries: a theoretical investigation.

PubMed

Marini Bettolo Marconi, Umberto; Monteferrante, Michele; Melchionna, Simone

2014-12-14

Motivated by recent experiments, we present a theoretical investigation of how the electro-osmotic flow occurring in a capillary is modified when its charged surfaces are coated with charged polymers. The theoretical treatment is based on a three-dimensional model consisting of a ternary fluid-mixture, representing the solvent and two species for the ions, confined between two parallel charged plates decorated with a fixed array of scatterers representing the polymer coating. The electro-osmotic flow, generated by a constant electric field applied in a direction parallel to the plates, is studied numerically by means of Lattice Boltzmann simulations. In order to gain further understanding we performed a simple theoretical analysis by extending the Stokes-Smoluchowski equation to take into account the porosity induced by the polymers in the region adjacent to the walls. We discuss the nature of the velocity profiles by focusing on the competing effects of the polymer charges and the frictional forces they exert. We show evidence of the flow reduction and of the flow inversion phenomenon when the polymer charge is opposite to the surface charge. By using the density of polymers and the surface charge as control variables, we propose a phase diagram that discriminates the direct and the reversed flow regimes and determines their dependence on the ionic concentration.
The OpenMP Implementation of NAS Parallel Benchmarks and its Performance

NASA Technical Reports Server (NTRS)

Jin, Hao-Qiang; Frumkin, Michael; Yan, Jerry

1999-01-01

As the new ccNUMA architecture became popular in recent years, parallel programming with compiler directives on these machines has evolved to accommodate new needs. In this study, we examine the effectiveness of OpenMP directives for parallelizing the NAS Parallel Benchmarks. Implementation details will be discussed and performance will be compared with the MPI implementation. We have demonstrated that OpenMP can achieve very good results for parallelization on a shared memory system, but effective use of memory and cache is very important.
Asymmetry in the Farley-Buneman dispersion relation caused by parallel electric fields

NASA Astrophysics Data System (ADS)

Forsythe, Victoriya V.; Makarevich, Roman A.

2016-11-01

An implicit assumption utilized in studies of E region plasma waves generated by the Farley-Buneman instability (FBI) is that the FBI dispersion relation and its solutions for the growth rate and phase velocity are perfectly symmetric with respect to the reversal of the wave propagation component parallel to the magnetic field. In the present study, a recently derived general dispersion relation that describes fundamental plasma instabilities in the lower ionosphere including FBI is considered and it is demonstrated that the dispersion relation is symmetric only for background electric fields that are perfectly perpendicular to the magnetic field. It is shown that parallel electric fields result in significant differences between the growth rates and phase velocities for propagation of parallel components of opposite signs. These differences are evaluated using numerical solutions of the general dispersion relation and shown to exhibit an approximately linear relationship with the parallel electric field near the E region peak altitude of 110 km. An analytic expression for the differences is also derived from an approximate version of the dispersion relation, with comparisons between numerical and analytic results agreeing near 110 km. It is further demonstrated that parallel electric fields do not change the overall symmetry when the full 3-D wave propagation vector is reversed, with no symmetry seen when either the perpendicular or parallel component is reversed. The present results indicate that moderate-to-strong parallel electric fields of 0.1-1.0 mV/m can result in experimentally measurable differences between the characteristics of plasma waves with parallel propagation components of opposite polarity.
Evaluation of a new parallel numerical parameter optimization algorithm for a dynamical system

NASA Astrophysics Data System (ADS)

Duran, Ahmet; Tuncel, Mehmet

2016-10-01

It is important to have a scalable parallel numerical parameter optimization algorithm for a dynamical system used in financial applications where time limitation is crucial. We use Message Passing Interface parallel programming and present such a new parallel algorithm for parameter estimation. For example, we apply the algorithm to the asset flow differential equations that have been developed and analyzed since 1989 (see [3-6] and references contained therein). We achieved speed-up for some time series to run up to 512 cores (see [10]). Unlike [10], we consider more extensive financial market situations, for example, in presence of low volatility, high volatility and stock market price at a discount/premium to its net asset value with varying magnitude, in this work. Moreover, we evaluated the convergence of the model parameter vector, the nonlinear least squares error and maximum improvement factor to quantify the success of the optimization process depending on the number of initial parameter vectors.
Parallel Fortran-MPI software for numerical inversion of the Laplace transform and its application to oscillatory water levels in groundwater environments

USGS Publications Warehouse

Zhan, X.

2005-01-01

A parallel Fortran-MPI (Message Passing Interface) software for numerical inversion of the Laplace transform based on a Fourier series method is developed to meet the need of solving intensive computational problems involving oscillatory water level's response to hydraulic tests in a groundwater environment. The software is a parallel version of ACM (The Association for Computing Machinery) Transactions on Mathematical Software (TOMS) Algorithm 796. Running 38 test examples indicated that implementation of MPI techniques with distributed memory architecture speedups the processing and improves the efficiency. Applications to oscillatory water levels in a well during aquifer tests are presented to illustrate how this package can be applied to solve complicated environmental problems involved in differential and integral equations. The package is free and is easy to use for people with little or no previous experience in using MPI but who wish to get off to a quick start in parallel computing. ?? 2004 Elsevier Ltd. All rights reserved.
Parallel implementation of geometrical shock dynamics for two dimensional converging shock waves

NASA Astrophysics Data System (ADS)

Qiu, Shi; Liu, Kuang; Eliasson, Veronica

2016-10-01

Geometrical shock dynamics (GSD) theory is an appealing method to predict the shock motion in the sense that it is more computationally efficient than solving the traditional Euler equations, especially for converging shock waves. However, to solve and optimize large scale configurations, the main bottleneck is the computational cost. Among the existing numerical GSD schemes, there is only one that has been implemented on parallel computers, with the purpose to analyze detonation waves. To extend the computational advantage of the GSD theory to more general applications such as converging shock waves, a numerical implementation using a spatial decomposition method has been coupled with a front tracking approach on parallel computers. In addition, an efficient tridiagonal system solver for massively parallel computers has been applied to resolve the most expensive function in this implementation, resulting in an efficiency of 0.93 while using 32 HPCC cores. Moreover, symmetric boundary conditions have been developed to further reduce the computational cost, achieving a speedup of 19.26 for a 12-sided polygonal converging shock.
Acoustic simulation in architecture with parallel algorithm

NASA Astrophysics Data System (ADS)

Li, Xiaohong; Zhang, Xinrong; Li, Dan

2004-03-01

In allusion to complexity of architecture environment and Real-time simulation of architecture acoustics, a parallel radiosity algorithm was developed. The distribution of sound energy in scene is solved with this method. And then the impulse response between sources and receivers at frequency segment, which are calculated with multi-process, are combined into whole frequency response. The numerical experiment shows that parallel arithmetic can improve the acoustic simulating efficiency of complex scene.
Effects of a parallel electric field and the geomagnetic field in the topside ionosphere on auroral and photoelectron energy distributions

NASA Technical Reports Server (NTRS)

Min, Q.-L.; Lummerzheim, D.; Rees, M. H.; Stamnes, K.

1993-01-01

The consequences of electric field acceleration and an inhomogeneous magnetic field on auroral electron energy distributions in the topside ionosphere are investigated. The one-dimensional, steady state electron transport equation includes elastic and inelastic collisions, an inhomogeneous magnetic field, and a field-aligned electric field. The case of a self-consistent polarization electric field is considered first. The self-consistent field is derived by solving the continuity equation for all ions of importance, including diffusion of O(+) and H(+), and the electron and ion energy equations to derive the electron and ion temperatures. The system of coupled electron transport, continuity, and energy equations is solved numerically. Recognizing observations of parallel electric fields of larger magnitude than the baseline case of the polarization field, the effect of two model fields on the electron distribution function is investigated. In one case the field is increased from the polarization field magnitude at 300 km to a maximum at the upper boundary of 800 km, and in another case a uniform field is added to the polarization field. Substantial perturbations of the low energy portion of the electron flux are produced: an upward directed electric field accelerates the downward directed flux of low-energy secondary electrons and decelerates the upward directed component. Above about 400 km the inhomogeneous magnetic field produces anisotropies in the angular distribution of the electron flux. The effects of the perturbed energy distributions on auroral spectral emission features are noted.
Effects of a Parallel Electric Field and the Geomagnetic Field in the Topside Ionosphere on Auroral and Photoelectron Energy Distributions

NASA Technical Reports Server (NTRS)

Min, Q.-L.; Lummerzheim, D.; Rees, M. H.; Stamnes, K.

1993-01-01

The consequences of electric field acceleration and an inhomogencous magnetic field on auroral electron energy distributions in the topside ionosphere are investigated. The one- dimensional, steady state electron transport equation includes elastic and inelastic collisions, an inhomogencous magnetic field, and a field-aligned electric field. The case of a self-consistent polarization electric field is considered first. The self-consistent field is derived by solving the continuity equation for all ions of importance, including diffusion of 0(+) and H(+), and the electron and ion energy equations to derive the electron and ion temperatures. The system of coupled electron transport, continuity, and energy equations is solved numerically. Recognizing observations of parallel electric fields of larger magnitude than the baseline case of the polarization field, the effect of two model fields on the electron distribution function in investigated. In one case the field is increased from the polarization field magnitude at 300 km to a maximum at the upper boundary of 800 km, and in another case a uniform field is added to the polarization field. Substantial perturbations of the low energy portion of the electron flux are produced: an upward directed electric field accelerates the downward directed flux of low-energy secondary electrons and decelerates the upward directed component. Above about 400 km the inhomogencous magnetic field produces anisotropies in the angular distribution of the electron flux. The effects of the perturbed energy distributions on auroral spectral emission features are noted.
The NAS parallel benchmarks

NASA Technical Reports Server (NTRS)

Bailey, D. H.; Barszcz, E.; Barton, J. T.; Carter, R. L.; Lasinski, T. A.; Browning, D. S.; Dagum, L.; Fatoohi, R. A.; Frederickson, P. O.; Schreiber, R. S.

1991-01-01

A new set of benchmarks has been developed for the performance evaluation of highly parallel supercomputers in the framework of the NASA Ames Numerical Aerodynamic Simulation (NAS) Program. These consist of five 'parallel kernel' benchmarks and three 'simulated application' benchmarks. Together they mimic the computation and data movement characteristics of large-scale computational fluid dynamics applications. The principal distinguishing feature of these benchmarks is their 'pencil and paper' specification-all details of these benchmarks are specified only algorithmically. In this way many of the difficulties associated with conventional benchmarking approaches on highly parallel systems are avoided.

Why not make a PC cluster of your own? 5. AppleSeed: A Parallel Macintosh Cluster for Scientific Computing

NASA Astrophysics Data System (ADS)

Decyk, Viktor K.; Dauger, Dean E.

We have constructed a parallel cluster consisting of Apple Macintosh G4 computers running both Classic Mac OS as well as the Unix-based Mac OS X, and have achieved very good performance on numerically intensive, parallel plasma particle-in-cell simulations. Unlike other Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. This enables us to move parallel computing from the realm of experts to the mainstream of computing.
Plasma Science and Innovation Center at Washington, Wisconsin, and Utah State: Final Scientific Report for the University of Wisconsin-Madison

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sovinec, Carl R.

The University of Wisconsin-Madison component of the Plasma Science and Innovation Center (PSI Center) contributed to modeling capabilities and algorithmic efficiency of the Non-Ideal Magnetohydrodynamics with Rotation (NIMROD) Code, which is widely used to model macroscopic dynamics of magnetically confined plasma. It also contributed to the understanding of direct-current (DC) injection of electrical current for initiating and sustaining plasma in three spherical torus experiments: the Helicity Injected Torus-II (HIT-II), the Pegasus Toroidal Experiment, and the National Spherical Torus Experiment (NSTX). The effort was funded through the PSI Center's cooperative agreement with the University of Washington and Utah State University overmore » the period of March 1, 2005 - August 31, 2016. In addition to the computational and physics accomplishments, the Wisconsin effort contributed to the professional education of four graduate students and two postdoctoral research associates. The modeling for HIT-II and Pegasus was directly supported by the cooperative agreement, and contributions to the NSTX modeling were in support of work by Dr. Bickford Hooper, who was funded through a separate grant. Our primary contribution to model development is the implementation of detailed closure relations for collisional plasma. Postdoctoral associate Adam Bayliss implemented the temperature-dependent effects of Braginskii's parallel collisional ion viscosity. As a graduate student, John O'Bryan added runtime options for Braginskii's models and Ji's K2 models of thermal conduction with magnetization effects and thermal equilibration. As a postdoctoral associate, O'Bryan added the magnetization effects for ion viscosity. Another area of model development completed through the PSI-Center is the implementation of Chodura's phenomenological resistivity model. Finally, we investigated and tested linear electron parallel viscosity, leveraged by support from the Center for Extended Magnetohydrodynamic Modeling (CEMM). Work on algorithmic efficiency improved NIMROD's element-based computations. We reordered arrays and eliminated a level of looping for computations over the data points that are used for numerical integration over elements. Moreover, the reordering allows fewer and larger communication calls when using distributed-memory parallel computation, thereby avoiding a data starvation problem that limited parallel scaling over NIMROD's Fourier components for the periodic coordinate. Together with improved parallel preconditioning, work that was supported by CEMM, these developments allowed NIMROD's first scaling to over 10,000 processor cores. Another algorithm improvement supported by the PSI Center is nonlinear numerical diffusivities for implicit advection. We also developed the Stitch code to enhance the flexibility of NIMROD's preprocessing. Our simulations of HIT-II considered conditions with and without fluctuation-induced amplification of poloidal flux, but our validation efforts focused on conditions without amplification. A significant finding is that NIMROD reproduces the dependence of net plasma current as the imposed poloidal flux is varied. The modeling of Pegasus startup from localized DC injectors predicted that development of a tokamak-like configuration occurs through a sequence of current-filament merger events. Comparison of experimentally measured and numerically computed cross-power spectra enhance confidence in NIMROD's simulation of magnetic fluctuations; however, energy confinement remains an open area for further research. Our contributions to the NSTX study include adaptation of the helicity-injection boundary conditions from the HIT-II simulations and support for linear analysis and computation of 3D current-driven instabilities.« less
Percolation and permeability of fracture networks in Excavated Damaged Zones

NASA Astrophysics Data System (ADS)

Mourzenko, V.; Thovert, J.; Adler, P. M.

2012-12-01

Generally, the excavation process of a gallery generates fractures in its immediate vicinity. The corresponding zone which is called the Excavated Damaged Zone (EDZ), has a larger permeability than the intact surrounding medium. The properties of the EDZ are attracting more and more attention because of their potential importance in repositories of nuclear wastes. The EDZ which is induced by the excavation process may create along the galleries of the repositories a high permeability zone which could directly connect the storage area with the ground surface. Therefore, the studies of its properties are of crucial importance for applications such as the storage of nuclear wastes. Field observations (such as the ones which have been systematically performed at Mont Terri by [1, 2]) suggest that the fracture density is an exponentially decreasing function of the distance to the wall with a characteristic length of about 0.5 m and that the fracture orientation is anisotropic (most fractures are subparallel to the tunnel walls) and well approximated by a Fisher law whose pole is orthogonal to the wall. Numerical samples are generated according to these prescriptions. Their percolation status and hydraulic transmissivity can be calculated by the numerical codes which are detailed in [3]. Percolation is determined by a pseudo diffusion algorithm. Flow determination necessitates the meshing of the fracture networks and the discretisation of the Darcy equation by a finite volume technique; the resulting linear system is solved by a conjugate gradient algorithm. Only the flow properties of the EDZ along the directions which are parallel to the wall are of interest when a pressure gradient parallel to the wall is applied. The transmissivity T which relates the total flow rate per unit width Q along the wall through the whole EDZ to the pressure gradient grad p, is defined by Q = - T grad p/mu where mu is the fluid viscosity. The percolation status and hydraulic transmissivity are systematically determined for a wide range of decay lengths and anisotropy parameters. They can be modeled by comparison with anisotropic fracture networks with a constant density. A heuristic power-law model is proposed which accurately describes the results for the percolation threshold over the whole investigated range of heterogeneity and anisotropy. Then, the data for the EDZ transmissivity are presented. A simple parallel flow model is introduced. The flow properties of the EDZ vary with the distance z from the wall. However, the macroscopic pressure gradient does not depend on z, and the flow lines are in average parallel to the wall. Hence, the overall transmissivity is tentatively estimated by a parallel flow model, where a layer at depth z behaves as a fractured medium with uniform properties corresponding to the state at this position in the EDZ. It yields an explicit analytical expression for the transmissivity as a function of the heterogeneity and anisotropy parameters, and it successfully accounts for all the numerical data. Graphical tools are provided from which first estimates can be quickly and easily obtained. [1] Bossart P. et al, Eng. Geol., vol. 66, 19-38 (2002). [2] Thovert J.-F. et al, Eng. Geol., 117, 39-51 (2011). [3] Adler P.M. et al, Fractured porous media, Oxford U. Press, in press.
Dynamic recrystallization during deformation of polycrystalline ice: insights from numerical simulations

PubMed Central

Griera, Albert; Steinbach, Florian; Bons, Paul D.; Jansen, Daniela; Roessiger, Jens; Lebensohn, Ricardo A.

2017-01-01

The flow of glaciers and polar ice sheets is controlled by the highly anisotropic rheology of ice crystals that have hexagonal symmetry (ice lh). To improve our knowledge of ice sheet dynamics, it is necessary to understand how dynamic recrystallization (DRX) controls ice microstructures and rheology at different boundary conditions that range from pure shear flattening at the top to simple shear near the base of the sheets. We present a series of two-dimensional numerical simulations that couple ice deformation with DRX of various intensities, paying special attention to the effect of boundary conditions. The simulations show how similar orientations of c-axis maxima with respect to the finite deformation direction develop regardless of the amount of DRX and applied boundary conditions. In pure shear this direction is parallel to the maximum compressional stress, while it rotates towards the shear direction in simple shear. This leads to strain hardening and increased activity of non-basal slip systems in pure shear and to strain softening in simple shear. Therefore, it is expected that ice is effectively weaker in the lower parts of the ice sheets than in the upper parts. Strain-rate localization occurs in all simulations, especially in simple shear cases. Recrystallization suppresses localization, which necessitates the activation of hard, non-basal slip systems. This article is part of the themed issue ‘Microdynamics of ice’. PMID:28025295
Dynamic recrystallization during deformation of polycrystalline ice: insights from numerical simulations.

PubMed

Llorens, Maria-Gema; Griera, Albert; Steinbach, Florian; Bons, Paul D; Gomez-Rivas, Enrique; Jansen, Daniela; Roessiger, Jens; Lebensohn, Ricardo A; Weikusat, Ilka

2017-02-13

The flow of glaciers and polar ice sheets is controlled by the highly anisotropic rheology of ice crystals that have hexagonal symmetry (ice lh). To improve our knowledge of ice sheet dynamics, it is necessary to understand how dynamic recrystallization (DRX) controls ice microstructures and rheology at different boundary conditions that range from pure shear flattening at the top to simple shear near the base of the sheets. We present a series of two-dimensional numerical simulations that couple ice deformation with DRX of various intensities, paying special attention to the effect of boundary conditions. The simulations show how similar orientations of c-axis maxima with respect to the finite deformation direction develop regardless of the amount of DRX and applied boundary conditions. In pure shear this direction is parallel to the maximum compressional stress, while it rotates towards the shear direction in simple shear. This leads to strain hardening and increased activity of non-basal slip systems in pure shear and to strain softening in simple shear. Therefore, it is expected that ice is effectively weaker in the lower parts of the ice sheets than in the upper parts. Strain-rate localization occurs in all simulations, especially in simple shear cases. Recrystallization suppresses localization, which necessitates the activation of hard, non-basal slip systems.This article is part of the themed issue 'Microdynamics of ice'. © 2016 The Author(s).
DOE Office of Scientific and Technical Information (OSTI.GOV)

Bowen, Benjamin; Ruebel, Oliver; Fischer, Curt Fischer R.

BASTet is an advanced software library written in Python. BASTet serves as the analysis and storage library for the OpenMSI project. BASTet is an integrate framework for: i) storage of spectral imaging data, ii) storage of derived analysis data, iii) provenance of analyses, iv) integration and execution of analyses via complex workflows. BASTet implements the API for the HDF5 storage format used by OpenMSI. Analyses that are developed using BASTet benefit from direct integration with storage format, automatic tracking of provenance, and direct integration with command-line and workflow execution tools. BASTet also defines interfaces to enable developers to directly integratemore » their analysis with OpenMSI's web-based viewing infrastruture without having to know OpenMSI. BASTet also provides numerous helper classes and tools to assist with the conversion of data files, ease parallel implementation of analysis algorithms, ease interaction with web-based functions, description methods for data reduction. BASTet also includes detailed developer documentation, user tutorials, iPython notebooks, and other supporting documents.« less
On the inversion of geodetic integrals defined over the sphere using 1-D FFT

NASA Astrophysics Data System (ADS)

García, R. V.; Alejo, C. A.

2005-08-01

An iterative method is presented which performs inversion of integrals defined over the sphere. The method is based on one-dimensional fast Fourier transform (1-D FFT) inversion and is implemented with the projected Landweber technique, which is used to solve constrained least-squares problems reducing the associated 1-D cyclic-convolution error. The results obtained are as precise as the direct matrix inversion approach, but with better computational efficiency. A case study uses the inversion of Hotine’s integral to obtain gravity disturbances from geoid undulations. Numerical convergence is also analyzed and comparisons with respect to the direct matrix inversion method using conjugate gradient (CG) iteration are presented. Like the CG method, the number of iterations needed to get the optimum (i.e., small) error decreases as the measurement noise increases. Nevertheless, for discrete data given over a whole parallel band, the method can be applied directly without implementing the projected Landweber method, since no cyclic convolution error exists.
Measurements of the momentum and current transport from tearing instability in the Madison Symmetric Torus reversed-field pinch

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kuritsyn, A.; Fiksel, G.; Almagri, A. F.

2009-05-15

In this paper measurements of momentum and current transport caused by current driven tearing instability are reported. The measurements are done in the Madison Symmetric Torus reversed-field pinch [R. N. Dexter, D. W. Kerst, T. W. Lovell, S. C. Prager, and J. C. Sprott, Fusion Technol. 19, 131 (1991)] in a regime with repetitive bursts of tearing instability causing magnetic field reconnection. It is established that the plasma parallel momentum profile flattens during these reconnection events: The flow decreases in the core and increases at the edge. The momentum relaxation phenomenon is similar in nature to the well established relaxationmore » of the parallel electrical current and could be a general feature of self-organized systems. The measured fluctuation-induced Maxwell and Reynolds stresses, which govern the dynamics of plasma flow, are large and almost balance each other such that their difference is approximately equal to the rate of change of plasma momentum. The Hall dynamo, which is directly related to the Maxwell stress, drives the parallel current profile relaxation at resonant surfaces at the reconnection events. These results qualitatively agree with analytical calculations and numerical simulations. It is plausible that current-driven instabilities can be responsible for momentum transport in other laboratory and astrophysical plasmas.« less
Nonadiabatic electron response in the Hasegawa-Wakatani equations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stoltzfus-Dueck, T.; Scott, B. D.; Krommes, J. A.

2013-08-15

Tokamak edge turbulence is strongly influenced by parallel electron physics, which relaxes density and potential fluctuations towards electron adiabatic response. Beginning with the paradigmatic Hasegawa-Wakatani equations (HWEs) for resistive tokamak edge turbulence, a unique decomposition of the electric potential (φ) into adiabatic (a) and nonadiabatic (b) portions is derived, based on the requirement that a neither drive nor respond to the parallel current j{sub ∥}. The form of the decomposition clarifies that, at perpendicular scales large relative to the sound radius, the electron adiabatic response controls the nonzonal φ, not the fluctuating density n. Simple energy balance arguments allow onemore » to rigorously bound the ratio of rms nonzonal nonadiabatic fluctuations (b(tilde sign)) relative to adiabatic ones (ã). The role of the vorticity nonlinearity in transferring energy between adiabatic and nonadiabatic fluctuations aids intuitive understanding of self-sustained turbulence in the HWEs. When the normalized parallel resistivity is weak, b(tilde sign) becomes effectively slaved, allowing the reduction to an approximate one-field model that remains valid for strong turbulence. In addition to guiding physical intuition, the one-field reduction should greatly ease further analytical manipulations. Direct numerical simulation of the 2D HWEs confirms the convergence of the asymptotic formula for b(tilde sign)« less
Evaluation of the accuracy of the Rotating Parallel Ray Omnidirectional Integration for instantaneous pressure reconstruction from the measured pressure gradient

NASA Astrophysics Data System (ADS)

Moreto, Jose; Liu, Xiaofeng

2017-11-01

The accuracy of the Rotating Parallel Ray omnidirectional integration for pressure reconstruction from the measured pressure gradient (Liu et al., AIAA paper 2016-1049) is evaluated against both the Circular Virtual Boundary omnidirectional integration (Liu and Katz, 2006 and 2013) and the conventional Poisson equation approach. Dirichlet condition at one boundary point and Neumann condition at all other boundary points are applied to the Poisson solver. A direct numerical simulation database of isotropic turbulence flow (JHTDB), with a homogeneously distributed random noise added to the entire field of DNS pressure gradient, is used to assess the performance of the methods. The random noise, generated by the Matlab function Rand, has a magnitude varying randomly within the range of +/-40% of the maximum DNS pressure gradient. To account for the effect of the noise distribution pattern on the reconstructed pressure accuracy, a total of 1000 different noise distributions achieved by using different random number seeds are involved in the evaluation. Final results after averaging the 1000 realizations show that the error of the reconstructed pressure normalized by the DNS pressure variation range is 0.15 +/-0.07 for the Poisson equation approach, 0.028 +/-0.003 for the Circular Virtual Boundary method and 0.027 +/-0.003 for the Rotating Parallel Ray method, indicating the robustness of the Rotating Parallel Ray method in pressure reconstruction. Sponsor: The San Diego State University UGP program.
Development of a parallel FE simulator for modeling the whole trans-scale failure process of rock from meso- to engineering-scale

NASA Astrophysics Data System (ADS)

Li, Gen; Tang, Chun-An; Liang, Zheng-Zhao

2017-01-01

Multi-scale high-resolution modeling of rock failure process is a powerful means in modern rock mechanics studies to reveal the complex failure mechanism and to evaluate engineering risks. However, multi-scale continuous modeling of rock, from deformation, damage to failure, has raised high requirements on the design, implementation scheme and computation capacity of the numerical software system. This study is aimed at developing the parallel finite element procedure, a parallel rock failure process analysis (RFPA) simulator that is capable of modeling the whole trans-scale failure process of rock. Based on the statistical meso-damage mechanical method, the RFPA simulator is able to construct heterogeneous rock models with multiple mechanical properties, deal with and represent the trans-scale propagation of cracks, in which the stress and strain fields are solved for the damage evolution analysis of representative volume element by the parallel finite element method (FEM) solver. This paper describes the theoretical basis of the approach and provides the details of the parallel implementation on a Windows - Linux interactive platform. A numerical model is built to test the parallel performance of FEM solver. Numerical simulations are then carried out on a laboratory-scale uniaxial compression test, and field-scale net fracture spacing and engineering-scale rock slope examples, respectively. The simulation results indicate that relatively high speedup and computation efficiency can be achieved by the parallel FEM solver with a reasonable boot process. In laboratory-scale simulation, the well-known physical phenomena, such as the macroscopic fracture pattern and stress-strain responses, can be reproduced. In field-scale simulation, the formation process of net fracture spacing from initiation, propagation to saturation can be revealed completely. In engineering-scale simulation, the whole progressive failure process of the rock slope can be well modeled. It is shown that the parallel FE simulator developed in this study is an efficient tool for modeling the whole trans-scale failure process of rock from meso- to engineering-scale.
Neural Signatures of Number Processing in Human Infants: Evidence for Two Core Systems Underlying Numerical Cognition

ERIC Educational Resources Information Center

Hyde, Daniel C.; Spelke, Elizabeth S.

2011-01-01

Behavioral research suggests that two cognitive systems are at the foundations of numerical thinking: one for representing 1-3 objects in parallel and one for representing and comparing large, approximate numerical magnitudes. We tested for dissociable neural signatures of these systems in preverbal infants by recording event-related potentials…
Graphics applications utilizing parallel processing

NASA Technical Reports Server (NTRS)

Rice, John R.

1990-01-01

The results are presented of research conducted to develop a parallel graphic application algorithm to depict the numerical solution of the 1-D wave equation, the vibrating string. The research was conducted on a Flexible Flex/32 multiprocessor and a Sequent Balance 21000 multiprocessor. The wave equation is implemented using the finite difference method. The synchronization issues that arose from the parallel implementation and the strategies used to alleviate the effects of the synchronization overhead are discussed.
Numerical and experimental simulation of linear shear piezoelectric phased arrays for structural health monitoring

NASA Astrophysics Data System (ADS)

Wang, Wentao; Zhang, Hui; Lynch, Jerome P.; Cesnik, Carlos E. S.; Li, Hui

2017-04-01

A novel d36-type piezoelectric wafer fabricated from lead magnesium niobate-lead titanate (PMN-PT) is explored for the generation of in-plane horizontal shear waves in plate structures. The study focuses on the development of a linear phased array (PA) of PMN-PT wafers to improve the damage detection capabilities of a structural health monitoring (SHM) system. An attractive property of in-plane horizontal shear waves is that they are nondispersive yet sensitive to damage. This study characterizes the directionality of body waves (Lamb and horizontal shear) created by a single PMN-PT wafer bonded to the surface of a metallic plate structure. Second, a linear PA is designed from PMN-PT wafers to steer and focus Lamb and horizontal shear waves in a plate structure. Numerical studies are conducted to explore the capabilities of a PMN-PT-based PA to detect damage in aluminum plates. Numerical simulations are conducted using the Local Interaction Simulation Approach (LISA) implemented on a parallelized graphical processing unit (GPU) for high-speed execution. Numerical studies are further validated using experimental tests conducted with a linear PA. The study confirms the ability of an PMN-PT phased array to accurately detect and localize damage in aluminum plates.
Two-dimensional numerical simulation of a Stirling engine heat exchanger

NASA Technical Reports Server (NTRS)

Ibrahim, Mounir; Tew, Roy C.; Dudenhoefer, James E.

1989-01-01

The first phase of an effort to develop multidimensional models of Stirling engine components is described. The ultimate goal is to model an entire engine working space. Parallel plate and tubular heat exchanger models are described, with emphasis on the central part of the channel (i.e., ignoring hydrodynamic and thermal end effects). The model assumes laminar, incompressible flow with constant thermophysical properties. In addition, a constant axial temperature gradient is imposed. The governing equations describing the model have been solved using the Crack-Nicloson finite-difference scheme. Model predictions are compared with analytical solutions for oscillating/reversing flow and heat transfer in order to check numerical accuracy. Excellent agreement is obtained for flow both in circular tubes and between parallel plates. The computational heat transfer results are in good agreement with the analytical heat transfer results for parallel plates.
Novel molecular targets for kRAS downregulation: promoter G-quadruplexes

DTIC Science & Technology

2016-11-01

conditions, and described the structure as having mixed parallel/anti-parallel loops of lengths 2:8:10 in the 5’-3’ direction. Using selective small...and anti-parallel loop directionality of lengths 4:10:8 in the 5’–3’ direction, three tetrads stacked, and involving guanines in runs B, C, E, and F...a tri-stacked structure incorporating runs B, C, E and F with intervening loops of 2, 10, and 8 bases in the 5’–3’ direction. G = black circles, C
Design and realization of test system for testing parallelism and jumpiness of optical axis of photoelectric equipment

NASA Astrophysics Data System (ADS)

Shi, Sheng-bing; Chen, Zhen-xing; Qin, Shao-gang; Song, Chun-yan; Jiang, Yun-hong

2014-09-01

With the development of science and technology, photoelectric equipment comprises visible system, infrared system, laser system and so on, integration, information and complication are higher than past. Parallelism and jumpiness of optical axis are important performance of photoelectric equipment,directly affect aim, ranging, orientation and so on. Jumpiness of optical axis directly affect hit precision of accurate point damage weapon, but we lack the facility which is used for testing this performance. In this paper, test system which is used fo testing parallelism and jumpiness of optical axis is devised, accurate aim isn't necessary and data processing are digital in the course of testing parallelism, it can finish directly testing parallelism of multi-axes, aim axis and laser emission axis, parallelism of laser emission axis and laser receiving axis and first acuualizes jumpiness of optical axis of optical sighting device, it's a universal test system.
Statistical properties of Charney-Hasegawa-Mima zonal flows

DOE Office of Scientific and Technical Information (OSTI.GOV)

Anderson, Johan, E-mail: anderson.johan@gmail.com; Botha, G. J. J.

2015-05-15

A theoretical interpretation of numerically generated probability density functions (PDFs) of intermittent plasma transport events in unforced zonal flows is provided within the Charney-Hasegawa-Mima (CHM) model. The governing equation is solved numerically with various prescribed density gradients that are designed to produce different configurations of parallel and anti-parallel streams. Long-lasting vortices form whose flow is governed by the zonal streams. It is found that the numerically generated PDFs can be matched with analytical predictions of PDFs based on the instanton method by removing the autocorrelations from the time series. In many instances, the statistics generated by the CHM dynamics relaxesmore » to Gaussian distributions for both the electrostatic and vorticity perturbations, whereas in areas with strong nonlinear interactions it is found that the PDFs are exponentially distributed.« less
Peculiarities of field penetration in the presence of cross-flux interaction

NASA Astrophysics Data System (ADS)

Berseth, V.; Buzdin, A. I.; Indenbom, M. V.; Benoit, W.

1996-02-01

The attractive core interaction between two orthogonal vortex lattices in alayered superconductor is calculated. When one of these lattices is moving, this interaction gives rise to a drag force acting on the other one. Considering a moving in-plane flux lattice, the effect of the drag force on the perpendicular flux is modelled through a modification of the bulk critical current for this field component. The new critical current depends on the direction of motion of both parallel and perpendicular vortices. The results are derived within the critical-state model for the infinite slab and for the thin strip. For this latter geometry, computations are made with the help of a new numerical method simulating flux penetration in the critical state. The new predicted qualitative phenomena (like the formation of a vortex-free region between two zones of opposite flux in the flat geometry) can be directly verified by the magneto-optic technique.
Small-scale anisotropic intermittency in magnetohydrodynamic turbulence at low magnetic Reynolds numbers.

PubMed

Okamoto, Naoya; Yoshimatsu, Katsunori; Schneider, Kai; Farge, Marie

2014-03-01

Small-scale anisotropic intermittency is examined in three-dimensional incompressible magnetohydrodynamic turbulence subjected to a uniformly imposed magnetic field. Orthonormal wavelet analyses are applied to direct numerical simulation data at moderate Reynolds number and for different interaction parameters. The magnetic Reynolds number is sufficiently low such that the quasistatic approximation can be applied. Scale-dependent statistical measures are introduced to quantify anisotropy in terms of the flow components, either parallel or perpendicular to the imposed magnetic field, and in terms of the different directions. Moreover, the flow intermittency is shown to increase with increasing values of the interaction parameter, which is reflected in strongly growing flatness values when the scale decreases. The scale-dependent anisotropy of energy is found to be independent of scale for all considered values of the interaction parameter. The strength of the imposed magnetic field does amplify the anisotropy of the flow.

Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications

NASA Technical Reports Server (NTRS)

Sun, Xian-He

1997-01-01

Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm and Reduced Parallel Diagonal Dominant (RPDD) algorithm have been carefully studied on different parallel platforms for different applications, and a NASA simulation code developed by Man M. Rai and his colleagues has been parallelized and implemented based on data dependency analysis. These achievements are addressed in detail in the paper.
Parallelizing flow-accumulation calculations on graphics processing units—From iterative DEM preprocessing algorithm to recursive multiple-flow-direction algorithm

NASA Astrophysics Data System (ADS)

Qin, Cheng-Zhi; Zhan, Lijun

2012-06-01

As one of the important tasks in digital terrain analysis, the calculation of flow accumulations from gridded digital elevation models (DEMs) usually involves two steps in a real application: (1) using an iterative DEM preprocessing algorithm to remove the depressions and flat areas commonly contained in real DEMs, and (2) using a recursive flow-direction algorithm to calculate the flow accumulation for every cell in the DEM. Because both algorithms are computationally intensive, quick calculation of the flow accumulations from a DEM (especially for a large area) presents a practical challenge to personal computer (PC) users. In recent years, rapid increases in hardware capacity of the graphics processing units (GPUs) provided in modern PCs have made it possible to meet this challenge in a PC environment. Parallel computing on GPUs using a compute-unified-device-architecture (CUDA) programming model has been explored to speed up the execution of the single-flow-direction algorithm (SFD). However, the parallel implementation on a GPU of the multiple-flow-direction (MFD) algorithm, which generally performs better than the SFD algorithm, has not been reported. Moreover, GPU-based parallelization of the DEM preprocessing step in the flow-accumulation calculations has not been addressed. This paper proposes a parallel approach to calculate flow accumulations (including both iterative DEM preprocessing and a recursive MFD algorithm) on a CUDA-compatible GPU. For the parallelization of an MFD algorithm (MFD-md), two different parallelization strategies using a GPU are explored. The first parallelization strategy, which has been used in the existing parallel SFD algorithm on GPU, has the problem of computing redundancy. Therefore, we designed a parallelization strategy based on graph theory. The application results show that the proposed parallel approach to calculate flow accumulations on a GPU performs much faster than either sequential algorithms or other parallel GPU-based algorithms based on existing parallelization strategies.
A bibliography on parallel and vector numerical algorithms

NASA Technical Reports Server (NTRS)

Ortega, James M.; Voigt, Robert G.; Romine, Charles H.

1988-01-01

This is a bibliography on numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are also listed.
A bibliography on parallel and vector numerical algorithms

NASA Technical Reports Server (NTRS)

Ortega, J. M.; Voigt, R. G.

1987-01-01

This is a bibliography of numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are listed also.
A bibliography on parallel and vector numerical algorithms

NASA Technical Reports Server (NTRS)

Ortega, James M.; Voigt, Robert G.; Romine, Charles H.

1990-01-01

This is a bibliography on numerical methods. It also includes a number of other references on machine architecture, programming language, and other topics of interest to scientific computing. Certain conference proceedings and anthologies which have been published in book form are also listed.
NDL-v2.0: A new version of the numerical differentiation library for parallel architectures

NASA Astrophysics Data System (ADS)

Hadjidoukas, P. E.; Angelikopoulos, P.; Voglis, C.; Papageorgiou, D. G.; Lagaris, I. E.

2014-07-01

We present a new version of the numerical differentiation library (NDL) used for the numerical estimation of first and second order partial derivatives of a function by finite differencing. In this version we have restructured the serial implementation of the code so as to achieve optimal task-based parallelization. The pure shared-memory parallelization of the library has been based on the lightweight OpenMP tasking model allowing for the full extraction of the available parallelism and efficient scheduling of multiple concurrent library calls. On multicore clusters, parallelism is exploited by means of TORC, an MPI-based multi-threaded tasking library. The new MPI implementation of NDL provides optimal performance in terms of function calls and, furthermore, supports asynchronous execution of multiple library calls within legacy MPI programs. In addition, a Python interface has been implemented for all cases, exporting the functionality of our library to sequential Python codes. Catalog identifier: AEDG_v2_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEDG_v2_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 63036 No. of bytes in distributed program, including test data, etc.: 801872 Distribution format: tar.gz Programming language: ANSI Fortran-77, ANSI C, Python. Computer: Distributed systems (clusters), shared memory systems. Operating system: Linux, Unix. Has the code been vectorized or parallelized?: Yes. RAM: The library uses O(N) internal storage, N being the dimension of the problem. It can use up to O(N2) internal storage for Hessian calculations, if a task throttling factor has not been set by the user. Classification: 4.9, 4.14, 6.5. Catalog identifier of previous version: AEDG_v1_0 Journal reference of previous version: Comput. Phys. Comm. 180(2009)1404 Does the new version supersede the previous version?: Yes Nature of problem: The numerical estimation of derivatives at several accuracy levels is a common requirement in many computational tasks, such as optimization, solution of nonlinear systems, and sensitivity analysis. For a large number of scientific and engineering applications, the underlying functions correspond to simulation codes for which analytical estimation of derivatives is difficult or almost impossible. A parallel implementation that exploits systems with multiple CPUs is very important for large scale and computationally expensive problems. Solution method: Finite differencing is used with a carefully chosen step that minimizes the sum of the truncation and round-off errors. The parallel versions employ both OpenMP and MPI libraries. Reasons for new version: The updated version was motivated by our endeavors to extend a parallel Bayesian uncertainty quantification framework [1], by incorporating higher order derivative information as in most state-of-the-art stochastic simulation methods such as Stochastic Newton MCMC [2] and Riemannian Manifold Hamiltonian MC [3]. The function evaluations are simulations with significant time-to-solution, which also varies with the input parameters such as in [1, 4]. The runtime of the N-body-type of problem changes considerably with the introduction of a longer cut-off between the bodies. In the first version of the library, the OpenMP-parallel subroutines spawn a new team of threads and distribute the function evaluations with a PARALLEL DO directive. This limits the functionality of the library as multiple concurrent calls require nested parallelism support from the OpenMP environment. Therefore, either their function evaluations will be serialized or processor oversubscription is likely to occur due to the increased number of OpenMP threads. In addition, the Hessian calculations include two explicit parallel regions that compute first the diagonal and then the off-diagonal elements of the array. Due to the barrier between the two regions, the parallelism of the calculations is not fully exploited. These issues have been addressed in the new version by first restructuring the serial code and then running the function evaluations in parallel using OpenMP tasks. Although the MPI-parallel implementation of the first version is capable of fully exploiting the task parallelism of the PNDL routines, it does not utilize the caching mechanism of the serial code and, therefore, performs some redundant function evaluations in the Hessian and Jacobian calculations. This can lead to: (a) higher execution times if the number of available processors is lower than the total number of tasks, and (b) significant energy consumption due to wasted processor cycles. Overcoming these drawbacks, which become critical as the time of a single function evaluation increases, was the primary goal of this new version. Due to the code restructure, the MPI-parallel implementation (and the OpenMP-parallel in accordance) avoids redundant calls, providing optimal performance in terms of the number of function evaluations. Another limitation of the library was that the library subroutines were collective and synchronous calls. In the new version, each MPI process can issue any number of subroutines for asynchronous execution. We introduce two library calls that provide global and local task synchronizations, similarly to the BARRIER and TASKWAIT directives of OpenMP. The new MPI-implementation is based on TORC, a new tasking library for multicore clusters [5-7]. TORC improves the portability of the software, as it relies exclusively on the POSIX-Threads and MPI programming interfaces. It allows MPI processes to utilize multiple worker threads, offering a hybrid programming and execution environment similar to MPI+OpenMP, in a completely transparent way. Finally, to further improve the usability of our software, a Python interface has been implemented on top of both the OpenMP and MPI versions of the library. This allows sequential Python codes to exploit shared and distributed memory systems. Summary of revisions: The revised code improves the performance of both parallel (OpenMP and MPI) implementations. The functionality and the user-interface of the MPI-parallel version have been extended to support the asynchronous execution of multiple PNDL calls, issued by one or multiple MPI processes. A new underlying tasking library increases portability and allows MPI processes to have multiple worker threads. For both implementations, an interface to the Python programming language has been added. Restrictions: The library uses only double precision arithmetic. The MPI implementation assumes the homogeneity of the execution environment provided by the operating system. Specifically, the processes of a single MPI application must have identical address space and a user function resides at the same virtual address. In addition, address space layout randomization should not be used for the application. Unusual features: The software takes into account bound constraints, in the sense that only feasible points are used to evaluate the derivatives, and given the level of the desired accuracy, the proper formula is automatically employed. Running time: Running time depends on the function's complexity. The test run took 23 ms for the serial distribution, 25 ms for the OpenMP with 2 threads, 53 ms and 1.01 s for the MPI parallel distribution using 2 threads and 2 processes respectively and yield-time for idle workers equal to 10 ms. References: [1] P. Angelikopoulos, C. Paradimitriou, P. Koumoutsakos, Bayesian uncertainty quantification and propagation in molecular dynamics simulations: a high performance computing framework, J. Chem. Phys 137 (14). [2] H.P. Flath, L.C. Wilcox, V. Akcelik, J. Hill, B. van Bloemen Waanders, O. Ghattas, Fast algorithms for Bayesian uncertainty quantification in large-scale linear inverse problems based on low-rank partial Hessian approximations, SIAM J. Sci. Comput. 33 (1) (2011) 407-432. [3] M. Girolami, B. Calderhead, Riemann manifold Langevin and Hamiltonian Monte Carlo methods, J. R. Stat. Soc. Ser. B (Stat. Methodol.) 73 (2) (2011) 123-214. [4] P. Angelikopoulos, C. Paradimitriou, P. Koumoutsakos, Data driven, predictive molecular dynamics for nanoscale flow simulations under uncertainty, J. Phys. Chem. B 117 (47) (2013) 14808-14816. [5] P.E. Hadjidoukas, E. Lappas, V.V. Dimakopoulos, A runtime library for platform-independent task parallelism, in: PDP, IEEE, 2012, pp. 229-236. [6] C. Voglis, P.E. Hadjidoukas, D.G. Papageorgiou, I. Lagaris, A parallel hybrid optimization algorithm for fitting interatomic potentials, Appl. Soft Comput. 13 (12) (2013) 4481-4492. [7] P.E. Hadjidoukas, C. Voglis, V.V. Dimakopoulos, I. Lagaris, D.G. Papageorgiou, Supporting adaptive and irregular parallelism for non-linear numerical optimization, Appl. Math. Comput. 231 (2014) 544-559.
Structure of scintillations in Neptune's occultation shadow

NASA Technical Reports Server (NTRS)

Hubbard, W. B.; Lellouch, Emmanuel; Sicardy, Bruno; Brahic, Andre; Vilas, Faith

1988-01-01

An exceptionally high-quality data set from a Neptune occultation is used here to derive a number of new results about the statistical properties of the fluctuations of the intensity distribution in various parts of Neptune's occultation shadow. An approximate numerical ray-tracing model which successfully accounts for many of the qualitative aspects of the observed intensity fluctuation distribution is introduced. Strong refractive scintillation is simulated by including the effects of 'turbulence' with projected atmospheric properties allowed to vary in both the direction perpendicular and parallel to the limb, and an explicit two-dimensional picture of a typical intensity distribution throughout an occulting planet's shadow is presented. The results confirm the existence of highly anisotropic turbulence.
Dynamical Instability Produces Transform Faults at Mid-Ocean Ridges

NASA Astrophysics Data System (ADS)

Gerya, Taras

2010-08-01

Transform faults at mid-ocean ridges—one of the most striking, yet enigmatic features of terrestrial plate tectonics—are considered to be the inherited product of preexisting fault structures. Ridge offsets along these faults therefore should remain constant with time. Here, numerical models suggest that transform faults are actively developing and result from dynamical instability of constructive plate boundaries, irrespective of previous structure. Boundary instability from asymmetric plate growth can spontaneously start in alternate directions along successive ridge sections; the resultant curved ridges become transform faults within a few million years. Fracture-related rheological weakening stabilizes ridge-parallel detachment faults. Offsets along the transform faults change continuously with time by asymmetric plate growth and discontinuously by ridge jumps.
Highly parallel demagnetization field calculation using the fast multipole method on tetrahedral meshes with continuous sources

NASA Astrophysics Data System (ADS)

Palmesi, P.; Exl, L.; Bruckner, F.; Abert, C.; Suess, D.

2017-11-01

The long-range magnetic field is the most time-consuming part in micromagnetic simulations. Computational improvements can relieve problems related to this bottleneck. This work presents an efficient implementation of the Fast Multipole Method [FMM] for the magnetic scalar potential as used in micromagnetics. The novelty lies in extending FMM to linearly magnetized tetrahedral sources making it interesting also for other areas of computational physics. We treat the near field directly and in use (exact) numerical integration on the multipole expansion in the far field. This approach tackles important issues like the vectorial and continuous nature of the magnetic field. By using FMM the calculations scale linearly in time and memory.
A survey of parallel programming tools

NASA Technical Reports Server (NTRS)

Cheng, Doreen Y.

1991-01-01

This survey examines 39 parallel programming tools. Focus is placed on those tool capabilites needed for parallel scientific programming rather than for general computer science. The tools are classified with current and future needs of Numerical Aerodynamic Simulator (NAS) in mind: existing and anticipated NAS supercomputers and workstations; operating systems; programming languages; and applications. They are divided into four categories: suggested acquisitions, tools already brought in; tools worth tracking; and tools eliminated from further consideration at this time.
On the conversion of infrared radiation from fission reactor-based photon engine into parallel beam

NASA Astrophysics Data System (ADS)

Gulevich, Andrey V.; Levchenko, Vladislav E.; Loginov, Nicolay I.; Kukharchuk, Oleg F.; Evtodiev, Denis A.; Zrodnikov, Anatoly V.

2002-01-01

The efficiency of infrared radiation conversion from photon engine based on fission reactor into parallel photon beam is discussed. Two different ways of doing that are considered. One of them is to use the parabolic mirror to convert of infrared radiation into parallel photon beam. The another one is based on the use of special lattice consisting of numerous light conductors. The experimental facility and some results are described. .
Numerical Analysis of Dusty-Gas Flows

NASA Astrophysics Data System (ADS)

Saito, T.

2002-02-01

This paper presents the development of a numerical code for simulating unsteady dusty-gas flows including shock and rarefaction waves. The numerical results obtained for a shock tube problem are used for validating the accuracy and performance of the code. The code is then extended for simulating two-dimensional problems. Since the interactions between the gas and particle phases are calculated with the operator splitting technique, we can choose numerical schemes independently for the different phases. A semi-analytical method is developed for the dust phase, while the TVD scheme of Harten and Yee is chosen for the gas phase. Throughout this study, computations are carried out on SGI Origin2000, a parallel computer with multiple of RISC based processors. The efficient use of the parallel computer system is an important issue and the code implementation on Origin2000 is also described. Flow profiles of both the gas and solid particles behind the steady shock wave are calculated by integrating the steady conservation equations. The good agreement between the pseudo-stationary solutions and those from the current numerical code validates the numerical approach and the actual coding. The pseudo-stationary shock profiles can also be used as initial conditions of unsteady multidimensional simulations.
Application of Direct Parallel Methods to Reconstruction and Forecasting Problems

NASA Astrophysics Data System (ADS)

Song, Changgeun

Many important physical processes in nature are represented by partial differential equations. Numerical weather prediction in particular, requires vast computational resources. We investigate the significance of parallel processing technology to the real world problem of atmospheric prediction. In this paper we consider the classic problem of decomposing the observed wind field into the irrotational and nondivergent components. Recognizing the fact that on a limited domain this problem has a non-unique solution, Lynch (1989) described eight different ways to accomplish the decomposition. One set of elliptic equations is associated with the decomposition--this determines the initial nondivergent state for the forecast model. It is shown that the entire decomposition problem can be solved in a fraction of a second using multi-vector processor such as ALLIANT FX/8. Secondly, the barotropic model is used to track hurricanes. Also, one set of elliptic equations is solved to recover the streamfunction from the forecasted vorticity. A 72 h prediction of Elena is made while it is in the Gulf of Mexico. During this time the hurricane executes a dramatic re-curvature that is captured by the model. Furthermore, an improvement in the track prediction results when a simple assimilation strategy is used. This technique makes use of the wind fields in the 24 h period immediately preceding the initial time for the prediction. In this particular application, solutions to systems of elliptic equations are the center of the computational mechanics. We demonstrate that direct, parallel methods based on accelerated block cyclic reduction (BCR) significantly reduce the computational time required to solve the elliptic equations germane to the decomposition, the forecast and adjoint assimilation.
An asymptotic induced numerical method for the convection-diffusion-reaction equation

NASA Technical Reports Server (NTRS)

Scroggs, Jeffrey S.; Sorensen, Danny C.

1988-01-01

A parallel algorithm for the efficient solution of a time dependent reaction convection diffusion equation with small parameter on the diffusion term is presented. The method is based on a domain decomposition that is dictated by singular perturbation analysis. The analysis is used to determine regions where certain reduced equations may be solved in place of the full equation. Parallelism is evident at two levels. Domain decomposition provides parallelism at the highest level, and within each domain there is ample opportunity to exploit parallelism. Run time results demonstrate the viability of the method.
Experimental and numerical analysis on aluminum/steel pipe using magnetic pulse welding

NASA Astrophysics Data System (ADS)

Shim, J. Y.; Kim, I. S.; Lee, K. J.; Kang, B. Y.

2011-12-01

Recently, there has been a trend in the automotive industry to focus on the improvement of lightweight materials, such as aluminum and magnesium because the welding of dissimilar metals causes many welding defects. Magnetic pulse welding (MPW), one of the solid state welding technologies, uses electromagnetic force from current discharged through a working coil which develops a repulsive force between the induced currents flowing parallel and in the opposite direction in the tube to be welded. The objective of this paper is to develop a numerical model for analysis of the interaction between the outer pipe and the working coil using a finite element method (FEM) in the MPW process. Four Maxwell equations are solved using a general electromagnetic mechanics computer program, ANSYS/EMAG code. Experiments were also carried out with a W-MPW60 machine manufactured by WELMATE CO., LTD. with the Al1070 and SM45C for Al pipe and steel bar respectively. The calculated and measured results were compared to verify the proposed model.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Alhroob, M.; Boyd, G.; Hasib, A.

Precision ultrasonic measurements in binary gas systems provide continuous real-time monitoring of mixture composition and flow. Using custom micro-controller-based electronics, we have developed an ultrasonic instrument, with numerous potential applications, capable of making continuous high-precision sound velocity measurements. The instrument measures sound transit times along two opposite directions aligned parallel to - or obliquely crossing - the gas flow. The difference between the two measured times yields the gas flow rate while their average gives the sound velocity, which can be compared with a sound velocity vs. molar composition look-up table for the binary mixture at a given temperature andmore » pressure. The look-up table may be generated from prior measurements in known mixtures of the two components, from theoretical calculations, or from a combination of the two. We describe the instrument and its performance within numerous applications in the ATLAS experiment at the CERN Large Hadron Collider (LHC). The instrument can be of interest in other areas where continuous in-situ binary gas analysis and flowmetry are required. (authors)« less
Discontinuous Galerkin finite element methods for radiative transfer in spherical symmetry

NASA Astrophysics Data System (ADS)

Kitzmann, D.; Bolte, J.; Patzer, A. B. C.

2016-11-01

The discontinuous Galerkin finite element method (DG-FEM) is successfully applied to treat a broad variety of transport problems numerically. In this work, we use the full capacity of the DG-FEM to solve the radiative transfer equation in spherical symmetry. We present a discontinuous Galerkin method to directly solve the spherically symmetric radiative transfer equation as a two-dimensional problem. The transport equation in spherical atmospheres is more complicated than in the plane-parallel case owing to the appearance of an additional derivative with respect to the polar angle. The DG-FEM formalism allows for the exact integration of arbitrarily complex scattering phase functions, independent of the angular mesh resolution. We show that the discontinuous Galerkin method is able to describe accurately the radiative transfer in extended atmospheres and to capture discontinuities or complex scattering behaviour which might be present in the solution of certain radiative transfer tasks and can, therefore, cause severe numerical problems for other radiative transfer solution methods.
Computational attributes of the integral form of the equation of transfer

NASA Technical Reports Server (NTRS)

Frankel, J. I.

1991-01-01

Difficulties can arise in radiative and neutron transport calculations when a highly anisotropic scattering phase function is present. In the presence of anisotropy, currently used numerical solutions are based on the integro-differential form of the linearized Boltzmann transport equation. This paper, departs from classical thought and presents an alternative numerical approach based on application of the integral form of the transport equation. Use of the integral formalism facilitates the following steps: a reduction in dimensionality of the system prior to discretization, the use of symbolic manipulation to augment the computational procedure, and the direct determination of key physical quantities which are derivable through the various Legendre moments of the intensity. The approach is developed in the context of radiative heat transfer in a plane-parallel geometry, and results are presented and compared with existing benchmark solutions. Encouraging results are presented to illustrate the potential of the integral formalism for computation. The integral formalism appears to possess several computational attributes which are well-suited to radiative and neutron transport calculations.
Parallel/distributed direct method for solving linear systems

NASA Technical Reports Server (NTRS)

Lin, Avi

1990-01-01

A new family of parallel schemes for directly solving linear systems is presented and analyzed. It is shown that these schemes exhibit a near optimal performance and enjoy several important features: (1) For large enough linear systems, the design of the appropriate paralleled algorithm is insensitive to the number of processors as its performance grows monotonically with them; (2) It is especially good for large matrices, with dimensions large relative to the number of processors in the system; (3) It can be used in both distributed parallel computing environments and tightly coupled parallel computing systems; and (4) This set of algorithms can be mapped onto any parallel architecture without any major programming difficulties or algorithmical changes.
A scalable parallel black oil simulator on distributed memory parallel computers

NASA Astrophysics Data System (ADS)

Wang, Kun; Liu, Hui; Chen, Zhangxin

2015-11-01

This paper presents our work on developing a parallel black oil simulator for distributed memory computers based on our in-house parallel platform. The parallel simulator is designed to overcome the performance issues of common simulators that are implemented for personal computers and workstations. The finite difference method is applied to discretize the black oil model. In addition, some advanced techniques are employed to strengthen the robustness and parallel scalability of the simulator, including an inexact Newton method, matrix decoupling methods, and algebraic multigrid methods. A new multi-stage preconditioner is proposed to accelerate the solution of linear systems from the Newton methods. Numerical experiments show that our simulator is scalable and efficient, and is capable of simulating extremely large-scale black oil problems with tens of millions of grid blocks using thousands of MPI processes on parallel computers.

Score Equating and Nominally Parallel Language Tests.

ERIC Educational Resources Information Center

Moy, Raymond

Score equating requires that the forms to be equated are functionally parallel. That is, the two test forms should rank order examinees in a similar fashion. In language proficiency testing situations, this assumption is often put into doubt because of the numerous tests that have been proposed as measures of language proficiency and the…
New Parallel Algorithms for Landscape Evolution Model

NASA Astrophysics Data System (ADS)

Jin, Y.; Zhang, H.; Shi, Y.

2017-12-01

Most landscape evolution models (LEM) developed in the last two decades solve the diffusion equation to simulate the transportation of surface sediments. This numerical approach is difficult to parallelize due to the computation of drainage area for each node, which needs huge amount of communication if run in parallel. In order to overcome this difficulty, we developed two parallel algorithms for LEM with a stream net. One algorithm handles the partition of grid with traditional methods and applies an efficient global reduction algorithm to do the computation of drainage areas and transport rates for the stream net; the other algorithm is based on a new partition algorithm, which partitions the nodes in catchments between processes first, and then partitions the cells according to the partition of nodes. Both methods focus on decreasing communication between processes and take the advantage of massive computing techniques, and numerical experiments show that they are both adequate to handle large scale problems with millions of cells. We implemented the two algorithms in our program based on the widely used finite element library deal.II, so that it can be easily coupled with ASPECT.
Effects of fracture surface roughness and shear displacement on geometrical and hydraulic properties of three-dimensional crossed rock fracture models

NASA Astrophysics Data System (ADS)

Huang, Na; Liu, Richeng; Jiang, Yujing; Li, Bo; Yu, Liyuan

2018-03-01

While shear-flow behavior through fractured media has been so far studied at single fracture scale, a numerical analysis of the shear effect on the hydraulic response of 3D crossed fracture model is presented. The analysis was based on a series of crossed fracture models, in which the effects of fracture surface roughness and shear displacement were considered. The rough fracture surfaces were generated using the modified successive random additions (SRA) algorithm. The shear displacement was applied on one fracture, and at the same time another fracture shifted along with the upper and lower surfaces of the sheared fracture. The simulation results reveal the development and variation of preferential flow paths through the model during the shear, accompanied by the change of the flow rate ratios between two flow planes at the outlet boundary. The average contact area accounts for approximately 5-27% of the fracture planes during shear, but the actual calculated flow area is about 38-55% of the fracture planes, which is much smaller than the noncontact area. The equivalent permeability will either increase or decrease as shear displacement increases from 0 to 4 mm, depending on the aperture distribution of intersection part between two fractures. When the shear displacement continuously increases by up to 20 mm, the equivalent permeability increases sharply first, and then keeps increasing with a lower gradient. The equivalent permeability of rough fractured model is about 26-80% of that calculated from the parallel plate model, and the equivalent permeability in the direction perpendicular to shear direction is approximately 1.31-3.67 times larger than that in the direction parallel to shear direction. These results can provide a fundamental understanding of fluid flow through crossed fracture model under shear.
A Numerical Comparison of Barrier and Modified Barrier Methods for Large-Scale Bound-Constrained Optimization

NASA Technical Reports Server (NTRS)

Nash, Stephen G.; Polyak, R.; Sofer, Ariela

1994-01-01

When a classical barrier method is applied to the solution of a nonlinear programming problem with inequality constraints, the Hessian matrix of the barrier function becomes increasingly ill-conditioned as the solution is approached. As a result, it may be desirable to consider alternative numerical algorithms. We compare the performance of two methods motivated by barrier functions. The first is a stabilized form of the classical barrier method, where a numerically stable approximation to the Newton direction is used when the barrier parameter is small. The second is a modified barrier method where a barrier function is applied to a shifted form of the problem, and the resulting barrier terms are scaled by estimates of the optimal Lagrange multipliers. The condition number of the Hessian matrix of the resulting modified barrier function remains bounded as the solution to the constrained optimization problem is approached. Both of these techniques can be used in the context of a truncated-Newton method, and hence can be applied to large problems, as well as on parallel computers. In this paper, both techniques are applied to problems with bound constraints and we compare their practical behavior.
Two-dimensional numerical simulation of a Stirling engine heat exchanger

NASA Technical Reports Server (NTRS)

Ibrahim, Mounir B.; Tew, Roy C.; Dudenhoefer, James E.

1989-01-01

The first phase of an effort to develop multidimensional models of Stirling engine components is described; the ultimate goal is to model an entire engine working space. More specifically, parallel plate and tubular heat exchanger models with emphasis on the central part of the channel (i.e., ignoring hydrodynamic and thermal end effects) are described. The model assumes: laminar, incompressible flow with constant thermophysical properties. In addition, a constant axial temperature gradient is imposed. The governing equations, describing the model, were solved using Crank-Nicloson finite-difference scheme. Model predictions were compared with analytical solutions for oscillating/reversing flow and heat transfer in order to check numerical accuracy. Excellent agreement was obtained for the model predictions with analytical solutions available for both flow in circular tubes and between parallel plates. Also the heat transfer computational results are in good agreement with the heat transfer analytical results for parallel plates.
Chemical Transport in a Fissured Rock: Verification of a Numerical Model

NASA Astrophysics Data System (ADS)

Rasmuson, A.; Narasimhan, T. N.; Neretnieks, I.

1982-10-01

Numerical models for simulating chemical transport in fissured rocks constitute powerful tools for evaluating the acceptability of geological nuclear waste repositories. Due to the very long-term, high toxicity of some nuclear waste products, the models are required to predict, in certain cases, the spatial and temporal distribution of chemical concentration less than 0.001% of the concentration released from the repository. Whether numerical models can provide such accuracies is a major question addressed in the present work. To this end we have verified a numerical model, TRUMP, which solves the advective diffusion equation in general three dimensions, with or without decay and source terms. The method is based on an integrated finite difference approach. The model was verified against known analytic solution of the one-dimensional advection-diffusion problem, as well as the problem of advection-diffusion in a system of parallel fractures separated by spherical particles. The studies show that as long as the magnitude of advectance is equal to or less than that of conductance for the closed surface bounding any volume element in the region (that is, numerical Peclet number <2), the numerical method can indeed match the analytic solution within errors of ±10-3% or less. The realistic input parameters used in the sample calculations suggest that such a range of Peclet numbers is indeed likely to characterize deep groundwater systems in granitic and ancient argillaceous systems. Thus TRUMP in its present form does provide a viable tool for use in nuclear waste evaluation studies. A sensitivity analysis based on the analytic solution suggests that the errors in prediction introduced due to uncertainties in input parameters are likely to be larger than the computational inaccuracies introduced by the numerical model. Currently, a disadvantage in the TRUMP model is that the iterative method of solving the set of simultaneous equations is rather slow when time constants vary widely over the flow region. Although the iterative solution may be very desirable for large three-dimensional problems in order to minimize computer storage, it seems desirable to use a direct solver technique in conjunction with the mixed explicit-implicit approach whenever possible. Work in this direction is in progress.
Detecting opportunities for parallel observations on the Hubble Space Telescope

NASA Technical Reports Server (NTRS)

Lucks, Michael

1992-01-01

The presence of multiple scientific instruments aboard the Hubble Space Telescope provides opportunities for parallel science, i.e., the simultaneous use of different instruments for different observations. Determining whether candidate observations are suitable for parallel execution depends on numerous criteria (some involving quantitative tradeoffs) that may change frequently. A knowledge based approach is presented for constructing a scoring function to rank candidate pairs of observations for parallel science. In the Parallel Observation Matching System (POMS), spacecraft knowledge and schedulers' preferences are represented using a uniform set of mappings, or knowledge functions. Assessment of parallel science opportunities is achieved via composition of the knowledge functions in a prescribed manner. The knowledge acquisition, and explanation facilities of the system are presented. The methodology is applicable to many other multiple criteria assessment problems.
Parallelized Stochastic Cutoff Method for Long-Range Interacting Systems

NASA Astrophysics Data System (ADS)

Endo, Eishin; Toga, Yuta; Sasaki, Munetaka

2015-07-01

We present a method of parallelizing the stochastic cutoff (SCO) method, which is a Monte-Carlo method for long-range interacting systems. After interactions are eliminated by the SCO method, we subdivide a lattice into noninteracting interpenetrating sublattices. This subdivision enables us to parallelize the Monte-Carlo calculation in the SCO method. Such subdivision is found by numerically solving the vertex coloring of a graph created by the SCO method. We use an algorithm proposed by Kuhn and Wattenhofer to solve the vertex coloring by parallel computation. This method was applied to a two-dimensional magnetic dipolar system on an L × L square lattice to examine its parallelization efficiency. The result showed that, in the case of L = 2304, the speed of computation increased about 102 times by parallel computation with 288 processors.
Numerical investigation of heat transfer in parallel channels with water at supercritical pressure.

PubMed

Shitsi, Edward; Kofi Debrah, Seth; Yao Agbodemegbe, Vincent; Ampomah-Amoako, Emmanuel

2017-11-01

Thermal phenomena such as heat transfer enhancement, heat transfer deterioration, and flow instability observed at supercritical pressures as a result of fluid property variations have the potential to affect the safety of design and operation of Supercritical Water-cooled Reactor SCWR, and also challenge the capabilities of both heat transfer correlations and Computational Fluid Dynamics CFD physical models. These phenomena observed at supercritical pressures need to be thoroughly investigated. An experimental study was carried out by Xi to investigate flow instability in parallel channels at supercritical pressures under different mass flow rates, pressures, and axial power shapes. Experimental data on flow instability at inlet of the heated channels were obtained but no heat transfer data along the axial length was obtained. This numerical study used 3D numerical tool STAR-CCM+ to investigate heat transfer at supercritical pressures along the axial lengths of the parallel channels with water ahead of experimental data. Homogeneous axial power shape HAPS was adopted and the heating powers adopted in this work were below the experimental threshold heating powers obtained for HAPS by Xi. The results show that the Fluid Centre-line Temperature FCLT increased linearly below and above the PCT region, but flattened at the PCT region for all the system parameters considered. The inlet temperature, heating power, pressure, gravity and mass flow rate have effects on WT (wall temperature) values in the NHT (normal heat transfer), EHT (enhanced heat transfer), DHT (deteriorated heat transfer) and recovery from DHT regions. While variation of all other system parameters in the EHT and PCT regions showed no significant difference in the WT and FCLT values respectively, the WT and FCLT values respectively increased with pressure in these regions. For most of the system parameters considered, the FCLT and WT values obtained in the two channels were nearly the same. The numerical study was not quantitatively compared with experimental data along the axial lengths of the parallel channels, but it was observed that the numerical tool STAR-CCM+ adopted was able to capture the trends for NHT, EHT, DHT and recovery from DHT regions. The heating powers used for the various simulations were below the experimentally observed threshold heating powers, but heat transfer deterioration HTD was observed, confirming the previous finding that HTD could occur before the occurrence of unstable behavior at supercritical pressures. For purposes of comparing the results of numerical simulations with experimental data, the heat transfer data on temperature oscillations obtained at the outlet of the heated channels and instability boundary results obtained at the inlet of the heated channels were compared. The numerical results obtained quite well agree with the experimental data. This work calls for provision of experimental data on heat transfer in parallel channels at supercritical pressures for validation of similar numerical studies.
Flow above and within granular media composed of spherical and non-spherical particles - using a 3D numerical model

NASA Astrophysics Data System (ADS)

Bartzke, Gerhard; Kuhlmann, Jannis; Huhn, Katrin

2016-04-01

The entrainment of single grains and, hence, their erosion characteristics are dependent on fluid forcing, grain size and density, but also shape variations. To quantitatively describe and capture the hydrodynamic conditions around individual grains, researchers commonly use empirical approaches such as laboratory flume tanks. Nonetheless, it is difficult with such physical experiments to measure the flow velocities in the direct vicinity or within the pore spaces of sediments, at a sufficient resolution and in a non-invasive way. As a result, the hydrodynamic conditions in the water column, at the fluid-porous interface and within pore spaces of a granular medium of various grain shapes is not yet fully understood. For that reason, there is a strong need for numerical models, since these are capable of quantifying fluid speeds within a granular medium. A 3D-SPH (Smooth Particle Hydrodynamics) numerical wave tank model was set up to provide quantitative evidence on the flow velocities in the direct vicinity and in the interior of granular beds composed of two shapes as a complementary method to the difficult task of in situ measurement. On the basis of previous successful numerical wave tank models with SPH, the model geometry was chosen in dimensions of X=2.68 [m], Y=0.48 [m], and Z=0.8 [m]. Three suites of experiments were designed with a range of particle shape models: (1) ellipsoids with the long axis oriented in the across-stream direction, (2) ellipsoids with the long axis oriented in the along-stream direction, and (3) spheres. Particle diameters ranged from 0.04 [m] to 0.08 [m]. A wave was introduced by a vertical paddle that accelerated to 0.8 [m/s] perpendicular to the granular bed. Flow measurements showed that the flow velocity values into the beds were highest when the grains were oriented across the stream direction and lowest in case when the grains were oriented parallel to the stream, indicating that the model was capable to simulate simultaneously the flow into and within a granular medium composed of spherical and non-spherical shapes under wave forcing. It is concluded that variations in grain shape orientation within a bed appear to control the amount of flow that can be accumulated by the pores, which was illustrated in a conceptual model.
A method for data handling numerical results in parallel OpenFOAM simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Anton, Alin; Muntean, Sebastian

Parallel computational fluid dynamics simulations produce vast amount of numerical result data. This paper introduces a method for reducing the size of the data by replaying the interprocessor traffic. The results are recovered only in certain regions of interest configured by the user. A known test case is used for several mesh partitioning scenarios using the OpenFOAM toolkit{sup ®}[1]. The space savings obtained with classic algorithms remain constant for more than 60 Gb of floating point data. Our method is most efficient on large simulation meshes and is much better suited for compressing large scale simulation results than the regular algorithms.
Numerically stable, scalable formulas for parallel and online computation of higher-order multivariate central moments with arbitrary weights

DOE PAGES

Pebay, Philippe; Terriberry, Timothy B.; Kolla, Hemanth; ...

2016-03-29

Formulas for incremental or parallel computation of second order central moments have long been known, and recent extensions of these formulas to univariate and multivariate moments of arbitrary order have been developed. Such formulas are of key importance in scenarios where incremental results are required and in parallel and distributed systems where communication costs are high. We survey these recent results, and improve them with arbitrary-order, numerically stable one-pass formulas which we further extend with weighted and compound variants. We also develop a generalized correction factor for standard two-pass algorithms that enables the maintenance of accuracy over nearly the fullmore » representable range of the input, avoiding the need for extended-precision arithmetic. We then empirically examine algorithm correctness for pairwise update formulas up to order four as well as condition number and relative error bounds for eight different central moment formulas, each up to degree six, to address the trade-offs between numerical accuracy and speed of the various algorithms. Finally, we demonstrate the use of the most elaborate among the above mentioned formulas, with the utilization of the compound moments for a practical large-scale scientific application.« less
Massive parallel 3D PIC simulation of negative ion extraction

NASA Astrophysics Data System (ADS)

Revel, Adrien; Mochalskyy, Serhiy; Montellano, Ivar Mauricio; Wünderlich, Dirk; Fantz, Ursel; Minea, Tiberiu

2017-09-01

The 3D PIC-MCC code ONIX is dedicated to modeling Negative hydrogen/deuterium Ion (NI) extraction and co-extraction of electrons from radio-frequency driven, low pressure plasma sources. It provides valuable insight on the complex phenomena involved in the extraction process. In previous calculations, a mesh size larger than the Debye length was used, implying numerical electron heating. Important steps have been achieved in terms of computation performance and parallelization efficiency allowing successful massive parallel calculations (4096 cores), imperative to resolve the Debye length. In addition, the numerical algorithms have been improved in terms of grid treatment, i.e., the electric field near the complex geometry boundaries (plasma grid) is calculated more accurately. The revised model preserves the full 3D treatment, but can take advantage of a highly refined mesh. ONIX was used to investigate the role of the mesh size, the re-injection scheme for lost particles (extracted or wall absorbed), and the electron thermalization process on the calculated extracted current and plasma characteristics. It is demonstrated that all numerical schemes give the same NI current distribution for extracted ions. Concerning the electrons, the pair-injection technique is found well-adapted to simulate the sheath in front of the plasma grid.
A parallelization method for time periodic steady state in simulation of radio frequency sheath dynamics

NASA Astrophysics Data System (ADS)

Kwon, Deuk-Chul; Shin, Sung-Sik; Yu, Dong-Hun

2017-10-01

In order to reduce the computing time in simulation of radio frequency (rf) plasma sources, various numerical schemes were developed. It is well known that the upwind, exponential, and power-law schemes can efficiently overcome the limitation on the grid size for fluid transport simulations of high density plasma discharges. Also, the semi-implicit method is a well-known numerical scheme to overcome on the simulation time step. However, despite remarkable advances in numerical techniques and computing power over the last few decades, efficient multi-dimensional modeling of low temperature plasma discharges has remained a considerable challenge. In particular, there was a difficulty on parallelization in time for the time periodic steady state problems such as capacitively coupled plasma discharges and rf sheath dynamics because values of plasma parameters in previous time step are used to calculate new values each time step. Therefore, we present a parallelization method for the time periodic steady state problems by using period-slices. In order to evaluate the efficiency of the developed method, one-dimensional fluid simulations are conducted for describing rf sheath dynamics. The result shows that speedup can be achieved by using a multithreading method.
Interfacial properties in a discrete model for tumor growth

NASA Astrophysics Data System (ADS)

Moglia, Belén; Guisoni, Nara; Albano, Ezequiel V.

2013-03-01

We propose and study, by means of Monte Carlo numerical simulations, a minimal discrete model for avascular tumor growth, which can also be applied for the description of cell cultures in vitro. The interface of the tumor is self-affine and its width can be characterized by the following exponents: (i) the growth exponent β=0.32(2) that governs the early time regime, (ii) the roughness exponent α=0.49(2) related to the fluctuations in the stationary regime, and (iii) the dynamic exponent z=α/β≃1.49(2), which measures the propagation of correlations in the direction parallel to the interface, e.g., ξ∝t1/z, where ξ is the parallel correlation length. Therefore, the interface belongs to the Kardar-Parisi-Zhang universality class, in agreement with recent experiments of cell cultures in vitro. Furthermore, density profiles of the growing cells are rationalized in terms of traveling waves that are solutions of the Fisher-Kolmogorov equation. In this way, we achieved excellent agreement between the simulation results of the discrete model and the continuous description of the growth front of the culture or tumor.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Ghysels, Pieter; Li, Xiaoye S.; Rouet, Francois -Henry

Here, we present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factoriz ation leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite.more » The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK - STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices.« less
An Efficient Multicore Implementation of a Novel HSS-Structured Multifrontal Solver Using Randomized Sampling

DOE PAGES

Ghysels, Pieter; Li, Xiaoye S.; Rouet, Francois -Henry; ...

2016-10-27

Here, we present a sparse linear system solver that is based on a multifrontal variant of Gaussian elimination and exploits low-rank approximation of the resulting dense frontal matrices. We use hierarchically semiseparable (HSS) matrices, which have low-rank off-diagonal blocks, to approximate the frontal matrices. For HSS matrix construction, a randomized sampling algorithm is used together with interpolative decompositions. The combination of the randomized compression with a fast ULV HSS factoriz ation leads to a solver with lower computational complexity than the standard multifrontal method for many applications, resulting in speedups up to 7 fold for problems in our test suite.more » The implementation targets many-core systems by using task parallelism with dynamic runtime scheduling. Numerical experiments show performance improvements over state-of-the-art sparse direct solvers. The implementation achieves high performance and good scalability on a range of modern shared memory parallel systems, including the Intel Xeon Phi (MIC). The code is part of a software package called STRUMPACK - STRUctured Matrices PACKage, which also has a distributed memory component for dense rank-structured matrices.« less
Mechanism for generating the anomalous uplift of oceanic core complexes: Atlantis Bank, southwest Indian Ridge

NASA Astrophysics Data System (ADS)

Baines, A. Graham; Cheadle, Michael J.; Dick, Henry J. B.; Hosford Scheirer, Allegra; John, Barbara E.; Kusznir, Nick J.; Matsumoto, Takeshi

2003-12-01

Atlantis Bank is an anomalously uplifted oceanic core complex adjacent to the Atlantis II transform, on the southwest Indian Ridge, that rises >3 km above normal seafloor of the same age. Models of flexural uplift due to detachment faulting can account for ˜1 km of this uplift. Postdetachment normal faults have been observed during submersible dives and on swath bathymetry. Two transform-parallel, large-offset (hundreds of meters) normal faults are identified on the eastern flank of Atlantis Bank, with numerous smaller faults (tens of meters) on the western flank. Flexural uplift associated with this transform-parallel normal faulting is consistent with gravity data and can account for the remaining anomalous uplift of Atlantis Bank. Extension normal to the Atlantis II transform may have occurred during a 12 m.y. period of transtension initiated by a 10° change in spreading direction ca. 19.5 Ma. This extension may have produced the 120-km-long transverse ridge of which Atlantis Bank is a part, and is consistent with stress reorientation about a weak transform fault.
Mechanism for generating the anomalous uplift of oceanic core complexes: Atlantis Bank, southwest Indian Ridge

USGS Publications Warehouse

Baines, A.G.; Cheadle, Michael J.; Dick, H.J.B.; Scheirer, A.H.; John, Barbara E.; Kusznir, N.J.; Matsumoto, T.

2003-01-01

Atlantis Bank is an anomalously uplifted oceanic core complex adjacent to the Atlantis II transform, on the southwest Indian Ridge, that rises >3 km above normal seafloor of the same age. Models of flexural uplift due to detachment faulting can account for ???1 km of this uplift. Postdetachment normal faults have been observed during submersible dives and on swath bathymetry. Two transform-parallel, large-offset (hundreds of meters) normal faults are identified on the eastern flank of Atlantis Bank, with numerous smaller faults (tens of meters) on the western flank. Flexural uplift associated with this transform-parallel normal faulting is consistent with gravity data and can account for the remaining anomalous uplift of Atlantis Bank. Extension normal to the Atlantis II transform may have occurred during a 12 m.y. period of transtension initiated by a 10?? change in spreading direction ca. 19.5 Ma. This extension may have produced the 120-km-long transverse ridge of which Atlantis Bank is a part, and is consistent with stress reorientation about a weak transform fault.
Entropy generation in a parallel-plate active magnetic regenerator with insulator layers

NASA Astrophysics Data System (ADS)

Mugica Guerrero, Ibai; Poncet, Sébastien; Bouchard, Jonathan

2017-02-01

This paper proposes a feasible solution to diminish conduction losses in active magnetic regenerators. Higher performances of these machines are linked to a lower thermal conductivity of the Magneto-Caloric Material (MCM) in the streamwise direction. The concept presented here involves the insertion of insulator layers along the length of a parallel-plate magnetic regenerator in order to reduce the heat conduction within the MCM. This idea is investigated by means of a 1D numerical model. This model solves not only the energy equations for the fluid and solid domains but also the magnetic circuit that conforms the experimental setup of reference. In conclusion, the addition of insulator layers within the MCM increases the temperature span, cooling load, and coefficient of performance by a combination of lower heat conduction losses and an increment of the global Magneto-Caloric Effect. The generated entropy by solid conduction, fluid convection, and conduction and viscous losses are calculated to help understand the implications of introducing insulator layers in magnetic regenerators. Finally, the optimal number of insulator layers is studied.

Magnetic resonance imaging of water content across the Nafion membrane in an operational PEM fuel cell.

PubMed

Zhang, Ziheng; Martin, Jonathan; Wu, Jinfeng; Wang, Haijiang; Promislow, Keith; Balcom, Bruce J

2008-08-01

Water management is critical to optimize the operation of polymer electrolyte membrane fuel cells. At present, numerical models are employed to guide water management in such fuel cells. Accurate measurements of water content variation in polymer electrolyte membrane fuel cells are required to validate these models and to optimize fuel cell behavior. We report a direct water content measurement across the Nafion membrane in an operational polymer electrolyte membrane fuel cell, employing double half k-space spin echo single point imaging techniques. The MRI measurements with T2 mapping were undertaken with a parallel plate resonator to avoid the effects of RF screening. The parallel plate resonator employs the electrodes inherent to the fuel cell to create a resonant circuit at RF frequencies for MR excitation and detection, while still operating as a conventional fuel cell at DC. Three stages of fuel cell operation were investigated: activation, operation and dehydration. Each profile was acquired in 6 min, with 6 microm nominal resolution and a SNR of better than 15.
A conservative scheme of drift kinetic electrons for gyrokinetic simulation of kinetic-MHD processes in toroidal plasmas

NASA Astrophysics Data System (ADS)

Bao, J.; Liu, D.; Lin, Z.

2017-10-01

A conservative scheme of drift kinetic electrons for gyrokinetic simulations of kinetic-magnetohydrodynamic processes in toroidal plasmas has been formulated and verified. Both vector potential and electron perturbed distribution function are decomposed into adiabatic part with analytic solution and non-adiabatic part solved numerically. The adiabatic parallel electric field is solved directly from the electron adiabatic response, resulting in a high degree of accuracy. The consistency between electrostatic potential and parallel vector potential is enforced by using the electron continuity equation. Since particles are only used to calculate the non-adiabatic response, which is used to calculate the non-adiabatic vector potential through Ohm's law, the conservative scheme minimizes the electron particle noise and mitigates the cancellation problem. Linear dispersion relations of the kinetic Alfvén wave and the collisionless tearing mode in cylindrical geometry have been verified in gyrokinetic toroidal code simulations, which show that the perpendicular grid size can be larger than the electron collisionless skin depth when the mode wavelength is longer than the electron skin depth.
Asymptotic-preserving Lagrangian approach for modeling anisotropic transport in magnetized plasmas for arbitrary magnetic fields

NASA Astrophysics Data System (ADS)

Chacon, Luis; Del-Castillo-Negrete, Diego; Hauck, Cory

2012-10-01

Modeling electron transport in magnetized plasmas is extremely challenging due to the extreme anisotropy between parallel (to the magnetic field) and perpendicular directions (χ/χ˜10^10 in fusion plasmas). Recently, a Lagrangian Green's function approach, developed for the purely parallel transport case,footnotetextD. del-Castillo-Negrete, L. Chac'on, PRL, 106, 195004 (2011)^,footnotetextD. del-Castillo-Negrete, L. Chac'on, Phys. Plasmas, 19, 056112 (2012) has been extended to the anisotropic transport case in the tokamak-ordering limit with constant density.footnotetextL. Chac'on, D. del-Castillo-Negrete, C. Hauck, JCP, submitted (2012) An operator-split algorithm is proposed that allows one to treat Eulerian and Lagrangian components separately. The approach is shown to feature bounded numerical errors for arbitrary χ/χ ratios, which renders it asymptotic-preserving. In this poster, we will present the generalization of the Lagrangian approach to arbitrary magnetic fields. We will demonstrate the potential of the approach with various challenging configurations, including the case of transport across a magnetic island in cylindrical geometry.
Sexual dimorphisms in avian and reptilian courtship: two systems that do not play by mammalian rules.

PubMed

Wade, J

1999-01-01

Sexual dimorphisms in the central nervous system exist in numerous vertebrate species, and in many cases these structural differences between males and females parallel differences in the display of reproductive behaviors. Often both the behavioral and anatomical differences are controlled by exposure to gonadal steroid hormones, either during ontogeny or in adulthood. This article reviews some of the evidence supporting the hypothesis that in mammals, testosterone or its metabolites regulate the structure and function of neural and muscle systems involved in the control of masculine sexual behaviors. It then describes data suggesting that the mechanisms regulating sexually dimorphic courtship systems in zebra finches and green anole lizards are not completely parallel to the mammalian systems. Finally, some directions for future study are suggested, with the hope that they will stimulate thought about the nature of comparisons made across vertebrate models when investigators are attempting to determine both which morphological sex differences are important to the control of the reproductive behaviors, and which mechanisms regulating both structure and function are widely employed or are unique.
Engineered artificial antigen presenting cells facilitate direct and efficient expansion of tumor infiltrating lymphocytes

PubMed Central

2011-01-01

Background Development of a standardized platform for the rapid expansion of tumor-infiltrating lymphocytes (TILs) with anti-tumor function from patients with limited TIL numbers or tumor tissues challenges their clinical application. Methods To facilitate adoptive immunotherapy, we applied genetically-engineered K562 cell-based artificial antigen presenting cells (aAPCs) for the direct and rapid expansion of TILs isolated from primary cancer specimens. Results TILs outgrown in IL-2 undergo rapid, CD28-independent expansion in response to aAPC stimulation that requires provision of exogenous IL-2 cytokine support. aAPCs induce numerical expansion of TILs that is statistically similar to an established rapid expansion method at a 100-fold lower feeder cell to TIL ratio, and greater than those achievable using anti-CD3/CD28 activation beads or extended IL-2 culture. aAPC-expanded TILs undergo numerical expansion of tumor antigen-specific cells, remain amenable to secondary aAPC-based expansion, and have low CD4/CD8 ratios and FOXP3+ CD4+ cell frequencies. TILs can also be expanded directly from fresh enzyme-digested tumor specimens when pulsed with aAPCs. These "young" TILs are tumor-reactive, positively skewed in CD8+ lymphocyte composition, CD28 and CD27 expression, and contain fewer FOXP3+ T cells compared to parallel IL-2 cultures. Conclusion Genetically-enhanced aAPCs represent a standardized, "off-the-shelf" platform for the direct ex vivo expansion of TILs of suitable number, phenotype and function for use in adoptive immunotherapy. PMID:21827675
Parallel-vector out-of-core equation solver for computational mechanics

NASA Technical Reports Server (NTRS)

Qin, J.; Agarwal, T. K.; Storaasli, O. O.; Nguyen, D. T.; Baddourah, M. A.

1993-01-01

A parallel/vector out-of-core equation solver is developed for shared-memory computers, such as the Cray Y-MP machine. The input/ output (I/O) time is reduced by using the a synchronous BUFFER IN and BUFFER OUT, which can be executed simultaneously with the CPU instructions. The parallel and vector capability provided by the supercomputers is also exploited to enhance the performance. Numerical applications in large-scale structural analysis are given to demonstrate the efficiency of the present out-of-core solver.
Nonlinear whistler waves

NASA Astrophysics Data System (ADS)

Vasko, I.; Agapitov, O. V.; Mozer, F.; Bonnell, J. W.; Krasnoselskikh, V.; Artemyev, A.; Drake, J. F.

2017-12-01

Chorus waves observed in the Earth inner magnetosphere sometimes exhibit significantly distorted (nonharmonic) parallel electric field waveform. In spectrograms these waveform features show up as overtones of chorus wave. In this work we show that the chorus wave parallel electric field is distorted due to finite temperature of electrons. The distortion of the parallel electric field is described analytically and reproduced in the numerical fluid simulations. Due to this effect the chorus energy is transferred to higher frequencies making possible efficient scattering of low ( a few keV) energy electrons.
Parallel heat transport in integrable and chaotic magnetic fields

DOE Office of Scientific and Technical Information (OSTI.GOV)

Del-Castillo-Negrete, Diego B; Chacon, Luis

2012-01-01

The study of transport in magnetized plasmas is a problem of fundamental interest in controlled fusion, space plasmas, and astrophysics research. Three issues make this problem particularly chal- lenging: (i) The extreme anisotropy between the parallel (i.e., along the magnetic field), , and the perpendicular, , conductivities ( / may exceed 1010 in fusion plasmas); (ii) Magnetic field lines chaos which in general complicates (and may preclude) the construction of magnetic field line coordinates; and (iii) Nonlocal parallel transport in the limit of small collisionality. Motivated by these issues, we present a Lagrangian Green s function method to solve themore » local and non-local parallel transport equation applicable to integrable and chaotic magnetic fields in arbitrary geom- etry. The method avoids by construction the numerical pollution issues of grid-based algorithms. The potential of the approach is demonstrated with nontrivial applications to integrable (magnetic island chain), weakly chaotic (devil s staircase), and fully chaotic magnetic field configurations. For the latter, numerical solutions of the parallel heat transport equation show that the effective radial transport, with local and non-local closures, is non-diffusive, thus casting doubts on the appropriateness of the applicability of quasilinear diffusion descriptions. General conditions for the existence of non-diffusive, multivalued flux-gradient relations in the temperature evolution are derived.« less
Computational time analysis of the numerical solution of 3D electrostatic Poisson's equation

NASA Astrophysics Data System (ADS)

Kamboh, Shakeel Ahmed; Labadin, Jane; Rigit, Andrew Ragai Henri; Ling, Tech Chaw; Amur, Khuda Bux; Chaudhary, Muhammad Tayyab

2015-05-01

3D Poisson's equation is solved numerically to simulate the electric potential in a prototype design of electrohydrodynamic (EHD) ion-drag micropump. Finite difference method (FDM) is employed to discretize the governing equation. The system of linear equations resulting from FDM is solved iteratively by using the sequential Jacobi (SJ) and sequential Gauss-Seidel (SGS) methods, simulation results are also compared to examine the difference between the results. The main objective was to analyze the computational time required by both the methods with respect to different grid sizes and parallelize the Jacobi method to reduce the computational time. In common, the SGS method is faster than the SJ method but the data parallelism of Jacobi method may produce good speedup over SGS method. In this study, the feasibility of using parallel Jacobi (PJ) method is attempted in relation to SGS method. MATLAB Parallel/Distributed computing environment is used and a parallel code for SJ method is implemented. It was found that for small grid size the SGS method remains dominant over SJ method and PJ method while for large grid size both the sequential methods may take nearly too much processing time to converge. Yet, the PJ method reduces computational time to some extent for large grid sizes.
Advancing predictive models for particulate formation in turbulent flames via massively parallel direct numerical simulations

PubMed Central

Bisetti, Fabrizio; Attili, Antonio; Pitsch, Heinz

2014-01-01

Combustion of fossil fuels is likely to continue for the near future due to the growing trends in energy consumption worldwide. The increase in efficiency and the reduction of pollutant emissions from combustion devices are pivotal to achieving meaningful levels of carbon abatement as part of the ongoing climate change efforts. Computational fluid dynamics featuring adequate combustion models will play an increasingly important role in the design of more efficient and cleaner industrial burners, internal combustion engines, and combustors for stationary power generation and aircraft propulsion. Today, turbulent combustion modelling is hindered severely by the lack of data that are accurate and sufficiently complete to assess and remedy model deficiencies effectively. In particular, the formation of pollutants is a complex, nonlinear and multi-scale process characterized by the interaction of molecular and turbulent mixing with a multitude of chemical reactions with disparate time scales. The use of direct numerical simulation (DNS) featuring a state of the art description of the underlying chemistry and physical processes has contributed greatly to combustion model development in recent years. In this paper, the analysis of the intricate evolution of soot formation in turbulent flames demonstrates how DNS databases are used to illuminate relevant physico-chemical mechanisms and to identify modelling needs. PMID:25024412
Robust Multigrid Smoothers for Three Dimensional Elliptic Equations with Strong Anisotropies

NASA Technical Reports Server (NTRS)

Llorente, Ignacio M.; Melson, N. Duane

1998-01-01

We discuss the behavior of several plane relaxation methods as multigrid smoothers for the solution of a discrete anisotropic elliptic model problem on cell-centered grids. The methods compared are plane Jacobi with damping, plane Jacobi with partial damping, plane Gauss-Seidel, plane zebra Gauss-Seidel, and line Gauss-Seidel. Based on numerical experiments and local mode analysis, we compare the smoothing factor of the different methods in the presence of strong anisotropies. A four-color Gauss-Seidel method is found to have the best numerical and architectural properties of the methods considered in the present work. Although alternating direction plane relaxation schemes are simpler and more robust than other approaches, they are not currently used in industrial and production codes because they require the solution of a two-dimensional problem for each plane in each direction. We verify the theoretical predictions of Thole and Trottenberg that an exact solution of each plane is not necessary and that a single two-dimensional multigrid cycle gives the same result as an exact solution, in much less execution time. Parallelization of the two-dimensional multigrid cycles, the kernel of the three-dimensional implicit solver, is also discussed. Alternating-plane smoothers are found to be highly efficient multigrid smoothers for anisotropic elliptic problems.
Comparison study of exhaust plume impingement effects of small mono- and bipropellant thrusters using parallelized DSMC method

PubMed Central

2017-01-01

A space propulsion system is important for the normal mission operations of a spacecraft by adjusting its attitude and maneuver. Generally, a mono- and a bipropellant thruster have been mainly used for low thrust liquid rocket engines. But as the plume gas expelled from these small thrusters diffuses freely in a vacuum space along all directions, unwanted effects due to the plume collision onto the spacecraft surfaces can dramatically cause a deterioration of the function and performance of a spacecraft. Thus, aim of the present study is to investigate and compare the major differences of the plume gas impingement effects quantitatively between the small mono- and bipropellant thrusters using the computational fluid dynamics (CFD). For an efficiency of the numerical calculations, the whole calculation domain is divided into two different flow regimes depending on the flow characteristics, and then Navier-Stokes equations and parallelized Direct Simulation Monte Carlo (DSMC) method are adopted for each flow regime. From the present analysis, thermal and mass influences of the plume gas impingements on the spacecraft were analyzed for the mono- and the bipropellant thrusters. As a result, it is concluded that a careful understanding on the plume impingement effects depending on the chemical characteristics of different propellants are necessary for the efficient design of the spacecraft. PMID:28636625
A Comparison of Numerical and Analytical Radiative-Transfer Solutions for Plane Albedo of Natural Waters

EPA Science Inventory

Three numerical algorithms were compared to provide a solution of a radiative transfer equation (RTE) for plane albedo (hemispherical reflectance) in semi-infinite one-dimensional plane-parallel layer. Algorithms were based on the invariant imbedding method and two different var...
Study Behaviors and USMLE Step 1 Performance: Implications of a Student Self-Directed Parallel Curriculum.

PubMed

Burk-Rafel, Jesse; Santen, Sally A; Purkiss, Joel

2017-11-01

To determine medical students' study behaviors when preparing for the United States Medical Licensing Examination (USMLE) Step 1, and how these behaviors are associated with Step 1 scores when controlling for likely covariates. The authors distributed a study-behaviors survey in 2014 and 2015 at their institution to two cohorts of medical students who had recently taken Step 1. Demographic and academic data were linked to responses. Descriptive statistics, bivariate correlations, and multiple linear regression analyses were performed. Of 332 medical students, 274 (82.5%) participated. Most students (n = 211; 77.0%) began studying for Step 1 during their preclinical curriculum, increasing their intensity during a protected study period during which they averaged 11.0 hours studying per day (standard deviation [SD] 2.1) over a period of 35.3 days (SD 6.2). Students used numerous third-party resources, including reading an exam-specific 700-page review book on average 2.1 times (SD 0.8) and completing an average of 3,597 practice multiple-choice questions (SD 1,611). Initiating study prior to the designated study period, increased review book usage, and attempting more practice questions were all associated with higher Step 1 scores, even when controlling for Medical College Admission Test scores, preclinical exam performance, and self-identified score goal (adjusted R = 0.56, P < .001). Medical students at one public institution engaged in a self-directed, "parallel" Step 1 curriculum using third-party study resources. Several study behaviors were associated with improved USMLE Step 1 performance, informing both institutional- and student-directed preparation for this high-stakes exam.
Parallel rendering

NASA Technical Reports Server (NTRS)

Crockett, Thomas W.

1995-01-01

This article provides a broad introduction to the subject of parallel rendering, encompassing both hardware and software systems. The focus is on the underlying concepts and the issues which arise in the design of parallel rendering algorithms and systems. We examine the different types of parallelism and how they can be applied in rendering applications. Concepts from parallel computing, such as data decomposition, task granularity, scalability, and load balancing, are considered in relation to the rendering problem. We also explore concepts from computer graphics, such as coherence and projection, which have a significant impact on the structure of parallel rendering algorithms. Our survey covers a number of practical considerations as well, including the choice of architectural platform, communication and memory requirements, and the problem of image assembly and display. We illustrate the discussion with numerous examples from the parallel rendering literature, representing most of the principal rendering methods currently used in computer graphics.
Linearly exact parallel closures for slab geometry

NASA Astrophysics Data System (ADS)

Ji, Jeong-Young; Held, Eric D.; Jhang, Hogun

2013-08-01

Parallel closures are obtained by solving a linearized kinetic equation with a model collision operator using the Fourier transform method. The closures expressed in wave number space are exact for time-dependent linear problems to within the limits of the model collision operator. In the adiabatic, collisionless limit, an inverse Fourier transform is performed to obtain integral (nonlocal) parallel closures in real space; parallel heat flow and viscosity closures for density, temperature, and flow velocity equations replace Braginskii's parallel closure relations, and parallel flow velocity and heat flow closures for density and temperature equations replace Spitzer's parallel transport relations. It is verified that the closures reproduce the exact linear response function of Hammett and Perkins [Phys. Rev. Lett. 64, 3019 (1990)] for Landau damping given a temperature gradient. In contrast to their approximate closures where the vanishing viscosity coefficient numerically gives an exact response, our closures relate the heat flow and nonvanishing viscosity to temperature and flow velocity (gradients).
Parallel Computation of the Jacobian Matrix for Nonlinear Equation Solvers Using MATLAB

NASA Technical Reports Server (NTRS)

Rose, Geoffrey K.; Nguyen, Duc T.; Newman, Brett A.

2017-01-01

Demonstrating speedup for parallel code on a multicore shared memory PC can be challenging in MATLAB due to underlying parallel operations that are often opaque to the user. This can limit potential for improvement of serial code even for the so-called embarrassingly parallel applications. One such application is the computation of the Jacobian matrix inherent to most nonlinear equation solvers. Computation of this matrix represents the primary bottleneck in nonlinear solver speed such that commercial finite element (FE) and multi-body-dynamic (MBD) codes attempt to minimize computations. A timing study using MATLAB's Parallel Computing Toolbox was performed for numerical computation of the Jacobian. Several approaches for implementing parallel code were investigated while only the single program multiple data (spmd) method using composite objects provided positive results. Parallel code speedup is demonstrated but the goal of linear speedup through the addition of processors was not achieved due to PC architecture.
Multilevel decomposition of complete vehicle configuration in a parallel computing environment

NASA Technical Reports Server (NTRS)

Bhatt, Vinay; Ragsdell, K. M.

1989-01-01

This research summarizes various approaches to multilevel decomposition to solve large structural problems. A linear decomposition scheme based on the Sobieski algorithm is selected as a vehicle for automated synthesis of a complete vehicle configuration in a parallel processing environment. The research is in a developmental state. Preliminary numerical results are presented for several example problems.
User's guide to the Parallel Processing Extension of the Prognosis Model

Treesearch

Nicholas L. Crookston; Albert R. Stage

1991-01-01

The Parallel Processing Extension (PPE) of the Prognosis Model was designed to analyze responses of numerous stands to coordinated management and pest impacts that operate at the landscape level of forests. Vegetation-related resource supply analysis can be readily performed for a thousand or more sample stands for projections 400 years into the future. Capabilities...
Innovative Language-Based & Object-Oriented Structured AMR Using Fortran 90 and OpenMP

NASA Technical Reports Server (NTRS)

Norton, C.; Balsara, D.

1999-01-01

Parallel adaptive mesh refinement (AMR) is an important numerical technique that leads to the efficient solution of many physical and engineering problems. In this paper, we describe how AMR programing can be performed in an object-oreinted way using the modern aspects of Fortran 90 combined with the parallelization features of OpenMP.

Analysis of Serial and Parallel Algorithms for Use in Molecular Dynamics.. Review and Proposals

NASA Astrophysics Data System (ADS)

Mazzone, A. M.

This work analyzes the stability and accuracy of multistep methods, either for serial or parallel calculations, applied to molecular dynamics simulations. Numerical testing is made by evaluating the equilibrium configurations of mono-elemental crystalline lattices of metallic and semiconducting type (Ag and Si, respectively) and of a cubic CuY compound.
Parallelization of Rocket Engine Simulator Software (PRESS)

NASA Technical Reports Server (NTRS)

Cezzar, Ruknet

1997-01-01

Parallelization of Rocket Engine System Software (PRESS) project is part of a collaborative effort with Southern University at Baton Rouge (SUBR), University of West Florida (UWF), and Jackson State University (JSU). The second-year funding, which supports two graduate students enrolled in our new Master's program in Computer Science at Hampton University and the principal investigator, have been obtained for the period from October 19, 1996 through October 18, 1997. The key part of the interim report was new directions for the second year funding. This came about from discussions during Rocket Engine Numeric Simulator (RENS) project meeting in Pensacola on January 17-18, 1997. At that time, a software agreement between Hampton University and NASA Lewis Research Center had already been concluded. That agreement concerns off-NASA-site experimentation with PUMPDES/TURBDES software. Before this agreement, during the first year of the project, another large-scale FORTRAN-based software, Two-Dimensional Kinetics (TDK), was being used for translation to an object-oriented language and parallelization experiments. However, that package proved to be too complex and lacking sufficient documentation for effective translation effort to the object-oriented C + + source code. The focus, this time with better documented and more manageable PUMPDES/TURBDES package, was still on translation to C + + with design improvements. At the RENS Meeting, however, the new impetus for the RENS projects in general, and PRESS in particular, has shifted in two important ways. One was closer alignment with the work on Numerical Propulsion System Simulator (NPSS) through cooperation and collaboration with LERC ACLU organization. The other was to see whether and how NASA's various rocket design software can be run over local and intra nets without any radical efforts for redesign and translation into object-oriented source code. There were also suggestions that the Fortran based code be encapsulated in C + + code thereby facilitating reuse without undue development effort. The details are covered in the aforementioned section of the interim report filed on April 28, 1997.
Mechanical testing of bones: the positive synergy of finite-element models and in vitro experiments.

PubMed

Cristofolini, Luca; Schileo, Enrico; Juszczyk, Mateusz; Taddei, Fulvia; Martelli, Saulo; Viceconti, Marco

2010-06-13

Bone biomechanics have been extensively investigated in the past both with in vitro experiments and numerical models. In most cases either approach is chosen, without exploiting synergies. Both experiments and numerical models suffer from limitations relative to their accuracy and their respective fields of application. In vitro experiments can improve numerical models by: (i) preliminarily identifying the most relevant failure scenarios; (ii) improving the model identification with experimentally measured material properties; (iii) improving the model identification with accurately measured actual boundary conditions; and (iv) providing quantitative validation based on mechanical properties (strain, displacements) directly measured from physical specimens being tested in parallel with the modelling activity. Likewise, numerical models can improve in vitro experiments by: (i) identifying the most relevant loading configurations among a number of motor tasks that cannot be replicated in vitro; (ii) identifying acceptable simplifications for the in vitro simulation; (iii) optimizing the use of transducers to minimize errors and provide measurements at the most relevant locations; and (iv) exploring a variety of different conditions (material properties, interface, etc.) that would require enormous experimental effort. By reporting an example of successful investigation of the femur, we show how a combination of numerical modelling and controlled experiments within the same research team can be designed to create a virtuous circle where models are used to improve experiments, experiments are used to improve models and their combination synergistically provides more detailed and more reliable results than can be achieved with either approach singularly.
Parallel Algorithm Solves Coupled Differential Equations

NASA Technical Reports Server (NTRS)

Hayashi, A.

1987-01-01

Numerical methods adapted to concurrent processing. Algorithm solves set of coupled partial differential equations by numerical integration. Adapted to run on hypercube computer, algorithm separates problem into smaller problems solved concurrently. Increase in computing speed with concurrent processing over that achievable with conventional sequential processing appreciable, especially for large problems.
A Comparison of Numerical and Analytical Radiative-Transfer Solutions for Plane Albedo in Natural Waters

EPA Science Inventory

Several numerical and analytical solutions of the radiative transfer equation (RTE) for plane albedo were compared for solar light reflection by sea water. The study incorporated the simplest case, that being a semi-infinite one-dimensional plane-parallel absorbing and scattering...
A parallel Jacobson-Oksman optimization algorithm. [parallel processing (computers)

NASA Technical Reports Server (NTRS)

Straeter, T. A.; Markos, A. T.

1975-01-01

A gradient-dependent optimization technique which exploits the vector-streaming or parallel-computing capabilities of some modern computers is presented. The algorithm, derived by assuming that the function to be minimized is homogeneous, is a modification of the Jacobson-Oksman serial minimization method. In addition to describing the algorithm, conditions insuring the convergence of the iterates of the algorithm and the results of numerical experiments on a group of sample test functions are presented. The results of these experiments indicate that this algorithm will solve optimization problems in less computing time than conventional serial methods on machines having vector-streaming or parallel-computing capabilities.
Beyond the Renderer: Software Architecture for Parallel Graphics and Visualization

NASA Technical Reports Server (NTRS)

Crockett, Thomas W.

1996-01-01

As numerous implementations have demonstrated, software-based parallel rendering is an effective way to obtain the needed computational power for a variety of challenging applications in computer graphics and scientific visualization. To fully realize their potential, however, parallel renderers need to be integrated into a complete environment for generating, manipulating, and delivering visual data. We examine the structure and components of such an environment, including the programming and user interfaces, rendering engines, and image delivery systems. We consider some of the constraints imposed by real-world applications and discuss the problems and issues involved in bringing parallel rendering out of the lab and into production.
Collisions between quasi-parallel shocks

NASA Technical Reports Server (NTRS)

Cargill, Peter J.

1991-01-01

The collision between pairs of quasi-parallel shocks is examined using hybrid numerical simulations. In the interaction, the two shocks are transmitted through each other leaving behind a hot plasma with a population of particles with energies in excess of 40 E0, where E0 is the kinetic energy of particles in the shock frame prior to the collision. The energization is more efficient for quasi-parallel shocks than parallel shocks. Collisions between shocks of equal strengths are more efficient than those that are unequal. The results are of importance for phenomena during the impulsive phase of solar flares, in the distant solar wind and at planetary bow shocks.
Research in applied mathematics, numerical analysis, and computer science

NASA Technical Reports Server (NTRS)

1984-01-01

Research conducted at the Institute for Computer Applications in Science and Engineering (ICASE) in applied mathematics, numerical analysis, and computer science is summarized and abstracts of published reports are presented. The major categories of the ICASE research program are: (1) numerical methods, with particular emphasis on the development and analysis of basic numerical algorithms; (2) control and parameter identification; (3) computational problems in engineering and the physical sciences, particularly fluid dynamics, acoustics, and structural analysis; and (4) computer systems and software, especially vector and parallel computers.
Vortex-induced vibration of two parallel risers: Experimental test and numerical simulation

NASA Astrophysics Data System (ADS)

Huang, Weiping; Zhou, Yang; Chen, Haiming

2016-04-01

The vortex-induced vibration of two identical rigidly mounted risers in a parallel arrangement was studied using Ansys- CFX and model tests. The vortex shedding and force were recorded to determine the effect of spacing on the two-degree-of-freedom oscillation of the risers. CFX was used to study the single riser and two parallel risers in 2-8 D spacing considering the coupling effect. Because of the limited width of water channel, only three different riser spacings, 2 D, 3 D, and 4 D, were tested to validate the characteristics of the two parallel risers by comparing to the numerical simulation. The results indicate that the lift force changes significantly with the increase in spacing, and in the case of 3 D spacing, the lift force of the two parallel risers reaches the maximum. The vortex shedding of the risers in 3 D spacing shows that a variable velocity field with the same frequency as the vortex shedding is generated in the overlapped area, thus equalizing the period of drag force to that of lift force. It can be concluded that the interaction between the two parallel risers is significant when the risers are brought to a small distance between them because the trajectory of riser changes from oval to curve 8 as the spacing is increased. The phase difference of lift force between the two risers is also different as the spacing changes.
A new parallel plate shear cell for in situ real-space measurements of complex fluids under shear flow.

PubMed

Wu, Yu Ling; Brand, Joost H J; van Gemert, Josephus L A; Verkerk, Jaap; Wisman, Hans; van Blaaderen, Alfons; Imhof, Arnout

2007-10-01

We developed and tested a parallel plate shear cell that can be mounted on top of an inverted microscope to perform confocal real-space measurements on complex fluids under shear. To follow structural changes in time, a plane of zero velocity is created by letting the plates move in opposite directions. The location of this plane is varied by changing the relative velocities of the plates. The gap width is variable between 20 and 200 microm with parallelism better than 1 microm. Such a small gap width enables us to examine the total sample thickness using high numerical aperture objective lenses. The achieved shear rates cover the range of 0.02-10(3) s(-1). This shear cell can apply an oscillatory shear with adjustable amplitude and frequency. The maximum travel of each plate equals 1 cm, so that strains up to 500 can be applied. For most complex fluids, an oscillatory shear with such a large amplitude can be regarded as a continuous shear. We measured the flow profile of a suspension of silica colloids in this shear cell. It was linear except for a small deviation caused by sedimentation. To demonstrate the excellent performance and capabilities of this new setup we examined shear induced crystallization and melting of concentrated suspensions of 1 microm diameter silica colloids.
Cpu/gpu Computing for AN Implicit Multi-Block Compressible Navier-Stokes Solver on Heterogeneous Platform

NASA Astrophysics Data System (ADS)

Deng, Liang; Bai, Hanli; Wang, Fang; Xu, Qingxin

2016-06-01

CPU/GPU computing allows scientists to tremendously accelerate their numerical codes. In this paper, we port and optimize a double precision alternating direction implicit (ADI) solver for three-dimensional compressible Navier-Stokes equations from our in-house Computational Fluid Dynamics (CFD) software on heterogeneous platform. First, we implement a full GPU version of the ADI solver to remove a lot of redundant data transfers between CPU and GPU, and then design two fine-grain schemes, namely “one-thread-one-point” and “one-thread-one-line”, to maximize the performance. Second, we present a dual-level parallelization scheme using the CPU/GPU collaborative model to exploit the computational resources of both multi-core CPUs and many-core GPUs within the heterogeneous platform. Finally, considering the fact that memory on a single node becomes inadequate when the simulation size grows, we present a tri-level hybrid programming pattern MPI-OpenMP-CUDA that merges fine-grain parallelism using OpenMP and CUDA threads with coarse-grain parallelism using MPI for inter-node communication. We also propose a strategy to overlap the computation with communication using the advanced features of CUDA and MPI programming. We obtain speedups of 6.0 for the ADI solver on one Tesla M2050 GPU in contrast to two Xeon X5670 CPUs. Scalability tests show that our implementation can offer significant performance improvement on heterogeneous platform.
Long-term morphological developments of river channels separated by a longitudinal training wall

NASA Astrophysics Data System (ADS)

Le, T. B.; Crosato, A.; Uijttewaal, W. S. J.

2018-03-01

Rivers have been trained for centuries by channel narrowing and straightening. This caused important damages to their ecosystems, particularly around the bank areas. We analyze here the possibility to train rivers in a new way by subdividing their channel in main and ecological channel with a longitudinal training wall. The effectiveness of longitudinal training walls in achieving this goal and their long-term effects on the river morphology have not been thoroughly investigated yet. In particular, studies that assess the stability of the two parallel channels separated by the training wall are still lacking. This work studies the long-term morphological developments of river channels subdivided by a longitudinal training wall in the presence of steady alternate bars. This type of bars, common in alluvial rivers, alters the flow field and the sediment transport direction and might affect the stability of the bifurcating system. The work comprises both laboratory experiments and numerical simulations (Delft3D). The results show that a system of parallel channels divided by a longitudinal training wall has the tendency to become unstable. An important factor is found to be the location of the upstream termination of the longitudinal wall with respect to a neighboring steady bar. The relative widths of the two parallel channels separated by the wall and variable discharge do not substantially change the final evolution of the system.
OpenACC acceleration of an unstructured CFD solver based on a reconstructed discontinuous Galerkin method for compressible flows

DOE PAGES

Xia, Yidong; Lou, Jialin; Luo, Hong; ...

2015-02-09

Here, an OpenACC directive-based graphics processing unit (GPU) parallel scheme is presented for solving the compressible Navier–Stokes equations on 3D hybrid unstructured grids with a third-order reconstructed discontinuous Galerkin method. The developed scheme requires the minimum code intrusion and algorithm alteration for upgrading a legacy solver with the GPU computing capability at very little extra effort in programming, which leads to a unified and portable code development strategy. A face coloring algorithm is adopted to eliminate the memory contention because of the threading of internal and boundary face integrals. A number of flow problems are presented to verify the implementationmore » of the developed scheme. Timing measurements were obtained by running the resulting GPU code on one Nvidia Tesla K20c GPU card (Nvidia Corporation, Santa Clara, CA, USA) and compared with those obtained by running the equivalent Message Passing Interface (MPI) parallel CPU code on a compute node (consisting of two AMD Opteron 6128 eight-core CPUs (Advanced Micro Devices, Inc., Sunnyvale, CA, USA)). Speedup factors of up to 24× and 1.6× for the GPU code were achieved with respect to one and 16 CPU cores, respectively. The numerical results indicate that this OpenACC-based parallel scheme is an effective and extensible approach to port unstructured high-order CFD solvers to GPU computing.« less
PARALLEL PERTURBATION MODEL FOR CYCLE TO CYCLE VARIABILITY PPM4CCV

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ameen, Muhsin Mohammed; Som, Sibendu

This code consists of a Fortran 90 implementation of the parallel perturbation model to compute cyclic variability in spark ignition (SI) engines. Cycle-to-cycle variability (CCV) is known to be detrimental to SI engine operation resulting in partial burn and knock, and result in an overall reduction in the reliability of the engine. Numerical prediction of cycle-to-cycle variability (CCV) in SI engines is extremely challenging for two key reasons: (i) high-fidelity methods such as large eddy simulation (LES) are required to accurately capture the in-cylinder turbulent flow field, and (ii) CCV is experienced over long timescales and hence the simulations needmore » to be performed for hundreds of consecutive cycles. In the new technique, the strategy is to perform multiple parallel simulations, each of which encompasses 2-3 cycles, by effectively perturbing the simulation parameters such as the initial and boundary conditions. The PPM4CCV code is a pre-processing code and can be coupled with any engine CFD code. PPM4CCV was coupled with Converge CFD code and a 10-time speedup was demonstrated over the conventional multi-cycle LES in predicting the CCV for a motored engine. Recently, the model is also being applied to fired engines including port fuel injected (PFI) and direct injection spark ignition engines and the preliminary results are very encouraging.« less
Smart Optical Material Characterization System and Method

NASA Technical Reports Server (NTRS)

Choi, Sang Hyouk (Inventor); Park, Yeonjoon (Inventor)

2015-01-01

Disclosed is a system and method for characterizing optical materials, using steps and equipment for generating a coherent laser light, filtering the light to remove high order spatial components, collecting the filtered light and forming a parallel light beam, splitting the parallel beam into a first direction and a second direction wherein the parallel beam travelling in the second direction travels toward the material sample so that the parallel beam passes through the sample, applying various physical quantities to the sample, reflecting the beam travelling in the first direction to produce a first reflected beam, reflecting the beam that passes through the sample to produce a second reflected beam that travels back through the sample, combining the second reflected beam after it travels back though the sample with the first reflected beam, sensing the light beam produced by combining the first and second reflected beams, and processing the sensed beam to determine sample characteristics and properties.
Parameter estimation for stiff deterministic dynamical systems via ensemble Kalman filter

NASA Astrophysics Data System (ADS)

Arnold, Andrea; Calvetti, Daniela; Somersalo, Erkki

2014-10-01

A commonly encountered problem in numerous areas of applications is to estimate the unknown coefficients of a dynamical system from direct or indirect observations at discrete times of some of the components of the state vector. A related problem is to estimate unobserved components of the state. An egregious example of such a problem is provided by metabolic models, in which the numerous model parameters and the concentrations of the metabolites in tissue are to be estimated from concentration data in the blood. A popular method for addressing similar questions in stochastic and turbulent dynamics is the ensemble Kalman filter (EnKF), a particle-based filtering method that generalizes classical Kalman filtering. In this work, we adapt the EnKF algorithm for deterministic systems in which the numerical approximation error is interpreted as a stochastic drift with variance based on classical error estimates of numerical integrators. This approach, which is particularly suitable for stiff systems where the stiffness may depend on the parameters, allows us to effectively exploit the parallel nature of particle methods. Moreover, we demonstrate how spatial prior information about the state vector, which helps the stability of the computed solution, can be incorporated into the filter. The viability of the approach is shown by computed examples, including a metabolic system modeling an ischemic episode in skeletal muscle, with a high number of unknown parameters.
Numerical tension adjustment of x-ray membrane to represent goat skin kompang

NASA Astrophysics Data System (ADS)

Siswanto, Waluyo Adi; Abdullah, Muhammad Syiddiq Bin

2017-04-01

This paper presents a numerical membrane model of traditional musical instrument kompang that will be used to find the parameter of membrane tension of x-ray membrane representing the classical goat-skin membrane of kompang. In this study, the experiment towards the kompang is first conducted in an acoustical anechoic enclosure and in parallel a mathematical model of the kompang membrane is developed to simulate the vibration of the kompang membrane in polar coordinate by implementing Fourier-Bessel wave function. The wave equation in polar direction in mode 0,1 is applied to provide the corresponding natural frequencies of the circular membrane. The value of initial and boundary conditions in the function is determined from experiment to allow the correct development of numerical equation. The numerical mathematical model is coded in SMath for the accurate numerical analysis as well as the plotting tool. Two kompang membrane cases with different membrane materials, i.e. goat skin and x-ray film membranes with fixed radius of 0.1 m are used in the experiment. An alternative of kompang's membrane made of x-ray film with the appropriate tension setting can be used to represent the sound of traditional goat-skin kompang. The tension setting of the membrane to resemble the goat-skin is 24N. An effective numerical tool has been develop to help kompang maker to set the tension of x-ray membrane. In the future application, any tradional kompang with different size can be replaced by another membrane material if the tension is set to the correct tension value. The developed numerical tool is useful and handy to calculate the tension of the alternative membrane material.
Numerical Tension Adjustment of X-Ray Membrane to Represent Goat Skin Kompang

NASA Astrophysics Data System (ADS)

Syiddiq, M.; Siswanto, W. A.

2017-01-01

This paper presents a numerical membrane model of traditional musical instrument kompang that will be used to find the parameter of membrane tension of x-ray membrane representing the classical goat-skin membrane of kompang. In this study, the experiment towards the kompang is first conducted in an acoustical anechoic enclosure and in parallel a mathematical model of the kompang membrane is developed to simulate the vibration of the kompang membrane in polar coordinate by implementing Fourier-Bessel wave function. The wave equation in polar direction in mode 0,1 is applied to provide the corresponding natural frequencies of the circular membrane. The value of initial and boundary conditions in the function is determined from experiment to allow the correct development of numerical equation. The numerical mathematical model is coded in SMath for the accurate numerical analysis as well as the plotting tool. Two kompang membrane cases with different membrane materials, i.e. goat skin and x-ray film membranes with fixed radius of 0.1 m are used in the experiment. An alternative of kompang’s membrane made of x-ray film with the appropriate tension setting can be used to represent the sound of traditional goat-skin kompang. The tension setting of the membrane to resemble the goat-skin is 24N. An effective numerical tool has been used to help kompang maker to set the tension of x-ray membrane. In the future application, any traditional kompang with different size can be replaced by another membrane material if the tension is set to the correct tension value. The numerical tool used is useful and handy to calculate the tension of the alternative membrane material.
Automatic Management of Parallel and Distributed System Resources

NASA Technical Reports Server (NTRS)

Yan, Jerry; Ngai, Tin Fook; Lundstrom, Stephen F.

1990-01-01

Viewgraphs on automatic management of parallel and distributed system resources are presented. Topics covered include: parallel applications; intelligent management of multiprocessing systems; performance evaluation of parallel architecture; dynamic concurrent programs; compiler-directed system approach; lattice gaseous cellular automata; and sparse matrix Cholesky factorization.

Bilingual parallel programming

DOE Office of Scientific and Technical Information (OSTI.GOV)

Foster, I.; Overbeek, R.

1990-01-01

Numerous experiments have demonstrated that computationally intensive algorithms support adequate parallelism to exploit the potential of large parallel machines. Yet successful parallel implementations of serious applications are rare. The limiting factor is clearly programming technology. None of the approaches to parallel programming that have been proposed to date -- whether parallelizing compilers, language extensions, or new concurrent languages -- seem to adequately address the central problems of portability, expressiveness, efficiency, and compatibility with existing software. In this paper, we advocate an alternative approach to parallel programming based on what we call bilingual programming. We present evidence that this approach providesmore » and effective solution to parallel programming problems. The key idea in bilingual programming is to construct the upper levels of applications in a high-level language while coding selected low-level components in low-level languages. This approach permits the advantages of a high-level notation (expressiveness, elegance, conciseness) to be obtained without the cost in performance normally associated with high-level approaches. In addition, it provides a natural framework for reusing existing code.« less
Numerical investigation of two interacting parallel thruster-plumes and comparison to experiment

NASA Astrophysics Data System (ADS)

Grabe, Martin; Holz, André; Ziegenhagen, Stefan; Hannemann, Klaus

2014-12-01

Clusters of orbital thrusters are an attractive option to achieve graduated thrust levels and increased redundancy with available hardware, but the heavily under-expanded plumes of chemical attitude control thrusters placed in close proximity will interact, leading to a local amplification of downstream fluxes and of back-flow onto the spacecraft. The interaction of two similar, parallel, axi-symmetric cold-gas model thrusters has recently been studied in the DLR High-Vacuum Plume Test Facility STG under space-like vacuum conditions, employing a Patterson-type impact pressure probe with slot orifice. We reproduce a selection of these experiments numerically, and emphasise that a comparison of numerical results to the measured data is not straight-forward. The signal of the probe used in the experiments must be interpreted according to the degree of rarefaction and local flow Mach number, and both vary dramatically thoughout the flow-field. We present a procedure to reconstruct the probe signal by post-processing the numerically obtained flow-field data and show that agreement to the experimental results is then improved. Features of the investigated cold-gas thruster plume interaction are discussed on the basis of the numerical results.
FoSSI: the family of simplified solver interfaces for the rapid development of parallel numerical atmosphere and ocean models

NASA Astrophysics Data System (ADS)

Frickenhaus, Stephan; Hiller, Wolfgang; Best, Meike

The portable software FoSSI is introduced that—in combination with additional free solver software packages—allows for an efficient and scalable parallel solution of large sparse linear equations systems arising in finite element model codes. FoSSI is intended to support rapid model code development, completely hiding the complexity of the underlying solver packages. In particular, the model developer need not be an expert in parallelization and is yet free to switch between different solver packages by simple modifications of the interface call. FoSSI offers an efficient and easy, yet flexible interface to several parallel solvers, most of them available on the web, such as PETSC, AZTEC, MUMPS, PILUT and HYPRE. FoSSI makes use of the concept of handles for vectors, matrices, preconditioners and solvers, that is frequently used in solver libraries. Hence, FoSSI allows for a flexible treatment of several linear equations systems and associated preconditioners at the same time, even in parallel on separate MPI-communicators. The second special feature in FoSSI is the task specifier, being a combination of keywords, each configuring a certain phase in the solver setup. This enables the user to control a solver over one unique subroutine. Furthermore, FoSSI has rather similar features for all solvers, making a fast solver intercomparison or exchange an easy task. FoSSI is a community software, proven in an adaptive 2D-atmosphere model and a 3D-primitive equation ocean model, both formulated in finite elements. The present paper discusses perspectives of an OpenMP-implementation of parallel iterative solvers based on domain decomposition methods. This approach to OpenMP solvers is rather attractive, as the code for domain-local operations of factorization, preconditioning and matrix-vector product can be readily taken from a sequential implementation that is also suitable to be used in an MPI-variant. Code development in this direction is in an advanced state under the name ScOPES: the Scalable Open Parallel sparse linear Equations Solver.
A New Numerical Scheme for Cosmic-Ray Transport

NASA Astrophysics Data System (ADS)

Jiang, Yan-Fei; Oh, S. Peng

2018-02-01

Numerical solutions of the cosmic-ray (CR) magnetohydrodynamic equations are dogged by a powerful numerical instability, which arises from the constraint that CRs can only stream down their gradient. The standard cure is to regularize by adding artificial diffusion. Besides introducing ad hoc smoothing, this has a significant negative impact on either computational cost or complexity and parallel scalings. We describe a new numerical algorithm for CR transport, with close parallels to two-moment methods for radiative transfer under the reduced speed of light approximation. It stably and robustly handles CR streaming without any artificial diffusion. It allows for both isotropic and field-aligned CR streaming and diffusion, with arbitrary streaming and diffusion coefficients. CR transport is handled explicitly, while source terms are handled implicitly. The overall time step scales linearly with resolution (even when computing CR diffusion) and has a perfect parallel scaling. It is given by the standard Courant condition with respect to a constant maximum velocity over the entire simulation domain. The computational cost is comparable to that of solving the ideal MHD equation. We demonstrate the accuracy and stability of this new scheme with a wide variety of tests, including anisotropic streaming and diffusion tests, CR-modified shocks, CR-driven blast waves, and CR transport in multiphase media. The new algorithm opens doors to much more ambitious and hitherto intractable calculations of CR physics in galaxies and galaxy clusters. It can also be applied to other physical processes with similar mathematical structure, such as saturated, anisotropic heat conduction.
Acoustic metacages for sound shielding with steady air flow

NASA Astrophysics Data System (ADS)

Shen, Chen; Xie, Yangbo; Li, Junfei; Cummer, Steven A.; Jing, Yun

2018-03-01

Conventional sound shielding structures typically prevent fluid transport between the exterior and interior. A design of a two-dimensional acoustic metacage with subwavelength thickness which can shield acoustic waves from all directions while allowing steady fluid flow is presented in this paper. The structure is designed based on acoustic gradient-index metasurfaces composed of open channels and shunted Helmholtz resonators. In-plane sound at an arbitrary angle of incidence is reflected due to the strong parallel momentum on the metacage surface, which leads to low sound transmission through the metacage. The performance of the proposed metacage is verified by numerical simulations and measurements on a three-dimensional printed prototype. The acoustic metacage has potential applications in sound insulation where steady fluid flow is necessary or advantageous.
Effect of electron thermal anisotropy on the kinetic cross-field streaming instability

NASA Technical Reports Server (NTRS)

Tsai, S. T.; Tanaka, M.; Gaffey, J. D., Jr.; Wu, C. S.; Da Jornada, E. H.; Ziebell, L. F.

1984-01-01

The investigation of the kinetic cross-field streaming instability, motivated by the research of collisionless shock waves and previously studied by Wu et al. (1983), is discussed more fully. Since in the ramp region of a quasi-perpendicular shock electrons can be preferentially heated in the direction transverse to the ambient magnetic field, it is both desirable and necessary to include the effect of the thermal anisotropy on the instability associated with a shock. It is found that Te-perpendicular greater than Te-parallel can significantly enhance the peak growth rate of the cross-field streaming instability when the electron beta is sufficiently high. Furthermore, the present analysis also improves the analytical and numerical solutions previously obtained.
Discontinuous Galerkin Methods and High-Speed Turbulent Flows

NASA Astrophysics Data System (ADS)

Atak, Muhammed; Larsson, Johan; Munz, Claus-Dieter

2014-11-01

Discontinuous Galerkin methods gain increasing importance within the CFD community as they combine arbitrary high order of accuracy in complex geometries with parallel efficiency. Particularly the discontinuous Galerkin spectral element method (DGSEM) is a promising candidate for both the direct numerical simulation (DNS) and large eddy simulation (LES) of turbulent flows due to its excellent scaling attributes. In this talk, we present a DNS of a compressible turbulent boundary layer along a flat plate at a free-stream Mach number of M = 2.67 and assess the computational efficiency of the DGSEM at performing high-fidelity simulations of both transitional and turbulent boundary layers. We compare the accuracy of the results as well as the computational performance to results using a high order finite difference method.
Portable LQCD Monte Carlo code using OpenACC

NASA Astrophysics Data System (ADS)

Bonati, Claudio; Calore, Enrico; Coscetti, Simone; D'Elia, Massimo; Mesiti, Michele; Negro, Francesco; Fabio Schifano, Sebastiano; Silvi, Giorgio; Tripiccione, Raffaele

2018-03-01

Varying from multi-core CPU processors to many-core GPUs, the present scenario of HPC architectures is extremely heterogeneous. In this context, code portability is increasingly important for easy maintainability of applications; this is relevant in scientific computing where code changes are numerous and frequent. In this talk we present the design and optimization of a state-of-the-art production level LQCD Monte Carlo application, using the OpenACC directives model. OpenACC aims to abstract parallel programming to a descriptive level, where programmers do not need to specify the mapping of the code on the target machine. We describe the OpenACC implementation and show that the same code is able to target different architectures, including state-of-the-art CPUs and GPUs.
Analysis of rapid increase in the plasma density during the ramp-up phase in a radio frequency negative ion source by large-scale particle simulation

NASA Astrophysics Data System (ADS)

Yasumoto, M.; Ohta, M.; Kawamura, Y.; Hatayama, A.

2014-02-01

Numerical simulations become useful for the developing RF-ICP (Radio Frequency Inductively Coupled Plasma) negative ion sources. We are developing and parallelizing a two-dimensional three velocity electromagnetic Particle-In-Cell code. The result shows rapid increase in the electron density during the density ramp-up phase. A radial electric field due to the space charge is produced with increase in the electron density and the electron transport in the radial direction is suppressed. As a result, electrons stay for a long period in the region where the inductive electric field is strong, and this leads efficient electron acceleration and a rapid increasing of the electron density.
Parallel solution of high-order numerical schemes for solving incompressible flows

NASA Technical Reports Server (NTRS)

Milner, Edward J.; Lin, Avi; Liou, May-Fun; Blech, Richard A.

1993-01-01

A new parallel numerical scheme for solving incompressible steady-state flows is presented. The algorithm uses a finite-difference approach to solving the Navier-Stokes equations. The algorithms are scalable and expandable. They may be used with only two processors or with as many processors as are available. The code is general and expandable. Any size grid may be used. Four processors of the NASA LeRC Hypercluster were used to solve for steady-state flow in a driven square cavity. The Hypercluster was configured in a distributed-memory, hypercube-like architecture. By using a 50-by-50 finite-difference solution grid, an efficiency of 74 percent (a speedup of 2.96) was obtained.
Full-field 3D deformation measurement: comparison between speckle phase and displacement evaluation.

PubMed

Khodadad, Davood; Singh, Alok Kumar; Pedrini, Giancarlo; Sjödahl, Mikael

2016-09-20

The objective of this paper is to describe a full-field deformation measurement method based on 3D speckle displacements. The deformation is evaluated from the slope of the speckle displacement function that connects the different reconstruction planes. For our experiment, a symmetrical arrangement with four illuminations parallel to the planes (x,z) and (y,z) was used. Four sets of speckle patterns were sequentially recorded by illuminating an object from the four directions, respectively. A single camera is used to record the holograms before and after deformations. Digital speckle photography is then used to calculate relative speckle displacements in each direction between two numerically propagated planes. The 3D speckle displacements vector is calculated as a combination of the speckle displacements from the holograms recorded in each illumination direction. Using the speckle displacements, problems associated with rigid body movements and phase wrapping are avoided. In our experiment, the procedure is shown to give the theoretical accuracy of 0.17 pixels yielding the accuracy of 2×10^-3 in the measurement of deformation gradients.
Drawing Parallels in Search of Educational Equity: A Multicultural Education Delegation to China Looks Outside to See Within

ERIC Educational Resources Information Center

Carjuzaa, Jioanna; Fenimore-Smith, J. Kay; Fuller, Ethlyn Davis; Howe, William A.; Kugler, Eileen; London, Arcenia P.; Ruiz, Ivette; Shin, Barbara

2008-01-01

In 2004, a professional delegation of multicultural educators visited the People's Republic of China to explore how diversity issues are addressed and how students are prepared for entry into the international workforce. The delegation, sponsored by the People to People Ambassador Programs, observed numerous parallels to the American system of…
Parallel database search and prime factorization with magnonic holographic memory devices

DOE Office of Scientific and Technical Information (OSTI.GOV)

Khitun, Alexander

In this work, we describe the capabilities of Magnonic Holographic Memory (MHM) for parallel database search and prime factorization. MHM is a type of holographic device, which utilizes spin waves for data transfer and processing. Its operation is based on the correlation between the phases and the amplitudes of the input spin waves and the output inductive voltage. The input of MHM is provided by the phased array of spin wave generating elements allowing the producing of phase patterns of an arbitrary form. The latter makes it possible to code logic states into the phases of propagating waves and exploitmore » wave superposition for parallel data processing. We present the results of numerical modeling illustrating parallel database search and prime factorization. The results of numerical simulations on the database search are in agreement with the available experimental data. The use of classical wave interference may results in a significant speedup over the conventional digital logic circuits in special task data processing (e.g., √n in database search). Potentially, magnonic holographic devices can be implemented as complementary logic units to digital processors. Physical limitations and technological constrains of the spin wave approach are also discussed.« less
Parallel database search and prime factorization with magnonic holographic memory devices

NASA Astrophysics Data System (ADS)

Khitun, Alexander

2015-12-01

In this work, we describe the capabilities of Magnonic Holographic Memory (MHM) for parallel database search and prime factorization. MHM is a type of holographic device, which utilizes spin waves for data transfer and processing. Its operation is based on the correlation between the phases and the amplitudes of the input spin waves and the output inductive voltage. The input of MHM is provided by the phased array of spin wave generating elements allowing the producing of phase patterns of an arbitrary form. The latter makes it possible to code logic states into the phases of propagating waves and exploit wave superposition for parallel data processing. We present the results of numerical modeling illustrating parallel database search and prime factorization. The results of numerical simulations on the database search are in agreement with the available experimental data. The use of classical wave interference may results in a significant speedup over the conventional digital logic circuits in special task data processing (e.g., √n in database search). Potentially, magnonic holographic devices can be implemented as complementary logic units to digital processors. Physical limitations and technological constrains of the spin wave approach are also discussed.
A Green's function method for local and non-local parallel transport in general magnetic fields

NASA Astrophysics Data System (ADS)

Del-Castillo-Negrete, Diego; Chacón, Luis

2009-11-01

The study of transport in magnetized plasmas is a problem of fundamental interest in controlled fusion and astrophysics research. Three issues make this problem particularly challenging: (i) The extreme anisotropy between the parallel (i.e., along the magnetic field), χ, and the perpendicular, χ, conductivities (χ/χ may exceed 10^10 in fusion plasmas); (ii) Magnetic field lines chaos which in general complicates (and may preclude) the construction of magnetic field line coordinates; and (iii) Nonlocal parallel transport in the limit of small collisionality. Motivated by these issues, we present a Lagrangian Green's function method to solve the local and non-local parallel transport equation applicable to integrable and chaotic magnetic fields. The numerical implementation employs a volume-preserving field-line integrator [Finn and Chac'on, Phys. Plasmas, 12 (2005)] for an accurate representation of the magnetic field lines regardless of the level of stochasticity. The general formalism and its algorithmic properties are discussed along with illustrative analytical and numerical examples. Problems of particular interest include: the departures from the Rochester--Rosenbluth diffusive scaling in the weak magnetic chaos regime, the interplay between non-locality and chaos, and the robustness of transport barriers in reverse shear configurations.
CASCADE AND DAMPING OF ALFVEN-CYCLOTRON FLUCTUATIONS: APPLICATION TO SOLAR WIND TURBULENCE

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jiang Yanwei; Petrosian, Vahe; Liu Siming

2009-06-10

It is well recognized that the presence of magnetic fields will lead to anisotropic energy cascade and dissipation of astrophysical turbulence. With the diffusion approximation and linear dissipation rates, we study the cascade and damping of Alfven-cyclotron fluctuations in solar plasmas numerically for two diagonal diffusion tensors, one (isotropic) with identical components for the parallel and perpendicular directions (with respect to the magnetic field) and one with different components (nonisotropic). It is found that for the isotropic case the steady-state turbulence spectra are nearly isotropic in the inertial range and can be fitted by a single power-law function with amore » spectral index of -3/2, similar to the Iroshnikov-Kraichnan phenomenology, while for the nonisotropic case the spectra vary greatly with the direction of propagation. The energy fluxes in both cases are much higher in the perpendicular direction than in the parallel direction due to the angular dependence (or inhomogeneity) of the components. In addition, beyond the MHD regime the kinetic effects make the spectrum softer at higher wavenumbers. In the dissipation range the turbulence spectrum cuts off at the wavenumber, where the damping rate becomes comparable to the cascade rate, and the cutoff wavenumber changes with the wave propagation direction. The angle-averaged turbulence spectrum of the isotropic model resembles a broken power law, which cuts off at the maximum of the cutoff wavenumbers or the {sup 4}He cyclotron frequency. Taking into account the Doppler effects, the model naturally reproduces the broken power-law turbulence spectra observed in the solar wind and predicts that a higher break frequency always comes along with a softer dissipation range spectrum that may be caused by the increase of the turbulence intensity, the reciprocal of the plasma {beta}{sub p}, and/or the angle between the solar wind velocity and the mean magnetic field. These predictions can be tested by detailed comparisons with more accurate observations.« less
Determination of backbone chain direction of PDA using FFM

NASA Astrophysics Data System (ADS)

Jo, Sadaharu; Okamoto, Kentaro; Takenaga, Mitsuru

2010-01-01

The effect of backbone chains on friction force was investigated on both Langmuir-Blodgett (LB) films of 10,12-heptacosadiynoic acid and the (0 1 0) surfaces of single crystals of 2,4-hexadiene-1,6-diol using friction force microscopy (FFM). It was observed that friction force decreased when the scanning direction was parallel to the [0 0 1] direction in both samples. Moreover, friction force decreased when the scanning direction was parallel to the crystallographic [1 0 2], [1 0 1], [1 0 0] and [1 0 1¯] directions in only the single crystals. For the LB films, the [0 0 1] direction corresponds to the backbone chain direction of 10,12-heptacosadiynoic acid. For the single crystals, both the [0 0 1] and [1 0 1] directions correspond to the backbone chain direction, and the [1 0 2], [1 0 0] and [1 0 1¯] directions correspond to the low-index crystallographic direction. In both the LB films and single crystals, the friction force was minimized when the directions of scanning and the backbone chain were parallel.
Spatial and temporal accuracy of asynchrony-tolerant finite difference schemes for partial differential equations at extreme scales

NASA Astrophysics Data System (ADS)

Kumari, Komal; Donzis, Diego

2017-11-01

Highly resolved computational simulations on massively parallel machines are critical in understanding the physics of a vast number of complex phenomena in nature governed by partial differential equations. Simulations at extreme levels of parallelism present many challenges with communication between processing elements (PEs) being a major bottleneck. In order to fully exploit the computational power of exascale machines one needs to devise numerical schemes that relax global synchronizations across PEs. This asynchronous computations, however, have a degrading effect on the accuracy of standard numerical schemes.We have developed asynchrony-tolerant (AT) schemes that maintain order of accuracy despite relaxed communications. We show, analytically and numerically, that these schemes retain their numerical properties with multi-step higher order temporal Runge-Kutta schemes. We also show that for a range of optimized parameters,the computation time and error for AT schemes is less than their synchronous counterpart. Stability of the AT schemes which depends upon history and random nature of delays, are also discussed. Support from NSF is gratefully acknowledged.
Experimental Studies of the Interaction Between a Parallel Shear Flow and a Directionally-Solidifying Front

NASA Technical Reports Server (NTRS)

Zhang, Meng; Maxworthy, Tony

1999-01-01

It has long been recognized that flow in the melt can have a profound influence on the dynamics of a solidifying interface and hence the quality of the solid material. In particular, flow affects the heat and mass transfer, and causes spatial and temporal variations in the flow and melt composition. This results in a crystal with nonuniform physical properties. Flow can be generated by buoyancy, expansion or contraction upon phase change, and thermo-soluto capillary effects. In general, these flows can not be avoided and can have an adverse effect on the stability of the crystal structures. This motivates crystal growth experiments in a microgravity environment, where buoyancy-driven convection is significantly suppressed. However, transient accelerations (g-jitter) caused by the acceleration of the spacecraft can affect the melt, while convection generated from the effects other than buoyancy remain important. Rather than bemoan the presence of convection as a source of interfacial instability, Hurle in the 1960s suggested that flow in the melt, either forced or natural convection, might be used to stabilize the interface. Delves considered the imposition of both a parabolic velocity profile and a Blasius boundary layer flow over the interface. He concluded that fast stirring could stabilize the interface to perturbations whose wave vector is in the direction of the fluid velocity. Forth and Wheeler considered the effect of the asymptotic suction boundary layer profile. They showed that the effect of the shear flow was to generate travelling waves parallel to the flow with a speed proportional to the Reynolds number. There have been few quantitative, experimental works reporting on the coupling effect of fluid flow and morphological instabilities. Huang studied plane Couette flow over cells and dendrites. It was found that this flow could greatly enhance the planar stability and even induce the cell-planar transition. A rotating impeller was buried inside the sample cell, driven by an outside rotating magnet, in order to generate the flow. However, it appears that this was not a well-controlled flow and may also have been unsteady. In the present experimental study, we want to study how a forced parallel shear flow in a Hele-Shaw cell interacts with the directionally solidifying crystal interface. The comparison of experimental data show that the parallel shear flow in a Hele-Shaw cell has a strong stabilizing effect on the planar interface by damping the existing initial perturbations. The flow also shows a stabilizing effect on the cellular interface by slightly reducing the exponential growth rate of cells. The left-right symmetry of cells is broken by the flow with cells tilting toward the incoming flow direction. The tilting angle increases with the velocity ratio. The experimental results are explained through the parallel flow effect on lateral solute transport. The phenomenon of cells tilting against the flow is consistent with the numerical result of Dantzig and Chao.
Thermal conductivity anisotropy of rocks

NASA Astrophysics Data System (ADS)

Lee, Youngmin; Keehm, Youngseuk; Shin, Sang Ho

2013-04-01

The interior heat of the lithosphere of the Earth is mainly transferred by conduction that depends on thermal conductivity of rocks. Many sedimentary and metamorphic rocks have thermal conductivity anisotropy, i.e. heat is preferentially transferred in the direction parallel to the bedding and foliation of these rocks. Deming (JGR, 1994) proposed an empirical relationship between K(perp) and anisotropy (K(par)/K(perp)) using 89 measurements on rock samples from literatures. In Deming's model, thermal conductivity is almost isotropic for K(perp) > 4 W/mK, but anisotropy is exponentially increasing with decreasing K(perp), with final anisotropy of ~2.5 at K(perp) < 1.0 W/mK. However, Davis et al. (JGR, 2007) argued that there is little evidence for Deming's suggestion that thermal conductivity anisotropy of all rocks increases systematically to about 2.5 for rocks with low thermal conductivity. Davis et al. insisted that Deming's increase in anisotropy for 1 < K(perp) < 4 W/mK with decreasing K(perp) could be due to the fractures filled with air or water, which causes thermal conductivity anisotropy. To test Deming's suggestion and Davis et al.'s argument on thermal conductivity anisotropy, we measured thermal conductivity parallel (K(par)) and perpendicular (K(perp)) to bedding or foliation and performed analytical & numerical modeling. Our measurements on 53 rock samples show the anisotropy range from 0.79 to 1.36 for 1.84 < K(prep) < 4.06 W/mK. Analytical models show that anisotropy can increase or stay the same at the range of 1 < K(perp) < 4 W/mK. Numerical modeling for gneiss shows that anisotropy ranges 1.21 to 1.36 for 2.5 < K(perp) < 4.8 W/mK. Another numerical modeling with interbedded coal layers in high thermal conductivity rocks (3.5 W/mK) shows anisotropy of 1.87 when K(perp) is 1.7 W/mK. Finally, numerical modeling with fractures indicates that the fractures does not seem to affect thermal conductivity anisotropy significantly. In conclusion, our preliminary results imply that thermal conductivity anisotropy can increase or stay at low value in the range of 1.0 < K(perp) < 4.0 W/mK. Both cases are shown to be possible through lab measurements and analytical & numerical modeling.

A high-speed linear algebra library with automatic parallelism

NASA Technical Reports Server (NTRS)

Boucher, Michael L.

1994-01-01

Parallel or distributed processing is key to getting highest performance workstations. However, designing and implementing efficient parallel algorithms is difficult and error-prone. It is even more difficult to write code that is both portable to and efficient on many different computers. Finally, it is harder still to satisfy the above requirements and include the reliability and ease of use required of commercial software intended for use in a production environment. As a result, the application of parallel processing technology to commercial software has been extremely small even though there are numerous computationally demanding programs that would significantly benefit from application of parallel processing. This paper describes DSSLIB, which is a library of subroutines that perform many of the time-consuming computations in engineering and scientific software. DSSLIB combines the high efficiency and speed of parallel computation with a serial programming model that eliminates many undesirable side-effects of typical parallel code. The result is a simple way to incorporate the power of parallel processing into commercial software without compromising maintainability, reliability, or ease of use. This gives significant advantages over less powerful non-parallel entries in the market.
A parallel domain decomposition-based implicit method for the Cahn–Hilliard–Cook phase-field equation in 3D

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zheng, Xiang; Yang, Chao; State Key Laboratory of Computer Science, Chinese Academy of Sciences, Beijing 100190

2015-03-15

We present a numerical algorithm for simulating the spinodal decomposition described by the three dimensional Cahn–Hilliard–Cook (CHC) equation, which is a fourth-order stochastic partial differential equation with a noise term. The equation is discretized in space and time based on a fully implicit, cell-centered finite difference scheme, with an adaptive time-stepping strategy designed to accelerate the progress to equilibrium. At each time step, a parallel Newton–Krylov–Schwarz algorithm is used to solve the nonlinear system. We discuss various numerical and computational challenges associated with the method. The numerical scheme is validated by a comparison with an explicit scheme of high accuracymore » (and unreasonably high cost). We present steady state solutions of the CHC equation in two and three dimensions. The effect of the thermal fluctuation on the spinodal decomposition process is studied. We show that the existence of the thermal fluctuation accelerates the spinodal decomposition process and that the final steady morphology is sensitive to the stochastic noise. We also show the evolution of the energies and statistical moments. In terms of the parallel performance, it is found that the implicit domain decomposition approach scales well on supercomputers with a large number of processors.« less
Self-Scheduling Parallel Methods for Multiple Serial Codes with Application to WOPWOP

NASA Technical Reports Server (NTRS)

Long, Lyle N.; Brentner, Kenneth S.

2000-01-01

This paper presents a scheme for efficiently running a large number of serial jobs on parallel computers. Two examples are given of computer programs that run relatively quickly, but often they must be run numerous times to obtain all the results needed. It is very common in science and engineering to have codes that are not massive computing challenges in themselves, but due to the number of instances that must be run, they do become large-scale computing problems. The two examples given here represent common problems in aerospace engineering: aerodynamic panel methods and aeroacoustic integral methods. The first example simply solves many systems of linear equations. This is representative of an aerodynamic panel code where someone would like to solve for numerous angles of attack. The complete code for this first example is included in the appendix so that it can be readily used by others as a template. The second example is an aeroacoustics code (WOPWOP) that solves the Ffowcs Williams Hawkings equation to predict the far-field sound due to rotating blades. In this example, one quite often needs to compute the sound at numerous observer locations, hence parallelization is utilized to automate the noise computation for a large number of observers.
Semiannual report, 1 April - 30 September 1991

NASA Technical Reports Server (NTRS)

1991-01-01

The major categories of the current Institute for Computer Applications in Science and Engineering (ICASE) research program are: (1) numerical methods, with particular emphasis on the development and analysis of basic numerical algorithms; (2) control and parameter identification problems, with emphasis on effective numerical methods; (3) computational problems in engineering and the physical sciences, particularly fluid dynamics, acoustics, and structural analysis; and (4) computer systems and software for parallel computers. Research in these areas is discussed.
Parallel computing of a digital hologram and particle searching for microdigital-holographic particle-tracking velocimetry

DOE Office of Scientific and Technical Information (OSTI.GOV)

Satake, Shin-ichi; Kanamori, Hiroyuki; Kunugi, Tomoaki

2007-02-01

We have developed a parallel algorithm for microdigital-holographic particle-tracking velocimetry. The algorithm is used in (1) numerical reconstruction of a particle image computer using a digital hologram, and (2) searching for particles. The numerical reconstruction from the digital hologram makes use of the Fresnel diffraction equation and the FFT (fast Fourier transform),whereas the particle search algorithm looks for local maximum graduation in a reconstruction field represented by a 3D matrix. To achieve high performance computing for both calculations (reconstruction and particle search), two memory partitions are allocated to the 3D matrix. In this matrix, the reconstruction part consists of horizontallymore » placed 2D memory partitions on the x-y plane for the FFT, whereas, the particle search part consists of vertically placed 2D memory partitions set along the z axes.Consequently, the scalability can be obtained for the proportion of processor elements,where the benchmarks are carried out for parallel computation by a SGI Altix machine.« less
Automatic Parallelization of Numerical Python Applications using the Global Arrays Toolkit

DOE Office of Scientific and Technical Information (OSTI.GOV)

Daily, Jeffrey A.; Lewis, Robert R.

2011-11-30

Global Arrays is a software system from Pacific Northwest National Laboratory that enables an efficient, portable, and parallel shared-memory programming interface to manipulate distributed dense arrays. The NumPy module is the de facto standard for numerical calculation in the Python programming language, a language whose use is growing rapidly in the scientific and engineering communities. NumPy provides a powerful N-dimensional array class as well as other scientific computing capabilities. However, like the majority of the core Python modules, NumPy is inherently serial. Using a combination of Global Arrays and NumPy, we have reimplemented NumPy as a distributed drop-in replacement calledmore » Global Arrays in NumPy (GAiN). Serial NumPy applications can become parallel, scalable GAiN applications with only minor source code changes. Scalability studies of several different GAiN applications will be presented showing the utility of developing serial NumPy codes which can later run on more capable clusters or supercomputers.« less
A Model for Displacements Between Parallel Plates That Shows Change of Type from Hyperbolic to Elliptic

NASA Astrophysics Data System (ADS)

Shariati, Maryam; Yortsos, Yannis; Talon, Laurent; Martin, Jerome; Rakotomalala, Nicole; Salin, Dominique

2003-11-01

We consider miscible displacement between parallel plates, where the viscosity is a function of the concentration. By selecting a piece-wise representation, the problem can be considered as ``three-phase'' flow. Assuming a lubrication-type approximation, the mathematical description is in terms of two quasi-linear hyperbolic equations. When the mobility of the middle phase is smaller than its neighbors, the system is genuinely hyperbolic and can be solved analytically. However, when it is larger, an elliptic region develops. This change-of-type behavior is for the first time proved here based on sound physical principles. Numerical solutions with a small diffusion are presented. Good agreement is obtained outside the elliptic region, but not inside, where the numerical results show unstable behavior. We conjecture that for the solution of the real problem in the mixed-type case, the full higher-dimensionality problem must be considered inside the elliptic region, in which the lubrication (parallel-flow) approximation is no longer appropriate. This is discussed in a companion presentation.
A new parallel-vector finite element analysis software on distributed-memory computers

NASA Technical Reports Server (NTRS)

Qin, Jiangning; Nguyen, Duc T.

1993-01-01

A new parallel-vector finite element analysis software package MPFEA (Massively Parallel-vector Finite Element Analysis) is developed for large-scale structural analysis on massively parallel computers with distributed-memory. MPFEA is designed for parallel generation and assembly of the global finite element stiffness matrices as well as parallel solution of the simultaneous linear equations, since these are often the major time-consuming parts of a finite element analysis. Block-skyline storage scheme along with vector-unrolling techniques are used to enhance the vector performance. Communications among processors are carried out concurrently with arithmetic operations to reduce the total execution time. Numerical results on the Intel iPSC/860 computers (such as the Intel Gamma with 128 processors and the Intel Touchstone Delta with 512 processors) are presented, including an aircraft structure and some very large truss structures, to demonstrate the efficiency and accuracy of MPFEA.
Parallel computing works

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of manymore » computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.« less
Aerostructural analysis and design optimization of composite aircraft

NASA Astrophysics Data System (ADS)

Kennedy, Graeme James

High-performance composite materials exhibit both anisotropic strength and stiffness properties. These anisotropic properties can be used to produce highly-tailored aircraft structures that meet stringent performance requirements, but these properties also present unique challenges for analysis and design. New tools and techniques are developed to address some of these important challenges. A homogenization-based theory for beams is developed to accurately predict the through-thickness stress and strain distribution in thick composite beams. Numerical comparisons demonstrate that the proposed beam theory can be used to obtain highly accurate results in up to three orders of magnitude less computational time than three-dimensional calculations. Due to the large finite-element model requirements for thin composite structures used in aerospace applications, parallel solution methods are explored. A parallel direct Schur factorization method is developed. The parallel scalability of the direct Schur approach is demonstrated for a large finite-element problem with over 5 million unknowns. In order to address manufacturing design requirements, a novel laminate parametrization technique is presented that takes into account the discrete nature of the ply-angle variables, and ply-contiguity constraints. This parametrization technique is demonstrated on a series of structural optimization problems including compliance minimization of a plate, buckling design of a stiffened panel and layup design of a full aircraft wing. The design and analysis of composite structures for aircraft is not a stand-alone problem and cannot be performed without multidisciplinary considerations. A gradient-based aerostructural design optimization framework is presented that partitions the disciplines into distinct process groups. An approximate Newton-Krylov method is shown to be an efficient aerostructural solution algorithm and excellent parallel scalability of the algorithm is demonstrated. An induced drag optimization study is performed to compare the trade-off between wing weight and induced drag for wing tip extensions, raked wing tips and winglets. The results demonstrate that it is possible to achieve a 43% induced drag reduction with no weight penalty, a 28% induced drag reduction with a 10% wing weight reduction, or a 20% wing weight reduction with a 5% induced drag penalty from a baseline wing obtained from a structural mass-minimization problem with fixed aerodynamic loads.
Comparing the OpenMP, MPI, and Hybrid Programming Paradigm on an SMP Cluster

NASA Technical Reports Server (NTRS)

Jost, Gabriele; Jin, Hao-Qiang; anMey, Dieter; Hatay, Ferhat F.

2003-01-01

Clusters of SMP (Symmetric Multi-Processors) nodes provide support for a wide range of parallel programming paradigms. The shared address space within each node is suitable for OpenMP parallelization. Message passing can be employed within and across the nodes of a cluster. Multiple levels of parallelism can be achieved by combining message passing and OpenMP parallelization. Which programming paradigm is the best will depend on the nature of the given problem, the hardware components of the cluster, the network, and the available software. In this study we compare the performance of different implementations of the same CFD benchmark application, using the same numerical algorithm but employing different programming paradigms.
Solvent-assisted multistage nonequilibrium electron transfer in rigid supramolecular systems: Diabatic free energy surfaces and algorithms for numerical simulations

NASA Astrophysics Data System (ADS)

Feskov, Serguei V.; Ivanov, Anatoly I.

2018-03-01

An approach to the construction of diabatic free energy surfaces (FESs) for ultrafast electron transfer (ET) in a supramolecule with an arbitrary number of electron localization centers (redox sites) is developed, supposing that the reorganization energies for the charge transfers and shifts between all these centers are known. Dimensionality of the coordinate space required for the description of multistage ET in this supramolecular system is shown to be equal to N - 1, where N is the number of the molecular centers involved in the reaction. The proposed algorithm of FES construction employs metric properties of the coordinate space, namely, relation between the solvent reorganization energy and the distance between the two FES minima. In this space, the ET reaction coordinate zn n' associated with electron transfer between the nth and n'th centers is calculated through the projection to the direction, connecting the FES minima. The energy-gap reaction coordinates zn n' corresponding to different ET processes are not in general orthogonal so that ET between two molecular centers can create nonequilibrium distribution, not only along its own reaction coordinate but along other reaction coordinates too. This results in the influence of the preceding ET steps on the kinetics of the ensuing ET. It is important for the ensuing reaction to be ultrafast to proceed in parallel with relaxation along the ET reaction coordinates. Efficient algorithms for numerical simulation of multistage ET within the stochastic point-transition model are developed. The algorithms are based on the Brownian simulation technique with the recrossing-event detection procedure. The main advantages of the numerical method are (i) its computational complexity is linear with respect to the number of electronic states involved and (ii) calculations can be naturally parallelized up to the level of individual trajectories. The efficiency of the proposed approach is demonstrated for a model supramolecular system involving four redox centers.
SpF: Enabling Petascale Performance for Pseudospectral Dynamo Models

NASA Astrophysics Data System (ADS)

Jiang, W.; Clune, T.; Vriesema, J.; Gutmann, G.

2013-12-01

Pseudospectral (PS) methods possess a number of characteristics (e.g., efficiency, accuracy, natural boundary conditions) that are extremely desirable for dynamo models. Unfortunately, dynamo models based upon PS methods face a number of daunting challenges, which include exposing additional parallelism, leveraging hardware accelerators, exploiting hybrid parallelism, and improving the scalability of global memory transposes. Although these issues are a concern for most models, solutions for PS methods tend to require far more pervasive changes to underlying data and control structures. Further, improvements in performance in one model are difficult to transfer to other models, resulting in significant duplication of effort across the research community. We have developed an extensible software framework for pseudospectral methods called SpF that is intended to enable extreme scalability and optimal performance. High-level abstractions provided by SpF unburden applications of the responsibility of managing domain decomposition and load balance while reducing the changes in code required to adapt to new computing architectures. The key design concept in SpF is that each phase of the numerical calculation is partitioned into disjoint numerical 'kernels' that can be performed entirely in-processor. The granularity of domain-decomposition provided by SpF is only constrained by the data-locality requirements of these kernels. SpF builds on top of optimized vendor libraries for common numerical operations such as transforms, matrix solvers, etc., but can also be configured to use open source alternatives for portability. SpF includes several alternative schemes for global data redistribution and is expected to serve as an ideal testbed for further research into optimal approaches for different network architectures. In this presentation, we will describe the basic architecture of SpF as well as preliminary performance data and experience with adapting legacy dynamo codes. We will conclude with a discussion of planned extensions to SpF that will provide pseudospectral applications with additional flexibility with regard to time integration, linear solvers, and discretization in the radial direction.
Parallelized CCHE2D flow model with CUDA Fortran on Graphics Process Units

USDA-ARS?s Scientific Manuscript database

This paper presents the CCHE2D implicit flow model parallelized using CUDA Fortran programming technique on Graphics Processing Units (GPUs). A parallelized implicit Alternating Direction Implicit (ADI) solver using Parallel Cyclic Reduction (PCR) algorithm on GPU is developed and tested. This solve...
Dynamic states of swimming bacteria in a nematic liquid crystal cell with homeotropic alignment

DOE PAGES

Zhou, Shuang; Tovkach, Oleh; Golovaty, Dmitry; ...

2017-05-17

Flagellated bacteria such as Escherichia coli and Bacillus subtilis exhibit effective mechanisms for swimming in fluids and exploring the surrounding environment. In isotropic fluids such as water, the bacteria change swimming direction through the run-and-tumble process. Lyotropic chromonic liquid crystals (LCLCs) have been introduced recently as an anisotropic environment in which the direction of preferred orientation, the director, guides the bacterial trajectories. In this work, we describe the behavior of bacteria B. subtilis in a homeotropic LCLC geometry, in which the director is perpendicular to the bounding plates of a shallow cell. We demonstrate that the bacteria are capable ofmore » overcoming the stabilizing elastic forces of the LCLC and swim perpendicularly to the imposed director (and parallel to the bounding plates). The effect is explained by a finite surface anchoring of the director at the bacterial body; the role of surface anchoring is analyzed by numerical simulations of a rod realigning in an otherwise uniform director field. Shear flows produced by a swimming bacterium cause director distortions around its body, as evidenced both by experiments and numerical simulations. These distortions contribute to a repulsive force that keeps the swimming bacterium at a distance of a few micrometers away from the bounding plates. The homeotropic alignment of the director imposes two different scenarios of bacterial tumbling: one with an 180° reversal of the horizontal velocity and the other with the realignment of the bacterium by two consecutive 90° turns. Finally, in the second case, the angle between the bacterial body and the imposed director changes from 90° to 0° and then back to 90°; the new direction of swimming does not correlate with the previous swimming direction.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhou, Shuang; Tovkach, Oleh; Golovaty, Dmitry

Flagellated bacteria such as Escherichia coli and Bacillus subtilis exhibit effective mechanisms for swimming in fluids and exploring the surrounding environment. In isotropic fluids such as water, the bacteria change swimming direction through the run-and-tumble process. Lyotropic chromonic liquid crystals (LCLCs) have been introduced recently as an anisotropic environment in which the direction of preferred orientation, the director, guides the bacterial trajectories. In this work, we describe the behavior of bacteria B. subtilis in a homeotropic LCLC geometry, in which the director is perpendicular to the bounding plates of a shallow cell. We demonstrate that the bacteria are capable ofmore » overcoming the stabilizing elastic forces of the LCLC and swim perpendicularly to the imposed director (and parallel to the bounding plates). The effect is explained by a finite surface anchoring of the director at the bacterial body; the role of surface anchoring is analyzed by numerical simulations of a rod realigning in an otherwise uniform director field. Shear flows produced by a swimming bacterium cause director distortions around its body, as evidenced both by experiments and numerical simulations. These distortions contribute to a repulsive force that keeps the swimming bacterium at a distance of a few micrometers away from the bounding plates. The homeotropic alignment of the director imposes two different scenarios of bacterial tumbling: one with an 180° reversal of the horizontal velocity and the other with the realignment of the bacterium by two consecutive 90° turns. Finally, in the second case, the angle between the bacterial body and the imposed director changes from 90° to 0° and then back to 90°; the new direction of swimming does not correlate with the previous swimming direction.« less
Dynamic states of swimming bacteria in a nematic liquid crystal cell with homeotropic alignment

NASA Astrophysics Data System (ADS)

Zhou, Shuang; Tovkach, Oleh; Golovaty, Dmitry; Sokolov, Andrey; Aranson, Igor S.; Lavrentovich, Oleg D.

2017-05-01

Flagellated bacteria such as Escherichia coli and Bacillus subtilis exhibit effective mechanisms for swimming in fluids and exploring the surrounding environment. In isotropic fluids such as water, the bacteria change swimming direction through the run-and-tumble process. Lyotropic chromonic liquid crystals (LCLCs) have been introduced recently as an anisotropic environment in which the direction of preferred orientation, the director, guides the bacterial trajectories. In this work, we describe the behavior of bacteria B. subtilis in a homeotropic LCLC geometry, in which the director is perpendicular to the bounding plates of a shallow cell. We demonstrate that the bacteria are capable of overcoming the stabilizing elastic forces of the LCLC and swim perpendicularly to the imposed director (and parallel to the bounding plates). The effect is explained by a finite surface anchoring of the director at the bacterial body; the role of surface anchoring is analyzed by numerical simulations of a rod realigning in an otherwise uniform director field. Shear flows produced by a swimming bacterium cause director distortions around its body, as evidenced both by experiments and numerical simulations. These distortions contribute to a repulsive force that keeps the swimming bacterium at a distance of a few micrometers away from the bounding plates. The homeotropic alignment of the director imposes two different scenarios of bacterial tumbling: one with an 180° reversal of the horizontal velocity and the other with the realignment of the bacterium by two consecutive 90° turns. In the second case, the angle between the bacterial body and the imposed director changes from 90° to 0° and then back to 90° the new direction of swimming does not correlate with the previous swimming direction.
Hall-Effect Thruster Simulations with 2-D Electron Transport and Hydrodynamic Ions

NASA Technical Reports Server (NTRS)

Mikellides, Ioannis G.; Katz, Ira; Hofer, Richard H.; Goebel, Dan M.

2009-01-01

A computational approach that has been used extensively in the last two decades for Hall thruster simulations is to solve a diffusion equation and energy conservation law for the electrons in a direction that is perpendicular to the magnetic field, and use discrete-particle methods for the heavy species. This "hybrid" approach has allowed for the capture of bulk plasma phenomena inside these thrusters within reasonable computational times. Regions of the thruster with complex magnetic field arrangements (such as those near eroded walls and magnets) and/or reduced Hall parameter (such as those near the anode and the cathode plume) challenge the validity of the quasi-one-dimensional assumption for the electrons. This paper reports on the development of a computer code that solves numerically the 2-D axisymmetric vector form of Ohm's law, with no assumptions regarding the rate of electron transport in the parallel and perpendicular directions. The numerical challenges related to the large disparity of the transport coefficients in the two directions are met by solving the equations in a computational mesh that is aligned with the magnetic field. The fully-2D approach allows for a large physical domain that extends more than five times the thruster channel length in the axial direction, and encompasses the cathode boundary. Ions are treated as an isothermal, cold (relative to the electrons) fluid, accounting for charge-exchange and multiple-ionization collisions in the momentum equations. A first series of simulations of two Hall thrusters, namely the BPT-4000 and a 6-kW laboratory thruster, quantifies the significance of ion diffusion in the anode region and the importance of the extended physical domain on studies related to the impact of the transport coefficients on the electron flow field.
Controlling the numerical Cerenkov instability in PIC simulations using a customized finite difference Maxwell solver and a local FFT based current correction

DOE PAGES

Li, Fei; Yu, Peicheng; Xu, Xinlu; ...

2017-01-12

In this study we present a customized finite-difference-time-domain (FDTD) Maxwell solver for the particle-in-cell (PIC) algorithm. The solver is customized to effectively eliminate the numerical Cerenkov instability (NCI) which arises when a plasma (neutral or non-neutral) relativistically drifts on a grid when using the PIC algorithm. We control the EM dispersion curve in the direction of the plasma drift of a FDTD Maxwell solver by using a customized higher order finite difference operator for the spatial derivative along the direction of the drift (1ˆ direction). We show that this eliminates the main NCI modes with moderate |k 1|, while keepsmore » additional main NCI modes well outside the range of physical interest with higher |k 1|. These main NCI modes can be easily filtered out along with first spatial aliasing NCI modes which are also at the edge of the fundamental Brillouin zone. The customized solver has the possible advantage of improved parallel scalability because it can be easily partitioned along 1ˆ which typically has many more cells than other directions for the problems of interest. We show that FFTs can be performed locally to current on each partition to filter out the main and first spatial aliasing NCI modes, and to correct the current so that it satisfies the continuity equation for the customized spatial derivative. This ensures that Gauss’ Law is satisfied. Lastly, we present simulation examples of one relativistically drifting plasma, of two colliding relativistically drifting plasmas, and of nonlinear laser wakefield acceleration (LWFA) in a Lorentz boosted frame that show no evidence of the NCI can be observed when using this customized Maxwell solver together with its NCI elimination scheme.« less
Controlling the numerical Cerenkov instability in PIC simulations using a customized finite difference Maxwell solver and a local FFT based current correction

NASA Astrophysics Data System (ADS)

Li, Fei; Yu, Peicheng; Xu, Xinlu; Fiuza, Frederico; Decyk, Viktor K.; Dalichaouch, Thamine; Davidson, Asher; Tableman, Adam; An, Weiming; Tsung, Frank S.; Fonseca, Ricardo A.; Lu, Wei; Mori, Warren B.

2017-05-01

In this paper we present a customized finite-difference-time-domain (FDTD) Maxwell solver for the particle-in-cell (PIC) algorithm. The solver is customized to effectively eliminate the numerical Cerenkov instability (NCI) which arises when a plasma (neutral or non-neutral) relativistically drifts on a grid when using the PIC algorithm. We control the EM dispersion curve in the direction of the plasma drift of a FDTD Maxwell solver by using a customized higher order finite difference operator for the spatial derivative along the direction of the drift (1 ˆ direction). We show that this eliminates the main NCI modes with moderate |k1 | , while keeps additional main NCI modes well outside the range of physical interest with higher |k1 | . These main NCI modes can be easily filtered out along with first spatial aliasing NCI modes which are also at the edge of the fundamental Brillouin zone. The customized solver has the possible advantage of improved parallel scalability because it can be easily partitioned along 1 ˆ which typically has many more cells than other directions for the problems of interest. We show that FFTs can be performed locally to current on each partition to filter out the main and first spatial aliasing NCI modes, and to correct the current so that it satisfies the continuity equation for the customized spatial derivative. This ensures that Gauss' Law is satisfied. We present simulation examples of one relativistically drifting plasma, of two colliding relativistically drifting plasmas, and of nonlinear laser wakefield acceleration (LWFA) in a Lorentz boosted frame that show no evidence of the NCI can be observed when using this customized Maxwell solver together with its NCI elimination scheme.

Controlling the numerical Cerenkov instability in PIC simulations using a customized finite difference Maxwell solver and a local FFT based current correction

DOE Office of Scientific and Technical Information (OSTI.GOV)

Li, Fei; Yu, Peicheng; Xu, Xinlu

In this study we present a customized finite-difference-time-domain (FDTD) Maxwell solver for the particle-in-cell (PIC) algorithm. The solver is customized to effectively eliminate the numerical Cerenkov instability (NCI) which arises when a plasma (neutral or non-neutral) relativistically drifts on a grid when using the PIC algorithm. We control the EM dispersion curve in the direction of the plasma drift of a FDTD Maxwell solver by using a customized higher order finite difference operator for the spatial derivative along the direction of the drift (1ˆ direction). We show that this eliminates the main NCI modes with moderate |k 1|, while keepsmore » additional main NCI modes well outside the range of physical interest with higher |k 1|. These main NCI modes can be easily filtered out along with first spatial aliasing NCI modes which are also at the edge of the fundamental Brillouin zone. The customized solver has the possible advantage of improved parallel scalability because it can be easily partitioned along 1ˆ which typically has many more cells than other directions for the problems of interest. We show that FFTs can be performed locally to current on each partition to filter out the main and first spatial aliasing NCI modes, and to correct the current so that it satisfies the continuity equation for the customized spatial derivative. This ensures that Gauss’ Law is satisfied. Lastly, we present simulation examples of one relativistically drifting plasma, of two colliding relativistically drifting plasmas, and of nonlinear laser wakefield acceleration (LWFA) in a Lorentz boosted frame that show no evidence of the NCI can be observed when using this customized Maxwell solver together with its NCI elimination scheme.« less
A possible explanation for foreland thrust propagation

NASA Astrophysics Data System (ADS)

Panian, John; Pilant, Walter

1990-06-01

A common feature of thin-skinned fold and thrust belts is the sequential nature of foreland directed thrust systems. As a rule, younger thrusts develop in the footwalls of older thrusts, the whole sequence propagating towards the foreland in the transport direction. As each new younger thrust develops, the entire sequence is thickened; particularly in the frontal region. The compressive toe region can be likened to an advancing wave; as the mountainous thrust belt advanced the down-surface slope stresses drive thrusts ahead of it much like a surfboard rider. In an attempt to investigate the stresses in the frontal regions of thrustsheets, a numerical method has been devised from the algorithm given by McTigue and Mei [1981]. The algorithm yields a quickly computed approximate solution of the gravity- and tectonic-induced stresses of a two-dimensional homogeneous elastic half-space with an arbitrarily shaped free surface of small slope. A comparison of the numerical method with analytical examples shows excellent agreement. The numerical method was devised because it greatly facilitates the stress calculations and frees one from using the restrictive, simple topographic profiles necessary to obtain an analytical solution. The numerical version of the McTigue and Mei algorithm shows that there is a region of increased maximum resolved shear stress, τ, directly beneath the toe of the overthrust sheet. Utilizing the Mohr-Coulomb failure criterion, predicted fault lines are computed. It is shown that they flatten and become horizontal in some portions of this zone of increased τ. Thrust sheets are known to advance upon weak decollement zones. If there is a coincidence of increased τ, a weak rock layer, and a potential fault line parallel to this weak layer, we have in place all the elements necessary to initiate a new thrusting event. That is, this combination acts as a nucleating center to initiate a new thrusting event. Therefore, thrusts develop in sequence towards the foreland as a consequence of the stress concentrating abilities of the toe of the thrust sheet. The gravity- and tectonic-induced stresses due to the surface topography (usually ignored in previous analyses) of an advancing thrust sheet play a key role in the nature of shallow foreland thrust propagation.
Kranc: a Mathematica package to generate numerical codes for tensorial evolution equations

NASA Astrophysics Data System (ADS)

Husa, Sascha; Hinder, Ian; Lechner, Christiane

2006-06-01

We present a suite of Mathematica-based computer-algebra packages, termed "Kranc", which comprise a toolbox to convert certain (tensorial) systems of partial differential evolution equations to parallelized C or Fortran code for solving initial boundary value problems. Kranc can be used as a "rapid prototyping" system for physicists or mathematicians handling very complicated systems of partial differential equations, but through integration into the Cactus computational toolkit we can also produce efficient parallelized production codes. Our work is motivated by the field of numerical relativity, where Kranc is used as a research tool by the authors. In this paper we describe the design and implementation of both the Mathematica packages and the resulting code, we discuss some example applications, and provide results on the performance of an example numerical code for the Einstein equations. Program summaryTitle of program: Kranc Catalogue identifier: ADXS_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADXS_v1_0 Program obtainable from: CPC Program Library, Queen's University of Belfast, N. Ireland Distribution format: tar.gz Computer for which the program is designed and others on which it has been tested: General computers which run Mathematica (for code generation) and Cactus (for numerical simulations), tested under Linux Programming language used: Mathematica, C, Fortran 90 Memory required to execute with typical data: This depends on the number of variables and gridsize, the included ADM example requires 4308 KB Has the code been vectorized or parallelized: The code is parallelized based on the Cactus framework. Number of bytes in distributed program, including test data, etc.: 1 578 142 Number of lines in distributed program, including test data, etc.: 11 711 Nature of physical problem: Solution of partial differential equations in three space dimensions, which are formulated as an initial value problem. In particular, the program is geared towards handling very complex tensorial equations as they appear, e.g., in numerical relativity. The worked out examples comprise the Klein-Gordon equations, the Maxwell equations, and the ADM formulation of the Einstein equations. Method of solution: The method of numerical solution is finite differencing and method of lines time integration, the numerical code is generated through a high level Mathematica interface. Restrictions on the complexity of the program: Typical numerical relativity applications will contain up to several dozen evolution variables and thousands of source terms, Cactus applications have shown scaling up to several thousand processors and grid sizes exceeding 500 3. Typical running time: This depends on the number of variables and the grid size: the included ADM example takes approximately 100 seconds on a 1600 MHz Intel Pentium M processor. Unusual features of the program: based on Mathematica and Cactus
The formation of quasi-parallel shocks. [in space, solar and astrophysical plasmas

NASA Technical Reports Server (NTRS)

Cargill, Peter J.

1991-01-01

In a collisionless plasma, the coupling between a piston and the plasma must take place through either laminar or turbulent electromagnetic fields. Of the three types of coupling (laminar, Larmor and turbulent), shock formation in the parallel regime is dominated by the latter and in the quasi-parallel regime by a combination of all three, depending on the piston. In the quasi-perpendicular regime, there is usually a good separation between piston and shock. This is not true in the quasi-parallel and parallel regime. Hybrid numerical simulations for hot plasma pistons indicate that when the electrons are hot, a shock forms, but does not cleanly decouple from the piston. For hot ion pistons, no shock forms in the parallel limit: in the quasi-parallel case, a shock forms, but there is severe contamination from hot piston ions. These results suggest that the properties of solar and astrophysical shocks, such as particle acceleration, cannot be readily separated from their driving mechanism.
Anisotropic Behaviour of Magnetic Power Spectra in Solar Wind Turbulence.

NASA Astrophysics Data System (ADS)

Banerjee, S.; Saur, J.; Gerick, F.; von Papen, M.

2017-12-01

Introduction:High altitude fast solar wind turbulence (SWT) shows different spectral properties as a function of the angle between the flow direction and the scale dependent mean magnetic field (Horbury et al., PRL, 2008). The average magnetic power contained in the near perpendicular direction (80º-90º) was found to be approximately 5 times larger than the average power in the parallel direction (0º- 10º). In addition, the parallel power spectra was found to give a steeper (-2) power law than the perpendicular power spectral density (PSD) which followed a near Kolmogorov slope (-5/3). Similar anisotropic behaviour has also been observed (Chen et al., MNRAS, 2011) for slow solar wind (SSW), but using a different method exploiting multi-spacecraft data of Cluster. Purpose:In the current study, using Ulysses data, we investigate (i) the anisotropic behaviour of near ecliptic slow solar wind using the same methodology (described below) as that of Horbury et al. (2008) and (ii) the dependence of the anisotropic behaviour of SWT as a function of the heliospheric latitude.Method:We apply the wavelet method to calculate the turbulent power spectra of the magnetic field fluctuations parallel and perpendicular to the local mean magnetic field (LMF). According to Horbury et al., LMF for a given scale (or size) is obtained using an envelope of the envelope of that size. Results:(i) SSW intervals always show near -5/3 perpendicular spectra. Unlike the fast solar wind (FSW) intervals, for SSW, we often find intervals where power parallel to the mean field is not observed. For a few intervals with sufficient power in parallel direction, slow wind turbulence also exhibit -2 parallel spectra similar to FSW.(ii) The behaviours of parallel and perpendicular power spectra are found to be independent of the heliospheric latitude. Conclusion:In the current study we do not find significant influence of the heliospheric latitude on the spectral slopes of parallel and perpendicular magnetic spectra. This indicates that the spectral anisotropy in parallel and perpendicular direction is governed by intrinsic properties of SWT.
Parallel Element Agglomeration Algebraic Multigrid and Upscaling Library

DOE Office of Scientific and Technical Information (OSTI.GOV)

Barker, Andrew T.; Benson, Thomas R.; Lee, Chak Shing

ParELAG is a parallel C++ library for numerical upscaling of finite element discretizations and element-based algebraic multigrid solvers. It provides optimal complexity algorithms to build multilevel hierarchies and solvers that can be used for solving a wide class of partial differential equations (elliptic, hyperbolic, saddle point problems) on general unstructured meshes. Additionally, a novel multilevel solver for saddle point problems with divergence constraint is implemented.
The restoring force on a dielectric in a parallel plate capacitor

NASA Astrophysics Data System (ADS)

Staunton, L. P.

2014-09-01

We investigate the restoring force on a dielectric slab being pulled from within the volume of a parallel plate capacitor connected to a battery. Using a conformal mapping to treat the fringing electric field exactly, we numerically obtain an expected Hooke's Law restoring force for small displacements, and a diminishing force for a displacement up to half the length of the dielectric.
Chronobiology of Melatonin beyond the Feedback to the Suprachiasmatic Nucleus-Consequences to Melatonin Dysfunction.

PubMed

Hardeland, Rüdiger

2013-03-12

The mammalian circadian system is composed of numerous oscillators, which gradually differ with regard to their dependence on the pacemaker, the suprachiasmatic nucleus (SCN). Actions of melatonin on extra-SCN oscillators represent an emerging field. Melatonin receptors are widely expressed in numerous peripheral and central nervous tissues. Therefore, the circadian rhythm of circulating, pineal-derived melatonin can have profound consequences for the temporal organization of almost all organs, without necessarily involving the melatonin feedback to the suprachiasmatic nucleus. Experiments with melatonin-deficient mouse strains, pinealectomized animals and melatonin receptor knockouts, as well as phase-shifting experiments with explants, reveal a chronobiological role of melatonin in various tissues. In addition to directly steering melatonin-regulated gene expression, the pineal hormone is required for the rhythmic expression of circadian oscillator genes in peripheral organs and to enhance the coupling of parallel oscillators within the same tissue. It exerts additional effects by modulating the secretion of other hormones. The importance of melatonin for numerous organs is underlined by the association of various diseases with gene polymorphisms concerning melatonin receptors and the melatonin biosynthetic pathway. The possibilities and limits of melatonergic treatment are discussed with regard to reductions of melatonin during aging and in various diseases.
The directional dependence of cometary magnetic energy density in the quasi-parallel and quasi-perpendicular regimes

NASA Technical Reports Server (NTRS)

Miller, R. H.; Gombosi, T. I.; Gary, S. P.; Winske, D.

1991-01-01

The direction of propagation of low frequency magnetic fluctuations generated by cometary ion pick-up is examined by means of 1D electromagnetic hybrid simulations. The newborn ions are injected at a constant rate, and the helicity and direction of propagation of magnetic fluctuations are explored for cometary ion injection angles of 0 and 90 deg relative to the solar wind magnetic field. The parameter eta represents the relative contribution of wave energy propagating in the direction away from the comet, parallel to the beam. For small (quasi-parallel) injection angles eta was found to be of order unity, while for larger (quasi-perpendicular) angles eta was found to be of order 0.5.
Effect of Substrate Wetting on the Morphology and Dynamics of Phase Separating Multi-Component Mixture

NASA Astrophysics Data System (ADS)

Goyal, Abheeti; Toschi, Federico; van der Schoot, Paul

2017-11-01

We study the morphological evolution and dynamics of phase separation of multi-component mixture in thin film constrained by a substrate. Specifically, we have explored the surface-directed spinodal decomposition of multicomponent mixture numerically by Free Energy Lattice Boltzmann (LB) simulations. The distinguishing feature of this model over the Shan-Chen (SC) model is that we have explicit and independent control over the free energy functional and EoS of the system. This vastly expands the ambit of physical systems that can be realistically simulated by LB simulations. We investigate the effect of composition, film thickness and substrate wetting on the phase morphology and the mechanism of growth in the vicinity of the substrate. The phase morphology and averaged size in the vicinity of the substrate fluctuate greatly due to the wetting of the substrate in both the parallel and perpendicular directions. Additionally, we also describe how the model presented here can be extended to include an arbitrary number of fluid components.
Concealed wire tracing apparatus

DOEpatents

Kronberg, J.W.

1994-05-31

An apparatus and method that combines a signal generator and a passive signal receiver to detect and record the path of partially or completely concealed electrical wiring without disturbing the concealing surface is disclosed. The signal generator applies a series of electrical pulses to the selected wiring of interest. The applied pulses create a magnetic field about the wiring that can be detected by a coil contained within the signal receiver. An audible output connected to the receiver and driven by the coil reflects the receivers position with respect to the wiring. The receivers audible signal is strongest when the receiver is directly above the wiring and the long axis of the receivers coil is parallel to the wiring. A marking means is mounted on the receiver to mark the location of the wiring as the receiver is directed over the wiring's concealing surface. Numerous marks made on various locations of the concealing surface will trace the path of the wiring of interest. 4 figs.
A fast new algorithm for a robot neurocontroller using inverse QR decomposition

DOE Office of Scientific and Technical Information (OSTI.GOV)

Morris, A.S.; Khemaissia, S.

2000-01-01

A new adaptive neural network controller for robots is presented. The controller is based on direct adaptive techniques. Unlike many neural network controllers in the literature, inverse dynamical model evaluation is not required. A numerically robust, computationally efficient processing scheme for neutral network weight estimation is described, namely, the inverse QR decomposition (INVQR). The inverse QR decomposition and a weighted recursive least-squares (WRLS) method for neural network weight estimation is derived using Cholesky factorization of the data matrix. The algorithm that performs the efficient INVQR of the underlying space-time data matrix may be implemented in parallel on a triangular array.more » Furthermore, its systolic architecture is well suited for VLSI implementation. Another important benefit is well suited for VLSI implementation. Another important benefit of the INVQR decomposition is that it solves directly for the time-recursive least-squares filter vector, while avoiding the sequential back-substitution step required by the QR decomposition approaches.« less
Large-scale large eddy simulation of nuclear reactor flows: Issues and perspectives

DOE Office of Scientific and Technical Information (OSTI.GOV)

Merzari, Elia; Obabko, Aleks; Fischer, Paul

Numerical simulation has been an intrinsic part of nuclear engineering research since its inception. In recent years a transition is occurring toward predictive, first-principle-based tools such as computational fluid dynamics. Even with the advent of petascale computing, however, such tools still have significant limitations. In the present work some of these issues, and in particular the presence of massive multiscale separation, are discussed, as well as some of the research conducted to mitigate them. Petascale simulations at high fidelity (large eddy simulation/direct numerical simulation) were conducted with the massively parallel spectral element code Nek5000 on a series of representative problems.more » These simulations shed light on the requirements of several types of simulation: (1) axial flow around fuel rods, with particular attention to wall effects; (2) natural convection in the primary vessel; and (3) flow in a rod bundle in the presence of spacing devices. Finally, the focus of the work presented here is on the lessons learned and the requirements to perform these simulations at exascale. Additional physical insight gained from these simulations is also emphasized.« less
Overview of the new capabilities of TORIC-v6 and comparison with TORIC-v5

NASA Astrophysics Data System (ADS)

Bilato, R.; Brambilla, M.; Bertelli, N.

2016-10-01

Since its release, version 5 (v5) of the full-wave TORIC code, characterized by an optimized parallelized solver for its routinely use in TRANSP package, has been ameliorated in many technical issues, e.g. the plasma-vacuum transition and the full-spectrum antenna modeling. For the WPCD-benchmark cases a good agreement between the new version, v6, and v5 is found. The major improvement, however, has been done in interfacing TORIC-v6 with the Fokker-Planck SSFPQL solver to account for the back-reaction of ICRF and NBI heating on the wave propagation and absorption. Special algorithms have been developed for SSFPQL for the numerical precision at high pitch-angle resolution and to evaluate the generalized dispersion function directly from the numerical solution. Care has been spent in automatizing the non-linear loop between TORIC-v6 and SSFPQL. In v6 the description of wave absorption at high-harmonics has been revised and applied to DEMO. For high-harmonic regimes there is an ongoing activity on the comparison with AORSA.
Large-scale large eddy simulation of nuclear reactor flows: Issues and perspectives

DOE PAGES

Merzari, Elia; Obabko, Aleks; Fischer, Paul; ...

2016-11-03

Numerical simulation has been an intrinsic part of nuclear engineering research since its inception. In recent years a transition is occurring toward predictive, first-principle-based tools such as computational fluid dynamics. Even with the advent of petascale computing, however, such tools still have significant limitations. In the present work some of these issues, and in particular the presence of massive multiscale separation, are discussed, as well as some of the research conducted to mitigate them. Petascale simulations at high fidelity (large eddy simulation/direct numerical simulation) were conducted with the massively parallel spectral element code Nek5000 on a series of representative problems.more » These simulations shed light on the requirements of several types of simulation: (1) axial flow around fuel rods, with particular attention to wall effects; (2) natural convection in the primary vessel; and (3) flow in a rod bundle in the presence of spacing devices. Finally, the focus of the work presented here is on the lessons learned and the requirements to perform these simulations at exascale. Additional physical insight gained from these simulations is also emphasized.« less
Detailed numerical investigation of the Bohm limit in cosmic ray diffusion theory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hussein, M.; Shalchi, A., E-mail: m_hussein@physics.umanitoba.ca, E-mail: andreasm4@yahoo.com

2014-04-10

A standard model in cosmic ray diffusion theory is the so-called Bohm limit in which the particle mean free path is assumed to be equal to the Larmor radius. This type of diffusion is often employed to model the propagation and acceleration of energetic particles. However, recent analytical and numerical work has shown that standard Bohm diffusion is not realistic. In the present paper, we perform test-particle simulations to explore particle diffusion in the strong turbulence limit in which the wave field is much stronger than the mean magnetic field. We show that there is indeed a lower limit ofmore » the particle mean free path along the mean field. In this limit, the mean free path is directly proportional to the unperturbed Larmor radius like in the traditional Bohm limit, but it is reduced by the factor δB/B {sub 0} where B {sub 0} is the mean field and δB the turbulent field. Although we focus on parallel diffusion, we also explore diffusion across the mean field in the strong turbulence limit.« less
Advances in computational design and analysis of airbreathing propulsion systems

NASA Technical Reports Server (NTRS)

Klineberg, John M.

1989-01-01

The development of commercial and military aircraft depends, to a large extent, on engine manufacturers being able to achieve significant increases in propulsion capability through improved component aerodynamics, materials, and structures. The recent history of propulsion has been marked by efforts to develop computational techniques that can speed up the propulsion design process and produce superior designs. The availability of powerful supercomputers, such as the NASA Numerical Aerodynamic Simulator, and the potential for even higher performance offered by parallel computer architectures, have opened the door to the use of multi-dimensional simulations to study complex physical phenomena in propulsion systems that have previously defied analysis or experimental observation. An overview of several NASA Lewis research efforts is provided that are contributing toward the long-range goal of a numerical test-cell for the integrated, multidisciplinary design, analysis, and optimization of propulsion systems. Specific examples in Internal Computational Fluid Mechanics, Computational Structural Mechanics, Computational Materials Science, and High Performance Computing are cited and described in terms of current capabilities, technical challenges, and future research directions.
The line integral approach to radarclinometry

USGS Publications Warehouse

Wildey, R.L.

1987-01-01

Radarclinometry, the invention of which has been previously reported, is a technique for deriving a topographic map from a single radar image by using the dependence upon terrain-surface orientation of the integrated signal of an individual image pixel. The radiometric calibration required for precise operation and testing does not yet exist, but the imminence of important applications justifies parallel, rather than serial, development of radarclinometry and radiometrically calibrated radar. The present investigation reports three developmental advances: (1) The solid angle of integration of back-scattered specific intensity constituting a pixel signal is more accurately accounted for in its dependence on surface orientation than in previous work. (2) The local curvature hypothesis, which removes the requirement of a ground-truth profile as a boundary condition and enables the formulation of the theory in terms of a line integral, has been expanded to include the three possibilities of Local Cylindricity, Local Biaxial Ellipsoidal Hyperbolicity, and Least-Squares Local Sphericity. (3) The theory is integrated in the cross-ground-range direction, which is ill-conditioned compared to the ground-range direction, whereas the original formulation was based on enforced isotropy in the two-dimensional power spectrum of the topography. It was found necessary to prohibit the hypothesis of Local Biaxial Ellipsoidal Hyperbolicity in the cross-range stepping, for reasons not completely clear. Variation in the proportioning between curvature assumptions had produced topographic maps that are in good mutual agreement but not realistic in appearance. They are severely banded parallel to the ground-range direction, most especially at small radar zenith angles. Numerical experimentation with the falsification of topography through incorrect decalibration as performed on a Gaussian hill suggests that the banding and its exaggeration at high radar incidence angles could easily be due to our lack of radiometric calibration. ?? 1987 D. Reidel Publishing Company.
Parallel processing using an optical delay-based reservoir computer

NASA Astrophysics Data System (ADS)

Van der Sande, Guy; Nguimdo, Romain Modeste; Verschaffelt, Guy

2016-04-01

Delay systems subject to delayed optical feedback have recently shown great potential in solving computationally hard tasks. By implementing a neuro-inspired computational scheme relying on the transient response to optical data injection, high processing speeds have been demonstrated. However, reservoir computing systems based on delay dynamics discussed in the literature are designed by coupling many different stand-alone components which lead to bulky, lack of long-term stability, non-monolithic systems. Here we numerically investigate the possibility of implementing reservoir computing schemes based on semiconductor ring lasers. Semiconductor ring lasers are semiconductor lasers where the laser cavity consists of a ring-shaped waveguide. SRLs are highly integrable and scalable, making them ideal candidates for key components in photonic integrated circuits. SRLs can generate light in two counterpropagating directions between which bistability has been demonstrated. We demonstrate that two independent machine learning tasks , even with different nature of inputs with different input data signals can be simultaneously computed using a single photonic nonlinear node relying on the parallelism offered by photonics. We illustrate the performance on simultaneous chaotic time series prediction and a classification of the Nonlinear Channel Equalization. We take advantage of different directional modes to process individual tasks. Each directional mode processes one individual task to mitigate possible crosstalk between the tasks. Our results indicate that prediction/classification with errors comparable to the state-of-the-art performance can be obtained even with noise despite the two tasks being computed simultaneously. We also find that a good performance is obtained for both tasks for a broad range of the parameters. The results are discussed in detail in [Nguimdo et al., IEEE Trans. Neural Netw. Learn. Syst. 26, pp. 3301-3307, 2015
Multilevel summation method for electrostatic force evaluation.

PubMed

Hardy, David J; Wu, Zhe; Phillips, James C; Stone, John E; Skeel, Robert D; Schulten, Klaus

2015-02-10

The multilevel summation method (MSM) offers an efficient algorithm utilizing convolution for evaluating long-range forces arising in molecular dynamics simulations. Shifting the balance of computation and communication, MSM provides key advantages over the ubiquitous particle–mesh Ewald (PME) method, offering better scaling on parallel computers and permitting more modeling flexibility, with support for periodic systems as does PME but also for semiperiodic and nonperiodic systems. The version of MSM available in the simulation program NAMD is described, and its performance and accuracy are compared with the PME method. The accuracy feasible for MSM in practical applications reproduces PME results for water property calculations of density, diffusion constant, dielectric constant, surface tension, radial distribution function, and distance-dependent Kirkwood factor, even though the numerical accuracy of PME is higher than that of MSM. Excellent agreement between MSM and PME is found also for interface potentials of air–water and membrane–water interfaces, where long-range Coulombic interactions are crucial. Applications demonstrate also the suitability of MSM for systems with semiperiodic and nonperiodic boundaries. For this purpose, simulations have been performed with periodic boundaries along directions parallel to a membrane surface but not along the surface normal, yielding membrane pore formation induced by an imbalance of charge across the membrane. Using a similar semiperiodic boundary condition, ion conduction through a graphene nanopore driven by an ion gradient has been simulated. Furthermore, proteins have been simulated inside a single spherical water droplet. Finally, parallel scalability results show the ability of MSM to outperform PME when scaling a system of modest size (less than 100 K atoms) to over a thousand processors, demonstrating the suitability of MSM for large-scale parallel simulation.

Visualization of Octree Adaptive Mesh Refinement (AMR) in Astrophysical Simulations

NASA Astrophysics Data System (ADS)

Labadens, M.; Chapon, D.; Pomaréde, D.; Teyssier, R.

2012-09-01

Computer simulations are important in current cosmological research. Those simulations run in parallel on thousands of processors, and produce huge amount of data. Adaptive mesh refinement is used to reduce the computing cost while keeping good numerical accuracy in regions of interest. RAMSES is a cosmological code developed by the Commissariat à l'énergie atomique et aux énergies alternatives (English: Atomic Energy and Alternative Energies Commission) which uses Octree adaptive mesh refinement. Compared to grid based AMR, the Octree AMR has the advantage to fit very precisely the adaptive resolution of the grid to the local problem complexity. However, this specific octree data type need some specific software to be visualized, as generic visualization tools works on Cartesian grid data type. This is why the PYMSES software has been also developed by our team. It relies on the python scripting language to ensure a modular and easy access to explore those specific data. In order to take advantage of the High Performance Computer which runs the RAMSES simulation, it also uses MPI and multiprocessing to run some parallel code. We would like to present with more details our PYMSES software with some performance benchmarks. PYMSES has currently two visualization techniques which work directly on the AMR. The first one is a splatting technique, and the second one is a custom ray tracing technique. Both have their own advantages and drawbacks. We have also compared two parallel programming techniques with the python multiprocessing library versus the use of MPI run. The load balancing strategy has to be smartly defined in order to achieve a good speed up in our computation. Results obtained with this software are illustrated in the context of a massive, 9000-processor parallel simulation of a Milky Way-like galaxy.
Three is much more than two in coarsening dynamics of cyclic competitions

NASA Astrophysics Data System (ADS)

Mitarai, Namiko; Gunnarson, Ivar; Pedersen, Buster Niels; Rosiek, Christian Anker; Sneppen, Kim

2016-04-01

The classical game of rock-paper-scissors has inspired experiments and spatial model systems that address the robustness of biological diversity. In particular, the game nicely illustrates that cyclic interactions allow multiple strategies to coexist for long-time intervals. When formulated in terms of a one-dimensional cellular automata, the spatial distribution of strategies exhibits coarsening with algebraically growing domain size over time, while the two-dimensional version allows domains to break and thereby opens the possibility for long-time coexistence. We consider a quasi-one-dimensional implementation of the cyclic competition, and study the long-term dynamics as a function of rare invasions between parallel linear ecosystems. We find that increasing the complexity from two to three parallel subsystems allows a transition from complete coarsening to an active steady state where the domain size stays finite. We further find that this transition happens irrespective of whether the update is done in parallel for all sites simultaneously or done randomly in sequential order. In both cases, the active state is characterized by localized bursts of dislocations, followed by longer periods of coarsening. In the case of the parallel dynamics, we find that there is another phase transition between the active steady state and the coarsening state within the three-line system when the invasion rate between the subsystems is varied. We identify the critical parameter for this transition and show that the density of active boundaries has critical exponents that are consistent with the directed percolation universality class. On the other hand, numerical simulations with the random sequential dynamics suggest that the system may exhibit an active steady state as long as the invasion rate is finite.
A parallelizable real-time motion tracking algorithm with applications to ultrasonic strain imaging.

PubMed

Jiang, J; Hall, T J

2007-07-07

Ultrasound-based mechanical strain imaging systems utilize signals from conventional diagnostic ultrasound systems to image tissue elasticity contrast that provides new diagnostically valuable information. Previous works (Hall et al 2003 Ultrasound Med. Biol. 29 427, Zhu and Hall 2002 Ultrason. Imaging 24 161) demonstrated that uniaxial deformation with minimal elevation motion is preferred for breast strain imaging and real-time strain image feedback to operators is important to accomplish this goal. The work reported here enhances the real-time speckle tracking algorithm with two significant modifications. One fundamental change is that the proposed algorithm is a column-based algorithm (a column is defined by a line of data parallel to the ultrasound beam direction, i.e. an A-line), as opposed to a row-based algorithm (a row is defined by a line of data perpendicular to the ultrasound beam direction). Then, displacement estimates from its adjacent columns provide good guidance for motion tracking in a significantly reduced search region to reduce computational cost. Consequently, the process of displacement estimation can be naturally split into at least two separated tasks, computed in parallel, propagating outward from the center of the region of interest (ROI). The proposed algorithm has been implemented and optimized in a Windows system as a stand-alone ANSI C++ program. Results of preliminary tests, using numerical and tissue-mimicking phantoms, and in vivo tissue data, suggest that high contrast strain images can be consistently obtained with frame rates (10 frames s(-1)) that exceed our previous methods.
Alignments of the galaxies in and around the Virgo cluster with the local velocity shear

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lee, Jounghun; Rey, Soo Chang; Kim, Suk, E-mail: jounghun@astro.snu.ac.kr

2014-08-10

Observational evidence is presented for the alignment between the cosmic sheet and the principal axis of the velocity shear field at the position of the Virgo cluster. The galaxies in and around the Virgo cluster from the Extended Virgo Cluster Catalog that was recently constructed by Kim et al. are used to determine the direction of the local sheet. The peculiar velocity field reconstructed from the Sloan Digital Sky Survey Data Release 7 is analyzed to estimate the local velocity shear tensor at the Virgo center. Showing first that the minor principal axis of the local velocity shear tensor ismore » almost parallel to the direction of the line of sight, we detect a clear signal of alignment between the positions of the Virgo satellites and the intermediate principal axis of the local velocity shear projected onto the plane of the sky. Furthermore, the dwarf satellites are found to appear more strongly aligned than their normal counterparts, which is interpreted as an indication of the following. (1) The normal satellites and the dwarf satellites fall in the Virgo cluster preferentially along the local filament and the local sheet, respectively. (2) The local filament is aligned with the minor principal axis of the local velocity shear while the local sheet is parallel to the plane spanned by the minor and intermediate principal axes. Our result is consistent with the recent numerical claim that the velocity shear is a good tracer of the cosmic web.« less
A Strassen-Newton algorithm for high-speed parallelizable matrix inversion

NASA Technical Reports Server (NTRS)

Bailey, David H.; Ferguson, Helaman R. P.

1988-01-01

Techniques are described for computing matrix inverses by algorithms that are highly suited to massively parallel computation. The techniques are based on an algorithm suggested by Strassen (1969). Variations of this scheme use matrix Newton iterations and other methods to improve the numerical stability while at the same time preserving a very high level of parallelism. One-processor Cray-2 implementations of these schemes range from one that is up to 55 percent faster than a conventional library routine to one that is slower than a library routine but achieves excellent numerical stability. The problem of computing the solution to a single set of linear equations is discussed, and it is shown that this problem can also be solved efficiently using these techniques.
Direct Observation of Parallel Folding Pathways Revealed Using a Symmetric Repeat Protein System

PubMed Central

Aksel, Tural; Barrick, Doug

2014-01-01

Although progress has been made to determine the native fold of a polypeptide from its primary structure, the diversity of pathways that connect the unfolded and folded states has not been adequately explored. Theoretical and computational studies predict that proteins fold through parallel pathways on funneled energy landscapes, although experimental detection of pathway diversity has been challenging. Here, we exploit the high translational symmetry and the direct length variation afforded by linear repeat proteins to directly detect folding through parallel pathways. By comparing folding rates of consensus ankyrin repeat proteins (CARPs), we find a clear increase in folding rates with increasing size and repeat number, although the size of the transition states (estimated from denaturant sensitivity) remains unchanged. The increase in folding rate with chain length, as opposed to a decrease expected from typical models for globular proteins, is a clear demonstration of parallel pathways. This conclusion is not dependent on extensive curve-fitting or structural perturbation of protein structure. By globally fitting a simple parallel-Ising pathway model, we have directly measured nucleation and propagation rates in protein folding, and have quantified the fluxes along each path, providing a detailed energy landscape for folding. This finding of parallel pathways differs from results from kinetic studies of repeat-proteins composed of sequence-variable repeats, where modest repeat-to-repeat energy variation coalesces folding into a single, dominant channel. Thus, for globular proteins, which have much higher variation in local structure and topology, parallel pathways are expected to be the exception rather than the rule. PMID:24988356
Scan Directed Load Balancing for Highly-Parallel Mesh-Connected Computers

DTIC Science & Technology

1991-07-01

DTIC ~ ELECTE OCT 2 41991 AD-A242 045 Scan Directed Load Balancing for Highly-Parallel Mesh-Connected Computers’ Edoardo S. Biagioni Jan F. Prins...Department of Computer Science University of North Carolina Chapel Hill, N.C. 27599-3175 USA biagioni @cs.unc.edu prinsOcs.unc.edu Abstract Scan Directed...MasPar Computer Corpora- tion. Bibliography [1] Edoardo S. Biagioni . Scan Directed Load Balancing. PhD thesis., University of North Carolina, Chapel Hill
Computational Challenges of 3D Radiative Transfer in Atmospheric Models

NASA Astrophysics Data System (ADS)

Jakub, Fabian; Bernhard, Mayer

2017-04-01

The computation of radiative heating and cooling rates is one of the most expensive components in todays atmospheric models. The high computational cost stems not only from the laborious integration over a wide range of the electromagnetic spectrum but also from the fact that solving the integro-differential radiative transfer equation for monochromatic light is already rather involved. This lead to the advent of numerous approximations and parameterizations to reduce the cost of the solver. One of the most prominent one is the so called independent pixel approximations (IPA) where horizontal energy transfer is neglected whatsoever and radiation may only propagate in the vertical direction (1D). Recent studies implicate that the IPA introduces significant errors in high resolution simulations and affects the evolution and development of convective systems. However, using fully 3D solvers such as for example MonteCarlo methods is not even on state of the art supercomputers feasible. The parallelization of atmospheric models is often realized by a horizontal domain decomposition, and hence, horizontal transfer of energy necessitates communication. E.g. a cloud's shadow at a low zenith angle will cast a long shadow and potentially needs to communication through a multitude of processors. Especially light in the solar spectral range may travel long distances through the atmosphere. Concerning highly parallel simulations, it is vital that 3D radiative transfer solvers put a special emphasis on parallel scalability. We will present an introduction to intricacies computing 3D radiative heating and cooling rates as well as report on the parallel performance of the TenStream solver. The TenStream is a 3D radiative transfer solver using the PETSc framework to iteratively solve a set of partial differential equation. We investigate two matrix preconditioners, (a) geometric algebraic multigrid preconditioning(MG+GAMG) and (b) block Jacobi incomplete LU (ILU) factorization. The TenStream solver is tested for up to 4096 cores and shows a parallel scaling efficiency of 80-90% on various supercomputers.
A massively parallel computational approach to coupled thermoelastic/porous gas flow problems

NASA Technical Reports Server (NTRS)

Shia, David; Mcmanus, Hugh L.

1995-01-01

A new computational scheme for coupled thermoelastic/porous gas flow problems is presented. Heat transfer, gas flow, and dynamic thermoelastic governing equations are expressed in fully explicit form, and solved on a massively parallel computer. The transpiration cooling problem is used as an example problem. The numerical solutions have been verified by comparison to available analytical solutions. Transient temperature, pressure, and stress distributions have been obtained. Small spatial oscillations in pressure and stress have been observed, which would be impractical to predict with previously available schemes. Comparisons between serial and massively parallel versions of the scheme have also been made. The results indicate that for small scale problems the serial and parallel versions use practically the same amount of CPU time. However, as the problem size increases the parallel version becomes more efficient than the serial version.
A transient FETI methodology for large-scale parallel implicit computations in structural mechanics

NASA Technical Reports Server (NTRS)

Farhat, Charbel; Crivelli, Luis; Roux, Francois-Xavier

1992-01-01

Explicit codes are often used to simulate the nonlinear dynamics of large-scale structural systems, even for low frequency response, because the storage and CPU requirements entailed by the repeated factorizations traditionally found in implicit codes rapidly overwhelm the available computing resources. With the advent of parallel processing, this trend is accelerating because explicit schemes are also easier to parallelize than implicit ones. However, the time step restriction imposed by the Courant stability condition on all explicit schemes cannot yet -- and perhaps will never -- be offset by the speed of parallel hardware. Therefore, it is essential to develop efficient and robust alternatives to direct methods that are also amenable to massively parallel processing because implicit codes using unconditionally stable time-integration algorithms are computationally more efficient when simulating low-frequency dynamics. Here we present a domain decomposition method for implicit schemes that requires significantly less storage than factorization algorithms, that is several times faster than other popular direct and iterative methods, that can be easily implemented on both shared and local memory parallel processors, and that is both computationally and communication-wise efficient. The proposed transient domain decomposition method is an extension of the method of Finite Element Tearing and Interconnecting (FETI) developed by Farhat and Roux for the solution of static problems. Serial and parallel performance results on the CRAY Y-MP/8 and the iPSC-860/128 systems are reported and analyzed for realistic structural dynamics problems. These results establish the superiority of the FETI method over both the serial/parallel conjugate gradient algorithm with diagonal scaling and the serial/parallel direct method, and contrast the computational power of the iPSC-860/128 parallel processor with that of the CRAY Y-MP/8 system.
Spatial attention determines the nature of nonverbal number representation.

PubMed

Hyde, Daniel C; Wood, Justin N

2011-09-01

Coordinated studies of adults, infants, and nonhuman animals provide evidence for two systems of nonverbal number representation: a "parallel individuation" system that represents individual items and a "numerical magnitude" system that represents the approximate cardinal value of a group. However, there is considerable debate about the nature and functions of these systems, due largely to the fact that some studies show a dissociation between small (1-3) and large (>3) number representation, whereas others do not. Using event-related potentials, we show that it is possible to determine which system will represent the numerical value of a small number set (1-3 items) by manipulating spatial attention. Specifically, when attention can select individual objects, an early brain response (N1) scales with the cardinal value of the display, the signature of parallel individuation. In contrast, when attention cannot select individual objects or is occupied by another task, a later brain response (P2p) scales with ratio, the signature of the approximate numerical magnitude system. These results provide neural evidence that small numbers can be represented as approximate numerical magnitudes. Further, they empirically demonstrate the importance of early attentional processes to number representation by showing that the way in which attention disperses across a scene determines which numerical system will deploy in a given context.
Chemical transport in a fissured rock: Verification of a numerical model

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rasmuson, A.; Narasimhan, T. N.; Neretnieks, I.

1982-10-01

Numerical models for simulating chemical transport in fissured rocks constitute powerful tools for evaluating the acceptability of geological nuclear waste repositories. Due to the very long-term, high toxicity of some nuclear waste products, the models are required to predict, in certain cases, the spatial and temporal distribution of chemical concentration less than 0.001% of the concentration released from the repository. Whether numerical models can provide such accuracies is a major question addressed in the present work. To this end, we have verified a numerical model, TRUMP, which solves the advective diffusion equation in general three dimensions with or without decaymore » and source terms. The method is based on an integrated finite-difference approach. The model was verified against known analytic solution of the one-dimensional advection-diffusion problem as well as the problem of advection-diffusion in a system of parallel fractures separated by spherical particles. The studies show that as long as the magnitude of advectance is equal to or less than that of conductance for the closed surface bounding any volume element in the region (that is, numerical Peclet number <2), the numerical method can indeed match the analytic solution within errors of ±10{sup -3} % or less. The realistic input parameters used in the sample calculations suggest that such a range of Peclet numbers is indeed likely to characterize deep groundwater systems in granitic and ancient argillaceous systems. Thus TRUMP in its present form does provide a viable tool for use in nuclear waste evaluation studies. A sensitivity analysis based on the analytic solution suggests that the errors in prediction introduced due to uncertainties in input parameters is likely to be larger than the computational inaccuracies introduced by the numerical model. Currently, a disadvantage in the TRUMP model is that the iterative method of solving the set of simultaneous equations is rather slow when time constants vary widely over the flow region. Although the iterative solution may be very desirable for large three-dimensional problems in order to minimize computer storage, it seems desirable to use a direct solver technique in conjunction with the mixed explicit-implicit approach whenever possible. work in this direction is in progress.« less
Massively parallel GPU-accelerated minimization of classical density functional theory

NASA Astrophysics Data System (ADS)

Stopper, Daniel; Roth, Roland

2017-08-01

In this paper, we discuss the ability to numerically minimize the grand potential of hard disks in two-dimensional and of hard spheres in three-dimensional space within the framework of classical density functional and fundamental measure theory on modern graphics cards. Our main finding is that a massively parallel minimization leads to an enormous performance gain in comparison to standard sequential minimization schemes. Furthermore, the results indicate that in complex multi-dimensional situations, a heavy parallel minimization of the grand potential seems to be mandatory in order to reach a reasonable balance between accuracy and computational cost.
Passive scalars: Mixing, diffusion, and intermittency in helical and nonhelical rotating turbulence

NASA Astrophysics Data System (ADS)

Imazio, P. Rodriguez; Mininni, P. D.

2017-03-01

We use direct numerical simulations to compute structure functions, scaling exponents, probability density functions, and effective transport coefficients of passive scalars in turbulent rotating helical and nonhelical flows. We show that helicity affects the inertial range scaling of the velocity and of the passive scalar when rotation is present, with a spectral law consistent with ˜k⊥-1.4 for the passive scalar variance spectrum. This scaling law is consistent with a phenomenological argument [P. Rodriguez Imazio and P. D. Mininni, Phys. Rev. E 83, 066309 (2011), 10.1103/PhysRevE.83.066309] for rotating nonhelical flows, which follows directly from Kolmogorov-Obukhov scaling and states that if energy follows a E (k ) ˜k-n law, then the passive scalar variance follows a law V (k ) ˜k-nθ with nθ=(5 -n ) /2 . With the second-order scaling exponent obtained from this law, and using the Kraichnan model, we obtain anomalous scaling exponents for the passive scalar that are in good agreement with the numerical results. Multifractal intermittency models are also considered. Intermittency of the passive scalar is stronger than in the nonhelical rotating case, a result that is also confirmed by stronger non-Gaussian tails in the probability density functions of field increments. Finally, Fick's law is used to compute the effective diffusion coefficients in the directions parallel and perpendicular to rotation. Calculations indicate that horizontal diffusion decreases in the presence of helicity in rotating flows, while vertical diffusion increases. A simple mean field argument explains this behavior in terms of the amplitude of velocity fluctuations.
Physical effects of magnetic fields on the Kelvin-Helmholtz instability in a free shear layer

NASA Astrophysics Data System (ADS)

Liu, Y.; Chen, Z. H.; Zhang, H. H.; Lin, Z. Y.

2018-04-01

The Kelvin-Helmholtz instability of a parallel shear flow with a hyperbolic-tangent velocity profile has been simulated numerically at a high Reynolds number. The fluid is perfectly conducting with low viscosity, and the strength of the applied magnetic field varies from weak to strong. We found that the magnetic field parallel to the mainstream direction has a stabilizing effect on the shear flow. The magnetic field mainly stabilizes short-wave perturbations. Small viscosity and/or slight compressibility could introduce some instability even in the presence of a strong magnetic field in a certain circumstance. The suppressing effect of the magnetic field on the instability is accomplished by two parts: the separating effect of the transverse magnetic pressure and the anti-bending effect of magnetic tension pointing to the center of curvature. The former shows prevailingly stronger effect on the fluid interface than the latter does, which is different from the conventional opinion that magnetic tension dominates. Essentially it is mainly the Maxwell stress that weakens and balances the momentum transport conducted by the Reynolds stress, reducing the mixing degree of the upper fluid and the lower fluid.
BPF-type region-of-interest reconstruction for parallel translational computed tomography.

PubMed

Wu, Weiwen; Yu, Hengyong; Wang, Shaoyu; Liu, Fenglin

2017-01-01

The objective of this study is to present and test a new ultra-low-cost linear scan based tomography architecture. Similar to linear tomosynthesis, the source and detector are translated in opposite directions and the data acquisition system targets on a region-of-interest (ROI) to acquire data for image reconstruction. This kind of tomographic architecture was named parallel translational computed tomography (PTCT). In previous studies, filtered backprojection (FBP)-type algorithms were developed to reconstruct images from PTCT. However, the reconstructed ROI images from truncated projections have severe truncation artefact. In order to overcome this limitation, we in this study proposed two backprojection filtering (BPF)-type algorithms named MP-BPF and MZ-BPF to reconstruct ROI images from truncated PTCT data. A weight function is constructed to deal with data redundancy for multi-linear translations modes. Extensive numerical simulations are performed to evaluate the proposed MP-BPF and MZ-BPF algorithms for PTCT in fan-beam geometry. Qualitative and quantitative results demonstrate that the proposed BPF-type algorithms cannot only more accurately reconstruct ROI images from truncated projections but also generate high-quality images for the entire image support in some circumstances.
On some Aitken-like acceleration of the Schwarz method

NASA Astrophysics Data System (ADS)

Garbey, M.; Tromeur-Dervout, D.

2002-12-01

In this paper we present a family of domain decomposition based on Aitken-like acceleration of the Schwarz method seen as an iterative procedure with a linear rate of convergence. We first present the so-called Aitken-Schwarz procedure for linear differential operators. The solver can be a direct solver when applied to the Helmholtz problem with five-point finite difference scheme on regular grids. We then introduce the Steffensen-Schwarz variant which is an iterative domain decomposition solver that can be applied to linear and nonlinear problems. We show that these solvers have reasonable numerical efficiency compared to classical fast solvers for the Poisson problem or multigrids for more general linear and nonlinear elliptic problems. However, the salient feature of our method is that our algorithm has high tolerance to slow network in the context of distributed parallel computing and is attractive, generally speaking, to use with computer architecture for which performance is limited by the memory bandwidth rather than the flop performance of the CPU. This is nowadays the case for most parallel. computer using the RISC processor architecture. We will illustrate this highly desirable property of our algorithm with large-scale computing experiments.
Anisotropic surface-state-mediated RKKY interaction between adatoms on a hexagonal lattice

NASA Astrophysics Data System (ADS)

Patrone, Paul N.; Einstein, T. L.

2012-01-01

Motivated by recent numerical studies of Ag on Pt(111), we derive an expression for the RKKY interaction mediated by surface states, considering the effect of anisotropy in the Fermi edge. Our analysis is based on a stationary phase approximation. The main contribution to the interaction comes from electrons whose Fermi velocity vF is parallel to the vector R connecting the interacting adatoms; we show that, in general, the corresponding Fermi wave vector kF is not parallel to R. The interaction is oscillatory; the amplitude and wavelength of oscillations have angular dependence arising from the anisotropy of the surface-state band structure. The wavelength, in particular, is determined by the projection of this kF (corresponding to vF) onto the direction of R. Our analysis is easily generalized to other systems. For Ag on Pt(111), our results indicate that the RKKY interaction between pairs of adatoms should be nearly isotropic and so cannot account for the anisotropy found in the studies motivating our work. However, for metals with surface-state dispersions similar to Be(101¯0), we show that the RKKY interaction should have considerable anisotropy.
Radio-over-fiber system with octuple frequency optical millimeter-wave signal generation using dual-parallel Mach-Zehnder modulator based on four-wave mixing in semiconductor optical amplifier

NASA Astrophysics Data System (ADS)

Zhou, Hui; Zeng, Yuting; Chen, Ming; Shen, Yunlong

2018-03-01

We have proposed a scheme of radio-over-fiber (RoF) system employing a dual-parallel Mach-Zehnder modulator (DP-MZM) based on four-wave mixing (FWM) in a semiconductor optical amplifier (SOA). In this scheme, the pump and the signal are generated by properly adjusting the direct current bias, modulation index of the DP-MZM, and the phase difference between the sub-MZMs. Because of the pump and the signal deriving from the same optical wave, the polarization states of the two lightwaves are copolarized. The single-pump FWM is polarization insensitive. After FWM and optical filtering, the optical millimeter-wave with octuple frequency is generated. About 40-GHz RoF system with a 2.5-Gbit / s signal is implemented by numerical simulation; the result shows that it has a good performance after the signal is transmitted over 40-km single-mode fiber. Then, the effects of the SOA's injection current and the carrier-to-sideband ratio on the system performance are discussed by simulation, and the optimum value for the system is obtained.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Popovich, P.; Carter, T. A.; Friedman, B.

Numerical simulation of plasma turbulence in the Large Plasma Device (LAPD) [W. Gekelman, H. Pfister, Z. Lucky et al., Rev. Sci. Instrum. 62, 2875 (1991)] is presented. The model, implemented in the BOUndary Turbulence code [M. Umansky, X. Xu, B. Dudson et al., Contrib. Plasma Phys. 180, 887 (2009)], includes three-dimensional (3D) collisional fluid equations for plasma density, electron parallel momentum, and current continuity, and also includes the effects of ion-neutral collisions. In nonlinear simulations using measured LAPD density profiles but assuming constant temperature profile for simplicity, self-consistent evolution of instabilities and nonlinearly generated zonal flows results in a saturatedmore » turbulent state. Comparisons of these simulations with measurements in LAPD plasmas reveal good qualitative and reasonable quantitative agreement, in particular in frequency spectrum, spatial correlation, and amplitude probability distribution function of density fluctuations. For comparison with LAPD measurements, the plasma density profile in simulations is maintained either by direct azimuthal averaging on each time step, or by adding particle source/sink function. The inferred source/sink values are consistent with the estimated ionization source and parallel losses in LAPD. These simulations lay the groundwork for more a comprehensive effort to test fluid turbulence simulation against LAPD data.« less

Robust recognition of handwritten numerals based on dual cooperative network

NASA Technical Reports Server (NTRS)

Lee, Sukhan; Choi, Yeongwoo

1992-01-01

An approach to robust recognition of handwritten numerals using two operating parallel networks is presented. The first network uses inputs in Cartesian coordinates, and the second network uses the same inputs transformed into polar coordinates. How the proposed approach realizes the robustness to local and global variations of input numerals by handling inputs both in Cartesian coordinates and in its transformed Polar coordinates is described. The required network structures and its learning scheme are discussed. Experimental results show that by tracking only a small number of distinctive features for each teaching numeral in each coordinate, the proposed system can provide robust recognition of handwritten numerals.
Percolation and permeability of heterogeneous fracture networks

NASA Astrophysics Data System (ADS)

Adler, Pierre; Mourzenko, Valeri; Thovert, Jean-François

2013-04-01

Natural fracture fields are almost necessarily heterogeneous with a fracture density varying with space. Two classes of variations are quite frequent. In the first one, the fracture density is decreasing from a given surface; the fracture density is usually (but not always see [1]) an exponential function of depth as it has been shown by many measurements. Another important example of such an exponential decrease consists of the Excavated Damaged Zone (EDZ) which is created by the excavation process of a gallery [2,3]. In the second one, the fracture density undergoes some local random variations around an average value. This presentation is mostly focused on the first class and numerical samples are generated with an exponentially decreasing density from a given plane surface. Their percolation status and hydraulic transmissivity can be calculated by the numerical codes which are detailed in [4]. Percolation is determined by a pseudo diffusion algorithm. Flow determination necessitates the meshing of the fracture networks and the discretisation of the Darcy equation by a finite volume technique; the resulting linear system is solved by a conjugate gradient algorithm. Only the flow properties of the EDZ along the directions which are parallel to the wall are of interest when a pressure gradient parallel to the wall is applied. The transmissivity T which relates the total flow rate per unit width Q along the wall through the whole fractured medium to the pressure gradient grad p, is defined by Q = - T grad p/mu where mu is the fluid viscosity. The percolation status and hydraulic transmissivity are systematically determined for a wide range of decay lengths and anisotropy parameters. They can be modeled by comparison with anisotropic fracture networks with a constant density. A heuristic power-law model is proposed which accurately describes the results for the percolation threshold over the whole investigated range of heterogeneity and anisotropy. Then, the data for transmissivity are presented. A simple parallel flow model is introduced. The flow properties of the medium vary with the distance z from the wall. However, the macroscopic pressure gradient does not depend on z, and the flow lines are in average parallel to the wall. Hence, the overall transmissivity is tentatively estimated by a parallel flow model, where a layer at depth z behaves as a fractured medium with uniform properties corresponding to the state at this position in the medium. It yields an explicit analytical expression for the transmissivity as a function of the heterogeneity and anisotropy parameters, and it successfully accounts for all the numerical data. Graphical tools are provided from which first estimates can be quickly and easily obtained. A short overview of the second class of heterogeneous media will be given. [1] Barton C.A., Zoback M.D., J. Geophys. Res., 97B, 5181-5200 (1992). [2] Bossart P. et al, Eng. Geol., vol. 66, 19-38 (2002). [3] Thovert J.-F. et al, Eng. Geol., 117, 39-51 (2011). [4] Adler P.M. et al, Fractured porous media, Oxford U. Press, 2012.
Etude de la Generation des Ultrasons Par Laser dans un Materiau Composite

NASA Astrophysics Data System (ADS)

Dubois, Marc

Laser generation of ultrasound is not a new subject. Many authors have proposed mathematical models of the thermoelastic process of generation of acoustic waves. However, none of those models, up to now, could take simultaneously the effects of the thermal conduction, the optical penetration, the anisotropy of the material and any time and surface profiles of the laser excitation into account. The model presented in this work takes all these parameters into consideration in the case of an infinite orthotropic plate. The mathematical approach used allows to obtain an analytical solution of the mechanical displacement field in the Laplace and two-dimensional (2-D) Fourier spaces. Numerical inverse Laplace and 2-D Fourier transformations bring the mechanical displacement field back into the normal spaces. The use of direct numerical transformations enables to consider almost any time and spatial distributions of the generation laser beam. The acoustic displacements calculated by this model have been compared to experimental displacements measured with a wide band optical detection system. The features of this system allow the quantitative measurement of the parallel and normal displacements to the surface of the sample. Hence, the calculated normal and parallel displacements have been compared to those experimentally measured at various locations on aluminum, glass and polymer samples. In all cases, the agreement between the calculated and experimentally measured displacements was good. The semi-analytical model having proved its validity, it has been used, in addition to a completely analytical one-dimensional model, to study the effects of the optical penetration and the laser pulse duration on the longitudinal acoustic wave generated. This study has established that a short enough laser pulse and a large irradiation with regard to the sample thickness allows to determine quantitatively, from the full width at half maximum of the acoustic pulse, the optical penetration depth at the wavelength of the generation laser inside the material. This semi-analytical model has also permitted to analyze the effects of the optical penetration on the directivity patterns of the longitudinal and shear waves generated by a thermoelastic source. This study has clearly shown that the optical penetration modifies significantly the longitudinal wave directivity pattern, but has only weak effects on the shear wave one. (Abstract shortened by UMI.).
Numerical analysis of propeller induced ground vortices by actuator disk model.

PubMed

Yang, Y; Veldhuis, L L M; Eitelberg, G

2018-01-01

During the ground operation of aircraft, the interaction between the propulsor-induced flow field and the ground may lead to the generation of ground vortices. Utilizing numerical approaches, the source of vorticity entering ground vortices is investigated. The results show that the production of wall-parallel components of vorticity has a strong contribution from the wall-parallel components of the pressure gradient on the wall, which is generated by the action of the propulsor. This mechanism is a supplementation for the vorticity transported from the far-field boundary layer, which has been assumed the main vorticity source in a number of previous publications. Furthermore, the quantitative prediction of the occurrence of ground vortices is performed from the numerical results. As the distance of the propeller form the ground decreases, and as the thrust of the propeller increases, ground vortices are generated from the ground and enter the propeller. In addition, the vortices which exist near the ground but does not enter the propeller plane are observed and visualized by three-dimensional data.
Performance analysis of three dimensional integral equation computations on a massively parallel computer. M.S. Thesis

NASA Technical Reports Server (NTRS)

Logan, Terry G.

1994-01-01

The purpose of this study is to investigate the performance of the integral equation computations using numerical source field-panel method in a massively parallel processing (MPP) environment. A comparative study of computational performance of the MPP CM-5 computer and conventional Cray-YMP supercomputer for a three-dimensional flow problem is made. A serial FORTRAN code is converted into a parallel CM-FORTRAN code. Some performance results are obtained on CM-5 with 32, 62, 128 nodes along with those on Cray-YMP with a single processor. The comparison of the performance indicates that the parallel CM-FORTRAN code near or out-performs the equivalent serial FORTRAN code for some cases.
Symposium on Parallel Computational Methods for Large-scale Structural Analysis and Design, 2nd, Norfolk, VA, US

NASA Technical Reports Server (NTRS)

Storaasli, Olaf O. (Editor); Housner, Jerrold M. (Editor)

1993-01-01

Computing speed is leaping forward by several orders of magnitude each decade. Engineers and scientists gathered at a NASA Langley symposium to discuss these exciting trends as they apply to parallel computational methods for large-scale structural analysis and design. Among the topics discussed were: large-scale static analysis; dynamic, transient, and thermal analysis; domain decomposition (substructuring); and nonlinear and numerical methods.
Electromagnetic pulse coupling through an aperture into a two-parallel-plate region

NASA Technical Reports Server (NTRS)

Rahmat-Samii, Y.

1978-01-01

Analysis of electromagnetic-pulse (EMP) penetration via apertures into cavities is an important study in designing hardened systems. In this paper, an integral equation procedure is developed for determining the frequency and consequently the time behavior of the field inside a two-parallel-plate region excited through an aperture by an EMP. Some discussion of the numerical results is also included in the paper for completeness.
Rapid Prediction of Unsteady Three-Dimensional Viscous Flows in Turbopump Geometries

NASA Technical Reports Server (NTRS)

Dorney, Daniel J.

1998-01-01

A program is underway to improve the efficiency of a three-dimensional Navier-Stokes code and generalize it for nozzle and turbopump geometries. Code modifications have included the implementation of parallel processing software, incorporation of new physical models and generalization of the multiblock capability. The final report contains details of code modifications, numerical results for several nozzle and turbopump geometries, and the implementation of the parallelization software.
LASER APPLICATIONS AND OTHER TOPICS IN QUANTUM ELECTRONICS: Application of the stochastic parallel gradient descent algorithm for numerical simulation and analysis of the coherent summation of radiation from fibre amplifiers

NASA Astrophysics Data System (ADS)

Zhou, Pu; Wang, Xiaolin; Li, Xiao; Chen, Zilum; Xu, Xiaojun; Liu, Zejin

2009-10-01

Coherent summation of fibre laser beams, which can be scaled to a relatively large number of elements, is simulated by using the stochastic parallel gradient descent (SPGD) algorithm. The applicability of this algorithm for coherent summation is analysed and its optimisaton parameters and bandwidth limitations are studied.
Advanced Numerical Techniques of Performance Evaluation. Volume 1

DTIC Science & Technology

1990-06-01

system scheduling3thread. The scheduling thread then runs any other ready thread that can be found. A thread can only sleep or switch out on itself...Polychronopoulos and D.J. Kuck. Guided Self- Scheduling : A Practical Scheduling Scheme for Parallel Supercomputers. IEEE Transactions on Computers C...Kuck 1987] C.D. Polychronopoulos and D.J. Kuck. Guided Self- Scheduling : A Practical Scheduling Scheme for Parallel Supercomputers. IEEE Trans. on Comp
Control of parallel manipulators using force feedback

NASA Technical Reports Server (NTRS)

Nanua, Prabjot

1994-01-01

Two control schemes are compared for parallel robotic mechanisms actuated by hydraulic cylinders. One scheme, the 'rate based scheme', uses the position and rate information only for feedback. The second scheme, the 'force based scheme' feeds back the force information also. The force control scheme is shown to improve the response over the rate control one. It is a simple constant gain control scheme better suited to parallel mechanisms. The force control scheme can be easily modified for the dynamic forces on the end effector. This paper presents the results of a computer simulation of both the rate and force control schemes. The gains in the force based scheme can be individually adjusted in all three directions, whereas the adjustment in just one direction of the rate based scheme directly affects the other two directions.
Reactor Dosimetry Applications Using RAPTOR-M3G:. a New Parallel 3-D Radiation Transport Code

NASA Astrophysics Data System (ADS)

Longoni, Gianluca; Anderson, Stanwood L.

2009-08-01

The numerical solution of the Linearized Boltzmann Equation (LBE) via the Discrete Ordinates method (SN) requires extensive computational resources for large 3-D neutron and gamma transport applications due to the concurrent discretization of the angular, spatial, and energy domains. This paper will discuss the development RAPTOR-M3G (RApid Parallel Transport Of Radiation - Multiple 3D Geometries), a new 3-D parallel radiation transport code, and its application to the calculation of ex-vessel neutron dosimetry responses in the cavity of a commercial 2-loop Pressurized Water Reactor (PWR). RAPTOR-M3G is based domain decomposition algorithms, where the spatial and angular domains are allocated and processed on multi-processor computer architectures. As compared to traditional single-processor applications, this approach reduces the computational load as well as the memory requirement per processor, yielding an efficient solution methodology for large 3-D problems. Measured neutron dosimetry responses in the reactor cavity air gap will be compared to the RAPTOR-M3G predictions. This paper is organized as follows: Section 1 discusses the RAPTOR-M3G methodology; Section 2 describes the 2-loop PWR model and the numerical results obtained. Section 3 addresses the parallel performance of the code, and Section 4 concludes this paper with final remarks and future work.
A parallel computing engine for a class of time critical processes.

PubMed

Nabhan, T M; Zomaya, A Y

1997-01-01

This paper focuses on the efficient parallel implementation of systems of numerically intensive nature over loosely coupled multiprocessor architectures. These analytical models are of significant importance to many real-time systems that have to meet severe time constants. A parallel computing engine (PCE) has been developed in this work for the efficient simplification and the near optimal scheduling of numerical models over the different cooperating processors of the parallel computer. First, the analytical system is efficiently coded in its general form. The model is then simplified by using any available information (e.g., constant parameters). A task graph representing the interconnections among the different components (or equations) is generated. The graph can then be compressed to control the computation/communication requirements. The task scheduler employs a graph-based iterative scheme, based on the simulated annealing algorithm, to map the vertices of the task graph onto a Multiple-Instruction-stream Multiple-Data-stream (MIMD) type of architecture. The algorithm uses a nonanalytical cost function that properly considers the computation capability of the processors, the network topology, the communication time, and congestion possibilities. Moreover, the proposed technique is simple, flexible, and computationally viable. The efficiency of the algorithm is demonstrated by two case studies with good results.
Bounds on the attractor dimension for magnetohydrodynamic channel flow with parallel magnetic field at low magnetic Reynolds number.

PubMed

Low, R; Pothérat, A

2015-05-01

We investigate aspects of low-magnetic-Reynolds-number flow between two parallel, perfectly insulating walls in the presence of an imposed magnetic field parallel to the bounding walls. We find a functional basis to describe the flow, well adapted to the problem of finding the attractor dimension and which is also used in subsequent direct numerical simulation of these flows. For given Reynolds and Hartmann numbers, we obtain an upper bound for the dimension of the attractor by means of known bounds on the nonlinear inertial term and this functional basis for the flow. Three distinct flow regimes emerge: a quasi-isotropic three-dimensional (3D) flow, a nonisotropic 3D flow, and a 2D flow. We find the transition curves between these regimes in the space parametrized by Hartmann number Ha and attractor dimension d(att). We find how the attractor dimension scales as a function of Reynolds and Hartmann numbers (Re and Ha) in each regime. We also investigate the thickness of the boundary layer along the bounding wall and find that in all regimes this scales as 1/Re, independently of the value of Ha, unlike Hartmann boundary layers found when the field is normal to the channel. The structure of the set of least dissipative modes is indeed quite different between these two cases but the properties of turbulence far from the walls (smallest scales and number of degrees of freedom) are found to be very similar.
Scale dependence of the alignment between strain rate and rotation in turbulent shear flow

NASA Astrophysics Data System (ADS)

Fiscaletti, D.; Elsinga, G. E.; Attili, A.; Bisetti, F.; Buxton, O. R. H.

2016-10-01

The scale dependence of the statistical alignment tendencies of the eigenvectors of the strain-rate tensor ei, with the vorticity vector ω , is examined in the self-preserving region of a planar turbulent mixing layer. Data from a direct numerical simulation are filtered at various length scales and the probability density functions of the magnitude of the alignment cosines between the two unit vectors | ei.ω ̂| are examined. It is observed that the alignment tendencies are insensitive to the concurrent large-scale velocity fluctuations, but are quantitatively affected by the nature of the concurrent large-scale velocity-gradient fluctuations. It is confirmed that the small-scale (local) vorticity vector is preferentially aligned in parallel with the large-scale (background) extensive strain-rate eigenvector e1, in contrast to the global tendency for ω to be aligned in parallel with the intermediate strain-rate eigenvector [Hamlington et al., Phys. Fluids 20, 111703 (2008), 10.1063/1.3021055]. When only data from regions of the flow that exhibit strong swirling are included, the so-called high-enstrophy worms, the alignment tendencies are exaggerated with respect to the global picture. These findings support the notion that the production of enstrophy, responsible for a net cascade of turbulent kinetic energy from large scales to small scales, is driven by vorticity stretching due to the preferential parallel alignment between ω and nonlocal e1 and that the strongly swirling worms are kinematically significant to this process.
Network selection, Information filtering and Scalable computation

NASA Astrophysics Data System (ADS)

Ye, Changqing

This dissertation explores two application scenarios of sparsity pursuit method on large scale data sets. The first scenario is classification and regression in analyzing high dimensional structured data, where predictors corresponds to nodes of a given directed graph. This arises in, for instance, identification of disease genes for the Parkinson's diseases from a network of candidate genes. In such a situation, directed graph describes dependencies among the genes, where direction of edges represent certain causal effects. Key to high-dimensional structured classification and regression is how to utilize dependencies among predictors as specified by directions of the graph. In this dissertation, we develop a novel method that fully takes into account such dependencies formulated through certain nonlinear constraints. We apply the proposed method to two applications, feature selection in large margin binary classification and in linear regression. We implement the proposed method through difference convex programming for the cost function and constraints. Finally, theoretical and numerical analyses suggest that the proposed method achieves the desired objectives. An application to disease gene identification is presented. The second application scenario is personalized information filtering which extracts the information specifically relevant to a user, predicting his/her preference over a large number of items, based on the opinions of users who think alike or its content. This problem is cast into the framework of regression and classification, where we introduce novel partial latent models to integrate additional user-specific and content-specific predictors, for higher predictive accuracy. In particular, we factorize a user-over-item preference matrix into a product of two matrices, each representing a user's preference and an item preference by users. Then we propose a likelihood method to seek a sparsest latent factorization, from a class of over-complete factorizations, possibly with a high percentage of missing values. This promotes additional sparsity beyond rank reduction. Computationally, we design methods based on a ``decomposition and combination'' strategy, to break large-scale optimization into many small subproblems to solve in a recursive and parallel manner. On this basis, we implement the proposed methods through multi-platform shared-memory parallel programming, and through Mahout, a library for scalable machine learning and data mining, for mapReduce computation. For example, our methods are scalable to a dataset consisting of three billions of observations on a single machine with sufficient memory, having good timings. Both theoretical and numerical investigations show that the proposed methods exhibit significant improvement in accuracy over state-of-the-art scalable methods.
Numerical modelling of strain in lava tubes

NASA Astrophysics Data System (ADS)

Merle, Olivier

The strain within lava tubes is described in terms of pipe flow. Strain is partitioned into three components: (a) two simple shear components acting from top to bottom and from side to side of a rectangular tube in transverse section; and (b) a pure shear component corresponding to vertical shortening in a deflating flow and horizontal compression in an inflating flow. The sense of shear of the two simple shear components is reversed on either side of a central zone of no shear. Results of numerical simulations of strain within lava tubes reveal a concentric pattern of flattening planes in section normal to the flow direction. The central node is a zone of low strain, which increases toward the lateral borders. Sections parallel to the flow show obliquity of the flattening plane to the flow axis, constituting an imbrication. The strain ellipsoid is generally of plane strain type, but can be of constriction or flattening type if thinning (i.e. deflating flow) or thickening (i.e. inflating flow) is superimposed on the simple shear regime. The strain pattern obtained from numerical simulation is then compared with several patterns recently described in natural lava flows. It is shown that the strain pattern revealed by AMS studies or crystal preferred orientations is remarkably similar to the numerical simulation. However, some departure from the model is found in AMS measurements. This may indicate inherited strain recorded during early stages of the flow or some limitation of the AMS technique.
Modeling electrokinetics in ionic liquids: General

DOE PAGES

Wang, Chao; Bao, Jie; Pan, Wenxiao; ...

2017-04-01

Using direct numerical simulations, we provide a thorough study regarding the electrokinetics of ionic liquids. In particular, modified Poisson–Nernst–Planck equations are solved to capture the crowding and overscreening effects characteristic of an ionic liquid. For modeling electrokinetic flows in an ionic liquid, the modified Poisson-Nernst-Planck equations are coupled with Navier–Stokes equations to study the coupling of ion transport, hydrodynamics, and electrostatic forces. Specifically, we consider the ion transport between two parallel charged surfaces, charging dynamics in a nanopore, capacitance of electric double-layer capacitors, electroosmotic flow in a nanochannel, electroconvective instability on a plane ion-selective surface, and electroconvective flow on amore » curved ionselective surface. Lastly, we also discuss how crowding and overscreening and their interplay affect the electrokinetic behaviors of ionic liquids in these application problems.« less
Dual-Mode Combustion

NASA Technical Reports Server (NTRS)

Goyne, Christopher P.; McDaniel, James C.

2002-01-01

The Department of Mechanical and Aerospace Engineering at the University of Virginia has conducted an investigation of the mixing and combustion processes in a hydrogen fueled dual-mode scramjet combustor. The experiment essentially consisted of the "direct connect" continuous operation of a Mach 2 rectangular combustor with a single unswept ramp fuel injector. The stagnation enthalpy of the test flow simulated a flight Mach number of 5. Measurements were obtained using conventional wall instrumentation and laser based diagnostics. These diagnostics included, pressure and wall temperature measurements, Fuel Plume Imaging (FPI) and Particle Image Velocimetry (PIV). A schematic of the combustor configuration and a summary of the measurements obtained are presented. The experimental work at UVa was parallel by Computational Fluid Dynamics (CFD) work at NASA Langley. The numerical and experiment results are compared in this document.
Generation of capillary instabilities by external disturbances in a liquid jet. Ph.D. Thesis - State Univ. of N.Y.

NASA Technical Reports Server (NTRS)

Leib, S. J.

1985-01-01

The receptivity problem in a circular liquid jet is considered. A time harmonic axial pressure gradient is imposed on the steady, parallel flow of a jet of liquid emerging from a circular duct. Using a technique developed in plasma physics a casual solution to the forced problem is obtained over certain ranges of Weber number for a number of mean velocity profiles. This solution contains a term which grows exponentially in the downstream direction and can be identified with a capillary instability wave. Hence, it is found that the externally imposed disturbances can indeed trigger instability waves in a liquid jet. The amplitude of the instability wave generated relative to the amplitude of the forcing is computed numerically for a number of cases.

String-averaging incremental subgradients for constrained convex optimization with applications to reconstruction of tomographic images

NASA Astrophysics Data System (ADS)

Massambone de Oliveira, Rafael; Salomão Helou, Elias; Fontoura Costa, Eduardo

2016-11-01

We present a method for non-smooth convex minimization which is based on subgradient directions and string-averaging techniques. In this approach, the set of available data is split into sequences (strings) and a given iterate is processed independently along each string, possibly in parallel, by an incremental subgradient method (ISM). The end-points of all strings are averaged to form the next iterate. The method is useful to solve sparse and large-scale non-smooth convex optimization problems, such as those arising in tomographic imaging. A convergence analysis is provided under realistic, standard conditions. Numerical tests are performed in a tomographic image reconstruction application, showing good performance for the convergence speed when measured as the decrease ratio of the objective function, in comparison to classical ISM.
Utilization management in radiology, part 2: perspectives and future directions.

PubMed

Duszak, Richard; Berlin, Jonathan W

2012-10-01

Increased utilization of medical imaging in the early part of the last decade has resulted in numerous efforts to reduce associated spending. Recent initiatives have focused on managing utilization with radiology benefits managers and real-time order entry decision support systems. Although these approaches might seem mutually exclusive and their application to radiology appears unique, the historical convergence and broad acceptance of both programs within the pharmacy sector may offer parallels for their potential future in medical imaging. In this second installment of a two-part series, anticipated trends in radiology utilization management are reviewed. Perspectives on current and future potential roles of radiologists in such initiatives are discussed, particularly in light of emerging physician payment models. Copyright © 2012 American College of Radiology. Published by Elsevier Inc. All rights reserved.
Numerical modeling of the 3D dynamics of ultrasound contrast agent microbubbles using the boundary integral method

NASA Astrophysics Data System (ADS)

Wang, Qianxi; Manmi, Kawa; Calvisi, Michael L.

2015-02-01

Ultrasound contrast agents (UCAs) are microbubbles stabilized with a shell typically of lipid, polymer, or protein and are emerging as a unique tool for noninvasive therapies ranging from gene delivery to tumor ablation. While various models have been developed to describe the spherical oscillations of contrast agents, the treatment of nonspherical behavior has received less attention. However, the nonspherical dynamics of contrast agents are thought to play an important role in therapeutic applications, for example, enhancing the uptake of therapeutic agents across cell membranes and tissue interfaces, and causing tissue ablation. In this paper, a model for nonspherical contrast agent dynamics based on the boundary integral method is described. The effects of the encapsulating shell are approximated by adapting Hoff's model for thin-shell, spherical contrast agents. A high-quality mesh of the bubble surface is maintained by implementing a hybrid approach of the Lagrangian method and elastic mesh technique. The numerical model agrees well with a modified Rayleigh-Plesset equation for encapsulated spherical bubbles. Numerical analyses of the dynamics of UCAs in an infinite liquid and near a rigid wall are performed in parameter regimes of clinical relevance. The oscillation amplitude and period decrease significantly due to the coating. A bubble jet forms when the amplitude of ultrasound is sufficiently large, as occurs for bubbles without a coating; however, the threshold amplitude required to incite jetting increases due to the coating. When a UCA is near a rigid boundary subject to acoustic forcing, the jet is directed towards the wall if the acoustic wave propagates perpendicular to the boundary. When the acoustic wave propagates parallel to the rigid boundary, the jet direction has components both along the wave direction and towards the boundary that depend mainly on the dimensionless standoff distance of the bubble from the boundary. In all cases, the jet directions for the coated and uncoated bubble are similar but the jet width and jet velocity are smaller for a coated bubble. The effects of shell thickness and shell viscosity are analyzed and determined to affect the bubble dynamics, including jet development.
Advancing MODFLOW Applying the Derived Vector Space Method

NASA Astrophysics Data System (ADS)

Herrera, G. S.; Herrera, I.; Lemus-García, M.; Hernandez-Garcia, G. D.

2015-12-01

The most effective domain decomposition methods (DDM) are non-overlapping DDMs. Recently a new approach, the DVS-framework, based on an innovative discretization method that uses a non-overlapping system of nodes (the derived-nodes), was introduced and developed by I. Herrera et al. [1, 2]. Using the DVS-approach a group of four algorithms, referred to as the 'DVS-algorithms', which fulfill the DDM-paradigm (i.e. the solution of global problems is obtained by resolution of local problems exclusively) has been derived. Such procedures are applicable to any boundary-value problem, or system of such equations, for which a standard discretization method is available and then software with a high degree of parallelization can be constructed. In a parallel talk, in this AGU Fall Meeting, Ismael Herrera will introduce the general DVS methodology. The application of the DVS-algorithms has been demonstrated in the solution of several boundary values problems of interest in Geophysics. Numerical examples for a single-equation, for the cases of symmetric, non-symmetric and indefinite problems were demonstrated before [1,2]. For these problems DVS-algorithms exhibited significantly improved numerical performance with respect to standard versions of DDM algorithms. In view of these results our research group is in the process of applying the DVS method to a widely used simulator for the first time, here we present the advances of the application of this method for the parallelization of MODFLOW. Efficiency results for a group of tests will be presented. References [1] I. Herrera, L.M. de la Cruz and A. Rosas-Medina. Non overlapping discretization methods for partial differential equations, Numer Meth Part D E, (2013). [2] Herrera, I., & Contreras Iván "An Innovative Tool for Effectively Applying Highly Parallelized Software To Problems of Elasticity". Geofísica Internacional, 2015 (In press)
Compliant Robot Wrist

NASA Technical Reports Server (NTRS)

Voellmer, George

1992-01-01

Compliant element for robot wrist accepts small displacements in one direction only (to first approximation). Three such elements combined to obtain translational compliance along three orthogonal directions, without rotational compliance along any of them. Element is double-blade flexure joint in which two sheets of spring steel attached between opposing blocks, forming rectangle. Blocks moved parallel to each other in one direction only. Sheets act as double cantilever beams deforming in S-shape, keeping blocks parallel.
MADNESS: A Multiresolution, Adaptive Numerical Environment for Scientific Simulation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Harrison, Robert J.; Beylkin, Gregory; Bischoff, Florian A.

2016-01-01

MADNESS (multiresolution adaptive numerical environment for scientific simulation) is a high-level software environment for solving integral and differential equations in many dimensions that uses adaptive and fast harmonic analysis methods with guaranteed precision based on multiresolution analysis and separated representations. Underpinning the numerical capabilities is a powerful petascale parallel programming environment that aims to increase both programmer productivity and code scalability. This paper describes the features and capabilities of MADNESS and briefly discusses some current applications in chemistry and several areas of physics.
Introduction to Numerical Methods

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schoonover, Joseph A.

2016-06-14

These are slides for a lecture for the Parallel Computing Summer Research Internship at the National Security Education Center. This gives an introduction to numerical methods. Repetitive algorithms are used to obtain approximate solutions to mathematical problems, using sorting, searching, root finding, optimization, interpolation, extrapolation, least squares regresion, Eigenvalue problems, ordinary differential equations, and partial differential equations. Many equations are shown. Discretizations allow us to approximate solutions to mathematical models of physical systems using a repetitive algorithm and introduce errors that can lead to numerical instabilities if we are not careful.
Numerical simulation and analysis of electromagnetic-wave absorption of a plasma slab created by a direct-current discharge with gridded anode

NASA Astrophysics Data System (ADS)

Yuan, Chengxun; Tian, Ruihuan; Eliseev, S. I.; Bekasov, V. S.; Bogdanov, E. A.; Kudryavtsev, A. A.; Zhou, Zhongxiang

2018-03-01

In this paper, we present investigation of a direct-current discharge with a gridded anode from the point of view of using it as a means of creating plasma coating that could efficiently absorb incident electromagnetic (EM) waves. A single discharge cell consists of two parallel plates, one of which (anode) is gridded. Electrons emitted from the cathode surface are accelerated in the short interelectrode gap and are injected into the post-anode space, where they lose acquired energy on ionization and create plasma. Numerical simulations were used to investigate the discharge structure and obtain spatial distributions of plasma density in the post-anode space. The numerical model of the discharge was based on a simple hybrid approach which takes into account non-local ionization by fast electrons streaming from the cathode sheath. Specially formulated transparency boundary conditions allowed performing simulations in 1D. Simulations were carried out in air at pressures of 10 Torr and higher. Analysis of the discharge structure and discharge formation is presented. It is shown that using cathode materials with lower secondary emission coefficients can allow increasing the thickness of plasma slabs for the same discharge current, which can potentially enhance EM wave absorption. Spatial distributions of electron density obtained during simulations were used to calculate attenuation of an incident EM wave propagating perpendicularly to the plasma slab boundary. It is shown that plasma created by means of a DC discharge with a gridded anode can efficiently absorb EM waves in the low frequency range (6-40 GHz). Increasing gas pressure results in a broader range of wave frequencies (up to 500 GHz) where a considerable attenuation is observed.
Parallel Directionally Split Solver Based on Reformulation of Pipelined Thomas Algorithm

NASA Technical Reports Server (NTRS)

Povitsky, A.

1998-01-01

In this research an efficient parallel algorithm for 3-D directionally split problems is developed. The proposed algorithm is based on a reformulated version of the pipelined Thomas algorithm that starts the backward step computations immediately after the completion of the forward step computations for the first portion of lines This algorithm has data available for other computational tasks while processors are idle from the Thomas algorithm. The proposed 3-D directionally split solver is based on the static scheduling of processors where local and non-local, data-dependent and data-independent computations are scheduled while processors are idle. A theoretical model of parallelization efficiency is used to define optimal parameters of the algorithm, to show an asymptotic parallelization penalty and to obtain an optimal cover of a global domain with subdomains. It is shown by computational experiments and by the theoretical model that the proposed algorithm reduces the parallelization penalty about two times over the basic algorithm for the range of the number of processors (subdomains) considered and the number of grid nodes per subdomain.
A plane wave model for direct simulation of reflection and transmission by discretely inhomogeneous plane parallel media

NASA Astrophysics Data System (ADS)

Mackowski, Daniel; Ramezanpour, Bahareh

2018-07-01

A formulation is developed for numerically solving the frequency domain Maxwell's equations in plane parallel layers of inhomogeneous media. As was done in a recent work [1], the plane parallel layer is modeled as an infinite square lattice of W × W × H unit cells, with W being a sample width of the layer and H the layer thickness. As opposed to the 3D volume integral/discrete dipole formulation, the derivation begins with a Fourier expansion of the electric field amplitude in the lateral plane, and leads to a coupled system of 1D ordinary differential equations in the depth direction of the layer. A 1D dyadic Green's function is derived for this system and used to construct a set of coupled 1D integral equations for the field expansion coefficients. The resulting mathematical formulation is considerably simpler and more compact than that derived, for the same system, using the discrete dipole approximation applied to the periodic plane lattice. Furthermore, the fundamental property variable appearing in the formulation is the Fourier transformed complex permittivity distribution in the unit cell, and the method obviates any need to define or calculate a dipole polarizability. Although designed primarily for random media calculations, the method is also capable of predicting the single scattering properties of individual particles; comparisons are presented to demonstrate that the method can accurately reproduce, at scattering angles not too close to 90°, the polarimetric scattering properties of single and multiple spheres. The derivation of the dyadic Green's function allows for an analytical preconditioning of the equations, and it is shown that this can result in significantly accelerated solution times when applied to densely-packed systems of particles. Calculation results demonstrate that the method, when applied to inhomogeneous media, can predict coherent backscattering and polarization opposition effects.
RT DDA: A hybrid method for predicting the scattering properties by densely packed media

NASA Astrophysics Data System (ADS)

Ramezan Pour, B.; Mackowski, D.

2017-12-01

The most accurate approaches to predicting the scattering properties of particulate media are based on exact solutions of the Maxwell's equations (MEs), such as the T-matrix and discrete dipole methods. Applying these techniques for optically thick targets is challenging problem due to the large-scale computations and are usually substituted by phenomenological radiative transfer (RT) methods. On the other hand, the RT technique is of questionable validity in media with large particle packing densities. In recent works, we used numerically exact ME solvers to examine the effects of particle concentration on the polarized reflection properties of plane parallel random media. The simulations were performed for plane parallel layers of wavelength-sized spherical particles, and results were compared with RT predictions. We have shown that RTE results monotonically converge to the exact solution as the particle volume fraction becomes smaller and one can observe a nearly perfect fit for packing densities of 2%-5%. This study describes the hybrid technique composed of exact and numerical scalar RT methods. The exact methodology in this work is the plane parallel discrete dipole approximation whereas the numerical method is based on the adding and doubling method. This approach not only decreases the computational time owing to the RT method but also includes the interference and multiple scattering effects, so it may be applicable to large particle density conditions.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Yee, Seonghwan, E-mail: Seonghwan.Yee@Beaumont.edu; Gao, Jia-Hong

Purpose: To investigate whether the direction of spin-lock field, either parallel or antiparallel to the rotating magnetization, has any effect on the spin-lock MRI signal and further on the quantitative measurement of T1ρ, in a clinical 3 T MRI system. Methods: The effects of inverted spin-lock field direction were investigated by acquiring a series of spin-lock MRI signals for an American College of Radiology MRI phantom, while the spin-lock field direction was switched between the parallel and antiparallel directions. The acquisition was performed for different spin-locking methods (i.e., for the single- and dual-field spin-locking methods) and for different levels ofmore » clinically feasible spin-lock field strength, ranging from 100 to 500 Hz, while the spin-lock duration was varied in the range from 0 to 100 ms. Results: When the spin-lock field was inverted into the antiparallel direction, the rate of MRI signal decay was altered and the T1ρ value, when compared to the value for the parallel field, was clearly different. Different degrees of such direction-dependency were observed for different spin-lock field strengths. In addition, the dependency was much smaller when the parallel and the antiparallel fields are mixed together in the dual-field method. Conclusions: The spin-lock field direction could impact the MRI signal and further the T1ρ measurement in a clinical MRI system.« less
Evaluation of Proteus as a Tool for the Rapid Development of Models of Hydrologic Systems

NASA Astrophysics Data System (ADS)

Weigand, T. M.; Farthing, M. W.; Kees, C. E.; Miller, C. T.

2013-12-01

Models of modern hydrologic systems can be complex and involve a variety of operators with varying character. The goal is to implement approximations of such models that are both efficient for the developer and computationally efficient, which is a set of naturally competing objectives. Proteus is a Python-based toolbox that supports prototyping of model formulations as well as a wide variety of modern numerical methods and parallel computing. We used Proteus to develop numerical approximations for three models: Richards' equation, a brine flow model derived using the Thermodynamically Constrained Averaging Theory (TCAT), and a multiphase TCAT-based tumor growth model. For Richards' equation, we investigated discontinuous Galerkin solutions with higher order time integration based on the backward difference formulas. The TCAT brine flow model was implemented using Proteus and a variety of numerical methods were compared to hand coded solutions. Finally, an existing tumor growth model was implemented in Proteus to introduce more advanced numerics and allow the code to be run in parallel. From these three example models, Proteus was found to be an attractive open-source option for rapidly developing high quality code for solving existing and evolving computational science models.
Separating stages of arithmetic verification: An ERP study with a novel paradigm.

PubMed

Avancini, Chiara; Soltész, Fruzsina; Szűcs, Dénes

2015-08-01

In studies of arithmetic verification, participants typically encounter two operands and they carry out an operation on these (e.g. adding them). Operands are followed by a proposed answer and participants decide whether this answer is correct or incorrect. However, interpretation of results is difficult because multiple parallel, temporally overlapping numerical and non-numerical processes of the human brain may contribute to task execution. In order to overcome this problem here we used a novel paradigm specifically designed to tease apart the overlapping cognitive processes active during arithmetic verification. Specifically, we aimed to separate effects related to detection of arithmetic correctness, detection of the violation of strategic expectations, detection of physical stimulus properties mismatch and numerical magnitude comparison (numerical distance effects). Arithmetic correctness, physical stimulus properties and magnitude information were not task-relevant properties of the stimuli. We distinguished between a series of temporally highly overlapping cognitive processes which in turn elicited overlapping ERP effects with distinct scalp topographies. We suggest that arithmetic verification relies on two major temporal phases which include parallel running processes. Our paradigm offers a new method for investigating specific arithmetic verification processes in detail. Copyright © 2015 Elsevier Ltd. All rights reserved.
The application of the large particles method of numerical modeling of the process of carbonic nanostructures synthesis in plasma

NASA Astrophysics Data System (ADS)

Abramov, G. V.; Gavrilov, A. N.

2018-03-01

The article deals with the numerical solution of the mathematical model of the particles motion and interaction in multicomponent plasma by the example of electric arc synthesis of carbon nanostructures. The high order of the particles and the number of their interactions requires a significant input of machine resources and time for calculations. Application of the large particles method makes it possible to reduce the amount of computation and the requirements for hardware resources without affecting the accuracy of numerical calculations. The use of technology of GPGPU parallel computing using the Nvidia CUDA technology allows organizing all General purpose computation on the basis of the graphical processor graphics card. The comparative analysis of different approaches to parallelization of computations to speed up calculations with the choice of the algorithm in which to calculate the accuracy of the solution shared memory is used. Numerical study of the influence of particles density in the macro particle on the motion parameters and the total number of particle collisions in the plasma for different modes of synthesis has been carried out. The rational range of the coherence coefficient of particle in the macro particle is computed.
Paleomagnetic and structural evidence for oblique slip in a fault-related fold, Grayback monocline, Colorado

USGS Publications Warehouse

Tetreault, J.; Jones, C.H.; Erslev, E.; Larson, S.; Hudson, M.; Holdaway, S.

2008-01-01

Significant fold-axis-parallel slip is accommodated in the folded strata of the Grayback monocline, northeastern Front Range, Colorado, without visible large strike-slip displacement on the fold surface. In many cases, oblique-slip deformation is partitioned; fold-axis-normal slip is accommodated within folds, and fold-axis-parallel slip is resolved onto adjacent strike-slip faults. Unlike partitioning strike-parallel slip onto adjacent strike-slip faults, fold-axis-parallel slip has deformed the forelimb of the Grayback monocline. Mean compressive paleostress orientations in the forelimb are deflected 15??-37?? clockwise from the regional paleostress orientation of the northeastern Front Range. Paleomagnetic directions from the Permian Ingleside Formation in the forelimb are rotated 16??-42?? clockwise about a bedding-normal axis relative to the North American Permian reference direction. The paleostress and paleomagnetic rotations increase with the bedding dip angle and decrease along strike toward the fold tip. These measurements allow for 50-120 m of fold-axis-parallel slip within the forelimb, depending on the kinematics of strike-slip shear. This resolved horizontal slip is nearly equal in magnitude to the ???180 m vertical throw across the fold. For 200 m of oblique-slip displacement (120 m of strike slip and 180 m of reverse slip), the true shortening direction across the fold is N90??E, indistinguishable from the regionally inferred direction of N90??E and quite different from the S53??E fold-normal direction. Recognition of this deformational style means that significant amounts of strike slip can be accommodated within folds without axis-parallel surficial faulting. ?? 2008 Geological Society of America.
Generation and evolution of anisotropic turbulence and related energy transfer in a multi-species solar wind

NASA Astrophysics Data System (ADS)

Maneva, Yana; Poedts, Stefaan

2017-04-01

The electromagnetic fluctuations in the solar wind represent a zoo of plasma waves with different properties, whose wavelengths range from largest fluid scales to the smallest dissipation scales. By nature the power spectrum of the magnetic fluctuations is anisotropic with different spectral slopes in parallel and perpendicular directions with respect to the background magnetic field. Furthermore, the magnetic field power spectra steepen as one moves from the inertial to the dissipation range and we observe multiple spectral breaks with different slopes in parallel and perpendicular direction at the ion scales and beyond. The turbulent dissipation of magnetic field fluctuations at the sub-ion scales is believed to go into local ion heating and acceleration, so that the spectral breaks are typically associated with particle energization. The gained energy can be in the form of anisotropic heating, formation of non-thermal features in the particle velocity distributions functions, and redistribution of the differential acceleration between the different ion populations. To study the relation between the evolution of the anisotropic turbulent spectra and the particle heating at the ion and sub-ion scales we perform a series of 2.5D hybrid simulations in a collisionless drifting proton-alpha plasma. We neglect the fast electron dynamics and treat the electrons as an isothermal fluid electrons, whereas the protons and a minor population of alpha particles are evolved in a fully kinetic manner. We start with a given wave spectrum and study the evolution of the magnetic field spectral slopes as a function of the parallel and perpendicular wave¬numbers. Simultaneously, we track the particle response and the energy exchange between the parallel and perpendicular scales. We observe anisotropic behavior of the turbulent power spectra with steeper slopes along the dominant energy-containing direction. This means that for parallel and quasi-parallel waves we have steeper spectral slope in parallel direction, whereas for highly oblique waves the dissipation occurs predominantly in perpendicular direction and the spectral slopes are steeper across the background magnetic field. The value of the spectral slopes depends on the angle of propagation, the spectral range, as well as the plasma properties. In general the dissipation is stronger at small scales and the corresponding spectral slopes there are steeper. For parallel and quasi-parallel propagation the prevailing energy cascade remains along the magnetic field, whereas for initially isotropic oblique turbulence the cascade develops mainly in perpendicular direction.
Opus: A Coordination Language for Multidisciplinary Applications

NASA Technical Reports Server (NTRS)

Chapman, Barbara; Haines, Matthew; Mehrotra, Piyush; Zima, Hans; vanRosendale, John

1997-01-01

Data parallel languages, such as High Performance fortran, can be successfully applied to a wide range of numerical applications. However, many advanced scientific and engineering applications are multidisciplinary and heterogeneous in nature, and thus do not fit well into the data parallel paradigm. In this paper we present Opus, a language designed to fill this gap. The central concept of Opus is a mechanism called ShareD Abstractions (SDA). An SDA can be used as a computation server, i.e., a locus of computational activity, or as a data repository for sharing data between asynchronous tasks. SDAs can be internally data parallel, providing support for the integration of data and task parallelism as well as nested task parallelism. They can thus be used to express multidisciplinary applications in a natural and efficient way. In this paper we describe the features of the language through a series of examples and give an overview of the runtime support required to implement these concepts in parallel and distributed environments.
Efficient parallel resolution of the simplified transport equations in mixed-dual formulation

NASA Astrophysics Data System (ADS)

Barrault, M.; Lathuilière, B.; Ramet, P.; Roman, J.

2011-03-01

A reactivity computation consists of computing the highest eigenvalue of a generalized eigenvalue problem, for which an inverse power algorithm is commonly used. Very fine modelizations are difficult to treat for our sequential solver, based on the simplified transport equations, in terms of memory consumption and computational time. A first implementation of a Lagrangian based domain decomposition method brings to a poor parallel efficiency because of an increase in the power iterations [1]. In order to obtain a high parallel efficiency, we improve the parallelization scheme by changing the location of the loop over the subdomains in the overall algorithm and by benefiting from the characteristics of the Raviart-Thomas finite element. The new parallel algorithm still allows us to locally adapt the numerical scheme (mesh, finite element order). However, it can be significantly optimized for the matching grid case. The good behavior of the new parallelization scheme is demonstrated for the matching grid case on several hundreds of nodes for computations based on a pin-by-pin discretization.
A software architecture for multidisciplinary applications: Integrating task and data parallelism

NASA Technical Reports Server (NTRS)

Chapman, Barbara; Mehrotra, Piyush; Vanrosendale, John; Zima, Hans

1994-01-01

Data parallel languages such as Vienna Fortran and HPF can be successfully applied to a wide range of numerical applications. However, many advanced scientific and engineering applications are of a multidisciplinary and heterogeneous nature and thus do not fit well into the data parallel paradigm. In this paper we present new Fortran 90 language extensions to fill this gap. Tasks can be spawned as asynchronous activities in a homogeneous or heterogeneous computing environment; they interact by sharing access to Shared Data Abstractions (SDA's). SDA's are an extension of Fortran 90 modules, representing a pool of common data, together with a set of Methods for controlled access to these data and a mechanism for providing persistent storage. Our language supports the integration of data and task parallelism as well as nested task parallelism and thus can be used to express multidisciplinary applications in a natural and efficient way.

Message-passing-interface-based parallel FDTD investigation on the EM scattering from a 1-D rough sea surface using uniaxial perfectly matched layer absorbing boundary.

PubMed

Li, J; Guo, L-X; Zeng, H; Han, X-B

2009-06-01

A message-passing-interface (MPI)-based parallel finite-difference time-domain (FDTD) algorithm for the electromagnetic scattering from a 1-D randomly rough sea surface is presented. The uniaxial perfectly matched layer (UPML) medium is adopted for truncation of FDTD lattices, in which the finite-difference equations can be used for the total computation domain by properly choosing the uniaxial parameters. This makes the parallel FDTD algorithm easier to implement. The parallel performance with different processors is illustrated for one sea surface realization, and the computation time of the parallel FDTD algorithm is dramatically reduced compared to a single-process implementation. Finally, some numerical results are shown, including the backscattering characteristics of sea surface for different polarization and the bistatic scattering from a sea surface with large incident angle and large wind speed.
A parallel graded-mesh FDTD algorithm for human-antenna interaction problems.

PubMed

Catarinucci, Luca; Tarricone, Luciano

2009-01-01

The finite difference time domain method (FDTD) is frequently used for the numerical solution of a wide variety of electromagnetic (EM) problems and, among them, those concerning human exposure to EM fields. In many practical cases related to the assessment of occupational EM exposure, large simulation domains are modeled and high space resolution adopted, so that strong memory and central processing unit power requirements have to be satisfied. To better afford the computational effort, the use of parallel computing is a winning approach; alternatively, subgridding techniques are often implemented. However, the simultaneous use of subgridding schemes and parallel algorithms is very new. In this paper, an easy-to-implement and highly-efficient parallel graded-mesh (GM) FDTD scheme is proposed and applied to human-antenna interaction problems, demonstrating its appropriateness in dealing with complex occupational tasks and showing its capability to guarantee the advantages of a traditional subgridding technique without affecting the parallel FDTD performance.
Use Computer-Aided Tools to Parallelize Large CFD Applications

NASA Technical Reports Server (NTRS)

Jin, H.; Frumkin, M.; Yan, J.

2000-01-01

Porting applications to high performance parallel computers is always a challenging task. It is time consuming and costly. With rapid progressing in hardware architectures and increasing complexity of real applications in recent years, the problem becomes even more sever. Today, scalability and high performance are mostly involving handwritten parallel programs using message-passing libraries (e.g. MPI). However, this process is very difficult and often error-prone. The recent reemergence of shared memory parallel (SMP) architectures, such as the cache coherent Non-Uniform Memory Access (ccNUMA) architecture used in the SGI Origin 2000, show good prospects for scaling beyond hundreds of processors. Programming on an SMP is simplified by working in a globally accessible address space. The user can supply compiler directives, such as OpenMP, to parallelize the code. As an industry standard for portable implementation of parallel programs for SMPs, OpenMP is a set of compiler directives and callable runtime library routines that extend Fortran, C and C++ to express shared memory parallelism. It promises an incremental path for parallel conversion of existing software, as well as scalability and performance for a complete rewrite or an entirely new development. Perhaps the main disadvantage of programming with directives is that inserted directives may not necessarily enhance performance. In the worst cases, it can create erroneous results. While vendors have provided tools to perform error-checking and profiling, automation in directive insertion is very limited and often failed on large programs, primarily due to the lack of a thorough enough data dependence analysis. To overcome the deficiency, we have developed a toolkit, CAPO, to automatically insert OpenMP directives in Fortran programs and apply certain degrees of optimization. CAPO is aimed at taking advantage of detailed inter-procedural dependence analysis provided by CAPTools, developed by the University of Greenwich, to reduce potential errors made by users. Earlier tests on NAS Benchmarks and ARC3D have demonstrated good success of this tool. In this study, we have applied CAPO to parallelize three large applications in the area of computational fluid dynamics (CFD): OVERFLOW, TLNS3D and INS3D. These codes are widely used for solving Navier-Stokes equations with complicated boundary conditions and turbulence model in multiple zones. Each one comprises of from 50K to 1,00k lines of FORTRAN77. As an example, CAPO took 77 hours to complete the data dependence analysis of OVERFLOW on a workstation (SGI, 175MHz, R10K processor). A fair amount of effort was spent on correcting false dependencies due to lack of necessary knowledge during the analysis. Even so, CAPO provides an easy way for user to interact with the parallelization process. The OpenMP version was generated within a day after the analysis was completed. Due to sequential algorithms involved, code sections in TLNS3D and INS3D need to be restructured by hand to produce more efficient parallel codes. An included figure shows preliminary test results of the generated OVERFLOW with several test cases in single zone. The MPI data points for the small test case were taken from a handcoded MPI version. As we can see, CAPO's version has achieved 18 fold speed up on 32 nodes of the SGI O2K. For the small test case, it outperformed the MPI version. These results are very encouraging, but further work is needed. For example, although CAPO attempts to place directives on the outer- most parallel loops in an interprocedural framework, it does not insert directives based on the best manual strategy. In particular, it lacks the support of parallelization at the multi-zone level. Future work will emphasize on the development of methodology to work in a multi-zone level and with a hybrid approach. Development of tools to perform more complicated code transformation is also needed.
Numerical aspects and implementation of a two-layer zonal wall model for LES of compressible turbulent flows on unstructured meshes

NASA Astrophysics Data System (ADS)

Park, George Ilhwan; Moin, Parviz

2016-01-01

This paper focuses on numerical and practical aspects associated with a parallel implementation of a two-layer zonal wall model for large-eddy simulation (LES) of compressible wall-bounded turbulent flows on unstructured meshes. A zonal wall model based on the solution of unsteady three-dimensional Reynolds-averaged Navier-Stokes (RANS) equations on a separate near-wall grid is implemented in an unstructured, cell-centered finite-volume LES solver. The main challenge in its implementation is to couple two parallel, unstructured flow solvers for efficient boundary data communication and simultaneous time integrations. A coupling strategy with good load balancing and low processors underutilization is identified. Face mapping and interpolation procedures at the coupling interface are explained in detail. The method of manufactured solution is used for verifying the correct implementation of solver coupling, and parallel performance of the combined wall-modeled LES (WMLES) solver is investigated. The method has successfully been applied to several attached and separated flows, including a transitional flow over a flat plate and a separated flow over an airfoil at an angle of attack.
Parallel numerical modeling of hybrid-dimensional compositional non-isothermal Darcy flows in fractured porous media

NASA Astrophysics Data System (ADS)

Xing, F.; Masson, R.; Lopez, S.

2017-09-01

This paper introduces a new discrete fracture model accounting for non-isothermal compositional multiphase Darcy flows and complex networks of fractures with intersecting, immersed and non-immersed fractures. The so called hybrid-dimensional model using a 2D model in the fractures coupled with a 3D model in the matrix is first derived rigorously starting from the equi-dimensional matrix fracture model. Then, it is discretized using a fully implicit time integration combined with the Vertex Approximate Gradient (VAG) finite volume scheme which is adapted to polyhedral meshes and anisotropic heterogeneous media. The fully coupled systems are assembled and solved in parallel using the Single Program Multiple Data (SPMD) paradigm with one layer of ghost cells. This strategy allows for a local assembly of the discrete systems. An efficient preconditioner is implemented to solve the linear systems at each time step and each Newton type iteration of the simulation. The numerical efficiency of our approach is assessed on different meshes, fracture networks, and physical settings in terms of parallel scalability, nonlinear convergence and linear convergence.
Implementation of a partitioned algorithm for simulation of large CSI problems

NASA Technical Reports Server (NTRS)

Alvin, Kenneth F.; Park, K. C.

1991-01-01

The implementation of a partitioned numerical algorithm for determining the dynamic response of coupled structure/controller/estimator finite-dimensional systems is reviewed. The partitioned approach leads to a set of coupled first and second-order linear differential equations which are numerically integrated with extrapolation and implicit step methods. The present software implementation, ACSIS, utilizes parallel processing techniques at various levels to optimize performance on a shared-memory concurrent/vector processing system. A general procedure for the design of controller and filter gains is also implemented, which utilizes the vibration characteristics of the structure to be solved. Also presented are: example problems; a user's guide to the software; the procedures and algorithm scripts; a stability analysis for the algorithm; and the source code for the parallel implementation.
Implementation and Assessment of a Virtual Laboratory of Parallel Robots Developed for Engineering Students

ERIC Educational Resources Information Center

Gil, Arturo; Peidró, Adrián; Reinoso, Óscar; Marín, José María

2017-01-01

This paper presents a tool, LABEL, oriented to the teaching of parallel robotics. The application, organized as a set of tools developed using Easy Java Simulations, enables the study of the kinematics of parallel robotics. A set of classical parallel structures was implemented such that LABEL can solve the inverse and direct kinematic problem of…
Advances in Parallelization for Large Scale Oct-Tree Mesh Generation

NASA Technical Reports Server (NTRS)

O'Connell, Matthew; Karman, Steve L.

2015-01-01

Despite great advancements in the parallelization of numerical simulation codes over the last 20 years, it is still common to perform grid generation in serial. Generating large scale grids in serial often requires using special "grid generation" compute machines that can have more than ten times the memory of average machines. While some parallel mesh generation techniques have been proposed, generating very large meshes for LES or aeroacoustic simulations is still a challenging problem. An automated method for the parallel generation of very large scale off-body hierarchical meshes is presented here. This work enables large scale parallel generation of off-body meshes by using a novel combination of parallel grid generation techniques and a hybrid "top down" and "bottom up" oct-tree method. Meshes are generated using hardware commonly found in parallel compute clusters. The capability to generate very large meshes is demonstrated by the generation of off-body meshes surrounding complex aerospace geometries. Results are shown including a one billion cell mesh generated around a Predator Unmanned Aerial Vehicle geometry, which was generated on 64 processors in under 45 minutes.
Supercomputing '91; Proceedings of the 4th Annual Conference on High Performance Computing, Albuquerque, NM, Nov. 18-22, 1991

NASA Technical Reports Server (NTRS)

1991-01-01

Various papers on supercomputing are presented. The general topics addressed include: program analysis/data dependence, memory access, distributed memory code generation, numerical algorithms, supercomputer benchmarks, latency tolerance, parallel programming, applications, processor design, networks, performance tools, mapping and scheduling, characterization affecting performance, parallelism packaging, computing climate change, combinatorial algorithms, hardware and software performance issues, system issues. (No individual items are abstracted in this volume)
Parallel aeroelastic computations for wing and wing-body configurations

NASA Technical Reports Server (NTRS)

Byun, Chansup

1994-01-01

The objective of this research is to develop computationally efficient methods for solving fluid-structural interaction problems by directly coupling finite difference Euler/Navier-Stokes equations for fluids and finite element dynamics equations for structures on parallel computers. This capability will significantly impact many aerospace projects of national importance such as Advanced Subsonic Civil Transport (ASCT), where the structural stability margin becomes very critical at the transonic region. This research effort will have direct impact on the High Performance Computing and Communication (HPCC) Program of NASA in the area of parallel computing.
Ion acceleration and heating by kinetic Alfvén waves associated with magnetic reconnection

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liang, Ji; Lin, Yu; Johnson, Jay R.

In a previous study on the generation and signatures of kinetic Alfv en waves (KAWs) associated with magnetic reconnection in a current sheet revealed that KAWs are a common feature during reconnection [Liang et al. J. Geophys. Res.: Space Phys. 121, 6526 (2016)]. In this paper, ion acceleration and heating by the KAWs generated during magnetic reconnection are investigated with a three-dimensional (3-D) hybrid model. It is found that in the outflow region, a fraction of inflow ions are accelerated by the KAWs generated in the leading bulge region of reconnection, and their parallel velocities gradually increase up to slightly super-Alfv enic. As a result of waveparticle interactions, an accelerated ion beam forms in the direction of the anti-parallel magnetic field, in addition to the core ion population, leading to the development of non-Maxwellian velocity distributions, which include a trapped population with parallel velocities consistent with the wave speed. We then heat ions in both parallel and perpendicular directions. In the parallel direction, the heating results from nonlinear Landau resonance of trapped ions. In the perpendicular direction, however, evidence of stochastic heating by the KAWs is found during the acceleration stage, with an increase of magnetic moment μ. The coherence in the T more » $$\\perp$$ ion temperature and the perpendicular electric and magnetic fields of KAWs also provides evidence for perpendicular heating by KAWs. The parallel and perpendicular heating of the accelerated beam occur simultaneously, leading to the development of temperature anisotropy with the perpendicular temperature T $$\\perp$$>T $$\\parallel$$ temperature. The heating rate agrees with the damping rate of the KAWs, and the heating is dominated by the accelerated ion beam. In the later stage, with the increase of the fraction of the accelerated ions, interaction between the accelerated beam and the core population also contributes to the ion heating, ultimately leading to overlap of the beams and an overall anisotropy with T $$\\perp$$>T $$\\parallel$$.« less
Ion acceleration and heating by kinetic Alfvén waves associated with magnetic reconnection

DOE PAGES

Liang, Ji; Lin, Yu; Johnson, Jay R.; ...

2017-09-19

In a previous study on the generation and signatures of kinetic Alfv en waves (KAWs) associated with magnetic reconnection in a current sheet revealed that KAWs are a common feature during reconnection [Liang et al. J. Geophys. Res.: Space Phys. 121, 6526 (2016)]. In this paper, ion acceleration and heating by the KAWs generated during magnetic reconnection are investigated with a three-dimensional (3-D) hybrid model. It is found that in the outflow region, a fraction of inflow ions are accelerated by the KAWs generated in the leading bulge region of reconnection, and their parallel velocities gradually increase up to slightly super-Alfv enic. As a result of waveparticle interactions, an accelerated ion beam forms in the direction of the anti-parallel magnetic field, in addition to the core ion population, leading to the development of non-Maxwellian velocity distributions, which include a trapped population with parallel velocities consistent with the wave speed. We then heat ions in both parallel and perpendicular directions. In the parallel direction, the heating results from nonlinear Landau resonance of trapped ions. In the perpendicular direction, however, evidence of stochastic heating by the KAWs is found during the acceleration stage, with an increase of magnetic moment μ. The coherence in the T more » $$\\perp$$ ion temperature and the perpendicular electric and magnetic fields of KAWs also provides evidence for perpendicular heating by KAWs. The parallel and perpendicular heating of the accelerated beam occur simultaneously, leading to the development of temperature anisotropy with the perpendicular temperature T $$\\perp$$>T $$\\parallel$$ temperature. The heating rate agrees with the damping rate of the KAWs, and the heating is dominated by the accelerated ion beam. In the later stage, with the increase of the fraction of the accelerated ions, interaction between the accelerated beam and the core population also contributes to the ion heating, ultimately leading to overlap of the beams and an overall anisotropy with T $$\\perp$$>T $$\\parallel$$.« less
Large-scale anisotropy in stably stratified rotating flows

DOE PAGES

Marino, R.; Mininni, P. D.; Rosenberg, D. L.; ...

2014-08-28

We present results from direct numerical simulations of the Boussinesq equations in the presence of rotation and/or stratification, both in the vertical direction. The runs are forced isotropically and randomly at small scales and have spatial resolutions of up tomore » $1024^3$ grid points and Reynolds numbers of $$\\approx 1000$$. We first show that solutions with negative energy flux and inverse cascades develop in rotating turbulence, whether or not stratification is present. However, the purely stratified case is characterized instead by an early-time, highly anisotropic transfer to large scales with almost zero net isotropic energy flux. This is consistent with previous studies that observed the development of vertically sheared horizontal winds, although only at substantially later times. However, and unlike previous works, when sufficient scale separation is allowed between the forcing scale and the domain size, the total energy displays a perpendicular (horizontal) spectrum with power law behavior compatible with $$\\sim k_\\perp^{-5/3}$$, including in the absence of rotation. In this latter purely stratified case, such a spectrum is the result of a direct cascade of the energy contained in the large-scale horizontal wind, as is evidenced by a strong positive flux of energy in the parallel direction at all scales including the largest resolved scales.« less
Numerical study on response time of a parallel plate capacitive polyimide humidity sensor based on microhole upper electrode

NASA Astrophysics Data System (ADS)

Zhou, Wenhe; He, Xuan; Wu, Jianyun; Wang, Liangbi; Wang, Liangcheng

2017-07-01

The parallel plate capacitive humidity sensor based on the grid upper electrode is considered to be a promising one in some fields which require a humidity sensor with better dynamic characteristics. To strengthen the structure and balance the electric charge of the grid upper electrode, a strip is needed. However, it is the strip that keeps the dynamic characteristics of the sensor from being further improved. The numerical method is time- and cost-saving, but the numerical study on the response time of the sensor is just of bits and pieces. The numerical models presented by these studies did not consider the porosity effect of the polymer film on the dynamic characteristics. To overcome the defect of the grid upper electrode, a new structure of the upper electrode is provided by this paper first, and then a model considering the porosity effects of the polymer film on the dynamic characteristics is presented and validated. Finally, with the help of software FLUENT, parameter effects on the response time of the humidity sensor based on the microhole upper electrode are studied by the numerical method. The numerical results show that the response time of the microhole upper electrode sensor is 86% better than that of the grid upper electrode sensor, the response time of humidity sensor can be improved by reducing the hole spacing, increasing the aperture, reducing film thickness, and reasonably enlarging the porosity of the film.
Knudsen pump inspired by Crookes radiometer with a specular wall

NASA Astrophysics Data System (ADS)

Baier, Tobias; Hardt, Steffen; Shahabi, Vahid; Roohi, Ehsan

2017-03-01

A rarefied gas is considered in a channel consisting of two infinite parallel plates between which an evenly spaced array of smaller plates is arranged normal to the channel direction. Each of these smaller plates is assumed to possess one ideally specularly reflective and one ideally diffusively reflective side. When the temperature of the small plates differs from the temperature of the sidewalls of the channel, these boundary conditions result in a temperature profile around the edges of each small plate that breaks the reflection symmetry along the channel direction. This in turn results in a force on each plate and a net gas flow along the channel. The situation is analyzed numerically using the direct simulation Monte Carlo method and compared with analytical results where available. The influence of the ideally specularly reflective wall is assessed by comparing with simulations using a finite accommodation coefficient at the corresponding wall. The configuration bears some similarity to a Crookes radiometer, where a nonsymmetric temperature profile at the radiometer vanes is generated by different temperatures on each side of the vane, resulting in a motion of the rotor. The described principle may find applications in pumping gas on small scales driven by temperature gradients.
Femtosecond laser-induced cross-periodic structures on a crystalline silicon surface under low pulse number irradiation

NASA Astrophysics Data System (ADS)

Ji, Xu; Jiang, Lan; Li, Xiaowei; Han, Weina; Liu, Yang; Wang, Andong; Lu, Yongfeng

2015-01-01

A cross-patterned surface periodic structure in femtosecond laser processing of crystalline silicon was revealed under a relatively low shots (4 < N < 10) with the pulse energy slightly higher than the ablation threshold. The experimental results indicated that the cross-pattern was composed of mutually orthogonal periodic structures (ripples). Ripples with a direction perpendicular to laser polarization (R⊥) spread in the whole laser-modified region, with the periodicity around 780 nm which was close to the central wavelength of the laser. Other ripples with a direction parallel to laser polarization (R‖) were found to be distributed between two of the adjacent ripples R⊥, with a periodicity about the sub-wavelength of the irradiated laser, 390 nm. The geometrical morphology of two mutually orthogonal ripples under static femtosecond laser irradiation could be continuously rotated as the polarization directions changed, but the periodicity remained almost unchanged. The underlying physical mechanism was revealed by numerical simulations based on the finite element method. It was found that the incubation effect with multiple shots, together with the redistributed electric field after initial ablation, plays a crucial role in the generation of the cross-patterned periodic surface structures.
The Dynamo package for tomography and subtomogram averaging: components for MATLAB, GPU computing and EC2 Amazon Web Services

PubMed Central

Castaño-Díez, Daniel

2017-01-01

Dynamo is a package for the processing of tomographic data. As a tool for subtomogram averaging, it includes different alignment and classification strategies. Furthermore, its data-management module allows experiments to be organized in groups of tomograms, while offering specialized three-dimensional tomographic browsers that facilitate visualization, location of regions of interest, modelling and particle extraction in complex geometries. Here, a technical description of the package is presented, focusing on its diverse strategies for optimizing computing performance. Dynamo is built upon mbtools (middle layer toolbox), a general-purpose MATLAB library for object-oriented scientific programming specifically developed to underpin Dynamo but usable as an independent tool. Its structure intertwines a flexible MATLAB codebase with precompiled C++ functions that carry the burden of numerically intensive operations. The package can be delivered as a precompiled standalone ready for execution without a MATLAB license. Multicore parallelization on a single node is directly inherited from the high-level parallelization engine provided for MATLAB, automatically imparting a balanced workload among the threads in computationally intense tasks such as alignment and classification, but also in logistic-oriented tasks such as tomogram binning and particle extraction. Dynamo supports the use of graphical processing units (GPUs), yielding considerable speedup factors both for native Dynamo procedures (such as the numerically intensive subtomogram alignment) and procedures defined by the user through its MATLAB-based GPU library for three-dimensional operations. Cloud-based virtual computing environments supplied with a pre-installed version of Dynamo can be publicly accessed through the Amazon Elastic Compute Cloud (EC2), enabling users to rent GPU computing time on a pay-as-you-go basis, thus avoiding upfront investments in hardware and longterm software maintenance. PMID:28580909
The Dynamo package for tomography and subtomogram averaging: components for MATLAB, GPU computing and EC2 Amazon Web Services.

PubMed

Castaño-Díez, Daniel

2017-06-01

Dynamo is a package for the processing of tomographic data. As a tool for subtomogram averaging, it includes different alignment and classification strategies. Furthermore, its data-management module allows experiments to be organized in groups of tomograms, while offering specialized three-dimensional tomographic browsers that facilitate visualization, location of regions of interest, modelling and particle extraction in complex geometries. Here, a technical description of the package is presented, focusing on its diverse strategies for optimizing computing performance. Dynamo is built upon mbtools (middle layer toolbox), a general-purpose MATLAB library for object-oriented scientific programming specifically developed to underpin Dynamo but usable as an independent tool. Its structure intertwines a flexible MATLAB codebase with precompiled C++ functions that carry the burden of numerically intensive operations. The package can be delivered as a precompiled standalone ready for execution without a MATLAB license. Multicore parallelization on a single node is directly inherited from the high-level parallelization engine provided for MATLAB, automatically imparting a balanced workload among the threads in computationally intense tasks such as alignment and classification, but also in logistic-oriented tasks such as tomogram binning and particle extraction. Dynamo supports the use of graphical processing units (GPUs), yielding considerable speedup factors both for native Dynamo procedures (such as the numerically intensive subtomogram alignment) and procedures defined by the user through its MATLAB-based GPU library for three-dimensional operations. Cloud-based virtual computing environments supplied with a pre-installed version of Dynamo can be publicly accessed through the Amazon Elastic Compute Cloud (EC2), enabling users to rent GPU computing time on a pay-as-you-go basis, thus avoiding upfront investments in hardware and longterm software maintenance.
Origins and nature of non-Fickian transport through fractures

NASA Astrophysics Data System (ADS)

Wang, L.; Cardenas, M. B.

2014-12-01

Non-Fickian transport occurs across all scales within fractured and porous geological media. Fundamental understanding and appropriate characterization of non-Fickian transport through fractures is critical for understanding and prediction of the fate of solutes and other scalars. We use both analytical and numerical modeling, including direct numerical simulation and particle tracking random walk, to investigate the origin of non-Fickian transport through both homogeneous and heterogeneous fractures. For the simple homogenous fracture case, i.e., parallel plates, we theoretically derived a formula for dynamic longitudinal dispersion (D) within Poiseuille flow. Using the closed-form expression for the theoretical D, we quantified the time (T) and length (L) scales separating preasymptotic and asymptotic dispersive transport, with T and L proportional to aperture (b) of parallel plates to second and fourth orders, respectively. As for heterogeneous fractures, the fracture roughness and correlation length are closely associated with the T and L, and thus indicate the origin for non-Fickian transport. Modeling solute transport through 2D rough-walled fractures with continuous time random walk with truncated power shows that the degree of deviation from Fickian transport is proportional to fracture roughness. The estimated L for 2D rough-walled fractures is significantly longer than that derived from the formula within Poiseuille flow with equivalent b. Moreover, we artificially generated normally distributed 3D fractures with fixed correlation length but different fracture dimensions. Solute transport through 3D fractures was modeled with a particle tracking random walk algorithm. We found that transport transitions from non-Fickian to Fickian with increasing fracture dimensions, where the estimated L for the studied 3D fractures is related to the correlation length.
Numerical calculations of non-inductive current driven by microwaves in JET

NASA Astrophysics Data System (ADS)

Kirov, K. K.; Baranov, Yu; Mailloux, J.; Nave, M. F. F.; Contributors, JET

2016-12-01

Recent studies at JET focus on analysis of the lower hybrid (LH) wave power absorption and current drive (CD) calculations by means of a new ray tracing (RT)/Fokker-Planck (FP) package. The RT code works in real 2D geometry accounting for the plasma boundary and the launcher shape. LH waves with different parallel refractive index, {{N}\\parallel} , spectra in poloidal direction can be launched thus simulating authentic antenna spectrum with rows fed by different combinations of klystrons. Various FP solvers were tested most advanced of which is a relativistic bounce averaged FP code. LH wave power deposition profiles from the new RT/FP code were compared to the experimental results from electron cyclotron emission (ECE) analysis of pulses at 3.4 T low and high density. This kind of direct comparison between power deposition profiles from experimental ECE data and numerical model were carried out for the first time for waves in the LH range of frequencies. The results were in a reasonable agreement with experimental data at lower density, line averaged values of {{n}\\text{e}}≈ 2.4× {{10}19} {{\\text{m}}-3} . At higher density, {{n}\\text{e}}≈ 3× {{10}19} {{\\text{m}}-3} , the code predicted larger on-axis LH power deposition, which is inconsistent with the experimental observations. Both calculations were unable to produce LH wave absorption at the plasma periphery, which contradicts to the analysis of the ECE data and possible sources of these discrepancies have been briefly discussed in the paper. The code was also used to calculate the LH power deposition and CD profiles for the low-density preheat phase of JET’s advanced tokamak (AT) scenario. It was found that as the density evolves from hollow to flat and then to a more peaked profile the LH power and driven current move inward i.e. towards the plasma axis. A total driven current of about 70 kA for 1 MW of launched LH power was predicted in these conditions.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.