single computer code: Topics by Science.gov

Sample records for single computer code

Gigaflop performance on a CRAY-2: Multitasking a computational fluid dynamics application

NASA Technical Reports Server (NTRS)

Tennille, Geoffrey M.; Overman, Andrea L.; Lambiotte, Jules J.; Streett, Craig L.

1991-01-01

The methodology is described for converting a large, long-running applications code that executed on a single processor of a CRAY-2 supercomputer to a version that executed efficiently on multiple processors. Although the conversion of every application is different, a discussion of the types of modification used to achieve gigaflop performance is included to assist others in the parallelization of applications for CRAY computers, especially those that were developed for other computers. An existing application, from the discipline of computational fluid dynamics, that had utilized over 2000 hrs of CPU time on CRAY-2 during the previous year was chosen as a test case to study the effectiveness of multitasking on a CRAY-2. The nature of dominant calculations within the application indicated that a sustained computational rate of 1 billion floating-point operations per second, or 1 gigaflop, might be achieved. The code was first analyzed and modified for optimal performance on a single processor in a batch environment. After optimal performance on a single CPU was achieved, the code was modified to use multiple processors in a dedicated environment. The results of these two efforts were merged into a single code that had a sustained computational rate of over 1 gigaflop on a CRAY-2. Timings and analysis of performance are given for both single- and multiple-processor runs.
VERAIn

DOE Office of Scientific and Technical Information (OSTI.GOV)

Simunovic, Srdjan

2015-02-16

CASL's modeling and simulation technology, the Virtual Environment for Reactor Applications (VERA), incorporates coupled physics and science-based models, state-of-the-art numerical methods, modern computational science, integrated uncertainty quantification (UQ) and validation against data from operating pressurized water reactors (PWRs), single-effect experiments, and integral tests. The computational simulation component of VERA is the VERA Core Simulator (VERA-CS). The core simulator is the specific collection of multi-physics computer codes used to model and deplete a LWR core over multiple cycles. The core simulator has a single common input file that drives all of the different physics codes. The parser code, VERAIn, converts VERAmore » Input into an XML file that is used as input to different VERA codes.« less
Implementation of radiation shielding calculation methods. Volume 1: Synopsis of methods and summary of results

NASA Technical Reports Server (NTRS)

Capo, M. A.; Disney, R. K.

1971-01-01

The work performed in the following areas is summarized: (1) Analysis of Realistic nuclear-propelled vehicle was analyzed using the Marshall Space Flight Center computer code package. This code package includes one and two dimensional discrete ordinate transport, point kernel, and single scatter techniques, as well as cross section preparation and data processing codes, (2) Techniques were developed to improve the automated data transfer in the coupled computation method of the computer code package and improve the utilization of this code package on the Univac-1108 computer system. (3) The MSFC master data libraries were updated.
A Computer Program for Flow-Log Analysis of Single Holes (FLASH)

USGS Publications Warehouse

Day-Lewis, F. D.; Johnson, C.D.; Paillet, Frederick L.; Halford, K.J.

2011-01-01

A new computer program, FLASH (Flow-Log Analysis of Single Holes), is presented for the analysis of borehole vertical flow logs. The code is based on an analytical solution for steady-state multilayer radial flow to a borehole. The code includes options for (1) discrete fractures and (2) multilayer aquifers. Given vertical flow profiles collected under both ambient and stressed (pumping or injection) conditions, the user can estimate fracture (or layer) transmissivities and far-field hydraulic heads. FLASH is coded in Microsoft Excel with Visual Basic for Applications routines. The code supports manual and automated model calibration. ?? 2011, The Author(s). Ground Water ?? 2011, National Ground Water Association.
An evaluation of four single element airfoil analytic methods

NASA Technical Reports Server (NTRS)

Freuler, R. J.; Gregorek, G. M.

1979-01-01

A comparison of four computer codes for the analysis of two-dimensional single element airfoil sections is presented for three classes of section geometries. Two of the computer codes utilize vortex singularities methods to obtain the potential flow solution. The other two codes solve the full inviscid potential flow equation using finite differencing techniques, allowing results to be obtained for transonic flow about an airfoil including weak shocks. Each program incorporates boundary layer routines for computing the boundary layer displacement thickness and boundary layer effects on aerodynamic coefficients. Computational results are given for a symmetrical section represented by an NACA 0012 profile, a conventional section illustrated by an NACA 65A413 profile, and a supercritical type section for general aviation applications typified by a NASA LS(1)-0413 section. The four codes are compared and contrasted in the areas of method of approach, range of applicability, agreement among each other and with experiment, individual advantages and disadvantages, computer run times and memory requirements, and operational idiosyncrasies.
Multi-processing on supercomputers for computational aerodynamics

NASA Technical Reports Server (NTRS)

Yarrow, Maurice; Mehta, Unmeel B.

1990-01-01

The MIMD concept is applied, through multitasking, with relatively minor modifications to an existing code for a single processor. This approach maps the available memory to multiple processors, exploiting the C-FORTRAN-Unix interface. An existing single processor algorithm is mapped without the need for developing a new algorithm. The procedure of designing a code utilizing this approach is automated with the Unix stream editor. A Multiple Processor Multiple Grid (MPMG) code is developed as a demonstration of this approach. This code solves the three-dimensional, Reynolds-averaged, thin-layer and slender-layer Navier-Stokes equations with an implicit, approximately factored and diagonalized method. This solver is applied to a generic, oblique-wing aircraft problem on a four-processor computer using one process for data management and nonparallel computations and three processes for pseudotime advance on three different grid systems.
New double-byte error-correcting codes for memory systems

NASA Technical Reports Server (NTRS)

Feng, Gui-Liang; Wu, Xinen; Rao, T. R. N.

1996-01-01

Error-correcting or error-detecting codes have been used in the computer industry to increase reliability, reduce service costs, and maintain data integrity. The single-byte error-correcting and double-byte error-detecting (SbEC-DbED) codes have been successfully used in computer memory subsystems. There are many methods to construct double-byte error-correcting (DBEC) codes. In the present paper we construct a class of double-byte error-correcting codes, which are more efficient than those known to be optimum, and a decoding procedure for our codes is also considered.
Nonuniform code concatenation for universal fault-tolerant quantum computing

NASA Astrophysics Data System (ADS)

Nikahd, Eesa; Sedighi, Mehdi; Saheb Zamani, Morteza

2017-09-01

Using transversal gates is a straightforward and efficient technique for fault-tolerant quantum computing. Since transversal gates alone cannot be computationally universal, they must be combined with other approaches such as magic state distillation, code switching, or code concatenation to achieve universality. In this paper we propose an alternative approach for universal fault-tolerant quantum computing, mainly based on the code concatenation approach proposed in [T. Jochym-O'Connor and R. Laflamme, Phys. Rev. Lett. 112, 010505 (2014), 10.1103/PhysRevLett.112.010505], but in a nonuniform fashion. The proposed approach is described based on nonuniform concatenation of the 7-qubit Steane code with the 15-qubit Reed-Muller code, as well as the 5-qubit code with the 15-qubit Reed-Muller code, which lead to two 49-qubit and 47-qubit codes, respectively. These codes can correct any arbitrary single physical error with the ability to perform a universal set of fault-tolerant gates, without using magic state distillation.
Performance analysis of three dimensional integral equation computations on a massively parallel computer. M.S. Thesis

NASA Technical Reports Server (NTRS)

Logan, Terry G.

1994-01-01

The purpose of this study is to investigate the performance of the integral equation computations using numerical source field-panel method in a massively parallel processing (MPP) environment. A comparative study of computational performance of the MPP CM-5 computer and conventional Cray-YMP supercomputer for a three-dimensional flow problem is made. A serial FORTRAN code is converted into a parallel CM-FORTRAN code. Some performance results are obtained on CM-5 with 32, 62, 128 nodes along with those on Cray-YMP with a single processor. The comparison of the performance indicates that the parallel CM-FORTRAN code near or out-performs the equivalent serial FORTRAN code for some cases.
Parallel Computation of the Jacobian Matrix for Nonlinear Equation Solvers Using MATLAB

NASA Technical Reports Server (NTRS)

Rose, Geoffrey K.; Nguyen, Duc T.; Newman, Brett A.

2017-01-01

Demonstrating speedup for parallel code on a multicore shared memory PC can be challenging in MATLAB due to underlying parallel operations that are often opaque to the user. This can limit potential for improvement of serial code even for the so-called embarrassingly parallel applications. One such application is the computation of the Jacobian matrix inherent to most nonlinear equation solvers. Computation of this matrix represents the primary bottleneck in nonlinear solver speed such that commercial finite element (FE) and multi-body-dynamic (MBD) codes attempt to minimize computations. A timing study using MATLAB's Parallel Computing Toolbox was performed for numerical computation of the Jacobian. Several approaches for implementing parallel code were investigated while only the single program multiple data (spmd) method using composite objects provided positive results. Parallel code speedup is demonstrated but the goal of linear speedup through the addition of processors was not achieved due to PC architecture.
ALEGRA -- A massively parallel h-adaptive code for solid dynamics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Summers, R.M.; Wong, M.K.; Boucheron, E.A.

1997-12-31

ALEGRA is a multi-material, arbitrary-Lagrangian-Eulerian (ALE) code for solid dynamics designed to run on massively parallel (MP) computers. It combines the features of modern Eulerian shock codes, such as CTH, with modern Lagrangian structural analysis codes using an unstructured grid. ALEGRA is being developed for use on the teraflop supercomputers to conduct advanced three-dimensional (3D) simulations of shock phenomena important to a variety of systems. ALEGRA was designed with the Single Program Multiple Data (SPMD) paradigm, in which the mesh is decomposed into sub-meshes so that each processor gets a single sub-mesh with approximately the same number of elements. Usingmore » this approach the authors have been able to produce a single code that can scale from one processor to thousands of processors. A current major effort is to develop efficient, high precision simulation capabilities for ALEGRA, without the computational cost of using a global highly resolved mesh, through flexible, robust h-adaptivity of finite elements. H-adaptivity is the dynamic refinement of the mesh by subdividing elements, thus changing the characteristic element size and reducing numerical error. The authors are working on several major technical challenges that must be met to make effective use of HAMMER on MP computers.« less
Multi-phase SPH modelling of violent hydrodynamics on GPUs

NASA Astrophysics Data System (ADS)

Mokos, Athanasios; Rogers, Benedict D.; Stansby, Peter K.; Domínguez, José M.

2015-11-01

This paper presents the acceleration of multi-phase smoothed particle hydrodynamics (SPH) using a graphics processing unit (GPU) enabling large numbers of particles (10-20 million) to be simulated on just a single GPU card. With novel hardware architectures such as a GPU, the optimum approach to implement a multi-phase scheme presents some new challenges. Many more particles must be included in the calculation and there are very different speeds of sound in each phase with the largest speed of sound determining the time step. This requires efficient computation. To take full advantage of the hardware acceleration provided by a single GPU for a multi-phase simulation, four different algorithms are investigated: conditional statements, binary operators, separate particle lists and an intermediate global function. Runtime results show that the optimum approach needs to employ separate cell and neighbour lists for each phase. The profiler shows that this approach leads to a reduction in both memory transactions and arithmetic operations giving significant runtime gains. The four different algorithms are compared to the efficiency of the optimised single-phase GPU code, DualSPHysics, for 2-D and 3-D simulations which indicate that the multi-phase functionality has a significant computational overhead. A comparison with an optimised CPU code shows a speed up of an order of magnitude over an OpenMP simulation with 8 threads and two orders of magnitude over a single thread simulation. A demonstration of the multi-phase SPH GPU code is provided by a 3-D dam break case impacting an obstacle. This shows better agreement with experimental results than an equivalent single-phase code. The multi-phase GPU code enables a convergence study to be undertaken on a single GPU with a large number of particles that otherwise would have required large high performance computing resources.
Computational techniques in gamma-ray skyshine analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

George, D.L.

1988-12-01

Two computer codes were developed to analyze gamma-ray skyshine, the scattering of gamma photons by air molecules. A review of previous gamma-ray skyshine studies discusses several Monte Carlo codes, programs using a single-scatter model, and the MicroSkyshine program for microcomputers. A benchmark gamma-ray skyshine experiment performed at Kansas State University is also described. A single-scatter numerical model was presented which traces photons from the source to their first scatter, then applies a buildup factor along a direct path from the scattering point to a detector. The FORTRAN code SKY, developed with this model before the present study, was modified tomore » use Gauss quadrature, recent photon attenuation data and a more accurate buildup approximation. The resulting code, SILOGP, computes response from a point photon source on the axis of a silo, with and without concrete shielding over the opening. Another program, WALLGP, was developed using the same model to compute response from a point gamma source behind a perfectly absorbing wall, with and without shielding overhead. 29 refs., 48 figs., 13 tabs.« less
Finite difference time domain electromagnetic scattering from frequency-dependent lossy materials

NASA Technical Reports Server (NTRS)

Luebbers, Raymond J.; Beggs, John H.

1991-01-01

Four different FDTD computer codes and companion Radar Cross Section (RCS) conversion codes on magnetic media are submitted. A single three dimensional dispersive FDTD code for both dispersive dielectric and magnetic materials was developed, along with a user's manual. The extension of FDTD to more complicated materials was made. The code is efficient and is capable of modeling interesting radar targets using a modest computer workstation platform. RCS results for two different plate geometries are reported. The FDTD method was also extended to computing far zone time domain results in two dimensions. Also the capability to model nonlinear materials was incorporated into FDTD and validated.
Production Level CFD Code Acceleration for Hybrid Many-Core Architectures

NASA Technical Reports Server (NTRS)

Duffy, Austen C.; Hammond, Dana P.; Nielsen, Eric J.

2012-01-01

In this work, a novel graphics processing unit (GPU) distributed sharing model for hybrid many-core architectures is introduced and employed in the acceleration of a production-level computational fluid dynamics (CFD) code. The latest generation graphics hardware allows multiple processor cores to simultaneously share a single GPU through concurrent kernel execution. This feature has allowed the NASA FUN3D code to be accelerated in parallel with up to four processor cores sharing a single GPU. For codes to scale and fully use resources on these and the next generation machines, codes will need to employ some type of GPU sharing model, as presented in this work. Findings include the effects of GPU sharing on overall performance. A discussion of the inherent challenges that parallel unstructured CFD codes face in accelerator-based computing environments is included, with considerations for future generation architectures. This work was completed by the author in August 2010, and reflects the analysis and results of the time.
Porting plasma physics simulation codes to modern computing architectures using the libmrc framework

NASA Astrophysics Data System (ADS)

Germaschewski, Kai; Abbott, Stephen

2015-11-01

Available computing power has continued to grow exponentially even after single-core performance satured in the last decade. The increase has since been driven by more parallelism, both using more cores and having more parallelism in each core, e.g. in GPUs and Intel Xeon Phi. Adapting existing plasma physics codes is challenging, in particular as there is no single programming model that covers current and future architectures. We will introduce the open-source libmrc framework that has been used to modularize and port three plasma physics codes: The extended MHD code MRCv3 with implicit time integration and curvilinear grids; the OpenGGCM global magnetosphere model; and the particle-in-cell code PSC. libmrc consolidates basic functionality needed for simulations based on structured grids (I/O, load balancing, time integrators), and also introduces a parallel object model that makes it possible to maintain multiple implementations of computational kernels, on e.g. conventional processors and GPUs. It handles data layout conversions and enables us to port performance-critical parts of a code to a new architecture step-by-step, while the rest of the code can remain unchanged. We will show examples of the performance gains and some physics applications.
First principles numerical model of avalanche-induced arc discharges in electron-irradiated dielectrics

NASA Technical Reports Server (NTRS)

Beers, B. L.; Pine, V. W.; Hwang, H. C.; Bloomberg, H. W.; Lin, D. L.; Schmidt, M. J.; Strickland, D. J.

1979-01-01

The model consists of four phases: single electron dynamics, single electron avalanche, negative streamer development, and tree formation. Numerical algorithms and computer code implementations are presented for the first three phases. An approach to developing a code description of fourth phase is discussed. Numerical results are presented for a crude material model of Teflon.
CFD Modeling of Free-Piston Stirling Engines

NASA Technical Reports Server (NTRS)

Ibrahim, Mounir B.; Zhang, Zhi-Guo; Tew, Roy C., Jr.; Gedeon, David; Simon, Terrence W.

2001-01-01

NASA Glenn Research Center (GRC) is funding Cleveland State University (CSU) to develop a reliable Computational Fluid Dynamics (CFD) code that can predict engine performance with the goal of significant improvements in accuracy when compared to one-dimensional (1-D) design code predictions. The funding also includes conducting code validation experiments at both the University of Minnesota (UMN) and CSU. In this paper a brief description of the work-in-progress is provided in the two areas (CFD and Experiments). Also, previous test results are compared with computational data obtained using (1) a 2-D CFD code obtained from Dr. Georg Scheuerer and further developed at CSU and (2) a multidimensional commercial code CFD-ACE+. The test data and computational results are for (1) a gas spring and (2) a single piston/cylinder with attached annular heat exchanger. The comparisons among the codes are discussed. The paper also discusses plans for conducting code validation experiments at CSU and UMN.
Computer program for prediction of the deposition of material released from fixed and rotary wing aircraft

NASA Technical Reports Server (NTRS)

Teske, M. E.

1984-01-01

This is a user manual for the computer code ""AGDISP'' (AGricultural DISPersal) which has been developed to predict the deposition of material released from fixed and rotary wing aircraft in a single-pass, computationally efficient manner. The formulation of the code is novel in that the mean particle trajectory and the variance about the mean resulting from turbulent fluid fluctuations are simultaneously predicted. The code presently includes the capability of assessing the influence of neutral atmospheric conditions, inviscid wake vortices, particle evaporation, plant canopy and terrain on the deposition pattern.
Computational techniques for solar wind flows past terrestrial planets: Theory and computer programs

NASA Technical Reports Server (NTRS)

Stahara, S. S.; Chaussee, D. S.; Trudinger, B. C.; Spreiter, J. R.

1977-01-01

The interaction of the solar wind with terrestrial planets can be predicted using a computer program based on a single fluid, steady, dissipationless, magnetohydrodynamic model to calculate the axisymmetric, supersonic, super-Alfvenic solar wind flow past both magnetic and nonmagnetic planets. The actual calculations are implemented by an assemblage of computer codes organized into one program. These include finite difference codes which determine the gas-dynamic solution, together with a variety of special purpose output codes for determining and automatically plotting both flow field and magnetic field results. Comparisons are made with previous results, and results are presented for a number of solar wind flows. The computational programs developed are documented and are presented in a general user's manual which is included.

Addressing the challenges of standalone multi-core simulations in molecular dynamics

NASA Astrophysics Data System (ADS)

Ocaya, R. O.; Terblans, J. J.

2017-07-01

Computational modelling in material science involves mathematical abstractions of force fields between particles with the aim to postulate, develop and understand materials by simulation. The aggregated pairwise interactions of the material's particles lead to a deduction of its macroscopic behaviours. For practically meaningful macroscopic scales, a large amount of data are generated, leading to vast execution times. Simulation times of hours, days or weeks for moderately sized problems are not uncommon. The reduction of simulation times, improved result accuracy and the associated software and hardware engineering challenges are the main motivations for many of the ongoing researches in the computational sciences. This contribution is concerned mainly with simulations that can be done on a "standalone" computer based on Message Passing Interfaces (MPI), parallel code running on hardware platforms with wide specifications, such as single/multi- processor, multi-core machines with minimal reconfiguration for upward scaling of computational power. The widely available, documented and standardized MPI library provides this functionality through the MPI_Comm_size (), MPI_Comm_rank () and MPI_Reduce () functions. A survey of the literature shows that relatively little is written with respect to the efficient extraction of the inherent computational power in a cluster. In this work, we discuss the main avenues available to tap into this extra power without compromising computational accuracy. We also present methods to overcome the high inertia encountered in single-node-based computational molecular dynamics. We begin by surveying the current state of the art and discuss what it takes to achieve parallelism, efficiency and enhanced computational accuracy through program threads and message passing interfaces. Several code illustrations are given. The pros and cons of writing raw code as opposed to using heuristic, third-party code are also discussed. The growing trend towards graphical processor units and virtual computing clouds for high-performance computing is also discussed. Finally, we present the comparative results of vacancy formation energy calculations using our own parallelized standalone code called Verlet-Stormer velocity (VSV) operating on 30,000 copper atoms. The code is based on the Sutton-Chen implementation of the Finnis-Sinclair pairwise embedded atom potential. A link to the code is also given.
Creating a Simple Single Computational Approach to Modeling Rarefied and Continuum Flow About Aerospace Vehicles

NASA Technical Reports Server (NTRS)

Goldstein, David B.; Varghese, Philip L.

1997-01-01

We proposed to create a single computational code incorporating methods that can model both rarefied and continuum flow to enable the efficient simulation of flow about space craft and high altitude hypersonic aerospace vehicles. The code was to use a single grid structure that permits a smooth transition between the continuum and rarefied portions of the flow. Developing an appropriate computational boundary between the two regions represented a major challenge. The primary approach chosen involves coupling a four-speed Lattice Boltzmann model for the continuum flow with the DSMC method in the rarefied regime. We also explored the possibility of using a standard finite difference Navier Stokes solver for the continuum flow. With the resulting code we will ultimately investigate three-dimensional plume impingement effects, a subject of critical importance to NASA and related to the work of Drs. Forrest Lumpkin, Steve Fitzgerald and Jay Le Beau at Johnson Space Center. Below is a brief background on the project and a summary of the results as of the end of the grant.
Reeds computer code

NASA Technical Reports Server (NTRS)

Bjork, C.

1981-01-01

The REEDS (rocket exhaust effluent diffusion single layer) computer code is used for the estimation of certain rocket exhaust effluent concentrations and dosages and their distributions near the Earth's surface following a rocket launch event. Output from REEDS is used in producing near real time air quality and environmental assessments of the effects of certain potentially harmful effluents, namely HCl, Al2O3, CO, and NO.
Microgravity computing codes. User's guide

NASA Astrophysics Data System (ADS)

1982-01-01

Codes used in microgravity experiments to compute fluid parameters and to obtain data graphically are introduced. The computer programs are stored on two diskettes, compatible with the floppy disk drives of the Apple 2. Two versions of both disks are available (DOS-2 and DOS-3). The codes are written in BASIC and are structured as interactive programs. Interaction takes place through the keyboard of any Apple 2-48K standard system with single floppy disk drive. The programs are protected against wrong commands given by the operator. The programs are described step by step in the same order as the instructions displayed on the monitor. Most of these instructions are shown, with samples of computation and of graphics.
Guide to AERO2S and WINGDES Computer Codes for Prediction and Minimization of Drag Due to Lift

NASA Technical Reports Server (NTRS)

Carlson, Harry W.; Chu, Julio; Ozoroski, Lori P.; McCullers, L. Arnold

1997-01-01

The computer codes, AER02S and WINGDES, are now widely used for the analysis and design of airplane lifting surfaces under conditions that tend to induce flow separation. These codes have undergone continued development to provide additional capabilities since the introduction of the original versions over a decade ago. This code development has been reported in a variety of publications (NASA technical papers, NASA contractor reports, and society journals). Some modifications have not been publicized at all. Users of these codes have suggested the desirability of combining in a single document the descriptions of the code development, an outline of the features of each code, and suggestions for effective code usage. This report is intended to supply that need.
Multiprocessing on supercomputers for computational aerodynamics

NASA Technical Reports Server (NTRS)

Yarrow, Maurice; Mehta, Unmeel B.

1990-01-01

Very little use is made of multiple processors available on current supercomputers (computers with a theoretical peak performance capability equal to 100 MFLOPs or more) in computational aerodynamics to significantly improve turnaround time. The productivity of a computer user is directly related to this turnaround time. In a time-sharing environment, the improvement in this speed is achieved when multiple processors are used efficiently to execute an algorithm. The concept of multiple instructions and multiple data (MIMD) through multi-tasking is applied via a strategy which requires relatively minor modifications to an existing code for a single processor. Essentially, this approach maps the available memory to multiple processors, exploiting the C-FORTRAN-Unix interface. The existing single processor code is mapped without the need for developing a new algorithm. The procedure for building a code utilizing this approach is automated with the Unix stream editor. As a demonstration of this approach, a Multiple Processor Multiple Grid (MPMG) code is developed. It is capable of using nine processors, and can be easily extended to a larger number of processors. This code solves the three-dimensional, Reynolds averaged, thin-layer and slender-layer Navier-Stokes equations with an implicit, approximately factored and diagonalized method. The solver is applied to generic oblique-wing aircraft problem on a four processor Cray-2 computer. A tricubic interpolation scheme is developed to increase the accuracy of coupling of overlapped grids. For the oblique-wing aircraft problem, a speedup of two in elapsed (turnaround) time is observed in a saturated time-sharing environment.
Analytical investigation of the dynamics of tethered constellations in Earth orbit, phase 2

NASA Technical Reports Server (NTRS)

Lorenzini, E.

1985-01-01

This Quarterly Report deals with the deployment maneuver of a single-axis, vertical constellation with three masses. A new, easy to handle, computer code that simulates the two-dimensional dynamics of the constellation has been implemented. This computer code is used for designing control laws for the deployment maneuver that minimizes the acceleration level of the low-g platform during the maneuver.
Computer code for single-point thermodynamic analysis of hydrogen/oxygen expander-cycle rocket engines

NASA Technical Reports Server (NTRS)

Glassman, Arthur J.; Jones, Scott M.

1991-01-01

This analysis and this computer code apply to full, split, and dual expander cycles. Heat regeneration from the turbine exhaust to the pump exhaust is allowed. The combustion process is modeled as one of chemical equilibrium in an infinite-area or a finite-area combustor. Gas composition in the nozzle may be either equilibrium or frozen during expansion. This report, which serves as a users guide for the computer code, describes the system, the analysis methodology, and the program input and output. Sample calculations are included to show effects of key variables such as nozzle area ratio and oxidizer-to-fuel mass ratio.
Particle Hydrodynamics with Material Strength for Multi-Layer Orbital Debris Shield Design

NASA Technical Reports Server (NTRS)

Fahrenthold, Eric P.

1999-01-01

Three dimensional simulation of oblique hypervelocity impact on orbital debris shielding places extreme demands on computer resources. Research to date has shown that particle models provide the most accurate and efficient means for computer simulation of shield design problems. In order to employ a particle based modeling approach to the wall plate impact portion of the shield design problem, it is essential that particle codes be augmented to represent strength effects. This report describes augmentation of a Lagrangian particle hydrodynamics code developed by the principal investigator, to include strength effects, allowing for the entire shield impact problem to be represented using a single computer code.
Evolution of plastic anisotropy for high-strain-rate computations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schiferl, S.K.; Maudlin, P.J.

1994-12-01

A model for anisotropic material strength, and for changes in the anisotropy due to plastic strain, is described. This model has been developed for use in high-rate, explicit, Lagrangian multidimensional continuum-mechanics codes. The model handles anisotropies in single-phase materials, in particular the anisotropies due to crystallographic texture--preferred orientations of the single-crystal grains. Textural anisotropies, and the changes in these anisotropies, depend overwhelmingly no the crystal structure of the material and on the deformation history. The changes, particularly for a complex deformations, are not amenable to simple analytical forms. To handle this problem, the material model described here includes a texturemore » code, or micromechanical calculation, coupled to a continuum code. The texture code updates grain orientations as a function of tensor plastic strain, and calculates the yield strength in different directions. A yield function is fitted to these yield points. For each computational cell in the continuum simulation, the texture code tracks a particular set of grain orientations. The orientations will change due to the tensor strain history, and the yield function will change accordingly. Hence, the continuum code supplies a tensor strain to the texture code, and the texture code supplies an updated yield function to the continuum code. Since significant texture changes require relatively large strains--typically, a few percent or more--the texture code is not called very often, and the increase in computer time is not excessive. The model was implemented, using a finite-element continuum code and a texture code specialized for hexagonal-close-packed crystal structures. The results for several uniaxial stress problems and an explosive-forming problem are shown.« less
Network Coding for Function Computation

ERIC Educational Resources Information Center

Appuswamy, Rathinakumar

2011-01-01

In this dissertation, the following "network computing problem" is considered. Source nodes in a directed acyclic network generate independent messages and a single receiver node computes a target function f of the messages. The objective is to maximize the average number of times f can be computed per network usage, i.e., the "computing…
With or without you: predictive coding and Bayesian inference in the brain

PubMed Central

Aitchison, Laurence; Lengyel, Máté

2018-01-01

Two theoretical ideas have emerged recently with the ambition to provide a unifying functional explanation of neural population coding and dynamics: predictive coding and Bayesian inference. Here, we describe the two theories and their combination into a single framework: Bayesian predictive coding. We clarify how the two theories can be distinguished, despite sharing core computational concepts and addressing an overlapping set of empirical phenomena. We argue that predictive coding is an algorithmic / representational motif that can serve several different computational goals of which Bayesian inference is but one. Conversely, while Bayesian inference can utilize predictive coding, it can also be realized by a variety of other representations. We critically evaluate the experimental evidence supporting Bayesian predictive coding and discuss how to test it more directly. PMID:28942084
Radiant Energy Measurements from a Scaled Jet Engine Axisymmetric Exhaust Nozzle for a Baseline Code Validation Case

NASA Technical Reports Server (NTRS)

Baumeister, Joseph F.

1994-01-01

A non-flowing, electrically heated test rig was developed to verify computer codes that calculate radiant energy propagation from nozzle geometries that represent aircraft propulsion nozzle systems. Since there are a variety of analysis tools used to evaluate thermal radiation propagation from partially enclosed nozzle surfaces, an experimental benchmark test case was developed for code comparison. This paper briefly describes the nozzle test rig and the developed analytical nozzle geometry used to compare the experimental and predicted thermal radiation results. A major objective of this effort was to make available the experimental results and the analytical model in a format to facilitate conversion to existing computer code formats. For code validation purposes this nozzle geometry represents one validation case for one set of analysis conditions. Since each computer code has advantages and disadvantages based on scope, requirements, and desired accuracy, the usefulness of this single nozzle baseline validation case can be limited for some code comparisons.
MHD code using multi graphical processing units: SMAUG+

NASA Astrophysics Data System (ADS)

Gyenge, N.; Griffiths, M. K.; Erdélyi, R.

2018-01-01

This paper introduces the Sheffield Magnetohydrodynamics Algorithm Using GPUs (SMAUG+), an advanced numerical code for solving magnetohydrodynamic (MHD) problems, using multi-GPU systems. Multi-GPU systems facilitate the development of accelerated codes and enable us to investigate larger model sizes and/or more detailed computational domain resolutions. This is a significant advancement over the parent single-GPU MHD code, SMAUG (Griffiths et al., 2015). Here, we demonstrate the validity of the SMAUG + code, describe the parallelisation techniques and investigate performance benchmarks. The initial configuration of the Orszag-Tang vortex simulations are distributed among 4, 16, 64 and 100 GPUs. Furthermore, different simulation box resolutions are applied: 1000 × 1000, 2044 × 2044, 4000 × 4000 and 8000 × 8000 . We also tested the code with the Brio-Wu shock tube simulations with model size of 800 employing up to 10 GPUs. Based on the test results, we observed speed ups and slow downs, depending on the granularity and the communication overhead of certain parallel tasks. The main aim of the code development is to provide massively parallel code without the memory limitation of a single GPU. By using our code, the applied model size could be significantly increased. We demonstrate that we are able to successfully compute numerically valid and large 2D MHD problems.
A coded tracking telemetry system

USGS Publications Warehouse

Howey, P.W.; Seegar, W.S.; Fuller, M.R.; Titus, K.; Amlaner, Charles J.

1989-01-01

We describe the general characteristics of an automated radio telemetry system designed to operate for prolonged periods on a single frequency. Each transmitter sends a unique coded signal to a receiving system that encodes and records only the appropriater, pre-programmed codes. A record of the time of each reception is stored on diskettes in a micro-computer. This system enables continuous monitoring of infrequent signals (e.g. one per minute or one per hour), thus extending operation life or allowing size reduction of the transmitter, compared to conventional wildlife telemetry. Furthermore, when using unique codes transmitted on a single frequency, biologists can monitor many individuals without exceeding the radio frequency allocations for wildlife.
Proceedings of the 14th International Conference on the Numerical Simulation of Plasmas

NASA Astrophysics Data System (ADS)

Partial Contents are as follows: Numerical Simulations of the Vlasov-Maxwell Equations by Coupled Particle-Finite Element Methods on Unstructured Meshes; Electromagnetic PIC Simulations Using Finite Elements on Unstructured Grids; Modelling Travelling Wave Output Structures with the Particle-in-Cell Code CONDOR; SST--A Single-Slice Particle Simulation Code; Graphical Display and Animation of Data Produced by Electromagnetic, Particle-in-Cell Codes; A Post-Processor for the PEST Code; Gray Scale Rendering of Beam Profile Data; A 2D Electromagnetic PIC Code for Distributed Memory Parallel Computers; 3-D Electromagnetic PIC Simulation on the NRL Connection Machine; Plasma PIC Simulations on MIMD Computers; Vlasov-Maxwell Algorithm for Electromagnetic Plasma Simulation on Distributed Architectures; MHD Boundary Layer Calculation Using the Vortex Method; and Eulerian Codes for Plasma Simulations.
Performance Analysis and Optimization on the UCLA Parallel Atmospheric General Circulation Model Code

NASA Technical Reports Server (NTRS)

Lou, John; Ferraro, Robert; Farrara, John; Mechoso, Carlos

1996-01-01

An analysis is presented of several factors influencing the performance of a parallel implementation of the UCLA atmospheric general circulation model (AGCM) on massively parallel computer systems. Several modificaitons to the original parallel AGCM code aimed at improving its numerical efficiency, interprocessor communication cost, load-balance and issues affecting single-node code performance are discussed.
Final report for the Tera Computer TTI CRADA

DOE Office of Scientific and Technical Information (OSTI.GOV)

Davidson, G.S.; Pavlakos, C.; Silva, C.

1997-01-01

Tera Computer and Sandia National Laboratories have completed a CRADA, which examined the Tera Multi-Threaded Architecture (MTA) for use with large codes of importance to industry and DOE. The MTA is an innovative architecture that uses parallelism to mask latency between memories and processors. The physical implementation is a parallel computer with high cross-section bandwidth and GaAs processors designed by Tera, which support many small computation threads and fast, lightweight context switches between them. When any thread blocks while waiting for memory accesses to complete, another thread immediately begins execution so that high CPU utilization is maintained. The Tera MTAmore » parallel computer has a single, global address space, which is appealing when porting existing applications to a parallel computer. This ease of porting is further enabled by compiler technology that helps break computations into parallel threads. DOE and Sandia National Laboratories were interested in working with Tera to further develop this computing concept. While Tera Computer would continue the hardware development and compiler research, Sandia National Laboratories would work with Tera to ensure that their compilers worked well with important Sandia codes, most particularly CTH, a shock physics code used for weapon safety computations. In addition to that important code, Sandia National Laboratories would complete research on a robotic path planning code, SANDROS, which is important in manufacturing applications, and would evaluate the MTA performance on this code. Finally, Sandia would work directly with Tera to develop 3D visualization codes, which would be appropriate for use with the MTA. Each of these tasks has been completed to the extent possible, given that Tera has just completed the MTA hardware. All of the CRADA work had to be done on simulators.« less
E-O Sensor Signal Recognition Simulation: Computer Code SPOT I.

DTIC Science & Technology

1978-10-01

scattering phase function PDCO , defined at the specified wavelength, given for each of the scattering angles defined. Currently, a maximum of sixty-four...PHASE MATRIX DATA IS DEFINED PDCO AVERAGE PROBABILITY FOR PHASE MATRIX DEFINITION NPROB PROBLEM NUMBER 54 Fig. 12. FLOWCHART for the SPOT Computer Code...El0.1 WLAM(N) Wavelength at which the aerosol single-scattering phase function set is defined (microns) 3 8El0.1 PDCO (N,I) Average probability for
Validation of hydrogen gas stratification and mixing models

DOE PAGES

Wu, Hsingtzu; Zhao, Haihua

2015-05-26

Two validation benchmarks confirm that the BMIX++ code is capable of simulating unintended hydrogen release scenarios efficiently. The BMIX++ (UC Berkeley mechanistic MIXing code in C++) code has been developed to accurately and efficiently predict the fluid mixture distribution and heat transfer in large stratified enclosures for accident analyses and design optimizations. The BMIX++ code uses a scaling based one-dimensional method to achieve large reduction in computational effort compared to a 3-D computational fluid dynamics (CFD) simulation. Two BMIX++ benchmark models have been developed. One is for a single buoyant jet in an open space and another is for amore » large sealed enclosure with both a jet source and a vent near the floor. Both of them have been validated by comparisons with experimental data. Excellent agreements are observed. The entrainment coefficients of 0.09 and 0.08 are found to fit the experimental data for hydrogen leaks with the Froude number of 99 and 268 best, respectively. In addition, the BIX++ simulation results of the average helium concentration for an enclosure with a vent and a single jet agree with the experimental data within a margin of about 10% for jet flow rates ranging from 1.21 × 10⁻⁴ to 3.29 × 10⁻⁴ m³/s. In conclusion, computing time for each BMIX++ model with a normal desktop computer is less than 5 min.« less

Argument structure hierarchy system and method for facilitating analysis and decision-making processes

DOEpatents

Janssen, Terry

2000-01-01

A system and method for facilitating decision-making comprising a computer program causing linkage of data representing a plurality of argument structure units into a hierarchical argument structure. Each argument structure unit comprises data corresponding to a hypothesis and its corresponding counter-hypothesis, data corresponding to grounds that provide a basis for inference of the hypothesis or its corresponding counter-hypothesis, data corresponding to a warrant linking the grounds to the hypothesis or its corresponding counter-hypothesis, and data corresponding to backing that certifies the warrant. The hierarchical argument structure comprises a top level argument structure unit and a plurality of subordinate level argument structure units. Each of the plurality of subordinate argument structure units comprises at least a portion of the grounds of the argument structure unit to which it is subordinate. Program code located on each of a plurality of remote computers accepts input from one of a plurality of contributors. Each input comprises data corresponding to an argument structure unit in the hierarchical argument structure and supports the hypothesis or its corresponding counter-hypothesis. A second programming code is adapted to combine the inputs into a single hierarchical argument structure. A third computer program code is responsive to the second computer program code and is adapted to represent a degree of support for the hypothesis and its corresponding counter-hypothesis in the single hierarchical argument structure.
Comparison of FDNS liquid rocket engine plume computations with SPF/2

NASA Technical Reports Server (NTRS)

Kumar, G. N.; Griffith, D. O., II; Warsi, S. A.; Seaford, C. M.

1993-01-01

Prediction of a plume's shape and structure is essential to the evaluation of base region environments. The JANNAF standard plume flowfield analysis code SPF/2 predicts plumes well, but cannot analyze base regions. Full Navier-Stokes CFD codes can calculate both zones; however, before they can be used, they must be validated. The CFD code FDNS3D (Finite Difference Navier-Stokes Solver) was used to analyze the single plume of a Space Transportation Main Engine (STME) and comparisons were made with SPF/2 computations. Both frozen and finite rate chemistry models were employed as well as two turbulence models in SPF/2. The results indicate that FDNS3D plume computations agree well with SPF/2 predictions for liquid rocket engine plumes.
New Parallel computing framework for radiation transport codes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kostin, M.A.; /Michigan State U., NSCL; Mokhov, N.V.

A new parallel computing framework has been developed to use with general-purpose radiation transport codes. The framework was implemented as a C++ module that uses MPI for message passing. The module is significantly independent of radiation transport codes it can be used with, and is connected to the codes by means of a number of interface functions. The framework was integrated with the MARS15 code, and an effort is under way to deploy it in PHITS. Besides the parallel computing functionality, the framework offers a checkpoint facility that allows restarting calculations with a saved checkpoint file. The checkpoint facility canmore » be used in single process calculations as well as in the parallel regime. Several checkpoint files can be merged into one thus combining results of several calculations. The framework also corrects some of the known problems with the scheduling and load balancing found in the original implementations of the parallel computing functionality in MARS15 and PHITS. The framework can be used efficiently on homogeneous systems and networks of workstations, where the interference from the other users is possible.« less
Nonlinear dynamic simulation of single- and multi-spool core engines

NASA Technical Reports Server (NTRS)

Schobeiri, T.; Lippke, C.; Abouelkheir, M.

1993-01-01

In this paper a new computational method for accurate simulation of the nonlinear dynamic behavior of single- and multi-spool core engines, turbofan engines, and power generation gas turbine engines is presented. In order to perform the simulation, a modularly structured computer code has been developed which includes individual mathematical modules representing various engine components. The generic structure of the code enables the dynamic simulation of arbitrary engine configurations ranging from single-spool thrust generation to multi-spool thrust/power generation engines under adverse dynamic operating conditions. For precise simulation of turbine and compressor components, row-by-row calculation procedures were implemented that account for the specific turbine and compressor cascade and blade geometry and characteristics. The dynamic behavior of the subject engine is calculated by solving a number of systems of partial differential equations, which describe the unsteady behavior of the individual components. In order to ensure the capability, accuracy, robustness, and reliability of the code, comprehensive critical performance assessment and validation tests were performed. As representatives, three different transient cases with single- and multi-spool thrust and power generation engines were simulated. The transient cases range from operating with a prescribed fuel schedule, to extreme load changes, to generator and turbine shut down.
Observations on computational methodologies for use in large-scale, gradient-based, multidisciplinary design incorporating advanced CFD codes

NASA Technical Reports Server (NTRS)

Newman, P. A.; Hou, G. J.-W.; Jones, H. E.; Taylor, A. C., III; Korivi, V. M.

1992-01-01

How a combination of various computational methodologies could reduce the enormous computational costs envisioned in using advanced CFD codes in gradient based optimized multidisciplinary design (MdD) procedures is briefly outlined. Implications of these MdD requirements upon advanced CFD codes are somewhat different than those imposed by a single discipline design. A means for satisfying these MdD requirements for gradient information is presented which appear to permit: (1) some leeway in the CFD solution algorithms which can be used; (2) an extension to 3-D problems; and (3) straightforward use of other computational methodologies. Many of these observations have previously been discussed as possibilities for doing parts of the problem more efficiently; the contribution here is observing how they fit together in a mutually beneficial way.
Multiprocessing on supercomputers for computational aerodynamics

NASA Technical Reports Server (NTRS)

Yarrow, Maurice; Mehta, Unmeel B.

1991-01-01

Little use is made of multiple processors available on current supercomputers (computers with a theoretical peak performance capability equal to 100 MFLOPS or more) to improve turnaround time in computational aerodynamics. The productivity of a computer user is directly related to this turnaround time. In a time-sharing environment, such improvement in this speed is achieved when multiple processors are used efficiently to execute an algorithm. The concept of multiple instructions and multiple data (MIMD) is applied through multitasking via a strategy that requires relatively minor modifications to an existing code for a single processor. This approach maps the available memory to multiple processors, exploiting the C-Fortran-Unix interface. The existing code is mapped without the need for developing a new algorithm. The procedure for building a code utilizing this approach is automated with the Unix stream editor.
Application of advanced computational procedures for modeling solar-wind interactions with Venus: Theory and computer code

NASA Technical Reports Server (NTRS)

Stahara, S. S.; Klenke, D.; Trudinger, B. C.; Spreiter, J. R.

1980-01-01

Computational procedures are developed and applied to the prediction of solar wind interaction with nonmagnetic terrestrial planet atmospheres, with particular emphasis to Venus. The theoretical method is based on a single fluid, steady, dissipationless, magnetohydrodynamic continuum model, and is appropriate for the calculation of axisymmetric, supersonic, super-Alfvenic solar wind flow past terrestrial planets. The procedures, which consist of finite difference codes to determine the gasdynamic properties and a variety of special purpose codes to determine the frozen magnetic field, streamlines, contours, plots, etc. of the flow, are organized into one computational program. Theoretical results based upon these procedures are reported for a wide variety of solar wind conditions and ionopause obstacle shapes. Plasma and magnetic field comparisons in the ionosheath are also provided with actual spacecraft data obtained by the Pioneer Venus Orbiter.
Application of Fast Multipole Methods to the NASA Fast Scattering Code

NASA Technical Reports Server (NTRS)

Dunn, Mark H.; Tinetti, Ana F.

2008-01-01

The NASA Fast Scattering Code (FSC) is a versatile noise prediction program designed to conduct aeroacoustic noise reduction studies. The equivalent source method is used to solve an exterior Helmholtz boundary value problem with an impedance type boundary condition. The solution process in FSC v2.0 requires direct manipulation of a large, dense system of linear equations, limiting the applicability of the code to small scales and/or moderate excitation frequencies. Recent advances in the use of Fast Multipole Methods (FMM) for solving scattering problems, coupled with sparse linear algebra techniques, suggest that a substantial reduction in computer resource utilization over conventional solution approaches can be obtained. Implementation of the single level FMM (SLFMM) and a variant of the Conjugate Gradient Method (CGM) into the FSC is discussed in this paper. The culmination of this effort, FSC v3.0, was used to generate solutions for three configurations of interest. Benchmarking against previously obtained simulations indicate that a twenty-fold reduction in computational memory and up to a four-fold reduction in computer time have been achieved on a single processor.
Shielding from space radiations

NASA Technical Reports Server (NTRS)

Chang, C. Ken; Badavi, Forooz F.; Tripathi, Ram K.

1993-01-01

This Progress Report covering the period of December 1, 1992 to June 1, 1993 presents the development of an analytical solution to the heavy ion transport equation in terms of Green's function formalism. The mathematical development results are recasted into a highly efficient computer code for space applications. The efficiency of this algorithm is accomplished by a nonperturbative technique of extending the Green's function over the solution domain. The code may also be applied to accelerator boundary conditions to allow code validation in laboratory experiments. Results from the isotopic version of the code with 59 isotopes present for a single layer target material, for the case of an iron beam projectile at 600 MeV/nucleon in water is presented. A listing of the single layer isotopic version of the code is included.
Development of Parallel Computing Framework to Enhance Radiation Transport Code Capabilities for Rare Isotope Beam Facility Design

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kostin, Mikhail; Mokhov, Nikolai; Niita, Koji

A parallel computing framework has been developed to use with general-purpose radiation transport codes. The framework was implemented as a C++ module that uses MPI for message passing. It is intended to be used with older radiation transport codes implemented in Fortran77, Fortran 90 or C. The module is significantly independent of radiation transport codes it can be used with, and is connected to the codes by means of a number of interface functions. The framework was developed and tested in conjunction with the MARS15 code. It is possible to use it with other codes such as PHITS, FLUKA andmore » MCNP after certain adjustments. Besides the parallel computing functionality, the framework offers a checkpoint facility that allows restarting calculations with a saved checkpoint file. The checkpoint facility can be used in single process calculations as well as in the parallel regime. The framework corrects some of the known problems with the scheduling and load balancing found in the original implementations of the parallel computing functionality in MARS15 and PHITS. The framework can be used efficiently on homogeneous systems and networks of workstations, where the interference from the other users is possible.« less
GPUs, a New Tool of Acceleration in CFD: Efficiency and Reliability on Smoothed Particle Hydrodynamics Methods

PubMed Central

Crespo, Alejandro C.; Dominguez, Jose M.; Barreiro, Anxo; Gómez-Gesteira, Moncho; Rogers, Benedict D.

2011-01-01

Smoothed Particle Hydrodynamics (SPH) is a numerical method commonly used in Computational Fluid Dynamics (CFD) to simulate complex free-surface flows. Simulations with this mesh-free particle method far exceed the capacity of a single processor. In this paper, as part of a dual-functioning code for either central processing units (CPUs) or Graphics Processor Units (GPUs), a parallelisation using GPUs is presented. The GPU parallelisation technique uses the Compute Unified Device Architecture (CUDA) of nVidia devices. Simulations with more than one million particles on a single GPU card exhibit speedups of up to two orders of magnitude over using a single-core CPU. It is demonstrated that the code achieves different speedups with different CUDA-enabled GPUs. The numerical behaviour of the SPH code is validated with a standard benchmark test case of dam break flow impacting on an obstacle where good agreement with the experimental results is observed. Both the achieved speed-ups and the quantitative agreement with experiments suggest that CUDA-based GPU programming can be used in SPH methods with efficiency and reliability. PMID:21695185
Porting a Hall MHD Code to a Graphic Processing Unit

NASA Technical Reports Server (NTRS)

Dorelli, John C.

2011-01-01

We present our experience porting a Hall MHD code to a Graphics Processing Unit (GPU). The code is a 2nd order accurate MUSCL-Hancock scheme which makes use of an HLL Riemann solver to compute numerical fluxes and second-order finite differences to compute the Hall contribution to the electric field. The divergence of the magnetic field is controlled with Dedner?s hyperbolic divergence cleaning method. Preliminary benchmark tests indicate a speedup (relative to a single Nehalem core) of 58x for a double precision calculation. We discuss scaling issues which arise when distributing work across multiple GPUs in a CPU-GPU cluster.
GPU-accelerated phase-field simulation of dendritic solidification in a binary alloy

NASA Astrophysics Data System (ADS)

Yamanaka, Akinori; Aoki, Takayuki; Ogawa, Satoi; Takaki, Tomohiro

2011-03-01

The phase-field simulation for dendritic solidification of a binary alloy has been accelerated by using a graphic processing unit (GPU). To perform the phase-field simulation of the alloy solidification on GPU, a program code was developed with computer unified device architecture (CUDA). In this paper, the implementation technique of the phase-field model on GPU is presented. Also, we evaluated the acceleration performance of the three-dimensional solidification simulation by using a single NVIDIA TESLA C1060 GPU and the developed program code. The results showed that the GPU calculation for 5763 computational grids achieved the performance of 170 GFLOPS by utilizing the shared memory as a software-managed cache. Furthermore, it can be demonstrated that the computation with the GPU is 100 times faster than that with a single CPU core. From the obtained results, we confirmed the feasibility of realizing a real-time full three-dimensional phase-field simulation of microstructure evolution on a personal desktop computer.
Nonlinear wave vacillation in the atmosphere

NASA Technical Reports Server (NTRS)

Antar, Basil N.

1987-01-01

The problem of vacillation in a baroclinically unstable flow field is studied through the time evolution of a single nonlinearly unstable wave. To this end a computer code is being developed to solve numerically for the time evolution of the amplitude of such a wave. The final working code will be the end product resulting from the development of a heirarchy of codes with increasing complexity. The first code in this series was completed and is undergoing several diagnostic analyses to verify its validity. The development of this code is detailed.
SU (2) lattice gauge theory simulations on Fermi GPUs

NASA Astrophysics Data System (ADS)

Cardoso, Nuno; Bicudo, Pedro

2011-05-01

In this work we explore the performance of CUDA in quenched lattice SU (2) simulations. CUDA, NVIDIA Compute Unified Device Architecture, is a hardware and software architecture developed by NVIDIA for computing on the GPU. We present an analysis and performance comparison between the GPU and CPU in single and double precision. Analyses with multiple GPUs and two different architectures (G200 and Fermi architectures) are also presented. In order to obtain a high performance, the code must be optimized for the GPU architecture, i.e., an implementation that exploits the memory hierarchy of the CUDA programming model. We produce codes for the Monte Carlo generation of SU (2) lattice gauge configurations, for the mean plaquette, for the Polyakov Loop at finite T and for the Wilson loop. We also present results for the potential using many configurations (50,000) without smearing and almost 2000 configurations with APE smearing. With two Fermi GPUs we have achieved an excellent performance of 200× the speed over one CPU, in single precision, around 110 Gflops/s. We also find that, using the Fermi architecture, double precision computations for the static quark-antiquark potential are not much slower (less than 2× slower) than single precision computations.
Modular Approach for Ethics

ERIC Educational Resources Information Center

Wyne, Mudasser F.

2010-01-01

It is hard to define a single set of ethics that will cover an entire computer users community. In this paper, the issue is addressed in reference to code of ethics implemented by various professionals, institutes and organizations. The paper presents a higher level model using hierarchical approach. The code developed using this approach could be…
Discrete sensitivity derivatives of the Navier-Stokes equations with a parallel Krylov solver

NASA Technical Reports Server (NTRS)

Ajmani, Kumud; Taylor, Arthur C., III

1994-01-01

This paper solves an 'incremental' form of the sensitivity equations derived by differentiating the discretized thin-layer Navier Stokes equations with respect to certain design variables of interest. The equations are solved with a parallel, preconditioned Generalized Minimal RESidual (GMRES) solver on a distributed-memory architecture. The 'serial' sensitivity analysis code is parallelized by using the Single Program Multiple Data (SPMD) programming model, domain decomposition techniques, and message-passing tools. Sensitivity derivatives are computed for low and high Reynolds number flows over a NACA 1406 airfoil on a 32-processor Intel Hypercube, and found to be identical to those computed on a single-processor Cray Y-MP. It is estimated that the parallel sensitivity analysis code has to be run on 40-50 processors of the Intel Hypercube in order to match the single-processor processing time of a Cray Y-MP.
mGrid: A load-balanced distributed computing environment for the remote execution of the user-defined Matlab code

PubMed Central

Karpievitch, Yuliya V; Almeida, Jonas S

2006-01-01

Background Matlab, a powerful and productive language that allows for rapid prototyping, modeling and simulation, is widely used in computational biology. Modeling and simulation of large biological systems often require more computational resources then are available on a single computer. Existing distributed computing environments like the Distributed Computing Toolbox, MatlabMPI, Matlab*G and others allow for the remote (and possibly parallel) execution of Matlab commands with varying support for features like an easy-to-use application programming interface, load-balanced utilization of resources, extensibility over the wide area network, and minimal system administration skill requirements. However, all of these environments require some level of access to participating machines to manually distribute the user-defined libraries that the remote call may invoke. Results mGrid augments the usual process distribution seen in other similar distributed systems by adding facilities for user code distribution. mGrid's client-side interface is an easy-to-use native Matlab toolbox that transparently executes user-defined code on remote machines (i.e. the user is unaware that the code is executing somewhere else). Run-time variables are automatically packed and distributed with the user-defined code and automated load-balancing of remote resources enables smooth concurrent execution. mGrid is an open source environment. Apart from the programming language itself, all other components are also open source, freely available tools: light-weight PHP scripts and the Apache web server. Conclusion Transparent, load-balanced distribution of user-defined Matlab toolboxes and rapid prototyping of many simple parallel applications can now be done with a single easy-to-use Matlab command. Because mGrid utilizes only Matlab, light-weight PHP scripts and the Apache web server, installation and configuration are very simple. Moreover, the web-based infrastructure of mGrid allows for it to be easily extensible over the Internet. PMID:16539707
mGrid: a load-balanced distributed computing environment for the remote execution of the user-defined Matlab code.

PubMed

Karpievitch, Yuliya V; Almeida, Jonas S

2006-03-15

Matlab, a powerful and productive language that allows for rapid prototyping, modeling and simulation, is widely used in computational biology. Modeling and simulation of large biological systems often require more computational resources then are available on a single computer. Existing distributed computing environments like the Distributed Computing Toolbox, MatlabMPI, Matlab*G and others allow for the remote (and possibly parallel) execution of Matlab commands with varying support for features like an easy-to-use application programming interface, load-balanced utilization of resources, extensibility over the wide area network, and minimal system administration skill requirements. However, all of these environments require some level of access to participating machines to manually distribute the user-defined libraries that the remote call may invoke. mGrid augments the usual process distribution seen in other similar distributed systems by adding facilities for user code distribution. mGrid's client-side interface is an easy-to-use native Matlab toolbox that transparently executes user-defined code on remote machines (i.e. the user is unaware that the code is executing somewhere else). Run-time variables are automatically packed and distributed with the user-defined code and automated load-balancing of remote resources enables smooth concurrent execution. mGrid is an open source environment. Apart from the programming language itself, all other components are also open source, freely available tools: light-weight PHP scripts and the Apache web server. Transparent, load-balanced distribution of user-defined Matlab toolboxes and rapid prototyping of many simple parallel applications can now be done with a single easy-to-use Matlab command. Because mGrid utilizes only Matlab, light-weight PHP scripts and the Apache web server, installation and configuration are very simple. Moreover, the web-based infrastructure of mGrid allows for it to be easily extensible over the Internet.
Patterns and Practices for Future Architectures

DTIC Science & Technology

2014-08-01

14. SUBJECT TERMS computing architecture, graph algorithms, high-performance computing, big data , GPU 15. NUMBER OF PAGES 44 16. PRICE CODE 17...at Vertex 1 6 Figure 4: Data Structures Created by Kernel 1 of Single CPU, List Implementation Using the Graph in the Example from Section 1.2 9...Figure 5: Kernel 2 of Graph500 BFS Reference Implementation: Single CPU, List 10 Figure 6: Data Structures for Sequential CSR Algorithm 12 Figure 7

Fundamentals, current state of the development of, and prospects for further improvement of the new-generation thermal-hydraulic computational HYDRA-IBRAE/LM code for simulation of fast reactor systems

NASA Astrophysics Data System (ADS)

Alipchenkov, V. M.; Anfimov, A. M.; Afremov, D. A.; Gorbunov, V. S.; Zeigarnik, Yu. A.; Kudryavtsev, A. V.; Osipov, S. L.; Mosunova, N. A.; Strizhov, V. F.; Usov, E. V.

2016-02-01

The conceptual fundamentals of the development of the new-generation system thermal-hydraulic computational HYDRA-IBRAE/LM code are presented. The code is intended to simulate the thermalhydraulic processes that take place in the loops and the heat-exchange equipment of liquid-metal cooled fast reactor systems under normal operation and anticipated operational occurrences and during accidents. The paper provides a brief overview of Russian and foreign system thermal-hydraulic codes for modeling liquid-metal coolants and gives grounds for the necessity of development of a new-generation HYDRA-IBRAE/LM code. Considering the specific engineering features of the nuclear power plants (NPPs) equipped with the BN-1200 and the BREST-OD-300 reactors, the processes and the phenomena are singled out that require a detailed analysis and development of the models to be correctly described by the system thermal-hydraulic code in question. Information on the functionality of the computational code is provided, viz., the thermalhydraulic two-phase model, the properties of the sodium and the lead coolants, the closing equations for simulation of the heat-mass exchange processes, the models to describe the processes that take place during the steam-generator tube rupture, etc. The article gives a brief overview of the usability of the computational code, including a description of the support documentation and the supply package, as well as possibilities of taking advantages of the modern computer technologies, such as parallel computations. The paper shows the current state of verification and validation of the computational code; it also presents information on the principles of constructing of and populating the verification matrices for the BREST-OD-300 and the BN-1200 reactor systems. The prospects are outlined for further development of the HYDRA-IBRAE/LM code, introduction of new models into it, and enhancement of its usability. It is shown that the program of development and practical application of the code will allow carrying out in the nearest future the computations to analyze the safety of potential NPP projects at a qualitatively higher level.
Application of a personal computer for the uncoupled vibration analysis of wind turbine blade and counterweight assemblies

NASA Technical Reports Server (NTRS)

White, P. R.; Little, R. R.

1985-01-01

A research effort was undertaken to develop personal computer based software for vibrational analysis. The software was developed to analytically determine the natural frequencies and mode shapes for the uncoupled lateral vibrations of the blade and counterweight assemblies used in a single bladed wind turbine. The uncoupled vibration analysis was performed in both the flapwise and chordwise directions for static rotor conditions. The effects of rotation on the uncoupled flapwise vibration of the blade and counterweight assemblies were evaluated for various rotor speeds up to 90 rpm. The theory, used in the vibration analysis codes, is based on a lumped mass formulation for the blade and counterweight assemblies. The codes are general so that other designs can be readily analyzed. The input for the codes is generally interactive to facilitate usage. The output of the codes is both tabular and graphical. Listings of the codes are provided. Predicted natural frequencies of the first several modes show reasonable agreement with experimental results. The analysis codes were originally developed on a DEC PDP 11/34 minicomputer and then downloaded and modified to run on an ITT XTRA personal computer. Studies conducted to evaluate the efficiency of running the programs on a personal computer as compared with the minicomputer indicated that, with the proper combination of hardware and software options, the efficiency of using a personal computer exceeds that of a minicomputer.
Development and application of computational aerothermodynamics flowfield computer codes

NASA Technical Reports Server (NTRS)

Venkatapathy, Ethiraj

1994-01-01

Research was performed in the area of computational modeling and application of hypersonic, high-enthalpy, thermo-chemical nonequilibrium flow (Aerothermodynamics) problems. A number of computational fluid dynamic (CFD) codes were developed and applied to simulate high altitude rocket-plume, the Aeroassist Flight Experiment (AFE), hypersonic base flow for planetary probes, the single expansion ramp model (SERN) connected with the National Aerospace Plane, hypersonic drag devices, hypersonic ramp flows, ballistic range models, shock tunnel facility nozzles, transient and steady flows in the shock tunnel facility, arc-jet flows, thermochemical nonequilibrium flows around simple and complex bodies, axisymmetric ionized flows of interest to re-entry, unsteady shock induced combustion phenomena, high enthalpy pulsed facility simulations, and unsteady shock boundary layer interactions in shock tunnels. Computational modeling involved developing appropriate numerical schemes for the flows on interest and developing, applying, and validating appropriate thermochemical processes. As part of improving the accuracy of the numerical predictions, adaptive grid algorithms were explored, and a user-friendly, self-adaptive code (SAGE) was developed. Aerothermodynamic flows of interest included energy transfer due to strong radiation, and a significant level of effort was spent in developing computational codes for calculating radiation and radiation modeling. In addition, computational tools were developed and applied to predict the radiative heat flux and spectra that reach the model surface.
Computer Graphics and Metaphorical Elaboration for Learning Science Concepts.

ERIC Educational Resources Information Center

ChanLin, Lih-Juan; Chan, Kung-Chi

This study explores the instructional impact of using computer multimedia to integrate metaphorical verbal information into graphical representations of biotechnology concepts. The combination of text and graphics into a single metaphor makes concepts dual-coded, and therefore more comprehensible and memorable for the student. Visual stimuli help…
Ducted-Fan Engine Acoustic Predictions using a Navier-Stokes Code

NASA Technical Reports Server (NTRS)

Rumsey, C. L.; Biedron, R. T.; Farassat, F.; Spence, P. L.

1998-01-01

A Navier-Stokes computer code is used to predict one of the ducted-fan engine acoustic modes that results from rotor-wake/stator-blade interaction. A patched sliding-zone interface is employed to pass information between the moving rotor row and the stationary stator row. The code produces averaged aerodynamic results downstream of the rotor that agree well with a widely used average-passage code. The acoustic mode of interest is generated successfully by the code and is propagated well upstream of the rotor; temporal and spatial numerical resolution are fine enough such that attenuation of the signal is small. Two acoustic codes are used to find the far-field noise. Near-field propagation is computed by using Eversman's wave envelope code, which is based on a finite-element model. Propagation to the far field is accomplished by using the Kirchhoff formula for moving surfaces with the results of the wave envelope code as input data. Comparison of measured and computed far-field noise levels show fair agreement in the range of directivity angles where the peak radiation lobes from the inlet are observed. Although only a single acoustic mode is targeted in this study, the main conclusion is a proof-of-concept: Navier-Stokes codes can be used both to generate and propagate rotor/stator acoustic modes forward through an engine, where the results can be coupled to other far-field noise prediction codes.
Computer-Aided Thermohydraulic Design of TEMA Type E Shell and Tube Heat Exchangers for Use in Low Pressure, Liquid-to-Liquid, Single Phase Applications.

DTIC Science & Technology

1985-04-01

and Standards .. ... ....... ....... 9 A. General . ... .. .. ... ..... .. .. ... 9 B. ASME Boiler and Pressure Vessel Code .. .. ......9 C. Foreign...several different sources. B. American Society of Mechanial Engineers (ASME) Boiler and Pressure Vessel Code A shell and tube heat exchanger is indeed a
Real-time computer treatment of THz passive device images with the high image quality

NASA Astrophysics Data System (ADS)

Trofimov, Vyacheslav A.; Trofimov, Vladislav V.

2012-06-01

We demonstrate real-time computer code improving significantly the quality of images captured by the passive THz imaging system. The code is not only designed for a THz passive device: it can be applied to any kind of such devices and active THz imaging systems as well. We applied our code for computer processing of images captured by four passive THz imaging devices manufactured by different companies. It should be stressed that computer processing of images produced by different companies requires using the different spatial filters usually. The performance of current version of the computer code is greater than one image per second for a THz image having more than 5000 pixels and 24 bit number representation. Processing of THz single image produces about 20 images simultaneously corresponding to various spatial filters. The computer code allows increasing the number of pixels for processed images without noticeable reduction of image quality. The performance of the computer code can be increased many times using parallel algorithms for processing the image. We develop original spatial filters which allow one to see objects with sizes less than 2 cm. The imagery is produced by passive THz imaging devices which captured the images of objects hidden under opaque clothes. For images with high noise we develop an approach which results in suppression of the noise after using the computer processing and we obtain the good quality image. With the aim of illustrating the efficiency of the developed approach we demonstrate the detection of the liquid explosive, ordinary explosive, knife, pistol, metal plate, CD, ceramics, chocolate and other objects hidden under opaque clothes. The results demonstrate the high efficiency of our approach for the detection of hidden objects and they are a very promising solution for the security problem.
High-performance computational fluid dynamics: a custom-code approach

NASA Astrophysics Data System (ADS)

Fannon, James; Loiseau, Jean-Christophe; Valluri, Prashant; Bethune, Iain; Náraigh, Lennon Ó.

2016-07-01

We introduce a modified and simplified version of the pre-existing fully parallelized three-dimensional Navier-Stokes flow solver known as TPLS. We demonstrate how the simplified version can be used as a pedagogical tool for the study of computational fluid dynamics (CFDs) and parallel computing. TPLS is at its heart a two-phase flow solver, and uses calls to a range of external libraries to accelerate its performance. However, in the present context we narrow the focus of the study to basic hydrodynamics and parallel computing techniques, and the code is therefore simplified and modified to simulate pressure-driven single-phase flow in a channel, using only relatively simple Fortran 90 code with MPI parallelization, but no calls to any other external libraries. The modified code is analysed in order to both validate its accuracy and investigate its scalability up to 1000 CPU cores. Simulations are performed for several benchmark cases in pressure-driven channel flow, including a turbulent simulation, wherein the turbulence is incorporated via the large-eddy simulation technique. The work may be of use to advanced undergraduate and graduate students as an introductory study in CFDs, while also providing insight for those interested in more general aspects of high-performance computing.
Computer code for predicting coolant flow and heat transfer in turbomachinery

NASA Technical Reports Server (NTRS)

Meitner, Peter L.

1990-01-01

A computer code was developed to analyze any turbomachinery coolant flow path geometry that consist of a single flow passage with a unique inlet and exit. Flow can be bled off for tip-cap impingement cooling, and a flow bypass can be specified in which coolant flow is taken off at one point in the flow channel and reintroduced at a point farther downstream in the same channel. The user may either choose the coolant flow rate or let the program determine the flow rate from specified inlet and exit conditions. The computer code integrates the 1-D momentum and energy equations along a defined flow path and calculates the coolant's flow rate, temperature, pressure, and velocity and the heat transfer coefficients along the passage. The equations account for area change, mass addition or subtraction, pumping, friction, and heat transfer.
Interactive computer modeling of combustion chemistry and coalescence-dispersion modeling of turbulent combustion

NASA Technical Reports Server (NTRS)

Pratt, D. T.

1984-01-01

An interactive computer code for simulation of a high-intensity turbulent combustor as a single point inhomogeneous stirred reactor was developed from an existing batch processing computer code CDPSR. The interactive CDPSR code was used as a guide for interpretation and direction of DOE-sponsored companion experiments utilizing Xenon tracer with optical laser diagnostic techniques to experimentally determine the appropriate mixing frequency, and for validation of CDPSR as a mixing-chemistry model for a laboratory jet-stirred reactor. The coalescence-dispersion model for finite rate mixing was incorporated into an existing interactive code AVCO-MARK I, to enable simulation of a combustor as a modular array of stirred flow and plug flow elements, each having a prescribed finite mixing frequency, or axial distribution of mixing frequency, as appropriate. Further increase the speed and reliability of the batch kinetics integrator code CREKID was increased by rewriting in vectorized form for execution on a vector or parallel processor, and by incorporating numerical techniques which enhance execution speed by permitting specification of a very low accuracy tolerance.
A Multiphysics and Multiscale Software Environment for Modeling Astrophysical Systems

NASA Astrophysics Data System (ADS)

Portegies Zwart, Simon; McMillan, Steve; O'Nualláin, Breanndán; Heggie, Douglas; Lombardi, James; Hut, Piet; Banerjee, Sambaran; Belkus, Houria; Fragos, Tassos; Fregeau, John; Fuji, Michiko; Gaburov, Evghenii; Glebbeek, Evert; Groen, Derek; Harfst, Stefan; Izzard, Rob; Jurić, Mario; Justham, Stephen; Teuben, Peter; van Bever, Joris; Yaron, Ofer; Zemp, Marcel

We present MUSE, a software framework for tying together existing computational tools for different astrophysical domains into a single multiphysics, multiscale workload. MUSE facilitates the coupling of existing codes written in different languages by providing inter-language tools and by specifying an interface between each module and the framework that represents a balance between generality and computational efficiency. This approach allows scientists to use combinations of codes to solve highly-coupled problems without the need to write new codes for other domains or significantly alter their existing codes. MUSE currently incorporates the domains of stellar dynamics, stellar evolution and stellar hydrodynamics for a generalized stellar systems workload. MUSE has now reached a "Noah's Ark" milestone, with two available numerical solvers for each domain. MUSE can treat small stellar associations, galaxies and everything in between, including planetary systems, dense stellar clusters and galactic nuclei. Here we demonstrate an examples calculated with MUSE: the merger of two galaxies. In addition we demonstrate the working of MUSE on a distributed computer. The current MUSE code base is publicly available as open source at http://muse.li.
Three-dimensional computer simulation of non-reacting jet-gas flow mixing in an MHD second stage combustor

NASA Astrophysics Data System (ADS)

Chang, S. L.; Lottes, S. A.; Berry, G. F.

Argonne National Laboratory is investigating the non-reacting jet-gas mixing patterns in a magnetohydrodynamics (MHD) second stage combustor by using a three-dimensional single-phase hydrodynamics computer program. The computer simulation is intended to enhance the understanding of flow and mixing patterns in the combustor, which in turn may improve downstream MHD channel performance. The code is used to examine the three-dimensional effects of the side walls and the distributed jet flows on the non-reacting jet-gas mixing patterns. The code solves the conservation equations of mass, momentum, and energy, and a transport equation of a turbulence parameter and allows permeable surfaces to be specified for any computational cell.
An Idealized, Single Radial Swirler, Lean-Direct-Injection (LDI) Concept Meshing Script

NASA Technical Reports Server (NTRS)

Iannetti, Anthony C.; Thompson, Daniel

2008-01-01

To easily study combustor design parameters using computational fluid dynamics codes (CFD), a Gridgen Glyph-based macro (based on the Tcl scripting language) dubbed BladeMaker has been developed for the meshing of an idealized, single radial swirler, lean-direct-injection (LDI) combustor. BladeMaker is capable of taking in a number of parameters, such as blade width, blade tilt with respect to the perpendicular, swirler cup radius, and grid densities, and producing a three-dimensional meshed radial swirler with a can-annular (canned) combustor. This complex script produces a data format suitable for but not specific to the National Combustion Code (NCC), a state-of-the-art CFD code developed for reacting flow processes.
Single Event Upset Rate Estimates for a 16-K CMOS (Complementary Metal Oxide Semiconductor) SRAM (Static Random Access Memory).

DTIC Science & Technology

1986-09-30

4 . ~**..ft.. ft . - - - ft SI TABLES 9 I. SA32~40 Single Event Upset Test, 1140-MeV Krypton, 9/l8/8~4. . .. .. .. .. .. .16 II. CRUP Simulation...cosmic ray interaction analysis described in the remainder of this report were calculated using the CRUP computer code 3 modified for funneling. The... CRUP code requires, as inputs, the size of a depletion region specified as a retangular parallel piped with dimensions a 9 b S c, the effective funnel
ADPAC v1.0: User's Manual

NASA Technical Reports Server (NTRS)

Hall, Edward J.; Heidegger, Nathan J.; Delaney, Robert A.

1999-01-01

The overall objective of this study was to evaluate the effects of turbulence models in a 3-D numerical analysis on the wake prediction capability. The current version of the computer code resulting from this study is referred to as ADPAC v7 (Advanced Ducted Propfan Analysis Codes -Version 7). This report is intended to serve as a computer program user's manual for the ADPAC code used and modified under Task 15 of NASA Contract NAS3-27394. The ADPAC program is based on a flexible multiple-block and discretization scheme permitting coupled 2-D/3-D mesh block solutions with application to a wide variety of geometries. Aerodynamic calculations are based on a four-stage Runge-Kutta time-marching finite volume solution technique with added numerical dissipation. Steady flow predictions are accelerated by a multigrid procedure. Turbulence models now available in the ADPAC code are: a simple mixing-length model, the algebraic Baldwin-Lomax model with user defined coefficients, the one-equation Spalart-Allmaras model, and a two-equation k-R model. The consolidated ADPAC code is capable of executing in either a serial or parallel computing mode from a single source code.
Scalability study of parallel spatial direct numerical simulation code on IBM SP1 parallel supercomputer

NASA Technical Reports Server (NTRS)

Hanebutte, Ulf R.; Joslin, Ronald D.; Zubair, Mohammad

1994-01-01

The implementation and the performance of a parallel spatial direct numerical simulation (PSDNS) code are reported for the IBM SP1 supercomputer. The spatially evolving disturbances that are associated with laminar-to-turbulent in three-dimensional boundary-layer flows are computed with the PS-DNS code. By remapping the distributed data structure during the course of the calculation, optimized serial library routines can be utilized that substantially increase the computational performance. Although the remapping incurs a high communication penalty, the parallel efficiency of the code remains above 40% for all performed calculations. By using appropriate compile options and optimized library routines, the serial code achieves 52-56 Mflops on a single node of the SP1 (45% of theoretical peak performance). The actual performance of the PSDNS code on the SP1 is evaluated with a 'real world' simulation that consists of 1.7 million grid points. One time step of this simulation is calculated on eight nodes of the SP1 in the same time as required by a Cray Y/MP for the same simulation. The scalability information provides estimated computational costs that match the actual costs relative to changes in the number of grid points.
Simulation of 2D Kinetic Effects in Plasmas using the Grid Based Continuum Code LOKI

NASA Astrophysics Data System (ADS)

Banks, Jeffrey; Berger, Richard; Chapman, Tom; Brunner, Stephan

2016-10-01

Kinetic simulation of multi-dimensional plasma waves through direct discretization of the Vlasov equation is a useful tool to study many physical interactions and is particularly attractive for situations where minimal fluctuation levels are desired, for instance, when measuring growth rates of plasma wave instabilities. However, direct discretization of phase space can be computationally expensive, and as a result there are few examples of published results using Vlasov codes in more than a single configuration space dimension. In an effort to fill this gap we have developed the Eulerian-based kinetic code LOKI that evolves the Vlasov-Poisson system in 2+2-dimensional phase space. The code is designed to reduce the cost of phase-space computation by using fully 4th order accurate conservative finite differencing, while retaining excellent parallel scalability that efficiently uses large scale computing resources. In this poster I will discuss the algorithms used in the code as well as some aspects of their parallel implementation using MPI. I will also overview simulation results of basic plasma wave instabilities relevant to laser plasma interaction, which have been obtained using the code.
A Model of Human Cognitive Behavior in Writing Code for Computer Programs. Volume 1

DTIC Science & Technology

1975-05-01

nearly all programming languages, each line of code actually involves a great many decisions - basic statement types, variable and expression choices...labels, etc. - and any heuristic which evaluates code on the basis of a single decision is not likely to have sufficient power. Only the use of plans...recalculated in the following line because It was needed again. The second reason is that there are some decisions about the structure of a program
Three-Dimensional Terahertz Coded-Aperture Imaging Based on Single Input Multiple Output Technology.

PubMed

Chen, Shuo; Luo, Chenggao; Deng, Bin; Wang, Hongqiang; Cheng, Yongqiang; Zhuang, Zhaowen

2018-01-19

As a promising radar imaging technique, terahertz coded-aperture imaging (TCAI) can achieve high-resolution, forward-looking, and staring imaging by producing spatiotemporal independent signals with coded apertures. In this paper, we propose a three-dimensional (3D) TCAI architecture based on single input multiple output (SIMO) technology, which can reduce the coding and sampling times sharply. The coded aperture applied in the proposed TCAI architecture loads either purposive or random phase modulation factor. In the transmitting process, the purposive phase modulation factor drives the terahertz beam to scan the divided 3D imaging cells. In the receiving process, the random phase modulation factor is adopted to modulate the terahertz wave to be spatiotemporally independent for high resolution. Considering human-scale targets, images of each 3D imaging cell are reconstructed one by one to decompose the global computational complexity, and then are synthesized together to obtain the complete high-resolution image. As for each imaging cell, the multi-resolution imaging method helps to reduce the computational burden on a large-scale reference-signal matrix. The experimental results demonstrate that the proposed architecture can achieve high-resolution imaging with much less time for 3D targets and has great potential in applications such as security screening, nondestructive detection, medical diagnosis, etc.
HEC Applications on Columbia Project

NASA Technical Reports Server (NTRS)

Taft, Jim

2004-01-01

NASA's Columbia system consists of a cluster of twenty 512 processor SGI Altix systems. Each of these systems is 3 TFLOP/s in peak performance - approximately the same as the entire compute capability at NAS just one year ago. Each 512p system is a single system image machine with one Linunx O5, one high performance file system, and one globally shared memory. The NAS Terascale Applications Group (TAG) is chartered to assist in scaling NASA's mission critical codes to at least 512p in order to significantly improve emergency response during flight operations, as well as provide significant improvements in the codes. and rate of scientific discovery across the scientifc disciplines within NASA's Missions. Recent accomplishments are 4x improvements to codes in the ocean modeling community, 10x performance improvements in a number of computational fluid dynamics codes used in aero-vehicle design, and 5x improvements in a number of space science codes dealing in extreme physics. The TAG group will continue its scaling work to 2048p and beyond (10240 cpus) as the Columbia system becomes fully operational and the upgrades to the SGI NUMAlink memory fabric are in place. The NUMlink uprades dramatically improve system scalability for a single application. These upgrades will allow a number of codes to execute faster at higher fidelity than ever before on any other system, thus increasing the rate of scientific discovery even further

SU (2) lattice gauge theory simulations on Fermi GPUs

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cardoso, Nuno, E-mail: nunocardoso@cftp.ist.utl.p; Bicudo, Pedro, E-mail: bicudo@ist.utl.p

2011-05-10

In this work we explore the performance of CUDA in quenched lattice SU (2) simulations. CUDA, NVIDIA Compute Unified Device Architecture, is a hardware and software architecture developed by NVIDIA for computing on the GPU. We present an analysis and performance comparison between the GPU and CPU in single and double precision. Analyses with multiple GPUs and two different architectures (G200 and Fermi architectures) are also presented. In order to obtain a high performance, the code must be optimized for the GPU architecture, i.e., an implementation that exploits the memory hierarchy of the CUDA programming model. We produce codes formore » the Monte Carlo generation of SU (2) lattice gauge configurations, for the mean plaquette, for the Polyakov Loop at finite T and for the Wilson loop. We also present results for the potential using many configurations (50,000) without smearing and almost 2000 configurations with APE smearing. With two Fermi GPUs we have achieved an excellent performance of 200x the speed over one CPU, in single precision, around 110 Gflops/s. We also find that, using the Fermi architecture, double precision computations for the static quark-antiquark potential are not much slower (less than 2x slower) than single precision computations.« less
Single stock dynamics on high-frequency data: from a compressed coding perspective.

PubMed

Fushing, Hsieh; Chen, Shu-Chun; Hwang, Chii-Ruey

2014-01-01

High-frequency return, trading volume and transaction number are digitally coded via a nonparametric computing algorithm, called hierarchical factor segmentation (HFS), and then are coupled together to reveal a single stock dynamics without global state-space structural assumptions. The base-8 digital coding sequence, which is capable of revealing contrasting aggregation against sparsity of extreme events, is further compressed into a shortened sequence of state transitions. This compressed digital code sequence vividly demonstrates that the aggregation of large absolute returns is the primary driving force for stimulating both the aggregations of large trading volumes and transaction numbers. The state of system-wise synchrony is manifested with very frequent recurrence in the stock dynamics. And this data-driven dynamic mechanism is seen to correspondingly vary as the global market transiting in and out of contraction-expansion cycles. These results not only elaborate the stock dynamics of interest to a fuller extent, but also contradict some classical theories in finance. Overall this version of stock dynamics is potentially more coherent and realistic, especially when the current financial market is increasingly powered by high-frequency trading via computer algorithms, rather than by individual investors.
Single Stock Dynamics on High-Frequency Data: From a Compressed Coding Perspective

PubMed Central

Fushing, Hsieh; Chen, Shu-Chun; Hwang, Chii-Ruey

2014-01-01

High-frequency return, trading volume and transaction number are digitally coded via a nonparametric computing algorithm, called hierarchical factor segmentation (HFS), and then are coupled together to reveal a single stock dynamics without global state-space structural assumptions. The base-8 digital coding sequence, which is capable of revealing contrasting aggregation against sparsity of extreme events, is further compressed into a shortened sequence of state transitions. This compressed digital code sequence vividly demonstrates that the aggregation of large absolute returns is the primary driving force for stimulating both the aggregations of large trading volumes and transaction numbers. The state of system-wise synchrony is manifested with very frequent recurrence in the stock dynamics. And this data-driven dynamic mechanism is seen to correspondingly vary as the global market transiting in and out of contraction-expansion cycles. These results not only elaborate the stock dynamics of interest to a fuller extent, but also contradict some classical theories in finance. Overall this version of stock dynamics is potentially more coherent and realistic, especially when the current financial market is increasingly powered by high-frequency trading via computer algorithms, rather than by individual investors. PMID:24586235
LDPC decoder with a limited-precision FPGA-based floating-point multiplication coprocessor

NASA Astrophysics Data System (ADS)

Moberly, Raymond; O'Sullivan, Michael; Waheed, Khurram

2007-09-01

Implementing the sum-product algorithm, in an FPGA with an embedded processor, invites us to consider a tradeoff between computational precision and computational speed. The algorithm, known outside of the signal processing community as Pearl's belief propagation, is used for iterative soft-decision decoding of LDPC codes. We determined the feasibility of a coprocessor that will perform product computations. Our FPGA-based coprocessor (design) performs computer algebra with significantly less precision than the standard (e.g. integer, floating-point) operations of general purpose processors. Using synthesis, targeting a 3,168 LUT Xilinx FPGA, we show that key components of a decoder are feasible and that the full single-precision decoder could be constructed using a larger part. Soft-decision decoding by the iterative belief propagation algorithm is impacted both positively and negatively by a reduction in the precision of the computation. Reducing precision reduces the coding gain, but the limited-precision computation can operate faster. A proposed solution offers custom logic to perform computations with less precision, yet uses the floating-point format to interface with the software. Simulation results show the achievable coding gain. Synthesis results help theorize the the full capacity and performance of an FPGA-based coprocessor.
Far-Field Turbulent Vortex-Wake/Exhaust Plume Interaction for Subsonic and HSCT Airplanes

NASA Technical Reports Server (NTRS)

Kandil, Osama A.; Adam, Ihab; Wong, Tin-Chee

1996-01-01

Computational study of the far-field turbulent vortex-wake/exhaust plume interaction for subsonic and high speed civil transport (HSCT) airplanes is carried out. The Reynolds-averaged Navier-Stokes (NS) equations are solved using the implicit, upwind, Roe-flux-differencing, finite-volume scheme. The two-equation shear stress transport model of Menter is implemented with the NS solver for turbulent-flow calculation. For the far-field study, the computations of vortex-wake interaction with the exhaust plume of a single engine of a Boeing 727 wing in a holding condition and two engines of an HSCT in a cruise condition are carried out using overlapping zonal method for several miles downstream. These results are obtained using the computer code FTNS3D. The results of the subsonic flow of this code are compared with those of a parabolized NS solver known as the UNIWAKE code.
Single-exposure quantitative phase imaging in color-coded LED microscopy.

PubMed

Lee, Wonchan; Jung, Daeseong; Ryu, Suho; Joo, Chulmin

2017-04-03

We demonstrate single-shot quantitative phase imaging (QPI) in a platform of color-coded LED microscopy (cLEDscope). The light source in a conventional microscope is replaced by a circular LED pattern that is trisected into subregions with equal area, assigned to red, green, and blue colors. Image acquisition with a color image sensor and subsequent computation based on weak object transfer functions allow for the QPI of a transparent specimen. We also provide a correction method for color-leakage, which may be encountered in implementing our method with consumer-grade LEDs and image sensors. Most commercially available LEDs and image sensors do not provide spectrally isolated emissions and pixel responses, generating significant error in phase estimation in our method. We describe the correction scheme for this color-leakage issue, and demonstrate improved phase measurement accuracy. The computational model and single-exposure QPI capability of our method are presented by showing images of calibrated phase samples and cellular specimens.
Numerical studies of the deposition of material released from fixed and rotary wing aircraft

NASA Technical Reports Server (NTRS)

Bilanin, A. J.; Teske, M. E.

1984-01-01

The computer code AGDISP (AGricultural DISPersal) has been developed to predict the deposition of material released from fixed and rotary wing aircraft in a single-pass, computationally efficient manner. The formulation of the code is novel in that the mean particle trajectory and the variance about the mean resulting from turbulent fluid fluctuations are simultaneously predicted. The code presently includes the capability of assessing the influence of neutral atmospheric conditions, inviscid wake vortices, particle evaporation, plant canopy and terrain on the deposition pattern. In this report, the equations governing the motion of aerially released particles are developed, including a description of the evaporation model used. A series of case studies, using AGDISP, are included.
Quantum steganography and quantum error-correction

NASA Astrophysics Data System (ADS)

Shaw, Bilal A.

Quantum error-correcting codes have been the cornerstone of research in quantum information science (QIS) for more than a decade. Without their conception, quantum computers would be a footnote in the history of science. When researchers embraced the idea that we live in a world where the effects of a noisy environment cannot completely be stripped away from the operations of a quantum computer, the natural way forward was to think about importing classical coding theory into the quantum arena to give birth to quantum error-correcting codes which could help in mitigating the debilitating effects of decoherence on quantum data. We first talk about the six-qubit quantum error-correcting code and show its connections to entanglement-assisted error-correcting coding theory and then to subsystem codes. This code bridges the gap between the five-qubit (perfect) and Steane codes. We discuss two methods to encode one qubit into six physical qubits. Each of the two examples corrects an arbitrary single-qubit error. The first example is a degenerate six-qubit quantum error-correcting code. We explicitly provide the stabilizer generators, encoding circuits, codewords, logical Pauli operators, and logical CNOT operator for this code. We also show how to convert this code into a non-trivial subsystem code that saturates the subsystem Singleton bound. We then prove that a six-qubit code without entanglement assistance cannot simultaneously possess a Calderbank-Shor-Steane (CSS) stabilizer and correct an arbitrary single-qubit error. A corollary of this result is that the Steane seven-qubit code is the smallest single-error correcting CSS code. Our second example is the construction of a non-degenerate six-qubit CSS entanglement-assisted code. This code uses one bit of entanglement (an ebit) shared between the sender (Alice) and the receiver (Bob) and corrects an arbitrary single-qubit error. The code we obtain is globally equivalent to the Steane seven-qubit code and thus corrects an arbitrary error on the receiver's half of the ebit as well. We prove that this code is the smallest code with a CSS structure that uses only one ebit and corrects an arbitrary single-qubit error on the sender's side. We discuss the advantages and disadvantages for each of the two codes. In the second half of this thesis we explore the yet uncharted and relatively undiscovered area of quantum steganography. Steganography is the process of hiding secret information by embedding it in an "innocent" message. We present protocols for hiding quantum information in a codeword of a quantum error-correcting code passing through a channel. Using either a shared classical secret key or shared entanglement Alice disguises her information as errors in the channel. Bob can retrieve the hidden information, but an eavesdropper (Eve) with the power to monitor the channel, but without the secret key, cannot distinguish the message from channel noise. We analyze how difficult it is for Eve to detect the presence of secret messages, and estimate rates of steganographic communication and secret key consumption for certain protocols. We also provide an example of how Alice hides quantum information in the perfect code when the underlying channel between Bob and her is the depolarizing channel. Using this scheme Alice can hide up to four stego-qubits.
Gyrokinetic micro-turbulence simulations on the NERSC 16-way SMP IBM SP computer: experiences and performance results

NASA Astrophysics Data System (ADS)

Ethier, Stephane; Lin, Zhihong

2001-10-01

Earlier this year, the National Energy Research Scientific Computing center (NERSC) took delivery of the second most powerful computer in the world. With its 2,528 processors running at a peak performance of 1.5 GFlops, this IBM SP machine has a theoretical performance of almost 3.8 TFlops. To efficiently harness such computing power in one single code is not an easy task and requires a good knowledge of the computer's architecture. Here we present the steps that we followed to improve our gyrokinetic micro-turbulence code GTC in order to take advantage of the new 16-way shared memory nodes of the NERSC IBM SP. Performance results are shown as well as details about the improved mixed-mode MPI-OpenMP model that we use. The enhancements to the code allowed us to tackle much bigger problem sizes, getting closer to our goal of simulating an ITER-size tokamak with both kinetic ions and electrons.(This work is supported by DOE Contract No. DE-AC02-76CH03073 (PPPL), and in part by the DOE Fusion SciDAC Project.)
Computational models for the analysis of three-dimensional internal and exhaust plume flowfields

NASA Technical Reports Server (NTRS)

Dash, S. M.; Delguidice, P. D.

1977-01-01

This paper describes computational procedures developed for the analysis of three-dimensional supersonic ducted flows and multinozzle exhaust plume flowfields. The models/codes embodying these procedures cater to a broad spectrum of geometric situations via the use of multiple reference plane grid networks in several coordinate systems. Shock capturing techniques are employed to trace the propagation and interaction of multiple shock surfaces while the plume interface, separating the exhaust and external flows, and the plume external shock are discretely analyzed. The computational grid within the reference planes follows the trace of streamlines to facilitate the incorporation of finite-rate chemistry and viscous computational capabilities. Exhaust gas properties consist of combustion products in chemical equilibrium. The computational accuracy of the models/codes is assessed via comparisons with exact solutions, results of other codes and experimental data. Results are presented for the flows in two-dimensional convergent and divergent ducts, expansive and compressive corner flows, flow in a rectangular nozzle and the plume flowfields for exhausts issuing out of single and multiple rectangular nozzles.
Nontangent, Developed Contour Bulkheads for a Single-Stage Launch Vehicle

NASA Technical Reports Server (NTRS)

Wu, K. Chauncey; Lepsch, Roger A., Jr.

2000-01-01

Dry weights for single-stage launch vehicles that incorporate nontangent, developed contour bulkheads are estimated and compared to a baseline vehicle with 1.414 aspect ratio ellipsoidal bulkheads. Weights, volumes, and heights of optimized bulkhead designs are computed using a preliminary design bulkhead analysis code. The dry weights of vehicles that incorporate the optimized bulkheads are predicted using a vehicle weights and sizing code. Two optimization approaches are employed. A structural-level method, where the vehicle's three major bulkhead regions are optimized separately and then incorporated into a model for computation of the vehicle dry weight, predicts a reduction of4365 lb (2.2 %) from the 200,679-lb baseline vehicle dry weight. In the second, vehicle-level, approach, the vehicle dry weight is the objective function for the optimization. For the vehicle-level analysis, modified bulkhead designs are analyzed and incorporated into the weights model for computation of a dry weight. The optimizer simultaneously manipulates design variables for all three bulkheads to reduce the dry weight. The vehicle-level analysis predicts a dry weight reduction of 5129 lb, a 2.6% reduction from the baseline weight. Based on these results, nontangent, developed contour bulkheads may provide substantial weight savings for single stage vehicles.
Deterministic and robust generation of single photons from a single quantum dot with 99.5% indistinguishability using adiabatic rapid passage.

PubMed

Wei, Yu-Jia; He, Yu-Ming; Chen, Ming-Cheng; Hu, Yi-Nan; He, Yu; Wu, Dian; Schneider, Christian; Kamp, Martin; Höfling, Sven; Lu, Chao-Yang; Pan, Jian-Wei

2014-11-12

Single photons are attractive candidates of quantum bits (qubits) for quantum computation and are the best messengers in quantum networks. Future scalable, fault-tolerant photonic quantum technologies demand both stringently high levels of photon indistinguishability and generation efficiency. Here, we demonstrate deterministic and robust generation of pulsed resonance fluorescence single photons from a single semiconductor quantum dot using adiabatic rapid passage, a method robust against fluctuation of driving pulse area and dipole moments of solid-state emitters. The emitted photons are background-free, have a vanishing two-photon emission probability of 0.3% and a raw (corrected) two-photon Hong-Ou-Mandel interference visibility of 97.9% (99.5%), reaching a precision that places single photons at the threshold for fault-tolerant surface-code quantum computing. This single-photon source can be readily scaled up to multiphoton entanglement and used for quantum metrology, boson sampling, and linear optical quantum computing.
Braiding by Majorana tracking and long-range CNOT gates with color codes

NASA Astrophysics Data System (ADS)

Litinski, Daniel; von Oppen, Felix

2017-11-01

Color-code quantum computation seamlessly combines Majorana-based hardware with topological error correction. Specifically, as Clifford gates are transversal in two-dimensional color codes, they enable the use of the Majoranas' non-Abelian statistics for gate operations at the code level. Here, we discuss the implementation of color codes in arrays of Majorana nanowires that avoid branched networks such as T junctions, thereby simplifying their realization. We show that, in such implementations, non-Abelian statistics can be exploited without ever performing physical braiding operations. Physical braiding operations are replaced by Majorana tracking, an entirely software-based protocol which appropriately updates the Majoranas involved in the color-code stabilizer measurements. This approach minimizes the required hardware operations for single-qubit Clifford gates. For Clifford completeness, we combine color codes with surface codes, and use color-to-surface-code lattice surgery for long-range multitarget CNOT gates which have a time overhead that grows only logarithmically with the physical distance separating control and target qubits. With the addition of magic state distillation, our architecture describes a fault-tolerant universal quantum computer in systems such as networks of tetrons, hexons, or Majorana box qubits, but can also be applied to nontopological qubit platforms.
A method to compute SEU fault probabilities in memory arrays with error correction

NASA Technical Reports Server (NTRS)

Gercek, Gokhan

1994-01-01

With the increasing packing densities in VLSI technology, Single Event Upsets (SEU) due to cosmic radiations are becoming more of a critical issue in the design of space avionics systems. In this paper, a method is introduced to compute the fault (mishap) probability for a computer memory of size M words. It is assumed that a Hamming code is used for each word to provide single error correction. It is also assumed that every time a memory location is read, single errors are corrected. Memory is read randomly whose distribution is assumed to be known. In such a scenario, a mishap is defined as two SEU's corrupting the same memory location prior to a read. The paper introduces a method to compute the overall mishap probability for the entire memory for a mission duration of T hours.
Computing observables in curved multifield models of inflation—A guide (with code) to the transport method

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dias, Mafalda; Seery, David; Frazer, Jonathan, E-mail: m.dias@sussex.ac.uk, E-mail: j.frazer@sussex.ac.uk, E-mail: a.liddle@sussex.ac.uk

We describe how to apply the transport method to compute inflationary observables in a broad range of multiple-field models. The method is efficient and encompasses scenarios with curved field-space metrics, violations of slow-roll conditions and turns of the trajectory in field space. It can be used for an arbitrary mass spectrum, including massive modes and models with quasi-single-field dynamics. In this note we focus on practical issues. It is accompanied by a Mathematica code which can be used to explore suitable models, or as a basis for further development.
Transmission-Type 2-Bit Programmable Metasurface for Single-Sensor and Single-Frequency Microwave Imaging

PubMed Central

Li, Yun Bo; Li, Lian Lin; Xu, Bai Bing; Wu, Wei; Wu, Rui Yuan; Wan, Xiang; Cheng, Qiang; Cui, Tie Jun

2016-01-01

The programmable and digital metamaterials or metasurfaces presented recently have huge potentials in designing real-time-controlled electromagnetic devices. Here, we propose the first transmission-type 2-bit programmable coding metasurface for single-sensor and single- frequency imaging in the microwave frequency. Compared with the existing single-sensor imagers composed of active spatial modulators with their units controlled independently, we introduce randomly programmable metasurface to transform the masks of modulators, in which their rows and columns are controlled simultaneously so that the complexity and cost of the imaging system can be reduced drastically. Different from the single-sensor approach using the frequency agility, the proposed imaging system makes use of variable modulators under single frequency, which can avoid the object dispersion. In order to realize the transmission-type 2-bit programmable metasurface, we propose a two-layer binary coding unit, which is convenient for changing the voltages in rows and columns to switch the diodes in the top and bottom layers, respectively. In our imaging measurements, we generate the random codes by computer to achieve different transmission patterns, which can support enough multiple modes to solve the inverse-scattering problem in the single-sensor imaging. Simple experimental results are presented in the microwave frequency, validating our new single-sensor and single-frequency imaging system. PMID:27025907
Transmission-Type 2-Bit Programmable Metasurface for Single-Sensor and Single-Frequency Microwave Imaging.

PubMed

Li, Yun Bo; Li, Lian Lin; Xu, Bai Bing; Wu, Wei; Wu, Rui Yuan; Wan, Xiang; Cheng, Qiang; Cui, Tie Jun

2016-03-30

The programmable and digital metamaterials or metasurfaces presented recently have huge potentials in designing real-time-controlled electromagnetic devices. Here, we propose the first transmission-type 2-bit programmable coding metasurface for single-sensor and single- frequency imaging in the microwave frequency. Compared with the existing single-sensor imagers composed of active spatial modulators with their units controlled independently, we introduce randomly programmable metasurface to transform the masks of modulators, in which their rows and columns are controlled simultaneously so that the complexity and cost of the imaging system can be reduced drastically. Different from the single-sensor approach using the frequency agility, the proposed imaging system makes use of variable modulators under single frequency, which can avoid the object dispersion. In order to realize the transmission-type 2-bit programmable metasurface, we propose a two-layer binary coding unit, which is convenient for changing the voltages in rows and columns to switch the diodes in the top and bottom layers, respectively. In our imaging measurements, we generate the random codes by computer to achieve different transmission patterns, which can support enough multiple modes to solve the inverse-scattering problem in the single-sensor imaging. Simple experimental results are presented in the microwave frequency, validating our new single-sensor and single-frequency imaging system.
Hierarchical parallelisation of functional renormalisation group calculations - hp-fRG

NASA Astrophysics Data System (ADS)

Rohe, Daniel

2016-10-01

The functional renormalisation group (fRG) has evolved into a versatile tool in condensed matter theory for studying important aspects of correlated electron systems. Practical applications of the method often involve a high numerical effort, motivating the question in how far High Performance Computing (HPC) can leverage the approach. In this work we report on a multi-level parallelisation of the underlying computational machinery and show that this can speed up the code by several orders of magnitude. This in turn can extend the applicability of the method to otherwise inaccessible cases. We exploit three levels of parallelisation: Distributed computing by means of Message Passing (MPI), shared-memory computing using OpenMP, and vectorisation by means of SIMD units (single-instruction-multiple-data). Results are provided for two distinct High Performance Computing (HPC) platforms, namely the IBM-based BlueGene/Q system JUQUEEN and an Intel Sandy-Bridge-based development cluster. We discuss how certain issues and obstacles were overcome in the course of adapting the code. Most importantly, we conclude that this vast improvement can actually be accomplished by introducing only moderate changes to the code, such that this strategy may serve as a guideline for other researcher to likewise improve the efficiency of their codes.
Moment method analysis of linearly tapered slot antennas: Low loss components for switched beam radiometers

NASA Technical Reports Server (NTRS)

Koeksal, Adnan; Trew, Robert J.; Kauffman, J. Frank

1992-01-01

A Moment Method Model for the radiation pattern characterization of single Linearly Tapered Slot Antennas (LTSA) in air or on a dielectric substrate is developed. This characterization consists of: (1) finding the radiated far-fields of the antenna; (2) determining the E-Plane and H-Plane beamwidths and sidelobe levels; and (3) determining the D-Plane beamwidth and cross polarization levels, as antenna parameters length, height, taper angle, substrate thickness, and the relative substrate permittivity vary. The LTSA geometry does not lend itself to analytical solution with the given parameter ranges. Therefore, a computer modeling scheme and a code are necessary to analyze the problem. This necessity imposes some further objectives or requirements on the solution method (modeling) and tool (computer code). These may be listed as follows: (1) a good approximation to the real antenna geometry; and (2) feasible computer storage and time requirements. According to these requirements, the work is concentrated on the development of efficient modeling schemes for these type of problems and on reducing the central processing unit (CPU) time required from the computer code. A Method of Moments (MoM) code is developed for the analysis of LTSA's within the parameter ranges given.
SequenceL: Automated Parallel Algorithms Derived from CSP-NT Computational Laws

NASA Technical Reports Server (NTRS)

Cooke, Daniel; Rushton, Nelson

2013-01-01

With the introduction of new parallel architectures like the cell and multicore chips from IBM, Intel, AMD, and ARM, as well as the petascale processing available for highend computing, a larger number of programmers will need to write parallel codes. Adding the parallel control structure to the sequence, selection, and iterative control constructs increases the complexity of code development, which often results in increased development costs and decreased reliability. SequenceL is a high-level programming language that is, a programming language that is closer to a human s way of thinking than to a machine s. Historically, high-level languages have resulted in decreased development costs and increased reliability, at the expense of performance. In recent applications at JSC and in industry, SequenceL has demonstrated the usual advantages of high-level programming in terms of low cost and high reliability. SequenceL programs, however, have run at speeds typically comparable with, and in many cases faster than, their counterparts written in C and C++ when run on single-core processors. Moreover, SequenceL is able to generate parallel executables automatically for multicore hardware, gaining parallel speedups without any extra effort from the programmer beyond what is required to write the sequen tial/singlecore code. A SequenceL-to-C++ translator has been developed that automatically renders readable multithreaded C++ from a combination of a SequenceL program and sample data input. The SequenceL language is based on two fundamental computational laws, Consume-Simplify- Produce (CSP) and Normalize-Trans - pose (NT), which enable it to automate the creation of parallel algorithms from high-level code that has no annotations of parallelism whatsoever. In our anecdotal experience, SequenceL development has been in every case less costly than development of the same algorithm in sequential (that is, single-core, single process) C or C++, and an order of magnitude less costly than development of comparable parallel code. Moreover, SequenceL not only automatically parallelizes the code, but since it is based on CSP-NT, it is provably race free, thus eliminating the largest quality challenge the parallelized software developer faces.

Development and application of computational aerothermodynamics flowfield computer codes

NASA Technical Reports Server (NTRS)

Venkatapathy, Ethiraj

1992-01-01

Presented is a collection of papers on research activities carried out during the funding period of October 1991 to March 1992. Topics covered include: blunt body flows in thermochemical equilibrium; thermochemical relaxation in high enthalpy nozzle flow; single expansion ramp nozzle simulations; lunar return aerobraking; line boundary problem for three dimensional grids; and unsteady shock induced combustion.
Parallel algorithm for solving Kepler’s equation on Graphics Processing Units: Application to analysis of Doppler exoplanet searches

NASA Astrophysics Data System (ADS)

Ford, Eric B.

2009-05-01

We present the results of a highly parallel Kepler equation solver using the Graphics Processing Unit (GPU) on a commercial nVidia GeForce 280GTX and the "Compute Unified Device Architecture" (CUDA) programming environment. We apply this to evaluate a goodness-of-fit statistic (e.g., χ2) for Doppler observations of stars potentially harboring multiple planetary companions (assuming negligible planet-planet interactions). Given the high-dimensionality of the model parameter space (at least five dimensions per planet), a global search is extremely computationally demanding. We expect that the underlying Kepler solver and model evaluator will be combined with a wide variety of more sophisticated algorithms to provide efficient global search, parameter estimation, model comparison, and adaptive experimental design for radial velocity and/or astrometric planet searches. We tested multiple implementations using single precision, double precision, pairs of single precision, and mixed precision arithmetic. We find that the vast majority of computations can be performed using single precision arithmetic, with selective use of compensated summation for increased precision. However, standard single precision is not adequate for calculating the mean anomaly from the time of observation and orbital period when evaluating the goodness-of-fit for real planetary systems and observational data sets. Using all double precision, our GPU code outperforms a similar code using a modern CPU by a factor of over 60. Using mixed precision, our GPU code provides a speed-up factor of over 600, when evaluating nsys > 1024 models planetary systems each containing npl = 4 planets and assuming nobs = 256 observations of each system. We conclude that modern GPUs also offer a powerful tool for repeatedly evaluating Kepler's equation and a goodness-of-fit statistic for orbital models when presented with a large parameter space.
MATLAB-based algorithm to estimate depths of isolated thin dike-like sources using higher-order horizontal derivatives of magnetic anomalies.

PubMed

Ekinci, Yunus Levent

2016-01-01

This paper presents an easy-to-use open source computer algorithm (code) for estimating the depths of isolated single thin dike-like source bodies by using numerical second-, third-, and fourth-order horizontal derivatives computed from observed magnetic anomalies. The approach does not require a priori information and uses some filters of successive graticule spacings. The computed higher-order horizontal derivative datasets are used to solve nonlinear equations for depth determination. The solutions are independent from the magnetization and ambient field directions. The practical usability of the developed code, designed in MATLAB R2012b (MathWorks Inc.), was successfully examined using some synthetic simulations with and without noise. The algorithm was then used to estimate the depths of some ore bodies buried in different regions (USA, Sweden, and Canada). Real data tests clearly indicated that the obtained depths are in good agreement with those of previous studies and drilling information. Additionally, a state-of-the-art inversion scheme based on particle swarm optimization produced comparable results to those of the higher-order horizontal derivative analyses in both synthetic and real anomaly cases. Accordingly, the proposed code is verified to be useful in interpreting isolated single thin dike-like magnetized bodies and may be an alternative processing technique. The open source code can be easily modified and adapted to suit the benefits of other researchers.
Task 7: ADPAC User's Manual

NASA Technical Reports Server (NTRS)

Hall, E. J.; Topp, D. A.; Delaney, R. A.

1996-01-01

The overall objective of this study was to develop a 3-D numerical analysis for compressor casing treatment flowfields. The current version of the computer code resulting from this study is referred to as ADPAC (Advanced Ducted Propfan Analysis Codes-Version 7). This report is intended to serve as a computer program user's manual for the ADPAC code developed under Tasks 6 and 7 of the NASA Contract. The ADPAC program is based on a flexible multiple- block grid discretization scheme permitting coupled 2-D/3-D mesh block solutions with application to a wide variety of geometries. Aerodynamic calculations are based on a four-stage Runge-Kutta time-marching finite volume solution technique with added numerical dissipation. Steady flow predictions are accelerated by a multigrid procedure. An iterative implicit algorithm is available for rapid time-dependent flow calculations, and an advanced two equation turbulence model is incorporated to predict complex turbulent flows. The consolidated code generated during this study is capable of executing in either a serial or parallel computing mode from a single source code. Numerous examples are given in the form of test cases to demonstrate the utility of this approach for predicting the aerodynamics of modem turbomachinery configurations.
A Comparison of Three Navier-Stokes Solvers for Exhaust Nozzle Flowfields

NASA Technical Reports Server (NTRS)

Georgiadis, Nicholas J.; Yoder, Dennis A.; Debonis, James R.

1999-01-01

A comparison of the NPARC, PAB, and WIND (previously known as NASTD) Navier-Stokes solvers is made for two flow cases with turbulent mixing as the dominant flow characteristic, a two-dimensional ejector nozzle and a Mach 1.5 elliptic jet. The objective of the work is to determine if comparable predictions of nozzle flows can be obtained from different Navier-Stokes codes employed in a multiple site research program. A single computational grid was constructed for each of the two flows and used for all of the Navier-Stokes solvers. In addition, similar k-e based turbulence models were employed in each code, and boundary conditions were specified as similarly as possible across the codes. Comparisons of mass flow rates, velocity profiles, and turbulence model quantities are made between the computations and experimental data. The computational cost of obtaining converged solutions with each of the codes is also documented. Results indicate that all of the codes provided similar predictions for the two nozzle flows. Agreement of the Navier-Stokes calculations with experimental data was good for the ejector nozzle. However, for the Mach 1.5 elliptic jet, the calculations were unable to accurately capture the development of the three dimensional elliptic mixing layer.
Hybrid Parallel Contour Trees, Version 1.0

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sewell, Christopher; Fasel, Patricia; Carr, Hamish

A common operation in scientific visualization is to compute and render a contour of a data set. Given a function of the form f : R^d -> R, a level set is defined as an inverse image f^-1(h) for an isovalue h, and a contour is a single connected component of a level set. The Reeb graph can then be defined to be the result of contracting each contour to a single point, and is well defined for Euclidean spaces or for general manifolds. For simple domains, the graph is guaranteed to be a tree, and is called the contourmore » tree. Analysis can then be performed on the contour tree in order to identify isovalues of particular interest, based on various metrics, and render the corresponding contours, without having to know such isovalues a priori. This code is intended to be the first data-parallel algorithm for computing contour trees. Our implementation will use the portable data-parallel primitives provided by Nvidia’s Thrust library, allowing us to compile our same code for both GPUs and multi-core CPUs. Native OpenMP and purely serial versions of the code will likely also be included. It will also be extended to provide a hybrid data-parallel / distributed algorithm, allowing scaling beyond a single GPU or CPU.« less
Error rates and resource overheads of encoded three-qubit gates

NASA Astrophysics Data System (ADS)

Takagi, Ryuji; Yoder, Theodore J.; Chuang, Isaac L.

2017-10-01

A non-Clifford gate is required for universal quantum computation, and, typically, this is the most error-prone and resource-intensive logical operation on an error-correcting code. Small, single-qubit rotations are popular choices for this non-Clifford gate, but certain three-qubit gates, such as Toffoli or controlled-controlled-Z (ccz), are equivalent options that are also more suited for implementing some quantum algorithms, for instance, those with coherent classical subroutines. Here, we calculate error rates and resource overheads for implementing logical ccz with pieceable fault tolerance, a nontransversal method for implementing logical gates. We provide a comparison with a nonlocal magic-state scheme on a concatenated code and a local magic-state scheme on the surface code. We find the pieceable fault-tolerance scheme particularly advantaged over magic states on concatenated codes and in certain regimes over magic states on the surface code. Our results suggest that pieceable fault tolerance is a promising candidate for fault tolerance in a near-future quantum computer.
Shared Memory Parallelization of an Implicit ADI-type CFD Code

NASA Technical Reports Server (NTRS)

Hauser, Th.; Huang, P. G.

1999-01-01

A parallelization study designed for ADI-type algorithms is presented using the OpenMP specification for shared-memory multiprocessor programming. Details of optimizations specifically addressed to cache-based computer architectures are described and performance measurements for the single and multiprocessor implementation are summarized. The paper demonstrates that optimization of memory access on a cache-based computer architecture controls the performance of the computational algorithm. A hybrid MPI/OpenMP approach is proposed for clusters of shared memory machines to further enhance the parallel performance. The method is applied to develop a new LES/DNS code, named LESTool. A preliminary DNS calculation of a fully developed channel flow at a Reynolds number of 180, Re(sub tau) = 180, has shown good agreement with existing data.
Effects of cosmic rays on single event upsets

NASA Technical Reports Server (NTRS)

Venable, D. D.; Zajic, V.; Lowe, C. W.; Olidapupo, A.; Fogarty, T. N.

1989-01-01

Assistance was provided to the Brookhaven Single Event Upset (SEU) Test Facility. Computer codes were developed for fragmentation and secondary radiation affecting Very Large Scale Integration (VLSI) in space. A computer controlled CV (HP4192) test was developed for Terman analysis. Also developed were high speed parametric tests which are independent of operator judgment and a charge pumping technique for measurement of D(sub it) (E). The X-ray secondary effects, and parametric degradation as a function of dose rate were simulated. The SPICE simulation of static RAMs with various resistor filters was tested.
Simulation of Hypervelocity Impact on Aluminum-Nextel-Kevlar Orbital Debris Shields

NASA Technical Reports Server (NTRS)

Fahrenthold, Eric P.

2000-01-01

An improved hybrid particle-finite element method has been developed for hypervelocity impact simulation. The method combines the general contact-impact capabilities of particle codes with the true Lagrangian kinematics of large strain finite element formulations. Unlike some alternative schemes which couple Lagrangian finite element models with smooth particle hydrodynamics, the present formulation makes no use of slidelines or penalty forces. The method has been implemented in a parallel, three dimensional computer code. Simulations of three dimensional orbital debris impact problems using this parallel hybrid particle-finite element code, show good agreement with experiment and good speedup in parallel computation. The simulations included single and multi-plate shields as well as aluminum and composite shielding materials. at an impact velocity of eleven kilometers per second.
Engineering calculations for communications satellite systems planning

NASA Technical Reports Server (NTRS)

Levis, C. A.; Martin, C. H.; Reilly, C. H.; Gonsalvez, D. J.; Yamaura, Y.

1985-01-01

An extended gradient search code for broadcasting satellite service (BSS) spectrum/orbit assignment synthesis is discussed. Progress is also reported on both single-entry and full synthesis computational aids for fixed satellite service (FSS) spectrum/orbit assignment purposes.
The change of radial power factor distribution due to RCCA insertion at the first cycle core of AP1000

NASA Astrophysics Data System (ADS)

Susilo, J.; Suparlina, L.; Deswandri; Sunaryo, G. R.

2018-02-01

The using of a computer program for the PWR type core neutronic design parameters analysis has been carried out in some previous studies. These studies included a computer code validation on the neutronic parameters data values resulted from measurements and benchmarking calculation. In this study, the AP1000 first cycle core radial power peaking factor validation and analysis were performed using CITATION module of the SRAC2006 computer code. The computer code has been also validated with a good result to the criticality values of VERA benchmark core. The AP1000 core power distribution calculation has been done in two-dimensional X-Y geometry through ¼ section modeling. The purpose of this research is to determine the accuracy of the SRAC2006 code, and also the safety performance of the AP1000 core first cycle operating. The core calculations were carried out with the several conditions, those are without Rod Cluster Control Assembly (RCCA), by insertion of a single RCCA (AO, M1, M2, MA, MB, MC, MD) and multiple insertion RCCA (MA + MB, MA + MB + MC, MA + MB + MC + MD, and MA + MB + MC + MD + M1). The maximum power factor of the fuel rods value in the fuel assembly assumedapproximately 1.406. The calculation results analysis showed that the 2-dimensional CITATION module of SRAC2006 code is accurate in AP1000 power distribution calculation without RCCA and with MA+MB RCCA insertion.The power peaking factor on the first operating cycle of the AP1000 core without RCCA, as well as with single and multiple RCCA are still below in the safety limit values (less then about 1.798). So in terms of thermal power generated by the fuel assembly, then it can be considered that the AP100 core at the first operating cycle is safe.
The Influence of Viscous Effects on Ice Accretion Prediction and Airfoil Performance Predictions

NASA Technical Reports Server (NTRS)

Kreeger, Richard E.; Wright, William B.

2005-01-01

A computational study was conducted to evaluate the effectiveness of using a viscous flow solution in an ice accretion code and the resulting accuracy of aerodynamic performance prediction. Ice shapes were obtained for one single-element and one multi-element airfoil using both potential flow and Navier-Stokes flowfields in the LEWICE ice accretion code. Aerodynamics were then calculated using a Navier-Stokes flow solver.
Automated Diversity in Computer Systems

DTIC Science & Technology

2005-09-01

traces that started with trace heads , namely backwards- taken branches. These branches are indicative of loops within the program, and Dynamo assumes that...would be the ones the program would normally take. Therefore when a trace head became hot (was visited enough times), only a single code trace would...all encountered trace heads . When an interesting instruction is being emulated, the tracing code checks to see if it has been encountered before
User's Guide for MSAP2D: A Program for Unsteady Aerodynamic and Aeroelastic (Flutter and Forced Response) Analysis of Multistage Compressors and Turbines. 1.0

NASA Technical Reports Server (NTRS)

Reddy, T. S. R.; Srivastava, R.

1996-01-01

This guide describes the input data required for using MSAP2D (Multi Stage Aeroelastic analysis Program - Two Dimensional) computer code. MSAP2D can be used for steady, unsteady aerodynamic, and aeroelastic (flutter and forced response) analysis of bladed disks arranged in multiple blade rows such as those found in compressors, turbines, counter rotating propellers or propfans. The code can also be run for single blade row. MSAP2D code is an extension of the original NPHASE code for multiblade row aerodynamic and aeroelastic analysis. Euler equations are used to obtain aerodynamic forces. The structural dynamic equations are written for a rigid typical section undergoing pitching (torsion) and plunging (bending) motion. The aeroelastic equations are solved in time domain. For single blade row analysis, frequency domain analysis is also provided to obtain unsteady aerodynamic coefficients required in an eigen analysis for flutter. In this manual, sample input and output are provided for a single blade row example, two blade row example with equal and unequal number of blades in the blade rows.
A multiphysics and multiscale software environment for modeling astrophysical systems

NASA Astrophysics Data System (ADS)

Portegies Zwart, Simon; McMillan, Steve; Harfst, Stefan; Groen, Derek; Fujii, Michiko; Nualláin, Breanndán Ó.; Glebbeek, Evert; Heggie, Douglas; Lombardi, James; Hut, Piet; Angelou, Vangelis; Banerjee, Sambaran; Belkus, Houria; Fragos, Tassos; Fregeau, John; Gaburov, Evghenii; Izzard, Rob; Jurić, Mario; Justham, Stephen; Sottoriva, Andrea; Teuben, Peter; van Bever, Joris; Yaron, Ofer; Zemp, Marcel

2009-05-01

We present MUSE, a software framework for combining existing computational tools for different astrophysical domains into a single multiphysics, multiscale application. MUSE facilitates the coupling of existing codes written in different languages by providing inter-language tools and by specifying an interface between each module and the framework that represents a balance between generality and computational efficiency. This approach allows scientists to use combinations of codes to solve highly coupled problems without the need to write new codes for other domains or significantly alter their existing codes. MUSE currently incorporates the domains of stellar dynamics, stellar evolution and stellar hydrodynamics for studying generalized stellar systems. We have now reached a "Noah's Ark" milestone, with (at least) two available numerical solvers for each domain. MUSE can treat multiscale and multiphysics systems in which the time- and size-scales are well separated, like simulating the evolution of planetary systems, small stellar associations, dense stellar clusters, galaxies and galactic nuclei. In this paper we describe three examples calculated using MUSE: the merger of two galaxies, the merger of two evolving stars, and a hybrid N-body simulation. In addition, we demonstrate an implementation of MUSE on a distributed computer which may also include special-purpose hardware, such as GRAPEs or GPUs, to accelerate computations. The current MUSE code base is publicly available as open source at http://muse.li.
Status and future of MUSE

NASA Astrophysics Data System (ADS)

Harfst, S.; Portegies Zwart, S.; McMillan, S.

2008-12-01

We present MUSE, a software framework for combining existing computational tools from different astrophysical domains into a single multi-physics, multi-scale application. MUSE facilitates the coupling of existing codes written in different languages by providing inter-language tools and by specifying an interface between each module and the framework that represents a balance between generality and computational efficiency. This approach allows scientists to use combinations of codes to solve highly-coupled problems without the need to write new codes for other domains or significantly alter their existing codes. MUSE currently incorporates the domains of stellar dynamics, stellar evolution and stellar hydrodynamics for studying generalized stellar systems. We have now reached a ``Noah's Ark'' milestone, with (at least) two available numerical solvers for each domain. MUSE can treat multi-scale and multi-physics systems in which the time- and size-scales are well separated, like simulating the evolution of planetary systems, small stellar associations, dense stellar clusters, galaxies and galactic nuclei. In this paper we describe two examples calculated using MUSE: the merger of two galaxies and an N-body simulation with live stellar evolution. In addition, we demonstrate an implementation of MUSE on a distributed computer which may also include special-purpose hardware, such as GRAPEs or GPUs, to accelerate computations. The current MUSE code base is publicly available as open source at http://muse.li.
Exact diagonalization library for quantum electron models

NASA Astrophysics Data System (ADS)

Iskakov, Sergei; Danilov, Michael

2018-04-01

We present an exact diagonalization C++ template library (EDLib) for solving quantum electron models, including the single-band finite Hubbard cluster and the multi-orbital impurity Anderson model. The observables that can be computed using EDLib are single particle Green's functions and spin-spin correlation functions. This code provides three different types of Hamiltonian matrix storage that can be chosen based on the model.
Validation and Performance Comparison of Numerical Codes for Tsunami Inundation

NASA Astrophysics Data System (ADS)

Velioglu, D.; Kian, R.; Yalciner, A. C.; Zaytsev, A.

2015-12-01

In inundation zones, tsunami motion turns from wave motion to flow of water. Modelling of this phenomenon is a complex problem since there are many parameters affecting the tsunami flow. In this respect, the performance of numerical codes that analyze tsunami inundation patterns becomes important. The computation of water surface elevation is not sufficient for proper analysis of tsunami behaviour in shallow water zones and on land and hence for the development of mitigation strategies. Velocity and velocity patterns are also crucial parameters and have to be computed at the highest accuracy. There are numerous numerical codes to be used for simulating tsunami inundation. In this study, FLOW 3D and NAMI DANCE codes are selected for validation and performance comparison. Flow 3D simulates linear and nonlinear propagating surface waves as well as long waves by solving three-dimensional Navier-Stokes (3D-NS) equations. FLOW 3D is used specificaly for flood problems. NAMI DANCE uses finite difference computational method to solve linear and nonlinear forms of shallow water equations (NSWE) in long wave problems, specifically tsunamis. In this study, these codes are validated and their performances are compared using two benchmark problems which are discussed in 2015 National Tsunami Hazard Mitigation Program (NTHMP) Annual meeting in Portland, USA. One of the problems is an experiment of a single long-period wave propagating up a piecewise linear slope and onto a small-scale model of the town of Seaside, Oregon. Other benchmark problem is an experiment of a single solitary wave propagating up a triangular shaped shelf with an island feature located at the offshore point of the shelf. The computed water surface elevation and velocity data are compared with the measured data. The comparisons showed that both codes are in fairly good agreement with each other and benchmark data. All results are presented with discussions and comparisons. The research leading to these results has received funding from the European Union's Seventh Framework Programme (FP7/2007-2013) under grant agreement No 603839 (Project ASTARTE - Assessment, Strategy and Risk Reduction for Tsunamis in Europe)
DOE Office of Scientific and Technical Information (OSTI.GOV)

Walker, Andrew; Lawrence, Earl

The Response Surface Modeling (RSM) Tool Suite is a collection of three codes used to generate an empirical interpolation function for a collection of drag coefficient calculations computed with Test Particle Monte Carlo (TPMC) simulations. The first code, "Automated RSM", automates the generation of a drag coefficient RSM for a particular object to a single command. "Automated RSM" first creates a Latin Hypercube Sample (LHS) of 1,000 ensemble members to explore the global parameter space. For each ensemble member, a TPMC simulation is performed and the object drag coefficient is computed. In the next step of the "Automated RSM" code,more » a Gaussian process is used to fit the TPMC simulations. In the final step, Markov Chain Monte Carlo (MCMC) is used to evaluate the non-analytic probability distribution function from the Gaussian process. The second code, "RSM Area", creates a look-up table for the projected area of the object based on input limits on the minimum and maximum allowed pitch and yaw angles and pitch and yaw angle intervals. The projected area from the look-up table is used to compute the ballistic coefficient of the object based on its pitch and yaw angle. An accurate ballistic coefficient is crucial in accurately computing the drag on an object. The third code, "RSM Cd", uses the RSM generated by the "Automated RSM" code and the projected area look-up table generated by the "RSM Area" code to accurately compute the drag coefficient and ballistic coefficient of the object. The user can modify the object velocity, object surface temperature, the translational temperature of the gas, the species concentrations of the gas, and the pitch and yaw angles of the object. Together, these codes allow for the accurate derivation of an object's drag coefficient and ballistic coefficient under any conditions with only knowledge of the object's geometry and mass.« less

Nonlinear 3D visco-resistive MHD modeling of fusion plasmas: a comparison between numerical codes

NASA Astrophysics Data System (ADS)

Bonfiglio, D.; Chacon, L.; Cappello, S.

2008-11-01

Fluid plasma models (and, in particular, the MHD model) are extensively used in the theoretical description of laboratory and astrophysical plasmas. We present here a successful benchmark between two nonlinear, three-dimensional, compressible visco-resistive MHD codes. One is the fully implicit, finite volume code PIXIE3D [1,2], which is characterized by many attractive features, notably the generalized curvilinear formulation (which makes the code applicable to different geometries) and the possibility to include in the computation the energy transport equation and the extended MHD version of Ohm's law. In addition, the parallel version of the code features excellent scalability properties. Results from this code, obtained in cylindrical geometry, are compared with those produced by the semi-implicit cylindrical code SpeCyl, which uses finite differences radially, and spectral formulation in the other coordinates [3]. Both single and multi-mode simulations are benchmarked, regarding both reversed field pinch (RFP) and ohmic tokamak magnetic configurations. [1] L. Chacon, Computer Physics Communications 163, 143 (2004). [2] L. Chacon, Phys. Plasmas 15, 056103 (2008). [3] S. Cappello, Plasma Phys. Control. Fusion 46, B313 (2004) & references therein.
M-type potassium conductance controls the emergence of neural phase codes: a combined experimental and neuron modelling study

PubMed Central

Kwag, Jeehyun; Jang, Hyun Jae; Kim, Mincheol; Lee, Sujeong

2014-01-01

Rate and phase codes are believed to be important in neural information processing. Hippocampal place cells provide a good example where both coding schemes coexist during spatial information processing. Spike rate increases in the place field, whereas spike phase precesses relative to the ongoing theta oscillation. However, what intrinsic mechanism allows for a single neuron to generate spike output patterns that contain both neural codes is unknown. Using dynamic clamp, we simulate an in vivo-like subthreshold dynamics of place cells to in vitro CA1 pyramidal neurons to establish an in vitro model of spike phase precession. Using this in vitro model, we show that membrane potential oscillation (MPO) dynamics is important in the emergence of spike phase codes: blocking the slowly activating, non-inactivating K+ current (IM), which is known to control subthreshold MPO, disrupts MPO and abolishes spike phase precession. We verify the importance of adaptive IM in the generation of phase codes using both an adaptive integrate-and-fire and a Hodgkin–Huxley (HH) neuron model. Especially, using the HH model, we further show that it is the perisomatically located IM with slow activation kinetics that is crucial for the generation of phase codes. These results suggest an important functional role of IM in single neuron computation, where IM serves as an intrinsic mechanism allowing for dual rate and phase coding in single neurons. PMID:25100320
Silicon CMOS architecture for a spin-based quantum computer.

PubMed

Veldhorst, M; Eenink, H G J; Yang, C H; Dzurak, A S

2017-12-15

Recent advances in quantum error correction codes for fault-tolerant quantum computing and physical realizations of high-fidelity qubits in multiple platforms give promise for the construction of a quantum computer based on millions of interacting qubits. However, the classical-quantum interface remains a nascent field of exploration. Here, we propose an architecture for a silicon-based quantum computer processor based on complementary metal-oxide-semiconductor (CMOS) technology. We show how a transistor-based control circuit together with charge-storage electrodes can be used to operate a dense and scalable two-dimensional qubit system. The qubits are defined by the spin state of a single electron confined in quantum dots, coupled via exchange interactions, controlled using a microwave cavity, and measured via gate-based dispersive readout. We implement a spin qubit surface code, showing the prospects for universal quantum computation. We discuss the challenges and focus areas that need to be addressed, providing a path for large-scale quantum computing.
Multi-dimensional computer simulation of MHD combustor hydrodynamics

NASA Astrophysics Data System (ADS)

Berry, G. F.; Chang, S. L.; Lottes, S. A.; Rimkus, W. A.

1991-04-01

Argonne National Laboratory is investigating the nonreacting jet gas mixing patterns in an MHD second stage combustor by using a 2-D multiphase hydrodynamics computer program and a 3-D single phase hydrodynamics computer program. The computer simulations are intended to enhance the understanding of flow and mixing patterns in the combustor, which in turn may lead to improvement of the downstream MHD channel performance. A 2-D steady state computer model, based on mass and momentum conservation laws for multiple gas species, is used to simulate the hydrodynamics of the combustor in which a jet of oxidizer is injected into an unconfined cross stream gas flow. A 3-D code is used to examine the effects of the side walls and the distributed jet flows on the non-reacting jet gas mixing patterns. The code solves the conservation equations of mass, momentum, and energy, and a transport equation of a turbulence parameter and allows permeable surfaces to be specified for any computational cell.
Hybrid reduced order modeling for assembly calculations

DOE PAGES

Bang, Youngsuk; Abdel-Khalik, Hany S.; Jessee, Matthew A.; ...

2015-08-14

While the accuracy of assembly calculations has greatly improved due to the increase in computer power enabling more refined description of the phase space and use of more sophisticated numerical algorithms, the computational cost continues to increase which limits the full utilization of their effectiveness for routine engineering analysis. Reduced order modeling is a mathematical vehicle that scales down the dimensionality of large-scale numerical problems to enable their repeated executions on small computing environment, often available to end users. This is done by capturing the most dominant underlying relationships between the model's inputs and outputs. Previous works demonstrated the usemore » of the reduced order modeling for a single physics code, such as a radiation transport calculation. This paper extends those works to coupled code systems as currently employed in assembly calculations. Finally, numerical tests are conducted using realistic SCALE assembly models with resonance self-shielding, neutron transport, and nuclides transmutation/depletion models representing the components of the coupled code system.« less
Fast computation of quadrupole and hexadecapole approximations in microlensing with a single point-source evaluation

NASA Astrophysics Data System (ADS)

Cassan, Arnaud

2017-07-01

The exoplanet detection rate from gravitational microlensing has grown significantly in recent years thanks to a great enhancement of resources and improved observational strategy. Current observatories include ground-based wide-field and/or robotic world-wide networks of telescopes, as well as space-based observatories such as satellites Spitzer or Kepler/K2. This results in a large quantity of data to be processed and analysed, which is a challenge for modelling codes because of the complexity of the parameter space to be explored and the intensive computations required to evaluate the models. In this work, I present a method that allows to compute the quadrupole and hexadecapole approximations of the finite-source magnification with more efficiency than previously available codes, with routines about six times and four times faster, respectively. The quadrupole takes just about twice the time of a point-source evaluation, which advocates for generalizing its use to large portions of the light curves. The corresponding routines are available as open-source python codes.
Improved Boundary Layer Module (BLM) for the Solid Performance Program (SPP)

NASA Astrophysics Data System (ADS)

Coats, D. E.; Cebeci, T.

1982-03-01

The requirements for a replacement to the Bartz boundary layer code, the standard method of computing the performance loss due to viscous effects by the solid performance program, were discussed by the propulsion community along with four nationally recognized boundary layer experts. A consensus was reached regarding the preferred features for the analysis of the replacement code. The major points that were agreed upon are: (1) finite difference methods are preferred over integral methods; (2) a single equation eddy viscosity model was considered to be adequate for the purpose of computing performance loss; (3) a variable grid capability in both coordinate directions would be required; (4) a proven finite difference algorithm which is not stability restricted should be used, that is, an implicit numerical scheme would be required; and (5) the replacement code should be able to compute both turbulent and laminar flows. The program should treat mass addition at the wall as well as being able to calculate a stagnation point starting line.
Mathematical Description of Complex Chemical Kinetics and Application to CFD Modeling Codes

NASA Technical Reports Server (NTRS)

Bittker, D. A.

1993-01-01

A major effort in combustion research at the present time is devoted to the theoretical modeling of practical combustion systems. These include turbojet and ramjet air-breathing engines as well as ground-based gas-turbine power generating systems. The ability to use computational modeling extensively in designing these products not only saves time and money, but also helps designers meet the quite rigorous environmental standards that have been imposed on all combustion devices. The goal is to combine the very complex solution of the Navier-Stokes flow equations with realistic turbulence and heat-release models into a single computer code. Such a computational fluid-dynamic (CFD) code simulates the coupling of fluid mechanics with the chemistry of combustion to describe the practical devices. This paper will focus on the task of developing a simplified chemical model which can predict realistic heat-release rates as well as species composition profiles, and is also computationally rapid. We first discuss the mathematical techniques used to describe a complex, multistep fuel oxidation chemical reaction and develop a detailed mechanism for the process. We then show how this mechanism may be reduced and simplified to give an approximate model which adequately predicts heat release rates and a limited number of species composition profiles, but is computationally much faster than the original one. Only such a model can be incorporated into a CFD code without adding significantly to long computation times. Finally, we present some of the recent advances in the development of these simplified chemical mechanisms.
Mathematical description of complex chemical kinetics and application to CFD modeling codes

NASA Technical Reports Server (NTRS)

Bittker, D. A.

1993-01-01

A major effort in combustion research at the present time is devoted to the theoretical modeling of practical combustion systems. These include turbojet and ramjet air-breathing engines as well as ground-based gas-turbine power generating systems. The ability to use computational modeling extensively in designing these products not only saves time and money, but also helps designers meet the quite rigorous environmental standards that have been imposed on all combustion devices. The goal is to combine the very complex solution of the Navier-Stokes flow equations with realistic turbulence and heat-release models into a single computer code. Such a computational fluid-dynamic (CFD) code simulates the coupling of fluid mechanics with the chemistry of combustion to describe the practical devices. This paper will focus on the task of developing a simplified chemical model which can predict realistic heat-release rates as well as species composition profiles, and is also computationally rapid. We first discuss the mathematical techniques used to describe a complex, multistep fuel oxidation chemical reaction and develop a detailed mechanism for the process. We then show how this mechanism may be reduced and simplified to give an approximate model which adequately predicts heat release rates and a limited number of species composition profiles, but is computationally much faster than the original one. Only such a model can be incorporated into a CFD code without adding significantly to long computation times. Finally, we present some of the recent advances in the development of these simplified chemical mechanisms.
Gate sequence for continuous variable one-way quantum computation

PubMed Central

Su, Xiaolong; Hao, Shuhong; Deng, Xiaowei; Ma, Lingyu; Wang, Meihong; Jia, Xiaojun; Xie, Changde; Peng, Kunchi

2013-01-01

Measurement-based one-way quantum computation using cluster states as resources provides an efficient model to perform computation and information processing of quantum codes. Arbitrary Gaussian quantum computation can be implemented sufficiently by long single-mode and two-mode gate sequences. However, continuous variable gate sequences have not been realized so far due to an absence of cluster states larger than four submodes. Here we present the first continuous variable gate sequence consisting of a single-mode squeezing gate and a two-mode controlled-phase gate based on a six-mode cluster state. The quantum property of this gate sequence is confirmed by the fidelities and the quantum entanglement of two output modes, which depend on both the squeezing and controlled-phase gates. The experiment demonstrates the feasibility of implementing Gaussian quantum computation by means of accessible gate sequences.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Bolstad, J.W.; Haarman, R.A.

The results of two transients involving the loss of a steam generator in a single-pass, steam generator, pressurized water reactor have been analyzed using a state-of-the-art, thermal-hydraulic computer code. Computed results include the formation of a steam bubble in the core while the pressurizer is solid. Calculations show that continued injection of high pressure water would have stopped the scenario. These are similar to the happenings at Three Mile Island.
Monte Carlo calculations of initial energies of electrons in water irradiated by photons with energies up to 1GeV.

PubMed

Todo, A S; Hiromoto, G; Turner, J E; Hamm, R N; Wright, H A

1982-12-01

Previous calculations of the initial energies of electrons produced in water irradiated by photons are extended to 1 GeV by including pair and triplet production. Calculations were performed with the Monte Carlo computer code PHOEL-3, which replaces the earlier code, PHOEL-2. Tables of initial electron energies are presented for single interactions of monoenergetic photons at a number of energies from 10 keV to 1 GeV. These tables can be used to compute kerma in water irradiated by photons with arbitrary energy spectra to 1 GeV. In addition, separate tables of Compton-and pair-electron spectra are given over this energy range. The code PHOEL-3 is available from the Radiation Shielding Information Center, Oak Ridge National Laboratory, Oak Ridge, TN 37830.
Majorana fermion surface code for universal quantum computation

DOE PAGES

Vijay, Sagar; Hsieh, Timothy H.; Fu, Liang

2015-12-10

In this study, we introduce an exactly solvable model of interacting Majorana fermions realizing Z 2 topological order with a Z 2 fermion parity grading and lattice symmetries permuting the three fundamental anyon types. We propose a concrete physical realization by utilizing quantum phase slips in an array of Josephson-coupled mesoscopic topological superconductors, which can be implemented in a wide range of solid-state systems, including topological insulators, nanowires, or two-dimensional electron gases, proximitized by s-wave superconductors. Our model finds a natural application as a Majorana fermion surface code for universal quantum computation, with a single-step stabilizer measurement requiring no physicalmore » ancilla qubits, increased error tolerance, and simpler logical gates than a surface code with bosonic physical qubits. We thoroughly discuss protocols for stabilizer measurements, encoding and manipulating logical qubits, and gate implementations.« less
Space shuttle rendezous, radiation and reentry analysis code

NASA Technical Reports Server (NTRS)

Mcglathery, D. M.

1973-01-01

A preliminary space shuttle mission design and analysis tool is reported emphasizing versatility, flexibility, and user interaction through the use of a relatively small computer (IBM-7044). The Space Shuttle Rendezvous, Radiation and Reentry Analysis Code is used to perform mission and space radiation environmental analyses for four typical space shuttle missions. Included also is a version of the proposed Apollo/Soyuz rendezvous and docking test mission. Tangential steering circle to circle low-thrust tug orbit raising and the effects of the trapped radiation environment on trajectory shaping due to solar electric power losses are also features of this mission analysis code. The computational results include a parametric study on single impulse versus double impulse deorbiting for relatively low space shuttle orbits as well as some definitive data on the magnetically trapped protons and electrons encountered on a particular mission.
CREME: The 2011 Revision of the Cosmic Ray Effects on Micro-Electronics Code

NASA Technical Reports Server (NTRS)

Adams, James H., Jr.; Barghouty, Abdulnasser F.; Reed, Robert A.; Sierawski, Brian D.; Watts, John W., Jr.

2012-01-01

We describe a tool suite, CREME, which combines existing capabilities of CREME96 and CREME86 with new radiation environment models and new Monte Carlo computational capabilities for single event effects and total ionizing dose.
Preface.

PubMed

Ditlevsen, Susanne; Lansky, Petr

2016-06-01

This Special Issue of Mathematical Biosciences and Engineering contains 11 selected papers presented at the Neural Coding 2014 workshop. The workshop was held in the royal city of Versailles in France, October 6-10, 2014. This was the 11th of a series of international workshops on this subject, the first held in Prague (1995), then Versailles (1997), Osaka (1999), Plymouth (2001), Aulla (2003), Marburg (2005), Montevideo (2007), Tainan (2009), Limassol (2010), and again in Prague (2012). Also selected papers from Prague were published as a special issue of Mathematical Biosciences and Engineering and in this way a tradition was started. Similarly to the previous workshops, this was a single track multidisciplinary event bringing together experimental and computational neuroscientists. The Neural Coding Workshops are traditionally biennial symposia. They are relatively small in size, interdisciplinary with major emphasis on the search for common principles in neural coding. The workshop was conceived to bring together scientists from different disciplines for an in-depth discussion of mathematical model-building and computational strategies. Further information on the meeting can be found at the NC2014 website at https://colloque6.inra.fr/neural_coding_2014. The meeting was supported by French National Institute for Agricultural Research, the world's leading institution in this field. This Special Issue of Mathematical Biosciences and Engineering contains 11 selected papers presented at the Neural Coding 2014 workshop. The workshop was held in the royal city of Versailles in France, October 6-10, 2014. This was the 11th of a series of international workshops on this subject, the first held in Prague (1995), then Versailles (1997), Osaka (1999), Plymouth (2001), Aulla (2003), Marburg (2005), Montevideo (2007), Tainan (2009), Limassol (2010), and again in Prague (2012). Also selected papers from Prague were published as a special issue of Mathematical Biosciences and Engineering and in this way a tradition was started. Similarly to the previous workshops, this was a single track multidisciplinary event bringing together experimental and computational neuroscientists. The Neural Coding Workshops are traditionally biennial symposia. They are relatively small in size, interdisciplinary with major emphasis on the search for common principles in neural coding. The workshop was conceived to bring together scientists from different disciplines for an in-depth discussion of mathematical model-building and computational strategies. Further information on the meeting can be found at the NC2014 website at https://colloque6.inra.fr/neural_coding_2014. The meeting was supported by French National Institute for Agricultural Research, the world's leading institution in this field. Understanding how the brain processes information is one of the most challenging subjects in neuroscience. The papers presented in this special issue show a small corner of the huge diversity of this field, and illustrate how scientists with different backgrounds approach this vast subject. The diversity of disciplines engaged in these investigations is remarkable: biologists, mathematicians, physicists, psychologists, computer scientists, and statisticians, all have original tools and ideas by which to try to elucidate the underlying mechanisms. In this issue, emphasis is put on mathematical modeling of single neurons. A variety of problems in computational neuroscience accompanied with a rich diversity of mathematical tools and approaches are presented. We hope it will inspire and challenge the readers in their own research. We would like to thank the authors for their valuable contributions and the referees for their priceless effort of reviewing the manuscripts. Finally, we would like to thank Yang Kuang for supporting us and making this publication possible.
Efficient full wave code for the coupling of large multirow multijunction LH grills

NASA Astrophysics Data System (ADS)

Preinhaelter, Josef; Hillairet, Julien; Milanesio, Daniele; Maggiora, Riccardo; Urban, Jakub; Vahala, Linda; Vahala, George

2017-11-01

The full wave code OLGA, for determining the coupling of a single row lower hybrid launcher (waveguide grills) to the plasma, is extended to handle multirow multijunction active passive structures (like the C3 and C4 launchers on TORE SUPRA) by implementing the scattering matrix formalism. The extended code is still computationally fast because of the use of (i) 2D splines of the plasma surface admittance in the accessibility region of the k-space, (ii) high order Gaussian quadrature rules for the integration of the coupling elements and (iii) utilizing the symmetries of the coupling elements in the multiperiodic structures. The extended OLGA code is benchmarked against the ALOHA-1D, ALOHA-2D and TOPLHA codes for the coupling of the C3 and C4 TORE SUPRA launchers for several plasma configurations derived from reflectometry and interferometery. Unlike nearly all codes (except the ALOHA-1D code), OLGA does not require large computational resources and can be used for everyday usage in planning experimental runs. In particular, it is shown that the OLGA code correctly handles the coupling of the C3 and C4 launchers over a very wide range of plasma densities in front of the grill.
User's manual for PANDA II: A computer code for calculating equations of state

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kerley, G.I.

1991-07-18

PANDA is an interactive computer code that is used to compute equations of state (EOS) for many classes of materials over a wide range of densities and temperatures. The first step in the development of a general EOS model is to determine the EOS for a one- component system, consisting of a single solid or fluid phase and a single chemical species. The results of several such calculations can then be combined to construct EOS for multiphase and multicomponent systems. For one-component solids and fluids, PANDA offers a variety of options for modeling various contributions to the EOS: the zeromore » Kelvin isotherm, lattice vibrations, fluid degrees of freedom, thermal electronic excitation and ionization, and molecular vibrational and rotational degrees of freedom. Two options are available for computing EOS for multicomponent systems from separate EOS for the individual species and phases. The phase transition model is used for a system of immiscible phases, each having the same chemical composition. In the mixture model, the components can be either miscible or immiscible and can have different chemical compositions; mixtures cab be either inert or reactive. PANDA provides over 50 commands that are used to define the EOS models, to make calculations and compare the models to experimental data, and to generate and maintain tabular EOS libraries for use in hydrocodes and other applications. Versions of the code available for the Cray (UNICOS and CTSS), SUN (UNIX), and VAX(VMS) machines, and a small version is available for personal computers (DOS). This report describes the EOS models, use of the commands, and several sample problems. 92 refs., 7 figs., 10 tabs.« less
Adaptive EAGLE dynamic solution adaptation and grid quality enhancement

NASA Technical Reports Server (NTRS)

Luong, Phu Vinh; Thompson, J. F.; Gatlin, B.; Mastin, C. W.; Kim, H. J.

1992-01-01

In the effort described here, the elliptic grid generation procedure in the EAGLE grid code was separated from the main code into a subroutine, and a new subroutine which evaluates several grid quality measures at each grid point was added. The elliptic grid routine can now be called, either by a computational fluid dynamics (CFD) code to generate a new adaptive grid based on flow variables and quality measures through multiple adaptation, or by the EAGLE main code to generate a grid based on quality measure variables through static adaptation. Arrays of flow variables can be read into the EAGLE grid code for use in static adaptation as well. These major changes in the EAGLE adaptive grid system make it easier to convert any CFD code that operates on a block-structured grid (or single-block grid) into a multiple adaptive code.
Deploying electromagnetic particle-in-cell (EM-PIC) codes on Xeon Phi accelerators boards

NASA Astrophysics Data System (ADS)

Fonseca, Ricardo

2014-10-01

The complexity of the phenomena involved in several relevant plasma physics scenarios, where highly nonlinear and kinetic processes dominate, makes purely theoretical descriptions impossible. Further understanding of these scenarios requires detailed numerical modeling, but fully relativistic particle-in-cell codes such as OSIRIS are computationally intensive. The quest towards Exaflop computer systems has lead to the development of HPC systems based on add-on accelerator cards, such as GPGPUs and more recently the Xeon Phi accelerators that power the current number 1 system in the world. These cards, also referred to as Intel Many Integrated Core Architecture (MIC) offer peak theoretical performances of >1 TFlop/s for general purpose calculations in a single board, and are receiving significant attention as an attractive alternative to CPUs for plasma modeling. In this work we report on our efforts towards the deployment of an EM-PIC code on a Xeon Phi architecture system. We will focus on the parallelization and vectorization strategies followed, and present a detailed performance evaluation of code performance in comparison with the CPU code.

Hierarchical surface code for network quantum computing with modules of arbitrary size

NASA Astrophysics Data System (ADS)

Li, Ying; Benjamin, Simon C.

2016-10-01

The network paradigm for quantum computing involves interconnecting many modules to form a scalable machine. Typically it is assumed that the links between modules are prone to noise while operations within modules have a significantly higher fidelity. To optimize fault tolerance in such architectures we introduce a hierarchical generalization of the surface code: a small "patch" of the code exists within each module and constitutes a single effective qubit of the logic-level surface code. Errors primarily occur in a two-dimensional subspace, i.e., patch perimeters extruded over time, and the resulting noise threshold for intermodule links can exceed ˜10 % even in the absence of purification. Increasing the number of qubits within each module decreases the number of qubits necessary for encoding a logical qubit. But this advantage is relatively modest, and broadly speaking, a "fine-grained" network of small modules containing only about eight qubits is competitive in total qubit count versus a "course" network with modules containing many hundreds of qubits.
Spin-based quantum computation in multielectron quantum dots

NASA Astrophysics Data System (ADS)

Hu, Xuedong; Das Sarma, S.

2001-10-01

In a quantum computer the hardware and software are intrinsically connected because the quantum Hamiltonian (or more precisely its time development) is the code that runs the computer. We demonstrate this subtle and crucial relationship by considering the example of electron-spin-based solid-state quantum computer in semiconductor quantum dots. We show that multielectron quantum dots with one valence electron in the outermost shell do not behave simply as an effective single-spin system unless special conditions are satisfied. Our work compellingly demonstrates that a delicate synergy between theory and experiment (between software and hardware) is essential for constructing a quantum computer.
GAPD: a GPU-accelerated atom-based polychromatic diffraction simulation code.

PubMed

E, J C; Wang, L; Chen, S; Zhang, Y Y; Luo, S N

2018-03-01

GAPD, a graphics-processing-unit (GPU)-accelerated atom-based polychromatic diffraction simulation code for direct, kinematics-based, simulations of X-ray/electron diffraction of large-scale atomic systems with mono-/polychromatic beams and arbitrary plane detector geometries, is presented. This code implements GPU parallel computation via both real- and reciprocal-space decompositions. With GAPD, direct simulations are performed of the reciprocal lattice node of ultralarge systems (∼5 billion atoms) and diffraction patterns of single-crystal and polycrystalline configurations with mono- and polychromatic X-ray beams (including synchrotron undulator sources), and validation, benchmark and application cases are presented.
A User''s Guide to the Zwikker-Kosten Transmission Line Code (ZKTL)

NASA Technical Reports Server (NTRS)

Kelly, J. J.; Abu-Khajeel, H.

1997-01-01

This user's guide documents updates to the Zwikker-Kosten Transmission Line Code (ZKTL). This code was developed for analyzing new liner concepts developed to provide increased sound absorption. Contiguous arrays of multi-degree-of-freedom (MDOF) liner elements serve as the model for these liner configurations, and Zwikker and Kosten's theory of sound propagation in channels is used to predict the surface impedance. Transmission matrices for the various liner elements incorporate both analytical and semi-empirical methods. This allows standard matrix techniques to be employed in the code to systematically calculate the composite impedance due to the individual liner elements. The ZKTL code consists of four independent subroutines: 1. Single channel impedance calculation - linear version (SCIC) 2. Single channel impedance calculation - nonlinear version (SCICNL) 3. Multi-channel, multi-segment, multi-layer impedance calculation - linear version (MCMSML) 4. Multi-channel, multi-segment, multi-layer impedance calculation - nonlinear version (MCMSMLNL) Detailed examples, comments, and explanations for each liner impedance computation module are included. Also contained in the guide are depictions of the interactive execution, input files and output files.
Parallel and serial computing tools for testing single-locus and epistatic SNP effects of quantitative traits in genome-wide association studies

PubMed Central

Ma, Li; Runesha, H Birali; Dvorkin, Daniel; Garbe, John R; Da, Yang

2008-01-01

Background Genome-wide association studies (GWAS) using single nucleotide polymorphism (SNP) markers provide opportunities to detect epistatic SNPs associated with quantitative traits and to detect the exact mode of an epistasis effect. Computational difficulty is the main bottleneck for epistasis testing in large scale GWAS. Results The EPISNPmpi and EPISNP computer programs were developed for testing single-locus and epistatic SNP effects on quantitative traits in GWAS, including tests of three single-locus effects for each SNP (SNP genotypic effect, additive and dominance effects) and five epistasis effects for each pair of SNPs (two-locus interaction, additive × additive, additive × dominance, dominance × additive, and dominance × dominance) based on the extended Kempthorne model. EPISNPmpi is the parallel computing program for epistasis testing in large scale GWAS and achieved excellent scalability for large scale analysis and portability for various parallel computing platforms. EPISNP is the serial computing program based on the EPISNPmpi code for epistasis testing in small scale GWAS using commonly available operating systems and computer hardware. Three serial computing utility programs were developed for graphical viewing of test results and epistasis networks, and for estimating CPU time and disk space requirements. Conclusion The EPISNPmpi parallel computing program provides an effective computing tool for epistasis testing in large scale GWAS, and the epiSNP serial computing programs are convenient tools for epistasis analysis in small scale GWAS using commonly available computer hardware. PMID:18644146
Refactoring the Genetic Code for Increased Evolvability

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pines, Gur; Winkler, James D.; Pines, Assaf

ABSTRACT The standard genetic code is robust to mutations during transcription and translation. Point mutations are likely to be synonymous or to preserve the chemical properties of the original amino acid. Saturation mutagenesis experiments suggest that in some cases the best-performing mutant requires replacement of more than a single nucleotide within a codon. These replacements are essentially inaccessible to common error-based laboratory engineering techniques that alter a single nucleotide per mutation event, due to the extreme rarity of adjacent mutations. In this theoretical study, we suggest a radical reordering of the genetic code that maximizes the mutagenic potential of singlemore » nucleotide replacements. We explore several possible genetic codes that allow a greater degree of accessibility to the mutational landscape and may result in a hyperevolvable organism that could serve as an ideal platform for directed evolution experiments. We then conclude by evaluating the challenges of constructing such recoded organisms and their potential applications within the field of synthetic biology. IMPORTANCE The conservative nature of the genetic code prevents bioengineers from efficiently accessing the full mutational landscape of a gene via common error-prone methods. Here, we present two computational approaches to generate alternative genetic codes with increased accessibility. These new codes allow mutational transitions to a larger pool of amino acids and with a greater extent of chemical differences, based on a single nucleotide replacement within the codon, thus increasing evolvability both at the single-gene and at the genome levels. Given the widespread use of these techniques for strain and protein improvement, along with more fundamental evolutionary biology questions, the use of recoded organisms that maximize evolvability should significantly improve the efficiency of directed evolution, library generation, and fitness maximization.« less
Refactoring the Genetic Code for Increased Evolvability

DOE PAGES

Pines, Gur; Winkler, James D.; Pines, Assaf; ...

2017-11-14

ABSTRACT The standard genetic code is robust to mutations during transcription and translation. Point mutations are likely to be synonymous or to preserve the chemical properties of the original amino acid. Saturation mutagenesis experiments suggest that in some cases the best-performing mutant requires replacement of more than a single nucleotide within a codon. These replacements are essentially inaccessible to common error-based laboratory engineering techniques that alter a single nucleotide per mutation event, due to the extreme rarity of adjacent mutations. In this theoretical study, we suggest a radical reordering of the genetic code that maximizes the mutagenic potential of singlemore » nucleotide replacements. We explore several possible genetic codes that allow a greater degree of accessibility to the mutational landscape and may result in a hyperevolvable organism that could serve as an ideal platform for directed evolution experiments. We then conclude by evaluating the challenges of constructing such recoded organisms and their potential applications within the field of synthetic biology. IMPORTANCE The conservative nature of the genetic code prevents bioengineers from efficiently accessing the full mutational landscape of a gene via common error-prone methods. Here, we present two computational approaches to generate alternative genetic codes with increased accessibility. These new codes allow mutational transitions to a larger pool of amino acids and with a greater extent of chemical differences, based on a single nucleotide replacement within the codon, thus increasing evolvability both at the single-gene and at the genome levels. Given the widespread use of these techniques for strain and protein improvement, along with more fundamental evolutionary biology questions, the use of recoded organisms that maximize evolvability should significantly improve the efficiency of directed evolution, library generation, and fitness maximization.« less
Step-by-step magic state encoding for efficient fault-tolerant quantum computation

PubMed Central

Goto, Hayato

2014-01-01

Quantum error correction allows one to make quantum computers fault-tolerant against unavoidable errors due to decoherence and imperfect physical gate operations. However, the fault-tolerant quantum computation requires impractically large computational resources for useful applications. This is a current major obstacle to the realization of a quantum computer. In particular, magic state distillation, which is a standard approach to universality, consumes the most resources in fault-tolerant quantum computation. For the resource problem, here we propose step-by-step magic state encoding for concatenated quantum codes, where magic states are encoded step by step from the physical level to the logical one. To manage errors during the encoding, we carefully use error detection. Since the sizes of intermediate codes are small, it is expected that the resource overheads will become lower than previous approaches based on the distillation at the logical level. Our simulation results suggest that the resource requirements for a logical magic state will become comparable to those for a single logical controlled-NOT gate. Thus, the present method opens a new possibility for efficient fault-tolerant quantum computation. PMID:25511387
Step-by-step magic state encoding for efficient fault-tolerant quantum computation.

PubMed

Goto, Hayato

2014-12-16

Quantum error correction allows one to make quantum computers fault-tolerant against unavoidable errors due to decoherence and imperfect physical gate operations. However, the fault-tolerant quantum computation requires impractically large computational resources for useful applications. This is a current major obstacle to the realization of a quantum computer. In particular, magic state distillation, which is a standard approach to universality, consumes the most resources in fault-tolerant quantum computation. For the resource problem, here we propose step-by-step magic state encoding for concatenated quantum codes, where magic states are encoded step by step from the physical level to the logical one. To manage errors during the encoding, we carefully use error detection. Since the sizes of intermediate codes are small, it is expected that the resource overheads will become lower than previous approaches based on the distillation at the logical level. Our simulation results suggest that the resource requirements for a logical magic state will become comparable to those for a single logical controlled-NOT gate. Thus, the present method opens a new possibility for efficient fault-tolerant quantum computation.
Extremely Scalable Spiking Neuronal Network Simulation Code: From Laptops to Exascale Computers.

PubMed

Jordan, Jakob; Ippen, Tammo; Helias, Moritz; Kitayama, Itaru; Sato, Mitsuhisa; Igarashi, Jun; Diesmann, Markus; Kunkel, Susanne

2018-01-01

State-of-the-art software tools for neuronal network simulations scale to the largest computing systems available today and enable investigations of large-scale networks of up to 10 % of the human cortex at a resolution of individual neurons and synapses. Due to an upper limit on the number of incoming connections of a single neuron, network connectivity becomes extremely sparse at this scale. To manage computational costs, simulation software ultimately targeting the brain scale needs to fully exploit this sparsity. Here we present a two-tier connection infrastructure and a framework for directed communication among compute nodes accounting for the sparsity of brain-scale networks. We demonstrate the feasibility of this approach by implementing the technology in the NEST simulation code and we investigate its performance in different scaling scenarios of typical network simulations. Our results show that the new data structures and communication scheme prepare the simulation kernel for post-petascale high-performance computing facilities without sacrificing performance in smaller systems.
Extremely Scalable Spiking Neuronal Network Simulation Code: From Laptops to Exascale Computers

PubMed Central

Jordan, Jakob; Ippen, Tammo; Helias, Moritz; Kitayama, Itaru; Sato, Mitsuhisa; Igarashi, Jun; Diesmann, Markus; Kunkel, Susanne

2018-01-01

State-of-the-art software tools for neuronal network simulations scale to the largest computing systems available today and enable investigations of large-scale networks of up to 10 % of the human cortex at a resolution of individual neurons and synapses. Due to an upper limit on the number of incoming connections of a single neuron, network connectivity becomes extremely sparse at this scale. To manage computational costs, simulation software ultimately targeting the brain scale needs to fully exploit this sparsity. Here we present a two-tier connection infrastructure and a framework for directed communication among compute nodes accounting for the sparsity of brain-scale networks. We demonstrate the feasibility of this approach by implementing the technology in the NEST simulation code and we investigate its performance in different scaling scenarios of typical network simulations. Our results show that the new data structures and communication scheme prepare the simulation kernel for post-petascale high-performance computing facilities without sacrificing performance in smaller systems. PMID:29503613
Repetition code of 15 qubits

NASA Astrophysics Data System (ADS)

Wootton, James R.; Loss, Daniel

2018-05-01

The repetition code is an important primitive for the techniques of quantum error correction. Here we implement repetition codes of at most 15 qubits on the 16 qubit ibmqx3 device. Each experiment is run for a single round of syndrome measurements, achieved using the standard quantum technique of using ancilla qubits and controlled operations. The size of the final syndrome is small enough to allow for lookup table decoding using experimentally obtained data. The results show strong evidence that the logical error rate decays exponentially with code distance, as is expected and required for the development of fault-tolerant quantum computers. The results also give insight into the nature of noise in the device.
New Challenges in Computational Thermal Hydraulics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yadigaroglu, George; Lakehal, Djamel

New needs and opportunities drive the development of novel computational methods for the design and safety analysis of light water reactors (LWRs). Some new methods are likely to be three dimensional. Coupling is expected between system codes, computational fluid dynamics (CFD) modules, and cascades of computations at scales ranging from the macro- or system scale to the micro- or turbulence scales, with the various levels continuously exchanging information back and forth. The ISP-42/PANDA and the international SETH project provide opportunities for testing applications of single-phase CFD methods to LWR safety problems. Although industrial single-phase CFD applications are commonplace, computational multifluidmore » dynamics is still under development. However, first applications are appearing; the state of the art and its potential uses are discussed. The case study of condensation of steam/air mixtures injected from a downward-facing vent into a pool of water is a perfect illustration of a simulation cascade: At the top of the hierarchy of scales, system behavior can be modeled with a system code; at the central level, the volume-of-fluid method can be applied to predict large-scale bubbling behavior; at the bottom of the cascade, direct-contact condensation can be treated with direct numerical simulation, in which turbulent flow (in both the gas and the liquid), interfacial dynamics, and heat/mass transfer are directly simulated without resorting to models.« less
Predictions of the spontaneous symmetry-breaking theory for visual code completeness and spatial scaling in single-cell learning rules.

PubMed

Webber, C J

2001-05-01

This article shows analytically that single-cell learning rules that give rise to oriented and localized receptive fields, when their synaptic weights are randomly and independently initialized according to a plausible assumption of zero prior information, will generate visual codes that are invariant under two-dimensional translations, rotations, and scale magnifications, provided that the statistics of their training images are sufficiently invariant under these transformations. Such codes span different image locations, orientations, and size scales with equal economy. Thus, single-cell rules could account for the spatial scaling property of the cortical simple-cell code. This prediction is tested computationally by training with natural scenes; it is demonstrated that a single-cell learning rule can give rise to simple-cell receptive fields spanning the full range of orientations, image locations, and spatial frequencies (except at the extreme high and low frequencies at which the scale invariance of the statistics of digitally sampled images must ultimately break down, because of the image boundary and the finite pixel resolution). Thus, no constraint on completeness, or any other coupling between cells, is necessary to induce the visual code to span wide ranges of locations, orientations, and size scales. This prediction is made using the theory of spontaneous symmetry breaking, which we have previously shown can also explain the data-driven self-organization of a wide variety of transformation invariances in neurons' responses, such as the translation invariance of complex cell response.
Symplectic multi-particle tracking on GPUs

NASA Astrophysics Data System (ADS)

Liu, Zhicong; Qiang, Ji

2018-05-01

A symplectic multi-particle tracking model is implemented on the Graphic Processing Units (GPUs) using the Compute Unified Device Architecture (CUDA) language. The symplectic tracking model can preserve phase space structure and reduce non-physical effects in long term simulation, which is important for beam property evaluation in particle accelerators. Though this model is computationally expensive, it is very suitable for parallelization and can be accelerated significantly by using GPUs. In this paper, we optimized the implementation of the symplectic tracking model on both single GPU and multiple GPUs. Using a single GPU processor, the code achieves a factor of 2-10 speedup for a range of problem sizes compared with the time on a single state-of-the-art Central Processing Unit (CPU) node with similar power consumption and semiconductor technology. It also shows good scalability on a multi-GPU cluster at Oak Ridge Leadership Computing Facility. In an application to beam dynamics simulation, the GPU implementation helps save more than a factor of two total computing time in comparison to the CPU implementation.
Multi-Zone Liquid Thrust Chamber Performance Code with Domain Decomposition for Parallel Processing

NASA Technical Reports Server (NTRS)

Navaz, Homayun K.

2002-01-01

Computational Fluid Dynamics (CFD) has considerably evolved in the last decade. There are many computer programs that can perform computations on viscous internal or external flows with chemical reactions. CFD has become a commonly used tool in the design and analysis of gas turbines, ramjet combustors, turbo-machinery, inlet ducts, rocket engines, jet interaction, missile, and ramjet nozzles. One of the problems of interest to NASA has always been the performance prediction for rocket and air-breathing engines. Due to the complexity of flow in these engines it is necessary to resolve the flowfield into a fine mesh to capture quantities like turbulence and heat transfer. However, calculation on a high-resolution grid is associated with a prohibitively increasing computational time that can downgrade the value of the CFD for practical engineering calculations. The Liquid Thrust Chamber Performance (LTCP) code was developed for NASA/MSFC (Marshall Space Flight Center) to perform liquid rocket engine performance calculations. This code is a 2D/axisymmetric full Navier-Stokes (NS) solver with fully coupled finite rate chemistry and Eulerian treatment of liquid fuel and/or oxidizer droplets. One of the advantages of this code has been the resemblance of its input file to the JANNAF (Joint Army Navy NASA Air Force Interagency Propulsion Committee) standard TDK code, and its automatic grid generation for JANNAF defined combustion chamber wall geometry. These options minimize the learning effort for TDK users, and make the code a good candidate for performing engineering calculations. Although the LTCP code was developed for liquid rocket engines, it is a general-purpose code and has been used for solving many engineering problems. However, the single zone formulation of the LTCP has limited the code to be applicable to problems with complex geometry. Furthermore, the computational time becomes prohibitively large for high-resolution problems with chemistry, two-equation turbulence model, and two-phase flow. To overcome these limitations, the LTCP code is rewritten to include the multi-zone capability with domain decomposition that makes it suitable for parallel processing, i.e., enabling the code to run every zone or sub-domain on a separate processor. This can reduce the run time by a factor of 6 to 8, depending on the problem.
Injecting Artificial Memory Errors Into a Running Computer Program

NASA Technical Reports Server (NTRS)

Bornstein, Benjamin J.; Granat, Robert A.; Wagstaff, Kiri L.

2008-01-01

Single-event upsets (SEUs) or bitflips are computer memory errors caused by radiation. BITFLIPS (Basic Instrumentation Tool for Fault Localized Injection of Probabilistic SEUs) is a computer program that deliberately injects SEUs into another computer program, while the latter is running, for the purpose of evaluating the fault tolerance of that program. BITFLIPS was written as a plug-in extension of the open-source Valgrind debugging and profiling software. BITFLIPS can inject SEUs into any program that can be run on the Linux operating system, without needing to modify the program s source code. Further, if access to the original program source code is available, BITFLIPS offers fine-grained control over exactly when and which areas of memory (as specified via program variables) will be subjected to SEUs. The rate of injection of SEUs is controlled by specifying either a fault probability or a fault rate based on memory size and radiation exposure time, in units of SEUs per byte per second. BITFLIPS can also log each SEU that it injects and, if program source code is available, report the magnitude of effect of the SEU on a floating-point value or other program variable.
Quantum error correction in crossbar architectures

NASA Astrophysics Data System (ADS)

Helsen, Jonas; Steudtner, Mark; Veldhorst, Menno; Wehner, Stephanie

2018-07-01

A central challenge for the scaling of quantum computing systems is the need to control all qubits in the system without a large overhead. A solution for this problem in classical computing comes in the form of so-called crossbar architectures. Recently we made a proposal for a large-scale quantum processor (Li et al arXiv:1711.03807 (2017)) to be implemented in silicon quantum dots. This system features a crossbar control architecture which limits parallel single-qubit control, but allows the scheme to overcome control scaling issues that form a major hurdle to large-scale quantum computing systems. In this work, we develop a language that makes it possible to easily map quantum circuits to crossbar systems, taking into account their architecture and control limitations. Using this language we show how to map well known quantum error correction codes such as the planar surface and color codes in this limited control setting with only a small overhead in time. We analyze the logical error behavior of this surface code mapping for estimated experimental parameters of the crossbar system and conclude that logical error suppression to a level useful for real quantum computation is feasible.
LB3D: A parallel implementation of the Lattice-Boltzmann method for simulation of interacting amphiphilic fluids

NASA Astrophysics Data System (ADS)

Schmieschek, S.; Shamardin, L.; Frijters, S.; Krüger, T.; Schiller, U. D.; Harting, J.; Coveney, P. V.

2017-08-01

We introduce the lattice-Boltzmann code LB3D, version 7.1. Building on a parallel program and supporting tools which have enabled research utilising high performance computing resources for nearly two decades, LB3D version 7 provides a subset of the research code functionality as an open source project. Here, we describe the theoretical basis of the algorithm as well as computational aspects of the implementation. The software package is validated against simulations of meso-phases resulting from self-assembly in ternary fluid mixtures comprising immiscible and amphiphilic components such as water-oil-surfactant systems. The impact of the surfactant species on the dynamics of spinodal decomposition are tested and quantitative measurement of the permeability of a body centred cubic (BCC) model porous medium for a simple binary mixture is described. Single-core performance and scaling behaviour of the code are reported for simulations on current supercomputer architectures.
Assessment of Reduced-Kinetics Mechanisms for Combustion of Jet Fuel in CFD Applications

NASA Technical Reports Server (NTRS)

Ajmani, Kumud; Kundu, Krihna P.; Yungster, Shaye J.

2014-01-01

A computational effort was undertaken to analyze the details of fluid flow in Lean-Direct Injection (LDI) combustors for next-generation LDI design. The National Combustor Code (NCC) was used to perform reacting flow computations on single-element LDI injector configurations. The feasibility of using a reduced chemical-kinetics approach, which optimizes the reaction rates and species to model the emissions characteristics typical of lean-burning gas-turbine combustors, was assessed. The assessments were performed with Reynolds- Averaged Navier-Stokes (RANS) and Time-Filtered Navier Stokes (TFNS) time-integration, with a Lagrangian spray model with the NCC code. The NCC predictions for EINOx and combustor exit temperature were compared with experimental data for two different single-element LDI injector configurations, with 60deg and 45deg axially swept swirler vanes. The effects of turbulence-chemistry interaction on the predicted flow in a typical LDI combustor were studied with detailed comparisons of NCC TFNS with experimental data.

Computer program for prediction of capture maneuver probability for an on-off reaction controlled upper stage

NASA Technical Reports Server (NTRS)

Knauber, R. N.

1982-01-01

A FORTRAN coded computer program which computes the capture transient of a launch vehicle upper stage at the ignition and/or separation event is presented. It is for a single degree-of-freedom on-off reaction jet attitude control system. The Monte Carlo method is used to determine the statistical value of key parameters at the outcome of the event. Aerodynamic and booster induced disturbances, vehicle and control system characteristics, and initial conditions are treated as random variables. By appropriate selection of input data pitch, yaw and roll axes can be analyzed. Transient response of a single deterministic case can be computed. The program is currently set up on a CDC CYBER 175 computer system but is compatible with ANSI FORTRAN computer language. This routine has been used over the past fifteen (15) years for the SCOUT Launch Vehicle and has been run on RECOMP III, IBM 7090, IBM 360/370, CDC6600 and CDC CYBER 175 computers with little modification.
Ancient DNA sequence revealed by error-correcting codes.

PubMed

Brandão, Marcelo M; Spoladore, Larissa; Faria, Luzinete C B; Rocha, Andréa S L; Silva-Filho, Marcio C; Palazzo, Reginaldo

2015-07-10

A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code.
Ancient DNA sequence revealed by error-correcting codes

PubMed Central

Brandão, Marcelo M.; Spoladore, Larissa; Faria, Luzinete C. B.; Rocha, Andréa S. L.; Silva-Filho, Marcio C.; Palazzo, Reginaldo

2015-01-01

A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code. PMID:26159228
Development of an extensible dual-core wireless sensing node for cyber-physical systems

NASA Astrophysics Data System (ADS)

Kane, Michael; Zhu, Dapeng; Hirose, Mitsuhito; Dong, Xinjun; Winter, Benjamin; Häckell, Mortiz; Lynch, Jerome P.; Wang, Yang; Swartz, A.

2014-04-01

The introduction of wireless telemetry into the design of monitoring and control systems has been shown to reduce system costs while simplifying installations. To date, wireless nodes proposed for sensing and actuation in cyberphysical systems have been designed using microcontrollers with one computational pipeline (i.e., single-core microcontrollers). While concurrent code execution can be implemented on single-core microcontrollers, concurrency is emulated by splitting the pipeline's resources to support multiple threads of code execution. For many applications, this approach to multi-threading is acceptable in terms of speed and function. However, some applications such as feedback controls demand deterministic timing of code execution and maximum computational throughput. For these applications, the adoption of multi-core processor architectures represents one effective solution. Multi-core microcontrollers have multiple computational pipelines that can execute embedded code in parallel and can be interrupted independent of one another. In this study, a new wireless platform named Martlet is introduced with a dual-core microcontroller adopted in its design. The dual-core microcontroller design allows Martlet to dedicate one core to standard wireless sensor operations while the other core is reserved for embedded data processing and real-time feedback control law execution. Another distinct feature of Martlet is a standardized hardware interface that allows specialized daughter boards (termed wing boards) to be interfaced to the Martlet baseboard. This extensibility opens opportunity to encapsulate specialized sensing and actuation functions in a wing board without altering the design of Martlet. In addition to describing the design of Martlet, a few example wings are detailed, along with experiments showing the Martlet's ability to monitor and control physical systems such as wind turbines and buildings.
Support for Systematic Code Reviews with the SCRUB Tool

NASA Technical Reports Server (NTRS)

Holzmann, Gerald J.

2010-01-01

SCRUB is a code review tool that supports both large, team-based software development efforts (e.g., for mission software) as well as individual tasks. The tool was developed at JPL to support a new, streamlined code review process that combines human-generated review reports with program-generated review reports from a customizable range of state-of-the-art source code analyzers. The leading commercial tools include Codesonar, Coverity, and Klocwork, each of which can achieve a reasonably low rate of false-positives in the warnings that they generate. The time required to analyze code with these tools can vary greatly. In each case, however, the tools produce results that would be difficult to realize with human code inspections alone. There is little overlap in the results produced by the different analyzers, and each analyzer used generally increases the effectiveness of the overall effort. The SCRUB tool allows all reports to be accessed through a single, uniform interface (see figure) that facilitates brows ing code and reports. Improvements over existing software include significant simplification, and leveraging of a range of commercial, static source code analyzers in a single, uniform framework. The tool runs as a small stand-alone application, avoiding the security problems related to tools based on Web browsers. A developer or reviewer, for instance, must have already obtained access rights to a code base before that code can be browsed and reviewed with the SCRUB tool. The tool cannot open any files or folders to which the user does not already have access. This means that the tool does not need to enforce or administer any additional security policies. The analysis results presented through the SCRUB tool s user interface are always computed off-line, given that, especially for larger projects, this computation can take longer than appropriate for interactive tool use. The recommended code review process that is supported by the SCRUB tool consists of three phases: Code Review, Developer Response, and Closeout Resolution. In the Code Review phase, all tool-based analysis reports are generated, and specific comments from expert code reviewers are entered into the SCRUB tool. In the second phase, Developer Response, the developer is asked to respond to each comment and tool-report that was produced, either agreeing or disagreeing to provide a fix that addresses the issue that was raised. In the third phase, Closeout Resolution, all disagreements are discussed in a meeting of all parties involved, and a resolution is made for all disagreements. The first two phases generally take one week each, and the third phase is concluded in a single closeout meeting.
Fault-tolerance in Two-dimensional Topological Systems

NASA Astrophysics Data System (ADS)

Anderson, Jonas T.

This thesis is a collection of ideas with the general goal of building, at least in the abstract, a local fault-tolerant quantum computer. The connection between quantum information and topology has proven to be an active area of research in several fields. The introduction of the toric code by Alexei Kitaev demonstrated the usefulness of topology for quantum memory and quantum computation. Many quantum codes used for quantum memory are modeled by spin systems on a lattice, with operators that extract syndrome information placed on vertices or faces of the lattice. It is natural to wonder whether the useful codes in such systems can be classified. This thesis presents work that leverages ideas from topology and graph theory to explore the space of such codes. Homological stabilizer codes are introduced and it is shown that, under a set of reasonable assumptions, any qubit homological stabilizer code is equivalent to either a toric code or a color code. Additionally, the toric code and the color code correspond to distinct classes of graphs. Many systems have been proposed as candidate quantum computers. It is very desirable to design quantum computing architectures with two-dimensional layouts and low complexity in parity-checking circuitry. Kitaev's surface codes provided the first example of codes satisfying this property. They provided a new route to fault tolerance with more modest overheads and thresholds approaching 1%. The recently discovered color codes share many properties with the surface codes, such as the ability to perform syndrome extraction locally in two dimensions. Some families of color codes admit a transversal implementation of the entire Clifford group. This work investigates color codes on the 4.8.8 lattice known as triangular codes. I develop a fault-tolerant error-correction strategy for these codes in which repeated syndrome measurements on this lattice generate a three-dimensional space-time combinatorial structure. I then develop an integer program that analyzes this structure and determines the most likely set of errors consistent with the observed syndrome values. I implement this integer program to find the threshold for depolarizing noise on small versions of these triangular codes. Because the threshold for magic-state distillation is likely to be higher than this value and because logical CNOT gates can be performed by code deformation in a single block instead of between pairs of blocks, the threshold for fault-tolerant quantum memory for these codes is also the threshold for fault-tolerant quantum computation with them. Since the advent of a threshold theorem for quantum computers much has been improved upon. Thresholds have increased, architectures have become more local, and gate sets have been simplified. The overhead for magic-state distillation has been studied, but not nearly to the extent of the aforementioned topics. A method for greatly reducing this overhead, known as reusable magic states, is studied here. While examples of reusable magic states exist for Clifford gates, I give strong reasons to believe they do not exist for non-Clifford gates.
Reconstruction for time-domain in vivo EPR 3D multigradient oximetric imaging--a parallel processing perspective.

PubMed

Dharmaraj, Christopher D; Thadikonda, Kishan; Fletcher, Anthony R; Doan, Phuc N; Devasahayam, Nallathamby; Matsumoto, Shingo; Johnson, Calvin A; Cook, John A; Mitchell, James B; Subramanian, Sankaran; Krishna, Murali C

2009-01-01

Three-dimensional Oximetric Electron Paramagnetic Resonance Imaging using the Single Point Imaging modality generates unpaired spin density and oxygen images that can readily distinguish between normal and tumor tissues in small animals. It is also possible with fast imaging to track the changes in tissue oxygenation in response to the oxygen content in the breathing air. However, this involves dealing with gigabytes of data for each 3D oximetric imaging experiment involving digital band pass filtering and background noise subtraction, followed by 3D Fourier reconstruction. This process is rather slow in a conventional uniprocessor system. This paper presents a parallelization framework using OpenMP runtime support and parallel MATLAB to execute such computationally intensive programs. The Intel compiler is used to develop a parallel C++ code based on OpenMP. The code is executed on four Dual-Core AMD Opteron shared memory processors, to reduce the computational burden of the filtration task significantly. The results show that the parallel code for filtration has achieved a speed up factor of 46.66 as against the equivalent serial MATLAB code. In addition, a parallel MATLAB code has been developed to perform 3D Fourier reconstruction. Speedup factors of 4.57 and 4.25 have been achieved during the reconstruction process and oximetry computation, for a data set with 23 x 23 x 23 gradient steps. The execution time has been computed for both the serial and parallel implementations using different dimensions of the data and presented for comparison. The reported system has been designed to be easily accessible even from low-cost personal computers through local internet (NIHnet). The experimental results demonstrate that the parallel computing provides a source of high computational power to obtain biophysical parameters from 3D EPR oximetric imaging, almost in real-time.
Use of the PARC code to estimate the off-design transonic performance of an over/under turboramjet nozzle

NASA Technical Reports Server (NTRS)

Lam, David W.

1995-01-01

The transonic performance of a dual-throat, single-expansion-ramp nozzle (SERN) was investigated with a PARC computational fluid dynamics (CFD) code, an external flow Navier-Stokes solver. The nozzle configuration was from a conceptual Mach 5 cruise aircraft powered by four air-breathing turboramjets. Initial test cases used the two-dimensional version of PARC in Euler mode to investigate the effect of geometric variation on transonic performance. Additional cases used the two-dimensional version in viscous mode and the three-dimensional version in both Euler and viscous modes. Results of the analysis indicate low nozzle performance and a highly three-dimensional nozzle flow at transonic conditions. In another comparative study using the PARC code, a single-throat SERN configuration for which experimental data were available at transonic conditions was used to validate the results of the over/under turboramjet nozzle.
EOSlib, Version 3

DOE Office of Scientific and Technical Information (OSTI.GOV)

Woods, Nathan; Menikoff, Ralph

2017-02-03

Equilibrium thermodynamics underpins many of the technologies used throughout theoretical physics, yet verification of the various theoretical models in the open literature remains challenging. EOSlib provides a single, consistent, verifiable implementation of these models, in a single, easy-to-use software package. It consists of three parts: a software library implementing various published equation-of-state (EOS) models; a database of fitting parameters for various materials for these models; and a number of useful utility functions for simplifying thermodynamic calculations such as computing Hugoniot curves or Riemann problem solutions. Ready availability of this library will enable reliable code-to- code testing of equation-of-state implementations, asmore » well as a starting point for more rigorous verification work. EOSlib also provides a single, consistent API for its analytic and tabular EOS models, which simplifies the process of comparing models for a particular application.« less
A parallel Monte Carlo code for planar and SPECT imaging: implementation, verification and applications in (131)I SPECT.

PubMed

Dewaraja, Yuni K; Ljungberg, Michael; Majumdar, Amitava; Bose, Abhijit; Koral, Kenneth F

2002-02-01

This paper reports the implementation of the SIMIND Monte Carlo code on an IBM SP2 distributed memory parallel computer. Basic aspects of running Monte Carlo particle transport calculations on parallel architectures are described. Our parallelization is based on equally partitioning photons among the processors and uses the Message Passing Interface (MPI) library for interprocessor communication and the Scalable Parallel Random Number Generator (SPRNG) to generate uncorrelated random number streams. These parallelization techniques are also applicable to other distributed memory architectures. A linear increase in computing speed with the number of processors is demonstrated for up to 32 processors. This speed-up is especially significant in Single Photon Emission Computed Tomography (SPECT) simulations involving higher energy photon emitters, where explicit modeling of the phantom and collimator is required. For (131)I, the accuracy of the parallel code is demonstrated by comparing simulated and experimental SPECT images from a heart/thorax phantom. Clinically realistic SPECT simulations using the voxel-man phantom are carried out to assess scatter and attenuation correction.
A Lossless Multichannel Bio-Signal Compression Based on Low-Complexity Joint Coding Scheme for Portable Medical Devices

PubMed Central

Kim, Dong-Sun; Kwon, Jin-San

2014-01-01

Research on real-time health systems have received great attention during recent years and the needs of high-quality personal multichannel medical signal compression for personal medical product applications are increasing. The international MPEG-4 audio lossless coding (ALS) standard supports a joint channel-coding scheme for improving compression performance of multichannel signals and it is very efficient compression method for multi-channel biosignals. However, the computational complexity of such a multichannel coding scheme is significantly greater than that of other lossless audio encoders. In this paper, we present a multichannel hardware encoder based on a low-complexity joint-coding technique and shared multiplier scheme for portable devices. A joint-coding decision method and a reference channel selection scheme are modified for a low-complexity joint coder. The proposed joint coding decision method determines the optimized joint-coding operation based on the relationship between the cross correlation of residual signals and the compression ratio. The reference channel selection is designed to select a channel for the entropy coding of the joint coding. The hardware encoder operates at a 40 MHz clock frequency and supports two-channel parallel encoding for the multichannel monitoring system. Experimental results show that the compression ratio increases by 0.06%, whereas the computational complexity decreases by 20.72% compared to the MPEG-4 ALS reference software encoder. In addition, the compression ratio increases by about 11.92%, compared to the single channel based bio-signal lossless data compressor. PMID:25237900
Ultrafast and scalable cone-beam CT reconstruction using MapReduce in a cloud computing environment.

PubMed

Meng, Bowen; Pratx, Guillem; Xing, Lei

2011-12-01

Four-dimensional CT (4DCT) and cone beam CT (CBCT) are widely used in radiation therapy for accurate tumor target definition and localization. However, high-resolution and dynamic image reconstruction is computationally demanding because of the large amount of data processed. Efficient use of these imaging techniques in the clinic requires high-performance computing. The purpose of this work is to develop a novel ultrafast, scalable and reliable image reconstruction technique for 4D CBCT∕CT using a parallel computing framework called MapReduce. We show the utility of MapReduce for solving large-scale medical physics problems in a cloud computing environment. In this work, we accelerated the Feldcamp-Davis-Kress (FDK) algorithm by porting it to Hadoop, an open-source MapReduce implementation. Gated phases from a 4DCT scans were reconstructed independently. Following the MapReduce formalism, Map functions were used to filter and backproject subsets of projections, and Reduce function to aggregate those partial backprojection into the whole volume. MapReduce automatically parallelized the reconstruction process on a large cluster of computer nodes. As a validation, reconstruction of a digital phantom and an acquired CatPhan 600 phantom was performed on a commercial cloud computing environment using the proposed 4D CBCT∕CT reconstruction algorithm. Speedup of reconstruction time is found to be roughly linear with the number of nodes employed. For instance, greater than 10 times speedup was achieved using 200 nodes for all cases, compared to the same code executed on a single machine. Without modifying the code, faster reconstruction is readily achievable by allocating more nodes in the cloud computing environment. Root mean square error between the images obtained using MapReduce and a single-threaded reference implementation was on the order of 10(-7). Our study also proved that cloud computing with MapReduce is fault tolerant: the reconstruction completed successfully with identical results even when half of the nodes were manually terminated in the middle of the process. An ultrafast, reliable and scalable 4D CBCT∕CT reconstruction method was developed using the MapReduce framework. Unlike other parallel computing approaches, the parallelization and speedup required little modification of the original reconstruction code. MapReduce provides an efficient and fault tolerant means of solving large-scale computing problems in a cloud computing environment.
Ultrafast and scalable cone-beam CT reconstruction using MapReduce in a cloud computing environment

PubMed Central

Meng, Bowen; Pratx, Guillem; Xing, Lei

2011-01-01

Purpose: Four-dimensional CT (4DCT) and cone beam CT (CBCT) are widely used in radiation therapy for accurate tumor target definition and localization. However, high-resolution and dynamic image reconstruction is computationally demanding because of the large amount of data processed. Efficient use of these imaging techniques in the clinic requires high-performance computing. The purpose of this work is to develop a novel ultrafast, scalable and reliable image reconstruction technique for 4D CBCT/CT using a parallel computing framework called MapReduce. We show the utility of MapReduce for solving large-scale medical physics problems in a cloud computing environment. Methods: In this work, we accelerated the Feldcamp–Davis–Kress (FDK) algorithm by porting it to Hadoop, an open-source MapReduce implementation. Gated phases from a 4DCT scans were reconstructed independently. Following the MapReduce formalism, Map functions were used to filter and backproject subsets of projections, and Reduce function to aggregate those partial backprojection into the whole volume. MapReduce automatically parallelized the reconstruction process on a large cluster of computer nodes. As a validation, reconstruction of a digital phantom and an acquired CatPhan 600 phantom was performed on a commercial cloud computing environment using the proposed 4D CBCT/CT reconstruction algorithm. Results: Speedup of reconstruction time is found to be roughly linear with the number of nodes employed. For instance, greater than 10 times speedup was achieved using 200 nodes for all cases, compared to the same code executed on a single machine. Without modifying the code, faster reconstruction is readily achievable by allocating more nodes in the cloud computing environment. Root mean square error between the images obtained using MapReduce and a single-threaded reference implementation was on the order of 10−7. Our study also proved that cloud computing with MapReduce is fault tolerant: the reconstruction completed successfully with identical results even when half of the nodes were manually terminated in the middle of the process. Conclusions: An ultrafast, reliable and scalable 4D CBCT/CT reconstruction method was developed using the MapReduce framework. Unlike other parallel computing approaches, the parallelization and speedup required little modification of the original reconstruction code. MapReduce provides an efficient and fault tolerant means of solving large-scale computing problems in a cloud computing environment. PMID:22149842
Geometrical-optics code for computing the optical properties of large dielectric spheres.

PubMed

Zhou, Xiaobing; Li, Shusun; Stamnes, Knut

2003-07-20

Absorption of electromagnetic radiation by absorptive dielectric spheres such as snow grains in the near-infrared part of the solar spectrum cannot be neglected when radiative properties of snow are computed. Thus a new, to our knowledge, geometrical-optics code is developed to compute scattering and absorption cross sections of large dielectric particles of arbitrary complex refractive index. The number of internal reflections and transmissions are truncated on the basis of the ratio of the irradiance incident at the nth interface to the irradiance incident at the first interface for a specific optical ray. Thus the truncation number is a function of the angle of incidence. Phase functions for both near- and far-field absorption and scattering of electromagnetic radiation are calculated directly at any desired scattering angle by using a hybrid algorithm based on the bisection and Newton-Raphson methods. With these methods a large sphere's absorption and scattering properties of light can be calculated for any wavelength from the ultraviolet to the microwave regions. Assuming that large snow meltclusters (1-cm order), observed ubiquitously in the snow cover during summer, can be characterized as spheres, one may compute absorption and scattering efficiencies and the scattering phase function on the basis of this geometrical-optics method. A geometrical-optics method for sphere (GOMsphere) code is developed and tested against Wiscombe's Mie scattering code (MIE0) and a Monte Carlo code for a range of size parameters. GOMsphere can be combined with MIE0 to calculate the single-scattering properties of dielectric spheres of any size.
New decoding methods of interleaved burst error-correcting codes

NASA Astrophysics Data System (ADS)

Nakano, Y.; Kasahara, M.; Namekawa, T.

1983-04-01

A probabilistic method of single burst error correction, using the syndrome correlation of subcodes which constitute the interleaved code, is presented. This method makes it possible to realize a high capability of burst error correction with less decoding delay. By generalizing this method it is possible to obtain probabilistic method of multiple (m-fold) burst error correction. After estimating the burst error positions using syndrome correlation of subcodes which are interleaved m-fold burst error detecting codes, this second method corrects erasure errors in each subcode and m-fold burst errors. The performance of these two methods is analyzed via computer simulation, and their effectiveness is demonstrated.
CoreTSAR: Core Task-Size Adapting Runtime

DOE PAGES

Scogland, Thomas R. W.; Feng, Wu-chun; Rountree, Barry; ...

2014-10-27

Heterogeneity continues to increase at all levels of computing, with the rise of accelerators such as GPUs, FPGAs, and other co-processors into everything from desktops to supercomputers. As a consequence, efficiently managing such disparate resources has become increasingly complex. CoreTSAR seeks to reduce this complexity by adaptively worksharing parallel-loop regions across compute resources without requiring any transformation of the code within the loop. Lastly, our results show performance improvements of up to three-fold over a current state-of-the-art heterogeneous task scheduler as well as linear performance scaling from a single GPU to four GPUs for many codes. In addition, CoreTSAR demonstratesmore » a robust ability to adapt to both a variety of workloads and underlying system configurations.« less
GPU accelerated implementation of NCI calculations using promolecular density.

PubMed

Rubez, Gaëtan; Etancelin, Jean-Matthieu; Vigouroux, Xavier; Krajecki, Michael; Boisson, Jean-Charles; Hénon, Eric

2017-05-30

The NCI approach is a modern tool to reveal chemical noncovalent interactions. It is particularly attractive to describe ligand-protein binding. A custom implementation for NCI using promolecular density is presented. It is designed to leverage the computational power of NVIDIA graphics processing unit (GPU) accelerators through the CUDA programming model. The code performances of three versions are examined on a test set of 144 systems. NCI calculations are particularly well suited to the GPU architecture, which reduces drastically the computational time. On a single compute node, the dual-GPU version leads to a 39-fold improvement for the biggest instance compared to the optimal OpenMP parallel run (C code, icc compiler) with 16 CPU cores. Energy consumption measurements carried out on both CPU and GPU NCI tests show that the GPU approach provides substantial energy savings. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Implementation of radiation shielding calculation methods. Volume 2: Seminar/Workshop notes

NASA Technical Reports Server (NTRS)

Capo, M. A.; Disney, R. K.

1971-01-01

Detailed descriptions are presented of the input data for each of the MSFC computer codes applied to the analysis of a realistic nuclear propelled vehicle. The analytical techniques employed include cross section data, preparation, one and two dimensional discrete ordinates transport, point kernel, and single scatter methods.
Reed Solomon codes for error control in byte organized computer memory systems

NASA Technical Reports Server (NTRS)

Lin, S.; Costello, D. J., Jr.

1984-01-01

A problem in designing semiconductor memories is to provide some measure of error control without requiring excessive coding overhead or decoding time. In LSI and VLSI technology, memories are often organized on a multiple bit (or byte) per chip basis. For example, some 256K-bit DRAM's are organized in 32Kx8 bit-bytes. Byte oriented codes such as Reed Solomon (RS) codes can provide efficient low overhead error control for such memories. However, the standard iterative algorithm for decoding RS codes is too slow for these applications. Some special decoding techniques for extended single-and-double-error-correcting RS codes which are capable of high speed operation are presented. These techniques are designed to find the error locations and the error values directly from the syndrome without having to use the iterative algorithm to find the error locator polynomial.
Algorithm and code development for unsteady three-dimensional Navier-Stokes equations

NASA Technical Reports Server (NTRS)

Obayashi, Shigeru

1994-01-01

Aeroelastic tests require extensive cost and risk. An aeroelastic wind-tunnel experiment is an order of magnitude more expensive than a parallel experiment involving only aerodynamics. By complementing the wind-tunnel experiments with numerical simulations, the overall cost of the development of aircraft can be considerably reduced. In order to accurately compute aeroelastic phenomenon it is necessary to solve the unsteady Euler/Navier-Stokes equations simultaneously with the structural equations of motion. These equations accurately describe the flow phenomena for aeroelastic applications. At ARC a code, ENSAERO, is being developed for computing the unsteady aerodynamics and aeroelasticity of aircraft, and it solves the Euler/Navier-Stokes equations. The purpose of this cooperative agreement was to enhance ENSAERO in both algorithm and geometric capabilities. During the last five years, the algorithms of the code have been enhanced extensively by using high-resolution upwind algorithms and efficient implicit solvers. The zonal capability of the code has been extended from a one-to-one grid interface to a mismatching unsteady zonal interface. The geometric capability of the code has been extended from a single oscillating wing case to a full-span wing-body configuration with oscillating control surfaces. Each time a new capability was added, a proper validation case was simulated, and the capability of the code was demonstrated.

A generic archive protocol and an implementation

NASA Technical Reports Server (NTRS)

Jordan, J. M.; Jennings, D. G.; Mcglynn, T. A.; Ruggiero, N. G.; Serlemitsos, T. A.

1992-01-01

Archiving vast amounts of data has become a major part of every scientific space mission today. The Generic Archive/Retrieval Services Protocol (GRASP) addresses the question of how to archive the data collected in an environment where the underlying hardware archives may be rapidly changing. GRASP is a device independent specification defining a set of functions for storing and retrieving data from an archive, as well as other support functions. GRASP is divided into two levels: the Transfer Interface and the Action Interface. The Transfer Interface is computer/archive independent code while the Action Interface contains code which is dedicated to each archive/computer addressed. Implementations of the GRASP specification are currently available for DECstations running Ultrix, Sparcstations running SunOS, and microVAX/VAXstation 3100's. The underlying archive is assumed to function as a standard Unix or VMS file system. The code, written in C, is a single suite of files. Preprocessing commands define the machine unique code sections in the device interface. The implementation was written, to the greatest extent possible, using only ANSI standard C functions.
SpecBit, DecayBit and PrecisionBit: GAMBIT modules for computing mass spectra, particle decay rates and precision observables

NASA Astrophysics Data System (ADS)

Athron, Peter; Balázs, Csaba; Dal, Lars A.; Edsjö, Joakim; Farmer, Ben; Gonzalo, Tomás E.; Kvellestad, Anders; McKay, James; Putze, Antje; Rogan, Chris; Scott, Pat; Weniger, Christoph; White, Martin

2018-01-01

We present the GAMBIT modules SpecBit, DecayBit and PrecisionBit. Together they provide a new framework for linking publicly available spectrum generators, decay codes and other precision observable calculations in a physically and statistically consistent manner. This allows users to automatically run various combinations of existing codes as if they are a single package. The modular design allows software packages fulfilling the same role to be exchanged freely at runtime, with the results presented in a common format that can easily be passed to downstream dark matter, collider and flavour codes. These modules constitute an essential part of the broader GAMBIT framework, a major new software package for performing global fits. In this paper we present the observable calculations, data, and likelihood functions implemented in the three modules, as well as the conventions and assumptions used in interfacing them with external codes. We also present 3-BIT-HIT, a command-line utility for computing mass spectra, couplings, decays and precision observables in the MSSM, which shows how the three modules can easily be used independently of GAMBIT.
The NEST Dry-Run Mode: Efficient Dynamic Analysis of Neuronal Network Simulation Code.

PubMed

Kunkel, Susanne; Schenck, Wolfram

2017-01-01

NEST is a simulator for spiking neuronal networks that commits to a general purpose approach: It allows for high flexibility in the design of network models, and its applications range from small-scale simulations on laptops to brain-scale simulations on supercomputers. Hence, developers need to test their code for various use cases and ensure that changes to code do not impair scalability. However, running a full set of benchmarks on a supercomputer takes up precious compute-time resources and can entail long queuing times. Here, we present the NEST dry-run mode, which enables comprehensive dynamic code analysis without requiring access to high-performance computing facilities. A dry-run simulation is carried out by a single process, which performs all simulation steps except communication as if it was part of a parallel environment with many processes. We show that measurements of memory usage and runtime of neuronal network simulations closely match the corresponding dry-run data. Furthermore, we demonstrate the successful application of the dry-run mode in the areas of profiling and performance modeling.
The NEST Dry-Run Mode: Efficient Dynamic Analysis of Neuronal Network Simulation Code

PubMed Central

Kunkel, Susanne; Schenck, Wolfram

2017-01-01

NEST is a simulator for spiking neuronal networks that commits to a general purpose approach: It allows for high flexibility in the design of network models, and its applications range from small-scale simulations on laptops to brain-scale simulations on supercomputers. Hence, developers need to test their code for various use cases and ensure that changes to code do not impair scalability. However, running a full set of benchmarks on a supercomputer takes up precious compute-time resources and can entail long queuing times. Here, we present the NEST dry-run mode, which enables comprehensive dynamic code analysis without requiring access to high-performance computing facilities. A dry-run simulation is carried out by a single process, which performs all simulation steps except communication as if it was part of a parallel environment with many processes. We show that measurements of memory usage and runtime of neuronal network simulations closely match the corresponding dry-run data. Furthermore, we demonstrate the successful application of the dry-run mode in the areas of profiling and performance modeling. PMID:28701946
Rate adaptive multilevel coded modulation with high coding gain in intensity modulation direct detection optical communication

NASA Astrophysics Data System (ADS)

Xiao, Fei; Liu, Bo; Zhang, Lijia; Xin, Xiangjun; Zhang, Qi; Tian, Qinghua; Tian, Feng; Wang, Yongjun; Rao, Lan; Ullah, Rahat; Zhao, Feng; Li, Deng'ao

2018-02-01

A rate-adaptive multilevel coded modulation (RA-MLC) scheme based on fixed code length and a corresponding decoding scheme is proposed. RA-MLC scheme combines the multilevel coded and modulation technology with the binary linear block code at the transmitter. Bits division, coding, optional interleaving, and modulation are carried out by the preset rule, then transmitted through standard single mode fiber span equal to 100 km. The receiver improves the accuracy of decoding by means of soft information passing through different layers, which enhances the performance. Simulations are carried out in an intensity modulation-direct detection optical communication system using MATLAB®. Results show that the RA-MLC scheme can achieve bit error rate of 1E-5 when optical signal-to-noise ratio is 20.7 dB. It also reduced the number of decoders by 72% and realized 22 rate adaptation without significantly increasing the computing time. The coding gain is increased by 7.3 dB at BER=1E-3.
Parallel hyperbolic PDE simulation on clusters: Cell versus GPU

NASA Astrophysics Data System (ADS)

Rostrup, Scott; De Sterck, Hans

2010-12-01

Increasingly, high-performance computing is looking towards data-parallel computational devices to enhance computational performance. Two technologies that have received significant attention are IBM's Cell Processor and NVIDIA's CUDA programming model for graphics processing unit (GPU) computing. In this paper we investigate the acceleration of parallel hyperbolic partial differential equation simulation on structured grids with explicit time integration on clusters with Cell and GPU backends. The message passing interface (MPI) is used for communication between nodes at the coarsest level of parallelism. Optimizations of the simulation code at the several finer levels of parallelism that the data-parallel devices provide are described in terms of data layout, data flow and data-parallel instructions. Optimized Cell and GPU performance are compared with reference code performance on a single x86 central processing unit (CPU) core in single and double precision. We further compare the CPU, Cell and GPU platforms on a chip-to-chip basis, and compare performance on single cluster nodes with two CPUs, two Cell processors or two GPUs in a shared memory configuration (without MPI). We finally compare performance on clusters with 32 CPUs, 32 Cell processors, and 32 GPUs using MPI. Our GPU cluster results use NVIDIA Tesla GPUs with GT200 architecture, but some preliminary results on recently introduced NVIDIA GPUs with the next-generation Fermi architecture are also included. This paper provides computational scientists and engineers who are considering porting their codes to accelerator environments with insight into how structured grid based explicit algorithms can be optimized for clusters with Cell and GPU accelerators. It also provides insight into the speed-up that may be gained on current and future accelerator architectures for this class of applications. Program summaryProgram title: SWsolver Catalogue identifier: AEGY_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEGY_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GPL v3 No. of lines in distributed program, including test data, etc.: 59 168 No. of bytes in distributed program, including test data, etc.: 453 409 Distribution format: tar.gz Programming language: C, CUDA Computer: Parallel Computing Clusters. Individual compute nodes may consist of x86 CPU, Cell processor, or x86 CPU with attached NVIDIA GPU accelerator. Operating system: Linux Has the code been vectorised or parallelized?: Yes. Tested on 1-128 x86 CPU cores, 1-32 Cell Processors, and 1-32 NVIDIA GPUs. RAM: Tested on Problems requiring up to 4 GB per compute node. Classification: 12 External routines: MPI, CUDA, IBM Cell SDK Nature of problem: MPI-parallel simulation of Shallow Water equations using high-resolution 2D hyperbolic equation solver on regular Cartesian grids for x86 CPU, Cell Processor, and NVIDIA GPU using CUDA. Solution method: SWsolver provides 3 implementations of a high-resolution 2D Shallow Water equation solver on regular Cartesian grids, for CPU, Cell Processor, and NVIDIA GPU. Each implementation uses MPI to divide work across a parallel computing cluster. Additional comments: Sub-program numdiff is used for the test run.
A crystallographic model for nickel base single crystal alloys

NASA Technical Reports Server (NTRS)

Dame, L. T.; Stouffer, D. C.

1988-01-01

The purpose of this research is to develop a tool for the mechanical analysis of nickel-base single-crystal superalloys, specifically Rene N4, used in gas turbine engine components. This objective is achieved by developing a rate-dependent anisotropic constitutive model and implementing it in a nonlinear three-dimensional finite-element code. The constitutive model is developed from metallurgical concepts utilizing a crystallographic approach. An extension of Schmid's law is combined with the Bodner-Partom equations to model the inelastic tension/compression asymmetry and orientation-dependence in octahedral slip. Schmid's law is used to approximate the inelastic response of the material in cube slip. The constitutive equations model the tensile behavior, creep response and strain-rate sensitivity of the single-crystal superalloys. Methods for deriving the material constants from standard tests are also discussed. The model is implemented in a finite-element code, and the computed and experimental results are compared for several orientations and loading conditions.
The EDIT-COMGEOM Code

DTIC Science & Technology

1975-09-01

This report assumes a familiarity with the GIFT and MAGIC computer codes. The EDIT-COMGEOM code is a FORTRAN computer code. The EDIT-COMGEOM code...converts the target description data which was used in the MAGIC computer code to the target description data which can be used in the GIFT computer code
A numerical study of incompressible juncture flows

NASA Technical Reports Server (NTRS)

Kwak, D.; Rogers, S. E.; Kaul, U. K.; Chang, J. L. C.

1986-01-01

The laminar, steady juncture flow around single or multiple posts mounted between two flat plates is simulated using the three dimensional incompressible Navier-Stokes code, INS3D. The three dimensional separation of the boundary layer and subsequent formation and development of the horseshoe vortex is computed. The computed flow compares favorably with the experimental observation. The recent numerical study to understand and quantify the juncture flow relevant to the Space Shuttle main engine power head is summarized.
Is the orthographic/phonological onset a single unit in reading aloud?

PubMed

Mousikou, Petroula; Coltheart, Max; Saunders, Steven; Yen, Lisa

2010-02-01

Two main theories of visual word recognition have been developed regarding the way orthographic units in printed words map onto phonological units in spoken words. One theory suggests that a string of single letters or letter clusters corresponds to a string of phonemes (Coltheart, 1978; Venezky, 1970), while the other suggests that a string of single letters or letter clusters corresponds to coarser phonological units, for example, onsets and rimes (Treiman & Chafetz, 1987). These theoretical assumptions were critical for the development of coding schemes in prominent computational models of word recognition and reading aloud. In a reading-aloud study, we tested whether the human reading system represents the orthographic/phonological onset of printed words and nonwords as single units or as separate letters/phonemes. Our results, which favored a letter and not an onset-coding scheme, were successfully simulated by the dual-route cascaded (DRC) model (Coltheart, Rastle, Perry, Langdon, & Ziegler, 2001). A separate experiment was carried out to further adjudicate between 2 versions of the DRC model.
The COBAIN (COntact Binary Atmospheres with INterpolation) Code for Radiative Transfer

NASA Astrophysics Data System (ADS)

Kochoska, Angela; Prša, Andrej; Horvat, Martin

2018-01-01

Standard binary star modeling codes make use of pre-existing solutions of the radiative transfer equation in stellar atmospheres. The various model atmospheres available today are consistently computed for single stars, under different assumptions - plane-parallel or spherical atmosphere approximation, local thermodynamical equilibrium (LTE) or non-LTE (NLTE), etc. However, they are nonetheless being applied to contact binary atmospheres by populating the surface corresponding to each component separately and neglecting any mixing that would typically occur at the contact boundary. In addition, single stellar atmosphere models do not take into account irradiance from a companion star, which can pose a serious problem when modeling close binaries. 1D atmosphere models are also solved under the assumption of an atmosphere in hydrodynamical equilibrium, which is not necessarily the case for contact atmospheres, as the potentially different densities and temperatures can give rise to flows that play a key role in the heat and radiation transfer.To resolve the issue of erroneous modeling of contact binary atmospheres using single star atmosphere tables, we have developed a generalized radiative transfer code for computation of the normal emergent intensity of a stellar surface, given its geometry and internal structure. The code uses a regular mesh of equipotential surfaces in a discrete set of spherical coordinates, which are then used to interpolate the values of the structural quantites (density, temperature, opacity) in any given point inside the mesh. The radiaitive transfer equation is numerically integrated in a set of directions spanning the unit sphere around each point and iterated until the intensity values for all directions and all mesh points converge within a given tolerance. We have found that this approach, albeit computationally expensive, is the only one that can reproduce the intensity distribution of the non-symmetric contact binary atmosphere and can be used with any existing or new model of the structure of contact binaries. We present results on several test objects and future prospects of the implementation in state-of-the-art binary star modeling software.
HYDRA-II: A hydrothermal analysis computer code: Volume 3, Verification/validation assessments

DOE Office of Scientific and Technical Information (OSTI.GOV)

McCann, R.A.; Lowery, P.S.

1987-10-01

HYDRA-II is a hydrothermal computer code capable of three-dimensional analysis of coupled conduction, convection, and thermal radiation problems. This code is especially appropriate for simulating the steady-state performance of spent fuel storage systems. The code has been evaluated for this application for the US Department of Energy's Commercial Spent Fuel Management Program. HYDRA-II provides a finite difference solution in cartesian coordinates to the equations governing the conservation of mass, momentum, and energy. A cylindrical coordinate system may also be used to enclose the cartesian coordinate system. This exterior coordinate system is useful for modeling cylindrical cask bodies. The difference equationsmore » for conservation of momentum are enhanced by the incorporation of directional porosities and permeabilities that aid in modeling solid structures whose dimensions may be smaller than the computational mesh. The equation for conservation of energy permits modeling of orthotropic physical properties and film resistances. Several automated procedures are available to model radiation transfer within enclosures and from fuel rod to fuel rod. The documentation of HYDRA-II is presented in three separate volumes. Volume I - Equations and Numerics describes the basic differential equations, illustrates how the difference equations are formulated, and gives the solution procedures employed. Volume II - User's Manual contains code flow charts, discusses the code structure, provides detailed instructions for preparing an input file, and illustrates the operation of the code by means of a model problem. This volume, Volume III - Verification/Validation Assessments, provides a comparison between the analytical solution and the numerical simulation for problems with a known solution. This volume also documents comparisons between the results of simulations of single- and multiassembly storage systems and actual experimental data. 11 refs., 55 figs., 13 tabs.« less
Superconducting quantum circuits at the surface code threshold for fault tolerance.

PubMed

Barends, R; Kelly, J; Megrant, A; Veitia, A; Sank, D; Jeffrey, E; White, T C; Mutus, J; Fowler, A G; Campbell, B; Chen, Y; Chen, Z; Chiaro, B; Dunsworth, A; Neill, C; O'Malley, P; Roushan, P; Vainsencher, A; Wenner, J; Korotkov, A N; Cleland, A N; Martinis, John M

2014-04-24

A quantum computer can solve hard problems, such as prime factoring, database searching and quantum simulation, at the cost of needing to protect fragile quantum states from error. Quantum error correction provides this protection by distributing a logical state among many physical quantum bits (qubits) by means of quantum entanglement. Superconductivity is a useful phenomenon in this regard, because it allows the construction of large quantum circuits and is compatible with microfabrication. For superconducting qubits, the surface code approach to quantum computing is a natural choice for error correction, because it uses only nearest-neighbour coupling and rapidly cycled entangling gates. The gate fidelity requirements are modest: the per-step fidelity threshold is only about 99 per cent. Here we demonstrate a universal set of logic gates in a superconducting multi-qubit processor, achieving an average single-qubit gate fidelity of 99.92 per cent and a two-qubit gate fidelity of up to 99.4 per cent. This places Josephson quantum computing at the fault-tolerance threshold for surface code error correction. Our quantum processor is a first step towards the surface code, using five qubits arranged in a linear array with nearest-neighbour coupling. As a further demonstration, we construct a five-qubit Greenberger-Horne-Zeilinger state using the complete circuit and full set of gates. The results demonstrate that Josephson quantum computing is a high-fidelity technology, with a clear path to scaling up to large-scale, fault-tolerant quantum circuits.
Experimental and numerical results for a generic axisymmetric single-engine afterbody with tails at transonic speeds

NASA Technical Reports Server (NTRS)

Burley, J. R., II; Carlson, J. R.; Henderson, W. P.

1986-01-01

Static pressure measurements were made on the afterbody, nozzle and tails of a generic single-engine axisymmetric fighter configuration. Data were recorded at Mach numbers of 0.6, 0.9, and 1.2. NPR was varied from 1.0 to 8.0 and angle of attack was varied from -3 deg. to 9 deg. Experimental data were compared with numerical results from two state-of-the-art computer codes.
Mathematical Modeling of Loop Heat Pipes with Multiple Capillary Pumps and Multiple Condensers. Part 1; Stead State Stimulations

NASA Technical Reports Server (NTRS)

Hoang, Triem T.; OConnell, Tamara; Ku, Jentung

2004-01-01

Loop Heat Pipes (LHPs) have proven themselves as reliable and robust heat transport devices for spacecraft thermal control systems. So far, the LHPs in earth-orbit satellites perform very well as expected. Conventional LHPs usually consist of a single capillary pump for heat acquisition and a single condenser for heat rejection. Multiple pump/multiple condenser LHPs have shown to function very well in ground testing. Nevertheless, the test results of a dual pump/condenser LHP also revealed that the dual LHP behaved in a complicated manner due to the interaction between the pumps and condensers. Thus it is redundant to say that more research is needed before they are ready for 0-g deployment. One research area that perhaps compels immediate attention is the analytical modeling of LHPs, particularly the transient phenomena. Modeling a single pump/single condenser LHP is difficult enough. Only a handful of computer codes are available for both steady state and transient simulations of conventional LHPs. No previous effort was made to develop an analytical model (or even a complete theory) to predict the operational behavior of the multiple pump/multiple condenser LHP systems. The current research project offered a basic theory of the multiple pump/multiple condenser LHP operation. From it, a computer code was developed to predict the LHP saturation temperature in accordance with the system operating and environmental conditions.
New class of photonic quantum error correction codes

NASA Astrophysics Data System (ADS)

Silveri, Matti; Michael, Marios; Brierley, R. T.; Salmilehto, Juha; Albert, Victor V.; Jiang, Liang; Girvin, S. M.

We present a new class of quantum error correction codes for applications in quantum memories, communication and scalable computation. These codes are constructed from a finite superposition of Fock states and can exactly correct errors that are polynomial up to a specified degree in creation and destruction operators. Equivalently, they can perform approximate quantum error correction to any given order in time step for the continuous-time dissipative evolution under these errors. The codes are related to two-mode photonic codes but offer the advantage of requiring only a single photon mode to correct loss (amplitude damping), as well as the ability to correct other errors, e.g. dephasing. Our codes are also similar in spirit to photonic ''cat codes'' but have several advantages including smaller mean occupation number and exact rather than approximate orthogonality of the code words. We analyze how the rate of uncorrectable errors scales with the code complexity and discuss the unitary control for the recovery process. These codes are realizable with current superconducting qubit technology and can increase the fidelity of photonic quantum communication and memories.
Development of small scale cluster computer for numerical analysis

NASA Astrophysics Data System (ADS)

Zulkifli, N. H. N.; Sapit, A.; Mohammed, A. N.

2017-09-01

In this study, two units of personal computer were successfully networked together to form a small scale cluster. Each of the processor involved are multicore processor which has four cores in it, thus made this cluster to have eight processors. Here, the cluster incorporate Ubuntu 14.04 LINUX environment with MPI implementation (MPICH2). Two main tests were conducted in order to test the cluster, which is communication test and performance test. The communication test was done to make sure that the computers are able to pass the required information without any problem and were done by using simple MPI Hello Program where the program written in C language. Additional, performance test was also done to prove that this cluster calculation performance is much better than single CPU computer. In this performance test, four tests were done by running the same code by using single node, 2 processors, 4 processors, and 8 processors. The result shows that with additional processors, the time required to solve the problem decrease. Time required for the calculation shorten to half when we double the processors. To conclude, we successfully develop a small scale cluster computer using common hardware which capable of higher computing power when compare to single CPU processor, and this can be beneficial for research that require high computing power especially numerical analysis such as finite element analysis, computational fluid dynamics, and computational physics analysis.
Performance of a supercharged direct-injection stratified-charge rotary combustion engine

NASA Technical Reports Server (NTRS)

Bartrand, Timothy A.; Willis, Edward A.

1990-01-01

A zero-dimensional thermodynamic performance computer model for direct-injection stratified-charge rotary combustion engines was modified and run for a single rotor supercharged engine. Operating conditions for the computer runs were a single boost pressure and a matrix of speeds, loads and engine materials. A representative engine map is presented showing the predicted range of efficient operation. After discussion of the engine map, a number of engine features are analyzed individually. These features are: heat transfer and the influence insulating materials have on engine performance and exhaust energy; intake manifold pressure oscillations and interactions with the combustion chamber; and performance losses and seal friction. Finally, code running times and convergence data are presented.
Revisiting Molecular Dynamics on a CPU/GPU system: Water Kernel and SHAKE Parallelization.

PubMed

Ruymgaart, A Peter; Elber, Ron

2012-11-13

We report Graphics Processing Unit (GPU) and Open-MP parallel implementations of water-specific force calculations and of bond constraints for use in Molecular Dynamics simulations. We focus on a typical laboratory computing-environment in which a CPU with a few cores is attached to a GPU. We discuss in detail the design of the code and we illustrate performance comparable to highly optimized codes such as GROMACS. Beside speed our code shows excellent energy conservation. Utilization of water-specific lists allows the efficient calculations of non-bonded interactions that include water molecules and results in a speed-up factor of more than 40 on the GPU compared to code optimized on a single CPU core for systems larger than 20,000 atoms. This is up four-fold from a factor of 10 reported in our initial GPU implementation that did not include a water-specific code. Another optimization is the implementation of constrained dynamics entirely on the GPU. The routine, which enforces constraints of all bonds, runs in parallel on multiple Open-MP cores or entirely on the GPU. It is based on Conjugate Gradient solution of the Lagrange multipliers (CG SHAKE). The GPU implementation is partially in double precision and requires no communication with the CPU during the execution of the SHAKE algorithm. The (parallel) implementation of SHAKE allows an increase of the time step to 2.0fs while maintaining excellent energy conservation. Interestingly, CG SHAKE is faster than the usual bond relaxation algorithm even on a single core if high accuracy is expected. The significant speedup of the optimized components transfers the computational bottleneck of the MD calculation to the reciprocal part of Particle Mesh Ewald (PME).
Modern multicore and manycore architectures: Modelling, optimisation and benchmarking a multiblock CFD code

NASA Astrophysics Data System (ADS)

Hadade, Ioan; di Mare, Luca

2016-08-01

Modern multicore and manycore processors exhibit multiple levels of parallelism through a wide range of architectural features such as SIMD for data parallel execution or threads for core parallelism. The exploitation of multi-level parallelism is therefore crucial for achieving superior performance on current and future processors. This paper presents the performance tuning of a multiblock CFD solver on Intel SandyBridge and Haswell multicore CPUs and the Intel Xeon Phi Knights Corner coprocessor. Code optimisations have been applied on two computational kernels exhibiting different computational patterns: the update of flow variables and the evaluation of the Roe numerical fluxes. We discuss at great length the code transformations required for achieving efficient SIMD computations for both kernels across the selected devices including SIMD shuffles and transpositions for flux stencil computations and global memory transformations. Core parallelism is expressed through threading based on a number of domain decomposition techniques together with optimisations pertaining to alleviating NUMA effects found in multi-socket compute nodes. Results are correlated with the Roofline performance model in order to assert their efficiency for each distinct architecture. We report significant speedups for single thread execution across both kernels: 2-5X on the multicore CPUs and 14-23X on the Xeon Phi coprocessor. Computations at full node and chip concurrency deliver a factor of three speedup on the multicore processors and up to 24X on the Xeon Phi manycore coprocessor.

Unsteady Cascade Aerodynamic Response Using a Multiphysics Simulation Code

NASA Technical Reports Server (NTRS)

Lawrence, C.; Reddy, T. S. R.; Spyropoulos, E.

2000-01-01

The multiphysics code Spectrum(TM) is applied to calculate the unsteady aerodynamic pressures of oscillating cascade of airfoils representing a blade row of a turbomachinery component. Multiphysics simulation is based on a single computational framework for the modeling of multiple interacting physical phenomena, in the present case being between fluids and structures. Interaction constraints are enforced in a fully coupled manner using the augmented-Lagrangian method. The arbitrary Lagrangian-Eulerian method is utilized to account for deformable fluid domains resulting from blade motions. Unsteady pressures are calculated for a cascade designated as the tenth standard, and undergoing plunging and pitching oscillations. The predicted unsteady pressures are compared with those obtained from an unsteady Euler co-de refer-red in the literature. The Spectrum(TM) code predictions showed good correlation for the cases considered.
A comparison of skyshine computational methods.

PubMed

Hertel, Nolan E; Sweezy, Jeremy E; Shultis, J Kenneth; Warkentin, J Karl; Rose, Zachary J

2005-01-01

A variety of methods employing radiation transport and point-kernel codes have been used to model two skyshine problems. The first problem is a 1 MeV point source of photons on the surface of the earth inside a 2 m tall and 1 m radius silo having black walls. The skyshine radiation downfield from the point source was estimated with and without a 30-cm-thick concrete lid on the silo. The second benchmark problem is to estimate the skyshine radiation downfield from 12 cylindrical canisters emplaced in a low-level radioactive waste trench. The canisters are filled with ion-exchange resin with a representative radionuclide loading, largely 60Co, 134Cs and 137Cs. The solution methods include use of the MCNP code to solve the problem by directly employing variance reduction techniques, the single-scatter point kernel code GGG-GP, the QADMOD-GP point kernel code, the COHORT Monte Carlo code, the NAC International version of the SKYSHINE-III code, the KSU hybrid method and the associated KSU skyshine codes.
GASPRNG: GPU accelerated scalable parallel random number generator library

NASA Astrophysics Data System (ADS)

Gao, Shuang; Peterson, Gregory D.

2013-04-01

Graphics processors represent a promising technology for accelerating computational science applications. Many computational science applications require fast and scalable random number generation with good statistical properties, so they use the Scalable Parallel Random Number Generators library (SPRNG). We present the GPU Accelerated SPRNG library (GASPRNG) to accelerate SPRNG in GPU-based high performance computing systems. GASPRNG includes code for a host CPU and CUDA code for execution on NVIDIA graphics processing units (GPUs) along with a programming interface to support various usage models for pseudorandom numbers and computational science applications executing on the CPU, GPU, or both. This paper describes the implementation approach used to produce high performance and also describes how to use the programming interface. The programming interface allows a user to be able to use GASPRNG the same way as SPRNG on traditional serial or parallel computers as well as to develop tightly coupled programs executing primarily on the GPU. We also describe how to install GASPRNG and use it. To help illustrate linking with GASPRNG, various demonstration codes are included for the different usage models. GASPRNG on a single GPU shows up to 280x speedup over SPRNG on a single CPU core and is able to scale for larger systems in the same manner as SPRNG. Because GASPRNG generates identical streams of pseudorandom numbers as SPRNG, users can be confident about the quality of GASPRNG for scalable computational science applications. Catalogue identifier: AEOI_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEOI_v1_0.html Program obtainable from: CPC Program Library, Queen’s University, Belfast, N. Ireland Licensing provisions: UTK license. No. of lines in distributed program, including test data, etc.: 167900 No. of bytes in distributed program, including test data, etc.: 1422058 Distribution format: tar.gz Programming language: C and CUDA. Computer: Any PC or workstation with NVIDIA GPU (Tested on Fermi GTX480, Tesla C1060, Tesla M2070). Operating system: Linux with CUDA version 4.0 or later. Should also run on MacOS, Windows, or UNIX. Has the code been vectorized or parallelized?: Yes. Parallelized using MPI directives. RAM: 512 MB˜ 732 MB (main memory on host CPU, depending on the data type of random numbers.) / 512 MB (GPU global memory) Classification: 4.13, 6.5. Nature of problem: Many computational science applications are able to consume large numbers of random numbers. For example, Monte Carlo simulations are able to consume limitless random numbers for the computation as long as resources for the computing are supported. Moreover, parallel computational science applications require independent streams of random numbers to attain statistically significant results. The SPRNG library provides this capability, but at a significant computational cost. The GASPRNG library presented here accelerates the generators of independent streams of random numbers using graphical processing units (GPUs). Solution method: Multiple copies of random number generators in GPUs allow a computational science application to consume large numbers of random numbers from independent, parallel streams. GASPRNG is a random number generators library to allow a computational science application to employ multiple copies of random number generators to boost performance. Users can interface GASPRNG with software code executing on microprocessors and/or GPUs. Running time: The tests provided take a few minutes to run.
Geospace simulations using modern accelerator processor technology

NASA Astrophysics Data System (ADS)

Germaschewski, K.; Raeder, J.; Larson, D. J.

2009-12-01

OpenGGCM (Open Geospace General Circulation Model) is a well-established numerical code simulating the Earth's space environment. The most computing intensive part is the MHD (magnetohydrodynamics) solver that models the plasma surrounding Earth and its interaction with Earth's magnetic field and the solar wind flowing in from the sun. Like other global magnetosphere codes, OpenGGCM's realism is currently limited by computational constraints on grid resolution. OpenGGCM has been ported to make use of the added computational powerof modern accelerator based processor architectures, in particular the Cell processor. The Cell architecture is a novel inhomogeneous multicore architecture capable of achieving up to 230 GFLops on a single chip. The University of New Hampshire recently acquired a PowerXCell 8i based computing cluster, and here we will report initial performance results of OpenGGCM. Realizing the high theoretical performance of the Cell processor is a programming challenge, though. We implemented the MHD solver using a multi-level parallelization approach: On the coarsest level, the problem is distributed to processors based upon the usual domain decomposition approach. Then, on each processor, the problem is divided into 3D columns, each of which is handled by the memory limited SPEs (synergistic processing elements) slice by slice. Finally, SIMD instructions are used to fully exploit the SIMD FPUs in each SPE. Memory management needs to be handled explicitly by the code, using DMA to move data from main memory to the per-SPE local store and vice versa. We use a modern technique, automatic code generation, which shields the application programmer from having to deal with all of the implementation details just described, keeping the code much more easily maintainable. Our preliminary results indicate excellent performance, a speed-up of a factor of 30 compared to the unoptimized version.
Physical Processes and Applications of the Monte Carlo Radiative Energy Deposition (MRED) Code

NASA Astrophysics Data System (ADS)

Reed, Robert A.; Weller, Robert A.; Mendenhall, Marcus H.; Fleetwood, Daniel M.; Warren, Kevin M.; Sierawski, Brian D.; King, Michael P.; Schrimpf, Ronald D.; Auden, Elizabeth C.

2015-08-01

MRED is a Python-language scriptable computer application that simulates radiation transport. It is the computational engine for the on-line tool CRÈME-MC. MRED is based on c++ code from Geant4 with additional Fortran components to simulate electron transport and nuclear reactions with high precision. We provide a detailed description of the structure of MRED and the implementation of the simulation of physical processes used to simulate radiation effects in electronic devices and circuits. Extensive discussion and references are provided that illustrate the validation of models used to implement specific simulations of relevant physical processes. Several applications of MRED are summarized that demonstrate its ability to predict and describe basic physical phenomena associated with irradiation of electronic circuits and devices. These include effects from single particle radiation (including both direct ionization and indirect ionization effects), dose enhancement effects, and displacement damage effects. MRED simulations have also helped to identify new single event upset mechanisms not previously observed by experiment, but since confirmed, including upsets due to muons and energetic electrons.
Systematic network coding for two-hop lossy transmissions

NASA Astrophysics Data System (ADS)

Li, Ye; Blostein, Steven; Chan, Wai-Yip

2015-12-01

In this paper, we consider network transmissions over a single or multiple parallel two-hop lossy paths. These scenarios occur in applications such as sensor networks or WiFi offloading. Random linear network coding (RLNC), where previously received packets are re-encoded at intermediate nodes and forwarded, is known to be a capacity-achieving approach for these networks. However, a major drawback of RLNC is its high encoding and decoding complexity. In this work, a systematic network coding method is proposed. We show through both analysis and simulation that the proposed method achieves higher end-to-end rate as well as lower computational cost than RLNC for finite field sizes and finite-sized packet transmissions.
Chemical application of diffusion quantum Monte Carlo

NASA Technical Reports Server (NTRS)

Reynolds, P. J.; Lester, W. A., Jr.

1984-01-01

The diffusion quantum Monte Carlo (QMC) method gives a stochastic solution to the Schroedinger equation. This approach is receiving increasing attention in chemical applications as a result of its high accuracy. However, reducing statistical uncertainty remains a priority because chemical effects are often obtained as small differences of large numbers. As an example, the single-triplet splitting of the energy of the methylene molecule CH sub 2 is given. The QMC algorithm was implemented on the CYBER 205, first as a direct transcription of the algorithm running on the VAX 11/780, and second by explicitly writing vector code for all loops longer than a crossover length C. The speed of the codes relative to one another as a function of C, and relative to the VAX, are discussed. The computational time dependence obtained versus the number of basis functions is discussed and this is compared with that obtained from traditional quantum chemistry codes and that obtained from traditional computer architectures.
Cookbook Recipe to Simulate Seawater Intrusion with Standard MODFLOW

NASA Astrophysics Data System (ADS)

Schaars, F.; Bakker, M.

2012-12-01

We developed a cookbook recipe to simulate steady interface flow in multi-layer coastal aquifers with regular groundwater codes such as standard MODFLOW. The main step in the recipe is a simple transformation of the hydraulic conductivities and thicknesses of the aquifers. Standard groundwater codes may be applied to compute the head distribution in the aquifer using the transformed parameters. For example, for flow in a single unconfined aquifer, the hydraulic conductivity needs to be multiplied with 41 and the base of the aquifer needs to be set to mean sea level (for a relative seawater density of 1.025). Once the head distribution is obtained, the Ghijben-Herzberg relationship is applied to compute the depth of the interface. The recipe may be applied to quite general settings, including spatially variable aquifer properties. Any standard groundwater code may be used, as long as it can simulate unconfined flow where the transmissivity is a linear function of the head. The proposed recipe is benchmarked successfully against a number of analytic and numerical solutions.
General purpose molecular dynamics simulations fully implemented on graphics processing units

NASA Astrophysics Data System (ADS)

Anderson, Joshua A.; Lorenz, Chris D.; Travesset, A.

2008-05-01

Graphics processing units (GPUs), originally developed for rendering real-time effects in computer games, now provide unprecedented computational power for scientific applications. In this paper, we develop a general purpose molecular dynamics code that runs entirely on a single GPU. It is shown that our GPU implementation provides a performance equivalent to that of fast 30 processor core distributed memory cluster. Our results show that GPUs already provide an inexpensive alternative to such clusters and discuss implications for the future.
Eddylicious: A Python package for turbulent inflow generation

NASA Astrophysics Data System (ADS)

Mukha, Timofey; Liefvendahl, Mattias

2018-01-01

A Python package for generating inflow for scale-resolving computer simulations of turbulent flow is presented. The purpose of the package is to unite existing inflow generation methods in a single code-base and make them accessible to users of various Computational Fluid Dynamics (CFD) solvers. The currently existing functionality consists of an accurate inflow generation method suitable for flows with a turbulent boundary layer inflow and input/output routines for coupling with the open-source CFD solver OpenFOAM.
The Analysis of Visual Motion: From Computational Theory to Neuronal Mechanisms.

DTIC Science & Technology

1986-12-01

neuronb. Brain Res. 151:599-603. Frost, B . J., Nakayama, K . 1983. Single visual neurons code opposing motion independent JW of direction. Science 220:744...Biol. Cybern. 42:195-204. llolden, A. 1. 1977. Responses of directional ganglion cells in the pigeon retina. J. Physiol., 270:2,53 269. Horn. B . K . P...R. Soc. Iond. B . 223:165-175. 51 % Computations Underlying Motion ttildret ik Koch %V. Longuet-Iliggins, H. C., Prazdny. K . 1981. The interpretation
A simple method for computing the relativistic Compton scattering kernel for radiative transfer

NASA Technical Reports Server (NTRS)

Prasad, M. K.; Kershaw, D. S.; Beason, J. D.

1986-01-01

Correct computation of the Compton scattering kernel (CSK), defined to be the Klein-Nishina differential cross section averaged over a relativistic Maxwellian electron distribution, is reported. The CSK is analytically reduced to a single integral, which can then be rapidly evaluated using a power series expansion, asymptotic series, and rational approximation for sigma(s). The CSK calculation has application to production codes that aim at understanding certain astrophysical, laser fusion, and nuclear weapons effects phenomena.
MODFLOW-2000, the U.S. Geological Survey Modular Ground-Water Model--Documentation of the SEAWAT-2000 Version with the Variable-Density Flow Process (VDF) and the Integrated MT3DMS Transport Process (IMT)

USGS Publications Warehouse

Langevin, Christian D.; Shoemaker, W. Barclay; Guo, Weixing

2003-01-01

SEAWAT-2000 is the latest release of the SEAWAT computer program for simulation of three-dimensional, variable-density, transient ground-water flow in porous media. SEAWAT-2000 was designed by combining a modified version of MODFLOW-2000 and MT3DMS into a single computer program. The code was developed using the MODFLOW-2000 concept of a process, which is defined as ?part of the code that solves a fundamental equation by a specified numerical method.? SEAWAT-2000 contains all of the processes distributed with MODFLOW-2000 and also includes the Variable-Density Flow Process (as an alternative to the constant-density Ground-Water Flow Process) and the Integrated MT3DMS Transport Process. Processes may be active or inactive, depending on simulation objectives; however, not all processes are compatible. For example, the Sensitivity and Parameter Estimation Processes are not compatible with the Variable-Density Flow and Integrated MT3DMS Transport Processes. The SEAWAT-2000 computer code was tested with the common variable-density benchmark problems and also with problems representing evaporation from a salt lake and rotation of immiscible fluids.
External-Compression Supersonic Inlet Design Code

NASA Technical Reports Server (NTRS)

Slater, John W.

2011-01-01

A computer code named SUPIN has been developed to perform aerodynamic design and analysis of external-compression, supersonic inlets. The baseline set of inlets include axisymmetric pitot, two-dimensional single-duct, axisymmetric outward-turning, and two-dimensional bifurcated-duct inlets. The aerodynamic methods are based on low-fidelity analytical and numerical procedures. The geometric methods are based on planar geometry elements. SUPIN has three modes of operation: 1) generate the inlet geometry from a explicit set of geometry information, 2) size and design the inlet geometry and analyze the aerodynamic performance, and 3) compute the aerodynamic performance of a specified inlet geometry. The aerodynamic performance quantities includes inlet flow rates, total pressure recovery, and drag. The geometry output from SUPIN includes inlet dimensions, cross-sectional areas, coordinates of planar profiles, and surface grids suitable for input to grid generators for analysis by computational fluid dynamics (CFD) methods. The input data file for SUPIN and the output file from SUPIN are text (ASCII) files. The surface grid files are output as formatted Plot3D or stereolithography (STL) files. SUPIN executes in batch mode and is available as a Microsoft Windows executable and Fortran95 source code with a makefile for Linux.
Large-scale structural optimization

NASA Technical Reports Server (NTRS)

Sobieszczanski-Sobieski, J.

1983-01-01

Problems encountered by aerospace designers in attempting to optimize whole aircraft are discussed, along with possible solutions. Large scale optimization, as opposed to component-by-component optimization, is hindered by computational costs, software inflexibility, concentration on a single, rather than trade-off, design methodology and the incompatibility of large-scale optimization with single program, single computer methods. The software problem can be approached by placing the full analysis outside of the optimization loop. Full analysis is then performed only periodically. Problem-dependent software can be removed from the generic code using a systems programming technique, and then embody the definitions of design variables, objective function and design constraints. Trade-off algorithms can be used at the design points to obtain quantitative answers. Finally, decomposing the large-scale problem into independent subproblems allows systematic optimization of the problems by an organization of people and machines.
Redundant disk arrays: Reliable, parallel secondary storage. Ph.D. Thesis

NASA Technical Reports Server (NTRS)

Gibson, Garth Alan

1990-01-01

During the past decade, advances in processor and memory technology have given rise to increases in computational performance that far outstrip increases in the performance of secondary storage technology. Coupled with emerging small-disk technology, disk arrays provide the cost, volume, and capacity of current disk subsystems, by leveraging parallelism, many times their performance. Unfortunately, arrays of small disks may have much higher failure rates than the single large disks they replace. Redundant arrays of inexpensive disks (RAID) use simple redundancy schemes to provide high data reliability. The data encoding, performance, and reliability of redundant disk arrays are investigated. Organizing redundant data into a disk array is treated as a coding problem. Among alternatives examined, codes as simple as parity are shown to effectively correct single, self-identifying disk failures.
Accelerating calculations of RNA secondary structure partition functions using GPUs

PubMed Central

2013-01-01

Background RNA performs many diverse functions in the cell in addition to its role as a messenger of genetic information. These functions depend on its ability to fold to a unique three-dimensional structure determined by the sequence. The conformation of RNA is in part determined by its secondary structure, or the particular set of contacts between pairs of complementary bases. Prediction of the secondary structure of RNA from its sequence is therefore of great interest, but can be computationally expensive. In this work we accelerate computations of base-pair probababilities using parallel graphics processing units (GPUs). Results Calculation of the probabilities of base pairs in RNA secondary structures using nearest-neighbor standard free energy change parameters has been implemented using CUDA to run on hardware with multiprocessor GPUs. A modified set of recursions was introduced, which reduces memory usage by about 25%. GPUs are fastest in single precision, and for some hardware, restricted to single precision. This may introduce significant roundoff error. However, deviations in base-pair probabilities calculated using single precision were found to be negligible compared to those resulting from shifting the nearest-neighbor parameters by a random amount of magnitude similar to their experimental uncertainties. For large sequences running on our particular hardware, the GPU implementation reduces execution time by a factor of close to 60 compared with an optimized serial implementation, and by a factor of 116 compared with the original code. Conclusions Using GPUs can greatly accelerate computation of RNA secondary structure partition functions, allowing calculation of base-pair probabilities for large sequences in a reasonable amount of time, with a negligible compromise in accuracy due to working in single precision. The source code is integrated into the RNAstructure software package and available for download at http://rna.urmc.rochester.edu. PMID:24180434
Monte Carlo event generators in atomic collisions: A new tool to tackle the few-body dynamics

NASA Astrophysics Data System (ADS)

Ciappina, M. F.; Kirchner, T.; Schulz, M.

2010-04-01

We present a set of routines to produce theoretical event files, for both single and double ionization of atoms by ion impact, based on a Monte Carlo event generator (MCEG) scheme. Such event files are the theoretical counterpart of the data obtained from a kinematically complete experiment; i.e. they contain the momentum components of all collision fragments for a large number of ionization events. Among the advantages of working with theoretical event files is the possibility to incorporate the conditions present in a real experiment, such as the uncertainties in the measured quantities. Additionally, by manipulating them it is possible to generate any type of cross sections, specially those that are usually too complicated to compute with conventional methods due to a lack of symmetry. Consequently, the numerical effort of such calculations is dramatically reduced. We show examples for both single and double ionization, with special emphasis on a new data analysis tool, called four-body Dalitz plots, developed very recently. Program summaryProgram title: MCEG Catalogue identifier: AEFV_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEFV_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 2695 No. of bytes in distributed program, including test data, etc.: 18 501 Distribution format: tar.gz Programming language: FORTRAN 77 with parallelization directives using scripting Computer: Single machines using Linux and Linux servers/clusters (with cores with any clock speed, cache memory and bits in a word) Operating system: Linux (any version and flavor) and FORTRAN 77 compilers Has the code been vectorised or parallelized?: Yes RAM: 64-128 kBytes (the codes are very cpu intensive) Classification: 2.6 Nature of problem: The code deals with single and double ionization of atoms by ion impact. Conventional theoretical approaches aim at a direct calculation of the corresponding cross sections. This has the important shortcoming that it is difficult to account for the experimental conditions when comparing results to measured data. In contrast, the present code generates theoretical event files of the same type as are obtained in a real experiment. From these event files any type of cross sections can be easily extracted. The theoretical schemes are based on distorted wave formalisms for both processes of interest. Solution method: The codes employ a Monte Carlo Event Generator based on theoretical formalisms to generate event files for both single and double ionization. One of the main advantages of having access to theoretical event files is the possibility of adding the conditions present in real experiments (parameter uncertainties, environmental conditions, etc.) and to incorporate additional physics in the resulting event files (e.g. elastic scattering or other interactions absent in the underlying calculations). Additional comments: The computational time can be dramatically reduced if a large number of processors is used. Since the codes has no communication between processes it is possible to achieve an efficiency of a 100% (this number certainly will be penalized by the queuing waiting time). Running time: Times vary according to the process, single or double ionization, to be simulated, the number of processors and the type of theoretical model. The typical running time is between several hours and up to a few weeks.
Comparison Between Simulated and Experimentally Measured Performance of a Four Port Wave Rotor

NASA Technical Reports Server (NTRS)

Paxson, Daniel E.; Wilson, Jack; Welch, Gerard E.

2007-01-01

Performance and operability testing has been completed on a laboratory-scale, four-port wave rotor, of the type suitable for use as a topping cycle on a gas turbine engine. Many design aspects, and performance estimates for the wave rotor were determined using a time-accurate, one-dimensional, computational fluid dynamics-based simulation code developed specifically for wave rotors. The code follows a single rotor passage as it moves past the various ports, which in this reference frame become boundary conditions. This paper compares wave rotor performance predicted with the code to that measured during laboratory testing. Both on and off-design operating conditions were examined. Overall, the match between code and rig was found to be quite good. At operating points where there were disparities, the assumption of larger than expected internal leakage rates successfully realigned code predictions and laboratory measurements. Possible mechanisms for such leakage rates are discussed.
Toward performance portability of the Albany finite element analysis code using the Kokkos library

DOE Office of Scientific and Technical Information (OSTI.GOV)

Demeshko, Irina; Watkins, Jerry; Tezaur, Irina K.

Performance portability on heterogeneous high-performance computing (HPC) systems is a major challenge faced today by code developers: parallel code needs to be executed correctly as well as with high performance on machines with different architectures, operating systems, and software libraries. The finite element method (FEM) is a popular and flexible method for discretizing partial differential equations arising in a wide variety of scientific, engineering, and industrial applications that require HPC. This paper presents some preliminary results pertaining to our development of a performance portable implementation of the FEM-based Albany code. Performance portability is achieved using the Kokkos library. We presentmore » performance results for the Aeras global atmosphere dynamical core module in Albany. Finally, numerical experiments show that our single code implementation gives reasonable performance across three multicore/many-core architectures: NVIDIA General Processing Units (GPU’s), Intel Xeon Phis, and multicore CPUs.« less

Toward performance portability of the Albany finite element analysis code using the Kokkos library

DOE PAGES

Demeshko, Irina; Watkins, Jerry; Tezaur, Irina K.; ...

2018-02-05

Performance portability on heterogeneous high-performance computing (HPC) systems is a major challenge faced today by code developers: parallel code needs to be executed correctly as well as with high performance on machines with different architectures, operating systems, and software libraries. The finite element method (FEM) is a popular and flexible method for discretizing partial differential equations arising in a wide variety of scientific, engineering, and industrial applications that require HPC. This paper presents some preliminary results pertaining to our development of a performance portable implementation of the FEM-based Albany code. Performance portability is achieved using the Kokkos library. We presentmore » performance results for the Aeras global atmosphere dynamical core module in Albany. Finally, numerical experiments show that our single code implementation gives reasonable performance across three multicore/many-core architectures: NVIDIA General Processing Units (GPU’s), Intel Xeon Phis, and multicore CPUs.« less
Aerodynamic-structural model of offwind yacht sails

NASA Astrophysics Data System (ADS)

Mairs, Christopher M.

An aerodynamic-structural model of offwind yacht sails was created that is useful in predicting sail forces. Two sails were examined experimentally and computationally at several wind angles to explore a variety of flow regimes. The accuracy of the numerical solutions was measured by comparing to experimental results. The two sails examined were a Code 0 and a reaching asymmetric spinnaker. During experiment, balance, wake, and sail shape data were recorded for both sails in various configurations. Two computational steps were used to evaluate the computational model. First, an aerodynamic flow model that includes viscosity effects was used to examine the experimental flying shapes that were recorded. Second, the aerodynamic model was combined with a nonlinear, structural, finite element analysis (FEA) model. The aerodynamic and structural models were used iteratively to predict final flying shapes of offwind sails, starting with the design shapes. The Code 0 has relatively low camber and is used at small angles of attack. It was examined experimentally and computationally at a single angle of attack in two trim configurations, a baseline and overtrimmed setting. Experimentally, the Code 0 was stable and maintained large flow attachment regions. The digitized flying shapes from experiment were examined in the aerodynamic model. Force area predictions matched experimental results well. When the aerodynamic-structural tool was employed, the predictive capability was slightly worse. The reaching asymmetric spinnaker has higher camber and operates at higher angles of attack than the Code 0. Experimentally and computationally, it was examined at two angles of attack. Like the Code 0, at each wind angle, baseline and overtrimmed settings were examined. Experimentally, sail oscillations and large flow detachment regions were encountered. The computational analysis began by examining the experimental flying shapes in the aerodynamic model. In the baseline setting, the computational force predictions were fair at both wind angles examined. Force predictions were much improved in the overtrimmed setting when the sail was highly stalled and more stable. The same trends in force prediction were seen when employing the aerodynamic-structural model. Predictions were good to fair in the baseline setting but improved in the overtrimmed configuration.
FORENSIC ANALYSIS OF WINDOW’S® VIRTUAL MEMORY INCORPORATING THE SYSTEM’S PAGEFILE COUNTERINTELLIGENCE THROUGH MALICIOUS CODE ANALYSIS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jared Stimson

FORENSIC ANALYSIS OF WINDOW’S® VIRTUAL MEMORY INCORPORATING THE SYSTEM’S PAGEFILE Computer Forensics is concerned with the use of computer investigation and analysis techniques in order to collect evidence suitable for presentation in court. The examination of volatile memory is a relatively new but important area in computer forensics. More recently criminals are becoming more forensically aware and are now able to compromise computers without accessing the hard disk of the target computer. This means that traditional incident response practice of pulling the plug will destroy the only evidence of the crime. While some techniques are available for acquiring the contentsmore » of main memory, few exist which can analyze these data in a meaningful way. One reason for this is how memory is managed by the operating system. Data belonging to one process can be distributed arbitrarily across physical memory or the hard disk, making it very difficult to recover useful information. This report will focus on how these disparate sources of information can be combined to give a single, contiguous address space for each process. Using address translation a tool is developed to reconstruct the virtual address space of a process by combining a physical memory dump with the page-file on the hard disk. COUNTERINTELLIGENCE THROUGH MALICIOUS CODE ANALYSIS As computer network technology continues to grow so does the reliance on this technology for everyday business functionality. To appeal to customers and employees alike, businesses are seeking an increased online prescience, and to increase productivity the same businesses are computerizing their day-to-day operations. The combination of a publicly accessible interface to the businesses network, and the increase in the amount of intellectual property present on these networks presents serious risks. All of this intellectual property now faces constant attacks from a wide variety of malicious software that is intended to uncover company and government secrets. Every year billions of dollars are invested in preventing and recovering from the introduction of malicious code into a system. However, there is little research being done on leveraging these attacks for counterintelligence opportunities. With the ever-increasing number of vulnerable computers on the Internet the task of attributing these attacks to an organization or a single person is a daunting one. This thesis will demonstrate the idea of intentionally running a piece of malicious code in a secure environment in order to gain counterintelligence on an attacker.« less
Multi-Core Processor Memory Contention Benchmark Analysis Case Study

NASA Technical Reports Server (NTRS)

Simon, Tyler; McGalliard, James

2009-01-01

Multi-core processors dominate current mainframe, server, and high performance computing (HPC) systems. This paper provides synthetic kernel and natural benchmark results from an HPC system at the NASA Goddard Space Flight Center that illustrate the performance impacts of multi-core (dual- and quad-core) vs. single core processor systems. Analysis of processor design, application source code, and synthetic and natural test results all indicate that multi-core processors can suffer from significant memory subsystem contention compared to similar single-core processors.
SCELib3.0: The new revision of SCELib, the parallel computational library of molecular properties in the Single Center Approach

NASA Astrophysics Data System (ADS)

Sanna, N.; Baccarelli, I.; Morelli, G.

2009-12-01

SCELib is a computer program which implements the Single Center Expansion (SCE) method to describe molecular electronic densities and the interaction potentials between a charged projectile (electron or positron) and a target molecular system. The first version (CPC Catalog identifier ADMG_v1_0) was submitted to the CPC Program Library in 2000, and version 2.0 (ADMG_v2_0) was submitted in 2004. We here announce the new release 3.0 which presents additional features with respect to the previous versions aiming at a significative enhance of its capabilities to deal with larger molecular systems. SCELib 3.0 allows for ab initio effective core potential (ECP) calculations of the molecular wavefunctions to be used in the SCE method in addition to the standard all-electron description of the molecule. The list of supported architectures has been updated and the code has been ported to platforms based on accelerating coprocessors, such as the NVIDIA GPGPU and the new parallel model adopted is able to efficiently run on a mixed many-core computing system. Program summaryProgram title: SCELib3.0 Catalogue identifier: ADMG_v3_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADMG_v3_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 2 018 862 No. of bytes in distributed program, including test data, etc.: 4 955 014 Distribution format: tar.gz Programming language: C Compilers used: xlc V8.x, Intel C V10.x, Portland Group V7.x, nvcc V2.x Computer: All SMP platforms based on AIX, Linux and SUNOS operating systems over SPARC, POWER, Intel Itanium2, X86, em64t and Opteron processors Operating system: SUNOS, IBM AIX, Linux RedHat (Enterprise), Linux SuSE (SLES) Has the code been vectorized or parallelized?: Yes. 1 to 32 (CPU or GPU) used RAM: Up to 32 GB depending on the molecular system and runtime parameters Classification: 16.5 Catalogue identifier of previous version: ADMG_v2_0 Journal reference of previous version: Comput. Phys. Comm. 162 (2004) 51 External routines: CUDA libraries (SDK V2.x). Does the new version supersede the previous version?: Yes Nature of problem: In this set of codes an efficient procedure is implemented to describe the wavefunction and related molecular properties of a polyatomic molecular system within the Single Center of Expansion (SCE) approximation. The resulting SCE wavefunction, electron density, electrostatic and correlation/polarization potentials can then be used in a wide variety of applications, such as electron-molecule scattering calculations, quantum chemistry studies, biomodelling and drug design. Solution method: The polycentre Hartree-Fock solution for a molecule of arbitrary geometry, based on linear combination of Gaussian-Type Orbital (GTO), is expanded over a single center, typically the Center Of Mass (C.O.M.), by means of a Gauss Legendre/Chebyschev quadrature over the θ,φ angular coordinates. The resulting SCE numerical wavefunction is then used to calculate the one-particle electron density, the electrostatic potential and two different models for the correlation/polarization potentials induced by the impinging electron, which have the correct asymptotic behavior for the leading dipole molecular polarizabilities. Reasons for new version: The present release of SCELib allows the study of larger molecular systems with respect to the previous versions by means of theoretical and technological advances, with the first implementation of the code over a many-core computing system. Summary of revisions: The major features added with respect to SCELib Version 2.0 are molecular wavefunctions obtained via the Los Alamos (Hay and Wadt) LAN ECP plus DZ description of the inner-shell electrons (on Na-La, Hf-Bi elements) [1] can now be single-center-expanded; the addition required modifications of: (i) the filtering code readgau, (ii) the main reading function setinp, (iii) the sphint code (including changes to the CalcMO code), (iv) the densty code, (v) the vst code; the classes of platforms supported now include two more architectures based on accelerated coprocessors (Nvidia GSeries GPGPU and ClearSpeed e720 (ClearSpeed version, experimental; initial preliminary porting of the sphint() function not for production runs - see the code documentation for additional detail). A single-precision representation for real numbers in the SCE mapping of the GTOs ( sphint code), has been implemented into the new code; the I h symmetry point group for the molecular systems has been added to those already allowed in the SCE procedure; the orientation of the molecular axis system for the Cs (planar) symmetry has been changed in accord with the standard orientation adopted by the latest version of the quantum chemistry code (Gaussian C03 [2]), which is used to generate the input multi-centre molecular wavefunctions ( z-axis perpendicular to the symmetry plane); the abelian subgroup for the Cs point group has been changed from C 1 to Cs; atomic basis functions including g-type GTOs can now be single-center-expanded. Restrictions: Depending on the molecular system under study and on the operating conditions the program may or may not fit into available RAM memory. In this case a feature of the program is to memory map a disk file in order to efficiently access the memory data through a disk device. The parallel GP-GPU implementation limits the number of CPU threads to the number of GPU cores present. Running time: The execution time strongly depends on the molecular target description and on the hardware/OS chosen, it is directly proportional to the ( r,θ,φ) grid size and to the number of angular basis functions used. Thus, from the program printout of the main arrays memory occupancy, the user can approximately derive the expected computer time needed for a given calculation executed in serial mode. For parallel executions the overall efficiency must be further taken into account, and this depends on the no. of processors used as well as on the parallel architecture chosen, so a simple general law is at present not determinable. References:[1] P.J. Hay, W.R. Wadt, J. Chem. Phys. 82 (1985) 270; W.R. Wadt, P.J. Hay, J. Chem. Phys. 284 (1985);P.J. Hay, W.R. Wadt, J. Chem. Phys. 299 (1985). [2] M.J. Frisch et al., Gaussian 03, revision C.02, Gaussian, Inc., Wallingford, CT, 2004.
COMMIX-PPC: A three-dimensional transient multicomponent computer program for analyzing performance of power plant condensers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chien, T.H.; Domanus, H.M.; Sha, W.T.

1993-02-01

The COMMIX-PPC computer pregrain is an extended and improved version of earlier COMMIX codes and is specifically designed for evaluating the thermal performance of power plant condensers. The COMMIX codes are general-purpose computer programs for the analysis of fluid flow and heat transfer in complex Industrial systems. In COMMIX-PPC, two major features have been added to previously published COMMIX codes. One feature is the incorporation of one-dimensional equations of conservation of mass, momentum, and energy on the tube stile and the proper accounting for the thermal interaction between shell and tube side through the porous-medium approach. The other added featuremore » is the extension of the three-dimensional conservation equations for shell-side flow to treat the flow of a multicomponent medium. COMMIX-PPC is designed to perform steady-state and transient. Three-dimensional analysis of fluid flow with heat transfer tn a power plant condenser. However, the code is designed in a generalized fashion so that, with some modification, it can be used to analyze processes in any heat exchanger or other single-phase engineering applications. Volume I (Equations and Numerics) of this report describes in detail the basic equations, formulation, solution procedures, and models for a phenomena. Volume II (User's Guide and Manual) contains the input instruction, flow charts, sample problems, and descriptions of available options and boundary conditions.« less
Time-domain seismic modeling in viscoelastic media for full waveform inversion on heterogeneous computing platforms with OpenCL

NASA Astrophysics Data System (ADS)

Fabien-Ouellet, Gabriel; Gloaguen, Erwan; Giroux, Bernard

2017-03-01

Full Waveform Inversion (FWI) aims at recovering the elastic parameters of the Earth by matching recordings of the ground motion with the direct solution of the wave equation. Modeling the wave propagation for realistic scenarios is computationally intensive, which limits the applicability of FWI. The current hardware evolution brings increasing parallel computing power that can speed up the computations in FWI. However, to take advantage of the diversity of parallel architectures presently available, new programming approaches are required. In this work, we explore the use of OpenCL to develop a portable code that can take advantage of the many parallel processor architectures now available. We present a program called SeisCL for 2D and 3D viscoelastic FWI in the time domain. The code computes the forward and adjoint wavefields using finite-difference and outputs the gradient of the misfit function given by the adjoint state method. To demonstrate the code portability on different architectures, the performance of SeisCL is tested on three different devices: Intel CPUs, NVidia GPUs and Intel Xeon PHI. Results show that the use of GPUs with OpenCL can speed up the computations by nearly two orders of magnitudes over a single threaded application on the CPU. Although OpenCL allows code portability, we show that some device-specific optimization is still required to get the best performance out of a specific architecture. Using OpenCL in conjunction with MPI allows the domain decomposition of large models on several devices located on different nodes of a cluster. For large enough models, the speedup of the domain decomposition varies quasi-linearly with the number of devices. Finally, we investigate two different approaches to compute the gradient by the adjoint state method and show the significant advantages of using OpenCL for FWI.
A CFD Study of Turbojet and Single-Throat Ramjet Ejector Interaction

NASA Technical Reports Server (NTRS)

Chang, Ing; Hunter, Louis

1996-01-01

Supersonic ejector-diffuse systems have application in driving an advanced airbreathing propulsion system, consisting of turbojet engines acting as the primary and a single throat ramjet acting as the secondary. The turbojet engines are integrated into the single throat ramjet to minimize variable geometry and eliminate redundant propulsion components. The result is a simple, lightweight system that is operable from takeoff to high Mach numbers. At this high Mach number (approximately Mach 3.0), the turbojets are turned off and the high speed ramjet/scramjet take over and drive the vehicle to Mach 6.0. The turbojet-ejector-ramjet system consists of nonafterburning turbojet engines with ducting canted at 20 degrees to supply supersonic flow (downstream of CD nozzle) to the horizontal ramjet duct at a supply total pressure and temperature. Two conditions were modelled by a 2-D full Navier Stokes code at Mach 2.0. The code modelled the Fabri choke as well as the non-Fabri non critical case, using a computational throat to supply the back pressure. The results, which primarily predict the secondary mass flow rate and the mixed conditions at the ejector exit were in reasonable agreement with the 1-D cycle code (TBCC).
Execution of a parallel edge-based Navier-Stokes solver on commodity graphics processor units

NASA Astrophysics Data System (ADS)

Corral, Roque; Gisbert, Fernando; Pueblas, Jesus

2017-02-01

The implementation of an edge-based three-dimensional Reynolds Average Navier-Stokes solver for unstructured grids able to run on multiple graphics processing units (GPUs) is presented. Loops over edges, which are the most time-consuming part of the solver, have been written to exploit the massively parallel capabilities of GPUs. Non-blocking communications between parallel processes and between the GPU and the central processor unit (CPU) have been used to enhance code scalability. The code is written using a mixture of C++ and OpenCL, to allow the execution of the source code on GPUs. The Message Passage Interface (MPI) library is used to allow the parallel execution of the solver on multiple GPUs. A comparative study of the solver parallel performance is carried out using a cluster of CPUs and another of GPUs. It is shown that a single GPU is up to 64 times faster than a single CPU core. The parallel scalability of the solver is mainly degraded due to the loss of computing efficiency of the GPU when the size of the case decreases. However, for large enough grid sizes, the scalability is strongly improved. A cluster featuring commodity GPUs and a high bandwidth network is ten times less costly and consumes 33% less energy than a CPU-based cluster with an equivalent computational power.
The path toward HEP High Performance Computing

NASA Astrophysics Data System (ADS)

Apostolakis, John; Brun, René; Carminati, Federico; Gheata, Andrei; Wenzel, Sandro

2014-06-01

High Energy Physics code has been known for making poor use of high performance computing architectures. Efforts in optimising HEP code on vector and RISC architectures have yield limited results and recent studies have shown that, on modern architectures, it achieves a performance between 10% and 50% of the peak one. Although several successful attempts have been made to port selected codes on GPUs, no major HEP code suite has a "High Performance" implementation. With LHC undergoing a major upgrade and a number of challenging experiments on the drawing board, HEP cannot any longer neglect the less-than-optimal performance of its code and it has to try making the best usage of the hardware. This activity is one of the foci of the SFT group at CERN, which hosts, among others, the Root and Geant4 project. The activity of the experiments is shared and coordinated via a Concurrency Forum, where the experience in optimising HEP code is presented and discussed. Another activity is the Geant-V project, centred on the development of a highperformance prototype for particle transport. Achieving a good concurrency level on the emerging parallel architectures without a complete redesign of the framework can only be done by parallelizing at event level, or with a much larger effort at track level. Apart the shareable data structures, this typically implies a multiplication factor in terms of memory consumption compared to the single threaded version, together with sub-optimal handling of event processing tails. Besides this, the low level instruction pipelining of modern processors cannot be used efficiently to speedup the program. We have implemented a framework that allows scheduling vectors of particles to an arbitrary number of computing resources in a fine grain parallel approach. The talk will review the current optimisation activities within the SFT group with a particular emphasis on the development perspectives towards a simulation framework able to profit best from the recent technology evolution in computing.
Multi-stage decoding for multi-level block modulation codes

NASA Technical Reports Server (NTRS)

Lin, Shu

1991-01-01

In this paper, we investigate various types of multi-stage decoding for multi-level block modulation codes, in which the decoding of a component code at each stage can be either soft-decision or hard-decision, maximum likelihood or bounded-distance. Error performance of codes is analyzed for a memoryless additive channel based on various types of multi-stage decoding, and upper bounds on the probability of an incorrect decoding are derived. Based on our study and computation results, we find that, if component codes of a multi-level modulation code and types of decoding at various stages are chosen properly, high spectral efficiency and large coding gain can be achieved with reduced decoding complexity. In particular, we find that the difference in performance between the suboptimum multi-stage soft-decision maximum likelihood decoding of a modulation code and the single-stage optimum decoding of the overall code is very small: only a fraction of dB loss in SNR at the probability of an incorrect decoding for a block of 10(exp -6). Multi-stage decoding of multi-level modulation codes really offers a way to achieve the best of three worlds, bandwidth efficiency, coding gain, and decoding complexity.
Multiphysics Code Demonstrated for Propulsion Applications

NASA Technical Reports Server (NTRS)

Lawrence, Charles; Melis, Matthew E.

1998-01-01

The utility of multidisciplinary analysis tools for aeropropulsion applications is being investigated at the NASA Lewis Research Center. The goal of this project is to apply Spectrum, a multiphysics code developed by Centric Engineering Systems, Inc., to simulate multidisciplinary effects in turbomachinery components. Many engineering problems today involve detailed computer analyses to predict the thermal, aerodynamic, and structural response of a mechanical system as it undergoes service loading. Analysis of aerospace structures generally requires attention in all three disciplinary areas to adequately predict component service behavior, and in many cases, the results from one discipline substantially affect the outcome of the other two. There are numerous computer codes currently available in the engineering community to perform such analyses in each of these disciplines. Many of these codes are developed and used in-house by a given organization, and many are commercially available. However, few, if any, of these codes are designed specifically for multidisciplinary analyses. The Spectrum code has been developed for performing fully coupled fluid, thermal, and structural analyses on a mechanical system with a single simulation that accounts for all simultaneous interactions, thus eliminating the requirement for running a large number of sequential, separate, disciplinary analyses. The Spectrum code has a true multiphysics analysis capability, which improves analysis efficiency as well as accuracy. Centric Engineering, Inc., working with a team of Lewis and AlliedSignal Engines engineers, has been evaluating Spectrum for a variety of propulsion applications including disk quenching, drum cavity flow, aeromechanical simulations, and a centrifugal compressor flow simulation.
Development of a dynamic coupled hydro-geomechanical code and its application to induced seismicity

NASA Astrophysics Data System (ADS)

Miah, Md Mamun

This research describes the importance of a hydro-geomechanical coupling in the geologic sub-surface environment from fluid injection at geothermal plants, large-scale geological CO2 sequestration for climate mitigation, enhanced oil recovery, and hydraulic fracturing during wells construction in the oil and gas industries. A sequential computational code is developed to capture the multiphysics interaction behavior by linking a flow simulation code TOUGH2 and a geomechanics modeling code PyLith. Numerical formulation of each code is discussed to demonstrate their modeling capabilities. The computational framework involves sequential coupling, and solution of two sub-problems- fluid flow through fractured and porous media and reservoir geomechanics. For each time step of flow calculation, pressure field is passed to the geomechanics code to compute effective stress field and fault slips. A simplified permeability model is implemented in the code that accounts for the permeability of porous and saturated rocks subject to confining stresses. The accuracy of the TOUGH-PyLith coupled simulator is tested by simulating Terzaghi's 1D consolidation problem. The modeling capability of coupled poroelasticity is validated by benchmarking it against Mandel's problem. The code is used to simulate both quasi-static and dynamic earthquake nucleation and slip distribution on a fault from the combined effect of far field tectonic loading and fluid injection by using an appropriate fault constitutive friction model. Results from the quasi-static induced earthquake simulations show a delayed response in earthquake nucleation. This is attributed to the increased total stress in the domain and not accounting for pressure on the fault. However, this issue is resolved in the final chapter in simulating a single event earthquake dynamic rupture. Simulation results show that fluid pressure has a positive effect on slip nucleation and subsequent crack propagation. This is confirmed by running a sensitivity analysis that shows an increase in injection well distance results in delayed slip nucleation and rupture propagation on the fault.
Orbitofrontal Cortex Signals Expected Outcomes with Predictive Codes When Stable Contingencies Promote the Integration of Reward History

PubMed Central

Shapiro, Matthew L.

2017-01-01

Memory can inform goal-directed behavior by linking current opportunities to past outcomes. The orbitofrontal cortex (OFC) may guide value-based responses by integrating the history of stimulus–reward associations into expected outcomes, representations of predicted hedonic value and quality. Alternatively, the OFC may rapidly compute flexible “online” reward predictions by associating stimuli with the latest outcome. OFC neurons develop predictive codes when rats learn to associate arbitrary stimuli with outcomes, but the extent to which predictive coding depends on most recent events and the integrated history of rewards is unclear. To investigate how reward history modulates OFC activity, we recorded OFC ensembles as rats performed spatial discriminations that differed only in the number of rewarded trials between goal reversals. The firing rate of single OFC neurons distinguished identical behaviors guided by different goals. When >20 rewarded trials separated goal switches, OFC ensembles developed stable and anticorrelated population vectors that predicted overall choice accuracy and the goal selected in single trials. When <10 rewarded trials separated goal switches, OFC population vectors decorrelated rapidly after each switch, but did not develop anticorrelated firing patterns or predict choice accuracy. The results show that, whereas OFC signals respond rapidly to contingency changes, they predict choices only when reward history is relatively stable, suggesting that consecutive rewarded episodes are needed for OFC computations that integrate reward history into expected outcomes. SIGNIFICANCE STATEMENT Adapting to changing contingencies and making decisions engages the orbitofrontal cortex (OFC). Previous work shows that OFC function can either improve or impair learning depending on reward stability, suggesting that OFC guides behavior optimally when contingencies apply consistently. The mechanisms that link reward history to OFC computations remain obscure. Here, we examined OFC unit activity as rodents performed tasks controlled by contingencies that varied reward history. When contingencies were stable, OFC neurons signaled past, present, and pending events; when contingencies were unstable, past and present coding persisted, but predictive coding diminished. The results suggest that OFC mechanisms require stable contingencies across consecutive episodes to integrate reward history, represent predicted outcomes, and inform goal-directed choices. PMID:28115481
Time-Shifted Boundary Conditions Used for Navier-Stokes Aeroelastic Solver

NASA Technical Reports Server (NTRS)

Srivastava, Rakesh

1999-01-01

Under the Advanced Subsonic Technology (AST) Program, an aeroelastic analysis code (TURBO-AE) based on Navier-Stokes equations is currently under development at NASA Lewis Research Center s Machine Dynamics Branch. For a blade row, aeroelastic instability can occur in any of the possible interblade phase angles (IBPA s). Analyzing small IBPA s is very computationally expensive because a large number of blade passages must be simulated. To reduce the computational cost of these analyses, we used time shifted, or phase-lagged, boundary conditions in the TURBO-AE code. These conditions can be used to reduce the computational domain to a single blade passage by requiring the boundary conditions across the passage to be lagged depending on the IBPA being analyzed. The time-shifted boundary conditions currently implemented are based on the direct-store method. This method requires large amounts of data to be stored over a period of the oscillation cycle. On CRAY computers this is not a major problem because solid-state devices can be used for fast input and output to read and write the data onto a disk instead of storing it in core memory.
Performance of a Block Structured, Hierarchical Adaptive MeshRefinement Code on the 64k Node IBM BlueGene/L Computer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Greenough, Jeffrey A.; de Supinski, Bronis R.; Yates, Robert K.

2005-04-25

We describe the performance of the block-structured Adaptive Mesh Refinement (AMR) code Raptor on the 32k node IBM BlueGene/L computer. This machine represents a significant step forward towards petascale computing. As such, it presents Raptor with many challenges for utilizing the hardware efficiently. In terms of performance, Raptor shows excellent weak and strong scaling when running in single level mode (no adaptivity). Hardware performance monitors show Raptor achieves an aggregate performance of 3:0 Tflops in the main integration kernel on the 32k system. Results from preliminary AMR runs on a prototype astrophysical problem demonstrate the efficiency of the current softwaremore » when running at large scale. The BG/L system is enabling a physics problem to be considered that represents a factor of 64 increase in overall size compared to the largest ones of this type computed to date. Finally, we provide a description of the development work currently underway to address our inefficiencies.« less
Experimental QR code optical encryption: noise-free data recovering.

PubMed

Barrera, John Fredy; Mira-Agudelo, Alejandro; Torroba, Roberto

2014-05-15

We report, to our knowledge for the first time, the experimental implementation of a quick response (QR) code as a "container" in an optical encryption system. A joint transform correlator architecture in an interferometric configuration is chosen as the experimental scheme. As the implementation is not possible in a single step, a multiplexing procedure to encrypt the QR code of the original information is applied. Once the QR code is correctly decrypted, the speckle noise present in the recovered QR code is eliminated by a simple digital procedure. Finally, the original information is retrieved completely free of any kind of degradation after reading the QR code. Additionally, we propose and implement a new protocol in which the reception of the encrypted QR code and its decryption, the digital block processing, and the reading of the decrypted QR code are performed employing only one device (smartphone, tablet, or computer). The overall method probes to produce an outcome far more attractive to make the adoption of the technique a plausible option. Experimental results are presented to demonstrate the practicality of the proposed security system.
MicroHH 1.0: a computational fluid dynamics code for direct numerical simulation and large-eddy simulation of atmospheric boundary layer flows

NASA Astrophysics Data System (ADS)

van Heerwaarden, Chiel C.; van Stratum, Bart J. H.; Heus, Thijs; Gibbs, Jeremy A.; Fedorovich, Evgeni; Mellado, Juan Pedro

2017-08-01

This paper describes MicroHH 1.0, a new and open-source (www.microhh.org) computational fluid dynamics code for the simulation of turbulent flows in the atmosphere. It is primarily made for direct numerical simulation but also supports large-eddy simulation (LES). The paper covers the description of the governing equations, their numerical implementation, and the parameterizations included in the code. Furthermore, the paper presents the validation of the dynamical core in the form of convergence and conservation tests, and comparison of simulations of channel flows and slope flows against well-established test cases. The full numerical model, including the associated parameterizations for LES, has been tested for a set of cases under stable and unstable conditions, under the Boussinesq and anelastic approximations, and with dry and moist convection under stationary and time-varying boundary conditions. The paper presents performance tests showing good scaling from 256 to 32 768 processes. The graphical processing unit (GPU)-enabled version of the code can reach a speedup of more than an order of magnitude for simulations that fit in the memory of a single GPU.
Benchmark tests on the digital equipment corporation Alpha AXP 21164-based AlphaServer 8400, including a comparison of optimized vector and superscalar processing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wasserman, H.J.

1996-02-01

The second generation of the Digital Equipment Corp. (DEC) DECchip Alpha AXP microprocessor is referred to as the 21164. From the viewpoint of numerically-intensive computing, the primary difference between it and its predecessor, the 21064, is that the 21164 has twice the multiply/add throughput per clock period (CP), a maximum of two floating point operations (FLOPS) per CP vs. one for 21064. The AlphaServer 8400 is a shared-memory multiprocessor server system that can accommodate up to 12 CPUs and up to 14 GB of memory. In this report we will compare single processor performance of the 8400 system with thatmore » of the International Business Machines Corp. (IBM) RISC System/6000 POWER-2 microprocessor running at 66 MHz, the Silicon Graphics, Inc. (SGI) MIPS R8000 microprocessor running at 75 MHz, and the Cray Research, Inc. CRAY J90. The performance comparison is based on a set of Fortran benchmark codes that represent a portion of the Los Alamos National Laboratory supercomputer workload. The advantage of using these codes, is that the codes also span a wide range of computational characteristics, such as vectorizability, problem size, and memory access pattern. The primary disadvantage of using them is that detailed, quantitative analysis of performance behavior of all codes on all machines is difficult. One important addition to the benchmark set appears for the first time in this report. Whereas the older version was written for a vector processor, the newer version is more optimized for microprocessor architectures. Therefore, we have for the first time, an opportunity to measure performance on a single application using implementations that expose the respective strengths of vector and superscalar architecture. All results in this report are from single processors. A subsequent article will explore shared-memory multiprocessing performance of the 8400 system.« less
Recent Progress in the Development of a Multi-Layer Green's Function Code for Ion Beam Transport

NASA Technical Reports Server (NTRS)

Tweed, John; Walker, Steven A.; Wilson, John W.; Tripathi, Ram K.

2008-01-01

To meet the challenge of future deep space programs, an accurate and efficient engineering code for analyzing the shielding requirements against high-energy galactic heavy radiation is needed. To address this need, a new Green's function code capable of simulating high charge and energy ions with either laboratory or space boundary conditions is currently under development. The computational model consists of combinations of physical perturbation expansions based on the scales of atomic interaction, multiple scattering, and nuclear reactive processes with use of the Neumann-asymptotic expansions with non-perturbative corrections. The code contains energy loss due to straggling, nuclear attenuation, nuclear fragmentation with energy dispersion and downshifts. Previous reports show that the new code accurately models the transport of ion beams through a single slab of material. Current research efforts are focused on enabling the code to handle multiple layers of material and the present paper reports on progress made towards that end.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sprague, Michael A.

Enabled by petascale supercomputing, the next generation of computer models for wind energy will simulate a vast range of scales and physics, spanning from turbine structural dynamics and blade-scale turbulence to mesoscale atmospheric flow. A single model covering all scales and physics is not feasible. Thus, these simulations will require the coupling of different models/codes, each for different physics, interacting at their domain boundaries.
Computational Ion Optics Design Evaluations

NASA Technical Reports Server (NTRS)

Malone, Shane P.; Soulas, George C.

2004-01-01

Ion optics computational models are invaluable tools in the design of ion optics systems. In this study a new computational model developed by an outside vendor for use at the NASA Glenn Research Center (GRC) is presented. This computational model is a gun code that has been modified to model the plasma sheaths both upstream and downstream of the ion optics. The model handles multiple species (e.g. singly and doubly-charged ions) and includes a charge-exchange model to support erosion estimations. The model uses commercially developed solid design and meshing software to allow high flexibility in ion optics geometric configurations. The results from this computational model are applied to the NEXT project to investigate the effects of crossover impingement erosion seen during the 2000-hour wear test.
A Numerical Study of the Effects of Curvature and Convergence on Dilution Jet Mixing

NASA Technical Reports Server (NTRS)

Holdeman, J. D.; Reynolds, R.; White, C.

1987-01-01

An analytical program was conducted to assemble and assess a three-dimensional turbulent viscous flow computer code capable of analyzing the flow field in the transition liners of small gas turbine engines. This code is of the TEACH type with hybrid numerics, and uses the power law and SIMPLER algorithms, an orthogonal curvilinear coordinate system, and an algebraic Reynolds stress turbulence model. The assessments performed in this study, consistent with results in the literature, showed that in its present form this code is capable of predicting trends and qualitative results. The assembled code was used to perform a numerical experiment to investigate the effects of curvature and convergence in the transition liner on the mixing of single and opposed rows of cool dilution jets injected into a hot mainstream flow.
A numerical study of the effects of curvature and convergence on dilution jet mixing

NASA Technical Reports Server (NTRS)

Holdeman, J. D.; Reynolds, R.; White, C.

1987-01-01

An analytical program was conducted to assemble and assess a three-dimensional turbulent viscous flow computer code capable of analyzing the flow field in the transition liners of small gas turbine engines. This code is of the TEACH type with hybrid numerics, and uses the power law and SIMPLER algorithms, an orthogonal curvilinear coordinate system, and an algebraic Reynolds stress turbulence model. The assessments performed in this study, consistent with results in the literature, showed that in its present form this code is capable of predicting trends and qualitative results. The assembled code was used to perform a numerical experiment to investigate the effects of curvature and convergence in the transition liner on the mixing of single and opposed rows of cool dilution jets injected into a hot mainstream flow.
Energetics of Single Substitutional Impurities in NiTi

NASA Technical Reports Server (NTRS)

Good, Brian S.; Noebe, Ronald

2003-01-01

Shape-memory alloys are of considerable current interest, with applications ranging from stents to Mars rover components. In this work, we present results on the energetics of single substitutional impurities in B2 NiTi. Specifically, energies of Pd, Pt, Zr and Hf impurities at both Ni and Ti sites are computed. All energies are computed using the CASTEP ab initio code, and, for comparison, using the quantum approximate energy method of Bozzolo, Ferrante and Smith. Atomistic relaxation in the vicinity of the impurities is investigated via quantum approximate Monte Carlo simulation, and in cases where the relaxation is found to be important, the resulting relaxations are applied to the ab initio calculations. We compare our results with available experimental work.
A GPU-accelerated implicit meshless method for compressible flows

NASA Astrophysics Data System (ADS)

Zhang, Jia-Le; Ma, Zhi-Hua; Chen, Hong-Quan; Cao, Cheng

2018-05-01

This paper develops a recently proposed GPU based two-dimensional explicit meshless method (Ma et al., 2014) by devising and implementing an efficient parallel LU-SGS implicit algorithm to further improve the computational efficiency. The capability of the original 2D meshless code is extended to deal with 3D complex compressible flow problems. To resolve the inherent data dependency of the standard LU-SGS method, which causes thread-racing conditions destabilizing numerical computation, a generic rainbow coloring method is presented and applied to organize the computational points into different groups by painting neighboring points with different colors. The original LU-SGS method is modified and parallelized accordingly to perform calculations in a color-by-color manner. The CUDA Fortran programming model is employed to develop the key kernel functions to apply boundary conditions, calculate time steps, evaluate residuals as well as advance and update the solution in the temporal space. A series of two- and three-dimensional test cases including compressible flows over single- and multi-element airfoils and a M6 wing are carried out to verify the developed code. The obtained solutions agree well with experimental data and other computational results reported in the literature. Detailed analysis on the performance of the developed code reveals that the developed CPU based implicit meshless method is at least four to eight times faster than its explicit counterpart. The computational efficiency of the implicit method could be further improved by ten to fifteen times on the GPU.
Investigations of flowfields found in typical combustor geometries

NASA Technical Reports Server (NTRS)

Lilley, D. G.

1982-01-01

Measurements and computations are being applied to an axisymmetric swirling flow, emerging from swirl vanes at angle phi, entering a large chamber test section via a sudden expansion of various side-wall angles alpha. New features are: the turbulence measurements are being performed on swirling as well as nonswirling flow; and all measurements and computations are also being performed on a confined jet flowfield with realistic downstream blockage. Recent activity falls into three categories: (1) Time-mean flowfield characterization by five-hole pitot probe measurements and by flow visualization; (2) Turbulence measurements by a variety of single- and multi-wire hot-wire probe techniques; and (3) Flowfield computations using the computer code developed during the previous year's research program.
Computational Particle Dynamic Simulations on Multicore Processors (CPDMu) Final Report Phase I

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schmalz, Mark S

2011-07-24

Statement of Problem - Department of Energy has many legacy codes for simulation of computational particle dynamics and computational fluid dynamics applications that are designed to run on sequential processors and are not easily parallelized. Emerging high-performance computing architectures employ massively parallel multicore architectures (e.g., graphics processing units) to increase throughput. Parallelization of legacy simulation codes is a high priority, to achieve compatibility, efficiency, accuracy, and extensibility. General Statement of Solution - A legacy simulation application designed for implementation on mainly-sequential processors has been represented as a graph G. Mathematical transformations, applied to G, produce a graph representation {und G}more » for a high-performance architecture. Key computational and data movement kernels of the application were analyzed/optimized for parallel execution using the mapping G {yields} {und G}, which can be performed semi-automatically. This approach is widely applicable to many types of high-performance computing systems, such as graphics processing units or clusters comprised of nodes that contain one or more such units. Phase I Accomplishments - Phase I research decomposed/profiled computational particle dynamics simulation code for rocket fuel combustion into low and high computational cost regions (respectively, mainly sequential and mainly parallel kernels), with analysis of space and time complexity. Using the research team's expertise in algorithm-to-architecture mappings, the high-cost kernels were transformed, parallelized, and implemented on Nvidia Fermi GPUs. Measured speedups (GPU with respect to single-core CPU) were approximately 20-32X for realistic model parameters, without final optimization. Error analysis showed no loss of computational accuracy. Commercial Applications and Other Benefits - The proposed research will constitute a breakthrough in solution of problems related to efficient parallel computation of particle and fluid dynamics simulations. These problems occur throughout DOE, military and commercial sectors: the potential payoff is high. We plan to license or sell the solution to contractors for military and domestic applications such as disaster simulation (aerodynamic and hydrodynamic), Government agencies (hydrological and environmental simulations), and medical applications (e.g., in tomographic image reconstruction). Keywords - High-performance Computing, Graphic Processing Unit, Fluid/Particle Simulation. Summary for Members of Congress - Department of Energy has many simulation codes that must compute faster, to be effective. The Phase I research parallelized particle/fluid simulations for rocket combustion, for high-performance computing systems.« less
Initial Aerodynamic and Acoustic Study of an Active Twist Rotor Using a Loosely Coupled CFD/CSD Method

NASA Technical Reports Server (NTRS)

Boyd, David D. Jr.

2009-01-01

Preliminary aerodynamic and performance predictions for an active twist rotor for a HART-II type of configuration are performed using a computational fluid dynamics (CFD) code, OVERFLOW2, and a computational structural dynamics (CSD) code, CAMRAD -II. These codes are loosely coupled to compute a consistent set of aerodynamics and elastic blade motions. Resultant aerodynamic and blade motion data are then used in the Ffowcs-Williams Hawkins solver, PSU-WOPWOP, to compute noise on an observer plane under the rotor. Active twist of the rotor blade is achieved in CAMRAD-II by application of a periodic torsional moment couple (of equal and opposite sign) at the blade root and tip at a specified frequency and amplitude. To provide confidence in these particular active twist predictions for which no measured data is available, the rotor system geometry and computational set up examined here are identical to that used in a previous successful Higher Harmonic Control (HHC) computational study. For a single frequency equal to three times the blade passage frequency (3P), active twist is applied across a range of control phase angles at two different amplitudes. Predicted results indicate that there are control phase angles where the maximum mid-frequency noise level and the 4P non -rotating hub vibrations can be reduced, potentially, both at the same time. However, these calculated reductions are predicted to come with a performance penalty in the form of a reduction in rotor lift-to-drag ratio due to an increase in rotor profile power.
A critical analysis of the accuracy of several numerical techniques for combustion kinetic rate equations

NASA Technical Reports Server (NTRS)

Radhadrishnan, Krishnan

1993-01-01

A detailed analysis of the accuracy of several techniques recently developed for integrating stiff ordinary differential equations is presented. The techniques include two general-purpose codes EPISODE and LSODE developed for an arbitrary system of ordinary differential equations, and three specialized codes CHEMEQ, CREK1D, and GCKP4 developed specifically to solve chemical kinetic rate equations. The accuracy study is made by application of these codes to two practical combustion kinetics problems. Both problems describe adiabatic, homogeneous, gas-phase chemical reactions at constant pressure, and include all three combustion regimes: induction, heat release, and equilibration. To illustrate the error variation in the different combustion regimes the species are divided into three types (reactants, intermediates, and products), and error versus time plots are presented for each species type and the temperature. These plots show that CHEMEQ is the most accurate code during induction and early heat release. During late heat release and equilibration, however, the other codes are more accurate. A single global quantity, a mean integrated root-mean-square error, that measures the average error incurred in solving the complete problem is used to compare the accuracy of the codes. Among the codes examined, LSODE is the most accurate for solving chemical kinetics problems. It is also the most efficient code, in the sense that it requires the least computational work to attain a specified accuracy level. An important finding is that use of the algebraic enthalpy conservation equation to compute the temperature can be more accurate and efficient than integrating the temperature differential equation.
Computational multispectral video imaging [Invited].

PubMed

Wang, Peng; Menon, Rajesh

2018-01-01

Multispectral imagers reveal information unperceivable to humans and conventional cameras. Here, we demonstrate a compact single-shot multispectral video-imaging camera by placing a micro-structured diffractive filter in close proximity to the image sensor. The diffractive filter converts spectral information to a spatial code on the sensor pixels. Following a calibration step, this code can be inverted via regularization-based linear algebra to compute the multispectral image. We experimentally demonstrated spectral resolution of 9.6 nm within the visible band (430-718 nm). We further show that the spatial resolution is enhanced by over 30% compared with the case without the diffractive filter. We also demonstrate Vis-IR imaging with the same sensor. Because no absorptive color filters are utilized, sensitivity is preserved as well. Finally, the diffractive filters can be easily manufactured using optical lithography and replication techniques.
Semiempirical Quantum Chemical Calculations Accelerated on a Hybrid Multicore CPU-GPU Computing Platform.

PubMed

Wu, Xin; Koslowski, Axel; Thiel, Walter

2012-07-10

In this work, we demonstrate that semiempirical quantum chemical calculations can be accelerated significantly by leveraging the graphics processing unit (GPU) as a coprocessor on a hybrid multicore CPU-GPU computing platform. Semiempirical calculations using the MNDO, AM1, PM3, OM1, OM2, and OM3 model Hamiltonians were systematically profiled for three types of test systems (fullerenes, water clusters, and solvated crambin) to identify the most time-consuming sections of the code. The corresponding routines were ported to the GPU and optimized employing both existing library functions and a GPU kernel that carries out a sequence of noniterative Jacobi transformations during pseudodiagonalization. The overall computation times for single-point energy calculations and geometry optimizations of large molecules were reduced by one order of magnitude for all methods, as compared to runs on a single CPU core.
DistMap: a toolkit for distributed short read mapping on a Hadoop cluster.

PubMed

Pandey, Ram Vinay; Schlötterer, Christian

2013-01-01

With the rapid and steady increase of next generation sequencing data output, the mapping of short reads has become a major data analysis bottleneck. On a single computer, it can take several days to map the vast quantity of reads produced from a single Illumina HiSeq lane. In an attempt to ameliorate this bottleneck we present a new tool, DistMap - a modular, scalable and integrated workflow to map reads in the Hadoop distributed computing framework. DistMap is easy to use, currently supports nine different short read mapping tools and can be run on all Unix-based operating systems. It accepts reads in FASTQ format as input and provides mapped reads in a SAM/BAM format. DistMap supports both paired-end and single-end reads thereby allowing the mapping of read data produced by different sequencing platforms. DistMap is available from http://code.google.com/p/distmap/
DistMap: A Toolkit for Distributed Short Read Mapping on a Hadoop Cluster

PubMed Central

Pandey, Ram Vinay; Schlötterer, Christian

2013-01-01

With the rapid and steady increase of next generation sequencing data output, the mapping of short reads has become a major data analysis bottleneck. On a single computer, it can take several days to map the vast quantity of reads produced from a single Illumina HiSeq lane. In an attempt to ameliorate this bottleneck we present a new tool, DistMap - a modular, scalable and integrated workflow to map reads in the Hadoop distributed computing framework. DistMap is easy to use, currently supports nine different short read mapping tools and can be run on all Unix-based operating systems. It accepts reads in FASTQ format as input and provides mapped reads in a SAM/BAM format. DistMap supports both paired-end and single-end reads thereby allowing the mapping of read data produced by different sequencing platforms. DistMap is available from http://code.google.com/p/distmap/ PMID:24009693
Results of comparative RBMK neutron computation using VNIIEF codes (cell computation, 3D statics, 3D kinetics). Final report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grebennikov, A.N.; Zhitnik, A.K.; Zvenigorodskaya, O.A.

1995-12-31

In conformity with the protocol of the Workshop under Contract {open_quotes}Assessment of RBMK reactor safety using modern Western Codes{close_quotes} VNIIEF performed a neutronics computation series to compare western and VNIIEF codes and assess whether VNIIEF codes are suitable for RBMK type reactor safety assessment computation. The work was carried out in close collaboration with M.I. Rozhdestvensky and L.M. Podlazov, NIKIET employees. The effort involved: (1) cell computations with the WIMS, EKRAN codes (improved modification of the LOMA code) and the S-90 code (VNIIEF Monte Carlo). Cell, polycell, burnup computation; (2) 3D computation of static states with the KORAT-3D and NEUmore » codes and comparison with results of computation with the NESTLE code (USA). The computations were performed in the geometry and using the neutron constants presented by the American party; (3) 3D computation of neutron kinetics with the KORAT-3D and NEU codes. These computations were performed in two formulations, both being developed in collaboration with NIKIET. Formulation of the first problem maximally possibly agrees with one of NESTLE problems and imitates gas bubble travel through a core. The second problem is a model of the RBMK as a whole with imitation of control and protection system controls (CPS) movement in a core.« less
Vector Sum Excited Linear Prediction (VSELP) speech coding at 4.8 kbps

NASA Technical Reports Server (NTRS)

Gerson, Ira A.; Jasiuk, Mark A.

1990-01-01

Code Excited Linear Prediction (CELP) speech coders exhibit good performance at data rates as low as 4800 bps. The major drawback to CELP type coders is their larger computational requirements. The Vector Sum Excited Linear Prediction (VSELP) speech coder utilizes a codebook with a structure which allows for a very efficient search procedure. Other advantages of the VSELP codebook structure is discussed and a detailed description of a 4.8 kbps VSELP coder is given. This coder is an improved version of the VSELP algorithm, which finished first in the NSA's evaluation of the 4.8 kbps speech coders. The coder uses a subsample resolution single tap long term predictor, a single VSELP excitation codebook, a novel gain quantizer which is robust to channel errors, and a new adaptive pre/postfilter arrangement.
Constructing LDPC Codes from Loop-Free Encoding Modules

NASA Technical Reports Server (NTRS)

Divsalar, Dariush; Dolinar, Samuel; Jones, Christopher; Thorpe, Jeremy; Andrews, Kenneth

2009-01-01

A method of constructing certain low-density parity-check (LDPC) codes by use of relatively simple loop-free coding modules has been developed. The subclasses of LDPC codes to which the method applies includes accumulate-repeat-accumulate (ARA) codes, accumulate-repeat-check-accumulate codes, and the codes described in Accumulate-Repeat-Accumulate-Accumulate Codes (NPO-41305), NASA Tech Briefs, Vol. 31, No. 9 (September 2007), page 90. All of the affected codes can be characterized as serial/parallel (hybrid) concatenations of such relatively simple modules as accumulators, repetition codes, differentiators, and punctured single-parity check codes. These are error-correcting codes suitable for use in a variety of wireless data-communication systems that include noisy channels. These codes can also be characterized as hybrid turbolike codes that have projected graph or protograph representations (for example see figure); these characteristics make it possible to design high-speed iterative decoders that utilize belief-propagation algorithms. The present method comprises two related submethods for constructing LDPC codes from simple loop-free modules with circulant permutations. The first submethod is an iterative encoding method based on the erasure-decoding algorithm. The computations required by this method are well organized because they involve a parity-check matrix having a block-circulant structure. The second submethod involves the use of block-circulant generator matrices. The encoders of this method are very similar to those of recursive convolutional codes. Some encoders according to this second submethod have been implemented in a small field-programmable gate array that operates at a speed of 100 megasymbols per second. By use of density evolution (a computational- simulation technique for analyzing performances of LDPC codes), it has been shown through some examples that as the block size goes to infinity, low iterative decoding thresholds close to channel capacity limits can be achieved for the codes of the type in question having low maximum variable node degrees. The decoding thresholds in these examples are lower than those of the best-known unstructured irregular LDPC codes constrained to have the same maximum node degrees. Furthermore, the present method enables the construction of codes of any desired rate with thresholds that stay uniformly close to their respective channel capacity thresholds.
TU-AB-BRC-10: Modeling of Radiotherapy Linac Source Terms Using ARCHER Monte Carlo Code: Performance Comparison of GPU and MIC Computing Accelerators

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liu, T; Lin, H; Xu, X

Purpose: (1) To perform phase space (PS) based source modeling for Tomotherapy and Varian TrueBeam 6 MV Linacs, (2) to examine the accuracy and performance of the ARCHER Monte Carlo code on a heterogeneous computing platform with Many Integrated Core coprocessors (MIC, aka Xeon Phi) and GPUs, and (3) to explore the software micro-optimization methods. Methods: The patient-specific source of Tomotherapy and Varian TrueBeam Linacs was modeled using the PS approach. For the helical Tomotherapy case, the PS data were calculated in our previous study (Su et al. 2014 41(7) Medical Physics). For the single-view Varian TrueBeam case, we analyticallymore » derived them from the raw patient-independent PS data in IAEA’s database, partial geometry information of the jaw and MLC as well as the fluence map. The phantom was generated from DICOM images. The Monte Carlo simulation was performed by ARCHER-MIC and GPU codes, which were benchmarked against a modified parallel DPM code. Software micro-optimization was systematically conducted, and was focused on SIMD vectorization of tight for-loops and data prefetch, with the ultimate goal of increasing 512-bit register utilization and reducing memory access latency. Results: Dose calculation was performed for two clinical cases, a Tomotherapy-based prostate cancer treatment and a TrueBeam-based left breast treatment. ARCHER was verified against the DPM code. The statistical uncertainty of the dose to the PTV was less than 1%. Using double-precision, the total wall time of the multithreaded CPU code on a X5650 CPU was 339 seconds for the Tomotherapy case and 131 seconds for the TrueBeam, while on 3 5110P MICs it was reduced to 79 and 59 seconds, respectively. The single-precision GPU code on a K40 GPU took 45 seconds for the Tomotherapy dose calculation. Conclusion: We have extended ARCHER, the MIC and GPU-based Monte Carlo dose engine to Tomotherapy and Truebeam dose calculations.« less
Comparison of liquid rocket engine base region heat flux computations using three turbulence models

NASA Technical Reports Server (NTRS)

Kumar, Ganesh N.; Griffith, Dwaine O., II; Prendergast, Maurice J.; Seaford, C. M.

1993-01-01

The flow in the base region of launch vehicles is characterized by flow separation, flow reversals, and reattachment. Computation of the convective heat flux in the base region and on the nozzle external surface of Space Shuttle Main Engine and Space Transportation Main Engine (STME) is an important part of defining base region thermal environments. Several turbulence models were incorporated in a CFD code and validated for flow and heat transfer computations in the separated and reattaching regions associated with subsonic and supersonic flows over backward facing steps. Heat flux computations in the base region of a single STME engine and a single S1C engine were performed using three different wall functions as well as a renormalization-group based k-epsilon model. With the very limited data available, the computed values are seen to be of the right order of magnitude. Based on the validation comparisons, it is concluded that all the turbulence models studied have predicted the reattachment location and the velocity profiles at various axial stations downstream of the step very well.
Single-intensity-recording optical encryption technique based on phase retrieval algorithm and QR code

NASA Astrophysics Data System (ADS)

Wang, Zhi-peng; Zhang, Shuai; Liu, Hong-zhao; Qin, Yi

2014-12-01

Based on phase retrieval algorithm and QR code, a new optical encryption technology that only needs to record one intensity distribution is proposed. In this encryption process, firstly, the QR code is generated from the information to be encrypted; and then the generated QR code is placed in the input plane of 4-f system to have a double random phase encryption. For only one intensity distribution in the output plane is recorded as the ciphertext, the encryption process is greatly simplified. In the decryption process, the corresponding QR code is retrieved using phase retrieval algorithm. A priori information about QR code is used as support constraint in the input plane, which helps solve the stagnation problem. The original information can be recovered without distortion by scanning the QR code. The encryption process can be implemented either optically or digitally, and the decryption process uses digital method. In addition, the security of the proposed optical encryption technology is analyzed. Theoretical analysis and computer simulations show that this optical encryption system is invulnerable to various attacks, and suitable for harsh transmission conditions.

Spiking network simulation code for petascale computers.

PubMed

Kunkel, Susanne; Schmidt, Maximilian; Eppler, Jochen M; Plesser, Hans E; Masumoto, Gen; Igarashi, Jun; Ishii, Shin; Fukai, Tomoki; Morrison, Abigail; Diesmann, Markus; Helias, Moritz

2014-01-01

Brain-scale networks exhibit a breathtaking heterogeneity in the dynamical properties and parameters of their constituents. At cellular resolution, the entities of theory are neurons and synapses and over the past decade researchers have learned to manage the heterogeneity of neurons and synapses with efficient data structures. Already early parallel simulation codes stored synapses in a distributed fashion such that a synapse solely consumes memory on the compute node harboring the target neuron. As petaflop computers with some 100,000 nodes become increasingly available for neuroscience, new challenges arise for neuronal network simulation software: Each neuron contacts on the order of 10,000 other neurons and thus has targets only on a fraction of all compute nodes; furthermore, for any given source neuron, at most a single synapse is typically created on any compute node. From the viewpoint of an individual compute node, the heterogeneity in the synaptic target lists thus collapses along two dimensions: the dimension of the types of synapses and the dimension of the number of synapses of a given type. Here we present a data structure taking advantage of this double collapse using metaprogramming techniques. After introducing the relevant scaling scenario for brain-scale simulations, we quantitatively discuss the performance on two supercomputers. We show that the novel architecture scales to the largest petascale supercomputers available today.
Spiking network simulation code for petascale computers

PubMed Central

Kunkel, Susanne; Schmidt, Maximilian; Eppler, Jochen M.; Plesser, Hans E.; Masumoto, Gen; Igarashi, Jun; Ishii, Shin; Fukai, Tomoki; Morrison, Abigail; Diesmann, Markus; Helias, Moritz

2014-01-01

Brain-scale networks exhibit a breathtaking heterogeneity in the dynamical properties and parameters of their constituents. At cellular resolution, the entities of theory are neurons and synapses and over the past decade researchers have learned to manage the heterogeneity of neurons and synapses with efficient data structures. Already early parallel simulation codes stored synapses in a distributed fashion such that a synapse solely consumes memory on the compute node harboring the target neuron. As petaflop computers with some 100,000 nodes become increasingly available for neuroscience, new challenges arise for neuronal network simulation software: Each neuron contacts on the order of 10,000 other neurons and thus has targets only on a fraction of all compute nodes; furthermore, for any given source neuron, at most a single synapse is typically created on any compute node. From the viewpoint of an individual compute node, the heterogeneity in the synaptic target lists thus collapses along two dimensions: the dimension of the types of synapses and the dimension of the number of synapses of a given type. Here we present a data structure taking advantage of this double collapse using metaprogramming techniques. After introducing the relevant scaling scenario for brain-scale simulations, we quantitatively discuss the performance on two supercomputers. We show that the novel architecture scales to the largest petascale supercomputers available today. PMID:25346682
JETSPIN: A specific-purpose open-source software for simulations of nanofiber electrospinning

NASA Astrophysics Data System (ADS)

Lauricella, Marco; Pontrelli, Giuseppe; Coluzza, Ivan; Pisignano, Dario; Succi, Sauro

2015-12-01

We present the open-source computer program JETSPIN, specifically designed to simulate the electrospinning process of nanofibers. Its capabilities are shown with proper reference to the underlying model, as well as a description of the relevant input variables and associated test-case simulations. The various interactions included in the electrospinning model implemented in JETSPIN are discussed in detail. The code is designed to exploit different computational architectures, from single to parallel processor workstations. This paper provides an overview of JETSPIN, focusing primarily on its structure, parallel implementations, functionality, performance, and availability.
Smart command recognizer (SCR) - For development, test, and implementation of speech commands

NASA Technical Reports Server (NTRS)

Simpson, Carol A.; Bunnell, John W.; Krones, Robert R.

1988-01-01

The SCR, a rapid prototyping system for the development, testing, and implementation of speech commands in a flight simulator or test aircraft, is described. A single unit performs all functions needed during these three phases of system development, while the use of common software and speech command data structure files greatly reduces the preparation time for successive development phases. As a smart peripheral to a simulation or flight host computer, the SCR interprets the pilot's spoken input and passes command codes to the simulation or flight computer.
Development and application of the GIM code for the Cyber 203 computer

NASA Technical Reports Server (NTRS)

Stainaker, J. F.; Robinson, M. A.; Rawlinson, E. G.; Anderson, P. G.; Mayne, A. W.; Spradley, L. W.

1982-01-01

The GIM computer code for fluid dynamics research was developed. Enhancement of the computer code, implicit algorithm development, turbulence model implementation, chemistry model development, interactive input module coding and wing/body flowfield computation are described. The GIM quasi-parabolic code development was completed, and the code used to compute a number of example cases. Turbulence models, algebraic and differential equations, were added to the basic viscous code. An equilibrium reacting chemistry model and implicit finite difference scheme were also added. Development was completed on the interactive module for generating the input data for GIM. Solutions for inviscid hypersonic flow over a wing/body configuration are also presented.
grid-model: Semi-numerical reionization code

NASA Astrophysics Data System (ADS)

Hutter, Anne

2018-05-01

grid-model computes the time and spatially dependent ionization of neutral hydrogen (HI), neutral (HeI) and singly ionized helium (HeII) in the intergalactic medium (IGM). It accounts for recombinations and provides different descriptions for the photoionization rate that are used to calculate the residual HI fraction in ionized regions. The ionizing emissivity is directly derived from the RT simulation spectra.
Assessing a mini-application as a performance proxy for a finite element method engineering application

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lin, Paul T.; Heroux, Michael A.; Barrett, Richard F.

The performance of a large-scale, production-quality science and engineering application (‘app’) is often dominated by a small subset of the code. Even within that subset, computational and data access patterns are often repeated, so that an even smaller portion can represent the performance-impacting features. If application developers, parallel computing experts, and computer architects can together identify this representative subset and then develop a small mini-application (‘miniapp’) that can capture these primary performance characteristics, then this miniapp can be used to both improve the performance of the app as well as provide a tool for co-design for the high-performance computing community.more » However, a critical question is whether a miniapp can effectively capture key performance behavior of an app. This study provides a comparison of an implicit finite element semiconductor device modeling app on unstructured meshes with an implicit finite element miniapp on unstructured meshes. The goal is to assess whether the miniapp is predictive of the performance of the app. Finally, single compute node performance will be compared, as well as scaling up to 16,000 cores. Results indicate that the miniapp can be reasonably predictive of the performance characteristics of the app for a single iteration of the solver on a single compute node.« less
Assessing a mini-application as a performance proxy for a finite element method engineering application

DOE PAGES

Lin, Paul T.; Heroux, Michael A.; Barrett, Richard F.; ...

2015-07-30

The performance of a large-scale, production-quality science and engineering application (‘app’) is often dominated by a small subset of the code. Even within that subset, computational and data access patterns are often repeated, so that an even smaller portion can represent the performance-impacting features. If application developers, parallel computing experts, and computer architects can together identify this representative subset and then develop a small mini-application (‘miniapp’) that can capture these primary performance characteristics, then this miniapp can be used to both improve the performance of the app as well as provide a tool for co-design for the high-performance computing community.more » However, a critical question is whether a miniapp can effectively capture key performance behavior of an app. This study provides a comparison of an implicit finite element semiconductor device modeling app on unstructured meshes with an implicit finite element miniapp on unstructured meshes. The goal is to assess whether the miniapp is predictive of the performance of the app. Finally, single compute node performance will be compared, as well as scaling up to 16,000 cores. Results indicate that the miniapp can be reasonably predictive of the performance characteristics of the app for a single iteration of the solver on a single compute node.« less
Reactor Dosimetry Applications Using RAPTOR-M3G:. a New Parallel 3-D Radiation Transport Code

NASA Astrophysics Data System (ADS)

Longoni, Gianluca; Anderson, Stanwood L.

2009-08-01

The numerical solution of the Linearized Boltzmann Equation (LBE) via the Discrete Ordinates method (SN) requires extensive computational resources for large 3-D neutron and gamma transport applications due to the concurrent discretization of the angular, spatial, and energy domains. This paper will discuss the development RAPTOR-M3G (RApid Parallel Transport Of Radiation - Multiple 3D Geometries), a new 3-D parallel radiation transport code, and its application to the calculation of ex-vessel neutron dosimetry responses in the cavity of a commercial 2-loop Pressurized Water Reactor (PWR). RAPTOR-M3G is based domain decomposition algorithms, where the spatial and angular domains are allocated and processed on multi-processor computer architectures. As compared to traditional single-processor applications, this approach reduces the computational load as well as the memory requirement per processor, yielding an efficient solution methodology for large 3-D problems. Measured neutron dosimetry responses in the reactor cavity air gap will be compared to the RAPTOR-M3G predictions. This paper is organized as follows: Section 1 discusses the RAPTOR-M3G methodology; Section 2 describes the 2-loop PWR model and the numerical results obtained. Section 3 addresses the parallel performance of the code, and Section 4 concludes this paper with final remarks and future work.
Novel Scalable 3-D MT Inverse Solver

NASA Astrophysics Data System (ADS)

Kuvshinov, A. V.; Kruglyakov, M.; Geraskin, A.

2016-12-01

We present a new, robust and fast, three-dimensional (3-D) magnetotelluric (MT) inverse solver. As a forward modelling engine a highly-scalable solver extrEMe [1] is used. The (regularized) inversion is based on an iterative gradient-type optimization (quasi-Newton method) and exploits adjoint sources approach for fast calculation of the gradient of the misfit. The inverse solver is able to deal with highly detailed and contrasting models, allows for working (separately or jointly) with any type of MT (single-site and/or inter-site) responses, and supports massive parallelization. Different parallelization strategies implemented in the code allow for optimal usage of available computational resources for a given problem set up. To parameterize an inverse domain a mask approach is implemented, which means that one can merge any subset of forward modelling cells in order to account for (usually) irregular distribution of observation sites. We report results of 3-D numerical experiments aimed at analysing the robustness, performance and scalability of the code. In particular, our computational experiments carried out at different platforms ranging from modern laptops to high-performance clusters demonstrate practically linear scalability of the code up to thousands of nodes. 1. Kruglyakov, M., A. Geraskin, A. Kuvshinov, 2016. Novel accurate and scalable 3-D MT forward solver based on a contracting integral equation method, Computers and Geosciences, in press.
Evaluation of RAPID for a UNF cask benchmark problem

NASA Astrophysics Data System (ADS)

Mascolino, Valerio; Haghighat, Alireza; Roskoff, Nathan J.

2017-09-01

This paper examines the accuracy and performance of the RAPID (Real-time Analysis for Particle transport and In-situ Detection) code system for the simulation of a used nuclear fuel (UNF) cask. RAPID is capable of determining eigenvalue, subcritical multiplication, and pin-wise, axially-dependent fission density throughout a UNF cask. We study the source convergence based on the analysis of the different parameters used in an eigenvalue calculation in the MCNP Monte Carlo code. For this study, we consider a single assembly surrounded by absorbing plates with reflective boundary conditions. Based on the best combination of eigenvalue parameters, a reference MCNP solution for the single assembly is obtained. RAPID results are in excellent agreement with the reference MCNP solutions, while requiring significantly less computation time (i.e., minutes vs. days). A similar set of eigenvalue parameters is used to obtain a reference MCNP solution for the whole UNF cask. Because of time limitation, the MCNP results near the cask boundaries have significant uncertainties. Except for these, the RAPID results are in excellent agreement with the MCNP predictions, and its computation time is significantly lower, 35 second on 1 core versus 9.5 days on 16 cores.
Multi-threading performance of Geant4, MCNP6, and PHITS Monte Carlo codes for tetrahedral-mesh geometry.

PubMed

Han, Min Cheol; Yeom, Yeon Soo; Lee, Hyun Su; Shin, Bangho; Kim, Chan Hyeong; Furuta, Takuya

2018-05-04

In this study, the multi-threading performance of the Geant4, MCNP6, and PHITS codes was evaluated as a function of the number of threads (N) and the complexity of the tetrahedral-mesh phantom. For this, three tetrahedral-mesh phantoms of varying complexity (simple, moderately complex, and highly complex) were prepared and implemented in the three different Monte Carlo codes, in photon and neutron transport simulations. Subsequently, for each case, the initialization time, calculation time, and memory usage were measured as a function of the number of threads used in the simulation. It was found that for all codes, the initialization time significantly increased with the complexity of the phantom, but not with the number of threads. Geant4 exhibited much longer initialization time than the other codes, especially for the complex phantom (MRCP). The improvement of computation speed due to the use of a multi-threaded code was calculated as the speed-up factor, the ratio of the computation speed on a multi-threaded code to the computation speed on a single-threaded code. Geant4 showed the best multi-threading performance among the codes considered in this study, with the speed-up factor almost linearly increasing with the number of threads, reaching ~30 when N = 40. PHITS and MCNP6 showed a much smaller increase of the speed-up factor with the number of threads. For PHITS, the speed-up factors were low when N = 40. For MCNP6, the increase of the speed-up factors was better, but they were still less than ~10 when N = 40. As for memory usage, Geant4 was found to use more memory than the other codes. In addition, compared to that of the other codes, the memory usage of Geant4 more rapidly increased with the number of threads, reaching as high as ~74 GB when N = 40 for the complex phantom (MRCP). It is notable that compared to that of the other codes, the memory usage of PHITS was much lower, regardless of both the complexity of the phantom and the number of threads, hardly increasing with the number of threads for the MRCP.
Verification of a three-dimensional viscous flow analysis for a single stage compressor

NASA Astrophysics Data System (ADS)

Matsuoka, Akinori; Hashimoto, Keisuke; Nozaki, Osamu; Kikuchi, Kazuo; Fukuda, Masahiro; Tamura, Atsuhiro

1992-12-01

A transonic flowfield around rotor blades of a highly loaded single stage axial compressor was numerically analyzed by a three dimensional compressible Navier-Stokes equation code using Chakravarthy and Osher type total variation diminishing (TVD) scheme. A stage analysis which calculates both flowfields around inlet guide vane (IGV) and rotor blades simultaneously was carried out. Comparing with design values and experimental data, computed results show slight difference quantitatively. But the numerical calculation simulates well the pressure rise characteristics of the compressor and its flow pattern including strong shock surface.
Custom Coordination Environments for Lanthanoids: Tripodal Ligands Achieve Near-Perfect Octahedral Coordination for Two Dysprosium-Based Molecular Nanomagnets.

PubMed

Lim, Kwang Soo; Baldoví, José J; Jiang, ShangDa; Koo, Bong Ho; Kang, Dong Won; Lee, Woo Ram; Koh, Eui Kwan; Gaita-Ariño, Alejandro; Coronado, Eugenio; Slota, Michael; Bogani, Lapo; Hong, Chang Seop

2017-05-01

Controlling the coordination sphere of lanthanoid complexes is a challenging critical step toward controlling their relaxation properties. Here we present the synthesis of hexacoordinated dysprosium single-molecule magnets, where tripodal ligands achieve a near-perfect octahedral coordination. We perform a complete experimental and theoretical investigation of their magnetic properties, including a full single-crystal magnetic anisotropy analysis. The combination of electrostatic and crystal-field computational tools (SIMPRE and CONDON codes) allows us to explain the static behavior of these systems in detail.
CHOLLA: A New Massively Parallel Hydrodynamics Code for Astrophysical Simulation

NASA Astrophysics Data System (ADS)

Schneider, Evan E.; Robertson, Brant E.

2015-04-01

We present Computational Hydrodynamics On ParaLLel Architectures (Cholla ), a new three-dimensional hydrodynamics code that harnesses the power of graphics processing units (GPUs) to accelerate astrophysical simulations. Cholla models the Euler equations on a static mesh using state-of-the-art techniques, including the unsplit Corner Transport Upwind algorithm, a variety of exact and approximate Riemann solvers, and multiple spatial reconstruction techniques including the piecewise parabolic method (PPM). Using GPUs, Cholla evolves the fluid properties of thousands of cells simultaneously and can update over 10 million cells per GPU-second while using an exact Riemann solver and PPM reconstruction. Owing to the massively parallel architecture of GPUs and the design of the Cholla code, astrophysical simulations with physically interesting grid resolutions (≳2563) can easily be computed on a single device. We use the Message Passing Interface library to extend calculations onto multiple devices and demonstrate nearly ideal scaling beyond 64 GPUs. A suite of test problems highlights the physical accuracy of our modeling and provides a useful comparison to other codes. We then use Cholla to simulate the interaction of a shock wave with a gas cloud in the interstellar medium, showing that the evolution of the cloud is highly dependent on its density structure. We reconcile the computed mixing time of a turbulent cloud with a realistic density distribution destroyed by a strong shock with the existing analytic theory for spherical cloud destruction by describing the system in terms of its median gas density.
COMMIX-PPC: A three-dimensional transient multicomponent computer program for analyzing performance of power plant condensers. Volume 1, Equations and numerics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chien, T.H.; Domanus, H.M.; Sha, W.T.

1993-02-01

The COMMIX-PPC computer pregrain is an extended and improved version of earlier COMMIX codes and is specifically designed for evaluating the thermal performance of power plant condensers. The COMMIX codes are general-purpose computer programs for the analysis of fluid flow and heat transfer in complex Industrial systems. In COMMIX-PPC, two major features have been added to previously published COMMIX codes. One feature is the incorporation of one-dimensional equations of conservation of mass, momentum, and energy on the tube stile and the proper accounting for the thermal interaction between shell and tube side through the porous-medium approach. The other added featuremore » is the extension of the three-dimensional conservation equations for shell-side flow to treat the flow of a multicomponent medium. COMMIX-PPC is designed to perform steady-state and transient. Three-dimensional analysis of fluid flow with heat transfer tn a power plant condenser. However, the code is designed in a generalized fashion so that, with some modification, it can be used to analyze processes in any heat exchanger or other single-phase engineering applications. Volume I (Equations and Numerics) of this report describes in detail the basic equations, formulation, solution procedures, and models for a phenomena. Volume II (User`s Guide and Manual) contains the input instruction, flow charts, sample problems, and descriptions of available options and boundary conditions.« less
Development of a New System for Transport Simulation and Analysis at General Atomics

NASA Astrophysics Data System (ADS)

St. John, H. E.; Peng, Q.; Freeman, J.; Crotinger, J.

1997-11-01

General Atomics has begun a long term program to improve all aspects of experimental data analysis related to DIII--D. The object is to make local and visiting physicists as productive as possible, with only a small investment in training, by developing intuitive, sophisticated interfaces to existing and newly created computer programs. Here we describe our initial work and results of a pilot project in this program. The pilot project is a collaboratory effort between LLNL and GA which will ultimately result in the merger of Corsica and ONETWO (and selected modules from other codes) into a new advanced transport code system. The initial goal is to produce a graphical user interface to the transport code ONETWO which will couple to a programmable (steerable) front end designed for the transport system. This will be an object oriented scheme written primarily in python. The programmable application will integrate existing C, C^++, and Fortran methods in a single computational paradigm. Its most important feature is the use of plug in physics modules which will allow a high degree of customization.
Space station integrated wall design and penetration damage control

NASA Technical Reports Server (NTRS)

Coronado, A. R.; Gibbins, M. N.; Wright, M. A.; Stern, P. H.

1987-01-01

The analysis code BUMPER executes a numerical solution to the problem of calculating the probability of no penetration (PNP) of a spacecraft subject to man-made orbital debris or meteoroid impact. The codes were developed on a DEC VAX 11/780 computer that uses the Virtual Memory System (VMS) operating system, which is written in FORTRAN 77 with no VAX extensions. To help illustrate the steps involved, a single sample analysis is performed. The example used is the space station reference configuration. The finite element model (FEM) of this configuration is relatively complex but demonstrates many BUMPER features. The computer tools and guidelines are described for constructing a FEM for the space station under consideration. The methods used to analyze the sensitivity of PNP to variations in design, are described. Ways are suggested for developing contour plots of the sensitivity study data. Additional BUMPER analysis examples are provided, including FEMs, command inputs, and data outputs. The mathematical theory used as the basis for the code is described, and illustrates the data flow within the analysis.
GeNN: a code generation framework for accelerated brain simulations

NASA Astrophysics Data System (ADS)

Yavuz, Esin; Turner, James; Nowotny, Thomas

2016-01-01

Large-scale numerical simulations of detailed brain circuit models are important for identifying hypotheses on brain functions and testing their consistency and plausibility. An ongoing challenge for simulating realistic models is, however, computational speed. In this paper, we present the GeNN (GPU-enhanced Neuronal Networks) framework, which aims to facilitate the use of graphics accelerators for computational models of large-scale neuronal networks to address this challenge. GeNN is an open source library that generates code to accelerate the execution of network simulations on NVIDIA GPUs, through a flexible and extensible interface, which does not require in-depth technical knowledge from the users. We present performance benchmarks showing that 200-fold speedup compared to a single core of a CPU can be achieved for a network of one million conductance based Hodgkin-Huxley neurons but that for other models the speedup can differ. GeNN is available for Linux, Mac OS X and Windows platforms. The source code, user manual, tutorials, Wiki, in-depth example projects and all other related information can be found on the project website http://genn-team.github.io/genn/.
GeNN: a code generation framework for accelerated brain simulations.

PubMed

Yavuz, Esin; Turner, James; Nowotny, Thomas

2016-01-07

Large-scale numerical simulations of detailed brain circuit models are important for identifying hypotheses on brain functions and testing their consistency and plausibility. An ongoing challenge for simulating realistic models is, however, computational speed. In this paper, we present the GeNN (GPU-enhanced Neuronal Networks) framework, which aims to facilitate the use of graphics accelerators for computational models of large-scale neuronal networks to address this challenge. GeNN is an open source library that generates code to accelerate the execution of network simulations on NVIDIA GPUs, through a flexible and extensible interface, which does not require in-depth technical knowledge from the users. We present performance benchmarks showing that 200-fold speedup compared to a single core of a CPU can be achieved for a network of one million conductance based Hodgkin-Huxley neurons but that for other models the speedup can differ. GeNN is available for Linux, Mac OS X and Windows platforms. The source code, user manual, tutorials, Wiki, in-depth example projects and all other related information can be found on the project website http://genn-team.github.io/genn/.

GeNN: a code generation framework for accelerated brain simulations

PubMed Central

Yavuz, Esin; Turner, James; Nowotny, Thomas

2016-01-01

Large-scale numerical simulations of detailed brain circuit models are important for identifying hypotheses on brain functions and testing their consistency and plausibility. An ongoing challenge for simulating realistic models is, however, computational speed. In this paper, we present the GeNN (GPU-enhanced Neuronal Networks) framework, which aims to facilitate the use of graphics accelerators for computational models of large-scale neuronal networks to address this challenge. GeNN is an open source library that generates code to accelerate the execution of network simulations on NVIDIA GPUs, through a flexible and extensible interface, which does not require in-depth technical knowledge from the users. We present performance benchmarks showing that 200-fold speedup compared to a single core of a CPU can be achieved for a network of one million conductance based Hodgkin-Huxley neurons but that for other models the speedup can differ. GeNN is available for Linux, Mac OS X and Windows platforms. The source code, user manual, tutorials, Wiki, in-depth example projects and all other related information can be found on the project website http://genn-team.github.io/genn/. PMID:26740369
Computing Temperatures in Optically Thick Protoplanetary Disks

NASA Technical Reports Server (NTRS)

Capuder, Lawrence F.. Jr.

2011-01-01

We worked with a Monte Carlo radiative transfer code to simulate the transfer of energy through protoplanetary disks, where planet formation occurs. The code tracks photons from the star into the disk, through scattering, absorption and re-emission, until they escape to infinity. High optical depths in the disk interior dominate the computation time because it takes the photon packet many interactions to get out of the region. High optical depths also receive few photons and therefore do not have well-estimated temperatures. We applied a modified random walk (MRW) approximation for treating high optical depths and to speed up the Monte Carlo calculations. The MRW is implemented by calculating the average number of interactions the photon packet will undergo in diffusing within a single cell of the spatial grid and then updating the packet position, packet frequencies, and local radiation absorption rate appropriately. The MRW approximation was then tested for accuracy and speed compared to the original code. We determined that MRW provides accurate answers to Monte Carlo Radiative transfer simulations. The speed gained from using MRW is shown to be proportional to the disk mass.
Attacks on quantum key distribution protocols that employ non-ITS authentication

NASA Astrophysics Data System (ADS)

Pacher, C.; Abidin, A.; Lorünser, T.; Peev, M.; Ursin, R.; Zeilinger, A.; Larsson, J.-Å.

2016-01-01

We demonstrate how adversaries with large computing resources can break quantum key distribution (QKD) protocols which employ a particular message authentication code suggested previously. This authentication code, featuring low key consumption, is not information-theoretically secure (ITS) since for each message the eavesdropper has intercepted she is able to send a different message from a set of messages that she can calculate by finding collisions of a cryptographic hash function. However, when this authentication code was introduced, it was shown to prevent straightforward man-in-the-middle (MITM) attacks against QKD protocols. In this paper, we prove that the set of messages that collide with any given message under this authentication code contains with high probability a message that has small Hamming distance to any other given message. Based on this fact, we present extended MITM attacks against different versions of BB84 QKD protocols using the addressed authentication code; for three protocols, we describe every single action taken by the adversary. For all protocols, the adversary can obtain complete knowledge of the key, and for most protocols her success probability in doing so approaches unity. Since the attacks work against all authentication methods which allow to calculate colliding messages, the underlying building blocks of the presented attacks expose the potential pitfalls arising as a consequence of non-ITS authentication in QKD post-processing. We propose countermeasures, increasing the eavesdroppers demand for computational power, and also prove necessary and sufficient conditions for upgrading the discussed authentication code to the ITS level.
Aspect-Oriented Programming

NASA Technical Reports Server (NTRS)

Elrad, Tzilla (Editor); Filman, Robert E. (Editor); Bader, Atef (Editor)

2001-01-01

Computer science has experienced an evolution in programming languages and systems from the crude assembly and machine codes of the earliest computers through concepts such as formula translation, procedural programming, structured programming, functional programming, logic programming, and programming with abstract data types. Each of these steps in programming technology has advanced our ability to achieve clear separation of concerns at the source code level. Currently, the dominant programming paradigm is object-oriented programming - the idea that one builds a software system by decomposing a problem into objects and then writing the code of those objects. Such objects abstract together behavior and data into a single conceptual and physical entity. Object-orientation is reflected in the entire spectrum of current software development methodologies and tools - we have OO methodologies, analysis and design tools, and OO programming languages. Writing complex applications such as graphical user interfaces, operating systems, and distributed applications while maintaining comprehensible source code has been made possible with OOP. Success at developing simpler systems leads to aspirations for greater complexity. Object orientation is a clever idea, but has certain limitations. We are now seeing that many requirements do not decompose neatly into behavior centered on a single locus. Object technology has difficulty localizing concerns invoking global constraints and pandemic behaviors, appropriately segregating concerns, and applying domain-specific knowledge. Post-object programming (POP) mechanisms that look to increase the expressiveness of the OO paradigm are a fertile arena for current research. Examples of POP technologies include domain-specific languages, generative programming, generic programming, constraint languages, reflection and metaprogramming, feature-oriented development, views/viewpoints, and asynchronous message brokering. (Czarneclu and Eisenecker s book includes a good survey of many of these technologies).
Evaluation of the finite element fuel rod analysis code (FRANCO)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lee, K.; Feltus, M.A.

1994-12-31

Knowledge of temperature distribution in a nuclear fuel rod is required to predict the behavior of fuel elements during operating conditions. The thermal and mechanical properties and performance characteristics are strongly dependent on the temperature, which can vary greatly inside the fuel rod. A detailed model of fuel rod behavior can be described by various numerical methods, including the finite element approach. The finite element method has been successfully used in many engineering applications, including nuclear piping and reactor component analysis. However, fuel pin analysis has traditionally been carried out with finite difference codes, with the exception of Electric Powermore » Research Institute`s FREY code, which was developed for mainframe execution. This report describes FRANCO, a finite element fuel rod analysis code capable of computing temperature disrtibution and mechanical deformation of a single light water reactor fuel rod.« less
Matrix-Product-State Algorithm for Finite Fractional Quantum Hall Systems

NASA Astrophysics Data System (ADS)

Liu, Zhao; Bhatt, R. N.

2015-09-01

Exact diagonalization is a powerful tool to study fractional quantum Hall (FQH) systems. However, its capability is limited by the exponentially increasing computational cost. In order to overcome this difficulty, density-matrix-renormalization-group (DMRG) algorithms were developed for much larger system sizes. Very recently, it was realized that some model FQH states have exact matrix-product-state (MPS) representation. Motivated by this, here we report a MPS code, which is closely related to, but different from traditional DMRG language, for finite FQH systems on the cylinder geometry. By representing the many-body Hamiltonian as a matrix-product-operator (MPO) and using single-site update and density matrix correction, we show that our code can efficiently search the ground state of various FQH systems. We also compare the performance of our code with traditional DMRG. The possible generalization of our code to infinite FQH systems and other physical systems is also discussed.
BINGO: a code for the efficient computation of the scalar bi-spectrum

NASA Astrophysics Data System (ADS)

Hazra, Dhiraj Kumar; Sriramkumar, L.; Martin, Jérôme

2013-05-01

We present a new and accurate Fortran code, the BI-spectra and Non-Gaussianity Operator (BINGO), for the efficient numerical computation of the scalar bi-spectrum and the non-Gaussianity parameter fNL in single field inflationary models involving the canonical scalar field. The code can calculate all the different contributions to the bi-spectrum and the parameter fNL for an arbitrary triangular configuration of the wavevectors. Focusing firstly on the equilateral limit, we illustrate the accuracy of BINGO by comparing the results from the code with the spectral dependence of the bi-spectrum expected in power law inflation. Then, considering an arbitrary triangular configuration, we contrast the numerical results with the analytical expression available in the slow roll limit, for, say, the case of the conventional quadratic potential. Considering a non-trivial scenario involving deviations from slow roll, we compare the results from the code with the analytical results that have recently been obtained in the case of the Starobinsky model in the equilateral limit. As an immediate application, we utilize BINGO to examine of the power of the non-Gaussianity parameter fNL to discriminate between various inflationary models that admit departures from slow roll and lead to similar features in the scalar power spectrum. We close with a summary and discussion on the implications of the results we obtain.
Theoretical Study of Wave Particle Correlation Measurement via 1-D Electromagnetic Particle Simulation

NASA Astrophysics Data System (ADS)

Ueda, Yoshikatsu; Omura, Yoshiharu; Kojima, Hiro

Spacecraft observation is essentially "one-point measurement", while numerical simulation can reproduce a whole system of physical processes on a computer. By performing particle simulations of plasma wave instabilities and calculating correlation of waves and particles observed at a single point, we examine how well we can infer the characteristics of the whole system by a one-point measurement. We perform various simulation runs with different plasma parameters using one-dimensional electromagnetic particle code (KEMPO1) and calculate 'E dot v' or other moments at a single point. We find good correlation between the measurement and the macroscopic fluctuations of the total simulation region. We make use of the results of the computer experiments in our system design of new instruments 'One-chip Wave Particle Interaction Analyzer (OWPIA)'.
Towards a high performance geometry library for particle-detector simulations

DOE PAGES

Apostolakis, J.; Bandieramonte, M.; Bitzes, G.; ...

2015-05-22

Thread-parallelization and single-instruction multiple data (SIMD) ”vectorisation” of software components in HEP computing has become a necessity to fully benefit from current and future computing hardware. In this context, the Geant-Vector/GPU simulation project aims to re-engineer current software for the simulation of the passage of particles through detectors in order to increase the overall event throughput. As one of the core modules in this area, the geometry library plays a central role and vectorising its algorithms will be one of the cornerstones towards achieving good CPU performance. Here, we report on the progress made in vectorising the shape primitives, asmore » well as in applying new C++ template based optimizations of existing code available in the Geant4, ROOT or USolids geometry libraries. We will focus on a presentation of our software development approach that aims to provide optimized code for all use cases of the library (e.g., single particle and many-particle APIs) and to support different architectures (CPU and GPU) while keeping the code base small, manageable and maintainable. We report on a generic and templated C++ geometry library as a continuation of the AIDA USolids project. As a result, the experience gained with these developments will be beneficial to other parts of the simulation software, such as for the optimization of the physics library, and possibly to other parts of the experiment software stack, such as reconstruction and analysis.« less
Towards a high performance geometry library for particle-detector simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Apostolakis, J.; Bandieramonte, M.; Bitzes, G.

Thread-parallelization and single-instruction multiple data (SIMD) ”vectorisation” of software components in HEP computing has become a necessity to fully benefit from current and future computing hardware. In this context, the Geant-Vector/GPU simulation project aims to re-engineer current software for the simulation of the passage of particles through detectors in order to increase the overall event throughput. As one of the core modules in this area, the geometry library plays a central role and vectorising its algorithms will be one of the cornerstones towards achieving good CPU performance. Here, we report on the progress made in vectorising the shape primitives, asmore » well as in applying new C++ template based optimizations of existing code available in the Geant4, ROOT or USolids geometry libraries. We will focus on a presentation of our software development approach that aims to provide optimized code for all use cases of the library (e.g., single particle and many-particle APIs) and to support different architectures (CPU and GPU) while keeping the code base small, manageable and maintainable. We report on a generic and templated C++ geometry library as a continuation of the AIDA USolids project. As a result, the experience gained with these developments will be beneficial to other parts of the simulation software, such as for the optimization of the physics library, and possibly to other parts of the experiment software stack, such as reconstruction and analysis.« less
Tough2{_}MP: A parallel version of TOUGH2

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Keni; Wu, Yu-Shu; Ding, Chris

2003-04-09

TOUGH2{_}MP is a massively parallel version of TOUGH2. It was developed for running on distributed-memory parallel computers to simulate large simulation problems that may not be solved by the standard, single-CPU TOUGH2 code. The new code implements an efficient massively parallel scheme, while preserving the full capacity and flexibility of the original TOUGH2 code. The new software uses the METIS software package for grid partitioning and AZTEC software package for linear-equation solving. The standard message-passing interface is adopted for communication among processors. Numerical performance of the current version code has been tested on CRAY-T3E and IBM RS/6000 SP platforms. Inmore » addition, the parallel code has been successfully applied to real field problems of multi-million-cell simulations for three-dimensional multiphase and multicomponent fluid and heat flow, as well as solute transport. In this paper, we will review the development of the TOUGH2{_}MP, and discuss the basic features, modules, and their applications.« less
Dynamic Divisive Normalization Predicts Time-Varying Value Coding in Decision-Related Circuits

PubMed Central

LoFaro, Thomas; Webb, Ryan; Glimcher, Paul W.

2014-01-01

Normalization is a widespread neural computation, mediating divisive gain control in sensory processing and implementing a context-dependent value code in decision-related frontal and parietal cortices. Although decision-making is a dynamic process with complex temporal characteristics, most models of normalization are time-independent and little is known about the dynamic interaction of normalization and choice. Here, we show that a simple differential equation model of normalization explains the characteristic phasic-sustained pattern of cortical decision activity and predicts specific normalization dynamics: value coding during initial transients, time-varying value modulation, and delayed onset of contextual information. Empirically, we observe these predicted dynamics in saccade-related neurons in monkey lateral intraparietal cortex. Furthermore, such models naturally incorporate a time-weighted average of past activity, implementing an intrinsic reference-dependence in value coding. These results suggest that a single network mechanism can explain both transient and sustained decision activity, emphasizing the importance of a dynamic view of normalization in neural coding. PMID:25429145
MCNP capabilities for nuclear well logging calculations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Forster, R.A.; Little, R.C.; Briesmeister, J.F.

The Los Alamos Radiation Transport Code System (LARTCS) consists of state-of-the-art Monte Carlo and discrete ordinates transport codes and data libraries. This paper discusses how the general-purpose continuous-energy Monte Carlo code MCNP ({und M}onte {und C}arlo {und n}eutron {und p}hoton), part of the LARTCS, provides a computational predictive capability for many applications of interest to the nuclear well logging community. The generalized three-dimensional geometry of MCNP is well suited for borehole-tool models. SABRINA, another component of the LARTCS, is a graphics code that can be used to interactively create a complex MCNP geometry. Users can define many source and tallymore » characteristics with standard MCNP features. The time-dependent capability of the code is essential when modeling pulsed sources. Problems with neutrons, photons, and electrons as either single particle or coupled particles can be calculated with MCNP. The physics of neutron and photon transport and interactions is modeled in detail using the latest available cross-section data.« less
a Proposed Benchmark Problem for Scatter Calculations in Radiographic Modelling

NASA Astrophysics Data System (ADS)

Jaenisch, G.-R.; Bellon, C.; Schumm, A.; Tabary, J.; Duvauchelle, Ph.

2009-03-01

Code Validation is a permanent concern in computer modelling, and has been addressed repeatedly in eddy current and ultrasonic modeling. A good benchmark problem is sufficiently simple to be taken into account by various codes without strong requirements on geometry representation capabilities, focuses on few or even a single aspect of the problem at hand to facilitate interpretation and to avoid that compound errors compensate themselves, yields a quantitative result and is experimentally accessible. In this paper we attempt to address code validation for one aspect of radiographic modeling, the scattered radiation prediction. Many NDT applications can not neglect scattered radiation, and the scatter calculation thus is important to faithfully simulate the inspection situation. Our benchmark problem covers the wall thickness range of 10 to 50 mm for single wall inspections, with energies ranging from 100 to 500 keV in the first stage, and up to 1 MeV with wall thicknesses up to 70 mm in the extended stage. A simple plate geometry is sufficient for this purpose, and the scatter data is compared on a photon level, without a film model, which allows for comparisons with reference codes like MCNP. We compare results of three Monte Carlo codes (McRay, Sindbad and Moderato) as well as an analytical first order scattering code (VXI), and confront them to results obtained with MCNP. The comparison with an analytical scatter model provides insights into the application domain where this kind of approach can successfully replace Monte-Carlo calculations.
New algorithm for tensor contractions on multi-core CPUs, GPUs, and accelerators enables CCSD and EOM-CCSD calculations with over 1000 basis functions on a single compute node.

PubMed

Kaliman, Ilya A; Krylov, Anna I

2017-04-30

A new hardware-agnostic contraction algorithm for tensors of arbitrary symmetry and sparsity is presented. The algorithm is implemented as a stand-alone open-source code libxm. This code is also integrated with general tensor library libtensor and with the Q-Chem quantum-chemistry package. An overview of the algorithm, its implementation, and benchmarks are presented. Similarly to other tensor software, the algorithm exploits efficient matrix multiplication libraries and assumes that tensors are stored in a block-tensor form. The distinguishing features of the algorithm are: (i) efficient repackaging of the individual blocks into large matrices and back, which affords efficient graphics processing unit (GPU)-enabled calculations without modifications of higher-level codes; (ii) fully asynchronous data transfer between disk storage and fast memory. The algorithm enables canonical all-electron coupled-cluster and equation-of-motion coupled-cluster calculations with single and double substitutions (CCSD and EOM-CCSD) with over 1000 basis functions on a single quad-GPU machine. We show that the algorithm exhibits predicted theoretical scaling for canonical CCSD calculations, O(N 6 ), irrespective of the data size on disk. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Steady-State Computation of Constant Rotational Rate Dynamic Stability Derivatives

NASA Technical Reports Server (NTRS)

Park, Michael A.; Green, Lawrence L.

2000-01-01

Dynamic stability derivatives are essential to predicting the open and closed loop performance, stability, and controllability of aircraft. Computational determination of constant-rate dynamic stability derivatives (derivatives of aircraft forces and moments with respect to constant rotational rates) is currently performed indirectly with finite differencing of multiple time-accurate computational fluid dynamics solutions. Typical time-accurate solutions require excessive amounts of computational time to complete. Formulating Navier-Stokes (N-S) equations in a rotating noninertial reference frame and applying an automatic differentiation tool to the modified code has the potential for directly computing these derivatives with a single, much faster steady-state calculation. The ability to rapidly determine static and dynamic stability derivatives by computational methods can benefit multidisciplinary design methodologies and reduce dependency on wind tunnel measurements. The CFL3D thin-layer N-S computational fluid dynamics code was modified for this study to allow calculations on complex three-dimensional configurations with constant rotation rate components in all three axes. These CFL3D modifications also have direct application to rotorcraft and turbomachinery analyses. The modified CFL3D steady-state calculation is a new capability that showed excellent agreement with results calculated by a similar formulation. The application of automatic differentiation to CFL3D allows the static stability and body-axis rate derivatives to be calculated quickly and exactly.
SAPNEW: Parallel finite element code for thin shell structures on the Alliant FX/80

NASA Astrophysics Data System (ADS)

Kamat, Manohar P.; Watson, Brian C.

1992-02-01

The results of a research activity aimed at providing a finite element capability for analyzing turbo-machinery bladed-disk assemblies in a vector/parallel processing environment are summarized. Analysis of aircraft turbofan engines is very computationally intensive. The performance limit of modern day computers with a single processing unit was estimated at 3 billions of floating point operations per second (3 gigaflops). In view of this limit of a sequential unit, performance rates higher than 3 gigaflops can be achieved only through vectorization and/or parallelization as on Alliant FX/80. Accordingly, the efforts of this critically needed research were geared towards developing and evaluating parallel finite element methods for static and vibration analysis. A special purpose code, named with the acronym SAPNEW, performs static and eigen analysis of multi-degree-of-freedom blade models built-up from flat thin shell elements.
User's guide to SEAWAT; a computer program for simulation of three-dimensional variable-density ground-water flow

USGS Publications Warehouse

Guo, Weixing; Langevin, C.D.

2002-01-01

This report documents a computer program (SEAWAT) that simulates variable-density, transient, ground-water flow in three dimensions. The source code for SEAWAT was developed by combining MODFLOW and MT3DMS into a single program that solves the coupled flow and solute-transport equations. The SEAWAT code follows a modular structure, and thus, new capabilities can be added with only minor modifications to the main program. SEAWAT reads and writes standard MODFLOW and MT3DMS data sets, although some extra input may be required for some SEAWAT simulations. This means that many of the existing pre- and post-processors can be used to create input data sets and analyze simulation results. Users familiar with MODFLOW and MT3DMS should have little difficulty applying SEAWAT to problems of variable-density ground-water flow.
Vectors a Fortran 90 module for 3-dimensional vector and dyadic arithmetic

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brock, B.C.

1998-02-01

A major advance contained in the new Fortran 90 language standard is the ability to define new data types and the operators associated with them. Writing computer code to implement computations with real and complex three-dimensional vectors and dyadics is greatly simplified if the equations can be implemented directly, without the need to code the vector arithmetic explicitly. The Fortran 90 module described here defines new data types for real and complex 3-dimensional vectors and dyadics, along with the common operations needed to work with these objects. Routines to allow convenient initialization and output of the new types are alsomore » included. In keeping with the philosophy of data abstraction, the details of the implementation of the data types are maintained private, and the functions and operators are made generic to simplify the combining of real, complex, single- and double-precision vectors and dyadics.« less
SAPNEW: Parallel finite element code for thin shell structures on the Alliant FX/80

NASA Technical Reports Server (NTRS)

Kamat, Manohar P.; Watson, Brian C.

1992-01-01

The results of a research activity aimed at providing a finite element capability for analyzing turbo-machinery bladed-disk assemblies in a vector/parallel processing environment are summarized. Analysis of aircraft turbofan engines is very computationally intensive. The performance limit of modern day computers with a single processing unit was estimated at 3 billions of floating point operations per second (3 gigaflops). In view of this limit of a sequential unit, performance rates higher than 3 gigaflops can be achieved only through vectorization and/or parallelization as on Alliant FX/80. Accordingly, the efforts of this critically needed research were geared towards developing and evaluating parallel finite element methods for static and vibration analysis. A special purpose code, named with the acronym SAPNEW, performs static and eigen analysis of multi-degree-of-freedom blade models built-up from flat thin shell elements.

Constructing Neuronal Network Models in Massively Parallel Environments.

PubMed

Ippen, Tammo; Eppler, Jochen M; Plesser, Hans E; Diesmann, Markus

2017-01-01

Recent advances in the development of data structures to represent spiking neuron network models enable us to exploit the complete memory of petascale computers for a single brain-scale network simulation. In this work, we investigate how well we can exploit the computing power of such supercomputers for the creation of neuronal networks. Using an established benchmark, we divide the runtime of simulation code into the phase of network construction and the phase during which the dynamical state is advanced in time. We find that on multi-core compute nodes network creation scales well with process-parallel code but exhibits a prohibitively large memory consumption. Thread-parallel network creation, in contrast, exhibits speedup only up to a small number of threads but has little overhead in terms of memory. We further observe that the algorithms creating instances of model neurons and their connections scale well for networks of ten thousand neurons, but do not show the same speedup for networks of millions of neurons. Our work uncovers that the lack of scaling of thread-parallel network creation is due to inadequate memory allocation strategies and demonstrates that thread-optimized memory allocators recover excellent scaling. An analysis of the loop order used for network construction reveals that more complex tests on the locality of operations significantly improve scaling and reduce runtime by allowing construction algorithms to step through large networks more efficiently than in existing code. The combination of these techniques increases performance by an order of magnitude and harnesses the increasingly parallel compute power of the compute nodes in high-performance clusters and supercomputers.
Constructing Neuronal Network Models in Massively Parallel Environments

PubMed Central

Ippen, Tammo; Eppler, Jochen M.; Plesser, Hans E.; Diesmann, Markus

2017-01-01

Recent advances in the development of data structures to represent spiking neuron network models enable us to exploit the complete memory of petascale computers for a single brain-scale network simulation. In this work, we investigate how well we can exploit the computing power of such supercomputers for the creation of neuronal networks. Using an established benchmark, we divide the runtime of simulation code into the phase of network construction and the phase during which the dynamical state is advanced in time. We find that on multi-core compute nodes network creation scales well with process-parallel code but exhibits a prohibitively large memory consumption. Thread-parallel network creation, in contrast, exhibits speedup only up to a small number of threads but has little overhead in terms of memory. We further observe that the algorithms creating instances of model neurons and their connections scale well for networks of ten thousand neurons, but do not show the same speedup for networks of millions of neurons. Our work uncovers that the lack of scaling of thread-parallel network creation is due to inadequate memory allocation strategies and demonstrates that thread-optimized memory allocators recover excellent scaling. An analysis of the loop order used for network construction reveals that more complex tests on the locality of operations significantly improve scaling and reduce runtime by allowing construction algorithms to step through large networks more efficiently than in existing code. The combination of these techniques increases performance by an order of magnitude and harnesses the increasingly parallel compute power of the compute nodes in high-performance clusters and supercomputers. PMID:28559808
An implementation of a tree code on a SIMD, parallel computer

NASA Technical Reports Server (NTRS)

Olson, Kevin M.; Dorband, John E.

1994-01-01

We describe a fast tree algorithm for gravitational N-body simulation on SIMD parallel computers. The tree construction uses fast, parallel sorts. The sorted lists are recursively divided along their x, y and z coordinates. This data structure is a completely balanced tree (i.e., each particle is paired with exactly one other particle) and maintains good spatial locality. An implementation of this tree-building algorithm on a 16k processor Maspar MP-1 performs well and constitutes only a small fraction (approximately 15%) of the entire cycle of finding the accelerations. Each node in the tree is treated as a monopole. The tree search and the summation of accelerations also perform well. During the tree search, node data that is needed from another processor is simply fetched. Roughly 55% of the tree search time is spent in communications between processors. We apply the code to two problems of astrophysical interest. The first is a simulation of the close passage of two gravitationally, interacting, disk galaxies using 65,636 particles. We also simulate the formation of structure in an expanding, model universe using 1,048,576 particles. Our code attains speeds comparable to one head of a Cray Y-MP, so single instruction, multiple data (SIMD) type computers can be used for these simulations. The cost/performance ratio for SIMD machines like the Maspar MP-1 make them an extremely attractive alternative to either vector processors or large multiple instruction, multiple data (MIMD) type parallel computers. With further optimizations (e.g., more careful load balancing), speeds in excess of today's vector processing computers should be possible.
Heat transfer in rocket engine combustion chambers and regeneratively cooled nozzles

NASA Technical Reports Server (NTRS)

1993-01-01

A conjugate heat transfer computational fluid dynamics (CFD) model to describe regenerative cooling in the main combustion chamber and nozzle and in the injector faceplate region for a launch vehicle class liquid rocket engine was developed. An injector model for sprays which treats the fluid as a variable density, single-phase media was formulated, incorporated into a version of the FDNS code, and used to simulate the injector flow typical of that in the Space Shuttle Main Engine (SSME). Various chamber related heat transfer analyses were made to verify the predictive capability of the conjugate heat transfer analysis provided by the FDNS code. The density based version of the FDNS code with the real fluid property models developed was successful in predicting the streamtube combustion of individual injector elements.
Statistical Analysis of CFD Solutions from the Drag Prediction Workshop

NASA Technical Reports Server (NTRS)

Hemsch, Michael J.

2002-01-01

A simple, graphical framework is presented for robust statistical evaluation of results obtained from N-Version testing of a series of RANS CFD codes. The solutions were obtained by a variety of code developers and users for the June 2001 Drag Prediction Workshop sponsored by the AIAA Applied Aerodynamics Technical Committee. The aerodynamic configuration used for the computational tests is the DLR-F4 wing-body combination previously tested in several European wind tunnels and for which a previous N-Version test had been conducted. The statistical framework is used to evaluate code results for (1) a single cruise design point, (2) drag polars and (3) drag rise. The paper concludes with a discussion of the meaning of the results, especially with respect to predictability, Validation, and reporting of solutions.
Efficient Network Coding-Based Loss Recovery for Reliable Multicast in Wireless Networks

NASA Astrophysics Data System (ADS)

Chi, Kaikai; Jiang, Xiaohong; Ye, Baoliu; Horiguchi, Susumu

Recently, network coding has been applied to the loss recovery of reliable multicast in wireless networks [19], where multiple lost packets are XOR-ed together as one packet and forwarded via single retransmission, resulting in a significant reduction of bandwidth consumption. In this paper, we first prove that maximizing the number of lost packets for XOR-ing, which is the key part of the available network coding-based reliable multicast schemes, is actually a complex NP-complete problem. To address this limitation, we then propose an efficient heuristic algorithm for finding an approximately optimal solution of this optimization problem. Furthermore, we show that the packet coding principle of maximizing the number of lost packets for XOR-ing sometimes cannot fully exploit the potential coding opportunities, and we then further propose new heuristic-based schemes with a new coding principle. Simulation results demonstrate that the heuristic-based schemes have very low computational complexity and can achieve almost the same transmission efficiency as the current coding-based high-complexity schemes. Furthermore, the heuristic-based schemes with the new coding principle not only have very low complexity, but also slightly outperform the current high-complexity ones.
NEAMS Update. Quarterly Report for October - December 2011.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bradley, K.

2012-02-16

The Advanced Modeling and Simulation Office within the DOE Office of Nuclear Energy (NE) has been charged with revolutionizing the design tools used to build nuclear power plants during the next 10 years. To accomplish this, the DOE has brought together the national laboratories, U.S. universities, and the nuclear energy industry to establish the Nuclear Energy Advanced Modeling and Simulation (NEAMS) Program. The mission of NEAMS is to modernize computer modeling of nuclear energy systems and improve the fidelity and validity of modeling results using contemporary software environments and high-performance computers. NEAMS will create a set of engineering-level codes aimedmore » at designing and analyzing the performance and safety of nuclear power plants and reactor fuels. The truly predictive nature of these codes will be achieved by modeling the governing phenomena at the spatial and temporal scales that dominate the behavior. These codes will be executed within a simulation environment that orchestrates code integration with respect to spatial meshing, computational resources, and execution to give the user a common 'look and feel' for setting up problems and displaying results. NEAMS is building upon a suite of existing simulation tools, including those developed by the federal Scientific Discovery through Advanced Computing and Advanced Simulation and Computing programs. NEAMS also draws upon existing simulation tools for materials and nuclear systems, although many of these are limited in terms of scale, applicability, and portability (their ability to be integrated into contemporary software and hardware architectures). NEAMS investments have directly and indirectly supported additional NE research and development programs, including those devoted to waste repositories, safeguarded separations systems, and long-term storage of used nuclear fuel. NEAMS is organized into two broad efforts, each comprising four elements. The quarterly highlights October-December 2011 are: (1) Version 1.0 of AMP, the fuel assembly performance code, was tested on the JAGUAR supercomputer and released on November 1, 2011, a detailed discussion of this new simulation tool is given; (2) A coolant sub-channel model and a preliminary UO{sub 2} smeared-cracking model were implemented in BISON, the single-pin fuel code, more information on how these models were developed and benchmarked is given; (3) The Object Kinetic Monte Carlo model was implemented to account for nucleation events in meso-scale simulations and a discussion of the significance of this advance is given; (4) The SHARP neutronics module, PROTEUS, was expanded to be applicable to all types of reactors, and a discussion of the importance of PROTEUS is given; (5) A plan has been finalized for integrating the high-fidelity, three-dimensional reactor code SHARP with both the systems-level code RELAP7 and the fuel assembly code AMP. This is a new initiative; (6) Work began to evaluate the applicability of AMP to the problem of dry storage of used fuel and to define a relevant problem to test the applicability; (7) A code to obtain phonon spectra from the force-constant matrix for a crystalline lattice has been completed. This important bridge between subcontinuum and continuum phenomena is discussed; (8) Benchmarking was begun on the meso-scale, finite-element fuels code MARMOT to validate its new variable splitting algorithm; (9) A very computationally demanding simulation of diffusion-driven nucleation of new microstructural features has been completed. An explanation of the difficulty of this simulation is given; (10) Experiments were conducted with deformed steel to validate a crystal plasticity finite-element code for bodycentered cubic iron; (11) The Capability Transfer Roadmap was completed and published as an internal laboratory technical report; (12) The AMP fuel assembly code input generator was integrated into the NEAMS Integrated Computational Environment (NiCE). More details on the planned NEAMS computing environment is given; and (13) The NEAMS program website (neams.energy.gov) is nearly ready to launch.« less
Implementation of a Message Passing Interface into a Cloud-Resolving Model for Massively Parallel Computing

NASA Technical Reports Server (NTRS)

Juang, Hann-Ming Henry; Tao, Wei-Kuo; Zeng, Xi-Ping; Shie, Chung-Lin; Simpson, Joanne; Lang, Steve

2004-01-01

The capability for massively parallel programming (MPP) using a message passing interface (MPI) has been implemented into a three-dimensional version of the Goddard Cumulus Ensemble (GCE) model. The design for the MPP with MPI uses the concept of maintaining similar code structure between the whole domain as well as the portions after decomposition. Hence the model follows the same integration for single and multiple tasks (CPUs). Also, it provides for minimal changes to the original code, so it is easily modified and/or managed by the model developers and users who have little knowledge of MPP. The entire model domain could be sliced into one- or two-dimensional decomposition with a halo regime, which is overlaid on partial domains. The halo regime requires that no data be fetched across tasks during the computational stage, but it must be updated before the next computational stage through data exchange via MPI. For reproducible purposes, transposing data among tasks is required for spectral transform (Fast Fourier Transform, FFT), which is used in the anelastic version of the model for solving the pressure equation. The performance of the MPI-implemented codes (i.e., the compressible and anelastic versions) was tested on three different computing platforms. The major results are: 1) both versions have speedups of about 99% up to 256 tasks but not for 512 tasks; 2) the anelastic version has better speedup and efficiency because it requires more computations than that of the compressible version; 3) equal or approximately-equal numbers of slices between the x- and y- directions provide the fastest integration due to fewer data exchanges; and 4) one-dimensional slices in the x-direction result in the slowest integration due to the need for more memory relocation for computation.
Application of a hybrid MPI/OpenMP approach for parallel groundwater model calibration using multi-core computers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tang, Guoping; D'Azevedo, Ed F; Zhang, Fan

2010-01-01

Calibration of groundwater models involves hundreds to thousands of forward solutions, each of which may solve many transient coupled nonlinear partial differential equations, resulting in a computationally intensive problem. We describe a hybrid MPI/OpenMP approach to exploit two levels of parallelisms in software and hardware to reduce calibration time on multi-core computers. HydroGeoChem 5.0 (HGC5) is parallelized using OpenMP for direct solutions for a reactive transport model application, and a field-scale coupled flow and transport model application. In the reactive transport model, a single parallelizable loop is identified to account for over 97% of the total computational time using GPROF.more » Addition of a few lines of OpenMP compiler directives to the loop yields a speedup of about 10 on a 16-core compute node. For the field-scale model, parallelizable loops in 14 of 174 HGC5 subroutines that require 99% of the execution time are identified. As these loops are parallelized incrementally, the scalability is found to be limited by a loop where Cray PAT detects over 90% cache missing rates. With this loop rewritten, similar speedup as the first application is achieved. The OpenMP-parallelized code can be run efficiently on multiple workstations in a network or multiple compute nodes on a cluster as slaves using parallel PEST to speedup model calibration. To run calibration on clusters as a single task, the Levenberg Marquardt algorithm is added to HGC5 with the Jacobian calculation and lambda search parallelized using MPI. With this hybrid approach, 100 200 compute cores are used to reduce the calibration time from weeks to a few hours for these two applications. This approach is applicable to most of the existing groundwater model codes for many applications.« less
Comparison of two LES codes for wind turbine wake studies

NASA Astrophysics Data System (ADS)

Sarlak, H.; Pierella, F.; Mikkelsen, R.; Sørensen, J. N.

2014-06-01

For the third time a blind test comparison in Norway 2013, was conducted comparing numerical simulations for the rotor Cp and Ct and wake profiles with the experimental results. As the only large eddy simulation study among participants, results of the Technical University of Denmark (DTU) using their in-house CFD solver, EllipSys3D, proved to be more reliable among the other models for capturing the wake profiles and the turbulence intensities downstream the turbine. It was therefore remarked in the workshop to investigate other LES codes to compare their performance with EllipSys3D. The aim of this paper is to investigate on two CFD solvers, the DTU's in-house code, EllipSys3D and the open-sourse toolbox, OpenFoam, for a set of actuator line based LES computations. Two types of simulations are performed: the wake behind a signle rotor and the wake behind a cluster of three inline rotors. Results are compared in terms of velocity deficit, turbulence kinetic energy and eddy viscosity. It is seen that both codes predict similar near-wake flow structures with the exception of OpenFoam's simulations without the subgrid-scale model. The differences begin to increase with increasing the distance from the upstream rotor. From the single rotor simulations, EllipSys3D is found to predict a slower wake recovery in the case of uniform laminar flow. From the 3-rotor computations, it is seen that the difference between the codes is smaller as the disturbance created by the downstream rotors causes break down of the wake structures and more homogenuous flow structures. It is finally observed that OpenFoam computations are more sensitive to the SGS models.
MCNP (Monte Carlo Neutron Photon) capabilities for nuclear well logging calculations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Forster, R.A.; Little, R.C.; Briesmeister, J.F.

The Los Alamos Radiation Transport Code System (LARTCS) consists of state-of-the-art Monte Carlo and discrete ordinates transport codes and data libraries. The general-purpose continuous-energy Monte Carlo code MCNP (Monte Carlo Neutron Photon), part of the LARTCS, provides a computational predictive capability for many applications of interest to the nuclear well logging community. The generalized three-dimensional geometry of MCNP is well suited for borehole-tool models. SABRINA, another component of the LARTCS, is a graphics code that can be used to interactively create a complex MCNP geometry. Users can define many source and tally characteristics with standard MCNP features. The time-dependent capabilitymore » of the code is essential when modeling pulsed sources. Problems with neutrons, photons, and electrons as either single particle or coupled particles can be calculated with MCNP. The physics of neutron and photon transport and interactions is modeled in detail using the latest available cross-section data. A rich collections of variance reduction features can greatly increase the efficiency of a calculation. MCNP is written in FORTRAN 77 and has been run on variety of computer systems from scientific workstations to supercomputers. The next production version of MCNP will include features such as continuous-energy electron transport and a multitasking option. Areas of ongoing research of interest to the well logging community include angle biasing, adaptive Monte Carlo, improved discrete ordinates capabilities, and discrete ordinates/Monte Carlo hybrid development. Los Alamos has requested approval by the Department of Energy to create a Radiation Transport Computational Facility under their User Facility Program to increase external interactions with industry, universities, and other government organizations. 21 refs.« less
Planet formation: is it good or bad to have a stellar companion?

NASA Astrophysics Data System (ADS)

Marzari, F.; Thebault, P.; Scholl, H.

2010-04-01

Planet formation in binary star systems is a complex issue due to the gravitational perturbations of the companion star. One of the crucial steps of the core-accretion model is planetesimal accretion into large protoplanets which finally coalesce into planets. In a planetesimal swarm surrounding the primary star, the average mutual impact velocity determines if larger bodies form or if the population is grinded down to dust, halting the planet formation process. This velocity is strongly influenced by the companion gravitational pull and by gas drag. The combined effect of these two forces may act in favour of or against planet formation, setting a lower or equal probability of the existence of extrasolar planets around single or binary stars. Planetesimal accretion in binaries has been studied so far with two different approaches. N-body codes based on the assumption that the disk is axisymmetric are very cost-effective since they allow the study of the mutual relative velocity with limited CPU usage. A large amount of planetesimal trajectories can be computed making it possible to outline the regions around the star where planet formation is possible. The main limitation of the N-body codes is the axisymmetric assumption. The companion perturbations affect not only the planetesimal orbits, but also the gaseous disk, by forcing spiral density waves. In addition, the overall shape of the disk changes from circular to elliptic. Hybrid codes have been recently developed which solve the equations for the disk with a hydrodynamical grid code and use the computed gas density and velocity vector to calculate an accurate value of the gas drag force on the planetesimals. These codes are more complex and may compute the trajectories of only a limited number of planetesimals.
Fabrication and evaluation of cold/formed/weldbrazed beta-titanium skin-stiffened compression panels

NASA Technical Reports Server (NTRS)

Royster, D. M.; Bales, T. T.; Davis, R. C.; Wiant, H. R.

1983-01-01

The room temperature and elevated temperature buckling behavior of cold formed beta titanium hat shaped stiffeners joined by weld brazing to alpha-beta titanium skins was determined. A preliminary set of single stiffener compression panels were used to develop a data base for material and panel properties. These panels were tested at room temperature and 316 C (600 F). A final set of multistiffener compression panels were fabricated for room temperature tests by the process developed in making the single stiffener panels. The overall geometrical dimensions for the multistiffener panels were determined by the structural sizing computer code PASCO. The data presented from the panel tests include load shortening curves, local buckling strengths, and failure loads. Experimental buckling loads are compared with the buckling loads predicted by the PASCO code. Material property data obtained from tests of ASTM standard dogbone specimens are also presented.
Implementation of a 3D mixing layer code on parallel computers

NASA Technical Reports Server (NTRS)

Roe, K.; Thakur, R.; Dang, T.; Bogucz, E.

1995-01-01

This paper summarizes our progress and experience in the development of a Computational-Fluid-Dynamics code on parallel computers to simulate three-dimensional spatially-developing mixing layers. In this initial study, the three-dimensional time-dependent Euler equations are solved using a finite-volume explicit time-marching algorithm. The code was first programmed in Fortran 77 for sequential computers. The code was then converted for use on parallel computers using the conventional message-passing technique, while we have not been able to compile the code with the present version of HPF compilers.
2nd Generation QUATARA Flight Computer Project

NASA Technical Reports Server (NTRS)

Falker, Jay; Keys, Andrew; Fraticelli, Jose Molina; Capo-Iugo, Pedro; Peeples, Steven

2015-01-01

Single core flight computer boards have been designed, developed, and tested (DD&T) to be flown in small satellites for the last few years. In this project, a prototype flight computer will be designed as a distributed multi-core system containing four microprocessors running code in parallel. This flight computer will be capable of performing multiple computationally intensive tasks such as processing digital and/or analog data, controlling actuator systems, managing cameras, operating robotic manipulators and transmitting/receiving from/to a ground station. In addition, this flight computer will be designed to be fault tolerant by creating both a robust physical hardware connection and by using a software voting scheme to determine the processor's performance. This voting scheme will leverage on the work done for the Space Launch System (SLS) flight software. The prototype flight computer will be constructed with Commercial Off-The-Shelf (COTS) components which are estimated to survive for two years in a low-Earth orbit.
Simulation of nonlinear propagation of biomedical ultrasound using pzflex and the Khokhlov-Zabolotskaya-Kuznetsov Texas code

PubMed Central

Qiao, Shan; Jackson, Edward; Coussios, Constantin C.; Cleveland, Robin O.

2016-01-01

Nonlinear acoustics plays an important role in both diagnostic and therapeutic applications of biomedical ultrasound and a number of research and commercial software packages are available. In this manuscript, predictions of two solvers available in a commercial software package, pzflex, one using the finite-element-method (FEM) and the other a pseudo-spectral method, spectralflex, are compared with measurements and the Khokhlov-Zabolotskaya-Kuznetsov (KZK) Texas code (a finite-difference time-domain algorithm). The pzflex methods solve the continuity equation, momentum equation and equation of state where they account for nonlinearity to second order whereas the KZK code solves a nonlinear wave equation with a paraxial approximation for diffraction. Measurements of the field from a single element 3.3 MHz focused transducer were compared with the simulations and there was good agreement for the fundamental frequency and the harmonics; however the FEM pzflex solver incurred a high computational cost to achieve equivalent accuracy. In addition, pzflex results exhibited non-physical oscillations in the spatial distribution of harmonics when the amplitudes were relatively low. It was found that spectralflex was able to accurately capture the nonlinear fields at reasonable computational cost. These results emphasize the need to benchmark nonlinear simulations before using codes as predictive tools. PMID:27914432
Progress Towards a Rad-Hydro Code for Modern Computing Architectures LA-UR-10-02825

NASA Astrophysics Data System (ADS)

Wohlbier, J. G.; Lowrie, R. B.; Bergen, B.; Calef, M.

2010-11-01

We are entering an era of high performance computing where data movement is the overwhelming bottleneck to scalable performance, as opposed to the speed of floating-point operations per processor. All multi-core hardware paradigms, whether heterogeneous or homogeneous, be it the Cell processor, GPGPU, or multi-core x86, share this common trait. In multi-physics applications such as inertial confinement fusion or astrophysics, one may be solving multi-material hydrodynamics with tabular equation of state data lookups, radiation transport, nuclear reactions, and charged particle transport in a single time cycle. The algorithms are intensely data dependent, e.g., EOS, opacity, nuclear data, and multi-core hardware memory restrictions are forcing code developers to rethink code and algorithm design. For the past two years LANL has been funding a small effort referred to as Multi-Physics on Multi-Core to explore ideas for code design as pertaining to inertial confinement fusion and astrophysics applications. The near term goals of this project are to have a multi-material radiation hydrodynamics capability, with tabular equation of state lookups, on cartesian and curvilinear block structured meshes. In the longer term we plan to add fully implicit multi-group radiation diffusion and material heat conduction, and block structured AMR. We will report on our progress to date.
Simulation of nonlinear propagation of biomedical ultrasound using pzflex and the Khokhlov-Zabolotskaya-Kuznetsov Texas code.

PubMed

Qiao, Shan; Jackson, Edward; Coussios, Constantin C; Cleveland, Robin O

2016-09-01

Nonlinear acoustics plays an important role in both diagnostic and therapeutic applications of biomedical ultrasound and a number of research and commercial software packages are available. In this manuscript, predictions of two solvers available in a commercial software package, pzflex, one using the finite-element-method (FEM) and the other a pseudo-spectral method, spectralflex, are compared with measurements and the Khokhlov-Zabolotskaya-Kuznetsov (KZK) Texas code (a finite-difference time-domain algorithm). The pzflex methods solve the continuity equation, momentum equation and equation of state where they account for nonlinearity to second order whereas the KZK code solves a nonlinear wave equation with a paraxial approximation for diffraction. Measurements of the field from a single element 3.3 MHz focused transducer were compared with the simulations and there was good agreement for the fundamental frequency and the harmonics; however the FEM pzflex solver incurred a high computational cost to achieve equivalent accuracy. In addition, pzflex results exhibited non-physical oscillations in the spatial distribution of harmonics when the amplitudes were relatively low. It was found that spectralflex was able to accurately capture the nonlinear fields at reasonable computational cost. These results emphasize the need to benchmark nonlinear simulations before using codes as predictive tools.
On the utility of graphics cards to perform massively parallel simulation of advanced Monte Carlo methods

PubMed Central

Lee, Anthony; Yau, Christopher; Giles, Michael B.; Doucet, Arnaud; Holmes, Christopher C.

2011-01-01

We present a case-study on the utility of graphics cards to perform massively parallel simulation of advanced Monte Carlo methods. Graphics cards, containing multiple Graphics Processing Units (GPUs), are self-contained parallel computational devices that can be housed in conventional desktop and laptop computers and can be thought of as prototypes of the next generation of many-core processors. For certain classes of population-based Monte Carlo algorithms they offer massively parallel simulation, with the added advantage over conventional distributed multi-core processors that they are cheap, easily accessible, easy to maintain, easy to code, dedicated local devices with low power consumption. On a canonical set of stochastic simulation examples including population-based Markov chain Monte Carlo methods and Sequential Monte Carlo methods, we nd speedups from 35 to 500 fold over conventional single-threaded computer code. Our findings suggest that GPUs have the potential to facilitate the growth of statistical modelling into complex data rich domains through the availability of cheap and accessible many-core computation. We believe the speedup we observe should motivate wider use of parallelizable simulation methods and greater methodological attention to their design. PMID:22003276
Efficient simulation of incompressible viscous flow over multi-element airfoils

NASA Technical Reports Server (NTRS)

Rogers, Stuart E.; Wiltberger, N. Lyn; Kwak, Dochan

1992-01-01

The incompressible, viscous, turbulent flow over single and multi-element airfoils is numerically simulated in an efficient manner by solving the incompressible Navier-Stokes equations. The computer code uses the method of pseudo-compressibility with an upwind-differencing scheme for the convective fluxes and an implicit line-relaxation solution algorithm. The motivation for this work includes interest in studying the high-lift take-off and landing configurations of various aircraft. In particular, accurate computation of lift and drag at various angles of attack, up to stall, is desired. Two different turbulence models are tested in computing the flow over an NACA 4412 airfoil; an accurate prediction of stall is obtained. The approach used for multi-element airfoils involves the use of multiple zones of structured grids fitted to each element. Two different approaches are compared: a patched system of grids, and an overlaid Chimera system of grids. Computational results are presented for two-element, three-element, and four-element airfoil configurations. Excellent agreement with experimental surface pressure coefficients is seen. The code converges in less than 200 iterations, requiring on the order of one minute of CPU time (on a CRAY YMP) per element in the airfoil configuration.

Unified aeroacoustics analysis for high speed turboprop aerodynamics and noise. Volume 4: Computer user's manual for UAAP turboprop aeroacoustic code

NASA Astrophysics Data System (ADS)

Menthe, R. W.; McColgan, C. J.; Ladden, R. M.

1991-05-01

The Unified AeroAcoustic Program (UAAP) code calculates the airloads on a single rotation prop-fan, or propeller, and couples these airloads with an acoustic radiation theory, to provide estimates of near-field or far-field noise levels. The steady airloads can also be used to calculate the nonuniform velocity components in the propeller wake. The airloads are calculated using a three dimensional compressible panel method which considers the effects of thin, cambered, multiple blades which may be highly swept. These airloads may be either steady or unsteady. The acoustic model uses the blade thickness distribution and the steady or unsteady aerodynamic loads to calculate the acoustic radiation. The users manual for the UAAP code is divided into five sections: general code description; input description; output description; system description; and error codes. The user must have access to IMSL10 libraries (MATH and SFUN) for numerous calls made for Bessel functions and matrix inversion. For plotted output users must modify the dummy calls to plotting routines included in the code to system-specific calls appropriate to the user's installation.
Unified aeroacoustics analysis for high speed turboprop aerodynamics and noise. Volume 4: Computer user's manual for UAAP turboprop aeroacoustic code

NASA Technical Reports Server (NTRS)

Menthe, R. W.; Mccolgan, C. J.; Ladden, R. M.

1991-01-01

The Unified AeroAcoustic Program (UAAP) code calculates the airloads on a single rotation prop-fan, or propeller, and couples these airloads with an acoustic radiation theory, to provide estimates of near-field or far-field noise levels. The steady airloads can also be used to calculate the nonuniform velocity components in the propeller wake. The airloads are calculated using a three dimensional compressible panel method which considers the effects of thin, cambered, multiple blades which may be highly swept. These airloads may be either steady or unsteady. The acoustic model uses the blade thickness distribution and the steady or unsteady aerodynamic loads to calculate the acoustic radiation. The users manual for the UAAP code is divided into five sections: general code description; input description; output description; system description; and error codes. The user must have access to IMSL10 libraries (MATH and SFUN) for numerous calls made for Bessel functions and matrix inversion. For plotted output users must modify the dummy calls to plotting routines included in the code to system-specific calls appropriate to the user's installation.
A Measurement and Simulation Based Methodology for Cache Performance Modeling and Tuning

NASA Technical Reports Server (NTRS)

Waheed, Abdul; Yan, Jerry; Saini, Subhash (Technical Monitor)

1998-01-01

We present a cache performance modeling methodology that facilitates the tuning of uniprocessor cache performance for applications executing on shared memory multiprocessors by accurately predicting the effects of source code level modifications. Measurements on a single processor are initially used for identifying parts of code where cache utilization improvements may significantly impact the overall performance. Cache simulation based on trace-driven techniques can be carried out without gathering detailed address traces. Minimal runtime information for modeling cache performance of a selected code block includes: base virtual addresses of arrays, virtual addresses of variables, and loop bounds for that code block. Rest of the information is obtained from the source code. We show that the cache performance predictions are as reliable as those obtained through trace-driven simulations. This technique is particularly helpful to the exploration of various "what-if' scenarios regarding the cache performance impact for alternative code structures. We explain and validate this methodology using a simple matrix-matrix multiplication program. We then apply this methodology to predict and tune the cache performance of two realistic scientific applications taken from the Computational Fluid Dynamics (CFD) domain.
An alternative to unstructured grids for computing gas dynamic flows around arbitrarily complex two-dimensional bodies

NASA Technical Reports Server (NTRS)

Quirk, James J.

1992-01-01

In this paper we describe an approach for dealing with arbitrary complex, two dimensional geometries, the so-called cartesian boundary method. Conceptually, the cartesian boundary method is quite simple. Solid bodies blank out areas of a background, cartesian mesh, and the resultant cut cells are singled out for special attention. However, there are several obstacles that must be overcome in order to achieve a practical scheme. We present a general strategy that overcomes these obstacles, together with some details of our successful conversion of an adaptive mesh algorithm from a body-fitted code to a cartesian boundary code.
Anisn-Dort Neutron-Gamma Flux Intercomparison Exercise for a Simple Testing Model

NASA Astrophysics Data System (ADS)

Boehmer, B.; Konheiser, J.; Borodkin, G.; Brodkin, E.; Egorov, A.; Kozhevnikov, A.; Zaritsky, S.; Manturov, G.; Voloschenko, A.

2003-06-01

The ability of transport codes ANISN, DORT, ROZ-6, MCNP and TRAMO, as well as nuclear data libraries BUGLE-96, ABBN-93, VITAMIN-B6 and ENDF/B-6 to deliver consistent gamma and neutron flux results was tested in the calculation of a one-dimensional cylindrical model consisting of a homogeneous core and an outer zone with a single material. Model variants with H2O, Fe, Cr and Ni in the outer zones were investigated. The results are compared with MCNP-ENDF/B-6 results. Discrepancies are discussed. The specified test model is proposed as a computational benchmark for testing calculation codes and data libraries.
Parallel Semi-Implicit Spectral Element Atmospheric Model

NASA Astrophysics Data System (ADS)

Fournier, A.; Thomas, S.; Loft, R.

2001-05-01

The shallow-water equations (SWE) have long been used to test atmospheric-modeling numerical methods. The SWE contain essential wave-propagation and nonlinear effects of more complete models. We present a semi-implicit (SI) improvement of the Spectral Element Atmospheric Model to solve the SWE (SEAM, Taylor et al. 1997, Fournier et al. 2000, Thomas & Loft 2000). SE methods are h-p finite element methods combining the geometric flexibility of size-h finite elements with the accuracy of degree-p spectral methods. Our work suggests that exceptional parallel-computation performance is achievable by a General-Circulation-Model (GCM) dynamical core, even at modest climate-simulation resolutions (>1o). The code derivation involves weak variational formulation of the SWE, Gauss(-Lobatto) quadrature over the collocation points, and Legendre cardinal interpolators. Appropriate weak variation yields a symmetric positive-definite Helmholtz operator. To meet the Ladyzhenskaya-Babuska-Brezzi inf-sup condition and avoid spurious modes, we use a staggered grid. The SI scheme combines leapfrog and Crank-Nicholson schemes for the nonlinear and linear terms respectively. The localization of operations to elements ideally fits the method to cache-based microprocessor computer architectures --derivatives are computed as collections of small (8x8), naturally cache-blocked matrix-vector products. SEAM also has desirable boundary-exchange communication, like finite-difference models. Timings on on the IBM SP and Compaq ES40 supercomputers indicate that the SI code (20-min timestep) requires 1/3 the CPU time of the explicit code (2-min timestep) for T42 resolutions. Both codes scale nearly linearly out to 400 processors. We achieved single-processor performance up to 30% of peak for both codes on the 375-MHz IBM Power-3 processors. Fast computation and linear scaling lead to a useful climate-simulation dycore only if enough model time is computed per unit wall-clock time. An efficient SI solver is essential to substantially increase this rate. Parallel preconditioning for an iterative conjugate-gradient elliptic solver is described. We are building a GCM dycore capable of 200 GF% lOPS sustained performance on clustered RISC/cache architectures using hybrid MPI/OpenMP programming.
Studies on Vapor Adsorption Systems

NASA Technical Reports Server (NTRS)

Shamsundar, N.; Ramotowski, M.

1998-01-01

The project consisted of performing experiments on single and dual bed vapor adsorption systems, thermodynamic cycle optimization, and thermal modeling. The work was described in a technical paper that appeared in conference proceedings and a Master's thesis, which were previously submitted to NASA. The present report describes some additional thermal modeling work done subsequently, and includes listings of computer codes developed during the project. Recommendations for future work are provided.
40 CFR 194.23 - Models and computer codes.

Code of Federal Regulations, 2013 CFR

2013-07-01

... 40 Protection of Environment 26 2013-07-01 2013-07-01 false Models and computer codes. 194.23... General Requirements § 194.23 Models and computer codes. (a) Any compliance application shall include: (1... obtain stable solutions; (iv) Computer models accurately implement the numerical models; i.e., computer...
40 CFR 194.23 - Models and computer codes.

Code of Federal Regulations, 2012 CFR

2012-07-01

... 40 Protection of Environment 26 2012-07-01 2011-07-01 true Models and computer codes. 194.23... General Requirements § 194.23 Models and computer codes. (a) Any compliance application shall include: (1... obtain stable solutions; (iv) Computer models accurately implement the numerical models; i.e., computer...
40 CFR 194.23 - Models and computer codes.

Code of Federal Regulations, 2014 CFR

2014-07-01

... 40 Protection of Environment 25 2014-07-01 2014-07-01 false Models and computer codes. 194.23... General Requirements § 194.23 Models and computer codes. (a) Any compliance application shall include: (1... obtain stable solutions; (iv) Computer models accurately implement the numerical models; i.e., computer...
40 CFR 194.23 - Models and computer codes.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 40 Protection of Environment 24 2010-07-01 2010-07-01 false Models and computer codes. 194.23... General Requirements § 194.23 Models and computer codes. (a) Any compliance application shall include: (1... obtain stable solutions; (iv) Computer models accurately implement the numerical models; i.e., computer...
40 CFR 194.23 - Models and computer codes.

Code of Federal Regulations, 2011 CFR

2011-07-01

... 40 Protection of Environment 25 2011-07-01 2011-07-01 false Models and computer codes. 194.23... General Requirements § 194.23 Models and computer codes. (a) Any compliance application shall include: (1... obtain stable solutions; (iv) Computer models accurately implement the numerical models; i.e., computer...
Advanced Pellet-Cladding Interaction Modeling using the US DOE CASL Fuel Performance Code: Peregrine

DOE Office of Scientific and Technical Information (OSTI.GOV)

Montgomery, Robert O.; Capps, Nathan A.; Sunderland, Dion J.

The US DOE’s Consortium for Advanced Simulation of LWRs (CASL) program has undertaken an effort to enhance and develop modeling and simulation tools for a virtual reactor application, including high fidelity neutronics, fluid flow/thermal hydraulics, and fuel and material behavior. The fuel performance analysis efforts aim to provide 3-dimensional capabilities for single and multiple rods to assess safety margins and the impact of plant operation and fuel rod design on the fuel thermo-mechanical-chemical behavior, including Pellet-Cladding Interaction (PCI) failures and CRUD-Induced Localized Corrosion (CILC) failures in PWRs. [1-3] The CASL fuel performance code, Peregrine, is an engineering scale code thatmore » is built upon the MOOSE/ELK/FOX computational FEM framework, which is also common to the fuel modeling framework, BISON [4,5]. Peregrine uses both 2-D and 3-D geometric fuel rod representations and contains a materials properties and fuel behavior model library for the UO2 and Zircaloy system common to PWR fuel derived from both open literature sources and the FALCON code [6]. The primary purpose of Peregrine is to accurately calculate the thermal, mechanical, and chemical processes active throughout a single fuel rod during operation in a reactor, for both steady state and off-normal conditions.« less
Simulation of LHC events on a millions threads

NASA Astrophysics Data System (ADS)

Childers, J. T.; Uram, T. D.; LeCompte, T. J.; Papka, M. E.; Benjamin, D. P.

2015-12-01

Demand for Grid resources is expected to double during LHC Run II as compared to Run I; the capacity of the Grid, however, will not double. The HEP community must consider how to bridge this computing gap by targeting larger compute resources and using the available compute resources as efficiently as possible. Argonne's Mira, the fifth fastest supercomputer in the world, can run roughly five times the number of parallel processes that the ATLAS experiment typically uses on the Grid. We ported Alpgen, a serial x86 code, to run as a parallel application under MPI on the Blue Gene/Q architecture. By analysis of the Alpgen code, we reduced the memory footprint to allow running 64 threads per node, utilizing the four hardware threads available per core on the PowerPC A2 processor. Event generation and unweighting, typically run as independent serial phases, are coupled together in a single job in this scenario, reducing intermediate writes to the filesystem. By these optimizations, we have successfully run LHC proton-proton physics event generation at the scale of a million threads, filling two-thirds of Mira.
X-33 Aerodynamic and Aeroheating Computations for Wind Tunnel and Flight Conditions

NASA Technical Reports Server (NTRS)

Hollis, Brian R.; Thompson, Richard A.; Murphy, Kelly J.; Nowak, Robert J.; Riley, Christopher J.; Wood, William A.; Alter, Stephen J.; Prabhu, Ramadas K.

1999-01-01

This report provides an overview of hypersonic Computational Fluid Dynamics research conducted at the NASA Langley Research Center to support the Phase II development of the X-33 vehicle. The X-33, which is being developed by Lockheed-Martin in partnership with NASA, is an experimental Single-Stage-to-Orbit demonstrator that is intended to validate critical technologies for a full-scale Reusable Launch Vehicle. As part of the development of the X-33, CFD codes have been used to predict the aerodynamic and aeroheating characteristics of the vehicle. Laminar and turbulent predictions were generated for the X 33 vehicle using two finite- volume, Navier-Stokes solvers. Inviscid solutions were also generated with an Euler code. Computations were performed for Mach numbers of 4.0 to 10.0 at angles-of-attack from 10 deg to 48 deg with body flap deflections of 0, 10 and 20 deg. Comparisons between predictions and wind tunnel aerodynamic and aeroheating data are presented in this paper. Aeroheating and aerodynamic predictions for flight conditions are also presented.
Unsteady Aero Computation of a 1 1/2 Stage Large Scale Rotating Turbine

NASA Technical Reports Server (NTRS)

To, Wai-Ming

2012-01-01

This report is the documentation of the work performed for the Subsonic Rotary Wing Project under the NASA s Fundamental Aeronautics Program. It was funded through Task Number NNC10E420T under GESS-2 Contract NNC06BA07B in the period of 10/1/2010 to 8/31/2011. The objective of the task is to provide support for the development of variable speed power turbine technology through application of computational fluid dynamics analyses. This includes work elements in mesh generation, multistage URANS simulations, and post-processing of the simulation results for comparison with the experimental data. The unsteady CFD calculations were performed with the TURBO code running in multistage single passage (phase lag) mode. Meshes for the blade rows were generated with the NASA developed TCGRID code. The CFD performance is assessed and improvements are recommended for future research in this area. For that, the United Technologies Research Center's 1 1/2 stage Large Scale Rotating Turbine was selected to be the candidate engine configuration for this computational effort because of the completeness and availability of the data.
Accelerating Monte Carlo simulations of photon transport in a voxelized geometry using a massively parallel graphics processing unit.

PubMed

Badal, Andreu; Badano, Aldo

2009-11-01

It is a known fact that Monte Carlo simulations of radiation transport are computationally intensive and may require long computing times. The authors introduce a new paradigm for the acceleration of Monte Carlo simulations: The use of a graphics processing unit (GPU) as the main computing device instead of a central processing unit (CPU). A GPU-based Monte Carlo code that simulates photon transport in a voxelized geometry with the accurate physics models from PENELOPE has been developed using the CUDATM programming model (NVIDIA Corporation, Santa Clara, CA). An outline of the new code and a sample x-ray imaging simulation with an anthropomorphic phantom are presented. A remarkable 27-fold speed up factor was obtained using a GPU compared to a single core CPU. The reported results show that GPUs are currently a good alternative to CPUs for the simulation of radiation transport. Since the performance of GPUs is currently increasing at a faster pace than that of CPUs, the advantages of GPU-based software are likely to be more pronounced in the future.
Optical aberration correction for simple lenses via sparse representation

NASA Astrophysics Data System (ADS)

Cui, Jinlin; Huang, Wei

2018-04-01

Simple lenses with spherical surfaces are lightweight, inexpensive, highly flexible, and can be easily processed. However, they suffer from optical aberrations that lead to limitations in high-quality photography. In this study, we propose a set of computational photography techniques based on sparse signal representation to remove optical aberrations, thereby allowing the recovery of images captured through a single-lens camera. The primary advantage of the proposed method is that many prior point spread functions calibrated at different depths are successfully used for restoring visual images in a short time, which can be generally applied to nonblind deconvolution methods for solving the problem of the excessive processing time caused by the number of point spread functions. The optical software CODE V is applied for examining the reliability of the proposed method by simulation. The simulation results reveal that the suggested method outperforms the traditional methods. Moreover, the performance of a single-lens camera is significantly enhanced both qualitatively and perceptually. Particularly, the prior information obtained by CODE V can be used for processing the real images of a single-lens camera, which provides an alternative approach to conveniently and accurately obtain point spread functions of single-lens cameras.
CFD Based Design of a Filming Injector for N+3 Combustors

NASA Technical Reports Server (NTRS)

Ajmani, Kumud; Mongia, Hukam; Lee, Phil

2016-01-01

An effort was undertaken to perform CFD analysis of fluid flow in Lean-Direct Injection (LDI) combustors with axial swirl-venturi elements for next-generation LDI-3 combustor design. The National Combustion Code (NCC) was used to perform non-reacting and two-phase reacting flow computations for a newly-designed pre-filming type fuel injector LDI-3 injector, in a single-injector and a five-injector array configuration. All computations were performed with a consistent approach of mesh-optimization, spray-modeling, ignition and kinetics-modeling. Computational predictions of the aerodynamics of the single-injector were used to arrive at an optimized main-injector design that meets effective area and fuel-air mixing criteria. Emissions (EINOx) characteristics were predicted for a medium-power engine cycle condition, and will be compared with data when it is made available from experimental measurements. The use of a PDF-like turbulence-chemistry interaction model with NCC's Time-Filtered Navier-Stokes (TFNS) solver is shown to produce a significant impact on the CFD results, when compared with a laminar-chemistry TFNS approach for the five-injector computations.
Computer Description of Black Hawk Helicopter

DTIC Science & Technology

1979-06-01

Model Combinatorial Geometry Models Black Hawk Helicopter Helicopter GIFT Computer Code Geometric Description of Targets 20. ABSTRACT...description was made using the technique of combinatorial geometry (COM-GEOM) and will be used as input to the GIFT computer code which generates Tliic...rnHp The data used bv the COVART comtmter code was eenerated bv the Geometric Information for Targets ( GIFT )Z computer code. This report documents

Multi-GPU accelerated three-dimensional FDTD method for electromagnetic simulation.

PubMed

Nagaoka, Tomoaki; Watanabe, Soichi

2011-01-01

Numerical simulation with a numerical human model using the finite-difference time domain (FDTD) method has recently been performed in a number of fields in biomedical engineering. To improve the method's calculation speed and realize large-scale computing with the numerical human model, we adapt three-dimensional FDTD code to a multi-GPU environment using Compute Unified Device Architecture (CUDA). In this study, we used NVIDIA Tesla C2070 as GPGPU boards. The performance of multi-GPU is evaluated in comparison with that of a single GPU and vector supercomputer. The calculation speed with four GPUs was approximately 3.5 times faster than with a single GPU, and was slightly (approx. 1.3 times) slower than with the supercomputer. Calculation speed of the three-dimensional FDTD method using GPUs can significantly improve with an expanding number of GPUs.
User manual for semi-circular compact range reflector code: Version 2

NASA Technical Reports Server (NTRS)

Gupta, Inder J.; Burnside, Walter D.

1987-01-01

A computer code has been developed at the Ohio State University ElectroScience Laboratory to analyze a semi-circular paraboloidal reflector with or without a rolled edge at the top and a skirt at the bottom. The code can be used to compute the total near field of the reflector or its individual components at a given distance from the center of the paraboloid. The code computes the fields along a radial, horizontal, vertical or axial cut at that distance. Thus, it is very effective in computing the size of the sweet spot for a semi-circular compact range reflector. This report describes the operation of the code. Various input and output statements are explained. Some results obtained using the computer code are presented to illustrate the code's capability as well as being samples of input/output sets.
Implementing Molecular Dynamics for Hybrid High Performance Computers - 1. Short Range Forces

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, W Michael; Wang, Peng; Plimpton, Steven J

The use of accelerators such as general-purpose graphics processing units (GPGPUs) have become popular in scientific computing applications due to their low cost, impressive floating-point capabilities, high memory bandwidth, and low electrical power requirements. Hybrid high performance computers, machines with more than one type of floating-point processor, are now becoming more prevalent due to these advantages. In this work, we discuss several important issues in porting a large molecular dynamics code for use on parallel hybrid machines - 1) choosing a hybrid parallel decomposition that works on central processing units (CPUs) with distributed memory and accelerator cores with shared memory,more » 2) minimizing the amount of code that must be ported for efficient acceleration, 3) utilizing the available processing power from both many-core CPUs and accelerators, and 4) choosing a programming model for acceleration. We present our solution to each of these issues for short-range force calculation in the molecular dynamics package LAMMPS. We describe algorithms for efficient short range force calculation on hybrid high performance machines. We describe a new approach for dynamic load balancing of work between CPU and accelerator cores. We describe the Geryon library that allows a single code to compile with both CUDA and OpenCL for use on a variety of accelerators. Finally, we present results on a parallel test cluster containing 32 Fermi GPGPUs and 180 CPU cores.« less
Computational fluid dynamics assessment: Volume 1, Computer simulations of the METC (Morgantown Energy Technology Center) entrained-flow gasifier: Final report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Celik, I.; Chattree, M.

1988-07-01

An assessment of the theoretical and numerical aspects of the computer code, PCGC-2, is made; and the results of the application of this code to the Morgantown Energy Technology Center (METC) advanced gasification facility entrained-flow reactor, ''the gasifier,'' are presented. PCGC-2 is a code suitable for simulating pulverized coal combustion or gasification under axisymmetric (two-dimensional) flow conditions. The governing equations for the gas and particulate phase have been reviewed. The numerical procedure and the related programming difficulties have been elucidated. A single-particle model similar to the one used in PCGC-2 has been developed, programmed, and applied to some simple situationsmore » in order to gain insight to the physics of coal particle heat-up, devolatilization, and char oxidation processes. PCGC-2 was applied to the METC entrained-flow gasifier to study numerically the flash pyrolysis of coal, and gasification of coal with steam or carbon dioxide. The results from the simulations are compared with measurements. The gas and particle residence times, particle temperature, and mass component history were also calculated and the results were analyzed. The results provide useful information for understanding the fundamentals of coal gasification and for assessment of experimental results performed using the reactor considered. 69 refs., 35 figs., 23 tabs.« less
Hanford meteorological station computer codes: Volume 9, The quality assurance computer codes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Burk, K.W.; Andrews, G.L.

1989-02-01

The Hanford Meteorological Station (HMS) was established in 1944 on the Hanford Site to collect and archive meteorological data and provide weather forecasts and related services for Hanford Site approximately 1/2 mile east of the 200 West Area and is operated by PNL for the US Department of Energy. Meteorological data are collected from various sensors and equipment located on and off the Hanford Site. These data are stored in data bases on the Digital Equipment Corporation (DEC) VAX 11/750 at the HMS (hereafter referred to as the HMS computer). Files from those data bases are routinely transferred to themore » Emergency Management System (EMS) computer at the Unified Dose Assessment Center (UDAC). To ensure the quality and integrity of the HMS data, a set of Quality Assurance (QA) computer codes has been written. The codes will be routinely used by the HMS system manager or the data base custodian. The QA codes provide detailed output files that will be used in correcting erroneous data. The following sections in this volume describe the implementation and operation of QA computer codes. The appendices contain detailed descriptions, flow charts, and source code listings of each computer code. 2 refs.« less
Composite blade structural analyzer (COBSTRAN) user's manual

NASA Technical Reports Server (NTRS)

Aiello, Robert A.

1989-01-01

The installation and use of a computer code, COBSTRAN (COmposite Blade STRuctrual ANalyzer), developed for the design and analysis of composite turbofan and turboprop blades and also for composite wind turbine blades was described. This code combines composite mechanics and laminate theory with an internal data base of fiber and matrix properties. Inputs to the code are constituent fiber and matrix material properties, factors reflecting the fabrication process, composite geometry and blade geometry. COBSTRAN performs the micromechanics, macromechanics and laminate analyses of these fiber composites. COBSTRAN generates a NASTRAN model with equivalent anisotropic homogeneous material properties. Stress output from NASTRAN is used to calculate individual ply stresses, strains, interply stresses, thru-the-thickness stresses and failure margins. Curved panel structures may be modeled providing the curvature of a cross-section is defined by a single value function. COBSTRAN is written in FORTRAN 77.
A Dancing Black Hole

NASA Astrophysics Data System (ADS)

Shoemaker, Deirdre; Smith, Kenneth; Schnetter, Erik; Fiske, David; Laguna, Pablo; Pullin, Jorge

2002-04-01

Recently, stationary black holes have been successfully simulated for up to times of approximately 600-1000M, where M is the mass of the black hole. Considering that the expected burst of gravitational radiation from a binary black hole merger would last approximately 200-500M, black hole codes are approaching the point where simulations of mergers may be feasible. We will present two types of simulations of single black holes obtained with a code based on the Baumgarte-Shapiro-Shibata-Nakamura formulation of the Einstein evolution equations. One type of simulations addresses the stability properties of stationary black hole evolutions. The second type of simulations demonstrates the ability of our code to move a black hole through the computational domain. This is accomplished by shifting the stationary black hole solution to a coordinate system in which the location of the black hole is time dependent.
Real-Space Density Functional Theory on Graphical Processing Units: Computational Approach and Comparison to Gaussian Basis Set Methods.

PubMed

Andrade, Xavier; Aspuru-Guzik, Alán

2013-10-08

We discuss the application of graphical processing units (GPUs) to accelerate real-space density functional theory (DFT) calculations. To make our implementation efficient, we have developed a scheme to expose the data parallelism available in the DFT approach; this is applied to the different procedures required for a real-space DFT calculation. We present results for current-generation GPUs from AMD and Nvidia, which show that our scheme, implemented in the free code Octopus, can reach a sustained performance of up to 90 GFlops for a single GPU, representing a significant speed-up when compared to the CPU version of the code. Moreover, for some systems, our implementation can outperform a GPU Gaussian basis set code, showing that the real-space approach is a competitive alternative for DFT simulations on GPUs.
Software engineering and automatic continuous verification of scientific software

NASA Astrophysics Data System (ADS)

Piggott, M. D.; Hill, J.; Farrell, P. E.; Kramer, S. C.; Wilson, C. R.; Ham, D.; Gorman, G. J.; Bond, T.

2011-12-01

Software engineering of scientific code is challenging for a number of reasons including pressure to publish and a lack of awareness of the pitfalls of software engineering by scientists. The Applied Modelling and Computation Group at Imperial College is a diverse group of researchers that employ best practice software engineering methods whilst developing open source scientific software. Our main code is Fluidity - a multi-purpose computational fluid dynamics (CFD) code that can be used for a wide range of scientific applications from earth-scale mantle convection, through basin-scale ocean dynamics, to laboratory-scale classic CFD problems, and is coupled to a number of other codes including nuclear radiation and solid modelling. Our software development infrastructure consists of a number of free tools that could be employed by any group that develops scientific code and has been developed over a number of years with many lessons learnt. A single code base is developed by over 30 people for which we use bazaar for revision control, making good use of the strong branching and merging capabilities. Using features of Canonical's Launchpad platform, such as code review, blueprints for designing features and bug reporting gives the group, partners and other Fluidity uers an easy-to-use platform to collaborate and allows the induction of new members of the group into an environment where software development forms a central part of their work. The code repositoriy are coupled to an automated test and verification system which performs over 20,000 tests, including unit tests, short regression tests, code verification and large parallel tests. Included in these tests are build tests on HPC systems, including local and UK National HPC services. The testing of code in this manner leads to a continuous verification process; not a discrete event performed once development has ceased. Much of the code verification is done via the "gold standard" of comparisons to analytical solutions via the method of manufactured solutions. By developing and verifying code in tandem we avoid a number of pitfalls in scientific software development and advocate similar procedures for other scientific code applications.
Effective Vectorization with OpenMP 4.5

DOE Office of Scientific and Technical Information (OSTI.GOV)

Huber, Joseph N.; Hernandez, Oscar R.; Lopez, Matthew Graham

This paper describes how the Single Instruction Multiple Data (SIMD) model and its extensions in OpenMP work, and how these are implemented in different compilers. Modern processors are highly parallel computational machines which often include multiple processors capable of executing several instructions in parallel. Understanding SIMD and executing instructions in parallel allows the processor to achieve higher performance without increasing the power required to run it. SIMD instructions can significantly reduce the runtime of code by executing a single operation on large groups of data. The SIMD model is so integral to the processor s potential performance that, if SIMDmore » is not utilized, less than half of the processor is ever actually used. Unfortunately, using SIMD instructions is a challenge in higher level languages because most programming languages do not have a way to describe them. Most compilers are capable of vectorizing code by using the SIMD instructions, but there are many code features important for SIMD vectorization that the compiler cannot determine at compile time. OpenMP attempts to solve this by extending the C++/C and Fortran programming languages with compiler directives that express SIMD parallelism. OpenMP is used to pass hints to the compiler about the code to be executed in SIMD. This is a key resource for making optimized code, but it does not change whether or not the code can use SIMD operations. However, in many cases critical functions are limited by a poor understanding of how SIMD instructions are actually implemented, as SIMD can be implemented through vector instructions or simultaneous multi-threading (SMT). We have found that it is often the case that code cannot be vectorized, or is vectorized poorly, because the programmer does not have sufficient knowledge of how SIMD instructions work.« less
Mixed Single/Double Precision in OpenIFS: A Detailed Study of Energy Savings, Scaling Effects, Architectural Effects, and Compilation Effects

NASA Astrophysics Data System (ADS)

Fagan, Mike; Dueben, Peter; Palem, Krishna; Carver, Glenn; Chantry, Matthew; Palmer, Tim; Schlacter, Jeremy

2017-04-01

It has been shown that a mixed precision approach that judiciously replaces double precision with single precision calculations can speed-up global simulations. In particular, a mixed precision variation of the Integrated Forecast System (IFS) of the European Centre for Medium-Range Weather Forecasts (ECMWF) showed virtually the same quality model results as the standard double precision version (Vana et al., Single precision in weather forecasting models: An evaluation with the IFS, Monthly Weather Review, in print). In this study, we perform detailed measurements of savings in computing time and energy using a mixed precision variation of the -OpenIFS- model. The mixed precision variation of OpenIFS is analogous to the IFS variation used in Vana et al. We (1) present results for energy measurements for simulations in single and double precision using Intel's RAPL technology, (2) conduct a -scaling- study to quantify the effects that increasing model resolution has on both energy dissipation and computing cycles, (3) analyze the differences between single core and multicore processing, and (4) compare the effects of different compiler technologies on the mixed precision OpenIFS code. In particular, we compare intel icc/ifort with gnu gcc/gfortran.
User's manual for semi-circular compact range reflector code

NASA Technical Reports Server (NTRS)

Gupta, Inder J.; Burnside, Walter D.

1986-01-01

A computer code was developed to analyze a semi-circular paraboloidal reflector antenna with a rolled edge at the top and a skirt at the bottom. The code can be used to compute the total near field of the antenna or its individual components at a given distance from the center of the paraboloid. Thus, it is very effective in computing the size of the sweet spot for RCS or antenna measurement. The operation of the code is described. Various input and output statements are explained. Some results obtained using the computer code are presented to illustrate the code's capability as well as being samples of input/output sets.
Highly fault-tolerant parallel computation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Spielman, D.A.

We re-introduce the coded model of fault-tolerant computation in which the input and output of a computational device are treated as words in an error-correcting code. A computational device correctly computes a function in the coded model if its input and output, once decoded, are a valid input and output of the function. In the coded model, it is reasonable to hope to simulate all computational devices by devices whose size is greater by a constant factor but which are exponentially reliable even if each of their components can fail with some constant probability. We consider fine-grained parallel computations inmore » which each processor has a constant probability of producing the wrong output at each time step. We show that any parallel computation that runs for time t on w processors can be performed reliably on a faulty machine in the coded model using w log{sup O(l)} w processors and time t log{sup O(l)} w. The failure probability of the computation will be at most t {center_dot} exp(-w{sup 1/4}). The codes used to communicate with our fault-tolerant machines are generalized Reed-Solomon codes and can thus be encoded and decoded in O(n log{sup O(1)} n) sequential time and are independent of the machine they are used to communicate with. We also show how coded computation can be used to self-correct many linear functions in parallel with arbitrarily small overhead.« less
An emulator for minimizing computer resources for finite element analysis

NASA Technical Reports Server (NTRS)

Melosh, R.; Utku, S.; Islam, M.; Salama, M.

1984-01-01

A computer code, SCOPE, has been developed for predicting the computer resources required for a given analysis code, computer hardware, and structural problem. The cost of running the code is a small fraction (about 3 percent) of the cost of performing the actual analysis. However, its accuracy in predicting the CPU and I/O resources depends intrinsically on the accuracy of calibration data that must be developed once for the computer hardware and the finite element analysis code of interest. Testing of the SCOPE code on the AMDAHL 470 V/8 computer and the ELAS finite element analysis program indicated small I/O errors (3.2 percent), larger CPU errors (17.8 percent), and negligible total errors (1.5 percent).
Modeling Laser Damage Thresholds Using the Thompson-Gerstman Model

DTIC Science & Technology

2014-10-01

Gerstman model was intended to be a modular tool fit for integration into other computational models. This adds usability to the standalone code...Advanced Study Institute, Series A – Life Sciences, Vol. 34, pp. 77-97. New York: Plenum Press . 4. Birngruber, R., V.-P. Gabel and F. Hillenkamp...Random granule placement - varies with melnum. ; ii. Depth averaging or shadowing - varies with melnum. ; iii. T(r,t) single granule calc
Implementation of BT, SP, LU, and FT of NAS Parallel Benchmarks in Java

NASA Technical Reports Server (NTRS)

Schultz, Matthew; Frumkin, Michael; Jin, Hao-Qiang; Yan, Jerry

2000-01-01

A number of Java features make it an attractive but a debatable choice for High Performance Computing. We have implemented benchmarks working on single structured grid BT,SP,LU and FT in Java. The performance and scalability of the Java code shows that a significant improvement in Java compiler technology and in Java thread implementation are necessary for Java to compete with Fortran in HPC applications.
Finite Element Modeling of Coupled Flexible Multibody Dynamics and Liquid Sloshing

DTIC Science & Technology

2006-09-01

tanks is presented. The semi-discrete combined solid and fluid equations of motions are integrated using a time- accurate parallel explicit solver...Incompressible fluid flow in a moving/deforming container including accurate modeling of the free-surface, turbulence, and viscous effects ...paper, a single computational code which uses a time- accurate explicit solution procedure is used to solve both the solid and fluid equations of
Increasing processor utilization during parallel computation rundown

NASA Technical Reports Server (NTRS)

Jones, W. H.

1986-01-01

Some parallel processing environments provide for asynchronous execution and completion of general purpose parallel computations from a single computational phase. When all the computations from such a phase are complete, a new parallel computational phase is begun. Depending upon the granularity of the parallel computations to be performed, there may be a shortage of available work as a particular computational phase draws to a close (computational rundown). This can result in the waste of computing resources and the delay of the overall problem. In many practical instances, strict sequential ordering of phases of parallel computation is not totally required. In such cases, the beginning of one phase can be correctly computed before the end of a previous phase is completed. This allows additional work to be generated somewhat earlier to keep computing resources busy during each computational rundown. The conditions under which this can occur are identified and the frequency of occurrence of such overlapping in an actual parallel Navier-Stokes code is reported. A language construct is suggested and possible control strategies for the management of such computational phase overlapping are discussed.
Utilizing photon number parity measurements to demonstrate quantum computation with cat-states in a cavity

NASA Astrophysics Data System (ADS)

Petrenko, A.; Ofek, N.; Vlastakis, B.; Sun, L.; Leghtas, Z.; Heeres, R.; Sliwa, K. M.; Mirrahimi, M.; Jiang, L.; Devoret, M. H.; Schoelkopf, R. J.

2015-03-01

Realizing a working quantum computer requires overcoming the many challenges that come with coupling large numbers of qubits to perform logical operations. These include improving coherence times, achieving high gate fidelities, and correcting for the inevitable errors that will occur throughout the duration of an algorithm. While impressive progress has been made in all of these areas, the difficulty of combining these ingredients to demonstrate an error-protected logical qubit, comprised of many physical qubits, still remains formidable. With its large Hilbert space, superior coherence properties, and single dominant error channel (single photon loss), a superconducting 3D resonator acting as a resource for a quantum memory offers a hardware-efficient alternative to multi-qubit codes [Leghtas et.al. PRL 2013]. Here we build upon recent work on cat-state encoding [Vlastakis et.al. Science 2013] and photon-parity jumps [Sun et.al. 2014] by exploring the effects of sequential measurements on a cavity state. Employing a transmon qubit dispersively coupled to two superconducting resonators in a cQED architecture, we explore further the application of parity measurements to characterizing such a hybrid qubit/cat state architecture. In so doing, we demonstrate the promise of integrating cat states as central constituents of future quantum codes.
Quantitative analysis of biomedical samples using synchrotron radiation microbeams

NASA Astrophysics Data System (ADS)

Ektessabi, Ali; Shikine, Shunsuke; Yoshida, Sohei

2001-07-01

X-ray fluorescence (XRF) using a synchrotron radiation (SR) microbeam was applied to investigate distributions and concentrations of elements in single neurons of patients with neurodegenerative diseases. In this paper we introduce a computer code that has been developed to quantify the trace elements and matrix elements at the single cell level. This computer code has been used in studies of several important neurodegenerative diseases such as Alzheimer's disease (AD), Parkinson's disease (PD) and parkinsonism-dementia complex (PDC), as well as in basic biological experiments to determine the elemental changes in cells due to incorporation of foreign metal elements. The substantial nigra (SN) tissue obtained from the autopsy specimens of patients with Guamanian parkinsonism-dementia complex (PDC) and control cases were examined. Quantitative XRF analysis showed that neuromelanin granules of Parkinsonian SN contained higher levels of Fe than those of the control. The concentrations were in the ranges of 2300-3100 ppm and 2000-2400 ppm respectively. On the contrary, Zn and Ni in neuromelanin granules of SN tissue from the PDC case were lower than those of the control. Especially Zn was less than 40 ppm in SN tissue from the PDC case while it was 560-810 ppm in the control. These changes are considered to be closely related to the neuro-degeneration and cell death.

Optimizing the inner loop of the gravitational force interaction on modern processors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Warren, Michael S

2010-12-08

We have achieved superior performance on multiple generations of the fastest supercomputers in the world with our hashed oct-tree N-body code (HOT), spanning almost two decades and garnering multiple Gordon Bell Prizes for significant achievement in parallel processing. Execution time for our N-body code is largely influenced by the force calculation in the inner loop. Improvements to the inner loop using SSE3 instructions has enabled the calculation of over 200 million gravitational interactions per second per processor on a 2.6 GHz Opteron, for a computational rate of over 7 Gflops in single precision (700/0 of peak). We obtain optimal performancemore » some processors (including the Cell) by decomposing the reciprocal square root function required for a gravitational interaction into a table lookup, Chebychev polynomial interpolation, and Newton-Raphson iteration, using the algorithm of Karp. By unrolling the loop by a factor of six, and using SPU intrinsics to compute on vectors, we obtain performance of over 16 Gflops on a single Cell SPE. Aggregated over the 8 SPEs on a Cell processor, the overall performance is roughly 130 Gflops. In comparison, the ordinary C version of our inner loop only obtains 1.6 Gflops per SPE with the spuxlc compiler.« less
Exact Rayleigh scattering calculations for use with the Nimbus-7 Coastal Zone Color Scanner.

PubMed

Gordon, H R; Brown, J W; Evans, R H

1988-03-01

For improved analysis of Coastal Zone Color Scanner (CZCS) imagery, the radiance reflected from a planeparallel atmosphere and flat sea surface in the absence of aerosols (Rayleigh radiance) has been computed with an exact multiple scattering code, i.e., including polarization. The results indicate that the single scattering approximation normally used to compute this radiance can cause errors of up to 5% for small and moderate solar zenith angles. At large solar zenith angles, such as encountered in the analysis of high-latitude imagery, the errors can become much larger, e.g.,>10% in the blue band. The single scattering error also varies along individual scan lines. Comparison with multiple scattering computations using scalar transfer theory, i.e., ignoring polarization, show that scalar theory can yield errors of approximately the same magnitude as single scattering when compared with exact computations at small to moderate values of the solar zenith angle. The exact computations can be easily incorporated into CZCS processing algorithms, and, for application to future instruments with higher radiometric sensitivity, a scheme is developed with which the effect of variations in the surface pressure could be easily and accurately included in the exact computation of the Rayleigh radiance. Direct application of these computations to CZCS imagery indicates that accurate atmospheric corrections can be made with solar zenith angles at least as large as 65 degrees and probably up to at least 70 degrees with a more sensitive instrument. This suggests that the new Rayleigh radiance algorithm should produce more consistent pigment retrievals, particularly at high latitudes.
A generalized one-dimensional computer code for turbomachinery cooling passage flow calculations

NASA Technical Reports Server (NTRS)

Kumar, Ganesh N.; Roelke, Richard J.; Meitner, Peter L.

1989-01-01

A generalized one-dimensional computer code for analyzing the flow and heat transfer in the turbomachinery cooling passages was developed. This code is capable of handling rotating cooling passages with turbulators, 180 degree turns, pin fins, finned passages, by-pass flows, tip cap impingement flows, and flow branching. The code is an extension of a one-dimensional code developed by P. Meitner. In the subject code, correlations for both heat transfer coefficient and pressure loss computations were developed to model each of the above mentioned type of coolant passages. The code has the capability of independently computing the friction factor and heat transfer coefficient on each side of a rectangular passage. Either the mass flow at the inlet to the channel or the exit plane pressure can be specified. For a specified inlet total temperature, inlet total pressure, and exit static pressure, the code computers the flow rates through the main branch and the subbranches, flow through tip cap for impingement cooling, in addition to computing the coolant pressure, temperature, and heat transfer coefficient distribution in each coolant flow branch. Predictions from the subject code for both nonrotating and rotating passages agree well with experimental data. The code was used to analyze the cooling passage of a research cooled radial rotor.
High-Speed Particle-in-Cell Simulation Parallelized with Graphic Processing Units for Low Temperature Plasmas for Material Processing

NASA Astrophysics Data System (ADS)

Hur, Min Young; Verboncoeur, John; Lee, Hae June

2014-10-01

Particle-in-cell (PIC) simulations have high fidelity in the plasma device requiring transient kinetic modeling compared with fluid simulations. It uses less approximation on the plasma kinetics but requires many particles and grids to observe the semantic results. It means that the simulation spends lots of simulation time in proportion to the number of particles. Therefore, PIC simulation needs high performance computing. In this research, a graphic processing unit (GPU) is adopted for high performance computing of PIC simulation for low temperature discharge plasmas. GPUs have many-core processors and high memory bandwidth compared with a central processing unit (CPU). NVIDIA GeForce GPUs were used for the test with hundreds of cores which show cost-effective performance. PIC code algorithm is divided into two modules which are a field solver and a particle mover. The particle mover module is divided into four routines which are named move, boundary, Monte Carlo collision (MCC), and deposit. Overall, the GPU code solves particle motions as well as electrostatic potential in two-dimensional geometry almost 30 times faster than a single CPU code. This work was supported by the Korea Institute of Science Technology Information.
Development of an integrated BEM approach for hot fluid structure interaction

NASA Technical Reports Server (NTRS)

Dargush, G. F.; Banerjee, P. K.; Shi, Y.

1991-01-01

The development of a comprehensive fluid-structure interaction capability within a boundary element computer code is described. This new capability is implemented in a completely general manner, so that quite arbitrary geometry, material properties and boundary conditions may be specified. Thus, a single analysis code can be used to run structures-only problems, fluids-only problems, or the combined fluid-structure problem. In all three cases, steady or transient conditions can be selected, with or without thermal effects. Nonlinear analyses can be solved via direct iteration or by employing a modified Newton-Raphson approach. A number of detailed numerical examples are included at the end of these two sections to validate the formulations and to emphasize both the accuracy and generality of the computer code. A brief review of the recent applicable boundary element literature is included for completeness. The fluid-structure interaction facility is discussed. Once again, several examples are provided to highlight this unique capability. A collection of potential boundary element applications that have been uncovered as a result of work related to the present grant is given. For most of those problems, satisfactory analysis techniques do not currently exist.
Efficient Proximity Computation Techniques Using ZIP Code Data for Smart Cities †

PubMed Central

Murdani, Muhammad Harist; Hong, Bonghee

2018-01-01

In this paper, we are interested in computing ZIP code proximity from two perspectives, proximity between two ZIP codes (Ad-Hoc) and neighborhood proximity (Top-K). Such a computation can be used for ZIP code-based target marketing as one of the smart city applications. A naïve approach to this computation is the usage of the distance between ZIP codes. We redefine a distance metric combining the centroid distance with the intersecting road network between ZIP codes by using a weighted sum method. Furthermore, we prove that the results of our combined approach conform to the characteristics of distance measurement. We have proposed a general and heuristic approach for computing Ad-Hoc proximity, while for computing Top-K proximity, we have proposed a general approach only. Our experimental results indicate that our approaches are verifiable and effective in reducing the execution time and search space. PMID:29587366
Efficient Proximity Computation Techniques Using ZIP Code Data for Smart Cities †.

PubMed

Murdani, Muhammad Harist; Kwon, Joonho; Choi, Yoon-Ho; Hong, Bonghee

2018-03-24

In this paper, we are interested in computing ZIP code proximity from two perspectives, proximity between two ZIP codes ( Ad-Hoc ) and neighborhood proximity ( Top-K ). Such a computation can be used for ZIP code-based target marketing as one of the smart city applications. A naïve approach to this computation is the usage of the distance between ZIP codes. We redefine a distance metric combining the centroid distance with the intersecting road network between ZIP codes by using a weighted sum method. Furthermore, we prove that the results of our combined approach conform to the characteristics of distance measurement. We have proposed a general and heuristic approach for computing Ad-Hoc proximity, while for computing Top-K proximity, we have proposed a general approach only. Our experimental results indicate that our approaches are verifiable and effective in reducing the execution time and search space.
Validation of the technique for absolute total electron content and differential code biases estimation

NASA Astrophysics Data System (ADS)

Mylnikova, Anna; Yasyukevich, Yury; Yasyukevich, Anna

2017-04-01

We have developed a technique for vertical total electron content (TEC) and differential code biases (DCBs) estimation using data from a single GPS/GLONASS station. The algorithm is based on TEC expansion into Taylor series in space and time (TayAbsTEC). We perform the validation of the technique using Global Ionospheric Maps (GIM) computed by Center for Orbit Determination in Europe (CODE) and Jet Propulsion Laboratory (JPL). We compared differences between absolute vertical TEC (VTEC) from GIM and VTEC evaluated by TayAbsTEC for 2009 year (solar activity minimum - sunspot number about 0), and for 2014 year (solar activity maximum - sunspot number 110). Since there is difference between VTEC from CODE and VTEC from JPL, we compare TayAbsTEC VTEC with both of them. We found that TayAbsTEC VTEC is closer to CODE VTEC than to JPL VTEC. The difference between TayAbsTEC VTEC and GIM VTEC is more noticeable for solar activity maximum (2014) than for solar activity minimum (2009) for both CODE and JPL. The distribution of VTEC differences is close to Gaussian distribution, so we conclude that results of TayAbsTEC are in the agreement with GIM VTEC. We also compared DCBs evaluated by TayAbsTEC and DCBs from GIM, computed by CODE. The TayAbsTEC DCBs are in good agreement with CODE DCBs for GPS satellites, but differ noticeable for GLONASS. We used DCBs to correct slant TEC to find out which DCBs give better results. Slant TEC correction with CODE DCBs produces negative and nonphysical TEC values. Slant TEC correction with TayAbsTEC DCBs doesn't produce such artifacts. The technique we developed is used for VTEC and DCBs calculation given only local GPS/GLONASS networks data. The evaluated VTEC data are in GIM framework which is handy when various data analyses are made.
Acceleration of fluoro-CT reconstruction for a mobile C-Arm on GPU and FPGA hardware: a simulation study

NASA Astrophysics Data System (ADS)

Xue, Xinwei; Cheryauka, Arvi; Tubbs, David

2006-03-01

CT imaging in interventional and minimally-invasive surgery requires high-performance computing solutions that meet operational room demands, healthcare business requirements, and the constraints of a mobile C-arm system. The computational requirements of clinical procedures using CT-like data are increasing rapidly, mainly due to the need for rapid access to medical imagery during critical surgical procedures. The highly parallel nature of Radon transform and CT algorithms enables embedded computing solutions utilizing a parallel processing architecture to realize a significant gain of computational intensity with comparable hardware and program coding/testing expenses. In this paper, using a sample 2D and 3D CT problem, we explore the programming challenges and the potential benefits of embedded computing using commodity hardware components. The accuracy and performance results obtained on three computational platforms: a single CPU, a single GPU, and a solution based on FPGA technology have been analyzed. We have shown that hardware-accelerated CT image reconstruction can be achieved with similar levels of noise and clarity of feature when compared to program execution on a CPU, but gaining a performance increase at one or more orders of magnitude faster. 3D cone-beam or helical CT reconstruction and a variety of volumetric image processing applications will benefit from similar accelerations.
Volume accumulator design analysis computer codes

NASA Technical Reports Server (NTRS)

Whitaker, W. D.; Shimazaki, T. T.

1973-01-01

The computer codes, VANEP and VANES, were written and used to aid in the design and performance calculation of the volume accumulator units (VAU) for the 5-kwe reactor thermoelectric system. VANEP computes the VAU design which meets the primary coolant loop VAU volume and pressure performance requirements. VANES computes the performance of the VAU design, determined from the VANEP code, at the conditions of the secondary coolant loop. The codes can also compute the performance characteristics of the VAU's under conditions of possible modes of failure which still permit continued system operation.
"Hour of Code": Can It Change Students' Attitudes toward Programming?

ERIC Educational Resources Information Center

Du, Jie; Wimmer, Hayden; Rada, Roy

2016-01-01

The Hour of Code is a one-hour introduction to computer science organized by Code.org, a non-profit dedicated to expanding participation in computer science. This study investigated the impact of the Hour of Code on students' attitudes towards computer programming and their knowledge of programming. A sample of undergraduate students from two…
Development of Unsteady Aerodynamic and Aeroelastic Reduced-Order Models Using the FUN3D Code

NASA Technical Reports Server (NTRS)

Silva, Walter A.; Vatsa, Veer N.; Biedron, Robert T.

2009-01-01

Recent significant improvements to the development of CFD-based unsteady aerodynamic reduced-order models (ROMs) are implemented into the FUN3D unstructured flow solver. These improvements include the simultaneous excitation of the structural modes of the CFD-based unsteady aerodynamic system via a single CFD solution, minimization of the error between the full CFD and the ROM unsteady aero- dynamic solution, and computation of a root locus plot of the aeroelastic ROM. Results are presented for a viscous version of the two-dimensional Benchmark Active Controls Technology (BACT) model and an inviscid version of the AGARD 445.6 aeroelastic wing using the FUN3D code.
Synergia: an accelerator modeling tool with 3-D space charge

DOE Office of Scientific and Technical Information (OSTI.GOV)

Amundson, James F.; Spentzouris, P.; /Fermilab

2004-07-01

High precision modeling of space-charge effects, together with accurate treatment of single-particle dynamics, is essential for designing future accelerators as well as optimizing the performance of existing machines. We describe Synergia, a high-fidelity parallel beam dynamics simulation package with fully three dimensional space-charge capabilities and a higher order optics implementation. We describe the computational techniques, the advanced human interface, and the parallel performance obtained using large numbers of macroparticles. We also perform code benchmarks comparing to semi-analytic results and other codes. Finally, we present initial results on particle tune spread, beam halo creation, and emittance growth in the Fermilab boostermore » accelerator.« less
Multitasking for flows about multiple body configurations using the chimera grid scheme

NASA Technical Reports Server (NTRS)

Dougherty, F. C.; Morgan, R. L.

1987-01-01

The multitasking of a finite-difference scheme using multiple overset meshes is described. In this chimera, or multiple overset mesh approach, a multiple body configuration is mapped using a major grid about the main component of the configuration, with minor overset meshes used to map each additional component. This type of code is well suited to multitasking. Both steady and unsteady two dimensional computations are run on parallel processors on a CRAY-X/MP 48, usually with one mesh per processor. Flow field results are compared with single processor results to demonstrate the feasibility of running multiple mesh codes on parallel processors and to show the increase in efficiency.
A high-speed BCI based on code modulation VEP

NASA Astrophysics Data System (ADS)

Bin, Guangyu; Gao, Xiaorong; Wang, Yijun; Li, Yun; Hong, Bo; Gao, Shangkai

2011-04-01

Recently, electroencephalogram-based brain-computer interfaces (BCIs) have attracted much attention in the fields of neural engineering and rehabilitation due to their noninvasiveness. However, the low communication speed of current BCI systems greatly limits their practical application. In this paper, we present a high-speed BCI based on code modulation of visual evoked potentials (c-VEP). Thirty-two target stimuli were modulated by a time-shifted binary pseudorandom sequence. A multichannel identification method based on canonical correlation analysis (CCA) was used for target identification. The online system achieved an average information transfer rate (ITR) of 108 ± 12 bits min-1 on five subjects with a maximum ITR of 123 bits min-1 for a single subject.
Development of an Aeroelastic Analysis Including a Viscous Flow Model

NASA Technical Reports Server (NTRS)

Keith, Theo G., Jr.; Bakhle, Milind A.

2001-01-01

Under this grant, Version 4 of the three-dimensional Navier-Stokes aeroelastic code (TURBO-AE) has been developed and verified. The TURBO-AE Version 4 aeroelastic code allows flutter calculations for a fan, compressor, or turbine blade row. This code models a vibrating three-dimensional bladed disk configuration and the associated unsteady flow (including shocks, and viscous effects) to calculate the aeroelastic instability using a work-per-cycle approach. Phase-lagged (time-shift) periodic boundary conditions are used to model the phase lag between adjacent vibrating blades. The direct-store approach is used for this purpose to reduce the computational domain to a single interblade passage. A disk storage option, implemented using direct access files, is available to reduce the large memory requirements of the direct-store approach. Other researchers have implemented 3D inlet/exit boundary conditions based on eigen-analysis. Appendix A: Aeroelastic calculations based on three-dimensional euler analysis. Appendix B: Unsteady aerodynamic modeling of blade vibration using the turbo-V3.1 code.
ANNA: A Convolutional Neural Network Code for Spectroscopic Analysis

NASA Astrophysics Data System (ADS)

Lee-Brown, Donald; Anthony-Twarog, Barbara J.; Twarog, Bruce A.

2018-01-01

We present ANNA, a Python-based convolutional neural network code for the automated analysis of stellar spectra. ANNA provides a flexible framework that allows atmospheric parameters such as temperature and metallicity to be determined with accuracies comparable to those of established but less efficient techniques. ANNA performs its parameterization extremely quickly; typically several thousand spectra can be analyzed in less than a second. Additionally, the code incorporates features which greatly speed up the training process necessary for the neural network to measure spectra accurately, resulting in a tool that can easily be run on a single desktop or laptop computer. Thus, ANNA is useful in an era when spectrographs increasingly have the capability to collect dozens to hundreds of spectra each night. This talk will cover the basic features included in ANNA and demonstrate its performance in two use cases: an open cluster abundance analysis involving several hundred spectra, and a metal-rich field star study. Applicability of the code to large survey datasets will also be discussed.
BODYFIT-1FE: a computer code for three-dimensional steady-state/transient single-phase rod-bundle thermal-hydraulic analysis. Draft report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, B.C.J.; Sha, W.T.; Doria, M.L.

1980-11-01

The governing equations, i.e., conservation equations for mass, momentum, and energy, are solved as a boundary-value problem in space and an initial-value problem in time. BODYFIT-1FE code uses the technique of boundary-fitted coordinate systems where all the physical boundaries are transformed to be coincident with constant coordinate lines in the transformed space. By using this technique, one can prescribe boundary conditions accurately without interpolation. The transformed governing equations in terms of the boundary-fitted coordinates are then solved by using implicit cell-by-cell procedure with a choice of either central or upwind convective derivatives. It is a true benchmark rod-bundle code withoutmore » invoking any assumptions in the case of laminar flow. However, for turbulent flow, some empiricism must be employed due to the closure problem of turbulence modeling. The detailed velocity and temperature distributions calculated from the code can be used to benchmark and calibrate empirical coefficients employed in subchannel codes and porous-medium analyses.« less
CASL VMA FY16 Milestone Report (L3:VMA.VUQ.P13.07) Westinghouse Mixing with COBRA-TF

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gordon, Natalie

2016-09-30

COBRA-TF (CTF) is a low-resolution code currently maintained as CASL's subchannel analysis tool. CTF operates as a two-phase, compressible code over a mesh comprised of subchannels and axial discretized nodes. In part because CTF is a low-resolution code, simulation run time is not computationally expensive, only on the order of minutes. Hi-resolution codes such as STAR-CCM+ can be used to train lower-fidelity codes such as CTF. Unlike STAR-CCM+, CTF has no turbulence model, only a two-phase turbulent mixing coefficient, β. β can be set to a constant value or calculated in terms of Reynolds number using an empirical correlation. Resultsmore » from STAR-CCM+ can be used to inform the appropriate value of β. Once β is calibrated, CTF runs can be an inexpensive alternative to costly STAR-CCM+ runs for scoping analyses. Based on the results of CTF runs, STAR-CCM+ can be run for specific parameters of interest. CASL areas of application are CIPS for single phase analysis and DNB-CTF for two-phase analysis.« less
Dynamic divisive normalization predicts time-varying value coding in decision-related circuits.

PubMed

Louie, Kenway; LoFaro, Thomas; Webb, Ryan; Glimcher, Paul W

2014-11-26

Normalization is a widespread neural computation, mediating divisive gain control in sensory processing and implementing a context-dependent value code in decision-related frontal and parietal cortices. Although decision-making is a dynamic process with complex temporal characteristics, most models of normalization are time-independent and little is known about the dynamic interaction of normalization and choice. Here, we show that a simple differential equation model of normalization explains the characteristic phasic-sustained pattern of cortical decision activity and predicts specific normalization dynamics: value coding during initial transients, time-varying value modulation, and delayed onset of contextual information. Empirically, we observe these predicted dynamics in saccade-related neurons in monkey lateral intraparietal cortex. Furthermore, such models naturally incorporate a time-weighted average of past activity, implementing an intrinsic reference-dependence in value coding. These results suggest that a single network mechanism can explain both transient and sustained decision activity, emphasizing the importance of a dynamic view of normalization in neural coding. Copyright © 2014 the authors 0270-6474/14/3416046-12$15.00/0.

Talking about Code: Integrating Pedagogical Code Reviews into Early Computing Courses

ERIC Educational Resources Information Center

Hundhausen, Christopher D.; Agrawal, Anukrati; Agarwal, Pawan

2013-01-01

Given the increasing importance of soft skills in the computing profession, there is good reason to provide students withmore opportunities to learn and practice those skills in undergraduate computing courses. Toward that end, we have developed an active learning approach for computing education called the "Pedagogical Code Review"…
Exact Rayleigh scattering calculations for use with the Nimbus-7 Coastal Zone Color Scanner

NASA Technical Reports Server (NTRS)

Gordon, Howard R.; Brown, James W.; Evans, Robert H.

1988-01-01

The radiance reflected from a plane-parallel atmosphere and flat sea surface in the absence of aerosols has been determined with an exact multiple scattering code to improve the analysis of Nimbus-7 CZCS imagery. It is shown that the single scattering approximation normally used to compute this radiance can result in errors of up to 5 percent for small and moderate solar zenith angles. A scheme to include the effect of variations in the surface pressure in the exact computation of the Rayleigh radiance is discussed. The results of an application of these computations to CZCS imagery suggest that accurate atmospheric corrections can be obtained for solar zenith angles at least as large as 65 deg.
Guidelines for developing vectorizable computer programs

NASA Technical Reports Server (NTRS)

Miner, E. W.

1982-01-01

Some fundamental principles for developing computer programs which are compatible with array-oriented computers are presented. The emphasis is on basic techniques for structuring computer codes which are applicable in FORTRAN and do not require a special programming language or exact a significant penalty on a scalar computer. Researchers who are using numerical techniques to solve problems in engineering can apply these basic principles and thus develop transportable computer programs (in FORTRAN) which contain much vectorizable code. The vector architecture of the ASC is discussed so that the requirements of array processing can be better appreciated. The "vectorization" of a finite-difference viscous shock-layer code is used as an example to illustrate the benefits and some of the difficulties involved. Increases in computing speed with vectorization are illustrated with results from the viscous shock-layer code and from a finite-element shock tube code. The applicability of these principles was substantiated through running programs on other computers with array-associated computing characteristics, such as the Hewlett-Packard (H-P) 1000-F.
The Helicopter Antenna Radiation Prediction Code (HARP)

NASA Technical Reports Server (NTRS)

Klevenow, F. T.; Lynch, B. G.; Newman, E. H.; Rojas, R. G.; Scheick, J. T.; Shamansky, H. T.; Sze, K. Y.

1990-01-01

The first nine months effort in the development of a user oriented computer code, referred to as the HARP code, for analyzing the radiation from helicopter antennas is described. The HARP code uses modern computer graphics to aid in the description and display of the helicopter geometry. At low frequencies the helicopter is modeled by polygonal plates, and the method of moments is used to compute the desired patterns. At high frequencies the helicopter is modeled by a composite ellipsoid and flat plates, and computations are made using the geometrical theory of diffraction. The HARP code will provide a user friendly interface, employing modern computer graphics, to aid the user to describe the helicopter geometry, select the method of computation, construct the desired high or low frequency model, and display the results.
Enhanced fault-tolerant quantum computing in d-level systems.

PubMed

Campbell, Earl T

2014-12-05

Error-correcting codes protect quantum information and form the basis of fault-tolerant quantum computing. Leading proposals for fault-tolerant quantum computation require codes with an exceedingly rare property, a transversal non-Clifford gate. Codes with the desired property are presented for d-level qudit systems with prime d. The codes use n=d-1 qudits and can detect up to ∼d/3 errors. We quantify the performance of these codes for one approach to quantum computation known as magic-state distillation. Unlike prior work, we find performance is always enhanced by increasing d.
Convergence acceleration of the Proteus computer code with multigrid methods

NASA Technical Reports Server (NTRS)

Demuren, A. O.; Ibraheem, S. O.

1992-01-01

Presented here is the first part of a study to implement convergence acceleration techniques based on the multigrid concept in the Proteus computer code. A review is given of previous studies on the implementation of multigrid methods in computer codes for compressible flow analysis. Also presented is a detailed stability analysis of upwind and central-difference based numerical schemes for solving the Euler and Navier-Stokes equations. Results are given of a convergence study of the Proteus code on computational grids of different sizes. The results presented here form the foundation for the implementation of multigrid methods in the Proteus code.
CELCAP: A Computer Model for Cogeneration System Analysis

NASA Technical Reports Server (NTRS)

1985-01-01

A description of the CELCAP cogeneration analysis program is presented. A detailed description of the methodology used by the Naval Civil Engineering Laboratory in developing the CELCAP code and the procedures for analyzing cogeneration systems for a given user are given. The four engines modeled in CELCAP are: gas turbine with exhaust heat boiler, diesel engine with waste heat boiler, single automatic-extraction steam turbine, and back-pressure steam turbine. Both the design point and part-load performances are taken into account in the engine models. The load model describes how the hourly electric and steam demand of the user is represented by 24 hourly profiles. The economic model describes how the annual and life-cycle operating costs that include the costs of fuel, purchased electricity, and operation and maintenance of engines and boilers are calculated. The CELCAP code structure and principal functions of the code are described to how the various components of the code are related to each other. Three examples of the application of the CELCAP code are given to illustrate the versatility of the code. The examples shown represent cases of system selection, system modification, and system optimization.
Microfluidic CODES: a scalable multiplexed electronic sensor for orthogonal detection of particles in microfluidic channels.

PubMed

Liu, Ruxiu; Wang, Ningquan; Kamili, Farhan; Sarioglu, A Fatih

2016-04-21

Numerous biophysical and biochemical assays rely on spatial manipulation of particles/cells as they are processed on lab-on-a-chip devices. Analysis of spatially distributed particles on these devices typically requires microscopy negating the cost and size advantages of microfluidic assays. In this paper, we introduce a scalable electronic sensor technology, called microfluidic CODES, that utilizes resistive pulse sensing to orthogonally detect particles in multiple microfluidic channels from a single electrical output. Combining the techniques from telecommunications and microfluidics, we route three coplanar electrodes on a glass substrate to create multiple Coulter counters producing distinct orthogonal digital codes when they detect particles. We specifically design a digital code set using the mathematical principles of Code Division Multiple Access (CDMA) telecommunication networks and can decode signals from different microfluidic channels with >90% accuracy through computation even if these signals overlap. As a proof of principle, we use this technology to detect human ovarian cancer cells in four different microfluidic channels fabricated using soft lithography. Microfluidic CODES offers a simple, all-electronic interface that is well suited to create integrated, low-cost lab-on-a-chip devices for cell- or particle-based assays in resource-limited settings.
Homemade Buckeye-Pi: A Learning Many-Node Platform for High-Performance Parallel Computing

NASA Astrophysics Data System (ADS)

Amooie, M. A.; Moortgat, J.

2017-12-01

We report on the "Buckeye-Pi" cluster, the supercomputer developed in The Ohio State University School of Earth Sciences from 128 inexpensive Raspberry Pi (RPi) 3 Model B single-board computers. Each RPi is equipped with fast Quad Core 1.2GHz ARMv8 64bit processor, 1GB of RAM, and 32GB microSD card for local storage. Therefore, the cluster has a total RAM of 128GB that is distributed on the individual nodes and a flash capacity of 4TB with 512 processors, while it benefits from low power consumption, easy portability, and low total cost. The cluster uses the Message Passing Interface protocol to manage the communications between each node. These features render our platform the most powerful RPi supercomputer to date and suitable for educational applications in high-performance-computing (HPC) and handling of large datasets. In particular, we use the Buckeye-Pi to implement optimized parallel codes in our in-house simulator for subsurface media flows with the goal of achieving a massively-parallelized scalable code. We present benchmarking results for the computational performance across various number of RPi nodes. We believe our project could inspire scientists and students to consider the proposed unconventional cluster architecture as a mainstream and a feasible learning platform for challenging engineering and scientific problems.
Computation of viscous blast wave flowfields

NASA Technical Reports Server (NTRS)

Atwood, Christopher A.

1991-01-01

A method to determine unsteady solutions of the Navier-Stokes equations was developed and applied. The structural finite-volume, approximately factored implicit scheme uses Newton subiterations to obtain the spatially and temporally second-order accurate time history of the interaction of blast-waves with stationary targets. The inviscid flux is evaluated using MacCormack's modified Steger-Warming flux or Roe flux difference splittings with total variation diminishing limiters, while the viscous flux is computed using central differences. The use of implicit boundary conditions in conjunction with a telescoping in time and space method permitted solutions to this strongly unsteady class of problems. Comparisons of numerical, analytical, and experimental results were made in two and three dimensions. These comparisons revealed accurate wave speed resolution with nonoscillatory discontinuity capturing. The purpose of this effort was to address the three-dimensional, viscous blast-wave problem. Test cases were undertaken to reveal these methods' weaknesses in three regimes: (1) viscous-dominated flow; (2) complex unsteady flow; and (3) three-dimensional flow. Comparisons of these computations to analytic and experimental results provided initial validation of the resultant code. Addition details on the numerical method and on the validation can be found in the appendix. Presently, the code is capable of single zone computations with selection of any permutation of solid wall or flow-through boundaries.
Green's function methods in heavy ion shielding

NASA Technical Reports Server (NTRS)

Wilson, John W.; Costen, Robert C.; Shinn, Judy L.; Badavi, Francis F.

1993-01-01

An analytic solution to the heavy ion transport in terms of Green's function is used to generate a highly efficient computer code for space applications. The efficiency of the computer code is accomplished by a nonperturbative technique extending Green's function over the solution domain. The computer code can also be applied to accelerator boundary conditions to allow code validation in laboratory experiments.
Multi-blocking strategies for the INS3D incompressible Navier-Stokes code

NASA Technical Reports Server (NTRS)

Gatlin, Boyd

1990-01-01

With the continuing development of bigger and faster supercomputers, computational fluid dynamics (CFD) has become a useful tool for real-world engineering design and analysis. However, the number of grid points necessary to resolve realistic flow fields numerically can easily exceed the memory capacity of available computers. In addition, geometric shapes of flow fields, such as those in the Space Shuttle Main Engine (SSME) power head, may be impossible to fill with continuous grids upon which to obtain numerical solutions to the equations of fluid motion. The solution to this dilemma is simply to decompose the computational domain into subblocks of manageable size. Computer codes that are single-block by construction can be modified to handle multiple blocks, but ad-hoc changes in the FORTRAN have to be made for each geometry treated. For engineering design and analysis, what is needed is generalization so that the blocking arrangement can be specified by the user. INS3D is a computer program for the solution of steady, incompressible flow problems. It is used frequently to solve engineering problems in the CFD Branch at Marshall Space Flight Center. INS3D uses an implicit solution algorithm and the concept of artificial compressibility to provide the necessary coupling between the pressure field and the velocity field. The development of generalized multi-block capability in INS3D is described.
Analytical modeling of operating characteristics of premixing-prevaporizing fuel-air mixing passages. Volume 2: User's manual

NASA Technical Reports Server (NTRS)

Anderson, O. L.; Chiappetta, L. M.; Edwards, D. E.; Mcvey, J. B.

1982-01-01

A user's manual describing the operation of three computer codes (ADD code, PTRAK code, and VAPDIF code) is presented. The general features of the computer codes, the input/output formats, run streams, and sample input cases are described.
Comparing Different Strategies in Directed Evolution of Enzyme Stereoselectivity: Single- versus Double-Code Saturation Mutagenesis.

PubMed

Sun, Zhoutong; Lonsdale, Richard; Li, Guangyue; Reetz, Manfred T

2016-10-04

Saturation mutagenesis at sites lining the binding pockets of enzymes constitutes a viable protein engineering technique for enhancing or inverting stereoselectivity. Statistical analysis shows that oversampling in the screening step (the bottleneck) increases astronomically as the number of residues in the randomization site increases, which is the reason why reduced amino acid alphabets have been employed, in addition to splitting large sites into smaller ones. Limonene epoxide hydrolase (LEH) has previously served as the experimental platform in these methodological efforts, enabling comparisons between single-code saturation mutagenesis (SCSM) and triple-code saturation mutagenesis (TCSM); these employ either only one or three amino acids, respectively, as building blocks. In this study the comparative platform is extended by exploring the efficacy of double-code saturation mutagenesis (DCSM), in which the reduced amino acid alphabet consists of two members, chosen according to the principles of rational design on the basis of structural information. The hydrolytic desymmetrization of cyclohexene oxide is used as the model reaction, with formation of either (R,R)- or (S,S)-cyclohexane-1,2-diol. DCSM proves to be clearly superior to the likewise tested SCSM, affording both R,R- and S,S-selective mutants. These variants are also good catalysts in reactions of further substrates. Docking computations reveal the basis of enantioselectivity. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Comparison of techniques that use the single scattering model to compute the quality factor Q from coda waves

USGS Publications Warehouse

Novelo-Casanova, D. A.; Lee, W.H.K.

1991-01-01

Using simulated coda waves, the resolution of the single-scattering model to extract coda Q (Qc) and its power law frequency dependence was tested. The back-scattering model of Aki and Chouet (1975) and the single isotropic-scattering model of Sato (1977) were examined. The results indicate that: (1) The input Qc models are reasonably well approximated by the two methods; (2) almost equal Qc values are recovered when the techniques sample the same coda windows; (3) low Qc models are well estimated in the frequency domain from the early and late part of the coda; and (4) models with high Qc values are more accurately extracted from late code measurements. ?? 1991 Birkha??user Verlag.
Automated apparatus and method of generating native code for a stitching machine

NASA Technical Reports Server (NTRS)

Miller, Jeffrey L. (Inventor)

2000-01-01

A computer system automatically generates CNC code for a stitching machine. The computer determines the locations of a present stitching point and a next stitching point. If a constraint is not found between the present stitching point and the next stitching point, the computer generates code for making a stitch at the next stitching point. If a constraint is found, the computer generates code for changing a condition (e.g., direction) of the stitching machine's stitching head.
Computer codes developed and under development at Lewis

NASA Technical Reports Server (NTRS)

Chamis, Christos C.

1992-01-01

The objective of this summary is to provide a brief description of: (1) codes developed or under development at LeRC; and (2) the development status of IPACS with some typical early results. The computer codes that have been developed and/or are under development at LeRC are listed in the accompanying charts. This list includes: (1) the code acronym; (2) select physics descriptors; (3) current enhancements; and (4) present (9/91) code status with respect to its availability and documentation. The computer codes list is grouped by related functions such as: (1) composite mechanics; (2) composite structures; (3) integrated and 3-D analysis; (4) structural tailoring; and (5) probabilistic structural analysis. These codes provide a broad computational simulation infrastructure (technology base-readiness) for assessing the structural integrity/durability/reliability of propulsion systems. These codes serve two other very important functions: they provide an effective means of technology transfer; and they constitute a depository of corporate memory.
Characterizing a four-qubit planar lattice for arbitrary error detection

NASA Astrophysics Data System (ADS)

Chow, Jerry M.; Srinivasan, Srikanth J.; Magesan, Easwar; Córcoles, A. D.; Abraham, David W.; Gambetta, Jay M.; Steffen, Matthias

2015-05-01

Quantum error correction will be a necessary component towards realizing scalable quantum computers with physical qubits. Theoretically, it is possible to perform arbitrarily long computations if the error rate is below a threshold value. The two-dimensional surface code permits relatively high fault-tolerant thresholds at the ~1% level, and only requires a latticed network of qubits with nearest-neighbor interactions. Superconducting qubits have continued to steadily improve in coherence, gate, and readout fidelities, to become a leading candidate for implementation into larger quantum networks. Here we describe characterization experiments and calibration of a system of four superconducting qubits arranged in a planar lattice, amenable to the surface code. Insights into the particular qubit design and comparison between simulated parameters and experimentally determined parameters are given. Single- and two-qubit gate tune-up procedures are described and results for simultaneously benchmarking pairs of two-qubit gates are given. All controls are eventually used for an arbitrary error detection protocol described in separate work [Corcoles et al., Nature Communications, 6, 2015].
The Role of Inhibition in a Computational Model of an Auditory Cortical Neuron during the Encoding of Temporal Information

PubMed Central

Bendor, Daniel

2015-01-01

In auditory cortex, temporal information within a sound is represented by two complementary neural codes: a temporal representation based on stimulus-locked firing and a rate representation, where discharge rate co-varies with the timing between acoustic events but lacks a stimulus-synchronized response. Using a computational neuronal model, we find that stimulus-locked responses are generated when sound-evoked excitation is combined with strong, delayed inhibition. In contrast to this, a non-synchronized rate representation is generated when the net excitation evoked by the sound is weak, which occurs when excitation is coincident and balanced with inhibition. Using single-unit recordings from awake marmosets (Callithrix jacchus), we validate several model predictions, including differences in the temporal fidelity, discharge rates and temporal dynamics of stimulus-evoked responses between neurons with rate and temporal representations. Together these data suggest that feedforward inhibition provides a parsimonious explanation of the neural coding dichotomy observed in auditory cortex. PMID:25879843
Investigation of flowfields found in typical combustor geometries

NASA Technical Reports Server (NTRS)

Lilley, D. G.

1985-01-01

Activities undertaken during the entire course of research are summarized. Studies were concerned with experimental and theoretical research on 2-D axisymmetric geometries under low speed nonreacting, turbulent, swirling flow conditions typical of gas turbine and ramjet combustion chambers. They included recirculation zone characterization, time-mean and turbulence simulation in swirling recirculating flow, sudden and gradual expansion flowfields, and furher complexities and parameter influences. The study included the investigation of: a complete range of swirl strengths; swirler performance; downstream contraction nozzle sizes and locations; expansion ratios; and inlet side-wall angles. Their individual and combined effects on the test section flowfield were observed, measured and characterized. Experimental methods included flow visualization (with smoke and neutrally-buoyant helium-filled soap bubbles), five-hole pitot probe time-mean velocity field measurements, and single-, double-, and triple-wire hot-wire anemometry measurements of time-mean velocities, normal and shear Reynolds sresses. Computational methods included development of the STARPIC code from the primitive-variable TEACH computer code, and its use in flowfield prediction and turbulence model development.

Investigation of Advanced Counterrotation Blade Configuration Concepts for High Speed Turboprop Systems. Task 2: Unsteady Ducted Propfan Analysis

NASA Technical Reports Server (NTRS)

Hall, Edward J.; Delaney, Robert A.; Bettner, James L.

1991-01-01

The primary objective was the development of a time dependent 3-D Euler/Navier-Stokes aerodynamic analysis to predict unsteady compressible transonic flows about ducted and unducted propfan propulsion systems at angle of attack. The resulting computer codes are referred to as Advanced Ducted Propfan Analysis Codes (ADPAC). A computer program user's manual is presented for the ADPAC. Aerodynamic calculations were based on a four stage Runge-Kutta time marching finite volume solution technique with added numerical dissipation. A time accurate implicit residual smoothing operator was used for unsteady flow predictions. For unducted propfans, a single H-type grid was used to discretize each blade passage of the complete propeller. For ducted propfans, a coupled system of five grid blocks utilizing an embedded C grid about the cowl leading edge was used to discretize each blade passage. Grid systems were generated by a combined algebraic/elliptic algorithm developed specifically for ducted propfans. Numerical calculations were compared with experimental data for both ducted and unducted flows.
FY16 ASME High Temperature Code Activities

DOE Office of Scientific and Technical Information (OSTI.GOV)

Swindeman, M. J.; Jetter, R. I.; Sham, T. -L.

2016-09-01

One of the objectives of the ASME high temperature Code activities is to develop and validate both improvements and the basic features of Section III, Division 5, Subsection HB, Subpart B (HBB). The overall scope of this task is to develop a computer program to be used to assess whether or not a specific component under specified loading conditions will satisfy the elevated temperature design requirements for Class A components in Section III, Division 5, Subsection HB, Subpart B (HBB). There are many features and alternative paths of varying complexity in HBB. The initial focus of this task is amore » basic path through the various options for a single reference material, 316H stainless steel. However, the program will be structured for eventual incorporation all the features and permitted materials of HBB. Since this task has recently been initiated, this report focuses on the description of the initial path forward and an overall description of the approach to computer program development.« less
Advanced complex trait analysis.

PubMed

Gray, A; Stewart, I; Tenesa, A

2012-12-01

The Genome-wide Complex Trait Analysis (GCTA) software package can quantify the contribution of genetic variation to phenotypic variation for complex traits. However, as those datasets of interest continue to increase in size, GCTA becomes increasingly computationally prohibitive. We present an adapted version, Advanced Complex Trait Analysis (ACTA), demonstrating dramatically improved performance. We restructure the genetic relationship matrix (GRM) estimation phase of the code and introduce the highly optimized parallel Basic Linear Algebra Subprograms (BLAS) library combined with manual parallelization and optimization. We introduce the Linear Algebra PACKage (LAPACK) library into the restricted maximum likelihood (REML) analysis stage. For a test case with 8999 individuals and 279,435 single nucleotide polymorphisms (SNPs), we reduce the total runtime, using a compute node with two multi-core Intel Nehalem CPUs, from ∼17 h to ∼11 min. The source code is fully available under the GNU Public License, along with Linux binaries. For more information see http://www.epcc.ed.ac.uk/software-products/acta. a.gray@ed.ac.uk Supplementary data are available at Bioinformatics online.
A users manual for a computer program which calculates time optical geocentric transfers using solar or nuclear electric and high thrust propulsion

NASA Technical Reports Server (NTRS)

Sackett, L. L.; Edelbaum, T. N.; Malchow, H. L.

1974-01-01

This manual is a guide for using a computer program which calculates time optimal trajectories for high-and low-thrust geocentric transfers. Either SEP or NEP may be assumed and a one or two impulse, fixed total delta V, initial high thrust phase may be included. Also a single impulse of specified delta V may be included after the low thrust state. The low thrust phase utilizes equinoctial orbital elements to avoid the classical singularities and Kryloff-Boguliuboff averaging to help insure more rapid computation time. The program is written in FORTRAN 4 in double precision for use on an IBM 360 computer. The manual includes a description of the problem treated, input/output information, examples of runs, and source code listings.
A precise goniometer/tensiometer using a low cost single-board computer

NASA Astrophysics Data System (ADS)

Favier, Benoit; Chamakos, Nikolaos T.; Papathanasiou, Athanasios G.

2017-12-01

Measuring the surface tension and the Young contact angle of a droplet is extremely important for many industrial applications. Here, considering the booming interest for small and cheap but precise experimental instruments, we have constructed a low-cost contact angle goniometer/tensiometer, based on a single-board computer (Raspberry Pi). The device runs an axisymmetric drop shape analysis (ADSA) algorithm written in Python. The code, here named DropToolKit, was developed in-house. We initially present the mathematical framework of our algorithm and then we validate our software tool against other well-established ADSA packages, including the commercial ramé-hart DROPimage Advanced as well as the DropAnalysis plugin in ImageJ. After successfully testing for various combinations of liquids and solid surfaces, we concluded that our prototype device would be highly beneficial for industrial applications as well as for scientific research in wetting phenomena compared to the commercial solutions.
Laser targets compensate for limitations in inertial confinement fusion drivers

NASA Astrophysics Data System (ADS)

Kilkenny, J. D.; Alexander, N. B.; Nikroo, A.; Steinman, D. A.; Nobile, A.; Bernat, T.; Cook, R.; Letts, S.; Takagi, M.; Harding, D.

2005-10-01

Success in inertial confinement fusion (ICF) requires sophisticated, characterized targets. The increasing fidelity of three-dimensional (3D), radiation hydrodynamic computer codes has made it possible to design targets for ICF which can compensate for limitations in the existing single shot laser and Z pinch ICF drivers. Developments in ICF target fabrication technology allow more esoteric target designs to be fabricated. At present, requirements require new deterministic nano-material fabrication on micro scale.
Robust Airborne Networking Extensions (RANGE)

DTIC Science & Technology

2008-02-01

IMUNES [13] project, which provides an entire network stack virtualization and topology control inside a single FreeBSD machine . The emulated topology...Multicast versus broadcast in a manet.” in ADHOC-NOW, 2004, pp. 14–27. [9] J. Mukherjee, R. Atwood , “ Rendezvous point relocation in protocol independent...computer with an Ethernet connection, or a Linux virtual machine on some other (e.g., Windows) operating system, should work. 2.1 Patching the source code
Generic algorithms for high performance scalable geocomputing

NASA Astrophysics Data System (ADS)

de Jong, Kor; Schmitz, Oliver; Karssenberg, Derek

2016-04-01

During the last decade, the characteristics of computing hardware have changed a lot. For example, instead of a single general purpose CPU core, personal computers nowadays contain multiple cores per CPU and often general purpose accelerators, like GPUs. Additionally, compute nodes are often grouped together to form clusters or a supercomputer, providing enormous amounts of compute power. For existing earth simulation models to be able to use modern hardware platforms, their compute intensive parts must be rewritten. This can be a major undertaking and may involve many technical challenges. Compute tasks must be distributed over CPU cores, offloaded to hardware accelerators, or distributed to different compute nodes. And ideally, all of this should be done in such a way that the compute task scales well with the hardware resources. This presents two challenges: 1) how to make good use of all the compute resources and 2) how to make these compute resources available for developers of simulation models, who may not (want to) have the required technical background for distributing compute tasks. The first challenge requires the use of specialized technology (e.g.: threads, OpenMP, MPI, OpenCL, CUDA). The second challenge requires the abstraction of the logic handling the distribution of compute tasks from the model-specific logic, hiding the technical details from the model developer. To assist the model developer, we are developing a C++ software library (called Fern) containing algorithms that can use all CPU cores available in a single compute node (distributing tasks over multiple compute nodes will be done at a later stage). The algorithms are grid-based (finite difference) and include local and spatial operations such as convolution filters. The algorithms handle distribution of the compute tasks to CPU cores internally. In the resulting model the low-level details of how this is done is separated from the model-specific logic representing the modeled system. This contrasts with practices in which code for distributing of compute tasks is mixed with model-specific code, and results in a better maintainable model. For flexibility and efficiency, the algorithms are configurable at compile-time with the respect to the following aspects: data type, value type, no-data handling, input value domain handling, and output value range handling. This makes the algorithms usable in very different contexts, without the need for making intrusive changes to existing models when using them. Applications that benefit from using the Fern library include the construction of forward simulation models in (global) hydrology (e.g. PCR-GLOBWB (Van Beek et al. 2011)), ecology, geomorphology, or land use change (e.g. PLUC (Verstegen et al. 2014)) and manipulation of hyper-resolution land surface data such as digital elevation models and remote sensing data. Using the Fern library, we have also created an add-on to the PCRaster Python Framework (Karssenberg et al. 2010) allowing its users to speed up their spatio-temporal models, sometimes by changing just a single line of Python code in their model. In our presentation we will give an overview of the design of the algorithms, providing examples of different contexts where they can be used to replace existing sequential algorithms, including the PCRaster environmental modeling software (www.pcraster.eu). We will show how the algorithms can be configured to behave differently when necessary. References Karssenberg, D., Schmitz, O., Salamon, P., De Jong, K. and Bierkens, M.F.P., 2010, A software framework for construction of process-based stochastic spatio-temporal models and data assimilation. Environmental Modelling & Software, 25, pp. 489-502, Link. Best Paper Award 2010: Software and Decision Support. Van Beek, L. P. H., Y. Wada, and M. F. P. Bierkens. 2011. Global monthly water stress: 1. Water balance and water availability. Water Resources Research. 47. Verstegen, J. A., D. Karssenberg, F. van der Hilst, and A. P. C. Faaij. 2014. Identifying a land use change cellular automaton by Bayesian data assimilation. Environmental Modelling & Software 53:121-136.
RELAP-7 Level 2 Milestone Report: Demonstration of a Steady State Single Phase PWR Simulation with RELAP-7

DOE Office of Scientific and Technical Information (OSTI.GOV)

David Andrs; Ray Berry; Derek Gaston

The document contains the simulation results of a steady state model PWR problem with the RELAP-7 code. The RELAP-7 code is the next generation nuclear reactor system safety analysis code being developed at Idaho National Laboratory (INL). The code is based on INL's modern scientific software development framework - MOOSE (Multi-Physics Object-Oriented Simulation Environment). This report summarizes the initial results of simulating a model steady-state single phase PWR problem using the current version of the RELAP-7 code. The major purpose of this demonstration simulation is to show that RELAP-7 code can be rapidly developed to simulate single-phase reactor problems. RELAP-7more » is a new project started on October 1st, 2011. It will become the main reactor systems simulation toolkit for RISMC (Risk Informed Safety Margin Characterization) and the next generation tool in the RELAP reactor safety/systems analysis application series (the replacement for RELAP5). The key to the success of RELAP-7 is the simultaneous advancement of physical models, numerical methods, and software design while maintaining a solid user perspective. Physical models include both PDEs (Partial Differential Equations) and ODEs (Ordinary Differential Equations) and experimental based closure models. RELAP-7 will eventually utilize well posed governing equations for multiphase flow, which can be strictly verified. Closure models used in RELAP5 and newly developed models will be reviewed and selected to reflect the progress made during the past three decades. RELAP-7 uses modern numerical methods, which allow implicit time integration, higher order schemes in both time and space, and strongly coupled multi-physics simulations. RELAP-7 is written with object oriented programming language C++. Its development follows modern software design paradigms. The code is easy to read, develop, maintain, and couple with other codes. Most importantly, the modern software design allows the RELAP-7 code to evolve with time. RELAP-7 is a MOOSE-based application. MOOSE (Multiphysics Object-Oriented Simulation Environment) is a framework for solving computational engineering problems in a well-planned, managed, and coordinated way. By leveraging millions of lines of open source software packages, such as PETSC (a nonlinear solver developed at Argonne National Laboratory) and LibMesh (a Finite Element Analysis package developed at University of Texas), MOOSE significantly reduces the expense and time required to develop new applications. Numerical integration methods and mesh management for parallel computation are provided by MOOSE. Therefore RELAP-7 code developers only need to focus on physics and user experiences. By using the MOOSE development environment, RELAP-7 code is developed by following the same modern software design paradigms used for other MOOSE development efforts. There are currently over 20 different MOOSE based applications ranging from 3-D transient neutron transport, detailed 3-D transient fuel performance analysis, to long-term material aging. Multi-physics and multiple dimensional analyses capabilities can be obtained by coupling RELAP-7 and other MOOSE based applications and by leveraging with capabilities developed by other DOE programs. This allows restricting the focus of RELAP-7 to systems analysis-type simulations and gives priority to retain and significantly extend RELAP5's capabilities.« less
FPGA acceleration of rigid-molecule docking codes

PubMed Central

Sukhwani, B.; Herbordt, M.C.

2011-01-01

Modelling the interactions of biological molecules, or docking, is critical both to understanding basic life processes and to designing new drugs. The field programmable gate array (FPGA) based acceleration of a recently developed, complex, production docking code is described. The authors found that it is necessary to extend their previous three-dimensional (3D) correlation structure in several ways, most significantly to support simultaneous computation of several correlation functions. The result for small-molecule docking is a 100-fold speed-up of a section of the code that represents over 95% of the original run-time. An additional 2% is accelerated through a previously described method, yielding a total acceleration of 36× over a single core and 10× over a quad-core. This approach is found to be an ideal complement to graphics processing unit (GPU) based docking, which excels in the protein–protein domain. PMID:21857870
An electrocorticographic BCI using code-based VEP for control in video applications: a single-subject study

PubMed Central

Kapeller, Christoph; Kamada, Kyousuke; Ogawa, Hiroshi; Prueckl, Robert; Scharinger, Josef; Guger, Christoph

2014-01-01

A brain-computer-interface (BCI) allows the user to control a device or software with brain activity. Many BCIs rely on visual stimuli with constant stimulation cycles that elicit steady-state visual evoked potentials (SSVEP) in the electroencephalogram (EEG). This EEG response can be generated with a LED or a computer screen flashing at a constant frequency, and similar EEG activity can be elicited with pseudo-random stimulation sequences on a screen (code-based BCI). Using electrocorticography (ECoG) instead of EEG promises higher spatial and temporal resolution and leads to more dominant evoked potentials due to visual stimulation. This work is focused on BCIs based on visual evoked potentials (VEP) and its capability as a continuous control interface for augmentation of video applications. One 35 year old female subject with implanted subdural grids participated in the study. The task was to select one out of four visual targets, while each was flickering with a code sequence. After a calibration run including 200 code sequences, a linear classifier was used during an evaluation run to identify the selected visual target based on the generated code-based VEPs over 20 trials. Multiple ECoG buffer lengths were tested and the subject reached a mean online classification accuracy of 99.21% for a window length of 3.15 s. Finally, the subject performed an unsupervised free run in combination with visual feedback of the current selection. Additionally, an algorithm was implemented that allowed to suppress false positive selections and this allowed the subject to start and stop the BCI at any time. The code-based BCI system attained very high online accuracy, which makes this approach very promising for control applications where a continuous control signal is needed. PMID:25147509
Porting ONETEP to graphical processing unit-based coprocessors. 1. FFT box operations.

PubMed

Wilkinson, Karl; Skylaris, Chris-Kriton

2013-10-30

We present the first graphical processing unit (GPU) coprocessor-enabled version of the Order-N Electronic Total Energy Package (ONETEP) code for linear-scaling first principles quantum mechanical calculations on materials. This work focuses on porting to the GPU the parts of the code that involve atom-localized fast Fourier transform (FFT) operations. These are among the most computationally intensive parts of the code and are used in core algorithms such as the calculation of the charge density, the local potential integrals, the kinetic energy integrals, and the nonorthogonal generalized Wannier function gradient. We have found that direct porting of the isolated FFT operations did not provide any benefit. Instead, it was necessary to tailor the port to each of the aforementioned algorithms to optimize data transfer to and from the GPU. A detailed discussion of the methods used and tests of the resulting performance are presented, which show that individual steps in the relevant algorithms are accelerated by a significant amount. However, the transfer of data between the GPU and host machine is a significant bottleneck in the reported version of the code. In addition, an initial investigation into a dynamic precision scheme for the ONETEP energy calculation has been performed to take advantage of the enhanced single precision capabilities of GPUs. The methods used here result in no disruption to the existing code base. Furthermore, as the developments reported here concern the core algorithms, they will benefit the full range of ONETEP functionality. Our use of a directive-based programming model ensures portability to other forms of coprocessors and will allow this work to form the basis of future developments to the code designed to support emerging high-performance computing platforms. Copyright © 2013 Wiley Periodicals, Inc.
Edge Simulation Laboratory Progress and Plans

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cohen, R

The Edge Simulation Laboratory (ESL) is a project to develop a gyrokinetic code for MFE edge plasmas based on continuum (Eulerian) techniques. ESL is a base-program activity of OFES, with an allied algorithm research activity funded by the OASCR base math program. ESL OFES funds directly support about 0.8 FTE of career staff at LLNL, a postdoc and a small fraction of an FTE at GA, and a graduate student at UCSD. In addition the allied OASCR program funds about 1/2 FTE each in the computations directorates at LBNL and LLNL. OFES ESL funding for LLNL and UCSD began inmore » fall 2005, while funding for GA and the math team began about a year ago. ESL's continuum approach is a complement to the PIC-based methods of the CPES Project, and was selected (1) because of concerns about noise issues associated with PIC in the high-density-contrast environment of the edge pedestal, (2) to be able to exploit advanced numerical methods developed for fluid codes, and (3) to build upon the successes of core continuum gyrokinetic codes such as GYRO, GS2 and GENE. The ESL project presently has three components: TEMPEST, a full-f, full-geometry (single-null divertor, or arbitrary-shape closed flux surfaces) code in E, {mu} (energy, magnetic-moment) coordinates; EGK, a simple-geometry rapid-prototype code, presently of; and the math component, which is developing and implementing algorithms for a next-generation code. Progress would be accelerated if we could find funding for a fourth, computer science, component, which would develop software infrastructure, provide user support, and address needs for data handing and analysis. We summarize the status and plans for the three funded activities.« less
Reactivity effects in VVER-1000 of the third unit of the kalinin nuclear power plant at physical start-up. Computations in ShIPR intellectual code system with library of two-group cross sections generated by UNK code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zizin, M. N.; Zimin, V. G.; Zizina, S. N., E-mail: zizin@adis.vver.kiae.ru

2010-12-15

The ShIPR intellectual code system for mathematical simulation of nuclear reactors includes a set of computing modules implementing the preparation of macro cross sections on the basis of the two-group library of neutron-physics cross sections obtained for the SKETCH-N nodal code. This library is created by using the UNK code for 3D diffusion computation of first VVER-1000 fuel loadings. Computation of neutron fields in the ShIPR system is performed using the DP3 code in the two-group diffusion approximation in 3D triangular geometry. The efficiency of all groups of control rods for the first fuel loading of the third unit ofmore » the Kalinin Nuclear Power Plant is computed. The temperature, barometric, and density effects of reactivity as well as the reactivity coefficient due to the concentration of boric acid in the reactor were computed additionally. Results of computations are compared with the experiment.« less
Reactivity effects in VVER-1000 of the third unit of the kalinin nuclear power plant at physical start-up. Computations in ShIPR intellectual code system with library of two-group cross sections generated by UNK code

NASA Astrophysics Data System (ADS)

Zizin, M. N.; Zimin, V. G.; Zizina, S. N.; Kryakvin, L. V.; Pitilimov, V. A.; Tereshonok, V. A.

2010-12-01

The ShIPR intellectual code system for mathematical simulation of nuclear reactors includes a set of computing modules implementing the preparation of macro cross sections on the basis of the two-group library of neutron-physics cross sections obtained for the SKETCH-N nodal code. This library is created by using the UNK code for 3D diffusion computation of first VVER-1000 fuel loadings. Computation of neutron fields in the ShIPR system is performed using the DP3 code in the two-group diffusion approximation in 3D triangular geometry. The efficiency of all groups of control rods for the first fuel loading of the third unit of the Kalinin Nuclear Power Plant is computed. The temperature, barometric, and density effects of reactivity as well as the reactivity coefficient due to the concentration of boric acid in the reactor were computed additionally. Results of computations are compared with the experiment.
Users manual and modeling improvements for axial turbine design and performance computer code TD2-2

NASA Technical Reports Server (NTRS)

Glassman, Arthur J.

1992-01-01

Computer code TD2 computes design point velocity diagrams and performance for multistage, multishaft, cooled or uncooled, axial flow turbines. This streamline analysis code was recently modified to upgrade modeling related to turbine cooling and to the internal loss correlation. These modifications are presented in this report along with descriptions of the code's expanded input and output. This report serves as the users manual for the upgraded code, which is named TD2-2.
An Object-Oriented Approach to Writing Computational Electromagnetics Codes

NASA Technical Reports Server (NTRS)

Zimmerman, Martin; Mallasch, Paul G.

1996-01-01

Presently, most computer software development in the Computational Electromagnetics (CEM) community employs the structured programming paradigm, particularly using the Fortran language. Other segments of the software community began switching to an Object-Oriented Programming (OOP) paradigm in recent years to help ease design and development of highly complex codes. This paper examines design of a time-domain numerical analysis CEM code using the OOP paradigm, comparing OOP code and structured programming code in terms of software maintenance, portability, flexibility, and speed.
The Fermilab lattice supercomputer project

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fischler, M.; Atac, R.; Cook, A.

1989-02-01

The ACPMAPS system is a highly cost effective, local memory MIMD computer targeted at algorithm development and production running for gauge theory on the lattice. The machine consists of a compound hypercube of crates, each of which is a full crossbar switch containing several processors. The processing nodes are single board array processors based on the Weitek XL chip set, each with a peak power of 20 MFLOPS and supported by 8 MBytes of data memory. The system currently being assembled has a peak power of 5 GFLOPS, delivering performance at approximately $250/MFLOP. The system is programmable in C andmore » Fortran. An underpinning of software routines (CANOPY) provides an easy and natural way of coding lattice problems, such that the details of parallelism, and communication and system architecture are transparent to the user. CANOPY can easily be ported to any single CPU or MIMD system which supports C, and allows the coding of typical applications with very little effort. 3 refs., 1 fig.« less
A Loader for Executing Multi-Binary Applications on the Thinking Machines CM-5: It's Not Just for SPMD Anymore

NASA Technical Reports Server (NTRS)

Becker, Jeffrey C.

1995-01-01

The Thinking Machines CM-5 platform was designed to run single program, multiple data (SPMD) applications, i.e., to run a single binary across all nodes of a partition, with each node possibly operating on different data. Certain classes of applications, such as multi-disciplinary computational fluid dynamics codes, are facilitated by the ability to have subsets of the partition nodes running different binaries. In order to extend the CM-5 system software to permit such applications, a multi-program loader was developed. This system is based on the dld loader which was originally developed for workstations. This paper provides a high level description of dld, and describes how it was ported to the CM-5 to provide support for multi-binary applications. Finally, it elaborates how the loader has been used to implement the CM-5 version of MPIRUN, a portable facility for running multi-disciplinary/multi-zonal MPI (Message-Passing Interface Standard) codes.
SoAx: A generic C++ Structure of Arrays for handling particles in HPC codes

NASA Astrophysics Data System (ADS)

Homann, Holger; Laenen, Francois

2018-03-01

The numerical study of physical problems often require integrating the dynamics of a large number of particles evolving according to a given set of equations. Particles are characterized by the information they are carrying such as an identity, a position other. There are generally speaking two different possibilities for handling particles in high performance computing (HPC) codes. The concept of an Array of Structures (AoS) is in the spirit of the object-oriented programming (OOP) paradigm in that the particle information is implemented as a structure. Here, an object (realization of the structure) represents one particle and a set of many particles is stored in an array. In contrast, using the concept of a Structure of Arrays (SoA), a single structure holds several arrays each representing one property (such as the identity) of the whole set of particles. The AoS approach is often implemented in HPC codes due to its handiness and flexibility. For a class of problems, however, it is known that the performance of SoA is much better than that of AoS. We confirm this observation for our particle problem. Using a benchmark we show that on modern Intel Xeon processors the SoA implementation is typically several times faster than the AoS one. On Intel's MIC co-processors the performance gap even attains a factor of ten. The same is true for GPU computing, using both computational and multi-purpose GPUs. Combining performance and handiness, we present the library SoAx that has optimal performance (on CPUs, MICs, and GPUs) while providing the same handiness as AoS. For this, SoAx uses modern C++ design techniques such template meta programming that allows to automatically generate code for user defined heterogeneous data structures.

Accelerated GPU based SPECT Monte Carlo simulations.

PubMed

Garcia, Marie-Paule; Bert, Julien; Benoit, Didier; Bardiès, Manuel; Visvikis, Dimitris

2016-06-07

Monte Carlo (MC) modelling is widely used in the field of single photon emission computed tomography (SPECT) as it is a reliable technique to simulate very high quality scans. This technique provides very accurate modelling of the radiation transport and particle interactions in a heterogeneous medium. Various MC codes exist for nuclear medicine imaging simulations. Recently, new strategies exploiting the computing capabilities of graphical processing units (GPU) have been proposed. This work aims at evaluating the accuracy of such GPU implementation strategies in comparison to standard MC codes in the context of SPECT imaging. GATE was considered the reference MC toolkit and used to evaluate the performance of newly developed GPU Geant4-based Monte Carlo simulation (GGEMS) modules for SPECT imaging. Radioisotopes with different photon energies were used with these various CPU and GPU Geant4-based MC codes in order to assess the best strategy for each configuration. Three different isotopes were considered: (99m) Tc, (111)In and (131)I, using a low energy high resolution (LEHR) collimator, a medium energy general purpose (MEGP) collimator and a high energy general purpose (HEGP) collimator respectively. Point source, uniform source, cylindrical phantom and anthropomorphic phantom acquisitions were simulated using a model of the GE infinia II 3/8" gamma camera. Both simulation platforms yielded a similar system sensitivity and image statistical quality for the various combinations. The overall acceleration factor between GATE and GGEMS platform derived from the same cylindrical phantom acquisition was between 18 and 27 for the different radioisotopes. Besides, a full MC simulation using an anthropomorphic phantom showed the full potential of the GGEMS platform, with a resulting acceleration factor up to 71. The good agreement with reference codes and the acceleration factors obtained support the use of GPU implementation strategies for improving computational efficiency of SPECT imaging simulations.
Underworld - Bringing a Research Code to the Classroom

NASA Astrophysics Data System (ADS)

Moresi, L. N.; Mansour, J.; Giordani, J.; Farrington, R.; Kaluza, O.; Quenette, S.; Woodcock, R.; Squire, G.

2017-12-01

While there are many reasons to celebrate the passing of punch card programming and flickering green screens,the loss of the sense of wonder at the very existence of computers and the calculations they make possible shouldnot be numbered among them. Computers have become so familiar that students are often unaware that formal and careful design of algorithms andtheir implementations remains a valuable and important skill that has to be learned and practiced to achieveexpertise and genuine understanding. In teaching geodynamics and geophysics at undergraduate level, we aimed to be able to bring our researchtools into the classroom - even when those tools are advanced, parallel research codes that we typically deploy on hundredsor thousands of processors, and we wanted to teach not just the physical concepts that are modelled by these codes but asense of familiarity with computational modelling and the ability to discriminate a reliable model from a poor one. The underworld code (www.underworldcode.org) was developed for modelling plate-scale fluid mechanics and studyingproblems in lithosphere dynamics. Though specialised for this task, underworld has a straightforwardpython user interface that allows it to run within the environment of jupyter notebooks on a laptop (at modest resolution, of course).The python interface was developed for adaptability in addressing new research problems, but also lends itself to integration intoa python-driven learning environment. To manage the heavy demands of installing and running underworld in a teaching laboratory, we have developed a workflow in whichwe install docker containers in the cloud which support a number of students to run their own environment independently. We share ourexperience blending notebooks and static webpages into a single web environment, and we explain how we designed our graphics andanalysis tools to allow notebook "scripts" to be queued and run on a supercomputer.
Performance analysis of parallel gravitational N-body codes on large GPU clusters

NASA Astrophysics Data System (ADS)

Huang, Si-Yi; Spurzem, Rainer; Berczik, Peter

2016-01-01

We compare the performance of two very different parallel gravitational N-body codes for astrophysical simulations on large Graphics Processing Unit (GPU) clusters, both of which are pioneers in their own fields as well as on certain mutual scales - NBODY6++ and Bonsai. We carry out benchmarks of the two codes by analyzing their performance, accuracy and efficiency through the modeling of structure decomposition and timing measurements. We find that both codes are heavily optimized to leverage the computational potential of GPUs as their performance has approached half of the maximum single precision performance of the underlying GPU cards. With such performance we predict that a speed-up of 200 - 300 can be achieved when up to 1k processors and GPUs are employed simultaneously. We discuss the quantitative information about comparisons of the two codes, finding that in the same cases Bonsai adopts larger time steps as well as larger relative energy errors than NBODY6++, typically ranging from 10 - 50 times larger, depending on the chosen parameters of the codes. Although the two codes are built for different astrophysical applications, in specified conditions they may overlap in performance at certain physical scales, thus allowing the user to choose either one by fine-tuning parameters accordingly.
Extremely accurate sequential verification of RELAP5-3D

DOE PAGES

Mesina, George L.; Aumiller, David L.; Buschman, Francis X.

2015-11-19

Large computer programs like RELAP5-3D solve complex systems of governing, closure and special process equations to model the underlying physics of nuclear power plants. Further, these programs incorporate many other features for physics, input, output, data management, user-interaction, and post-processing. For software quality assurance, the code must be verified and validated before being released to users. For RELAP5-3D, verification and validation are restricted to nuclear power plant applications. Verification means ensuring that the program is built right by checking that it meets its design specifications, comparing coding to algorithms and equations and comparing calculations against analytical solutions and method ofmore » manufactured solutions. Sequential verification performs these comparisons initially, but thereafter only compares code calculations between consecutive code versions to demonstrate that no unintended changes have been introduced. Recently, an automated, highly accurate sequential verification method has been developed for RELAP5-3D. The method also provides to test that no unintended consequences result from code development in the following code capabilities: repeating a timestep advancement, continuing a run from a restart file, multiple cases in a single code execution, and modes of coupled/uncoupled operation. In conclusion, mathematical analyses of the adequacy of the checks used in the comparisons are provided.« less
Application of a single-fluid model for the steam condensing flow prediction

NASA Astrophysics Data System (ADS)

Smołka, K.; Dykas, S.; Majkut, M.; Strozik, M.

2016-10-01

One of the results of many years of research conducted in the Institute of Power Engineering and Turbomachinery of the Silesian University of Technology are computational algorithms for modelling steam flows with a non-equilibrium condensation process. In parallel with theoretical and numerical research, works were also started on experimental testing of the steam condensing flow. This paper presents a comparison of calculations of a flow field modelled by means of a single-fluid model using both an in-house CFD code and the commercial Ansys CFX v16.2 software package. The calculation results are compared to inhouse experimental testing.
Performance of a plasma fluid code on the Intel parallel computers

NASA Technical Reports Server (NTRS)

Lynch, V. E.; Carreras, B. A.; Drake, J. B.; Leboeuf, J. N.; Liewer, P.

1992-01-01

One approach to improving the real-time efficiency of plasma turbulence calculations is to use a parallel algorithm. A parallel algorithm for plasma turbulence calculations was tested on the Intel iPSC/860 hypercube and the Touchtone Delta machine. Using the 128 processors of the Intel iPSC/860 hypercube, a factor of 5 improvement over a single-processor CRAY-2 is obtained. For the Touchtone Delta machine, the corresponding improvement factor is 16. For plasma edge turbulence calculations, an extrapolation of the present results to the Intel (sigma) machine gives an improvement factor close to 64 over the single-processor CRAY-2.
Computer Description of the Field Artillery Ammunition Supply Vehicle

DTIC Science & Technology

1983-04-01

Combinatorial Geometry (COM-GEOM) GIFT Computer Code Computer Target Description 2& AfTNACT (Cmne M feerve shb N ,neemssalyan ify by block number) A...input to the GIFT computer code to generate target vulnerability data. F.a- 4 ono OF I NOV 5S OLETE UNCLASSIFIED SECUOITY CLASSIFICATION OF THIS PAGE...Combinatorial Geometry (COM-GEOM) desrription. The "Geometric Information for Tarqets" ( GIFT ) computer code accepts the CO!-GEOM description and
48 CFR 252.227-7013 - Rights in technical data-Noncommercial items.

Code of Federal Regulations, 2011 CFR

2011-10-01

... causing a computer to perform a specific operation or series of operations. (3) Computer software means computer programs, source code, source code listings, object code listings, design details, algorithms... or will be developed exclusively with Government funds; (ii) Studies, analyses, test data, or similar...
48 CFR 252.227-7013 - Rights in technical data-Noncommercial items.

Code of Federal Regulations, 2012 CFR

2012-10-01

... causing a computer to perform a specific operation or series of operations. (3) Computer software means computer programs, source code, source code listings, object code listings, design details, algorithms... or will be developed exclusively with Government funds; (ii) Studies, analyses, test data, or similar...
48 CFR 252.227-7013 - Rights in technical data-Noncommercial items.

Code of Federal Regulations, 2014 CFR

2014-10-01

... causing a computer to perform a specific operation or series of operations. (3) Computer software means computer programs, source code, source code listings, object code listings, design details, algorithms... or will be developed exclusively with Government funds; (ii) Studies, analyses, test data, or similar...
48 CFR 252.227-7013 - Rights in technical data-Noncommercial items.

Code of Federal Regulations, 2010 CFR

2010-10-01

... causing a computer to perform a specific operation or series of operations. (3) Computer software means computer programs, source code, source code listings, object code listings, design details, algorithms... developed exclusively with Government funds; (ii) Studies, analyses, test data, or similar data produced for...
Fast data preprocessing with Graphics Processing Units for inverse problem solving in light-scattering measurements

NASA Astrophysics Data System (ADS)

Derkachov, G.; Jakubczyk, T.; Jakubczyk, D.; Archer, J.; Woźniak, M.

2017-07-01

Utilising Compute Unified Device Architecture (CUDA) platform for Graphics Processing Units (GPUs) enables significant reduction of computation time at a moderate cost, by means of parallel computing. In the paper [Jakubczyk et al., Opto-Electron. Rev., 2016] we reported using GPU for Mie scattering inverse problem solving (up to 800-fold speed-up). Here we report the development of two subroutines utilising GPU at data preprocessing stages for the inversion procedure: (i) A subroutine, based on ray tracing, for finding spherical aberration correction function. (ii) A subroutine performing the conversion of an image to a 1D distribution of light intensity versus azimuth angle (i.e. scattering diagram), fed from a movie-reading CPU subroutine running in parallel. All subroutines are incorporated in PikeReader application, which we make available on GitHub repository. PikeReader returns a sequence of intensity distributions versus a common azimuth angle vector, corresponding to the recorded movie. We obtained an overall ∼ 400 -fold speed-up of calculations at data preprocessing stages using CUDA codes running on GPU in comparison to single thread MATLAB-only code running on CPU.
Digital Plasma Control System for Alcator C-Mod

NASA Astrophysics Data System (ADS)

Ferrara, M.; Wolfe, S.; Stillerman, J.; Fredian, T.; Hutchinson, I.

2004-11-01

A digital plasma control system (DPCS) has been designed to replace the present C-Mod system, which is based on hybrid analog-digital computer. The initial implementation of DPCS comprises two 64 channel, 16 bit, low-latency cPCI digitizers, each with 16 analog outputs, controlled by a rack-mounted single-processor Linux server, which also serves as the compute engine. A prototype system employing three older 32 channel digitizers was tested during the 2003-04 campaign. The hybrid's linear PID feedback system was emulated by IDL code executing a synchronous loop, using the same target waveforms and control parameters. Reliable real-time operation was accomplished under a standard Linux OS (RH9) by locking memory and disabling interrupts during the plasma pulse. The DPCS-computed outputs agreed to within a few percent with those produced by the hybrid system, except for discrepancies due to offsets and non-ideal behavior of the hybrid circuitry. The system operated reliably, with no sample loss, at more than twice the 10kHz design specification, providing extra time for implementing more advanced control algorithms. The code is fault-tolerant and produces consistent output waveforms even with 10% sample loss.
Know Your Enemy: Successful Bioinformatic Approaches to Predict Functional RNA Structures in Viral RNAs.

PubMed

Lim, Chun Shen; Brown, Chris M

2017-01-01

Structured RNA elements may control virus replication, transcription and translation, and their distinct features are being exploited by novel antiviral strategies. Viral RNA elements continue to be discovered using combinations of experimental and computational analyses. However, the wealth of sequence data, notably from deep viral RNA sequencing, viromes, and metagenomes, necessitates computational approaches being used as an essential discovery tool. In this review, we describe practical approaches being used to discover functional RNA elements in viral genomes. In addition to success stories in new and emerging viruses, these approaches have revealed some surprising new features of well-studied viruses e.g., human immunodeficiency virus, hepatitis C virus, influenza, and dengue viruses. Some notable discoveries were facilitated by new comparative analyses of diverse viral genome alignments. Importantly, comparative approaches for finding RNA elements embedded in coding and non-coding regions differ. With the exponential growth of computer power we have progressed from stem-loop prediction on single sequences to cutting edge 3D prediction, and from command line to user friendly web interfaces. Despite these advances, many powerful, user friendly prediction tools and resources are underutilized by the virology community.
Tracking Debris Shed by a Space-Shuttle Launch Vehicle

NASA Technical Reports Server (NTRS)

Stuart, Phillip C.; Rogers, Stuart E.

2009-01-01

The DEBRIS software predicts the trajectories of debris particles shed by a space-shuttle launch vehicle during ascent, to aid in assessing potential harm to the space-shuttle orbiter and crew. The user specifies the location of release and other initial conditions for a debris particle. DEBRIS tracks the particle within an overset grid system by means of a computational fluid dynamics (CFD) simulation of the local flow field and a ballistic simulation that takes account of the mass of the particle and its aerodynamic properties in the flow field. The computed particle trajectory is stored in a file to be post-processed by other software for viewing and analyzing the trajectory. DEBRIS supplants a prior debris tracking code that took .15 minutes to calculate a single particle trajectory: DEBRIS can calculate 1,000 trajectories in .20 seconds on a desktop computer. Other improvements over the prior code include adaptive time-stepping to ensure accuracy, forcing at least one step per grid cell to ensure resolution of all CFD-resolved flow features, ability to simulate rebound of debris from surfaces, extensive error checking, a builtin suite of test cases, and dynamic allocation of memory.
Final Technical Report for Department of Energy award number DE-FG02-06ER54882, Revised

DOE Office of Scientific and Technical Information (OSTI.GOV)

Eggleston, Dennis L.

The research reported here involves studies of radial particle transport in a cylindrical, low-density Malmberg-Penning non-neutral plasma trap. The research is primarily experimental but involves careful comparisons to analytical theory and includes the results of a single-particle computer code. The transport is produced by applied electric fields that break the cylindrical symmetry of the trap, hence the term ``asymmetry-induced transport.'' Our computer studies have revealed the importance of a previously ignored class of particles that become trapped in the asymmetry potential. In many common situations these particles exhibit large radial excursions and dominate the radial transport. On the experimental side,more » we have developed new data analysis techniques that allowed us to determine the magnetic field dependence of the transport and to place empirical constraints on the form on the transport equation. Experiments designed to test the computer code results gave varying degrees of agreement with further work being necessary to understand the results. This work expands our knowledge of the varied mechanisms of cross-magnetic-field transport and should be of use to other workers studying plasma confinement.« less
Know Your Enemy: Successful Bioinformatic Approaches to Predict Functional RNA Structures in Viral RNAs

PubMed Central

Lim, Chun Shen; Brown, Chris M.

2018-01-01

Structured RNA elements may control virus replication, transcription and translation, and their distinct features are being exploited by novel antiviral strategies. Viral RNA elements continue to be discovered using combinations of experimental and computational analyses. However, the wealth of sequence data, notably from deep viral RNA sequencing, viromes, and metagenomes, necessitates computational approaches being used as an essential discovery tool. In this review, we describe practical approaches being used to discover functional RNA elements in viral genomes. In addition to success stories in new and emerging viruses, these approaches have revealed some surprising new features of well-studied viruses e.g., human immunodeficiency virus, hepatitis C virus, influenza, and dengue viruses. Some notable discoveries were facilitated by new comparative analyses of diverse viral genome alignments. Importantly, comparative approaches for finding RNA elements embedded in coding and non-coding regions differ. With the exponential growth of computer power we have progressed from stem-loop prediction on single sequences to cutting edge 3D prediction, and from command line to user friendly web interfaces. Despite these advances, many powerful, user friendly prediction tools and resources are underutilized by the virology community. PMID:29354101
Accelerating Monte Carlo simulations of photon transport in a voxelized geometry using a massively parallel graphics processing unit

DOE Office of Scientific and Technical Information (OSTI.GOV)

Badal, Andreu; Badano, Aldo

Purpose: It is a known fact that Monte Carlo simulations of radiation transport are computationally intensive and may require long computing times. The authors introduce a new paradigm for the acceleration of Monte Carlo simulations: The use of a graphics processing unit (GPU) as the main computing device instead of a central processing unit (CPU). Methods: A GPU-based Monte Carlo code that simulates photon transport in a voxelized geometry with the accurate physics models from PENELOPE has been developed using the CUDA programming model (NVIDIA Corporation, Santa Clara, CA). Results: An outline of the new code and a sample x-raymore » imaging simulation with an anthropomorphic phantom are presented. A remarkable 27-fold speed up factor was obtained using a GPU compared to a single core CPU. Conclusions: The reported results show that GPUs are currently a good alternative to CPUs for the simulation of radiation transport. Since the performance of GPUs is currently increasing at a faster pace than that of CPUs, the advantages of GPU-based software are likely to be more pronounced in the future.« less
BlazeDEM3D-GPU A Large Scale DEM simulation code for GPUs

NASA Astrophysics Data System (ADS)

Govender, Nicolin; Wilke, Daniel; Pizette, Patrick; Khinast, Johannes

2017-06-01

Accurately predicting the dynamics of particulate materials is of importance to numerous scientific and industrial areas with applications ranging across particle scales from powder flow to ore crushing. Computational discrete element simulations is a viable option to aid in the understanding of particulate dynamics and design of devices such as mixers, silos and ball mills, as laboratory scale tests comes at a significant cost. However, the computational time required to simulate an industrial scale simulation which consists of tens of millions of particles can take months to complete on large CPU clusters, making the Discrete Element Method (DEM) unfeasible for industrial applications. Simulations are therefore typically restricted to tens of thousands of particles with highly detailed particle shapes or a few million of particles with often oversimplified particle shapes. However, a number of applications require accurate representation of the particle shape to capture the macroscopic behaviour of the particulate system. In this paper we give an overview of the recent extensions to the open source GPU based DEM code, BlazeDEM3D-GPU, that can simulate millions of polyhedra and tens of millions of spheres on a desktop computer with a single or multiple GPUs.
NVIDIA OptiX ray-tracing engine as a new tool for modelling medical imaging systems

NASA Astrophysics Data System (ADS)

Pietrzak, Jakub; Kacperski, Krzysztof; Cieślar, Marek

2015-03-01

The most accurate technique to model the X- and gamma radiation path through a numerically defined object is the Monte Carlo simulation which follows single photons according to their interaction probabilities. A simplified and much faster approach, which just integrates total interaction probabilities along selected paths, is known as ray tracing. Both techniques are used in medical imaging for simulating real imaging systems and as projectors required in iterative tomographic reconstruction algorithms. These approaches are ready for massive parallel implementation e.g. on Graphics Processing Units (GPU), which can greatly accelerate the computation time at a relatively low cost. In this paper we describe the application of the NVIDIA OptiX ray-tracing engine, popular in professional graphics and rendering applications, as a new powerful tool for X- and gamma ray-tracing in medical imaging. It allows the implementation of a variety of physical interactions of rays with pixel-, mesh- or nurbs-based objects, and recording any required quantities, like path integrals, interaction sites, deposited energies, and others. Using the OptiX engine we have implemented a code for rapid Monte Carlo simulations of Single Photon Emission Computed Tomography (SPECT) imaging, as well as the ray-tracing projector, which can be used in reconstruction algorithms. The engine generates efficient, scalable and optimized GPU code, ready to run on multi GPU heterogeneous systems. We have compared the results our simulations with the GATE package. With the OptiX engine the computation time of a Monte Carlo simulation can be reduced from days to minutes.

MPIRUN: A Portable Loader for Multidisciplinary and Multi-Zonal Applications

NASA Technical Reports Server (NTRS)

Fineberg, Samuel A.; Woodrow, Thomas S. (Technical Monitor)

1994-01-01

Multidisciplinary and multi-zonal applications are an important class of applications in the area of Computational Aerosciences. In these codes, two or more distinct parallel programs or copies of a single program are utilized to model a single problem. To support such applications, it is common to use a programming model where a program is divided into several single program multiple data stream (SPMD) applications, each of which solves the equations for a single physical discipline or grid zone. These SPMD applications are then bound together to form a single multidisciplinary or multi-zonal program in which the constituent parts communicate via point-to-point message passing routines. One method for implementing the message passing portion of these codes is with the new Message Passing Interface (MPI) standard. Unfortunately, this standard only specifies the message passing portion of an application, but does not specify any portable mechanisms for loading an application. MPIRUN was developed to provide a portable means for loading MPI programs, and was specifically targeted at multidisciplinary and multi-zonal applications. Programs using MPIRUN for loading and MPI for message passing are then portable between all machines supported by MPIRUN. MPIRUN is currently implemented for the Intel iPSC/860, TMC CM5, IBM SP-1 and SP-2, Intel Paragon, and workstation clusters. Further, MPIRUN is designed to be simple enough to port easily to any system supporting MPI.
Antenna pattern study, task 2

NASA Technical Reports Server (NTRS)

Harper, Warren

1989-01-01

Two electromagnetic scattering codes, NEC-BSC and ESP3, were delivered and installed on a NASA VAX computer for use by Marshall Space Flight Center antenna design personnel. The existing codes and certain supplementary software were updated, the codes installed on a computer that will be delivered to the customer, to provide capability for graphic display of the data to be computed by the use of the codes and to assist the customer in the solution of specific problems that demonstrate the use of the codes. With the exception of one code revision, all of these tasks were performed.
Performance of MCNP4A on seven computing platforms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hendricks, J.S.; Brockhoff, R.C.

1994-12-31

The performance of seven computer platforms has been evaluated with the MCNP4A Monte Carlo radiation transport code. For the first time we report timing results using MCNP4A and its new test set and libraries. Comparisons are made on platforms not available to us in previous MCNP timing studies. By using MCNP4A and its 325-problem test set, a widely-used and readily-available physics production code is used; the timing comparison is not limited to a single ``typical`` problem, demonstrating the problem dependence of timing results; the results are reproducible at the more than 100 installations around the world using MCNP; comparison ofmore » performance of other computer platforms to the ones tested in this study is possible because we present raw data rather than normalized results; and a measure of the increase in performance of computer hardware and software over the past two years is possible. The computer platforms reported are the Cray-YMP 8/64, IBM RS/6000-560, Sun Sparc10, Sun Sparc2, HP/9000-735, 4 processor 100 MHz Silicon Graphics ONYX, and Gateway 2000 model 4DX2-66V PC. In 1991 a timing study of MCNP4, the predecessor to MCNP4A, was conducted using ENDF/B-V cross-section libraries, which are export protected. The new study is based upon the new MCNP 25-problem test set which utilizes internationally available data. MCNP4A, its test problems and the test data library are available from the Radiation Shielding and Information Center in Oak Ridge, Tennessee, or from the NEA Data Bank in Saclay, France. Anyone with the same workstation and compiler can get the same test problem sets, the same library files, and the same MCNP4A code from RSIC or NEA and replicate our results. And, because we report raw data, comparison of the performance of other compute platforms and compilers can be made.« less
Scalability of Parallel Spatial Direct Numerical Simulations on Intel Hypercube and IBM SP1 and SP2

NASA Technical Reports Server (NTRS)

Joslin, Ronald D.; Hanebutte, Ulf R.; Zubair, Mohammad

1995-01-01

The implementation and performance of a parallel spatial direct numerical simulation (PSDNS) approach on the Intel iPSC/860 hypercube and IBM SP1 and SP2 parallel computers is documented. Spatially evolving disturbances associated with the laminar-to-turbulent transition in boundary-layer flows are computed with the PSDNS code. The feasibility of using the PSDNS to perform transition studies on these computers is examined. The results indicate that PSDNS approach can effectively be parallelized on a distributed-memory parallel machine by remapping the distributed data structure during the course of the calculation. Scalability information is provided to estimate computational costs to match the actual costs relative to changes in the number of grid points. By increasing the number of processors, slower than linear speedups are achieved with optimized (machine-dependent library) routines. This slower than linear speedup results because the computational cost is dominated by FFT routine, which yields less than ideal speedups. By using appropriate compile options and optimized library routines on the SP1, the serial code achieves 52-56 M ops on a single node of the SP1 (45 percent of theoretical peak performance). The actual performance of the PSDNS code on the SP1 is evaluated with a "real world" simulation that consists of 1.7 million grid points. One time step of this simulation is calculated on eight nodes of the SP1 in the same time as required by a Cray Y/MP supercomputer. For the same simulation, 32-nodes of the SP1 and SP2 are required to reach the performance of a Cray C-90. A 32 node SP1 (SP2) configuration is 2.9 (4.6) times faster than a Cray Y/MP for this simulation, while the hypercube is roughly 2 times slower than the Y/MP for this application. KEY WORDS: Spatial direct numerical simulations; incompressible viscous flows; spectral methods; finite differences; parallel computing.
Parcels v0.9: prototyping a Lagrangian ocean analysis framework for the petascale age

NASA Astrophysics Data System (ADS)

Lange, Michael; van Sebille, Erik

2017-11-01

As ocean general circulation models (OGCMs) move into the petascale age, where the output of single simulations exceeds petabytes of storage space, tools to analyse the output of these models will need to scale up too. Lagrangian ocean analysis, where virtual particles are tracked through hydrodynamic fields, is an increasingly popular way to analyse OGCM output, by mapping pathways and connectivity of biotic and abiotic particulates. However, the current software stack of Lagrangian ocean analysis codes is not dynamic enough to cope with the increasing complexity, scale and need for customization of use-cases. Furthermore, most community codes are developed for stand-alone use, making it a nontrivial task to integrate virtual particles at runtime of the OGCM. Here, we introduce the new Parcels code, which was designed from the ground up to be sufficiently scalable to cope with petascale computing. We highlight its API design that combines flexibility and customization with the ability to optimize for HPC workflows, following the paradigm of domain-specific languages. Parcels is primarily written in Python, utilizing the wide range of tools available in the scientific Python ecosystem, while generating low-level C code and using just-in-time compilation for performance-critical computation. We show a worked-out example of its API, and validate the accuracy of the code against seven idealized test cases. This version 0.9 of Parcels is focused on laying out the API, with future work concentrating on support for curvilinear grids, optimization, efficiency and at-runtime coupling with OGCMs.
Complexity, information loss, and model building: from neuro- to cognitive dynamics

NASA Astrophysics Data System (ADS)

Arecchi, F. Tito

2007-06-01

A scientific problem described within a given code is mapped by a corresponding computational problem, We call complexity (algorithmic) the bit length of the shortest instruction which solves the problem. Deterministic chaos in general affects a dynamical systems making the corresponding problem experimentally and computationally heavy, since one must reset the initial conditions at a rate higher than that of information loss (Kolmogorov entropy). One can control chaos by adding to the system new degrees of freedom (information swapping: information lost by chaos is replaced by that arising from the new degrees of freedom). This implies a change of code, or a new augmented model. Within a single code, changing hypotheses is equivalent to fixing different sets of control parameters, each with a different a-priori probability, to be then confirmed and transformed to an a-posteriori probability via Bayes theorem. Sequential application of Bayes rule is nothing else than the Darwinian strategy in evolutionary biology. The sequence is a steepest ascent algorithm, which stops once maximum probability has been reached. At this point the hypothesis exploration stops. By changing code (and hence the set of relevant variables) one can start again to formulate new classes of hypotheses . We call semantic complexity the number of accessible scientific codes, or models, that describe a situation. It is however a fuzzy concept, in so far as this number changes due to interaction of the operator with the system under investigation. These considerations are illustrated with reference to a cognitive task, starting from synchronization of neuron arrays in a perceptual area and tracing the putative path toward a model building.
Software for Brain Network Simulations: A Comparative Study

PubMed Central

Tikidji-Hamburyan, Ruben A.; Narayana, Vikram; Bozkus, Zeki; El-Ghazawi, Tarek A.

2017-01-01

Numerical simulations of brain networks are a critical part of our efforts in understanding brain functions under pathological and normal conditions. For several decades, the community has developed many software packages and simulators to accelerate research in computational neuroscience. In this article, we select the three most popular simulators, as determined by the number of models in the ModelDB database, such as NEURON, GENESIS, and BRIAN, and perform an independent evaluation of these simulators. In addition, we study NEST, one of the lead simulators of the Human Brain Project. First, we study them based on one of the most important characteristics, the range of supported models. Our investigation reveals that brain network simulators may be biased toward supporting a specific set of models. However, all simulators tend to expand the supported range of models by providing a universal environment for the computational study of individual neurons and brain networks. Next, our investigations on the characteristics of computational architecture and efficiency indicate that all simulators compile the most computationally intensive procedures into binary code, with the aim of maximizing their computational performance. However, not all simulators provide the simplest method for module development and/or guarantee efficient binary code. Third, a study of their amenability for high-performance computing reveals that NEST can almost transparently map an existing model on a cluster or multicore computer, while NEURON requires code modification if the model developed for a single computer has to be mapped on a computational cluster. Interestingly, parallelization is the weakest characteristic of BRIAN, which provides no support for cluster computations and limited support for multicore computers. Fourth, we identify the level of user support and frequency of usage for all simulators. Finally, we carry out an evaluation using two case studies: a large network with simplified neural and synaptic models and a small network with detailed models. These two case studies allow us to avoid any bias toward a particular software package. The results indicate that BRIAN provides the most concise language for both cases considered. Furthermore, as expected, NEST mostly favors large network models, while NEURON is better suited for detailed models. Overall, the case studies reinforce our general observation that simulators have a bias in the computational performance toward specific types of the brain network models. PMID:28775687
Many-integrated core (MIC) technology for accelerating Monte Carlo simulation of radiation transport: A study based on the code DPM

NASA Astrophysics Data System (ADS)

Rodriguez, M.; Brualla, L.

2018-04-01

Monte Carlo simulation of radiation transport is computationally demanding to obtain reasonably low statistical uncertainties of the estimated quantities. Therefore, it can benefit in a large extent from high-performance computing. This work is aimed at assessing the performance of the first generation of the many-integrated core architecture (MIC) Xeon Phi coprocessor with respect to that of a CPU consisting of a double 12-core Xeon processor in Monte Carlo simulation of coupled electron-photonshowers. The comparison was made twofold, first, through a suite of basic tests including parallel versions of the random number generators Mersenne Twister and a modified implementation of RANECU. These tests were addressed to establish a baseline comparison between both devices. Secondly, through the p DPM code developed in this work. p DPM is a parallel version of the Dose Planning Method (DPM) program for fast Monte Carlo simulation of radiation transport in voxelized geometries. A variety of techniques addressed to obtain a large scalability on the Xeon Phi were implemented in p DPM. Maximum scalabilities of 84 . 2 × and 107 . 5 × were obtained in the Xeon Phi for simulations of electron and photon beams, respectively. Nevertheless, in none of the tests involving radiation transport the Xeon Phi performed better than the CPU. The disadvantage of the Xeon Phi with respect to the CPU owes to the low performance of the single core of the former. A single core of the Xeon Phi was more than 10 times less efficient than a single core of the CPU for all radiation transport simulations.
New estimates of the CMB angular power spectra from the WMAP 5 year low-resolution data

NASA Astrophysics Data System (ADS)

Gruppuso, A.; de Rosa, A.; Cabella, P.; Paci, F.; Finelli, F.; Natoli, P.; de Gasperis, G.; Mandolesi, N.

2009-11-01

A quadratic maximum likelihood (QML) estimator is applied to the Wilkinson Microwave Anisotropy Probe (WMAP) 5 year low-resolution maps to compute the cosmic microwave background angular power spectra (APS) at large scales for both temperature and polarization. Estimates and error bars for the six APS are provided up to l = 32 and compared, when possible, to those obtained by the WMAP team, without finding any inconsistency. The conditional likelihood slices are also computed for the Cl of all the six power spectra from l = 2 to 10 through a pixel-based likelihood code. Both the codes treat the covariance for (T, Q, U) in a single matrix without employing any approximation. The inputs of both the codes (foreground-reduced maps, related covariances and masks) are provided by the WMAP team. The peaks of the likelihood slices are always consistent with the QML estimates within the error bars; however, an excellent agreement occurs when the QML estimates are used as a fiducial power spectrum instead of the best-fitting theoretical power spectrum. By the full computation of the conditional likelihood on the estimated spectra, the value of the temperature quadrupole CTTl=2 is found to be less than 2σ away from the WMAP 5 year Λ cold dark matter best-fitting value. The BB spectrum is found to be well consistent with zero, and upper limits on the B modes are provided. The parity odd signals TB and EB are found to be consistent with zero.
Coding for Single-Line Transmission

NASA Technical Reports Server (NTRS)

Madison, L. G.

1983-01-01

Digital transmission code combines data and clock signals into single waveform. MADCODE needs four standard integrated circuits in generator and converter plus five small discrete components. MADCODE allows simple coding and decoding for transmission of digital signals over single line.
48 CFR 252.227-7013 - Rights in technical data-Noncommercial items.

Code of Federal Regulations, 2013 CFR

2013-10-01

... causing a computer to perform a specific operation or series of operations. (3) Computer software means computer programs, source code, source code listings, object code listings, design details, algorithms... funds; (ii) Studies, analyses, test data, or similar data produced for this contract, when the study...
User Instructions for the Systems Assessment Capability, Rev. 1, Computer Codes Volume 3: Utility Codes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Eslinger, Paul W.; Aaberg, Rosanne L.; Lopresti, Charles A.

2004-09-14

This document contains detailed user instructions for a suite of utility codes developed for Rev. 1 of the Systems Assessment Capability. The suite of computer codes for Rev. 1 of Systems Assessment Capability performs many functions.
Validation of Extended MHD Models using MST RFP Plasmas

NASA Astrophysics Data System (ADS)

Jacobson, C. M.; Chapman, B. E.; Craig, D.; McCollam, K. J.; Sovinec, C. R.

2016-10-01

Significant effort has been devoted to improvement of computational models used in fusion energy sciences. Rigorous validation of these models is necessary in order to increase confidence in their ability to predict the performance of future devices. MST is a well diagnosed reversed-field pinch (RFP) capable of operation over a wide range of parameters. In particular, the Lundquist number S, a key parameter in resistive magnetohydrodynamics (MHD), can be varied over a wide range and provide substantial overlap with MHD RFP simulations. MST RFP plasmas are simulated using both DEBS, a nonlinear single-fluid visco-resistive MHD code, and NIMROD, a nonlinear extended MHD code, with S ranging from 104 to 5 ×104 for single-fluid runs, with the magnetic Prandtl number Pm = 1 . Experiments with plasma current IP ranging from 60 kA to 500 kA result in S from 4 ×104 to 8 ×106 . Validation metric comparisons are presented, focusing on how magnetic fluctuations b scale with S. Single-fluid NIMROD results give S b - 0.21 , and experiments give S b - 0.28 for the dominant m = 1 , n = 6 mode. Preliminary two-fluid NIMROD results are also presented. Work supported by US DOE.
Semantic Interoperability for Computational Mineralogy: Experiences of the eMinerals Consortium

NASA Astrophysics Data System (ADS)

Walker, A. M.; White, T. O.; Dove, M. T.; Bruin, R. P.; Couch, P. A.; Tyer, R. P.

2006-12-01

The use of atomic scale computer simulation of minerals to obtain information for geophysics and environmental science has grown enormously over the past couple of decades. It is now routine to probe mineral behavior in the Earth's deep interior and in the surface environment by borrowing methods and simulation codes from computational chemistry and physics. It is becoming increasingly important to use methods embodied in more than one of these codes to solve any single scientific problem. However, scientific codes are rarely designed for easy interoperability and data exchange; data formats are often code-specific, poorly documented and fragile, liable to frequent change between software versions, and even compiler versions. This means that the scientist's simple desire to use the methodological approaches offered by multiple codes is frustrated, and even the sharing of data between collaborators becomes fraught with difficulties. The eMinerals consortium was formed in the early stages of the UK eScience program with the aim of developing the tools needed to apply atomic scale simulation to environmental problems in a grid-enabled world, and to harness the computational power offered by grid technologies to address some outstanding mineralogical problems. One example of the kind of problem we can tackle is the origin of the compressibility anomaly in silica glass. By passing data directly between simulation and analysis tools we were able to probe this effect in more detail than has previously been possible and have shown how the anomaly is related to the details of the amorphous structure. In order to approach this kind of problem we have constructed a mini-grid, a small scale and extensible combined compute- and data-grid that allows the execution of many calculations in parallel, and the transparent storage of semantically-rich marked-up result data. Importantly, we automatically capture multiple kinds of metadata and key results from each calculation. We believe that the lessons learned and tools developed will be useful in many areas of science beyond the computational mineralogy. Key tools that will be described include: a pure Fortran XML library (FoX) that presents XPath, SAX and DOM interfaces as well as permitting the easy production of valid XML from legacy Fortran programs; a job submission framework that automatically schedules calculations to remote grid resources, handles data staging and metadata capture; and a tool (AgentX) that map concepts from an ontology onto locations in documents of various formats that we use to enable data exchange.
Development of a model and computer code to describe solar grade silicon production processes

NASA Technical Reports Server (NTRS)

Gould, R. K.; Srivastava, R.

1979-01-01

Two computer codes were developed for describing flow reactors in which high purity, solar grade silicon is produced via reduction of gaseous silicon halides. The first is the CHEMPART code, an axisymmetric, marching code which treats two phase flows with models describing detailed gas-phase chemical kinetics, particle formation, and particle growth. It can be used to described flow reactors in which reactants, mix, react, and form a particulate phase. Detailed radial gas-phase composition, temperature, velocity, and particle size distribution profiles are computed. Also, deposition of heat, momentum, and mass (either particulate or vapor) on reactor walls is described. The second code is a modified version of the GENMIX boundary layer code which is used to compute rates of heat, momentum, and mass transfer to the reactor walls. This code lacks the detailed chemical kinetics and particle handling features of the CHEMPART code but has the virtue of running much more rapidly than CHEMPART, while treating the phenomena occurring in the boundary layer in more detail.
Large scale in vivo recordings to study neuronal biophysics.

PubMed

Giocomo, Lisa M

2015-06-01

Over the last several years, technological advances have enabled researchers to more readily observe single-cell membrane biophysics in awake, behaving animals. Studies utilizing these technologies have provided important insights into the mechanisms generating functional neural codes in both sensory and non-sensory cortical circuits. Crucial for a deeper understanding of how membrane biophysics control circuit dynamics however, is a continued effort to move toward large scale studies of membrane biophysics, in terms of the numbers of neurons and ion channels examined. Future work faces a number of theoretical and technical challenges on this front but recent technological developments hold great promise for a larger scale understanding of how membrane biophysics contribute to circuit coding and computation. Copyright © 2014 Elsevier Ltd. All rights reserved.
Experiences on p-Version Time-Discontinuous Galerkin's Method for Nonlinear Heat Transfer Analysis and Sensitivity Analysis

NASA Technical Reports Server (NTRS)

Hou, Gene

2004-01-01

The focus of this research is on the development of analysis and sensitivity analysis equations for nonlinear, transient heat transfer problems modeled by p-version, time discontinuous finite element approximation. The resulting matrix equation of the state equation is simply in the form ofA(x)x = c, representing a single step, time marching scheme. The Newton-Raphson's method is used to solve the nonlinear equation. Examples are first provided to demonstrate the accuracy characteristics of the resultant finite element approximation. A direct differentiation approach is then used to compute the thermal sensitivities of a nonlinear heat transfer problem. The report shows that only minimal coding effort is required to enhance the analysis code with the sensitivity analysis capability.
CIFOG: Cosmological Ionization Fields frOm Galaxies

NASA Astrophysics Data System (ADS)

Hutter, Anne

2018-03-01

CIFOG is a versatile MPI-parallelised semi-numerical tool to perform simulations of the Epoch of Reionization. From a set of evolving cosmological gas density and ionizing emissivity fields, it computes the time and spatially dependent ionization of neutral hydrogen (HI), neutral (HeI) and singly ionized helium (HeII) in the intergalactic medium (IGM). The code accounts for HII, HeII, HeIII recombinations, and provides different descriptions for the photoionization rate that are used to calculate the residual HI fraction in ionized regions. This tool has been designed to be coupled to semi-analytic galaxy formation models or hydrodynamical simulations. The modular fashion of the code allows the user to easily introduce new descriptions for recombinations and the photoionization rate.
An extensible circuit QED architecture for quantum computation

NASA Astrophysics Data System (ADS)

Dicarlo, Leo

Realizing a logical qubit robust to single errors in its constituent physical elements is an immediate challenge for quantum information processing platforms. A longer-term challenge will be achieving quantum fault tolerance, i.e., improving logical qubit resilience by increasing redundancy in the underlying quantum error correction code (QEC). In QuTech, we target these challenges in collaboration with industrial and academic partners. I will present the circuit QED quantum hardware, room-temperature control electronics, and software components of the complete architecture. I will show the extensibility of each component to the Surface-17 and -49 circuits needed to reach the objectives with surface-code QEC, and provide an overview of latest developments. Research funded by IARPA and Intel Corporation.
Modeling and comparative study of fluid velocities in heterogeneous rocks

NASA Astrophysics Data System (ADS)

Hingerl, Ferdinand F.; Romanenko, Konstantin; Pini, Ronny; Balcom, Bruce; Benson, Sally

2013-04-01

Detailed knowledge of the distribution of effective porosity and fluid velocities in heterogeneous rock samples is crucial for understanding and predicting spatially resolved fluid residence times and kinetic reaction rates of fluid-rock interactions. The applicability of conventional MRI techniques to sedimentary rocks is limited by internal magnetic field gradients and short spin relaxation times. The approach developed at the UNB MRI Centre combines the 13-interval Alternating-Pulsed-Gradient Stimulated-Echo (APGSTE) scheme and three-dimensional Single Point Ramped Imaging with T1 Enhancement (SPRITE). These methods were designed to reduce the errors due to effects of background gradients and fast transverse relaxation. SPRITE is largely immune to time-evolution effects resulting from background gradients, paramagnetic impurities and chemical shift. Using these techniques quantitative 3D porosity maps as well as single-phase fluid velocity fields in sandstone core samples were measured. Using a new Magnetic Resonance Imaging technique developed at the MRI Centre at UNB, we created 3D maps of porosity distributions as well as single-phase fluid velocity distributions of sandstone rock samples. Then, we evaluated the applicability of the Kozeny-Carman relationship for modeling measured fluid velocity distributions in sandstones samples showing meso-scale heterogeneities using two different modeling approaches. The MRI maps were used as reference points for the modeling approaches. For the first modeling approach, we applied the Kozeny-Carman relationship to the porosity distributions and computed respective permeability maps, which in turn provided input for a CFD simulation - using the Stanford CFD code GPRS - to compute averaged velocity maps. The latter were then compared to the measured velocity maps. For the second approach, the measured velocity distributions were used as input for inversely computing permeabilities using the GPRS CFD code. The computed permeabilities were then correlated with the ones based on the porosity maps and the Kozeny-Carman relationship. The findings of the comparative modeling study are discussed and its potential impact on the modeling of fluid residence times and kinetic reaction rates of fluid-rock interactions in rocks containing meso-scale heterogeneities are reviewed.

A hybrid gyrokinetic ion and isothermal electron fluid code for astrophysical plasma

NASA Astrophysics Data System (ADS)

Kawazura, Y.; Barnes, M.

2018-05-01

This paper describes a new code for simulating astrophysical plasmas that solves a hybrid model composed of gyrokinetic ions (GKI) and an isothermal electron fluid (ITEF) Schekochihin et al. (2009) [9]. This model captures ion kinetic effects that are important near the ion gyro-radius scale while electron kinetic effects are ordered out by an electron-ion mass ratio expansion. The code is developed by incorporating the ITEF approximation into AstroGK, an Eulerian δf gyrokinetics code specialized to a slab geometry Numata et al. (2010) [41]. The new code treats the linear terms in the ITEF equations implicitly while the nonlinear terms are treated explicitly. We show linear and nonlinear benchmark tests to prove the validity and applicability of the simulation code. Since the fast electron timescale is eliminated by the mass ratio expansion, the Courant-Friedrichs-Lewy condition is much less restrictive than in full gyrokinetic codes; the present hybrid code runs ∼ 2√{mi /me } ∼ 100 times faster than AstroGK with a single ion species and kinetic electrons where mi /me is the ion-electron mass ratio. The improvement of the computational time makes it feasible to execute ion scale gyrokinetic simulations with a high velocity space resolution and to run multiple simulations to determine the dependence of turbulent dynamics on parameters such as electron-ion temperature ratio and plasma beta.
Massively parallel unsupervised single-particle cryo-EM data clustering via statistical manifold learning

PubMed Central

Wu, Jiayi; Ma, Yong-Bei; Congdon, Charles; Brett, Bevin; Chen, Shuobing; Xu, Yaofang; Ouyang, Qi

2017-01-01

Structural heterogeneity in single-particle cryo-electron microscopy (cryo-EM) data represents a major challenge for high-resolution structure determination. Unsupervised classification may serve as the first step in the assessment of structural heterogeneity. However, traditional algorithms for unsupervised classification, such as K-means clustering and maximum likelihood optimization, may classify images into wrong classes with decreasing signal-to-noise-ratio (SNR) in the image data, yet demand increased computational costs. Overcoming these limitations requires further development of clustering algorithms for high-performance cryo-EM data processing. Here we introduce an unsupervised single-particle clustering algorithm derived from a statistical manifold learning framework called generative topographic mapping (GTM). We show that unsupervised GTM clustering improves classification accuracy by about 40% in the absence of input references for data with lower SNRs. Applications to several experimental datasets suggest that our algorithm can detect subtle structural differences among classes via a hierarchical clustering strategy. After code optimization over a high-performance computing (HPC) environment, our software implementation was able to generate thousands of reference-free class averages within hours in a massively parallel fashion, which allows a significant improvement on ab initio 3D reconstruction and assists in the computational purification of homogeneous datasets for high-resolution visualization. PMID:28786986
Massively parallel unsupervised single-particle cryo-EM data clustering via statistical manifold learning.

PubMed

Wu, Jiayi; Ma, Yong-Bei; Congdon, Charles; Brett, Bevin; Chen, Shuobing; Xu, Yaofang; Ouyang, Qi; Mao, Youdong

2017-01-01

Structural heterogeneity in single-particle cryo-electron microscopy (cryo-EM) data represents a major challenge for high-resolution structure determination. Unsupervised classification may serve as the first step in the assessment of structural heterogeneity. However, traditional algorithms for unsupervised classification, such as K-means clustering and maximum likelihood optimization, may classify images into wrong classes with decreasing signal-to-noise-ratio (SNR) in the image data, yet demand increased computational costs. Overcoming these limitations requires further development of clustering algorithms for high-performance cryo-EM data processing. Here we introduce an unsupervised single-particle clustering algorithm derived from a statistical manifold learning framework called generative topographic mapping (GTM). We show that unsupervised GTM clustering improves classification accuracy by about 40% in the absence of input references for data with lower SNRs. Applications to several experimental datasets suggest that our algorithm can detect subtle structural differences among classes via a hierarchical clustering strategy. After code optimization over a high-performance computing (HPC) environment, our software implementation was able to generate thousands of reference-free class averages within hours in a massively parallel fashion, which allows a significant improvement on ab initio 3D reconstruction and assists in the computational purification of homogeneous datasets for high-resolution visualization.
Comparison of two computer codes for crack growth analysis: NASCRAC Versus NASA/FLAGRO

NASA Technical Reports Server (NTRS)

Stallworth, R.; Meyers, C. A.; Stinson, H. C.

1989-01-01

Results are presented from the comparison study of two computer codes for crack growth analysis - NASCRAC and NASA/FLAGRO. The two computer codes gave compatible conservative results when the part through crack analysis solutions were analyzed versus experimental test data. Results showed good correlation between the codes for the through crack at a lug solution. For the through crack at a lug solution, NASA/FLAGRO gave the most conservative results.
Computational Predictions of the Performance Wright 'Bent End' Propellers

NASA Technical Reports Server (NTRS)

Wang, Xiang-Yu; Ash, Robert L.; Bobbitt, Percy J.; Prior, Edwin (Technical Monitor)

2002-01-01

Computational analysis of two 1911 Wright brothers 'Bent End' wooden propeller reproductions have been performed and compared with experimental test results from the Langley Full Scale Wind Tunnel. The purpose of the analysis was to check the consistency of the experimental results and to validate the reliability of the tests. This report is one part of the project on the propeller performance research of the Wright 'Bent End' propellers, intend to document the Wright brothers' pioneering propeller design contributions. Two computer codes were used in the computational predictions. The FLO-MG Navier-Stokes code is a CFD (Computational Fluid Dynamics) code based on the Navier-Stokes Equations. It is mainly used to compute the lift coefficient and the drag coefficient at specified angles of attack at different radii. Those calculated data are the intermediate results of the computation and a part of the necessary input for the Propeller Design Analysis Code (based on Adkins and Libeck method), which is a propeller design code used to compute the propeller thrust coefficient, the propeller power coefficient and the propeller propulsive efficiency.
Dual Coding Theory Explains Biphasic Collective Computation in Neural Decision-Making.

PubMed

Daniels, Bryan C; Flack, Jessica C; Krakauer, David C

2017-01-01

A central question in cognitive neuroscience is how unitary, coherent decisions at the whole organism level can arise from the distributed behavior of a large population of neurons with only partially overlapping information. We address this issue by studying neural spiking behavior recorded from a multielectrode array with 169 channels during a visual motion direction discrimination task. It is well known that in this task there are two distinct phases in neural spiking behavior. Here we show Phase I is a distributed or incompressible phase in which uncertainty about the decision is substantially reduced by pooling information from many cells. Phase II is a redundant or compressible phase in which numerous single cells contain all the information present at the population level in Phase I, such that the firing behavior of a single cell is enough to predict the subject's decision. Using an empirically grounded dynamical modeling framework, we show that in Phase I large cell populations with low redundancy produce a slow timescale of information aggregation through critical slowing down near a symmetry-breaking transition. Our model indicates that increasing collective amplification in Phase II leads naturally to a faster timescale of information pooling and consensus formation. Based on our results and others in the literature, we propose that a general feature of collective computation is a "coding duality" in which there are accumulation and consensus formation processes distinguished by different timescales.
Dual Coding Theory Explains Biphasic Collective Computation in Neural Decision-Making

PubMed Central

Daniels, Bryan C.; Flack, Jessica C.; Krakauer, David C.

2017-01-01

A central question in cognitive neuroscience is how unitary, coherent decisions at the whole organism level can arise from the distributed behavior of a large population of neurons with only partially overlapping information. We address this issue by studying neural spiking behavior recorded from a multielectrode array with 169 channels during a visual motion direction discrimination task. It is well known that in this task there are two distinct phases in neural spiking behavior. Here we show Phase I is a distributed or incompressible phase in which uncertainty about the decision is substantially reduced by pooling information from many cells. Phase II is a redundant or compressible phase in which numerous single cells contain all the information present at the population level in Phase I, such that the firing behavior of a single cell is enough to predict the subject's decision. Using an empirically grounded dynamical modeling framework, we show that in Phase I large cell populations with low redundancy produce a slow timescale of information aggregation through critical slowing down near a symmetry-breaking transition. Our model indicates that increasing collective amplification in Phase II leads naturally to a faster timescale of information pooling and consensus formation. Based on our results and others in the literature, we propose that a general feature of collective computation is a “coding duality” in which there are accumulation and consensus formation processes distinguished by different timescales. PMID:28634436
Report of experiments and evidence for ASC L2 milestone 4467 : demonstration of a legacy application's path to exascale.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Curry, Matthew L.; Ferreira, Kurt Brian; Pedretti, Kevin Thomas Tauke

2012-03-01

This report documents thirteen of Sandia's contributions to the Computational Systems and Software Environment (CSSE) within the Advanced Simulation and Computing (ASC) program between fiscal years 2009 and 2012. It describes their impact on ASC applications. Most contributions are implemented in lower software levels allowing for application improvement without source code changes. Improvements are identified in such areas as reduced run time, characterizing power usage, and Input/Output (I/O). Other experiments are more forward looking, demonstrating potential bottlenecks using mini-application versions of the legacy codes and simulating their network activity on Exascale-class hardware. The purpose of this report is to provemore » that the team has completed milestone 4467-Demonstration of a Legacy Application's Path to Exascale. Cielo is expected to be the last capability system on which existing ASC codes can run without significant modifications. This assertion will be tested to determine where the breaking point is for an existing highly scalable application. The goal is to stretch the performance boundaries of the application by applying recent CSSE RD in areas such as resilience, power, I/O, visualization services, SMARTMAP, lightweight LWKs, virtualization, simulation, and feedback loops. Dedicated system time reservations and/or CCC allocations will be used to quantify the impact of system-level changes to extend the life and performance of the ASC code base. Finally, a simulation of anticipated exascale-class hardware will be performed using SST to supplement the calculations. Determine where the breaking point is for an existing highly scalable application: Chapter 15 presented the CSSE work that sought to identify the breaking point in two ASC legacy applications-Charon and CTH. Their mini-app versions were also employed to complete the task. There is no single breaking point as more than one issue was found with the two codes. The results were that applications can expect to encounter performance issues related to the computing environment, system software, and algorithms. Careful profiling of runtime performance will be needed to identify the source of an issue, in strong combination with knowledge of system software and application source code.« less
Proceduracy: Computer Code Writing in the Continuum of Literacy

ERIC Educational Resources Information Center

Vee, Annette

2010-01-01

This dissertation looks at computer programming through the lens of literacy studies, building from the concept of code as a written text with expressive and rhetorical power. I focus on the intersecting technological and social factors of computer code writing as a literacy--a practice I call "proceduracy". Like literacy, proceduracy is a human…
Computer Code Aids Design Of Wings

NASA Technical Reports Server (NTRS)

Carlson, Harry W.; Darden, Christine M.

1993-01-01

AERO2S computer code developed to aid design engineers in selection and evaluation of aerodynamically efficient wing/canard and wing/horizontal-tail configurations that includes simple hinged-flap systems. Code rapidly estimates longitudinal aerodynamic characteristics of conceptual airplane lifting-surface arrangements. Developed in FORTRAN V on CDC 6000 computer system, and ported to MS-DOS environment.
Cloud Computing for Complex Performance Codes.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Appel, Gordon John; Hadgu, Teklu; Klein, Brandon Thorin

This report describes the use of cloud computing services for running complex public domain performance assessment problems. The work consisted of two phases: Phase 1 was to demonstrate complex codes, on several differently configured servers, could run and compute trivial small scale problems in a commercial cloud infrastructure. Phase 2 focused on proving non-trivial large scale problems could be computed in the commercial cloud environment. The cloud computing effort was successfully applied using codes of interest to the geohydrology and nuclear waste disposal modeling community.
APC: A New Code for Atmospheric Polarization Computations

NASA Technical Reports Server (NTRS)

Korkin, Sergey V.; Lyapustin, Alexei I.; Rozanov, Vladimir V.

2014-01-01

A new polarized radiative transfer code Atmospheric Polarization Computations (APC) is described. The code is based on separation of the diffuse light field into anisotropic and smooth (regular) parts. The anisotropic part is computed analytically. The smooth regular part is computed numerically using the discrete ordinates method. Vertical stratification of the atmosphere, common types of bidirectional surface reflection and scattering by spherical particles or spheroids are included. A particular consideration is given to computation of the bidirectional polarization distribution function (BPDF) of the waved ocean surface.
Documentation of computer program VS2D to solve the equations of fluid flow in variably saturated porous media

USGS Publications Warehouse

Lappala, E.G.; Healy, R.W.; Weeks, E.P.

1987-01-01

This report documents FORTRAN computer code for solving problems involving variably saturated single-phase flow in porous media. The flow equation is written with total hydraulic potential as the dependent variable, which allows straightforward treatment of both saturated and unsaturated conditions. The spatial derivatives in the flow equation are approximated by central differences, and time derivatives are approximated either by a fully implicit backward or by a centered-difference scheme. Nonlinear conductance and storage terms may be linearized using either an explicit method or an implicit Newton-Raphson method. Relative hydraulic conductivity is evaluated at cell boundaries by using either full upstream weighting, the arithmetic mean, or the geometric mean of values from adjacent cells. Nonlinear boundary conditions treated by the code include infiltration, evaporation, and seepage faces. Extraction by plant roots that is caused by atmospheric demand is included as a nonlinear sink term. These nonlinear boundary and sink terms are linearized implicitly. The code has been verified for several one-dimensional linear problems for which analytical solutions exist and against two nonlinear problems that have been simulated with other numerical models. A complete listing of data-entry requirements and data entry and results for three example problems are provided. (USGS)
Satellite Image Mosaic Engine

NASA Technical Reports Server (NTRS)

Plesea, Lucian

2006-01-01

A computer program automatically builds large, full-resolution mosaics of multispectral images of Earth landmasses from images acquired by Landsat 7, complete with matching of colors and blending between adjacent scenes. While the code has been used extensively for Landsat, it could also be used for other data sources. A single mosaic of as many as 8,000 scenes, represented by more than 5 terabytes of data and the largest set produced in this work, demonstrated what the code could do to provide global coverage. The program first statistically analyzes input images to determine areas of coverage and data-value distributions. It then transforms the input images from their original universal transverse Mercator coordinates to other geographical coordinates, with scaling. It applies a first-order polynomial brightness correction to each band in each scene. It uses a data-mask image for selecting data and blending of input scenes. Under control by a user, the program can be made to operate on small parts of the output image space, with check-point and restart capabilities. The program runs on SGI IRIX computers. It is capable of parallel processing using shared-memory code, large memories, and tens of central processing units. It can retrieve input data and store output data at locations remote from the processors on which it is executed.
Parametric Design of Injectors for LDI-3 Combustors

NASA Technical Reports Server (NTRS)

Ajmani, Kumud; Mongia, Hukam; Lee, Phil

2015-01-01

Application of a partially calibrated National Combustion Code (NCC) for providing guidance in the design of the 3rd generation of the Lean-Direct Injection (LDI) multi-element combustion configuration (LDI-3) is summarized. NCC was used to perform non-reacting and two-phase reacting flow computations on several LDI-3 injector configurations in a single-element and a five-element injector array. All computations were performed with a consistent approach for mesh-generation, turbulence, spray simulations, ignition and chemical kinetics-modeling. Both qualitative and quantitative assessment of the computed flowfield characteristics of the several design options led to selection of an optimal injector LDI- 3 design that met all the requirements including effective area, aerodynamics and fuel-air mixing criteria. Computed LDI-3 emissions (namely, NOx, CO and UHC) will be compared with the prior generation LDI- 2 combustor experimental data at relevant engine cycle conditions.
Computational Models of Anterior Cingulate Cortex: At the Crossroads between Prediction and Effort.

PubMed

Vassena, Eliana; Holroyd, Clay B; Alexander, William H

2017-01-01

In the last two decades the anterior cingulate cortex (ACC) has become one of the most investigated areas of the brain. Extensive neuroimaging evidence suggests countless functions for this region, ranging from conflict and error coding, to social cognition, pain and effortful control. In response to this burgeoning amount of data, a proliferation of computational models has tried to characterize the neurocognitive architecture of ACC. Early seminal models provided a computational explanation for a relatively circumscribed set of empirical findings, mainly accounting for EEG and fMRI evidence. More recent models have focused on ACC's contribution to effortful control. In parallel to these developments, several proposals attempted to explain within a single computational framework a wider variety of empirical findings that span different cognitive processes and experimental modalities. Here we critically evaluate these modeling attempts, highlighting the continued need to reconcile the array of disparate ACC observations within a coherent, unifying framework.
Hypercube matrix computation task

NASA Technical Reports Server (NTRS)

Calalo, Ruel H.; Imbriale, William A.; Jacobi, Nathan; Liewer, Paulett C.; Lockhart, Thomas G.; Lyzenga, Gregory A.; Lyons, James R.; Manshadi, Farzin; Patterson, Jean E.

1988-01-01

A major objective of the Hypercube Matrix Computation effort at the Jet Propulsion Laboratory (JPL) is to investigate the applicability of a parallel computing architecture to the solution of large-scale electromagnetic scattering problems. Three scattering analysis codes are being implemented and assessed on a JPL/California Institute of Technology (Caltech) Mark 3 Hypercube. The codes, which utilize different underlying algorithms, give a means of evaluating the general applicability of this parallel architecture. The three analysis codes being implemented are a frequency domain method of moments code, a time domain finite difference code, and a frequency domain finite elements code. These analysis capabilities are being integrated into an electromagnetics interactive analysis workstation which can serve as a design tool for the construction of antennas and other radiating or scattering structures. The first two years of work on the Hypercube Matrix Computation effort is summarized. It includes both new developments and results as well as work previously reported in the Hypercube Matrix Computation Task: Final Report for 1986 to 1987 (JPL Publication 87-18).
Calculation of Water Drop Trajectories to and About Arbitrary Three-Dimensional Bodies in Potential Airflow

NASA Technical Reports Server (NTRS)

Norment, H. G.

1980-01-01

Calculations can be performed for any atmospheric conditions and for all water drop sizes, from the smallest cloud droplet to large raindrops. Any subsonic, external, non-lifting flow can be accommodated; flow into, but not through, inlets also can be simulated. Experimental water drop drag relations are used in the water drop equations of motion and effects of gravity settling are included. Seven codes are described: (1) a code used to debug and plot body surface description data; (2) a code that processes the body surface data to yield the potential flow field; (3) a code that computes flow velocities at arrays of points in space; (4) a code that computes water drop trajectories from an array of points in space; (5) a code that computes water drop trajectories and fluxes to arbitrary target points; (6) a code that computes water drop trajectories tangent to the body; and (7) a code that produces stereo pair plots which include both the body and trajectories. Code descriptions include operating instructions, card inputs and printouts for example problems, and listing of the FORTRAN codes. Accuracy of the calculations is discussed, and trajectory calculation results are compared with prior calculations and with experimental data.
Utilizing GPUs to Accelerate Turbomachinery CFD Codes

NASA Technical Reports Server (NTRS)

MacCalla, Weylin; Kulkarni, Sameer

2016-01-01

GPU computing has established itself as a way to accelerate parallel codes in the high performance computing world. This work focuses on speeding up APNASA, a legacy CFD code used at NASA Glenn Research Center, while also drawing conclusions about the nature of GPU computing and the requirements to make GPGPU worthwhile on legacy codes. Rewriting and restructuring of the source code was avoided to limit the introduction of new bugs. The code was profiled and investigated for parallelization potential, then OpenACC directives were used to indicate parallel parts of the code. The use of OpenACC directives was not able to reduce the runtime of APNASA on either the NVIDIA Tesla discrete graphics card, or the AMD accelerated processing unit. Additionally, it was found that in order to justify the use of GPGPU, the amount of parallel work being done within a kernel would have to greatly exceed the work being done by any one portion of the APNASA code. It was determined that in order for an application like APNASA to be accelerated on the GPU, it should not be modular in nature, and the parallel portions of the code must contain a large portion of the code's computation time.
PASCO: Structural panel analysis and sizing code: Users manual - Revised

NASA Technical Reports Server (NTRS)

Anderson, M. S.; Stroud, W. J.; Durling, B. J.; Hennessy, K. W.

1981-01-01

A computer code denoted PASCO is described for analyzing and sizing uniaxially stiffened composite panels. Buckling and vibration analyses are carried out with a linked plate analysis computer code denoted VIPASA, which is included in PASCO. Sizing is based on nonlinear mathematical programming techniques and employs a computer code denoted CONMIN, also included in PASCO. Design requirements considered are initial buckling, material strength, stiffness and vibration frequency. A user's manual for PASCO is presented.

Computation of Reacting Flows in Combustion Processes

NASA Technical Reports Server (NTRS)

Keith, Theo G., Jr.; Chen, Kuo-Huey

1997-01-01

The main objective of this research was to develop an efficient three-dimensional computer code for chemically reacting flows. The main computer code developed is ALLSPD-3D. The ALLSPD-3D computer program is developed for the calculation of three-dimensional, chemically reacting flows with sprays. The ALL-SPD code employs a coupled, strongly implicit solution procedure for turbulent spray combustion flows. A stochastic droplet model and an efficient method for treatment of the spray source terms in the gas-phase equations are used to calculate the evaporating liquid sprays. The chemistry treatment in the code is general enough that an arbitrary number of reaction and species can be defined by the users. Also, it is written in generalized curvilinear coordinates with both multi-block and flexible internal blockage capabilities to handle complex geometries. In addition, for general industrial combustion applications, the code provides both dilution and transpiration cooling capabilities. The ALLSPD algorithm, which employs the preconditioning and eigenvalue rescaling techniques, is capable of providing efficient solution for flows with a wide range of Mach numbers. Although written for three-dimensional flows in general, the code can be used for two-dimensional and axisymmetric flow computations as well. The code is written in such a way that it can be run in various computer platforms (supercomputers, workstations and parallel processors) and the GUI (Graphical User Interface) should provide a user-friendly tool in setting up and running the code.
ALCBEAM - Neutral beam formation and propagation code for beam-based plasma diagnostics

NASA Astrophysics Data System (ADS)

Bespamyatnov, I. O.; Rowan, W. L.; Liao, K. T.

2012-03-01

ALCBEAM is a new three-dimensional neutral beam formation and propagation code. It was developed to support the beam-based diagnostics installed on the Alcator C-Mod tokamak. The purpose of the code is to provide reliable estimates of the local beam equilibrium parameters: such as beam energy fractions, density profiles and excitation populations. The code effectively unifies the ion beam formation, extraction and neutralization processes with beam attenuation and excitation in plasma and neutral gas and beam stopping by the beam apertures. This paper describes the physical processes interpreted and utilized by the code, along with exploited computational methods. The description is concluded by an example simulation of beam penetration into plasma of Alcator C-Mod. The code is successfully being used in Alcator C-Mod tokamak and expected to be valuable in the support of beam-based diagnostics in most other tokamak environments. Program summaryProgram title: ALCBEAM Catalogue identifier: AEKU_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEKU_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 66 459 No. of bytes in distributed program, including test data, etc.: 7 841 051 Distribution format: tar.gz Programming language: IDL Computer: Workstation, PC Operating system: Linux RAM: 1 GB Classification: 19.2 Nature of problem: Neutral beams are commonly used to heat and/or diagnose high-temperature magnetically-confined laboratory plasmas. An accurate neutral beam characterization is required for beam-based measurements of plasma properties. Beam parameters such as density distribution, energy composition, and atomic excited populations of the beam atoms need to be known. Solution method: A neutral beam is initially formed as an ion beam which is extracted from the ion source by high voltage applied to the extraction and accelerating grids. The current distribution of a single beamlet emitted from a single pore of IOS depends on the shape of the plasma boundary in the emission region. Total beam extracted by IOS is calculated at every point of 3D mesh as sum of all contributions from each grid pore. The code effectively unifies the ion beam formation, extraction and neutralization processes with neutral beam attenuation and excitation in plasma and neutral gas and beam stopping by the beam apertures. Running time: 10 min for a standard run.
Software for Better Documentation of Other Software

NASA Technical Reports Server (NTRS)

Pinedo, John

2003-01-01

The Literate Programming Extraction Engine is a Practical Extraction and Reporting Language- (PERL-)based computer program that facilitates and simplifies the implementation of a concept of self-documented literate programming in a fashion tailored to the typical needs of scientists. The advantage for the programmer is that documentation and source code are written side-by-side in the same file, reducing the likelihood that the documentation will be inconsistent with the code and improving the verification that the code performs its intended functions. The advantage for the user is the knowledge that the documentation matches the software because they come from the same file. This program unifies the documentation process for a variety of programming languages, including C, C++, and several versions of FORTRAN. This program can process the documentation in any markup language, and incorporates the LaTeX typesetting software. The program includes sample Makefile scripts for automating both the code-compilation (when appropriate) and documentation-generation processes into a single command-line statement. Also included are macro instructions for the Emacs display-editor software, making it easy for a programmer to toggle between editing in a code or a documentation mode.
Advanced Pellet Cladding Interaction Modeling Using the US DOE CASL Fuel Performance Code: Peregrine

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jason Hales; Various

The US DOE’s Consortium for Advanced Simulation of LWRs (CASL) program has undertaken an effort to enhance and develop modeling and simulation tools for a virtual reactor application, including high fidelity neutronics, fluid flow/thermal hydraulics, and fuel and material behavior. The fuel performance analysis efforts aim to provide 3-dimensional capabilities for single and multiple rods to assess safety margins and the impact of plant operation and fuel rod design on the fuel thermomechanical- chemical behavior, including Pellet-Cladding Interaction (PCI) failures and CRUD-Induced Localized Corrosion (CILC) failures in PWRs. [1-3] The CASL fuel performance code, Peregrine, is an engineering scale codemore » that is built upon the MOOSE/ELK/FOX computational FEM framework, which is also common to the fuel modeling framework, BISON [4,5]. Peregrine uses both 2-D and 3-D geometric fuel rod representations and contains a materials properties and fuel behavior model library for the UO2 and Zircaloy system common to PWR fuel derived from both open literature sources and the FALCON code [6]. The primary purpose of Peregrine is to accurately calculate the thermal, mechanical, and chemical processes active throughout a single fuel rod during operation in a reactor, for both steady state and off-normal conditions.« less
Anisotropic constitutive model for nickel base single crystal alloys: Development and finite element implementation

NASA Technical Reports Server (NTRS)

Dame, L. T.; Stouffer, D. C.

1986-01-01

A tool for the mechanical analysis of nickel base single crystal superalloys, specifically Rene N4, used in gas turbine engine components is developed. This is achieved by a rate dependent anisotropic constitutive model implemented in a nonlinear three dimensional finite element code. The constitutive model is developed from metallurigical concepts utilizing a crystallographic approach. A non Schmid's law formulation is used to model the tension/compression asymmetry and orientation dependence in octahedral slip. Schmid's law is a good approximation to the inelastic response of the material in cube slip. The constitutive equations model the tensile behavior, creep response, and strain rate sensitivity of these alloys. Methods for deriving the material constants from standard tests are presented. The finite element implementation utilizes an initial strain method and twenty noded isoparametric solid elements. The ability to model piecewise linear load histories is included in the finite element code. The constitutive equations are accurately and economically integrated using a second order Adams-Moulton predictor-corrector method with a dynamic time incrementing procedure. Computed results from the finite element code are compared with experimental data for tensile, creep and cyclic tests at 760 deg C. The strain rate sensitivity and stress relaxation capabilities of the model are evaluated.
Automated generation of lattice QCD Feynman rules

NASA Astrophysics Data System (ADS)

Hart, A.; von Hippel, G. M.; Horgan, R. R.; Müller, E. H.

2009-12-01

The derivation of the Feynman rules for lattice perturbation theory from actions and operators is complicated, especially for highly improved actions such as HISQ. This task is, however, both important and particularly suitable for automation. We describe a suite of software to generate and evaluate Feynman rules for a wide range of lattice field theories with gluons and (relativistic and/or heavy) quarks. Our programs are capable of dealing with actions as complicated as (m)NRQCD and HISQ. Automated differentiation methods are used to calculate also the derivatives of Feynman diagrams. Program summaryProgram title: HiPPY, HPsrc Catalogue identifier: AEDX_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEDX_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GPLv2 (see Additional comments below) No. of lines in distributed program, including test data, etc.: 513 426 No. of bytes in distributed program, including test data, etc.: 4 893 707 Distribution format: tar.gz Programming language: Python, Fortran95 Computer: HiPPy: Single-processor workstations. HPsrc: Single-processor workstations and MPI-enabled multi-processor systems Operating system: HiPPy: Any for which Python v2.5.x is available. HPsrc: Any for which a standards-compliant Fortran95 compiler is available Has the code been vectorised or parallelised?: Yes RAM: Problem specific, typically less than 1 GB for either code Classification: 4.4, 11.5 Nature of problem: Derivation and use of perturbative Feynman rules for complicated lattice QCD actions. Solution method: An automated expansion method implemented in Python (HiPPy) and code to use expansions to generate Feynman rules in Fortran95 (HPsrc). Restrictions: No general restrictions. Specific restrictions are discussed in the text. Additional comments: The HiPPy and HPsrc codes are released under the second version of the GNU General Public Licence (GPL v2). Therefore anyone is free to use or modify the code for their own calculations. As part of the licensing, we ask that any publications including results from the use of this code or of modifications of it cite Refs. [1,2] as well as this paper. Finally, we also ask that details of these publications, as well as of any bugs or required or useful improvements of this core code, would be communicated to us. Running time: Very problem specific, depending on the complexity of the Feynman rules and the number of integration points. Typically between a few minutes and several weeks. The installation tests provided with the program code take only a few seconds to run. References:A. Hart, G.M. von Hippel, R.R. Horgan, L.C. Storoni, Automatically generating Feynman rules for improved lattice eld theories, J. Comput. Phys. 209 (2005) 340-353, doi:10.1016/j.jcp.2005.03.010, arXiv:hep-lat/0411026. M. Lüscher, P. Weisz, Efficient Numerical Techniques for Perturbative Lattice Gauge Theory Computations, Nucl. Phys. B 266 (1986) 309, doi:10.1016/0550-3213(86)90094-5.
NASA Rotor 37 CFD Code Validation: Glenn-HT Code

NASA Technical Reports Server (NTRS)

Ameri, Ali A.

2010-01-01

In order to advance the goals of NASA aeronautics programs, it is necessary to continuously evaluate and improve the computational tools used for research and design at NASA. One such code is the Glenn-HT code which is used at NASA Glenn Research Center (GRC) for turbomachinery computations. Although the code has been thoroughly validated for turbine heat transfer computations, it has not been utilized for compressors. In this work, Glenn-HT was used to compute the flow in a transonic compressor and comparisons were made to experimental data. The results presented here are in good agreement with this data. Most of the measures of performance are well within the measurement uncertainties and the exit profiles of interest agree with the experimental measurements.
New Class of Quantum Error-Correcting Codes for a Bosonic Mode

NASA Astrophysics Data System (ADS)

Michael, Marios H.; Silveri, Matti; Brierley, R. T.; Albert, Victor V.; Salmilehto, Juha; Jiang, Liang; Girvin, S. M.

2016-07-01

We construct a new class of quantum error-correcting codes for a bosonic mode, which are advantageous for applications in quantum memories, communication, and scalable computation. These "binomial quantum codes" are formed from a finite superposition of Fock states weighted with binomial coefficients. The binomial codes can exactly correct errors that are polynomial up to a specific degree in bosonic creation and annihilation operators, including amplitude damping and displacement noise as well as boson addition and dephasing errors. For realistic continuous-time dissipative evolution, the codes can perform approximate quantum error correction to any given order in the time step between error detection measurements. We present an explicit approximate quantum error recovery operation based on projective measurements and unitary operations. The binomial codes are tailored for detecting boson loss and gain errors by means of measurements of the generalized number parity. We discuss optimization of the binomial codes and demonstrate that by relaxing the parity structure, codes with even lower unrecoverable error rates can be achieved. The binomial codes are related to existing two-mode bosonic codes, but offer the advantage of requiring only a single bosonic mode to correct amplitude damping as well as the ability to correct other errors. Our codes are similar in spirit to "cat codes" based on superpositions of the coherent states but offer several advantages such as smaller mean boson number, exact rather than approximate orthonormality of the code words, and an explicit unitary operation for repumping energy into the bosonic mode. The binomial quantum codes are realizable with current superconducting circuit technology, and they should prove useful in other quantum technologies, including bosonic quantum memories, photonic quantum communication, and optical-to-microwave up- and down-conversion.
Operations analysis (study 2.1). Program listing for the LOVES computer code

NASA Technical Reports Server (NTRS)

Wray, S. T., Jr.

1974-01-01

A listing of the LOVES computer program is presented. The program is coded partially in SIMSCRIPT and FORTRAN. This version of LOVES is compatible with both the CDC 7600 and the UNIVAC 1108 computers. The code has been compiled, loaded, and executed successfully on the EXEC 8 system for the UNIVAC 1108.
Optimization of Car Body under Constraints of Noise, Vibration, and Harshness (NVH), and Crash

NASA Technical Reports Server (NTRS)

Kodiyalam, Srinivas; Yang, Ren-Jye; Sobieszczanski-Sobieski, Jaroslaw (Editor)

2000-01-01

To be competitive on the today's market, cars have to be as light as possible while meeting the Noise, Vibration, and Harshness (NVH) requirements and conforming to Government-man dated crash survival regulations. The latter are difficult to meet because they involve very compute-intensive, nonlinear analysis, e.g., the code RADIOSS capable of simulation of the dynamics, and the geometrical and material nonlinearities of a thin-walled car structure in crash, would require over 12 days of elapsed time for a single design of a 390K elastic degrees of freedom model, if executed on a single processor of the state-of-the-art SGI Origin2000 computer. Of course, in optimization that crash analysis would have to be invoked many times. Needless to say, that has rendered such optimization intractable until now. The car finite element model is shown. The advent of computers that comprise large numbers of concurrently operating processors has created a new environment wherein the above optimization, and other engineering problems heretofore regarded as intractable may be solved. The procedure, shown, is a piecewise approximation based method and involves using a sensitivity based Taylor series approximation model for NVH and a polynomial response surface model for Crash. In that method the NVH constraints are evaluated using a finite element code (MSC/NASTRAN) that yields the constraint values and their derivatives with respect to design variables. The crash constraints are evaluated using the explicit code RADIOSS on the Origin 2000 operating on 256 processors simultaneously to generate data for a polynomial response surface in the design variable domain. The NVH constraints and their derivatives combined with the response surface for the crash constraints form an approximation to the system analysis (surrogate analysis) that enables a cycle of multidisciplinary optimization within move limits. In the inner loop, the NVH sensitivities are recomputed to update the NVH approximation model while keeping the Crash response surface constant. In every outer loop, the Crash response surface approximation is updated, including a gradual increase in the order of the response surface and the response surface extension in the direction of the search. In this optimization task, the NVH discipline has 30 design variables while the crash discipline has 20 design variables. A subset of these design variables (10) are common to both the NVH and crash disciplines. In order to construct a linear response surface for the Crash discipline constraints, a minimum of 21 design points would have to be analyzed using the RADIOSS code. On a single processor in Origin 2000 that amount of computing would require over 9 months! In this work, these runs were carried out concurrently on the Origin 2000 using multiple processors, ranging from 8 to 16, for each crash (RADIOSS) analysis. Another figure shows the wall time required for a single RADIOSS analysis using varying number of processors, as well as provides a comparison of 2 different common data placement procedures within the allotted memories for each analysis. The initial design is an infeasible design with NVH discipline Static Torsion constraint violations of over 10%. The final optimized design is a feasible design with a weight reduction of 15 kg compared to the initial design. This work demonstrates how advanced methodology for optimization combined with the technology of concurrent processing enables applications that until now were out of reach because of very long time-to-solution.
Analysis of the Length of Braille Texts in English Braille American Edition, the Nemeth Code, and Computer Braille Code versus the Unified English Braille Code

ERIC Educational Resources Information Center

Knowlton, Marie; Wetzel, Robin

2006-01-01

This study compared the length of text in English Braille American Edition, the Nemeth code, and the computer braille code with the Unified English Braille Code (UEBC)--also known as Unified English Braille (UEB). The findings indicate that differences in the length of text are dependent on the type of material that is transcribed and the grade…
A MATLAB based 3D modeling and inversion code for MT data

NASA Astrophysics Data System (ADS)

Singh, Arun; Dehiya, Rahul; Gupta, Pravin K.; Israil, M.

2017-07-01

The development of a MATLAB based computer code, AP3DMT, for modeling and inversion of 3D Magnetotelluric (MT) data is presented. The code comprises two independent components: grid generator code and modeling/inversion code. The grid generator code performs model discretization and acts as an interface by generating various I/O files. The inversion code performs core computations in modular form - forward modeling, data functionals, sensitivity computations and regularization. These modules can be readily extended to other similar inverse problems like Controlled-Source EM (CSEM). The modular structure of the code provides a framework useful for implementation of new applications and inversion algorithms. The use of MATLAB and its libraries makes it more compact and user friendly. The code has been validated on several published models. To demonstrate its versatility and capabilities the results of inversion for two complex models are presented.
Three-Dimensional Nacelle Aeroacoustics Code With Application to Impedance Education

NASA Technical Reports Server (NTRS)

Watson, Willie R.

2000-01-01

A three-dimensional nacelle acoustics code that accounts for uniform mean flow and variable surface impedance liners is developed. The code is linked to a commercial version of the NASA-developed General Purpose Solver (for solution of linear systems of equations) in order to obtain the capability to study high frequency waves that may require millions of grid points for resolution. Detailed, single-processor statistics for the performance of the solver in rigid and soft-wall ducts are presented. Over the range of frequencies of current interest in nacelle liner research, noise attenuation levels predicted from the code were in excellent agreement with those predicted from mode theory. The equation solver is memory efficient, requiring only a small fraction of the memory available on modern computers. As an application, the code is combined with an optimization algorithm and used to reduce the impedance spectrum of a ceramic liner. The primary problem with using the code to perform optimization studies at frequencies above I1kHz is the excessive CPU time (a major portion of which is matrix assembly). The research recommends that research be directed toward development of a rapid sparse assembler and exploitation of the multiprocessor capability of the solver to further reduce CPU time.
Error-Transparent Quantum Gates for Small Logical Qubit Architectures

NASA Astrophysics Data System (ADS)

Kapit, Eliot

2018-02-01

One of the largest obstacles to building a quantum computer is gate error, where the physical evolution of the state of a qubit or group of qubits during a gate operation does not match the intended unitary transformation. Gate error stems from a combination of control errors and random single qubit errors from interaction with the environment. While great strides have been made in mitigating control errors, intrinsic qubit error remains a serious problem that limits gate fidelity in modern qubit architectures. Simultaneously, recent developments of small error-corrected logical qubit devices promise significant increases in logical state lifetime, but translating those improvements into increases in gate fidelity is a complex challenge. In this Letter, we construct protocols for gates on and between small logical qubit devices which inherit the parent device's tolerance to single qubit errors which occur at any time before or during the gate. We consider two such devices, a passive implementation of the three-qubit bit flip code, and the author's own [E. Kapit, Phys. Rev. Lett. 116, 150501 (2016), 10.1103/PhysRevLett.116.150501] very small logical qubit (VSLQ) design, and propose error-tolerant gate sets for both. The effective logical gate error rate in these models displays superlinear error reduction with linear increases in single qubit lifetime, proving that passive error correction is capable of increasing gate fidelity. Using a standard phenomenological noise model for superconducting qubits, we demonstrate a realistic, universal one- and two-qubit gate set for the VSLQ, with error rates an order of magnitude lower than those for same-duration operations on single qubits or pairs of qubits. These developments further suggest that incorporating small logical qubits into a measurement based code could substantially improve code performance.
Sparse bursts optimize information transmission in a multiplexed neural code.

PubMed

Naud, Richard; Sprekeler, Henning

2018-06-22

Many cortical neurons combine the information ascending and descending the cortical hierarchy. In the classical view, this information is combined nonlinearly to give rise to a single firing-rate output, which collapses all input streams into one. We analyze the extent to which neurons can simultaneously represent multiple input streams by using a code that distinguishes spike timing patterns at the level of a neural ensemble. Using computational simulations constrained by experimental data, we show that cortical neurons are well suited to generate such multiplexing. Interestingly, this neural code maximizes information for short and sparse bursts, a regime consistent with in vivo recordings. Neurons can also demultiplex this information, using specific connectivity patterns. The anatomy of the adult mammalian cortex suggests that these connectivity patterns are used by the nervous system to maintain sparse bursting and optimal multiplexing. Contrary to firing-rate coding, our findings indicate that the physiology and anatomy of the cortex may be interpreted as optimizing the transmission of multiple independent signals to different targets. Copyright © 2018 the Author(s). Published by PNAS.
Planet-disc interactions with Discontinuous Galerkin Methods using GPUs

NASA Astrophysics Data System (ADS)

Velasco Romero, David A.; Veiga, Maria Han; Teyssier, Romain; Masset, Frédéric S.

2018-05-01

We present a two-dimensional Cartesian code based on high order discontinuous Galerkin methods, implemented to run in parallel over multiple GPUs. A simple planet-disc setup is used to compare the behaviour of our code against the behaviour found using the FARGO3D code with a polar mesh. We make use of the time dependence of the torque exerted by the disc on the planet as a mean to quantify the numerical viscosity of the code. We find that the numerical viscosity of the Keplerian flow can be as low as a few 10-8r2Ω, r and Ω being respectively the local orbital radius and frequency, for fifth order schemes and resolution of ˜10-2r. Although for a single disc problem a solution of low numerical viscosity can be obtained at lower computational cost with FARGO3D (which is nearly an order of magnitude faster than a fifth order method), discontinuous Galerkin methods appear promising to obtain solutions of low numerical viscosity in more complex situations where the flow cannot be captured on a polar or spherical mesh concentric with the disc.
PVM Wrapper

NASA Technical Reports Server (NTRS)

Katz, Daniel

2004-01-01

PVM Wrapper is a software library that makes it possible for code that utilizes the Parallel Virtual Machine (PVM) software library to run using the message-passing interface (MPI) software library, without needing to rewrite the entire code. PVM and MPI are the two most common software libraries used for applications that involve passing of messages among parallel computers. Since about 1996, MPI has been the de facto standard. Codes written when PVM was popular often feature patterns of {"initsend," "pack," "send"} and {"receive," "unpack"} calls. In many cases, these calls are not contiguous and one set of calls may even exist over multiple subroutines. These characteristics make it difficult to obtain equivalent functionality via a single MPI "send" call. Because PVM Wrapper is written to run with MPI- 1.2, some PVM functions are not permitted and must be replaced - a task that requires some programming expertise. The "pvm_spawn" and "pvm_parent" function calls are not replaced, but a programmer can use "mpirun" and knowledge of the ranks of parent and child tasks with supplied macroinstructions to enable execution of codes that use "pvm_spawn" and "pvm_parent."
High-speed reacting flow simulation using USA-series codes

NASA Astrophysics Data System (ADS)

Chakravarthy, S. R.; Palaniswamy, S.

In this paper, the finite-rate chemistry (FRC) formulation for the USA-series of codes and three sets of validations are presented. USA-series computational fluid dynamics (CFD) codes are based on Unified Solution Algorithms including explicity and implicit formulations, factorization and relaxation approaches, time marching and space marching methodolgies, etc., in order to be able to solve a very wide class of CDF problems using a single framework. Euler or Navier-Stokes equations are solved using a finite-volume treatment with upwind Total Variation Diminishing discretization for the inviscid terms. Perfect and real gas options are available including equilibrium and nonequilibrium chemistry. This capability has been widely used to study various problems including Space Shuttle exhaust plumes, National Aerospace Plane (NASP) designs, etc. (1) Numerical solutions are presented showing the full range of possible solutions to steady detonation wave problems. (2) Comparison between the solution obtained by the USA code and Generalized Kinetics Analysis Program (GKAP) is shown for supersonic combustion in a duct. (3) Simulation of combustion in a supersonic shear layer is shown to have reasonable agreement with experimental observations.
New t-gap insertion-deletion-like metrics for DNA hybridization thermodynamic modeling.

PubMed

D'yachkov, Arkadii G; Macula, Anthony J; Pogozelski, Wendy K; Renz, Thomas E; Rykov, Vyacheslav V; Torney, David C

2006-05-01

We discuss the concept of t-gap block isomorphic subsequences and use it to describe new abstract string metrics that are similar to the Levenshtein insertion-deletion metric. Some of the metrics that we define can be used to model a thermodynamic distance function on single-stranded DNA sequences. Our model captures a key aspect of the nearest neighbor thermodynamic model for hybridized DNA duplexes. One version of our metric gives the maximum number of stacked pairs of hydrogen bonded nucleotide base pairs that can be present in any secondary structure in a hybridized DNA duplex without pseudoknots. Thermodynamic distance functions are important components in the construction of DNA codes, and DNA codes are important components in biomolecular computing, nanotechnology, and other biotechnical applications that employ DNA hybridization assays. We show how our new distances can be calculated by using a dynamic programming method, and we derive a Varshamov-Gilbert-like lower bound on the size of some of codes using these distance functions as constraints. We also discuss software implementation of our DNA code design methods.
The implementation of an aeronautical CFD flow code onto distributed memory parallel systems

NASA Astrophysics Data System (ADS)

Ierotheou, C. S.; Forsey, C. R.; Leatham, M.

2000-04-01

The parallelization of an industrially important in-house computational fluid dynamics (CFD) code for calculating the airflow over complex aircraft configurations using the Euler or Navier-Stokes equations is presented. The code discussed is the flow solver module of the SAUNA CFD suite. This suite uses a novel grid system that may include block-structured hexahedral or pyramidal grids, unstructured tetrahedral grids or a hybrid combination of both. To assist in the rapid convergence to a solution, a number of convergence acceleration techniques are employed including implicit residual smoothing and a multigrid full approximation storage scheme (FAS). Key features of the parallelization approach are the use of domain decomposition and encapsulated message passing to enable the execution in parallel using a single programme multiple data (SPMD) paradigm. In the case where a hybrid grid is used, a unified grid partitioning scheme is employed to define the decomposition of the mesh. The parallel code has been tested using both structured and hybrid grids on a number of different distributed memory parallel systems and is now routinely used to perform industrial scale aeronautical simulations. Copyright

Some links on this page may take you to non-federal websites. Their policies may differ from this site.