Volume accumulator design analysis computer codes
NASA Technical Reports Server (NTRS)
Whitaker, W. D.; Shimazaki, T. T.
1973-01-01
The computer codes, VANEP and VANES, were written and used to aid in the design and performance calculation of the volume accumulator units (VAU) for the 5-kwe reactor thermoelectric system. VANEP computes the VAU design which meets the primary coolant loop VAU volume and pressure performance requirements. VANES computes the performance of the VAU design, determined from the VANEP code, at the conditions of the secondary coolant loop. The codes can also compute the performance characteristics of the VAU's under conditions of possible modes of failure which still permit continued system operation.
NASA Technical Reports Server (NTRS)
Logan, Terry G.
1994-01-01
The purpose of this study is to investigate the performance of the integral equation computations using numerical source field-panel method in a massively parallel processing (MPP) environment. A comparative study of computational performance of the MPP CM-5 computer and conventional Cray-YMP supercomputer for a three-dimensional flow problem is made. A serial FORTRAN code is converted into a parallel CM-FORTRAN code. Some performance results are obtained on CM-5 with 32, 62, 128 nodes along with those on Cray-YMP with a single processor. The comparison of the performance indicates that the parallel CM-FORTRAN code near or out-performs the equivalent serial FORTRAN code for some cases.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grebennikov, A.N.; Zhitnik, A.K.; Zvenigorodskaya, O.A.
1995-12-31
In conformity with the protocol of the Workshop under Contract {open_quotes}Assessment of RBMK reactor safety using modern Western Codes{close_quotes} VNIIEF performed a neutronics computation series to compare western and VNIIEF codes and assess whether VNIIEF codes are suitable for RBMK type reactor safety assessment computation. The work was carried out in close collaboration with M.I. Rozhdestvensky and L.M. Podlazov, NIKIET employees. The effort involved: (1) cell computations with the WIMS, EKRAN codes (improved modification of the LOMA code) and the S-90 code (VNIIEF Monte Carlo). Cell, polycell, burnup computation; (2) 3D computation of static states with the KORAT-3D and NEUmore » codes and comparison with results of computation with the NESTLE code (USA). The computations were performed in the geometry and using the neutron constants presented by the American party; (3) 3D computation of neutron kinetics with the KORAT-3D and NEU codes. These computations were performed in two formulations, both being developed in collaboration with NIKIET. Formulation of the first problem maximally possibly agrees with one of NESTLE problems and imitates gas bubble travel through a core. The second problem is a model of the RBMK as a whole with imitation of control and protection system controls (CPS) movement in a core.« less
Users manual and modeling improvements for axial turbine design and performance computer code TD2-2
NASA Technical Reports Server (NTRS)
Glassman, Arthur J.
1992-01-01
Computer code TD2 computes design point velocity diagrams and performance for multistage, multishaft, cooled or uncooled, axial flow turbines. This streamline analysis code was recently modified to upgrade modeling related to turbine cooling and to the internal loss correlation. These modifications are presented in this report along with descriptions of the code's expanded input and output. This report serves as the users manual for the upgraded code, which is named TD2-2.
Enhanced fault-tolerant quantum computing in d-level systems.
Campbell, Earl T
2014-12-05
Error-correcting codes protect quantum information and form the basis of fault-tolerant quantum computing. Leading proposals for fault-tolerant quantum computation require codes with an exceedingly rare property, a transversal non-Clifford gate. Codes with the desired property are presented for d-level qudit systems with prime d. The codes use n=d-1 qudits and can detect up to ∼d/3 errors. We quantify the performance of these codes for one approach to quantum computation known as magic-state distillation. Unlike prior work, we find performance is always enhanced by increasing d.
The adaption and use of research codes for performance assessment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liebetrau, A.M.
1987-05-01
Models of real-world phenomena are developed for many reasons. The models are usually, if not always, implemented in the form of a computer code. The characteristics of a code are determined largely by its intended use. Realizations or implementations of detailed mathematical models of complex physical and/or chemical processes are often referred to as research or scientific (RS) codes. Research codes typically require large amounts of computing time. One example of an RS code is a finite-element code for solving complex systems of differential equations that describe mass transfer through some geologic medium. Considerable computing time is required because computationsmore » are done at many points in time and/or space. Codes used to evaluate the overall performance of real-world physical systems are called performance assessment (PA) codes. Performance assessment codes are used to conduct simulated experiments involving systems that cannot be directly observed. Thus, PA codes usually involve repeated simulations of system performance in situations that preclude the use of conventional experimental and statistical methods. 3 figs.« less
Cloud Computing for Complex Performance Codes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Appel, Gordon John; Hadgu, Teklu; Klein, Brandon Thorin
This report describes the use of cloud computing services for running complex public domain performance assessment problems. The work consisted of two phases: Phase 1 was to demonstrate complex codes, on several differently configured servers, could run and compute trivial small scale problems in a commercial cloud infrastructure. Phase 2 focused on proving non-trivial large scale problems could be computed in the commercial cloud environment. The cloud computing effort was successfully applied using codes of interest to the geohydrology and nuclear waste disposal modeling community.
Computational Predictions of the Performance Wright 'Bent End' Propellers
NASA Technical Reports Server (NTRS)
Wang, Xiang-Yu; Ash, Robert L.; Bobbitt, Percy J.; Prior, Edwin (Technical Monitor)
2002-01-01
Computational analysis of two 1911 Wright brothers 'Bent End' wooden propeller reproductions have been performed and compared with experimental test results from the Langley Full Scale Wind Tunnel. The purpose of the analysis was to check the consistency of the experimental results and to validate the reliability of the tests. This report is one part of the project on the propeller performance research of the Wright 'Bent End' propellers, intend to document the Wright brothers' pioneering propeller design contributions. Two computer codes were used in the computational predictions. The FLO-MG Navier-Stokes code is a CFD (Computational Fluid Dynamics) code based on the Navier-Stokes Equations. It is mainly used to compute the lift coefficient and the drag coefficient at specified angles of attack at different radii. Those calculated data are the intermediate results of the computation and a part of the necessary input for the Propeller Design Analysis Code (based on Adkins and Libeck method), which is a propeller design code used to compute the propeller thrust coefficient, the propeller power coefficient and the propeller propulsive efficiency.
Fast H.264/AVC FRExt intra coding using belief propagation.
Milani, Simone
2011-01-01
In the H.264/AVC FRExt coder, the coding performance of Intra coding significantly overcomes the previous still image coding standards, like JPEG2000, thanks to a massive use of spatial prediction. Unfortunately, the adoption of an extensive set of predictors induces a significant increase of the computational complexity required by the rate-distortion optimization routine. The paper presents a complexity reduction strategy that aims at reducing the computational load of the Intra coding with a small loss in the compression performance. The proposed algorithm relies on selecting a reduced set of prediction modes according to their probabilities, which are estimated adopting a belief-propagation procedure. Experimental results show that the proposed method permits saving up to 60 % of the coding time required by an exhaustive rate-distortion optimization method with a negligible loss in performance. Moreover, it permits an accurate control of the computational complexity unlike other methods where the computational complexity depends upon the coded sequence.
Global Magnetohydrodynamic Simulation Using High Performance FORTRAN on Parallel Computers
NASA Astrophysics Data System (ADS)
Ogino, T.
High Performance Fortran (HPF) is one of modern and common techniques to achieve high performance parallel computation. We have translated a 3-dimensional magnetohydrodynamic (MHD) simulation code of the Earth's magnetosphere from VPP Fortran to HPF/JA on the Fujitsu VPP5000/56 vector-parallel supercomputer and the MHD code was fully vectorized and fully parallelized in VPP Fortran. The entire performance and capability of the HPF MHD code could be shown to be almost comparable to that of VPP Fortran. A 3-dimensional global MHD simulation of the earth's magnetosphere was performed at a speed of over 400 Gflops with an efficiency of 76.5 VPP5000/56 in vector and parallel computation that permitted comparison with catalog values. We have concluded that fluid and MHD codes that are fully vectorized and fully parallelized in VPP Fortran can be translated with relative ease to HPF/JA, and a code in HPF/JA may be expected to perform comparably to the same code written in VPP Fortran.
Design and optimization of a portable LQCD Monte Carlo code using OpenACC
NASA Astrophysics Data System (ADS)
Bonati, Claudio; Coscetti, Simone; D'Elia, Massimo; Mesiti, Michele; Negro, Francesco; Calore, Enrico; Schifano, Sebastiano Fabio; Silvi, Giorgio; Tripiccione, Raffaele
The present panorama of HPC architectures is extremely heterogeneous, ranging from traditional multi-core CPU processors, supporting a wide class of applications but delivering moderate computing performance, to many-core Graphics Processor Units (GPUs), exploiting aggressive data-parallelism and delivering higher performances for streaming computing applications. In this scenario, code portability (and performance portability) become necessary for easy maintainability of applications; this is very relevant in scientific computing where code changes are very frequent, making it tedious and prone to error to keep different code versions aligned. In this work, we present the design and optimization of a state-of-the-art production-level LQCD Monte Carlo application, using the directive-based OpenACC programming model. OpenACC abstracts parallel programming to a descriptive level, relieving programmers from specifying how codes should be mapped onto the target architecture. We describe the implementation of a code fully written in OpenAcc, and show that we are able to target several different architectures, including state-of-the-art traditional CPUs and GPUs, with the same code. We also measure performance, evaluating the computing efficiency of our OpenACC code on several architectures, comparing with GPU-specific implementations and showing that a good level of performance-portability can be reached.
Modeling Improvements and Users Manual for Axial-flow Turbine Off-design Computer Code AXOD
NASA Technical Reports Server (NTRS)
Glassman, Arthur J.
1994-01-01
An axial-flow turbine off-design performance computer code used for preliminary studies of gas turbine systems was modified and calibrated based on the experimental performance of large aircraft-type turbines. The flow- and loss-model modifications and calibrations are presented in this report. Comparisons are made between computed performances and experimental data for seven turbines over wide ranges of speed and pressure ratio. This report also serves as the users manual for the revised code, which is named AXOD.
Design geometry and design/off-design performance computer codes for compressors and turbines
NASA Technical Reports Server (NTRS)
Glassman, Arthur J.
1995-01-01
This report summarizes some NASA Lewis (i.e., government owned) computer codes capable of being used for airbreathing propulsion system studies to determine the design geometry and to predict the design/off-design performance of compressors and turbines. These are not CFD codes; velocity-diagram energy and continuity computations are performed fore and aft of the blade rows using meanline, spanline, or streamline analyses. Losses are provided by empirical methods. Both axial-flow and radial-flow configurations are included.
Performance measures for transform data coding.
NASA Technical Reports Server (NTRS)
Pearl, J.; Andrews, H. C.; Pratt, W. K.
1972-01-01
This paper develops performance criteria for evaluating transform data coding schemes under computational constraints. Computational constraints that conform with the proposed basis-restricted model give rise to suboptimal coding efficiency characterized by a rate-distortion relation R(D) similar in form to the theoretical rate-distortion function. Numerical examples of this performance measure are presented for Fourier, Walsh, Haar, and Karhunen-Loeve transforms.
Binary weight distributions of some Reed-Solomon codes
NASA Technical Reports Server (NTRS)
Pollara, F.; Arnold, S.
1992-01-01
The binary weight distributions of the (7,5) and (15,9) Reed-Solomon (RS) codes and their duals are computed using the MacWilliams identities. Several mappings of symbols to bits are considered and those offering the largest binary minimum distance are found. These results are then used to compute bounds on the soft-decoding performance of these codes in the presence of additive Gaussian noise. These bounds are useful for finding large binary block codes with good performance and for verifying the performance obtained by specific soft-coding algorithms presently under development.
Validation of the NCC Code for Staged Transverse Injection and Computations for a RBCC Combustor
NASA Technical Reports Server (NTRS)
Ajmani, Kumud; Liu, Nan-Suey
2005-01-01
The NCC code was validated for a case involving staged transverse injection into Mach 2 flow behind a rearward facing step. Comparisons with experimental data and with solutions from the FPVortex code was then used to perform computations to study fuel-air mixing for the combustor of a candidate rocket based combined cycle engine geometry. Comparisons with a one-dimensional analysis and a three-dimensional code (VULCAN) were performed to assess the qualitative and quantitative performance of the NCC solver.
48 CFR 252.227-7013 - Rights in technical data-Noncommercial items.
Code of Federal Regulations, 2011 CFR
2011-10-01
... causing a computer to perform a specific operation or series of operations. (3) Computer software means computer programs, source code, source code listings, object code listings, design details, algorithms... or will be developed exclusively with Government funds; (ii) Studies, analyses, test data, or similar...
48 CFR 252.227-7013 - Rights in technical data-Noncommercial items.
Code of Federal Regulations, 2012 CFR
2012-10-01
... causing a computer to perform a specific operation or series of operations. (3) Computer software means computer programs, source code, source code listings, object code listings, design details, algorithms... or will be developed exclusively with Government funds; (ii) Studies, analyses, test data, or similar...
48 CFR 252.227-7013 - Rights in technical data-Noncommercial items.
Code of Federal Regulations, 2014 CFR
2014-10-01
... causing a computer to perform a specific operation or series of operations. (3) Computer software means computer programs, source code, source code listings, object code listings, design details, algorithms... or will be developed exclusively with Government funds; (ii) Studies, analyses, test data, or similar...
48 CFR 252.227-7013 - Rights in technical data-Noncommercial items.
Code of Federal Regulations, 2010 CFR
2010-10-01
... causing a computer to perform a specific operation or series of operations. (3) Computer software means computer programs, source code, source code listings, object code listings, design details, algorithms... developed exclusively with Government funds; (ii) Studies, analyses, test data, or similar data produced for...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Eslinger, Paul W.; Aaberg, Rosanne L.; Lopresti, Charles A.
2004-09-14
This document contains detailed user instructions for a suite of utility codes developed for Rev. 1 of the Systems Assessment Capability. The suite of computer codes for Rev. 1 of Systems Assessment Capability performs many functions.
On the error statistics of Viterbi decoding and the performance of concatenated codes
NASA Technical Reports Server (NTRS)
Miller, R. L.; Deutsch, L. J.; Butman, S. A.
1981-01-01
Computer simulation results are presented on the performance of convolutional codes of constraint lengths 7 and 10 concatenated with the (255, 223) Reed-Solomon code (a proposed NASA standard). These results indicate that as much as 0.8 dB can be gained by concatenating this Reed-Solomon code with a (10, 1/3) convolutional code, instead of the (7, 1/2) code currently used by the DSN. A mathematical model of Viterbi decoder burst-error statistics is developed and is validated through additional computer simulations.
48 CFR 252.227-7013 - Rights in technical data-Noncommercial items.
Code of Federal Regulations, 2013 CFR
2013-10-01
... causing a computer to perform a specific operation or series of operations. (3) Computer software means computer programs, source code, source code listings, object code listings, design details, algorithms... funds; (ii) Studies, analyses, test data, or similar data produced for this contract, when the study...
NASA Astrophysics Data System (ADS)
Wei, Xiaohui; Li, Weishan; Tian, Hailong; Li, Hongliang; Xu, Haixiao; Xu, Tianfu
2015-07-01
The numerical simulation of multiphase flow and reactive transport in the porous media on complex subsurface problem is a computationally intensive application. To meet the increasingly computational requirements, this paper presents a parallel computing method and architecture. Derived from TOUGHREACT that is a well-established code for simulating subsurface multi-phase flow and reactive transport problems, we developed a high performance computing THC-MP based on massive parallel computer, which extends greatly on the computational capability for the original code. The domain decomposition method was applied to the coupled numerical computing procedure in the THC-MP. We designed the distributed data structure, implemented the data initialization and exchange between the computing nodes and the core solving module using the hybrid parallel iterative and direct solver. Numerical accuracy of the THC-MP was verified through a CO2 injection-induced reactive transport problem by comparing the results obtained from the parallel computing and sequential computing (original code). Execution efficiency and code scalability were examined through field scale carbon sequestration applications on the multicore cluster. The results demonstrate successfully the enhanced performance using the THC-MP on parallel computing facilities.
Navier-Stokes and Comprehensive Analysis Performance Predictions of the NREL Phase VI Experiment
NASA Technical Reports Server (NTRS)
Duque, Earl P. N.; Burklund, Michael D.; Johnson, Wayne
2003-01-01
A vortex lattice code, CAMRAD II, and a Reynolds-Averaged Navier-Stoke code, OVERFLOW-D2, were used to predict the aerodynamic performance of a two-bladed horizontal axis wind turbine. All computations were compared with experimental data that was collected at the NASA Ames Research Center 80- by 120-Foot Wind Tunnel. Computations were performed for both axial as well as yawed operating conditions. Various stall delay models and dynamics stall models were used by the CAMRAD II code. Comparisons between the experimental data and computed aerodynamic loads show that the OVERFLOW-D2 code can accurately predict the power and spanwise loading of a wind turbine rotor.
A MATLAB based 3D modeling and inversion code for MT data
NASA Astrophysics Data System (ADS)
Singh, Arun; Dehiya, Rahul; Gupta, Pravin K.; Israil, M.
2017-07-01
The development of a MATLAB based computer code, AP3DMT, for modeling and inversion of 3D Magnetotelluric (MT) data is presented. The code comprises two independent components: grid generator code and modeling/inversion code. The grid generator code performs model discretization and acts as an interface by generating various I/O files. The inversion code performs core computations in modular form - forward modeling, data functionals, sensitivity computations and regularization. These modules can be readily extended to other similar inverse problems like Controlled-Source EM (CSEM). The modular structure of the code provides a framework useful for implementation of new applications and inversion algorithms. The use of MATLAB and its libraries makes it more compact and user friendly. The code has been validated on several published models. To demonstrate its versatility and capabilities the results of inversion for two complex models are presented.
NASA Technical Reports Server (NTRS)
Weed, Richard Allen; Sankar, L. N.
1994-01-01
An increasing amount of research activity in computational fluid dynamics has been devoted to the development of efficient algorithms for parallel computing systems. The increasing performance to price ratio of engineering workstations has led to research to development procedures for implementing a parallel computing system composed of distributed workstations. This thesis proposal outlines an ongoing research program to develop efficient strategies for performing three-dimensional flow analysis on distributed computing systems. The PVM parallel programming interface was used to modify an existing three-dimensional flow solver, the TEAM code developed by Lockheed for the Air Force, to function as a parallel flow solver on clusters of workstations. Steady flow solutions were generated for three different wing and body geometries to validate the code and evaluate code performance. The proposed research will extend the parallel code development to determine the most efficient strategies for unsteady flow simulations.
NASA Technical Reports Server (NTRS)
Fishbach, L. H.
1979-01-01
The computational techniques utilized to determine the optimum propulsion systems for future aircraft applications and to identify system tradeoffs and technology requirements are described. The characteristics and use of the following computer codes are discussed: (1) NNEP - a very general cycle analysis code that can assemble an arbitrary matrix fans, turbines, ducts, shafts, etc., into a complete gas turbine engine and compute on- and off-design thermodynamic performance; (2) WATE - a preliminary design procedure for calculating engine weight using the component characteristics determined by NNEP; (3) POD DRG - a table look-up program to calculate wave and friction drag of nacelles; (4) LIFCYC - a computer code developed to calculate life cycle costs of engines based on the output from WATE; and (5) INSTAL - a computer code developed to calculate installation effects, inlet performance and inlet weight. Examples are given to illustrate how these computer techniques can be applied to analyze and optimize propulsion system fuel consumption, weight, and cost for representative types of aircraft and missions.
Developments in REDES: The rocket engine design expert system
NASA Technical Reports Server (NTRS)
Davidian, Kenneth O.
1990-01-01
The Rocket Engine Design Expert System (REDES) is being developed at the NASA-Lewis to collect, automate, and perpetuate the existing expertise of performing a comprehensive rocket engine analysis and design. Currently, REDES uses the rigorous JANNAF methodology to analyze the performance of the thrust chamber and perform computational studies of liquid rocket engine problems. The following computer codes were included in REDES: a gas properties program named GASP, a nozzle design program named RAO, a regenerative cooling channel performance evaluation code named RTE, and the JANNAF standard liquid rocket engine performance prediction code TDK (including performance evaluation modules ODE, ODK, TDE, TDK, and BLM). Computational analyses are being conducted by REDES to provide solutions to liquid rocket engine thrust chamber problems. REDES is built in the Knowledge Engineering Environment (KEE) expert system shell and runs on a Sun 4/110 computer.
Developments in REDES: The Rocket Engine Design Expert System
NASA Technical Reports Server (NTRS)
Davidian, Kenneth O.
1990-01-01
The Rocket Engine Design Expert System (REDES) was developed at NASA-Lewis to collect, automate, and perpetuate the existing expertise of performing a comprehensive rocket engine analysis and design. Currently, REDES uses the rigorous JANNAF methodology to analyze the performance of the thrust chamber and perform computational studies of liquid rocket engine problems. The following computer codes were included in REDES: a gas properties program named GASP; a nozzle design program named RAO; a regenerative cooling channel performance evaluation code named RTE; and the JANNAF standard liquid rocket engine performance prediction code TDK (including performance evaluation modules ODE, ODK, TDE, TDK, and BLM). Computational analyses are being conducted by REDES to provide solutions to liquid rocket engine thrust chamber problems. REDES was built in the Knowledge Engineering Environment (KEE) expert system shell and runs on a Sun 4/110 computer.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Evans, Thomas; Hamilton, Steven; Slattery, Stuart
Profugus is an open-source mini-application (mini-app) for radiation transport and reactor applications. It contains the fundamental computational kernels used in the Exnihilo code suite from Oak Ridge National Laboratory. However, Exnihilo is production code with a substantial user base. Furthermore, Exnihilo is export controlled. This makes collaboration with computer scientists and computer engineers difficult. Profugus is designed to bridge that gap. By encapsulating the core numerical algorithms in an abbreviated code base that is open-source, computer scientists can analyze the algorithms and easily make code-architectural changes to test performance without compromising the production code values of Exnihilo. Profugus is notmore » meant to be production software with respect to problem analysis. The computational kernels in Profugus are designed to analyze performance, not correctness. Nonetheless, users of Profugus can setup and run problems with enough real-world features to be useful as proof-of-concept for actual production work.« less
Selection of a computer code for Hanford low-level waste engineered-system performance assessment
DOE Office of Scientific and Technical Information (OSTI.GOV)
McGrail, B.P.; Mahoney, L.A.
Planned performance assessments for the proposed disposal of low-level waste (LLW) glass produced from remediation of wastes stored in underground tanks at Hanford, Washington will require calculations of radionuclide release rates from the subsurface disposal facility. These calculations will be done with the aid of computer codes. Currently available computer codes were ranked in terms of the feature sets implemented in the code that match a set of physical, chemical, numerical, and functional capabilities needed to assess release rates from the engineered system. The needed capabilities were identified from an analysis of the important physical and chemical process expected tomore » affect LLW glass corrosion and the mobility of radionuclides. The highest ranked computer code was found to be the ARES-CT code developed at PNL for the US Department of Energy for evaluation of and land disposal sites.« less
PerSEUS: Ultra-Low-Power High Performance Computing for Plasma Simulations
NASA Astrophysics Data System (ADS)
Doxas, I.; Andreou, A.; Lyon, J.; Angelopoulos, V.; Lu, S.; Pritchett, P. L.
2017-12-01
Peta-op SupErcomputing Unconventional System (PerSEUS) aims to explore the use for High Performance Scientific Computing (HPC) of ultra-low-power mixed signal unconventional computational elements developed by Johns Hopkins University (JHU), and demonstrate that capability on both fluid and particle Plasma codes. We will describe the JHU Mixed-signal Unconventional Supercomputing Elements (MUSE), and report initial results for the Lyon-Fedder-Mobarry (LFM) global magnetospheric MHD code, and a UCLA general purpose relativistic Particle-In-Cell (PIC) code.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hall, D.G.: Watkins, J.C.
This report documents an evaluation of the TRAC-PF1/MOD1 reactor safety analysis computer code during computer simulations of feedwater line break transients. The experimental data base for the evaluation included the results of three bottom feedwater line break tests performed in the Semiscale Mod-2C test facility. The tests modeled 14.3% (S-FS-7), 50% (S-FS-11), and 100% (S-FS-6B) breaks. The test facility and the TRAC-PF1/MOD1 model used in the calculations are described. Evaluations of the accuracy of the calculations are presented in the form of comparisons of measured and calculated histories of selected parameters associated with the primary and secondary systems. In additionmore » to evaluating the accuracy of the code calculations, the computational performance of the code during the simulations was assessed. A conclusion was reached that the code is capable of making feedwater line break transient calculations efficiently, but there is room for significant improvements in the simulations that were performed. Recommendations are made for follow-on investigations to determine how to improve future feedwater line break calculations and for code improvements to make the code easier to use.« less
Los Alamos radiation transport code system on desktop computing platforms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Briesmeister, J.F.; Brinkley, F.W.; Clark, B.A.
The Los Alamos Radiation Transport Code System (LARTCS) consists of state-of-the-art Monte Carlo and discrete ordinates transport codes and data libraries. These codes were originally developed many years ago and have undergone continual improvement. With a large initial effort and continued vigilance, the codes are easily portable from one type of hardware to another. The performance of scientific work-stations (SWS) has evolved to the point that such platforms can be used routinely to perform sophisticated radiation transport calculations. As the personal computer (PC) performance approaches that of the SWS, the hardware options for desk-top radiation transport calculations expands considerably. Themore » current status of the radiation transport codes within the LARTCS is described: MCNP, SABRINA, LAHET, ONEDANT, TWODANT, TWOHEX, and ONELD. Specifically, the authors discuss hardware systems on which the codes run and present code performance comparisons for various machines.« less
Performance assessment of KORAT-3D on the ANL IBM-SP computer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Alexeyev, A.V.; Zvenigorodskaya, O.A.; Shagaliev, R.M.
1999-09-01
The TENAR code is currently being developed at the Russian Federal Nuclear Center (VNIIEF) as a coupled dynamics code for the simulation of transients in VVER and RBMK systems and other nuclear systems. The neutronic module in this code system is KORAT-3D. This module is also one of the most computationally intensive components of the code system. A parallel version of KORAT-3D has been implemented to achieve the goal of obtaining transient solutions in reasonable computational time, particularly for RBMK calculations that involve the application of >100,000 nodes. An evaluation of the KORAT-3D code performance was recently undertaken on themore » Argonne National Laboratory (ANL) IBM ScalablePower (SP) parallel computer located in the Mathematics and Computer Science Division of ANL. At the time of the study, the ANL IBM-SP computer had 80 processors. This study was conducted under the auspices of a technical staff exchange program sponsored by the International Nuclear Safety Center (INSC).« less
Airfoil Vibration Dampers program
NASA Technical Reports Server (NTRS)
Cook, Robert M.
1991-01-01
The Airfoil Vibration Damper program has consisted of an analysis phase and a testing phase. During the analysis phase, a state-of-the-art computer code was developed, which can be used to guide designers in the placement and sizing of friction dampers. The use of this computer code was demonstrated by performing representative analyses on turbine blades from the High Pressure Oxidizer Turbopump (HPOTP) and High Pressure Fuel Turbopump (HPFTP) of the Space Shuttle Main Engine (SSME). The testing phase of the program consisted of performing friction damping tests on two different cantilever beams. Data from these tests provided an empirical check on the accuracy of the computer code developed in the analysis phase. Results of the analysis and testing showed that the computer code can accurately predict the performance of friction dampers. In addition, a valuable set of friction damping data was generated, which can be used to aid in the design of friction dampers, as well as provide benchmark test cases for future code developers.
Multidisciplinary High-Fidelity Analysis and Optimization of Aerospace Vehicles. Part 1; Formulation
NASA Technical Reports Server (NTRS)
Walsh, J. L.; Townsend, J. C.; Salas, A. O.; Samareh, J. A.; Mukhopadhyay, V.; Barthelemy, J.-F.
2000-01-01
An objective of the High Performance Computing and Communication Program at the NASA Langley Research Center is to demonstrate multidisciplinary shape and sizing optimization of a complete aerospace vehicle configuration by using high-fidelity, finite element structural analysis and computational fluid dynamics aerodynamic analysis in a distributed, heterogeneous computing environment that includes high performance parallel computing. A software system has been designed and implemented to integrate a set of existing discipline analysis codes, some of them computationally intensive, into a distributed computational environment for the design of a highspeed civil transport configuration. The paper describes the engineering aspects of formulating the optimization by integrating these analysis codes and associated interface codes into the system. The discipline codes are integrated by using the Java programming language and a Common Object Request Broker Architecture (CORBA) compliant software product. A companion paper presents currently available results.
NASA Technical Reports Server (NTRS)
Walowit, Jed A.; Shapiro, Wilbur
2005-01-01
The SPIRALI code predicts the performance characteristics of incompressible cylindrical and face seals with or without the inclusion of spiral grooves. Performance characteristics include load capacity (for face seals), leakage flow, power requirements and dynamic characteristics in the form of stiffness, damping and apparent mass coefficients in 4 degrees of freedom for cylindrical seals and 3 degrees of freedom for face seals. These performance characteristics are computed as functions of seal and groove geometry, load or film thickness, running and disturbance speeds, fluid viscosity, and boundary pressures. A derivation of the equations governing the performance of turbulent, incompressible, spiral groove cylindrical and face seals along with a description of their solution is given. The computer codes are described, including an input description, sample cases, and comparisons with results of other codes.
NASA Astrophysics Data System (ADS)
Moon, Hongsik
What is the impact of multicore and associated advanced technologies on computational software for science? Most researchers and students have multicore laptops or desktops for their research and they need computing power to run computational software packages. Computing power was initially derived from Central Processing Unit (CPU) clock speed. That changed when increases in clock speed became constrained by power requirements. Chip manufacturers turned to multicore CPU architectures and associated technological advancements to create the CPUs for the future. Most software applications benefited by the increased computing power the same way that increases in clock speed helped applications run faster. However, for Computational ElectroMagnetics (CEM) software developers, this change was not an obvious benefit - it appeared to be a detriment. Developers were challenged to find a way to correctly utilize the advancements in hardware so that their codes could benefit. The solution was parallelization and this dissertation details the investigation to address these challenges. Prior to multicore CPUs, advanced computer technologies were compared with the performance using benchmark software and the metric was FLoting-point Operations Per Seconds (FLOPS) which indicates system performance for scientific applications that make heavy use of floating-point calculations. Is FLOPS an effective metric for parallelized CEM simulation tools on new multicore system? Parallel CEM software needs to be benchmarked not only by FLOPS but also by the performance of other parameters related to type and utilization of the hardware, such as CPU, Random Access Memory (RAM), hard disk, network, etc. The codes need to be optimized for more than just FLOPs and new parameters must be included in benchmarking. In this dissertation, the parallel CEM software named High Order Basis Based Integral Equation Solver (HOBBIES) is introduced. This code was developed to address the needs of the changing computer hardware platforms in order to provide fast, accurate and efficient solutions to large, complex electromagnetic problems. The research in this dissertation proves that the performance of parallel code is intimately related to the configuration of the computer hardware and can be maximized for different hardware platforms. To benchmark and optimize the performance of parallel CEM software, a variety of large, complex projects are created and executed on a variety of computer platforms. The computer platforms used in this research are detailed in this dissertation. The projects run as benchmarks are also described in detail and results are presented. The parameters that affect parallel CEM software on High Performance Computing Clusters (HPCC) are investigated. This research demonstrates methods to maximize the performance of parallel CEM software code.
NASA Technical Reports Server (NTRS)
Capo, M. A.; Disney, R. K.
1971-01-01
The work performed in the following areas is summarized: (1) Analysis of Realistic nuclear-propelled vehicle was analyzed using the Marshall Space Flight Center computer code package. This code package includes one and two dimensional discrete ordinate transport, point kernel, and single scatter techniques, as well as cross section preparation and data processing codes, (2) Techniques were developed to improve the automated data transfer in the coupled computation method of the computer code package and improve the utilization of this code package on the Univac-1108 computer system. (3) The MSFC master data libraries were updated.
NASA Technical Reports Server (NTRS)
Chan, J. S.; Freeman, J. A.
1984-01-01
The viscous, axisymmetric flow in the thrust chamber of the space shuttle main engine (SSME) was computed on the CRAY 205 computer using the general interpolants method (GIM) code. Results show that the Navier-Stokes codes can be used for these flows to study trends and viscous effects as well as determine flow patterns; but further research and development is needed before they can be used as production tools for nozzle performance calculations. The GIM formulation, numerical scheme, and computer code are described. The actual SSME nozzle computation showing grid points, flow contours, and flow parameter plots is discussed. The computer system and run times/costs are detailed.
Program optimizations: The interplay between power, performance, and energy
Leon, Edgar A.; Karlin, Ian; Grant, Ryan E.; ...
2016-05-16
Practical considerations for future supercomputer designs will impose limits on both instantaneous power consumption and total energy consumption. Working within these constraints while providing the maximum possible performance, application developers will need to optimize their code for speed alongside power and energy concerns. This paper analyzes the effectiveness of several code optimizations including loop fusion, data structure transformations, and global allocations. A per component measurement and analysis of different architectures is performed, enabling the examination of code optimizations on different compute subsystems. Using an explicit hydrodynamics proxy application from the U.S. Department of Energy, LULESH, we show how code optimizationsmore » impact different computational phases of the simulation. This provides insight for simulation developers into the best optimizations to use during particular simulation compute phases when optimizing code for future supercomputing platforms. Here, we examine and contrast both x86 and Blue Gene architectures with respect to these optimizations.« less
Oelerich, Jan Oliver; Duschek, Lennart; Belz, Jürgen; Beyer, Andreas; Baranovskii, Sergei D; Volz, Kerstin
2017-06-01
We present a new multislice code for the computer simulation of scanning transmission electron microscope (STEM) images based on the frozen lattice approximation. Unlike existing software packages, the code is optimized to perform well on highly parallelized computing clusters, combining distributed and shared memory architectures. This enables efficient calculation of large lateral scanning areas of the specimen within the frozen lattice approximation and fine-grained sweeps of parameter space. Copyright © 2017 Elsevier B.V. All rights reserved.
Computer codes for thermal analysis of a solid rocket motor nozzle
NASA Technical Reports Server (NTRS)
Chauhan, Rajinder Singh
1988-01-01
A number of computer codes are available for performing thermal analysis of solid rocket motor nozzles. Aerotherm Chemical Equilibrium (ACE) computer program can be used to perform one-dimensional gas expansion to determine the state of the gas at each location of a nozzle. The ACE outputs can be used as input to a computer program called Momentum/Energy Integral Technique (MEIT) for predicting boundary layer development development, shear, and heating on the surface of the nozzle. The output from MEIT can be used as input to another computer program called Aerotherm Charring Material Thermal Response and Ablation Program (CMA). This program is used to calculate oblation or decomposition response of the nozzle material. A code called Failure Analysis Nonlinear Thermal and Structural Integrated Code (FANTASTIC) is also likely to be used for performing thermal analysis of solid rocket motor nozzles after the program is duly verified. A part of the verification work on FANTASTIC was done by using one and two dimension heat transfer examples with known answers. An attempt was made to prepare input for performing thermal analysis of the CCT nozzle using the FANTASTIC computer code. The CCT nozzle problem will first be solved by using ACE, MEIT, and CMA. The same problem will then be solved using FANTASTIC. These results will then be compared for verification of FANTASTIC.
An emulator for minimizing computer resources for finite element analysis
NASA Technical Reports Server (NTRS)
Melosh, R.; Utku, S.; Islam, M.; Salama, M.
1984-01-01
A computer code, SCOPE, has been developed for predicting the computer resources required for a given analysis code, computer hardware, and structural problem. The cost of running the code is a small fraction (about 3 percent) of the cost of performing the actual analysis. However, its accuracy in predicting the CPU and I/O resources depends intrinsically on the accuracy of calibration data that must be developed once for the computer hardware and the finite element analysis code of interest. Testing of the SCOPE code on the AMDAHL 470 V/8 computer and the ELAS finite element analysis program indicated small I/O errors (3.2 percent), larger CPU errors (17.8 percent), and negligible total errors (1.5 percent).
NASA Technical Reports Server (NTRS)
Hanebutte, Ulf R.; Joslin, Ronald D.; Zubair, Mohammad
1994-01-01
The implementation and the performance of a parallel spatial direct numerical simulation (PSDNS) code are reported for the IBM SP1 supercomputer. The spatially evolving disturbances that are associated with laminar-to-turbulent in three-dimensional boundary-layer flows are computed with the PS-DNS code. By remapping the distributed data structure during the course of the calculation, optimized serial library routines can be utilized that substantially increase the computational performance. Although the remapping incurs a high communication penalty, the parallel efficiency of the code remains above 40% for all performed calculations. By using appropriate compile options and optimized library routines, the serial code achieves 52-56 Mflops on a single node of the SP1 (45% of theoretical peak performance). The actual performance of the PSDNS code on the SP1 is evaluated with a 'real world' simulation that consists of 1.7 million grid points. One time step of this simulation is calculated on eight nodes of the SP1 in the same time as required by a Cray Y/MP for the same simulation. The scalability information provides estimated computational costs that match the actual costs relative to changes in the number of grid points.
Unaligned instruction relocation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bertolli, Carlo; O'Brien, John K.; Sallenave, Olivier H.
In one embodiment, a computer-implemented method includes receiving source code to be compiled into an executable file for an unaligned instruction set architecture (ISA). Aligned assembled code is generated, by a computer processor. The aligned assembled code complies with an aligned ISA and includes aligned processor code for a processor and aligned accelerator code for an accelerator. A first linking pass is performed on the aligned assembled code, including relocating a first relocation target in the aligned accelerator code that refers to a first object outside the aligned accelerator code. Unaligned assembled code is generated in accordance with the unalignedmore » ISA and includes unaligned accelerator code for the accelerator and unaligned processor code for the processor. A second linking pass is performed on the unaligned assembled code, including relocating a second relocation target outside the unaligned accelerator code that refers to an object in the unaligned accelerator code.« less
Unaligned instruction relocation
Bertolli, Carlo; O'Brien, John K.; Sallenave, Olivier H.; Sura, Zehra N.
2018-01-23
In one embodiment, a computer-implemented method includes receiving source code to be compiled into an executable file for an unaligned instruction set architecture (ISA). Aligned assembled code is generated, by a computer processor. The aligned assembled code complies with an aligned ISA and includes aligned processor code for a processor and aligned accelerator code for an accelerator. A first linking pass is performed on the aligned assembled code, including relocating a first relocation target in the aligned accelerator code that refers to a first object outside the aligned accelerator code. Unaligned assembled code is generated in accordance with the unaligned ISA and includes unaligned accelerator code for the accelerator and unaligned processor code for the processor. A second linking pass is performed on the unaligned assembled code, including relocating a second relocation target outside the unaligned accelerator code that refers to an object in the unaligned accelerator code.
Extension, validation and application of the NASCAP code
NASA Technical Reports Server (NTRS)
Katz, I.; Cassidy, J. J., III; Mandell, M. J.; Schnuelle, G. W.; Steen, P. G.; Parks, D. E.; Rotenberg, M.; Alexander, J. H.
1979-01-01
Numerous extensions were made in the NASCAP code. They fall into three categories: a greater range of definable objects, a more sophisticated computational model, and simplified code structure and usage. An important validation of NASCAP was performed using a new two dimensional computer code (TWOD). An interactive code (MATCHG) was written to compare material parameter inputs with charging results. The first major application of NASCAP was performed on the SCATHA satellite. Shadowing and charging calculation were completed. NASCAP was installed at the Air Force Geophysics Laboratory, where researchers plan to use it to interpret SCATHA data.
Multi-Zone Liquid Thrust Chamber Performance Code with Domain Decomposition for Parallel Processing
NASA Technical Reports Server (NTRS)
Navaz, Homayun K.
2002-01-01
Computational Fluid Dynamics (CFD) has considerably evolved in the last decade. There are many computer programs that can perform computations on viscous internal or external flows with chemical reactions. CFD has become a commonly used tool in the design and analysis of gas turbines, ramjet combustors, turbo-machinery, inlet ducts, rocket engines, jet interaction, missile, and ramjet nozzles. One of the problems of interest to NASA has always been the performance prediction for rocket and air-breathing engines. Due to the complexity of flow in these engines it is necessary to resolve the flowfield into a fine mesh to capture quantities like turbulence and heat transfer. However, calculation on a high-resolution grid is associated with a prohibitively increasing computational time that can downgrade the value of the CFD for practical engineering calculations. The Liquid Thrust Chamber Performance (LTCP) code was developed for NASA/MSFC (Marshall Space Flight Center) to perform liquid rocket engine performance calculations. This code is a 2D/axisymmetric full Navier-Stokes (NS) solver with fully coupled finite rate chemistry and Eulerian treatment of liquid fuel and/or oxidizer droplets. One of the advantages of this code has been the resemblance of its input file to the JANNAF (Joint Army Navy NASA Air Force Interagency Propulsion Committee) standard TDK code, and its automatic grid generation for JANNAF defined combustion chamber wall geometry. These options minimize the learning effort for TDK users, and make the code a good candidate for performing engineering calculations. Although the LTCP code was developed for liquid rocket engines, it is a general-purpose code and has been used for solving many engineering problems. However, the single zone formulation of the LTCP has limited the code to be applicable to problems with complex geometry. Furthermore, the computational time becomes prohibitively large for high-resolution problems with chemistry, two-equation turbulence model, and two-phase flow. To overcome these limitations, the LTCP code is rewritten to include the multi-zone capability with domain decomposition that makes it suitable for parallel processing, i.e., enabling the code to run every zone or sub-domain on a separate processor. This can reduce the run time by a factor of 6 to 8, depending on the problem.
NASA Technical Reports Server (NTRS)
Harper, Warren
1989-01-01
Two electromagnetic scattering codes, NEC-BSC and ESP3, were delivered and installed on a NASA VAX computer for use by Marshall Space Flight Center antenna design personnel. The existing codes and certain supplementary software were updated, the codes installed on a computer that will be delivered to the customer, to provide capability for graphic display of the data to be computed by the use of the codes and to assist the customer in the solution of specific problems that demonstrate the use of the codes. With the exception of one code revision, all of these tasks were performed.
Error Suppression for Hamiltonian-Based Quantum Computation Using Subsystem Codes
NASA Astrophysics Data System (ADS)
Marvian, Milad; Lidar, Daniel A.
2017-01-01
We present general conditions for quantum error suppression for Hamiltonian-based quantum computation using subsystem codes. This involves encoding the Hamiltonian performing the computation using an error detecting subsystem code and the addition of a penalty term that commutes with the encoded Hamiltonian. The scheme is general and includes the stabilizer formalism of both subspace and subsystem codes as special cases. We derive performance bounds and show that complete error suppression results in the large penalty limit. To illustrate the power of subsystem-based error suppression, we introduce fully two-local constructions for protection against local errors of the swap gate of adiabatic gate teleportation and the Ising chain in a transverse field.
Error Suppression for Hamiltonian-Based Quantum Computation Using Subsystem Codes.
Marvian, Milad; Lidar, Daniel A
2017-01-20
We present general conditions for quantum error suppression for Hamiltonian-based quantum computation using subsystem codes. This involves encoding the Hamiltonian performing the computation using an error detecting subsystem code and the addition of a penalty term that commutes with the encoded Hamiltonian. The scheme is general and includes the stabilizer formalism of both subspace and subsystem codes as special cases. We derive performance bounds and show that complete error suppression results in the large penalty limit. To illustrate the power of subsystem-based error suppression, we introduce fully two-local constructions for protection against local errors of the swap gate of adiabatic gate teleportation and the Ising chain in a transverse field.
NASA Technical Reports Server (NTRS)
Tsuchiya, T.; Murthy, S. N. B.
1982-01-01
A computer code is presented for the prediction of off-design axial flow compressor performance with water ingestion. Four processes were considered to account for the aero-thermo-mechanical interactions during operation with air-water droplet mixture flow: (1) blade performance change, (2) centrifuging of water droplets, (3) heat and mass transfer process between the gaseous and the liquid phases and (4) droplet size redistribution due to break-up. Stage and compressor performance are obtained by a stage stacking procedure using representative veocity diagrams at a rotor inlet and outlet mean radii. The Code has options for performance estimation with (1) mixtures of gas and (2) gas-water droplet mixtures, and therefore can take into account the humidity present in ambient conditions. A test case illustrates the method of using the Code. The Code follows closely the methodology and architecture of the NASA-STGSTK Code for the estimation of axial-flow compressor performance with air flow.
Lattice surgery on the Raussendorf lattice
NASA Astrophysics Data System (ADS)
Herr, Daniel; Paler, Alexandru; Devitt, Simon J.; Nori, Franco
2018-07-01
Lattice surgery is a method to perform quantum computation fault-tolerantly by using operations on boundary qubits between different patches of the planar code. This technique allows for universal planar code computation without eliminating the intrinsic two-dimensional nearest-neighbor properties of the surface code that eases physical hardware implementations. Lattice surgery approaches to algorithmic compilation and optimization have been demonstrated to be more resource efficient for resource-intensive components of a fault-tolerant algorithm, and consequently may be preferable over braid-based logic. Lattice surgery can be extended to the Raussendorf lattice, providing a measurement-based approach to the surface code. In this paper we describe how lattice surgery can be performed on the Raussendorf lattice and therefore give a viable alternative to computation using braiding in measurement-based implementations of topological codes.
Predicting the Performance of an Axial-Flow Compressor
NASA Technical Reports Server (NTRS)
Steinke, R. J.
1986-01-01
Stage-stacking computer code (STGSTK) developed for predicting off-design performance of multi-stage axial-flow compressors. Code uses meanline stagestacking method. Stage and cumulative compressor performance calculated from representative meanline velocity diagrams located at rotor inlet and outlet meanline radii. Numerous options available within code. Code developed so user modify correlations to suit their needs.
NASA Astrophysics Data System (ADS)
Nishiura, Daisuke; Furuichi, Mikito; Sakaguchi, Hide
2015-09-01
The computational performance of a smoothed particle hydrodynamics (SPH) simulation is investigated for three types of current shared-memory parallel computer devices: many integrated core (MIC) processors, graphics processing units (GPUs), and multi-core CPUs. We are especially interested in efficient shared-memory allocation methods for each chipset, because the efficient data access patterns differ between compute unified device architecture (CUDA) programming for GPUs and OpenMP programming for MIC processors and multi-core CPUs. We first introduce several parallel implementation techniques for the SPH code, and then examine these on our target computer architectures to determine the most effective algorithms for each processor unit. In addition, we evaluate the effective computing performance and power efficiency of the SPH simulation on each architecture, as these are critical metrics for overall performance in a multi-device environment. In our benchmark test, the GPU is found to produce the best arithmetic performance as a standalone device unit, and gives the most efficient power consumption. The multi-core CPU obtains the most effective computing performance. The computational speed of the MIC processor on Xeon Phi approached that of two Xeon CPUs. This indicates that using MICs is an attractive choice for existing SPH codes on multi-core CPUs parallelized by OpenMP, as it gains computational acceleration without the need for significant changes to the source code.
High-performance computational fluid dynamics: a custom-code approach
NASA Astrophysics Data System (ADS)
Fannon, James; Loiseau, Jean-Christophe; Valluri, Prashant; Bethune, Iain; Náraigh, Lennon Ó.
2016-07-01
We introduce a modified and simplified version of the pre-existing fully parallelized three-dimensional Navier-Stokes flow solver known as TPLS. We demonstrate how the simplified version can be used as a pedagogical tool for the study of computational fluid dynamics (CFDs) and parallel computing. TPLS is at its heart a two-phase flow solver, and uses calls to a range of external libraries to accelerate its performance. However, in the present context we narrow the focus of the study to basic hydrodynamics and parallel computing techniques, and the code is therefore simplified and modified to simulate pressure-driven single-phase flow in a channel, using only relatively simple Fortran 90 code with MPI parallelization, but no calls to any other external libraries. The modified code is analysed in order to both validate its accuracy and investigate its scalability up to 1000 CPU cores. Simulations are performed for several benchmark cases in pressure-driven channel flow, including a turbulent simulation, wherein the turbulence is incorporated via the large-eddy simulation technique. The work may be of use to advanced undergraduate and graduate students as an introductory study in CFDs, while also providing insight for those interested in more general aspects of high-performance computing.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Eyler, L L; Trent, D S; Budden, M J
During the course of the TEMPEST computer code development a concurrent effort was conducted to assess the code's performance and the validity of computed results. The results of this work are presented in this document. The principal objective of this effort was to assure the code's computational correctness for a wide range of hydrothermal phenomena typical of fast breeder reactor application. 47 refs., 94 figs., 6 tabs.
Computer algorithm for coding gain
NASA Technical Reports Server (NTRS)
Dodd, E. E.
1974-01-01
Development of a computer algorithm for coding gain for use in an automated communications link design system. Using an empirical formula which defines coding gain as used in space communications engineering, an algorithm is constructed on the basis of available performance data for nonsystematic convolutional encoding with soft-decision (eight-level) Viterbi decoding.
Experimental and analytical comparison of flowfields in a 110 N (25 lbf) H2/O2 rocket
NASA Technical Reports Server (NTRS)
Reed, Brian D.; Penko, Paul F.; Schneider, Steven J.; Kim, Suk C.
1991-01-01
A gaseous hydrogen/gaseous oxygen 110 N (25 lbf) rocket was examined through the RPLUS code using the full Navier-Stokes equations with finite rate chemistry. Performance tests were conducted on the rocket in an altitude test facility. Preliminary parametric analyses were performed for a range of mixture ratios and fuel film cooling pcts. It is shown that the computed values of specific impulse and characteristic exhaust velocity follow the trend of the experimental data. Specific impulse computed by the code is lower than the comparable test values by about two to three percent. The computed characteristic exhaust velocity values are lower than the comparable test values by three to four pct. Thrust coefficients computed by the code are found to be within two pct. of the measured values. It is concluded that the discrepancy between computed and experimental performance values could not be attributed to experimental uncertainty.
NASA Technical Reports Server (NTRS)
Goodwin, Sabine A.; Raj, P.
1999-01-01
Progress to date towards the development and validation of a fast, accurate and cost-effective aeroelastic method for advanced parallel computing platforms such as the IBM SP2 and the SGI Origin 2000 is presented in this paper. The ENSAERO code, developed at the NASA-Ames Research Center has been selected for this effort. The code allows for the computation of aeroelastic responses by simultaneously integrating the Euler or Navier-Stokes equations and the modal structural equations of motion. To assess the computational performance and accuracy of the ENSAERO code, this paper reports the results of the Navier-Stokes simulations of the transonic flow over a flexible aeroelastic wing body configuration. In addition, a forced harmonic oscillation analysis in the frequency domain and an analysis in the time domain are done on a wing undergoing a rigid pitch and plunge motion. Finally, to demonstrate the ENSAERO flutter-analysis capability, aeroelastic Euler and Navier-Stokes computations on an L-1011 wind tunnel model including pylon, nacelle and empennage are underway. All computational solutions are compared with experimental data to assess the level of accuracy of ENSAERO. As the computations described above are performed, a meticulous log of computational performance in terms of wall clock time, execution speed, memory and disk storage is kept. Code scalability is also demonstrated by studying the impact of varying the number of processors on computational performance on the IBM SP2 and the Origin 2000 systems.
Gigaflop performance on a CRAY-2: Multitasking a computational fluid dynamics application
NASA Technical Reports Server (NTRS)
Tennille, Geoffrey M.; Overman, Andrea L.; Lambiotte, Jules J.; Streett, Craig L.
1991-01-01
The methodology is described for converting a large, long-running applications code that executed on a single processor of a CRAY-2 supercomputer to a version that executed efficiently on multiple processors. Although the conversion of every application is different, a discussion of the types of modification used to achieve gigaflop performance is included to assist others in the parallelization of applications for CRAY computers, especially those that were developed for other computers. An existing application, from the discipline of computational fluid dynamics, that had utilized over 2000 hrs of CPU time on CRAY-2 during the previous year was chosen as a test case to study the effectiveness of multitasking on a CRAY-2. The nature of dominant calculations within the application indicated that a sustained computational rate of 1 billion floating-point operations per second, or 1 gigaflop, might be achieved. The code was first analyzed and modified for optimal performance on a single processor in a batch environment. After optimal performance on a single CPU was achieved, the code was modified to use multiple processors in a dedicated environment. The results of these two efforts were merged into a single code that had a sustained computational rate of over 1 gigaflop on a CRAY-2. Timings and analysis of performance are given for both single- and multiple-processor runs.
Scheduling Operations for Massive Heterogeneous Clusters
NASA Technical Reports Server (NTRS)
Humphrey, John; Spagnoli, Kyle
2013-01-01
High-performance computing (HPC) programming has become increasingly difficult with the advent of hybrid supercomputers consisting of multicore CPUs and accelerator boards such as the GPU. Manual tuning of software to achieve high performance on this type of machine has been performed by programmers. This is needlessly difficult and prone to being invalidated by new hardware, new software, or changes in the underlying code. A system was developed for task-based representation of programs, which when coupled with a scheduler and runtime system, allows for many benefits, including higher performance and utilization of computational resources, easier programming and porting, and adaptations of code during runtime. The system consists of a method of representing computer algorithms as a series of data-dependent tasks. The series forms a graph, which can be scheduled for execution on many nodes of a supercomputer efficiently by a computer algorithm. The schedule is executed by a dispatch component, which is tailored to understand all of the hardware types that may be available within the system. The scheduler is informed by a cluster mapping tool, which generates a topology of available resources and their strengths and communication costs. Software is decoupled from its hardware, which aids in porting to future architectures. A computer algorithm schedules all operations, which for systems of high complexity (i.e., most NASA codes), cannot be performed optimally by a human. The system aids in reducing repetitive code, such as communication code, and aids in the reduction of redundant code across projects. It adds new features to code automatically, such as recovering from a lost node or the ability to modify the code while running. In this project, the innovators at the time of this reporting intend to develop two distinct technologies that build upon each other and both of which serve as building blocks for more efficient HPC usage. First is the scheduling and dynamic execution framework, and the second is scalable linear algebra libraries that are built directly on the former.
NASA Technical Reports Server (NTRS)
Campbell, David; Wysong, Ingrid; Kaplan, Carolyn; Mott, David; Wadsworth, Dean; VanGilder, Douglas
2000-01-01
An AFRL/NRL team has recently been selected to develop a scalable, parallel, reacting, multidimensional (SUPREM) Direct Simulation Monte Carlo (DSMC) code for the DoD user community under the High Performance Computing Modernization Office (HPCMO) Common High Performance Computing Software Support Initiative (CHSSI). This paper will introduce the JANNAF Exhaust Plume community to this three-year development effort and present the overall goals, schedule, and current status of this new code.
Hypercube matrix computation task
NASA Technical Reports Server (NTRS)
Calalo, R.; Imbriale, W.; Liewer, P.; Lyons, J.; Manshadi, F.; Patterson, J.
1987-01-01
The Hypercube Matrix Computation (Year 1986-1987) task investigated the applicability of a parallel computing architecture to the solution of large scale electromagnetic scattering problems. Two existing electromagnetic scattering codes were selected for conversion to the Mark III Hypercube concurrent computing environment. They were selected so that the underlying numerical algorithms utilized would be different thereby providing a more thorough evaluation of the appropriateness of the parallel environment for these types of problems. The first code was a frequency domain method of moments solution, NEC-2, developed at Lawrence Livermore National Laboratory. The second code was a time domain finite difference solution of Maxwell's equations to solve for the scattered fields. Once the codes were implemented on the hypercube and verified to obtain correct solutions by comparing the results with those from sequential runs, several measures were used to evaluate the performance of the two codes. First, a comparison was provided of the problem size possible on the hypercube with 128 megabytes of memory for a 32-node configuration with that available in a typical sequential user environment of 4 to 8 megabytes. Then, the performance of the codes was anlyzed for the computational speedup attained by the parallel architecture.
Parallelized direct execution simulation of message-passing parallel programs
NASA Technical Reports Server (NTRS)
Dickens, Phillip M.; Heidelberger, Philip; Nicol, David M.
1994-01-01
As massively parallel computers proliferate, there is growing interest in findings ways by which performance of massively parallel codes can be efficiently predicted. This problem arises in diverse contexts such as parallelizing computers, parallel performance monitoring, and parallel algorithm development. In this paper we describe one solution where one directly executes the application code, but uses a discrete-event simulator to model details of the presumed parallel machine such as operating system and communication network behavior. Because this approach is computationally expensive, we are interested in its own parallelization specifically the parallelization of the discrete-event simulator. We describe methods suitable for parallelized direct execution simulation of message-passing parallel programs, and report on the performance of such a system, Large Application Parallel Simulation Environment (LAPSE), we have built on the Intel Paragon. On all codes measured to date, LAPSE predicts performance well typically within 10 percent relative error. Depending on the nature of the application code, we have observed low slowdowns (relative to natively executing code) and high relative speedups using up to 64 processors.
Performance of a parallel code for the Euler equations on hypercube computers
NASA Technical Reports Server (NTRS)
Barszcz, Eric; Chan, Tony F.; Jesperson, Dennis C.; Tuminaro, Raymond S.
1990-01-01
The performance of hypercubes were evaluated on a computational fluid dynamics problem and the parallel environment issues were considered that must be addressed, such as algorithm changes, implementation choices, programming effort, and programming environment. The evaluation focuses on a widely used fluid dynamics code, FLO52, which solves the two dimensional steady Euler equations describing flow around the airfoil. The code development experience is described, including interacting with the operating system, utilizing the message-passing communication system, and code modifications necessary to increase parallel efficiency. Results from two hypercube parallel computers (a 16-node iPSC/2, and a 512-node NCUBE/ten) are discussed and compared. In addition, a mathematical model of the execution time was developed as a function of several machine and algorithm parameters. This model accurately predicts the actual run times obtained and is used to explore the performance of the code in interesting but yet physically realizable regions of the parameter space. Based on this model, predictions about future hypercubes are made.
NASA Technical Reports Server (NTRS)
Chan, William M.
1995-01-01
Algorithms and computer code developments were performed for the overset grid approach to solving computational fluid dynamics problems. The techniques developed are applicable to compressible Navier-Stokes flow for any general complex configurations. The computer codes developed were tested on different complex configurations with the Space Shuttle launch vehicle configuration as the primary test bed. General, efficient and user-friendly codes were produced for grid generation, flow solution and force and moment computation.
PHoToNs–A parallel heterogeneous and threads oriented code for cosmological N-body simulation
NASA Astrophysics Data System (ADS)
Wang, Qiao; Cao, Zong-Yan; Gao, Liang; Chi, Xue-Bin; Meng, Chen; Wang, Jie; Wang, Long
2018-06-01
We introduce a new code for cosmological simulations, PHoToNs, which incorporates features for performing massive cosmological simulations on heterogeneous high performance computer (HPC) systems and threads oriented programming. PHoToNs adopts a hybrid scheme to compute gravitational force, with the conventional Particle-Mesh (PM) algorithm to compute the long-range force, the Tree algorithm to compute the short range force and the direct summation Particle-Particle (PP) algorithm to compute gravity from very close particles. A self-similar space filling a Peano-Hilbert curve is used to decompose the computing domain. Threads programming is advantageously used to more flexibly manage the domain communication, PM calculation and synchronization, as well as Dual Tree Traversal on the CPU+MIC platform. PHoToNs scales well and efficiency of the PP kernel achieves 68.6% of peak performance on MIC and 74.4% on CPU platforms. We also test the accuracy of the code against the much used Gadget-2 in the community and found excellent agreement.
[Series: Medical Applications of the PHITS Code (2): Acceleration by Parallel Computing].
Furuta, Takuya; Sato, Tatsuhiko
2015-01-01
Time-consuming Monte Carlo dose calculation becomes feasible owing to the development of computer technology. However, the recent development is due to emergence of the multi-core high performance computers. Therefore, parallel computing becomes a key to achieve good performance of software programs. A Monte Carlo simulation code PHITS contains two parallel computing functions, the distributed-memory parallelization using protocols of message passing interface (MPI) and the shared-memory parallelization using open multi-processing (OpenMP) directives. Users can choose the two functions according to their needs. This paper gives the explanation of the two functions with their advantages and disadvantages. Some test applications are also provided to show their performance using a typical multi-core high performance workstation.
NASA Astrophysics Data System (ADS)
Gel, Aytekin; Hu, Jonathan; Ould-Ahmed-Vall, ElMoustapha; Kalinkin, Alexander A.
2017-02-01
Legacy codes remain a crucial element of today's simulation-based engineering ecosystem due to the extensive validation process and investment in such software. The rapid evolution of high-performance computing architectures necessitates the modernization of these codes. One approach to modernization is a complete overhaul of the code. However, this could require extensive investments, such as rewriting in modern languages, new data constructs, etc., which will necessitate systematic verification and validation to re-establish the credibility of the computational models. The current study advocates using a more incremental approach and is a culmination of several modernization efforts of the legacy code MFIX, which is an open-source computational fluid dynamics code that has evolved over several decades, widely used in multiphase flows and still being developed by the National Energy Technology Laboratory. Two different modernization approaches,'bottom-up' and 'top-down', are illustrated. Preliminary results show up to 8.5x improvement at the selected kernel level with the first approach, and up to 50% improvement in total simulated time with the latter were achieved for the demonstration cases and target HPC systems employed.
Design of convolutional tornado code
NASA Astrophysics Data System (ADS)
Zhou, Hui; Yang, Yao; Gao, Hongmin; Tan, Lu
2017-09-01
As a linear block code, the traditional tornado (tTN) code is inefficient in burst-erasure environment and its multi-level structure may lead to high encoding/decoding complexity. This paper presents a convolutional tornado (cTN) code which is able to improve the burst-erasure protection capability by applying the convolution property to the tTN code, and reduce computational complexity by abrogating the multi-level structure. The simulation results show that cTN code can provide a better packet loss protection performance with lower computation complexity than tTN code.
A high performance scientific cloud computing environment for materials simulations
NASA Astrophysics Data System (ADS)
Jorissen, K.; Vila, F. D.; Rehr, J. J.
2012-09-01
We describe the development of a scientific cloud computing (SCC) platform that offers high performance computation capability. The platform consists of a scientific virtual machine prototype containing a UNIX operating system and several materials science codes, together with essential interface tools (an SCC toolset) that offers functionality comparable to local compute clusters. In particular, our SCC toolset provides automatic creation of virtual clusters for parallel computing, including tools for execution and monitoring performance, as well as efficient I/O utilities that enable seamless connections to and from the cloud. Our SCC platform is optimized for the Amazon Elastic Compute Cloud (EC2). We present benchmarks for prototypical scientific applications and demonstrate performance comparable to local compute clusters. To facilitate code execution and provide user-friendly access, we have also integrated cloud computing capability in a JAVA-based GUI. Our SCC platform may be an alternative to traditional HPC resources for materials science or quantum chemistry applications.
Interfacing Computer Aided Parallelization and Performance Analysis
NASA Technical Reports Server (NTRS)
Jost, Gabriele; Jin, Haoqiang; Labarta, Jesus; Gimenez, Judit; Biegel, Bryan A. (Technical Monitor)
2003-01-01
When porting sequential applications to parallel computer architectures, the program developer will typically go through several cycles of source code optimization and performance analysis. We have started a project to develop an environment where the user can jointly navigate through program structure and performance data information in order to make efficient optimization decisions. In a prototype implementation we have interfaced the CAPO computer aided parallelization tool with the Paraver performance analysis tool. We describe both tools and their interface and give an example for how the interface helps within the program development cycle of a benchmark code.
Nonuniform code concatenation for universal fault-tolerant quantum computing
NASA Astrophysics Data System (ADS)
Nikahd, Eesa; Sedighi, Mehdi; Saheb Zamani, Morteza
2017-09-01
Using transversal gates is a straightforward and efficient technique for fault-tolerant quantum computing. Since transversal gates alone cannot be computationally universal, they must be combined with other approaches such as magic state distillation, code switching, or code concatenation to achieve universality. In this paper we propose an alternative approach for universal fault-tolerant quantum computing, mainly based on the code concatenation approach proposed in [T. Jochym-O'Connor and R. Laflamme, Phys. Rev. Lett. 112, 010505 (2014), 10.1103/PhysRevLett.112.010505], but in a nonuniform fashion. The proposed approach is described based on nonuniform concatenation of the 7-qubit Steane code with the 15-qubit Reed-Muller code, as well as the 5-qubit code with the 15-qubit Reed-Muller code, which lead to two 49-qubit and 47-qubit codes, respectively. These codes can correct any arbitrary single physical error with the ability to perform a universal set of fault-tolerant gates, without using magic state distillation.
LDPC decoder with a limited-precision FPGA-based floating-point multiplication coprocessor
NASA Astrophysics Data System (ADS)
Moberly, Raymond; O'Sullivan, Michael; Waheed, Khurram
2007-09-01
Implementing the sum-product algorithm, in an FPGA with an embedded processor, invites us to consider a tradeoff between computational precision and computational speed. The algorithm, known outside of the signal processing community as Pearl's belief propagation, is used for iterative soft-decision decoding of LDPC codes. We determined the feasibility of a coprocessor that will perform product computations. Our FPGA-based coprocessor (design) performs computer algebra with significantly less precision than the standard (e.g. integer, floating-point) operations of general purpose processors. Using synthesis, targeting a 3,168 LUT Xilinx FPGA, we show that key components of a decoder are feasible and that the full single-precision decoder could be constructed using a larger part. Soft-decision decoding by the iterative belief propagation algorithm is impacted both positively and negatively by a reduction in the precision of the computation. Reducing precision reduces the coding gain, but the limited-precision computation can operate faster. A proposed solution offers custom logic to perform computations with less precision, yet uses the floating-point format to interface with the software. Simulation results show the achievable coding gain. Synthesis results help theorize the the full capacity and performance of an FPGA-based coprocessor.
NASA Technical Reports Server (NTRS)
Clement, J. D.; Kirby, K. D.
1973-01-01
Exploratory calculations were performed for several gas core breeder reactor configurations. The computational method involved the use of the MACH-1 one dimensional diffusion theory code and the THERMOS integral transport theory code for thermal cross sections. Computations were performed to analyze thermal breeder concepts and nonbreeder concepts. Analysis of breeders was restricted to the (U-233)-Th breeding cycle, and computations were performed to examine a range of parameters. These parameters include U-233 to hydrogen atom ratio in the gaseous cavity, carbon to thorium atom ratio in the breeding blanket, cavity size, and blanket size.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Clouse, C. J.; Edwards, M. J.; McCoy, M. G.
2015-07-07
Through its Advanced Scientific Computing (ASC) and Inertial Confinement Fusion (ICF) code development efforts, Lawrence Livermore National Laboratory (LLNL) provides a world leading numerical simulation capability for the National HED/ICF program in support of the Stockpile Stewardship Program (SSP). In addition the ASC effort provides high performance computing platform capabilities upon which these codes are run. LLNL remains committed to, and will work with, the national HED/ICF program community to help insure numerical simulation needs are met and to make those capabilities available, consistent with programmatic priorities and available resources.
Verifying a computational method for predicting extreme ground motion
Harris, R.A.; Barall, M.; Andrews, D.J.; Duan, B.; Ma, S.; Dunham, E.M.; Gabriel, A.-A.; Kaneko, Y.; Kase, Y.; Aagaard, Brad T.; Oglesby, D.D.; Ampuero, J.-P.; Hanks, T.C.; Abrahamson, N.
2011-01-01
In situations where seismological data is rare or nonexistent, computer simulations may be used to predict ground motions caused by future earthquakes. This is particularly practical in the case of extreme ground motions, where engineers of special buildings may need to design for an event that has not been historically observed but which may occur in the far-distant future. Once the simulations have been performed, however, they still need to be tested. The SCEC-USGS dynamic rupture code verification exercise provides a testing mechanism for simulations that involve spontaneous earthquake rupture. We have performed this examination for the specific computer code that was used to predict maximum possible ground motion near Yucca Mountain. Our SCEC-USGS group exercises have demonstrated that the specific computer code that was used for the Yucca Mountain simulations produces similar results to those produced by other computer codes when tackling the same science problem. We also found that the 3D ground motion simulations produced smaller ground motions than the 2D simulations.
NASA Rotor 37 CFD Code Validation: Glenn-HT Code
NASA Technical Reports Server (NTRS)
Ameri, Ali A.
2010-01-01
In order to advance the goals of NASA aeronautics programs, it is necessary to continuously evaluate and improve the computational tools used for research and design at NASA. One such code is the Glenn-HT code which is used at NASA Glenn Research Center (GRC) for turbomachinery computations. Although the code has been thoroughly validated for turbine heat transfer computations, it has not been utilized for compressors. In this work, Glenn-HT was used to compute the flow in a transonic compressor and comparisons were made to experimental data. The results presented here are in good agreement with this data. Most of the measures of performance are well within the measurement uncertainties and the exit profiles of interest agree with the experimental measurements.
Parallel Computation of the Jacobian Matrix for Nonlinear Equation Solvers Using MATLAB
NASA Technical Reports Server (NTRS)
Rose, Geoffrey K.; Nguyen, Duc T.; Newman, Brett A.
2017-01-01
Demonstrating speedup for parallel code on a multicore shared memory PC can be challenging in MATLAB due to underlying parallel operations that are often opaque to the user. This can limit potential for improvement of serial code even for the so-called embarrassingly parallel applications. One such application is the computation of the Jacobian matrix inherent to most nonlinear equation solvers. Computation of this matrix represents the primary bottleneck in nonlinear solver speed such that commercial finite element (FE) and multi-body-dynamic (MBD) codes attempt to minimize computations. A timing study using MATLAB's Parallel Computing Toolbox was performed for numerical computation of the Jacobian. Several approaches for implementing parallel code were investigated while only the single program multiple data (spmd) method using composite objects provided positive results. Parallel code speedup is demonstrated but the goal of linear speedup through the addition of processors was not achieved due to PC architecture.
Parallelization of Finite Element Analysis Codes Using Heterogeneous Distributed Computing
NASA Technical Reports Server (NTRS)
Ozguner, Fusun
1996-01-01
Performance gains in computer design are quickly consumed as users seek to analyze larger problems to a higher degree of accuracy. Innovative computational methods, such as parallel and distributed computing, seek to multiply the power of existing hardware technology to satisfy the computational demands of large applications. In the early stages of this project, experiments were performed using two large, coarse-grained applications, CSTEM and METCAN. These applications were parallelized on an Intel iPSC/860 hypercube. It was found that the overall speedup was very low, due to large, inherently sequential code segments present in the applications. The overall execution time T(sub par), of the application is dependent on these sequential segments. If these segments make up a significant fraction of the overall code, the application will have a poor speedup measure.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Peiyuan; Brown, Timothy; Fullmer, William D.
Five benchmark problems are developed and simulated with the computational fluid dynamics and discrete element model code MFiX. The benchmark problems span dilute and dense regimes, consider statistically homogeneous and inhomogeneous (both clusters and bubbles) particle concentrations and a range of particle and fluid dynamic computational loads. Several variations of the benchmark problems are also discussed to extend the computational phase space to cover granular (particles only), bidisperse and heat transfer cases. A weak scaling analysis is performed for each benchmark problem and, in most cases, the scalability of the code appears reasonable up to approx. 103 cores. Profiling ofmore » the benchmark problems indicate that the most substantial computational time is being spent on particle-particle force calculations, drag force calculations and interpolating between discrete particle and continuum fields. Hardware performance analysis was also carried out showing significant Level 2 cache miss ratios and a rather low degree of vectorization. These results are intended to serve as a baseline for future developments to the code as well as a preliminary indicator of where to best focus performance optimizations.« less
NASA Astrophysics Data System (ADS)
Ethier, Stephane; Lin, Zhihong
2001-10-01
Earlier this year, the National Energy Research Scientific Computing center (NERSC) took delivery of the second most powerful computer in the world. With its 2,528 processors running at a peak performance of 1.5 GFlops, this IBM SP machine has a theoretical performance of almost 3.8 TFlops. To efficiently harness such computing power in one single code is not an easy task and requires a good knowledge of the computer's architecture. Here we present the steps that we followed to improve our gyrokinetic micro-turbulence code GTC in order to take advantage of the new 16-way shared memory nodes of the NERSC IBM SP. Performance results are shown as well as details about the improved mixed-mode MPI-OpenMP model that we use. The enhancements to the code allowed us to tackle much bigger problem sizes, getting closer to our goal of simulating an ITER-size tokamak with both kinetic ions and electrons.(This work is supported by DOE Contract No. DE-AC02-76CH03073 (PPPL), and in part by the DOE Fusion SciDAC Project.)
Enhancement of the Probabilistic CEramic Matrix Composite ANalyzer (PCEMCAN) Computer Code
NASA Technical Reports Server (NTRS)
Shah, Ashwin
2000-01-01
This report represents a final technical report for Order No. C-78019-J entitled "Enhancement of the Probabilistic Ceramic Matrix Composite Analyzer (PCEMCAN) Computer Code." The scope of the enhancement relates to including the probabilistic evaluation of the D-Matrix terms in MAT2 and MAT9 material properties card (available in CEMCAN code) for the MSC/NASTRAN. Technical activities performed during the time period of June 1, 1999 through September 3, 1999 have been summarized, and the final version of the enhanced PCEMCAN code and revisions to the User's Manual is delivered along with. Discussions related to the performed activities were made to the NASA Project Manager during the performance period. The enhanced capabilities have been demonstrated using sample problems.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tournier, J.; El-Genk, M.S.; Huang, L.
1999-01-01
The Institute of Space and Nuclear Power Studies at the University of New Mexico has developed a computer simulation of cylindrical geometry alkali metal thermal-to-electric converter cells using a standard Fortran 77 computer code. The objective and use of this code was to compare the experimental measurements with computer simulations, upgrade the model as appropriate, and conduct investigations of various methods to improve the design and performance of the devices for improved efficiency, durability, and longer operational lifetime. The Institute of Space and Nuclear Power Studies participated in vacuum testing of PX series alkali metal thermal-to-electric converter cells and developedmore » the alkali metal thermal-to-electric converter Performance Evaluation and Analysis Model. This computer model consisted of a sodium pressure loss model, a cell electrochemical and electric model, and a radiation/conduction heat transfer model. The code closely predicted the operation and performance of a wide variety of PX series cells which led to suggestions for improvements to both lifetime and performance. The code provides valuable insight into the operation of the cell, predicts parameters of components within the cell, and is a useful tool for predicting both the transient and steady state performance of systems of cells.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tournier, J.; El-Genk, M.S.; Huang, L.
1999-01-01
The Institute of Space and Nuclear Power Studies at the University of New Mexico has developed a computer simulation of cylindrical geometry alkali metal thermal-to-electric converter cells using a standard Fortran 77 computer code. The objective and use of this code was to compare the experimental measurements with computer simulations, upgrade the model as appropriate, and conduct investigations of various methods to improve the design and performance of the devices for improved efficiency, durability, and longer operational lifetime. The Institute of Space and Nuclear Power Studies participated in vacuum testing of PX series alkali metal thermal-to-electric converter cells and developedmore » the alkali metal thermal-to-electric converter Performance Evaluation and Analysis Model. This computer model consisted of a sodium pressure loss model, a cell electrochemical and electric model, and a radiation/conduction heat transfer model. The code closely predicted the operation and performance of a wide variety of PX series cells which led to suggestions for improvements to both lifetime and performance. The code provides valuable insight into the operation of the cell, predicts parameters of components within the cell, and is a useful tool for predicting both the transient and steady state performance of systems of cells.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zizin, M. N.; Zimin, V. G.; Zizina, S. N., E-mail: zizin@adis.vver.kiae.ru
2010-12-15
The ShIPR intellectual code system for mathematical simulation of nuclear reactors includes a set of computing modules implementing the preparation of macro cross sections on the basis of the two-group library of neutron-physics cross sections obtained for the SKETCH-N nodal code. This library is created by using the UNK code for 3D diffusion computation of first VVER-1000 fuel loadings. Computation of neutron fields in the ShIPR system is performed using the DP3 code in the two-group diffusion approximation in 3D triangular geometry. The efficiency of all groups of control rods for the first fuel loading of the third unit ofmore » the Kalinin Nuclear Power Plant is computed. The temperature, barometric, and density effects of reactivity as well as the reactivity coefficient due to the concentration of boric acid in the reactor were computed additionally. Results of computations are compared with the experiment.« less
NASA Astrophysics Data System (ADS)
Zizin, M. N.; Zimin, V. G.; Zizina, S. N.; Kryakvin, L. V.; Pitilimov, V. A.; Tereshonok, V. A.
2010-12-01
The ShIPR intellectual code system for mathematical simulation of nuclear reactors includes a set of computing modules implementing the preparation of macro cross sections on the basis of the two-group library of neutron-physics cross sections obtained for the SKETCH-N nodal code. This library is created by using the UNK code for 3D diffusion computation of first VVER-1000 fuel loadings. Computation of neutron fields in the ShIPR system is performed using the DP3 code in the two-group diffusion approximation in 3D triangular geometry. The efficiency of all groups of control rods for the first fuel loading of the third unit of the Kalinin Nuclear Power Plant is computed. The temperature, barometric, and density effects of reactivity as well as the reactivity coefficient due to the concentration of boric acid in the reactor were computed additionally. Results of computations are compared with the experiment.
NASA Technical Reports Server (NTRS)
Hribar, Michelle R.; Frumkin, Michael; Jin, Haoqiang; Waheed, Abdul; Yan, Jerry; Saini, Subhash (Technical Monitor)
1998-01-01
Over the past decade, high performance computing has evolved rapidly; systems based on commodity microprocessors have been introduced in quick succession from at least seven vendors/families. Porting codes to every new architecture is a difficult problem; in particular, here at NASA, there are many large CFD applications that are very costly to port to new machines by hand. The LCM ("Legacy Code Modernization") Project is the development of an integrated parallelization environment (IPE) which performs the automated mapping of legacy CFD (Fortran) applications to state-of-the-art high performance computers. While most projects to port codes focus on the parallelization of the code, we consider porting to be an iterative process consisting of several steps: 1) code cleanup, 2) serial optimization,3) parallelization, 4) performance monitoring and visualization, 5) intelligent tools for automated tuning using performance prediction and 6) machine specific optimization. The approach for building this parallelization environment is to build the components for each of the steps simultaneously and then integrate them together. The demonstration will exhibit our latest research in building this environment: 1. Parallelizing tools and compiler evaluation. 2. Code cleanup and serial optimization using automated scripts 3. Development of a code generator for performance prediction 4. Automated partitioning 5. Automated insertion of directives. These demonstrations will exhibit the effectiveness of an automated approach for all the steps involved with porting and tuning a legacy code application for a new architecture.
CPMIP: measurements of real computational performance of Earth system models in CMIP6
NASA Astrophysics Data System (ADS)
Balaji, Venkatramani; Maisonnave, Eric; Zadeh, Niki; Lawrence, Bryan N.; Biercamp, Joachim; Fladrich, Uwe; Aloisio, Giovanni; Benson, Rusty; Caubel, Arnaud; Durachta, Jeffrey; Foujols, Marie-Alice; Lister, Grenville; Mocavero, Silvia; Underwood, Seth; Wright, Garrett
2017-01-01
A climate model represents a multitude of processes on a variety of timescales and space scales: a canonical example of multi-physics multi-scale modeling. The underlying climate system is physically characterized by sensitive dependence on initial conditions, and natural stochastic variability, so very long integrations are needed to extract signals of climate change. Algorithms generally possess weak scaling and can be I/O and/or memory-bound. Such weak-scaling, I/O, and memory-bound multi-physics codes present particular challenges to computational performance. Traditional metrics of computational efficiency such as performance counters and scaling curves do not tell us enough about real sustained performance from climate models on different machines. They also do not provide a satisfactory basis for comparative information across models. codes present particular challenges to computational performance. We introduce a set of metrics that can be used for the study of computational performance of climate (and Earth system) models. These measures do not require specialized software or specific hardware counters, and should be accessible to anyone. They are independent of platform and underlying parallel programming models. We show how these metrics can be used to measure actually attained performance of Earth system models on different machines, and identify the most fruitful areas of research and development for performance engineering. codes present particular challenges to computational performance. We present results for these measures for a diverse suite of models from several modeling centers, and propose to use these measures as a basis for a CPMIP, a computational performance model intercomparison project (MIP).
Porting plasma physics simulation codes to modern computing architectures using the
NASA Astrophysics Data System (ADS)
Germaschewski, Kai; Abbott, Stephen
2015-11-01
Available computing power has continued to grow exponentially even after single-core performance satured in the last decade. The increase has since been driven by more parallelism, both using more cores and having more parallelism in each core, e.g. in GPUs and Intel Xeon Phi. Adapting existing plasma physics codes is challenging, in particular as there is no single programming model that covers current and future architectures. We will introduce the open-source
Gel, Aytekin; Hu, Jonathan; Ould-Ahmed-Vall, ElMoustapha; ...
2017-03-20
Legacy codes remain a crucial element of today's simulation-based engineering ecosystem due to the extensive validation process and investment in such software. The rapid evolution of high-performance computing architectures necessitates the modernization of these codes. One approach to modernization is a complete overhaul of the code. However, this could require extensive investments, such as rewriting in modern languages, new data constructs, etc., which will necessitate systematic verification and validation to re-establish the credibility of the computational models. The current study advocates using a more incremental approach and is a culmination of several modernization efforts of the legacy code MFIX, whichmore » is an open-source computational fluid dynamics code that has evolved over several decades, widely used in multiphase flows and still being developed by the National Energy Technology Laboratory. Two different modernization approaches,‘bottom-up’ and ‘top-down’, are illustrated. Here, preliminary results show up to 8.5x improvement at the selected kernel level with the first approach, and up to 50% improvement in total simulated time with the latter were achieved for the demonstration cases and target HPC systems employed.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gel, Aytekin; Hu, Jonathan; Ould-Ahmed-Vall, ElMoustapha
Legacy codes remain a crucial element of today's simulation-based engineering ecosystem due to the extensive validation process and investment in such software. The rapid evolution of high-performance computing architectures necessitates the modernization of these codes. One approach to modernization is a complete overhaul of the code. However, this could require extensive investments, such as rewriting in modern languages, new data constructs, etc., which will necessitate systematic verification and validation to re-establish the credibility of the computational models. The current study advocates using a more incremental approach and is a culmination of several modernization efforts of the legacy code MFIX, whichmore » is an open-source computational fluid dynamics code that has evolved over several decades, widely used in multiphase flows and still being developed by the National Energy Technology Laboratory. Two different modernization approaches,‘bottom-up’ and ‘top-down’, are illustrated. Here, preliminary results show up to 8.5x improvement at the selected kernel level with the first approach, and up to 50% improvement in total simulated time with the latter were achieved for the demonstration cases and target HPC systems employed.« less
NASA Technical Reports Server (NTRS)
Walowit, Jed A.; Shapiro, Wibur
2005-01-01
This is the source listing of the computer code SPIRALI which predicts the performance characteristics of incompressible cylindrical and face seals with or without the inclusion of spiral grooves. Performance characteristics include load capacity (for face seals), leakage flow, power requirements and dynamic characteristics in the form of stiffness, damping and apparent mass coefficients in 4 degrees of freedom for cylindrical seals and 3 degrees of freedom for face seals. These performance characteristics are computed as functions of seal and groove geometry, load or film thickness, running and disturbance speeds, fluid viscosity, and boundary pressures.
Performance of the Widely-Used CFD Code OVERFLOW on the Pleides Supercomputer
NASA Technical Reports Server (NTRS)
Guruswamy, Guru P.
2017-01-01
Computational performance studies were made for NASA's widely used Computational Fluid Dynamics code OVERFLOW on the Pleiades Supercomputer. Two test cases were considered: a full launch vehicle with a grid of 286 million points and a full rotorcraft model with a grid of 614 million points. Computations using up to 8000 cores were run on Sandy Bridge and Ivy Bridge nodes. Performance was monitored using times reported in the day files from the Portable Batch System utility. Results for two grid topologies are presented and compared in detail. Observations and suggestions for future work are made.
NASA Technical Reports Server (NTRS)
Steinke, R. J.
1982-01-01
A FORTRAN computer code is presented for off-design performance prediction of axial-flow compressors. Stage and compressor performance is obtained by a stage-stacking method that uses representative velocity diagrams at rotor inlet and outlet meanline radii. The code has options for: (1) direct user input or calculation of nondimensional stage characteristics; (2) adjustment of stage characteristics for off-design speed and blade setting angle; (3) adjustment of rotor deviation angle for off-design conditions; and (4) SI or U.S. customary units. Correlations from experimental data are used to model real flow conditions. Calculations are compared with experimental data.
Final Report for ALCC Allocation: Predictive Simulation of Complex Flow in Wind Farms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Barone, Matthew F.; Ananthan, Shreyas; Churchfield, Matt
This report documents work performed using ALCC computing resources granted under a proposal submitted in February 2016, with the resource allocation period spanning the period July 2016 through June 2017. The award allocation was 10.7 million processor-hours at the National Energy Research Scientific Computing Center. The simulations performed were in support of two projects: the Atmosphere to Electrons (A2e) project, supported by the DOE EERE office; and the Exascale Computing Project (ECP), supported by the DOE Office of Science. The project team for both efforts consists of staff scientists and postdocs from Sandia National Laboratories and the National Renewable Energymore » Laboratory. At the heart of these projects is the open-source computational-fluid-dynamics (CFD) code, Nalu. Nalu solves the low-Mach-number Navier-Stokes equations using an unstructured- grid discretization. Nalu leverages the open-source Trilinos solver library and the Sierra Toolkit (STK) for parallelization and I/O. This report documents baseline computational performance of the Nalu code on problems of direct relevance to the wind plant physics application - namely, Large Eddy Simulation (LES) of an atmospheric boundary layer (ABL) flow and wall-modeled LES of a flow past a static wind turbine rotor blade. Parallel performance of Nalu and its constituent solver routines residing in the Trilinos library has been assessed previously under various campaigns. However, both Nalu and Trilinos have been, and remain, in active development and resources have not been available previously to rigorously track code performance over time. With the initiation of the ECP, it is important to establish and document baseline code performance on the problems of interest. This will allow the project team to identify and target any deficiencies in performance, as well as highlight any performance bottlenecks as we exercise the code on a greater variety of platforms and at larger scales. The current study is rather modest in scale, examining performance on problem sizes of O(100 million) elements and core counts up to 8k cores. This will be expanded as more computational resources become available to the projects.« less
NASA Technical Reports Server (NTRS)
Lou, John; Ferraro, Robert; Farrara, John; Mechoso, Carlos
1996-01-01
An analysis is presented of several factors influencing the performance of a parallel implementation of the UCLA atmospheric general circulation model (AGCM) on massively parallel computer systems. Several modificaitons to the original parallel AGCM code aimed at improving its numerical efficiency, interprocessor communication cost, load-balance and issues affecting single-node code performance are discussed.
Reliability model of a monopropellant auxiliary propulsion system
NASA Technical Reports Server (NTRS)
Greenberg, J. S.
1971-01-01
A mathematical model and associated computer code has been developed which computes the reliability of a monopropellant blowdown hydrazine spacecraft auxiliary propulsion system as a function of time. The propulsion system is used to adjust or modify the spacecraft orbit over an extended period of time. The multiple orbit corrections are the multiple objectives which the auxiliary propulsion system is designed to achieve. Thus the reliability model computes the probability of successfully accomplishing each of the desired orbit corrections. To accomplish this, the reliability model interfaces with a computer code that models the performance of a blowdown (unregulated) monopropellant auxiliary propulsion system. The computer code acts as a performance model and as such gives an accurate time history of the system operating parameters. The basic timing and status information is passed on to and utilized by the reliability model which establishes the probability of successfully accomplishing the orbit corrections.
A Computational Study of an Oscillating VR-12 Airfoil with a Gurney Flap
NASA Technical Reports Server (NTRS)
Rhee, Myung
2004-01-01
Computations of the flow over an oscillating airfoil with a Gurney-flap are performed using a Reynolds Averaged Navier-Stokes code and compared with recent experimental data. The experimental results have been generated for different sizes of the Gurney flaps. The computations are focused mainly on a configuration. The baseline airfoil without a Gurney flap is computed and compared with the experiments in both steady and unsteady cases for the purpose of initial testing of the code performance. The are carried out with different turbulence models. Effects of the grid refinement are also examined and unsteady cases, in addition to the assessment of solver effects. The results of the comparisons of steady lift and drag computations indicate that the code is reasonably accurate in the attached flow of the steady condition but largely overpredicts the lift and underpredicts the drag in the higher angle steady flow.
NASA Astrophysics Data System (ADS)
Walker, Ernest; Chen, Xinjia; Cooper, Reginald L.
2010-04-01
An arbitrarily accurate approach is used to determine the bit-error rate (BER) performance for generalized asynchronous DS-CDMA systems, in Gaussian noise with Raleigh fading. In this paper, and the sequel, new theoretical work has been contributed which substantially enhances existing performance analysis formulations. Major contributions include: substantial computational complexity reduction, including a priori BER accuracy bounding; an analytical approach that facilitates performance evaluation for systems with arbitrary spectral spreading distributions, with non-uniform transmission delay distributions. Using prior results, augmented by these enhancements, a generalized DS-CDMA system model is constructed and used to evaluated the BER performance, in a variety of scenarios. In this paper, the generalized system modeling was used to evaluate the performance of both Walsh- Hadamard (WH) and Walsh-Hadamard-seeded zero-correlation-zone (WH-ZCZ) coding. The selection of these codes was informed by the observation that WH codes contain N spectral spreading values (0 to N - 1), one for each code sequence; while WH-ZCZ codes contain only two spectral spreading values (N/2 - 1,N/2); where N is the sequence length in chips. Since these codes span the spectral spreading range for DS-CDMA coding, by invoking an induction argument, the generalization of the system model is sufficiently supported. The results in this paper, and the sequel, support the claim that an arbitrary accurate performance analysis for DS-CDMA systems can be evaluated over the full range of binary coding, with minimal computational complexity.
Highly fault-tolerant parallel computation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Spielman, D.A.
We re-introduce the coded model of fault-tolerant computation in which the input and output of a computational device are treated as words in an error-correcting code. A computational device correctly computes a function in the coded model if its input and output, once decoded, are a valid input and output of the function. In the coded model, it is reasonable to hope to simulate all computational devices by devices whose size is greater by a constant factor but which are exponentially reliable even if each of their components can fail with some constant probability. We consider fine-grained parallel computations inmore » which each processor has a constant probability of producing the wrong output at each time step. We show that any parallel computation that runs for time t on w processors can be performed reliably on a faulty machine in the coded model using w log{sup O(l)} w processors and time t log{sup O(l)} w. The failure probability of the computation will be at most t {center_dot} exp(-w{sup 1/4}). The codes used to communicate with our fault-tolerant machines are generalized Reed-Solomon codes and can thus be encoded and decoded in O(n log{sup O(1)} n) sequential time and are independent of the machine they are used to communicate with. We also show how coded computation can be used to self-correct many linear functions in parallel with arbitrarily small overhead.« less
Adiabatic topological quantum computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cesare, Chris; Landahl, Andrew J.; Bacon, Dave
Topological quantum computing promises error-resistant quantum computation without active error correction. However, there is a worry that during the process of executing quantum gates by braiding anyons around each other, extra anyonic excitations will be created that will disorder the encoded quantum information. Here, we explore this question in detail by studying adiabatic code deformations on Hamiltonians based on topological codes, notably Kitaev’s surface codes and the more recently discovered color codes. We develop protocols that enable universal quantum computing by adiabatic evolution in a way that keeps the energy gap of the system constant with respect to the computationmore » size and introduces only simple local Hamiltonian interactions. This allows one to perform holonomic quantum computing with these topological quantum computing systems. The tools we develop allow one to go beyond numerical simulations and understand these processes analytically.« less
Adiabatic topological quantum computing
Cesare, Chris; Landahl, Andrew J.; Bacon, Dave; ...
2015-07-31
Topological quantum computing promises error-resistant quantum computation without active error correction. However, there is a worry that during the process of executing quantum gates by braiding anyons around each other, extra anyonic excitations will be created that will disorder the encoded quantum information. Here, we explore this question in detail by studying adiabatic code deformations on Hamiltonians based on topological codes, notably Kitaev’s surface codes and the more recently discovered color codes. We develop protocols that enable universal quantum computing by adiabatic evolution in a way that keeps the energy gap of the system constant with respect to the computationmore » size and introduces only simple local Hamiltonian interactions. This allows one to perform holonomic quantum computing with these topological quantum computing systems. The tools we develop allow one to go beyond numerical simulations and understand these processes analytically.« less
Fast Sparse Coding for Range Data Denoising with Sparse Ridges Constraint.
Gao, Zhi; Lao, Mingjie; Sang, Yongsheng; Wen, Fei; Ramesh, Bharath; Zhai, Ruifang
2018-05-06
Light detection and ranging (LiDAR) sensors have been widely deployed on intelligent systems such as unmanned ground vehicles (UGVs) and unmanned aerial vehicles (UAVs) to perform localization, obstacle detection, and navigation tasks. Thus, research into range data processing with competitive performance in terms of both accuracy and efficiency has attracted increasing attention. Sparse coding has revolutionized signal processing and led to state-of-the-art performance in a variety of applications. However, dictionary learning, which plays the central role in sparse coding techniques, is computationally demanding, resulting in its limited applicability in real-time systems. In this study, we propose sparse coding algorithms with a fixed pre-learned ridge dictionary to realize range data denoising via leveraging the regularity of laser range measurements in man-made environments. Experiments on both synthesized data and real data demonstrate that our method obtains accuracy comparable to that of sophisticated sparse coding methods, but with much higher computational efficiency.
Improved Helicopter Rotor Performance Prediction through Loose and Tight CFD/CSD Coupling
NASA Astrophysics Data System (ADS)
Ickes, Jacob C.
Helicopters and other Vertical Take-Off or Landing (VTOL) vehicles exhibit an interesting combination of structural dynamic and aerodynamic phenomena which together drive the rotor performance. The combination of factors involved make simulating the rotor a challenging and multidisciplinary effort, and one which is still an active area of interest in the industry because of the money and time it could save during design. Modern tools allow the prediction of rotorcraft physics from first principles. Analysis of the rotor system with this level of accuracy provides the understanding necessary to improve its performance. There has historically been a divide between the comprehensive codes which perform aeroelastic rotor simulations using simplified aerodynamic models, and the very computationally intensive Navier-Stokes Computational Fluid Dynamics (CFD) solvers. As computer resources become more available, efforts have been made to replace the simplified aerodynamics of the comprehensive codes with the more accurate results from a CFD code. The objective of this work is to perform aeroelastic rotorcraft analysis using first-principles simulations for both fluids and structural predictions using tools available at the University of Toledo. Two separate codes are coupled together in both loose coupling (data exchange on a periodic interval) and tight coupling (data exchange each time step) schemes. To allow the coupling to be carried out in a reliable and efficient way, a Fluid-Structure Interaction code was developed which automatically performs primary functions of loose and tight coupling procedures. Flow phenomena such as transonics, dynamic stall, locally reversed flow on a blade, and Blade-Vortex Interaction (BVI) were simulated in this work. Results of the analysis show aerodynamic load improvement due to the inclusion of the CFD-based airloads in the structural dynamics analysis of the Computational Structural Dynamics (CSD) code. Improvements came in the form of improved peak/trough magnitude prediction, better phase prediction of these locations, and a predicted signal with a frequency content more like the flight test data than the CSD code acting alone. Additionally, a tight coupling analysis was performed as a demonstration of the capability and unique aspects of such an analysis. This work shows that away from the center of the flight envelope, the aerodynamic modeling of the CSD code can be replaced with a more accurate set of predictions from a CFD code with an improvement in the aerodynamic results. The better predictions come at substantially increased computational costs between 1,000 and 10,000 processor-hours.
Advances in Engineering Software for Lift Transportation Systems
NASA Astrophysics Data System (ADS)
Kazakoff, Alexander Borisoff
2012-03-01
In this paper an attempt is performed at computer modelling of ropeway ski lift systems. The logic in these systems is based on a travel form between the two terminals, which operates with high capacity cabins, chairs, gondolas or draw-bars. Computer codes AUTOCAD, MATLAB and Compaq-Visual Fortran - version 6.6 are used in the computer modelling. The rope systems computer modelling is organized in two stages in this paper. The first stage is organization of the ground relief profile and a design of the lift system as a whole, according to the terrain profile and the climatic and atmospheric conditions. The ground profile is prepared by the geodesists and is presented in an AUTOCAD view. The next step is the design of the lift itself which is performed by programmes using the computer code MATLAB. The second stage of the computer modelling is performed after the optimization of the co-ordinates and the lift profile using the computer code MATLAB. Then the co-ordinates and the parameters are inserted into a program written in Compaq Visual Fortran - version 6.6., which calculates 171 lift parameters, organized in 42 tables. The objective of the work presented in this paper is an attempt at computer modelling of the design and parameters derivation of the rope way systems and their computer variation and optimization.
Development of MCNPX-ESUT computer code for simulation of neutron/gamma pulse height distribution
NASA Astrophysics Data System (ADS)
Abolfazl Hosseini, Seyed; Vosoughi, Naser; Zangian, Mehdi
2015-05-01
In this paper, the development of the MCNPX-ESUT (MCNPX-Energy Engineering of Sharif University of Technology) computer code for simulation of neutron/gamma pulse height distribution is reported. Since liquid organic scintillators like NE-213 are well suited and routinely used for spectrometry in mixed neutron/gamma fields, this type of detectors is selected for simulation in the present study. The proposed algorithm for simulation includes four main steps. The first step is the modeling of the neutron/gamma particle transport and their interactions with the materials in the environment and detector volume. In the second step, the number of scintillation photons due to charged particles such as electrons, alphas, protons and carbon nuclei in the scintillator material is calculated. In the third step, the transport of scintillation photons in the scintillator and lightguide is simulated. Finally, the resolution corresponding to the experiment is considered in the last step of the simulation. Unlike the similar computer codes like SCINFUL, NRESP7 and PHRESP, the developed computer code is applicable to both neutron and gamma sources. Hence, the discrimination of neutron and gamma in the mixed fields may be performed using the MCNPX-ESUT computer code. The main feature of MCNPX-ESUT computer code is that the neutron/gamma pulse height simulation may be performed without needing any sort of post processing. In the present study, the pulse height distributions due to a monoenergetic neutron/gamma source in NE-213 detector using MCNPX-ESUT computer code is simulated. The simulated neutron pulse height distributions are validated through comparing with experimental data (Gohil et al. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 664 (2012) 304-309.) and the results obtained from similar computer codes like SCINFUL, NRESP7 and Geant4. The simulated gamma pulse height distribution for a 137Cs source is also compared with the experimental data.
Pretest aerosol code comparisons for LWR aerosol containment tests LA1 and LA2
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wright, A.L.; Wilson, J.H.; Arwood, P.C.
The Light-Water-Reactor (LWR) Aerosol Containment Experiments (LACE) are being performed in Richland, Washington, at the Hanford Engineering Development Laboratory (HEDL) under the leadership of an international project board and the Electric Power Research Institute. These tests have two objectives: (1) to investigate, at large scale, the inherent aerosol retention behavior in LWR containments under simulated severe accident conditions, and (2) to provide an experimental data base for validating aerosol behavior and thermal-hydraulic computer codes. Aerosol computer-code comparison activities are being coordinated at the Oak Ridge National Laboratory. For each of the six LACE tests, ''pretest'' calculations (for code-to-code comparisons) andmore » ''posttest'' calculations (for code-to-test data comparisons) are being performed. The overall goals of the comparison effort are (1) to provide code users with experience in applying their codes to LWR accident-sequence conditions and (2) to evaluate and improve the code models.« less
Shared prefetching to reduce execution skew in multi-threaded systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Eichenberger, Alexandre E; Gunnels, John A
Mechanisms are provided for optimizing code to perform prefetching of data into a shared memory of a computing device that is shared by a plurality of threads that execute on the computing device. A memory stream of a portion of code that is shared by the plurality of threads is identified. A set of prefetch instructions is distributed across the plurality of threads. Prefetch instructions are inserted into the instruction sequences of the plurality of threads such that each instruction sequence has a separate sub-portion of the set of prefetch instructions, thereby generating optimized code. Executable code is generated basedmore » on the optimized code and stored in a storage device. The executable code, when executed, performs the prefetches associated with the distributed set of prefetch instructions in a shared manner across the plurality of threads.« less
Capacity Maximizing Constellations
NASA Technical Reports Server (NTRS)
Barsoum, Maged; Jones, Christopher
2010-01-01
Some non-traditional signal constellations have been proposed for transmission of data over the Additive White Gaussian Noise (AWGN) channel using such channel-capacity-approaching codes as low-density parity-check (LDPC) or turbo codes. Computational simulations have shown performance gains of more than 1 dB over traditional constellations. These gains could be translated to bandwidth- efficient communications, variously, over longer distances, using less power, or using smaller antennas. The proposed constellations have been used in a bit-interleaved coded modulation system employing state-ofthe-art LDPC codes. In computational simulations, these constellations were shown to afford performance gains over traditional constellations as predicted by the gap between the parallel decoding capacity of the constellations and the Gaussian capacity
Parallelization of ARC3D with Computer-Aided Tools
NASA Technical Reports Server (NTRS)
Jin, Haoqiang; Hribar, Michelle; Yan, Jerry; Saini, Subhash (Technical Monitor)
1998-01-01
A series of efforts have been devoted to investigating methods of porting and parallelizing applications quickly and efficiently for new architectures, such as the SCSI Origin 2000 and Cray T3E. This report presents the parallelization of a CFD application, ARC3D, using the computer-aided tools, Cesspools. Steps of parallelizing this code and requirements of achieving better performance are discussed. The generated parallel version has achieved reasonably well performance, for example, having a speedup of 30 for 36 Cray T3E processors. However, this performance could not be obtained without modification of the original serial code. It is suggested that in many cases improving serial code and performing necessary code transformations are important parts for the automated parallelization process although user intervention in many of these parts are still necessary. Nevertheless, development and improvement of useful software tools, such as Cesspools, can help trim down many tedious parallelization details and improve the processing efficiency.
Thermoelectric pump performance analysis computer code
NASA Technical Reports Server (NTRS)
Johnson, J. L.
1973-01-01
A computer program is presented that was used to analyze and design dual-throat electromagnetic dc conduction pumps for the 5-kwe ZrH reactor thermoelectric system. In addition to a listing of the code and corresponding identification of symbols, the bases for this analytical model are provided.
ABINIT: Plane-Wave-Based Density-Functional Theory on High Performance Computers
NASA Astrophysics Data System (ADS)
Torrent, Marc
2014-03-01
For several years, a continuous effort has been produced to adapt electronic structure codes based on Density-Functional Theory to the future computing architectures. Among these codes, ABINIT is based on a plane-wave description of the wave functions which allows to treat systems of any kind. Porting such a code on petascale architectures pose difficulties related to the many-body nature of the DFT equations. To improve the performances of ABINIT - especially for what concerns standard LDA/GGA ground-state and response-function calculations - several strategies have been followed: A full multi-level parallelisation MPI scheme has been implemented, exploiting all possible levels and distributing both computation and memory. It allows to increase the number of distributed processes and could not be achieved without a strong restructuring of the code. The core algorithm used to solve the eigen problem (``Locally Optimal Blocked Congugate Gradient''), a Blocked-Davidson-like algorithm, is based on a distribution of processes combining plane-waves and bands. In addition to the distributed memory parallelization, a full hybrid scheme has been implemented, using standard shared-memory directives (openMP/openACC) or porting some comsuming code sections to Graphics Processing Units (GPU). As no simple performance model exists, the complexity of use has been increased; the code efficiency strongly depends on the distribution of processes among the numerous levels. ABINIT is able to predict the performances of several process distributions and automatically choose the most favourable one. On the other hand, a big effort has been carried out to analyse the performances of the code on petascale architectures, showing which sections of codes have to be improved; they all are related to Matrix Algebra (diagonalisation, orthogonalisation). The different strategies employed to improve the code scalability will be described. They are based on an exploration of new diagonalization algorithm, as well as the use of external optimized librairies. Part of this work has been supported by the european Prace project (PaRtnership for Advanced Computing in Europe) in the framework of its workpackage 8.
A performance comparison of the Cray-2 and the Cray X-MP
NASA Technical Reports Server (NTRS)
Schmickley, Ronald; Bailey, David H.
1986-01-01
A suite of thirteen large Fortran benchmark codes were run on Cray-2 and Cray X-MP supercomputers. These codes were a mix of compute-intensive scientific application programs (mostly Computational Fluid Dynamics) and some special vectorized computation exercise programs. For the general class of programs tested on the Cray-2, most of which were not specially tuned for speed, the floating point operation rates varied under a variety of system load configurations from 40 percent up to 125 percent of X-MP performance rates. It is concluded that the Cray-2, in the original system configuration studied (without memory pseudo-banking) will run untuned Fortran code, on average, about 70 percent of X-MP speeds.
Performance Analysis, Modeling and Scaling of HPC Applications and Tools
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bhatele, Abhinav
2016-01-13
E cient use of supercomputers at DOE centers is vital for maximizing system throughput, mini- mizing energy costs and enabling science breakthroughs faster. This requires complementary e orts along several directions to optimize the performance of scienti c simulation codes and the under- lying runtimes and software stacks. This in turn requires providing scalable performance analysis tools and modeling techniques that can provide feedback to physicists and computer scientists developing the simulation codes and runtimes respectively. The PAMS project is using time allocations on supercomputers at ALCF, NERSC and OLCF to further the goals described above by performing research alongmore » the following fronts: 1. Scaling Study of HPC applications; 2. Evaluation of Programming Models; 3. Hardening of Performance Tools; 4. Performance Modeling of Irregular Codes; and 5. Statistical Analysis of Historical Performance Data. We are a team of computer and computational scientists funded by both DOE/NNSA and DOE/ ASCR programs such as ECRP, XStack (Traleika Glacier, PIPER), ExaOSR (ARGO), SDMAV II (MONA) and PSAAP II (XPACC). This allocation will enable us to study big data issues when analyzing performance on leadership computing class systems and to assist the HPC community in making the most e ective use of these resources.« less
The MCNP6 Analytic Criticality Benchmark Suite
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brown, Forrest B.
2016-06-16
Analytical benchmarks provide an invaluable tool for verifying computer codes used to simulate neutron transport. Several collections of analytical benchmark problems [1-4] are used routinely in the verification of production Monte Carlo codes such as MCNP® [5,6]. Verification of a computer code is a necessary prerequisite to the more complex validation process. The verification process confirms that a code performs its intended functions correctly. The validation process involves determining the absolute accuracy of code results vs. nature. In typical validations, results are computed for a set of benchmark experiments using a particular methodology (code, cross-section data with uncertainties, and modeling)more » and compared to the measured results from the set of benchmark experiments. The validation process determines bias, bias uncertainty, and possibly additional margins. Verification is generally performed by the code developers, while validation is generally performed by code users for a particular application space. The VERIFICATION_KEFF suite of criticality problems [1,2] was originally a set of 75 criticality problems found in the literature for which exact analytical solutions are available. Even though the spatial and energy detail is necessarily limited in analytical benchmarks, typically to a few regions or energy groups, the exact solutions obtained can be used to verify that the basic algorithms, mathematics, and methods used in complex production codes perform correctly. The present work has focused on revisiting this benchmark suite. A thorough review of the problems resulted in discarding some of them as not suitable for MCNP benchmarking. For the remaining problems, many of them were reformulated to permit execution in either multigroup mode or in the normal continuous-energy mode for MCNP. Execution of the benchmarks in continuous-energy mode provides a significant advance to MCNP verification methods.« less
Real-time computer treatment of THz passive device images with the high image quality
NASA Astrophysics Data System (ADS)
Trofimov, Vyacheslav A.; Trofimov, Vladislav V.
2012-06-01
We demonstrate real-time computer code improving significantly the quality of images captured by the passive THz imaging system. The code is not only designed for a THz passive device: it can be applied to any kind of such devices and active THz imaging systems as well. We applied our code for computer processing of images captured by four passive THz imaging devices manufactured by different companies. It should be stressed that computer processing of images produced by different companies requires using the different spatial filters usually. The performance of current version of the computer code is greater than one image per second for a THz image having more than 5000 pixels and 24 bit number representation. Processing of THz single image produces about 20 images simultaneously corresponding to various spatial filters. The computer code allows increasing the number of pixels for processed images without noticeable reduction of image quality. The performance of the computer code can be increased many times using parallel algorithms for processing the image. We develop original spatial filters which allow one to see objects with sizes less than 2 cm. The imagery is produced by passive THz imaging devices which captured the images of objects hidden under opaque clothes. For images with high noise we develop an approach which results in suppression of the noise after using the computer processing and we obtain the good quality image. With the aim of illustrating the efficiency of the developed approach we demonstrate the detection of the liquid explosive, ordinary explosive, knife, pistol, metal plate, CD, ceramics, chocolate and other objects hidden under opaque clothes. The results demonstrate the high efficiency of our approach for the detection of hidden objects and they are a very promising solution for the security problem.
Improved neutron activation prediction code system development
NASA Technical Reports Server (NTRS)
Saqui, R. M.
1971-01-01
Two integrated neutron activation prediction code systems have been developed by modifying and integrating existing computer programs to perform the necessary computations to determine neutron induced activation gamma ray doses and dose rates in complex geometries. Each of the two systems is comprised of three computational modules. The first program module computes the spatial and energy distribution of the neutron flux from an input source and prepares input data for the second program which performs the reaction rate, decay chain and activation gamma source calculations. A third module then accepts input prepared by the second program to compute the cumulative gamma doses and/or dose rates at specified detector locations in complex, three-dimensional geometries.
Elliptical orbit performance computer program
NASA Technical Reports Server (NTRS)
Myler, T. R.
1981-01-01
A FORTRAN coded computer program which generates and plots elliptical orbit performance capability of space boosters for presentation purposes is described. Orbital performance capability of space boosters is typically presented as payload weight as a function of perigee and apogee altitudes. The parameters are derived from a parametric computer simulation of the booster flight which yields the payload weight as a function of velocity and altitude at insertion. The process of converting from velocity and altitude to apogee and perigee altitude and plotting the results as a function of payload weight is mechanized with the ELOPE program. The program theory, user instruction, input/output definitions, subroutine descriptions and detailed FORTRAN coding information are included.
Approximate maximum likelihood decoding of block codes
NASA Technical Reports Server (NTRS)
Greenberger, H. J.
1979-01-01
Approximate maximum likelihood decoding algorithms, based upon selecting a small set of candidate code words with the aid of the estimated probability of error of each received symbol, can give performance close to optimum with a reasonable amount of computation. By combining the best features of various algorithms and taking care to perform each step as efficiently as possible, a decoding scheme was developed which can decode codes which have better performance than those presently in use and yet not require an unreasonable amount of computation. The discussion of the details and tradeoffs of presently known efficient optimum and near optimum decoding algorithms leads, naturally, to the one which embodies the best features of all of them.
Computer code for the optimization of performance parameters of mixed explosive formulations.
Muthurajan, H; Sivabalan, R; Talawar, M B; Venugopalan, S; Gandhe, B R
2006-08-25
LOTUSES is a novel computer code, which has been developed for the prediction of various thermodynamic properties such as heat of formation, heat of explosion, volume of explosion gaseous products and other related performance parameters. In this paper, we report LOTUSES (Version 1.4) code which has been utilized for the optimization of various high explosives in different combinations to obtain maximum possible velocity of detonation. LOTUSES (Version 1.4) code will vary the composition of mixed explosives automatically in the range of 1-100% and computes the oxygen balance as well as the velocity of detonation for various compositions in preset steps. Further, the code suggests the compositions for which least oxygen balance and the higher velocity of detonation could be achieved. Presently, the code can be applied for two component explosive compositions. The code has been validated with well-known explosives like, TNT, HNS, HNF, TATB, RDX, HMX, AN, DNA, CL-20 and TNAZ in different combinations. The new algorithm incorporated in LOTUSES (Version 1.4) enhances the efficiency and makes it a more powerful tool for the scientists/researches working in the field of high energy materials/hazardous materials.
Analysis of JSI TRIGA MARK II reactor physical parameters calculated with TRIPOLI and MCNP.
Henry, R; Tiselj, I; Snoj, L
2015-03-01
New computational model of the JSI TRIGA Mark II research reactor was built for TRIPOLI computer code and compared with existing MCNP code model. The same modelling assumptions were used in order to check the differences of the mathematical models of both Monte Carlo codes. Differences between the TRIPOLI and MCNP predictions of keff were up to 100pcm. Further validation was performed with analyses of the normalized reaction rates and computations of kinetic parameters for various core configurations. Copyright © 2014 Elsevier Ltd. All rights reserved.
A comparison of two central difference schemes for solving the Navier-Stokes equations
NASA Technical Reports Server (NTRS)
Maksymiuk, C. M.; Swanson, R. C.; Pulliam, T. H.
1990-01-01
Five viscous transonic airfoil cases were computed by two significantly different computational fluid dynamics codes: An explicit finite-volume algorithm with multigrid, and an implicit finite-difference approximate-factorization method with Eigenvector diagonalization. Both methods are described in detail, and their performance on the test cases is compared. The codes utilized the same grids, turbulence model, and computer to provide the truest test of the algorithms. The two approaches produce very similar results, which, for attached flows, also agree well with experimental results; however, the explicit code is considerably faster.
NASA Technical Reports Server (NTRS)
Rutishauser, David
2006-01-01
The motivation for this work comes from an observation that amidst the push for Massively Parallel (MP) solutions to high-end computing problems such as numerical physical simulations, large amounts of legacy code exist that are highly optimized for vector supercomputers. Because re-hosting legacy code often requires a complete re-write of the original code, which can be a very long and expensive effort, this work examines the potential to exploit reconfigurable computing machines in place of a vector supercomputer to implement an essentially unmodified legacy source code. Custom and reconfigurable computing resources could be used to emulate an original application's target platform to the extent required to achieve high performance. To arrive at an architecture that delivers the desired performance subject to limited resources involves solving a multi-variable optimization problem with constraints. Prior research in the area of reconfigurable computing has demonstrated that designing an optimum hardware implementation of a given application under hardware resource constraints is an NP-complete problem. The premise of the approach is that the general issue of applying reconfigurable computing resources to the implementation of an application, maximizing the performance of the computation subject to physical resource constraints, can be made a tractable problem by assuming a computational paradigm, such as vector processing. This research contributes a formulation of the problem and a methodology to design a reconfigurable vector processing implementation of a given application that satisfies a performance metric. A generic, parametric, architectural framework for vector processing implemented in reconfigurable logic is developed as a target for a scheduling/mapping algorithm that maps an input computation to a given instance of the architecture. This algorithm is integrated with an optimization framework to arrive at a specification of the architecture parameters that attempts to minimize execution time, while staying within resource constraints. The flexibility of using a custom reconfigurable implementation is exploited in a unique manner to leverage the lessons learned in vector supercomputer development. The vector processing framework is tailored to the application, with variable parameters that are fixed in traditional vector processing. Benchmark data that demonstrates the functionality and utility of the approach is presented. The benchmark data includes an identified bottleneck in a real case study example vector code, the NASA Langley Terminal Area Simulation System (TASS) application.
NASA Technical Reports Server (NTRS)
Norment, H. G.
1980-01-01
Calculations can be performed for any atmospheric conditions and for all water drop sizes, from the smallest cloud droplet to large raindrops. Any subsonic, external, non-lifting flow can be accommodated; flow into, but not through, inlets also can be simulated. Experimental water drop drag relations are used in the water drop equations of motion and effects of gravity settling are included. Seven codes are described: (1) a code used to debug and plot body surface description data; (2) a code that processes the body surface data to yield the potential flow field; (3) a code that computes flow velocities at arrays of points in space; (4) a code that computes water drop trajectories from an array of points in space; (5) a code that computes water drop trajectories and fluxes to arbitrary target points; (6) a code that computes water drop trajectories tangent to the body; and (7) a code that produces stereo pair plots which include both the body and trajectories. Code descriptions include operating instructions, card inputs and printouts for example problems, and listing of the FORTRAN codes. Accuracy of the calculations is discussed, and trajectory calculation results are compared with prior calculations and with experimental data.
Utilizing GPUs to Accelerate Turbomachinery CFD Codes
NASA Technical Reports Server (NTRS)
MacCalla, Weylin; Kulkarni, Sameer
2016-01-01
GPU computing has established itself as a way to accelerate parallel codes in the high performance computing world. This work focuses on speeding up APNASA, a legacy CFD code used at NASA Glenn Research Center, while also drawing conclusions about the nature of GPU computing and the requirements to make GPGPU worthwhile on legacy codes. Rewriting and restructuring of the source code was avoided to limit the introduction of new bugs. The code was profiled and investigated for parallelization potential, then OpenACC directives were used to indicate parallel parts of the code. The use of OpenACC directives was not able to reduce the runtime of APNASA on either the NVIDIA Tesla discrete graphics card, or the AMD accelerated processing unit. Additionally, it was found that in order to justify the use of GPGPU, the amount of parallel work being done within a kernel would have to greatly exceed the work being done by any one portion of the APNASA code. It was determined that in order for an application like APNASA to be accelerated on the GPU, it should not be modular in nature, and the parallel portions of the code must contain a large portion of the code's computation time.
Review and verification of CARE 3 mathematical model and code
NASA Technical Reports Server (NTRS)
Rose, D. M.; Altschul, R. E.; Manke, J. W.; Nelson, D. L.
1983-01-01
The CARE-III mathematical model and code verification performed by Boeing Computer Services were documented. The mathematical model was verified for permanent and intermittent faults. The transient fault model was not addressed. The code verification was performed on CARE-III, Version 3. A CARE III Version 4, which corrects deficiencies identified in Version 3, is being developed.
The TeraShake Computational Platform for Large-Scale Earthquake Simulations
NASA Astrophysics Data System (ADS)
Cui, Yifeng; Olsen, Kim; Chourasia, Amit; Moore, Reagan; Maechling, Philip; Jordan, Thomas
Geoscientific and computer science researchers with the Southern California Earthquake Center (SCEC) are conducting a large-scale, physics-based, computationally demanding earthquake system science research program with the goal of developing predictive models of earthquake processes. The computational demands of this program continue to increase rapidly as these researchers seek to perform physics-based numerical simulations of earthquake processes for larger meet the needs of this research program, a multiple-institution team coordinated by SCEC has integrated several scientific codes into a numerical modeling-based research tool we call the TeraShake computational platform (TSCP). A central component in the TSCP is a highly scalable earthquake wave propagation simulation program called the TeraShake anelastic wave propagation (TS-AWP) code. In this chapter, we describe how we extended an existing, stand-alone, wellvalidated, finite-difference, anelastic wave propagation modeling code into the highly scalable and widely used TS-AWP and then integrated this code into the TeraShake computational platform that provides end-to-end (initialization to analysis) research capabilities. We also describe the techniques used to enhance the TS-AWP parallel performance on TeraGrid supercomputers, as well as the TeraShake simulations phases including input preparation, run time, data archive management, and visualization. As a result of our efforts to improve its parallel efficiency, the TS-AWP has now shown highly efficient strong scaling on over 40K processors on IBM’s BlueGene/L Watson computer. In addition, the TSCP has developed into a computational system that is useful to many members of the SCEC community for performing large-scale earthquake simulations.
Dongarra, Jack; Heroux, Michael A.; Luszczek, Piotr
2015-08-17
Here, we describe a new high-performance conjugate-gradient (HPCG) benchmark. HPCG is composed of computations and data-access patterns commonly found in scientific applications. HPCG strives for a better correlation to existing codes from the computational science domain and to be representative of their performance. Furthermore, HPCG is meant to help drive the computer system design and implementation in directions that will better impact future performance improvement.
Review of numerical models to predict cooling tower performance
DOE Office of Scientific and Technical Information (OSTI.GOV)
Johnson, B.M.; Nomura, K.K.; Bartz, J.A.
1987-01-01
Four state-of-the-art computer models developed to predict the thermal performance of evaporative cooling towers are summarized. The formulation of these models, STAR and TEFERI (developed in Europe) and FACTS and VERA2D (developed in the U.S.), is summarized. A fifth code, based on Merkel analysis, is also discussed. Principal features of the codes, computation time and storage requirements are described. A discussion of model validation is also provided.
Computational Particle Dynamic Simulations on Multicore Processors (CPDMu) Final Report Phase I
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schmalz, Mark S
2011-07-24
Statement of Problem - Department of Energy has many legacy codes for simulation of computational particle dynamics and computational fluid dynamics applications that are designed to run on sequential processors and are not easily parallelized. Emerging high-performance computing architectures employ massively parallel multicore architectures (e.g., graphics processing units) to increase throughput. Parallelization of legacy simulation codes is a high priority, to achieve compatibility, efficiency, accuracy, and extensibility. General Statement of Solution - A legacy simulation application designed for implementation on mainly-sequential processors has been represented as a graph G. Mathematical transformations, applied to G, produce a graph representation {und G}more » for a high-performance architecture. Key computational and data movement kernels of the application were analyzed/optimized for parallel execution using the mapping G {yields} {und G}, which can be performed semi-automatically. This approach is widely applicable to many types of high-performance computing systems, such as graphics processing units or clusters comprised of nodes that contain one or more such units. Phase I Accomplishments - Phase I research decomposed/profiled computational particle dynamics simulation code for rocket fuel combustion into low and high computational cost regions (respectively, mainly sequential and mainly parallel kernels), with analysis of space and time complexity. Using the research team's expertise in algorithm-to-architecture mappings, the high-cost kernels were transformed, parallelized, and implemented on Nvidia Fermi GPUs. Measured speedups (GPU with respect to single-core CPU) were approximately 20-32X for realistic model parameters, without final optimization. Error analysis showed no loss of computational accuracy. Commercial Applications and Other Benefits - The proposed research will constitute a breakthrough in solution of problems related to efficient parallel computation of particle and fluid dynamics simulations. These problems occur throughout DOE, military and commercial sectors: the potential payoff is high. We plan to license or sell the solution to contractors for military and domestic applications such as disaster simulation (aerodynamic and hydrodynamic), Government agencies (hydrological and environmental simulations), and medical applications (e.g., in tomographic image reconstruction). Keywords - High-performance Computing, Graphic Processing Unit, Fluid/Particle Simulation. Summary for Members of Congress - Department of Energy has many simulation codes that must compute faster, to be effective. The Phase I research parallelized particle/fluid simulations for rocket combustion, for high-performance computing systems.« less
CFD Modeling of Free-Piston Stirling Engines
NASA Technical Reports Server (NTRS)
Ibrahim, Mounir B.; Zhang, Zhi-Guo; Tew, Roy C., Jr.; Gedeon, David; Simon, Terrence W.
2001-01-01
NASA Glenn Research Center (GRC) is funding Cleveland State University (CSU) to develop a reliable Computational Fluid Dynamics (CFD) code that can predict engine performance with the goal of significant improvements in accuracy when compared to one-dimensional (1-D) design code predictions. The funding also includes conducting code validation experiments at both the University of Minnesota (UMN) and CSU. In this paper a brief description of the work-in-progress is provided in the two areas (CFD and Experiments). Also, previous test results are compared with computational data obtained using (1) a 2-D CFD code obtained from Dr. Georg Scheuerer and further developed at CSU and (2) a multidimensional commercial code CFD-ACE+. The test data and computational results are for (1) a gas spring and (2) a single piston/cylinder with attached annular heat exchanger. The comparisons among the codes are discussed. The paper also discusses plans for conducting code validation experiments at CSU and UMN.
NASA Technical Reports Server (NTRS)
1991-01-01
The technical effort and computer code enhancements performed during the sixth year of the Probabilistic Structural Analysis Methods program are summarized. Various capabilities are described to probabilistically combine structural response and structural resistance to compute component reliability. A library of structural resistance models is implemented in the Numerical Evaluations of Stochastic Structures Under Stress (NESSUS) code that included fatigue, fracture, creep, multi-factor interaction, and other important effects. In addition, a user interface was developed for user-defined resistance models. An accurate and efficient reliability method was developed and was successfully implemented in the NESSUS code to compute component reliability based on user-selected response and resistance models. A risk module was developed to compute component risk with respect to cost, performance, or user-defined criteria. The new component risk assessment capabilities were validated and demonstrated using several examples. Various supporting methodologies were also developed in support of component risk assessment.
Improved Boundary Layer Module (BLM) for the Solid Performance Program (SPP)
NASA Astrophysics Data System (ADS)
Coats, D. E.; Cebeci, T.
1982-03-01
The requirements for a replacement to the Bartz boundary layer code, the standard method of computing the performance loss due to viscous effects by the solid performance program, were discussed by the propulsion community along with four nationally recognized boundary layer experts. A consensus was reached regarding the preferred features for the analysis of the replacement code. The major points that were agreed upon are: (1) finite difference methods are preferred over integral methods; (2) a single equation eddy viscosity model was considered to be adequate for the purpose of computing performance loss; (3) a variable grid capability in both coordinate directions would be required; (4) a proven finite difference algorithm which is not stability restricted should be used, that is, an implicit numerical scheme would be required; and (5) the replacement code should be able to compute both turbulent and laminar flows. The program should treat mass addition at the wall as well as being able to calculate a stagnation point starting line.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Klasky, Marc Louis; Myers, Steven Charles; James, Michael R.
To facilitate the timely execution of System Threat Reviews (STRs) for DNDO, and also to develop a methodology for performing STRs, LANL performed comparisons of several radiation transport codes (MCNP, GADRAS, and Gamma-Designer) that have been previously utilized to compute radiation signatures. While each of these codes has strengths, it is of paramount interest to determine the limitations of each of the respective codes and also to identify the most time efficient means by which to produce computational results, given the large number of parametric cases that are anticipated in performing STR's. These comparisons serve to identify regions of applicabilitymore » for each code and provide estimates of uncertainty that may be anticipated. Furthermore, while performing these comparisons, examination of the sensitivity of the results to modeling assumptions was also examined. These investigations serve to enable the creation of the LANL methodology for performing STRs. Given the wide variety of radiation test sources, scenarios, and detectors, LANL calculated comparisons of the following parameters: decay data, multiplicity, device (n,γ) leakages, and radiation transport through representative scenes and shielding. This investigation was performed to understand potential limitations utilizing specific codes for different aspects of the STR challenges.« less
Parallel Scaling Characteristics of Selected NERSC User ProjectCodes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Skinner, David; Verdier, Francesca; Anand, Harsh
This report documents parallel scaling characteristics of NERSC user project codes between Fiscal Year 2003 and the first half of Fiscal Year 2004 (Oct 2002-March 2004). The codes analyzed cover 60% of all the CPU hours delivered during that time frame on seaborg, a 6080 CPU IBM SP and the largest parallel computer at NERSC. The scale in terms of concurrency and problem size of the workload is analyzed. Drawing on batch queue logs, performance data and feedback from researchers we detail the motivations, benefits, and challenges of implementing highly parallel scientific codes on current NERSC High Performance Computing systems.more » An evaluation and outlook of the NERSC workload for Allocation Year 2005 is presented.« less
Visual Computing Environment Workshop
NASA Technical Reports Server (NTRS)
Lawrence, Charles (Compiler)
1998-01-01
The Visual Computing Environment (VCE) is a framework for intercomponent and multidisciplinary computational simulations. Many current engineering analysis codes simulate various aspects of aircraft engine operation. For example, existing computational fluid dynamics (CFD) codes can model the airflow through individual engine components such as the inlet, compressor, combustor, turbine, or nozzle. Currently, these codes are run in isolation, making intercomponent and complete system simulations very difficult to perform. In addition, management and utilization of these engineering codes for coupled component simulations is a complex, laborious task, requiring substantial experience and effort. To facilitate multicomponent aircraft engine analysis, the CFD Research Corporation (CFDRC) is developing the VCE system. This system, which is part of NASA's Numerical Propulsion Simulation System (NPSS) program, can couple various engineering disciplines, such as CFD, structural analysis, and thermal analysis.
Analysis of a Distributed Pulse Power System Using a Circuit Analysis Code
1979-06-01
dose rate was then integrated to give a number that could be compared with measure- ments made using thermal luminescent dosimeters ( TLD ’ s). Since...NM 8 7117 AND THE BDM CORPORATION, ALBUQUERQUE, NM 87106 Abstract A sophisticated computer code (SCEPTRE), used to analyze electronic circuits...computer code (SCEPTRE), used to analyze electronic circuits, was used to evaluate the performance of a large flash X-ray machine. This device was
NASA Astrophysics Data System (ADS)
Burnett, W.
2016-12-01
The Department of Defense's (DoD) High Performance Computing Modernization Program (HPCMP) provides high performance computing to address the most significant challenges in computational resources, software application support and nationwide research and engineering networks. Today, the HPCMP has a critical role in ensuring the National Earth System Prediction Capability (N-ESPC) achieves initial operational status in 2019. A 2015 study commissioned by the HPCMP found that N-ESPC computational requirements will exceed interconnect bandwidth capacity due to the additional load from data assimilation and passing connecting data between ensemble codes. Memory bandwidth and I/O bandwidth will continue to be significant bottlenecks for the Navy's Hybrid Coordinate Ocean Model (HYCOM) scalability - by far the major driver of computing resource requirements in the N-ESPC. The study also found that few of the N-ESPC model developers have detailed plans to ensure their respective codes scale through 2024. Three HPCMP initiatives are designed to directly address and support these issues: Productivity Enhancement, Technology, Transfer and Training (PETTT), the HPCMP Applications Software Initiative (HASI), and Frontier Projects. PETTT supports code conversion by providing assistance, expertise and training in scalable and high-end computing architectures. HASI addresses the continuing need for modern application software that executes effectively and efficiently on next-generation high-performance computers. Frontier Projects enable research and development that could not be achieved using typical HPCMP resources by providing multi-disciplinary teams access to exceptional amounts of high performance computing resources. Finally, the Navy's DoD Supercomputing Resource Center (DSRC) currently operates a 6 Petabyte system, of which Naval Oceanography receives 15% of operational computational system use, or approximately 1 Petabyte of the processing capability. The DSRC will provide the DoD with future computing assets to initially operate the N-ESPC in 2019. This talk will further describe how DoD's HPCMP will ensure N-ESPC becomes operational, efficiently and effectively, using next-generation high performance computing.
Error Correction using Quantum Quasi-Cyclic Low-Density Parity-Check(LDPC) Codes
NASA Astrophysics Data System (ADS)
Jing, Lin; Brun, Todd; Quantum Research Team
Quasi-cyclic LDPC codes can approach the Shannon capacity and have efficient decoders. Manabu Hagiwara et al., 2007 presented a method to calculate parity check matrices with high girth. Two distinct, orthogonal matrices Hc and Hd are used. Using submatrices obtained from Hc and Hd by deleting rows, we can alter the code rate. The submatrix of Hc is used to correct Pauli X errors, and the submatrix of Hd to correct Pauli Z errors. We simulated this system for depolarizing noise on USC's High Performance Computing Cluster, and obtained the block error rate (BER) as a function of the error weight and code rate. From the rates of uncorrectable errors under different error weights we can extrapolate the BER to any small error probability. Our results show that this code family can perform reasonably well even at high code rates, thus considerably reducing the overhead compared to concatenated and surface codes. This makes these codes promising as storage blocks in fault-tolerant quantum computation. Error Correction using Quantum Quasi-Cyclic Low-Density Parity-Check(LDPC) Codes.
ERIC Educational Resources Information Center
Garneli, Varvara; Chorianopoulos, Konstantinos
2018-01-01
Various aspects of computational thinking (CT) could be supported by educational contexts such as simulations and video-games construction. In this field study, potential differences in student motivation and learning were empirically examined through students' code. For this purpose, we performed a teaching intervention that took place over five…
NASA Technical Reports Server (NTRS)
Veres, Joseph P.
2002-01-01
A high-fidelity simulation of a commercial turbofan engine has been created as part of the Numerical Propulsion System Simulation Project. The high-fidelity computer simulation utilizes computer models that were developed at NASA Glenn Research Center in cooperation with turbofan engine manufacturers. The average-passage (APNASA) Navier-Stokes based viscous flow computer code is used to simulate the 3D flow in the compressors and turbines of the advanced commercial turbofan engine. The 3D National Combustion Code (NCC) is used to simulate the flow and chemistry in the advanced aircraft combustor. The APNASA turbomachinery code and the NCC combustor code exchange boundary conditions at the interface planes at the combustor inlet and exit. This computer simulation technique can evaluate engine performance at steady operating conditions. The 3D flow models provide detailed knowledge of the airflow within the fan and compressor, the high and low pressure turbines, and the flow and chemistry within the combustor. The models simulate the performance of the engine at operating conditions that include sea level takeoff and the altitude cruise condition.
Fast Sparse Coding for Range Data Denoising with Sparse Ridges Constraint
Lao, Mingjie; Sang, Yongsheng; Wen, Fei; Zhai, Ruifang
2018-01-01
Light detection and ranging (LiDAR) sensors have been widely deployed on intelligent systems such as unmanned ground vehicles (UGVs) and unmanned aerial vehicles (UAVs) to perform localization, obstacle detection, and navigation tasks. Thus, research into range data processing with competitive performance in terms of both accuracy and efficiency has attracted increasing attention. Sparse coding has revolutionized signal processing and led to state-of-the-art performance in a variety of applications. However, dictionary learning, which plays the central role in sparse coding techniques, is computationally demanding, resulting in its limited applicability in real-time systems. In this study, we propose sparse coding algorithms with a fixed pre-learned ridge dictionary to realize range data denoising via leveraging the regularity of laser range measurements in man-made environments. Experiments on both synthesized data and real data demonstrate that our method obtains accuracy comparable to that of sophisticated sparse coding methods, but with much higher computational efficiency. PMID:29734793
Dexter - A one-dimensional code for calculating thermionic performance of long converters.
NASA Technical Reports Server (NTRS)
Sawyer, C. D.
1971-01-01
This paper describes a versatile code for computing the coupled thermionic electric-thermal performance of long thermionic converters in which the temperature and voltage variations cannot be neglected. The code is capable of accounting for a variety of external electrical connection schemes, coolant flow paths and converter failures by partial shorting. Example problem solutions are given.
NASA Technical Reports Server (NTRS)
Ryer, M. J.
1978-01-01
HAL/S is a computer programming language; it is a representation for algorithms which can be interpreted by either a person or a computer. HAL/S compilers transform blocks of HAL/S code into machine language which can then be directly executed by a computer. When the machine language is executed, the algorithm specified by the HAL/S code (source) is performed. This document describes how to read and write HAL/S source.
Fingerprinting Communication and Computation on HPC Machines
DOE Office of Scientific and Technical Information (OSTI.GOV)
Peisert, Sean
2010-06-02
How do we identify what is actually running on high-performance computing systems? Names of binaries, dynamic libraries loaded, or other elements in a submission to a batch queue can give clues, but binary names can be changed, and libraries provide limited insight and resolution on the code being run. In this paper, we present a method for"fingerprinting" code running on HPC machines using elements of communication and computation. We then discuss how that fingerprint can be used to determine if the code is consistent with certain other types of codes, what a user usually runs, or what the user requestedmore » an allocation to do. In some cases, our techniques enable us to fingerprint HPC codes using runtime MPI data with a high degree of accuracy.« less
NASA Astrophysics Data System (ADS)
Bird, Robert; Nystrom, David; Albright, Brian
2017-10-01
The ability of scientific simulations to effectively deliver performant computation is increasingly being challenged by successive generations of high-performance computing architectures. Code development to support efficient computation on these modern architectures is both expensive, and highly complex; if it is approached without due care, it may also not be directly transferable between subsequent hardware generations. Previous works have discussed techniques to support the process of adapting a legacy code for modern hardware generations, but despite the breakthroughs in the areas of mini-app development, portable-performance, and cache oblivious algorithms the problem still remains largely unsolved. In this work we demonstrate how a focus on platform agnostic modern code-development can be applied to Particle-in-Cell (PIC) simulations to facilitate effective scientific delivery. This work builds directly on our previous work optimizing VPIC, in which we replaced intrinsic based vectorisation with compile generated auto-vectorization to improve the performance and portability of VPIC. In this work we present the use of a specialized SIMD queue for processing some particle operations, and also preview a GPU capable OpenMP variant of VPIC. Finally we include a lessons learnt. Work performed under the auspices of the U.S. Dept. of Energy by the Los Alamos National Security, LLC Los Alamos National Laboratory under contract DE-AC52-06NA25396 and supported by the LANL LDRD program.
Performance of convolutional codes on fading channels typical of planetary entry missions
NASA Technical Reports Server (NTRS)
Modestino, J. W.; Mui, S. Y.; Reale, T. J.
1974-01-01
The performance of convolutional codes in fading channels typical of the planetary entry channel is examined in detail. The signal fading is due primarily to turbulent atmospheric scattering of the RF signal transmitted from an entry probe through a planetary atmosphere. Short constraint length convolutional codes are considered in conjunction with binary phase-shift keyed modulation and Viterbi maximum likelihood decoding, and for longer constraint length codes sequential decoding utilizing both the Fano and Zigangirov-Jelinek (ZJ) algorithms are considered. Careful consideration is given to the modeling of the channel in terms of a few meaningful parameters which can be correlated closely with theoretical propagation studies. For short constraint length codes the bit error probability performance was investigated as a function of E sub b/N sub o parameterized by the fading channel parameters. For longer constraint length codes the effect was examined of the fading channel parameters on the computational requirements of both the Fano and ZJ algorithms. The effects of simple block interleaving in combatting the memory of the channel is explored, using the analytic approach or digital computer simulation.
NASA Technical Reports Server (NTRS)
Schmidt, James F.
1995-01-01
An off-design axial-flow compressor code is presented and is available from COSMIC for predicting the aerodynamic performance maps of fans and compressors. Steady axisymmetric flow is assumed and the aerodynamic solution reduces to solving the two-dimensional flow field in the meridional plane. A streamline curvature method is used for calculating this flow-field outside the blade rows. This code allows for bleed flows and the first five stators can be reset for each rotational speed, capabilities which are necessary for large multistage compressors. The accuracy of the off-design performance predictions depend upon the validity of the flow loss and deviation correlation models. These empirical correlations for the flow loss and deviation are used to model the real flow effects and the off-design code will compute through small reverse flow regions. The input to this off-design code is fully described and a user's example case for a two-stage fan is included with complete input and output data sets. Also, a comparison of the off-design code predictions with experimental data is included which generally shows good agreement.
Evaluation of Cache-based Superscalar and Cacheless Vector Architectures for Scientific Computations
NASA Technical Reports Server (NTRS)
Oliker, Leonid; Carter, Jonathan; Shalf, John; Skinner, David; Ethier, Stephane; Biswas, Rupak; Djomehri, Jahed; VanderWijngaart, Rob
2003-01-01
The growing gap between sustained and peak performance for scientific applications has become a well-known problem in high performance computing. The recent development of parallel vector systems offers the potential to bridge this gap for a significant number of computational science codes and deliver a substantial increase in computing capabilities. This paper examines the intranode performance of the NEC SX6 vector processor and the cache-based IBM Power3/4 superscalar architectures across a number of key scientific computing areas. First, we present the performance of a microbenchmark suite that examines a full spectrum of low-level machine characteristics. Next, we study the behavior of the NAS Parallel Benchmarks using some simple optimizations. Finally, we evaluate the perfor- mance of several numerical codes from key scientific computing domains. Overall results demonstrate that the SX6 achieves high performance on a large fraction of our application suite and in many cases significantly outperforms the RISC-based architectures. However, certain classes of applications are not easily amenable to vectorization and would likely require extensive reengineering of both algorithm and implementation to utilize the SX6 effectively.
NASA Technical Reports Server (NTRS)
Saini, Subhash; Frumkin, Michael; Hribar, Michelle; Jin, Hao-Qiang; Waheed, Abdul; Yan, Jerry
1998-01-01
Porting applications to new high performance parallel and distributed computing platforms is a challenging task. Since writing parallel code by hand is extremely time consuming and costly, porting codes would ideally be automated by using some parallelization tools and compilers. In this paper, we compare the performance of the hand written NAB Parallel Benchmarks against three parallel versions generated with the help of tools and compilers: 1) CAPTools: an interactive computer aided parallelization too] that generates message passing code, 2) the Portland Group's HPF compiler and 3) using compiler directives with the native FORTAN77 compiler on the SGI Origin2000.
Power-on performance predictions for a complete generic hypersonic vehicle configuration
NASA Technical Reports Server (NTRS)
Bennett, Bradford C.
1991-01-01
The Compressible Navier-Stokes (CNS) code was developed to compute external hypersonic flow fields. It has been applied to various hypersonic external flow applications. Here, the CNS code was modified to compute hypersonic internal flow fields. Calculations were performed on a Mach 18 sidewall compression inlet and on the Lewis Mach 5 inlet. The use of the ARC3D diagonal algorithm was evaluated for internal flows on the Mach 5 inlet flow. The initial modifications to the CNS code involved generalization of the boundary conditions and the addition of viscous terms in the second crossflow direction and modifications to the Baldwin-Lomax turbulence model for corner flows.
Overview of the NASA Glenn Flux Reconstruction Based High-Order Unstructured Grid Code
NASA Technical Reports Server (NTRS)
Spiegel, Seth C.; DeBonis, James R.; Huynh, H. T.
2016-01-01
A computational fluid dynamics code based on the flux reconstruction (FR) method is currently being developed at NASA Glenn Research Center to ultimately provide a large- eddy simulation capability that is both accurate and efficient for complex aeropropulsion flows. The FR approach offers a simple and efficient method that is easy to implement and accurate to an arbitrary order on common grid cell geometries. The governing compressible Navier-Stokes equations are discretized in time using various explicit Runge-Kutta schemes, with the default being the 3-stage/3rd-order strong stability preserving scheme. The code is written in modern Fortran (i.e., Fortran 2008) and parallelization is attained through MPI for execution on distributed-memory high-performance computing systems. An h- refinement study of the isentropic Euler vortex problem is able to empirically demonstrate the capability of the FR method to achieve super-accuracy for inviscid flows. Additionally, the code is applied to the Taylor-Green vortex problem, performing numerous implicit large-eddy simulations across a range of grid resolutions and solution orders. The solution found by a pseudo-spectral code is commonly used as a reference solution to this problem, and the FR code is able to reproduce this solution using approximately the same grid resolution. Finally, an examination of the code's performance demonstrates good parallel scaling, as well as an implementation of the FR method with a computational cost/degree- of-freedom/time-step that is essentially independent of the solution order of accuracy for structured geometries.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Balkey, K.; Witt, F.J.; Bishop, B.A.
1995-06-01
Significant attention has been focused on the issue of reactor vessel pressurized thermal shock (PTS) for many years. Pressurized thermal shock transient events are characterized by a rapid cooldown at potentially high pressure levels that could lead to a reactor vessel integrity concern for some pressurized water reactors. As a result of regulatory and industry efforts in the early 1980`s, a probabilistic risk assessment methodology has been established to address this concern. Probabilistic fracture mechanics analyses are performed as part of this methodology to determine conditional probability of significant flaw extension for given pressurized thermal shock events. While recent industrymore » efforts are underway to benchmark probabilistic fracture mechanics computer codes that are currently used by the nuclear industry, Part I of this report describes the comparison of two independent computer codes used at the time of the development of the original U.S. Nuclear Regulatory Commission (NRC) pressurized thermal shock rule. The work that was originally performed in 1982 and 1983 to compare the U.S. NRC - VISA and Westinghouse (W) - PFM computer codes has been documented and is provided in Part I of this report. Part II of this report describes the results of more recent industry efforts to benchmark PFM computer codes used by the nuclear industry. This study was conducted as part of the USNRC-EPRI Coordinated Research Program for reviewing the technical basis for pressurized thermal shock (PTS) analyses of the reactor pressure vessel. The work focused on the probabilistic fracture mechanics (PFM) analysis codes and methods used to perform the PTS calculations. An in-depth review of the methodologies was performed to verify the accuracy and adequacy of the various different codes. The review was structured around a series of benchmark sample problems to provide a specific context for discussion and examination of the fracture mechanics methodology.« less
Some Problems and Solutions in Transferring Ecosystem Simulation Codes to Supercomputers
NASA Technical Reports Server (NTRS)
Skiles, J. W.; Schulbach, C. H.
1994-01-01
Many computer codes for the simulation of ecological systems have been developed in the last twenty-five years. This development took place initially on main-frame computers, then mini-computers, and more recently, on micro-computers and workstations. Recent recognition of ecosystem science as a High Performance Computing and Communications Program Grand Challenge area emphasizes supercomputers (both parallel and distributed systems) as the next set of tools for ecological simulation. Transferring ecosystem simulation codes to such systems is not a matter of simply compiling and executing existing code on the supercomputer since there are significant differences in the system architectures of sequential, scalar computers and parallel and/or vector supercomputers. To more appropriately match the application to the architecture (necessary to achieve reasonable performance), the parallelism (if it exists) of the original application must be exploited. We discuss our work in transferring a general grassland simulation model (developed on a VAX in the FORTRAN computer programming language) to a Cray Y-MP. We show the Cray shared-memory vector-architecture, and discuss our rationale for selecting the Cray. We describe porting the model to the Cray and executing and verifying a baseline version, and we discuss the changes we made to exploit the parallelism in the application and to improve code execution. As a result, the Cray executed the model 30 times faster than the VAX 11/785 and 10 times faster than a Sun 4 workstation. We achieved an additional speed-up of approximately 30 percent over the original Cray run by using the compiler's vectorizing capabilities and the machine's ability to put subroutines and functions "in-line" in the code. With the modifications, the code still runs at only about 5% of the Cray's peak speed because it makes ineffective use of the vector processing capabilities of the Cray. We conclude with a discussion and future plans.
Evaluation of CFD to Determine Two-Dimensional Airfoil Characteristics for Rotorcraft Applications
NASA Technical Reports Server (NTRS)
Smith, Marilyn J.; Wong, Tin-Chee; Potsdam, Mark; Baeder, James; Phanse, Sujeet
2004-01-01
The efficient prediction of helicopter rotor performance, vibratory loads, and aeroelastic properties still relies heavily on the use of comprehensive analysis codes by the rotorcraft industry. These comprehensive codes utilize look-up tables to provide two-dimensional aerodynamic characteristics. Typically these tables are comprised of a combination of wind tunnel data, empirical data and numerical analyses. The potential to rely more heavily on numerical computations based on Computational Fluid Dynamics (CFD) simulations has become more of a reality with the advent of faster computers and more sophisticated physical models. The ability of five different CFD codes applied independently to predict the lift, drag and pitching moments of rotor airfoils is examined for the SC1095 airfoil, which is utilized in the UH-60A main rotor. Extensive comparisons with the results of ten wind tunnel tests are performed. These CFD computations are found to be as good as experimental data in predicting many of the aerodynamic performance characteristics. Four turbulence models were examined (Baldwin-Lomax, Spalart-Allmaras, Menter SST, and k-omega).
The path toward HEP High Performance Computing
NASA Astrophysics Data System (ADS)
Apostolakis, John; Brun, René; Carminati, Federico; Gheata, Andrei; Wenzel, Sandro
2014-06-01
High Energy Physics code has been known for making poor use of high performance computing architectures. Efforts in optimising HEP code on vector and RISC architectures have yield limited results and recent studies have shown that, on modern architectures, it achieves a performance between 10% and 50% of the peak one. Although several successful attempts have been made to port selected codes on GPUs, no major HEP code suite has a "High Performance" implementation. With LHC undergoing a major upgrade and a number of challenging experiments on the drawing board, HEP cannot any longer neglect the less-than-optimal performance of its code and it has to try making the best usage of the hardware. This activity is one of the foci of the SFT group at CERN, which hosts, among others, the Root and Geant4 project. The activity of the experiments is shared and coordinated via a Concurrency Forum, where the experience in optimising HEP code is presented and discussed. Another activity is the Geant-V project, centred on the development of a highperformance prototype for particle transport. Achieving a good concurrency level on the emerging parallel architectures without a complete redesign of the framework can only be done by parallelizing at event level, or with a much larger effort at track level. Apart the shareable data structures, this typically implies a multiplication factor in terms of memory consumption compared to the single threaded version, together with sub-optimal handling of event processing tails. Besides this, the low level instruction pipelining of modern processors cannot be used efficiently to speedup the program. We have implemented a framework that allows scheduling vectors of particles to an arbitrary number of computing resources in a fine grain parallel approach. The talk will review the current optimisation activities within the SFT group with a particular emphasis on the development perspectives towards a simulation framework able to profit best from the recent technology evolution in computing.
Efficient Helicopter Aerodynamic and Aeroacoustic Predictions on Parallel Computers
NASA Technical Reports Server (NTRS)
Wissink, Andrew M.; Lyrintzis, Anastasios S.; Strawn, Roger C.; Oliker, Leonid; Biswas, Rupak
1996-01-01
This paper presents parallel implementations of two codes used in a combined CFD/Kirchhoff methodology to predict the aerodynamics and aeroacoustics properties of helicopters. The rotorcraft Navier-Stokes code, TURNS, computes the aerodynamic flowfield near the helicopter blades and the Kirchhoff acoustics code computes the noise in the far field, using the TURNS solution as input. The overall parallel strategy adds MPI message passing calls to the existing serial codes to allow for communication between processors. As a result, the total code modifications required for parallel execution are relatively small. The biggest bottleneck in running the TURNS code in parallel comes from the LU-SGS algorithm that solves the implicit system of equations. We use a new hybrid domain decomposition implementation of LU-SGS to obtain good parallel performance on the SP-2. TURNS demonstrates excellent parallel speedups for quasi-steady and unsteady three-dimensional calculations of a helicopter blade in forward flight. The execution rate attained by the code on 114 processors is six times faster than the same cases run on one processor of the Cray C-90. The parallel Kirchhoff code also shows excellent parallel speedups and fast execution rates. As a performance demonstration, unsteady acoustic pressures are computed at 1886 far-field observer locations for a sample acoustics problem. The calculation requires over two hundred hours of CPU time on one C-90 processor but takes only a few hours on 80 processors of the SP2. The resultant far-field acoustic field is analyzed with state of-the-art audio and video rendering of the propagating acoustic signals.
NASA Technical Reports Server (NTRS)
Bonhaus, Daryl L.; Wornom, Stephen F.
1991-01-01
Two codes which solve the 3-D Thin Layer Navier-Stokes (TLNS) equations are used to compute the steady state flow for two test cases representing typical finite wings at transonic conditions. Several grids of C-O topology and varying point densities are used to determine the effects of grid refinement. After a description of each code and test case, standards for determining code efficiency and accuracy are defined and applied to determine the relative performance of the two codes in predicting turbulent transonic wing flows. Comparisons of computed surface pressure distributions with experimental data are made.
NASA Astrophysics Data System (ADS)
Alameda, J. C.
2011-12-01
Development and optimization of computational science models, particularly on high performance computers, and with the advent of ubiquitous multicore processor systems, practically on every system, has been accomplished with basic software tools, typically, command-line based compilers, debuggers, performance tools that have not changed substantially from the days of serial and early vector computers. However, model complexity, including the complexity added by modern message passing libraries such as MPI, and the need for hybrid code models (such as openMP and MPI) to be able to take full advantage of high performance computers with an increasing core count per shared memory node, has made development and optimization of such codes an increasingly arduous task. Additional architectural developments, such as many-core processors, only complicate the situation further. In this paper, we describe how our NSF-funded project, "SI2-SSI: A Productive and Accessible Development Workbench for HPC Applications Using the Eclipse Parallel Tools Platform" (WHPC) seeks to improve the Eclipse Parallel Tools Platform, an environment designed to support scientific code development targeted at a diverse set of high performance computing systems. Our WHPC project to improve Eclipse PTP takes an application-centric view to improve PTP. We are using a set of scientific applications, each with a variety of challenges, and using PTP to drive further improvements to both the scientific application, as well as to understand shortcomings in Eclipse PTP from an application developer perspective, to drive our list of improvements we seek to make. We are also partnering with performance tool providers, to drive higher quality performance tool integration. We have partnered with the Cactus group at Louisiana State University to improve Eclipse's ability to work with computational frameworks and extremely complex build systems, as well as to develop educational materials to incorporate into computational science and engineering codes. Finally, we are partnering with the lead PTP developers at IBM, to ensure we are as effective as possible within the Eclipse community development. We are also conducting training and outreach to our user community, including conference BOF sessions, monthly user calls, and an annual user meeting, so that we can best inform the improvements we make to Eclipse PTP. With these activities we endeavor to encourage use of modern software engineering practices, as enabled through the Eclipse IDE, with computational science and engineering applications. These practices include proper use of source code repositories, tracking and rectifying issues, measuring and monitoring code performance changes against both optimizations as well as ever-changing software stacks and configurations on HPC systems, as well as ultimately encouraging development and maintenance of testing suites -- things that have become commonplace in many software endeavors, but have lagged in the development of science applications. We view that the challenge with the increased complexity of both HPC systems and science applications demands the use of better software engineering methods, preferably enabled by modern tools such as Eclipse PTP, to help the computational science community thrive as we evolve the HPC landscape.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ditmars, J.D.; Walbridge, E.W.; Rote, D.M.
1983-10-01
Repository performance assessment is analysis that identifies events and processes that might affect a repository system for isolation of radioactive waste, examines their effects on barriers to waste migration, and estimates the probabilities of their occurrence and their consequences. In 1983 Battelle Memorial Institute's Office of Nuclear Waste Isolation (ONWI) prepared two plans - one for performance assessment for a waste repository in salt and one for verification and validation of performance assessment technology. At the request of the US Department of Energy's Salt Repository Project Office (SRPO), Argonne National Laboratory reviewed those plans and prepared this report to advisemore » SRPO of specific areas where ONWI's plans for performance assessment might be improved. This report presents a framework for repository performance assessment that clearly identifies the relationships among the disposal problems, the processes underlying the problems, the tools for assessment (computer codes), and the data. In particular, the relationships among important processes and 26 model codes available to ONWI are indicated. A common suggestion for computer code verification and validation is the need for specific and unambiguous documentation of the results of performance assessment activities. A major portion of this report consists of status summaries of 27 model codes indicated as potentially useful by ONWI. The code summaries focus on three main areas: (1) the code's purpose, capabilities, and limitations; (2) status of the elements of documentation and review essential for code verification and validation; and (3) proposed application of the code for performance assessment of salt repository systems. 15 references, 6 figures, 4 tables.« less
Using a multifrontal sparse solver in a high performance, finite element code
NASA Technical Reports Server (NTRS)
King, Scott D.; Lucas, Robert; Raefsky, Arthur
1990-01-01
We consider the performance of the finite element method on a vector supercomputer. The computationally intensive parts of the finite element method are typically the individual element forms and the solution of the global stiffness matrix both of which are vectorized in high performance codes. To further increase throughput, new algorithms are needed. We compare a multifrontal sparse solver to a traditional skyline solver in a finite element code on a vector supercomputer. The multifrontal solver uses the Multiple-Minimum Degree reordering heuristic to reduce the number of operations required to factor a sparse matrix and full matrix computational kernels (e.g., BLAS3) to enhance vector performance. The net result in an order-of-magnitude reduction in run time for a finite element application on one processor of a Cray X-MP.
Dynamics of face and annular seals with two-phase flow
NASA Technical Reports Server (NTRS)
Hughes, William F.; Basu, Prithwish; Beatty, Paul A.; Beeler, Richard M.; Lau, Stephen
1988-01-01
A detailed study was made of face and annular seals under conditions where boiling, i.e., phase change of the leaking fluid, occurs within the seal. Many seals operate in this mode because of flashing due to pressure drop and/or heat input from frictional heating. Some of the distinctive behavior characteristics of two phase seals are discussed, particularly their axial stability. The main conclusions are that seals with two phase flow may be unstable if improperly balanced. Detailed theoretical analyses of low (laminar) and high (turbulent) leakage seals are presented along with computer codes, parametric studies, and in particular a simplified PC based code that allows for rapid performance prediction: calculations of stiffness coefficients, temperature and pressure distributions, and leakage rates for parallel and coned face seals. A simplified combined computer code for the performance prediction over the laminar and turbulent ranges of a two phase flow is described and documented. The analyses, results, and computer codes are summarized.
DEXTER: A one-dimensional code for calculating thermionic performance of long converters
NASA Technical Reports Server (NTRS)
Sawyer, C. D.
1971-01-01
A versatile code is described for computing the coupled thermionic electric-thermal performance of long thermionic converters in which the temperature and voltage variations cannot be neglected. The code is capable of accounting for a variety of external electrical connection schemes, coolant flow paths and converter failures by partial shorting. Example problem solutions are included along with a user's manual.
Analytical modeling of intumescent coating thermal protection system in a JP-5 fuel fire environment
NASA Technical Reports Server (NTRS)
Clark, K. J.; Shimizu, A. B.; Suchsland, K. E.; Moyer, C. B.
1974-01-01
The thermochemical response of Coating 313 when exposed to a fuel fire environment was studied to provide a tool for predicting the reaction time. The existing Aerotherm Charring Material Thermal Response and Ablation (CMA) computer program was modified to treat swelling materials. The modified code is now designated Aerotherm Transient Response of Intumescing Materials (TRIM) code. In addition, thermophysical property data for Coating 313 were analyzed and reduced for use in the TRIM code. An input data sensitivity study was performed, and performance tests of Coating 313/steel substrate models were carried out. The end product is a reliable computational model, the TRIM code, which was thoroughly validated for Coating 313. The tasks reported include: generation of input data, development of swell model and implementation in TRIM code, sensitivity study, acquisition of experimental data, comparisons of predictions with data, and predictions with intermediate insulation.
The Continual Intercomparison of Radiation Codes: Results from Phase I
NASA Technical Reports Server (NTRS)
Oreopoulos, Lazaros; Mlawer, Eli; Delamere, Jennifer; Shippert, Timothy; Cole, Jason; Iacono, Michael; Jin, Zhonghai; Li, Jiangnan; Manners, James; Raisanen, Petri;
2011-01-01
The computer codes that calculate the energy budget of solar and thermal radiation in Global Climate Models (GCMs), our most advanced tools for predicting climate change, have to be computationally efficient in order to not impose undue computational burden to climate simulations. By using approximations to gain execution speed, these codes sacrifice accuracy compared to more accurate, but also much slower, alternatives. International efforts to evaluate the approximate schemes have taken place in the past, but they have suffered from the drawback that the accurate standards were not validated themselves for performance. The manuscript summarizes the main results of the first phase of an effort called "Continual Intercomparison of Radiation Codes" (CIRC) where the cases chosen to evaluate the approximate models are based on observations and where we have ensured that the accurate models perform well when compared to solar and thermal radiation measurements. The effort is endorsed by international organizations such as the GEWEX Radiation Panel and the International Radiation Commission and has a dedicated website (i.e., http://circ.gsfc.nasa.gov) where interested scientists can freely download data and obtain more information about the effort's modus operandi and objectives. In a paper published in the March 2010 issue of the Bulletin of the American Meteorological Society only a brief overview of CIRC was provided with some sample results. In this paper the analysis of submissions of 11 solar and 13 thermal infrared codes relative to accurate reference calculations obtained by so-called "line-by-line" radiation codes is much more detailed. We demonstrate that, while performance of the approximate codes continues to improve, significant issues still remain to be addressed for satisfactory performance within GCMs. We hope that by identifying and quantifying shortcomings, the paper will help establish performance standards to objectively assess radiation code quality, and will guide the development of future phases of CIRC
Spare a Little Change? Towards a 5-Nines Internet in 250 Lines of Code
2011-05-01
NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) Carnegie Mellon University,School of Computer Science,Pittsburgh,PA,15213 8. PERFORMING ...Std Z39-18 Keywords: Internet reliability, BGP performance , Quagga This document includes excerpts of the source code for the Linux operating system...Behavior and Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . .. Other Related Work
A CFD/CSD Interaction Methodology for Aircraft Wings
NASA Technical Reports Server (NTRS)
Bhardwaj, Manoj K.
1997-01-01
With advanced subsonic transports and military aircraft operating in the transonic regime, it is becoming important to determine the effects of the coupling between aerodynamic loads and elastic forces. Since aeroelastic effects can contribute significantly to the design of these aircraft, there is a strong need in the aerospace industry to predict these aero-structure interactions computationally. To perform static aeroelastic analysis in the transonic regime, high fidelity computational fluid dynamics (CFD) analysis tools must be used in conjunction with high fidelity computational structural fluid dynamics (CSD) analysis tools due to the nonlinear behavior of the aerodynamics in the transonic regime. There is also a need to be able to use a wide variety of CFD and CSD tools to predict these aeroelastic effects in the transonic regime. Because source codes are not always available, it is necessary to couple the CFD and CSD codes without alteration of the source codes. In this study, an aeroelastic coupling procedure is developed which will perform static aeroelastic analysis using any CFD and CSD code with little code integration. The aeroelastic coupling procedure is demonstrated on an F/A-18 Stabilator using NASTD (an in-house McDonnell Douglas CFD code) and NASTRAN. In addition, the Aeroelastic Research Wing (ARW-2) is used for demonstration of the aeroelastic coupling procedure by using ENSAERO (NASA Ames Research Center CFD code) and a finite element wing-box code (developed as part of this research).
Efficient computation of kinship and identity coefficients on large pedigrees.
Cheng, En; Elliott, Brendan; Ozsoyoglu, Z Meral
2009-06-01
With the rapidly expanding field of medical genetics and genetic counseling, genealogy information is becoming increasingly abundant. An important computation on pedigree data is the calculation of identity coefficients, which provide a complete description of the degree of relatedness of a pair of individuals. The areas of application of identity coefficients are numerous and diverse, from genetic counseling to disease tracking, and thus, the computation of identity coefficients merits special attention. However, the computation of identity coefficients is not done directly, but rather as the final step after computing a set of generalized kinship coefficients. In this paper, we first propose a novel Path-Counting Formula for calculating generalized kinship coefficients, which is motivated by Wright's path-counting method for computing inbreeding coefficient. We then present an efficient and scalable scheme for calculating generalized kinship coefficients on large pedigrees using NodeCodes, a special encoding scheme for expediting the evaluation of queries on pedigree graph structures. Furthermore, we propose an improved scheme using Family NodeCodes for the computation of generalized kinship coefficients, which is motivated by the significant improvement of using Family NodeCodes for inbreeding coefficient over the use of NodeCodes. We also perform experiments for evaluating the efficiency of our method, and compare it with the performance of the traditional recursive algorithm for three individuals. Experimental results demonstrate that the resulting scheme is more scalable and efficient than the traditional recursive methods for computing generalized kinship coefficients.
Computational Nuclear Physics and Post Hartree-Fock Methods
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lietz, Justin; Sam, Novario; Hjorth-Jensen, M.
We present a computational approach to infinite nuclear matter employing Hartree-Fock theory, many-body perturbation theory and coupled cluster theory. These lectures are closely linked with those of chapters 9, 10 and 11 and serve as input for the correlation functions employed in Monte Carlo calculations in chapter 9, the in-medium similarity renormalization group theory of dense fermionic systems of chapter 10 and the Green's function approach in chapter 11. We provide extensive code examples and benchmark calculations, allowing thereby an eventual reader to start writing her/his own codes. We start with an object-oriented serial code and end with discussions onmore » strategies for porting the code to present and planned high-performance computing facilities.« less
Deployment of the OSIRIS EM-PIC code on the Intel Knights Landing architecture
NASA Astrophysics Data System (ADS)
Fonseca, Ricardo
2017-10-01
Electromagnetic particle-in-cell (EM-PIC) codes such as OSIRIS have found widespread use in modelling the highly nonlinear and kinetic processes that occur in several relevant plasma physics scenarios, ranging from astrophysical settings to high-intensity laser plasma interaction. Being computationally intensive, these codes require large scale HPC systems, and a continuous effort in adapting the algorithm to new hardware and computing paradigms. In this work, we report on our efforts on deploying the OSIRIS code on the new Intel Knights Landing (KNL) architecture. Unlike the previous generation (Knights Corner), these boards are standalone systems, and introduce several new features, include the new AVX-512 instructions and on-package MCDRAM. We will focus on the parallelization and vectorization strategies followed, as well as memory management, and present a detailed performance evaluation of code performance in comparison with the CPU code. This work was partially supported by Fundaçã para a Ciência e Tecnologia (FCT), Portugal, through Grant No. PTDC/FIS-PLA/2940/2014.
Development and application of structural dynamics analysis capabilities
NASA Technical Reports Server (NTRS)
Heinemann, Klaus W.; Hozaki, Shig
1994-01-01
Extensive research activities were performed in the area of multidisciplinary modeling and simulation of aerospace vehicles that are relevant to NASA Dryden Flight Research Facility. The efforts involved theoretical development, computer coding, and debugging of the STARS code. New solution procedures were developed in such areas as structures, CFD, and graphics, among others. Furthermore, systems-oriented codes were developed for rendering the code truly multidisciplinary and rather automated in nature. Also, work was performed in pre- and post-processing of engineering analysis data.
NASA Technical Reports Server (NTRS)
Choo, Y. K.; Staiger, P. J.
1982-01-01
The code was designed to analyze performance at valves-wide-open design flow. The code can model conventional steam cycles as well as cycles that include such special features as process steam extraction and induction and feedwater heating by external heat sources. Convenience features and extensions to the special features were incorporated into the PRESTO code. The features are described, and detailed examples illustrating the use of both the original and the special features are given.
Comparison of computer codes for calculating dynamic loads in wind turbines
NASA Technical Reports Server (NTRS)
Spera, D. A.
1977-01-01
Seven computer codes for analyzing performance and loads in large, horizontal axis wind turbines were used to calculate blade bending moment loads for two operational conditions of the 100 kW Mod-0 wind turbine. Results were compared with test data on the basis of cyclic loads, peak loads, and harmonic contents. Four of the seven codes include rotor-tower interaction and three were limited to rotor analysis. With a few exceptions, all calculated loads were within 25 percent of nominal test data.
NASA Technical Reports Server (NTRS)
Norment, H. G.
1985-01-01
Subsonic, external flow about nonlifting bodies, lifting bodies or combinations of lifting and nonlifting bodies is calculated by a modified version of the Hess lifting code. Trajectory calculations can be performed for any atmospheric conditions and for all water drop sizes, from the smallest cloud droplet to large raindrops. Experimental water drop drag relations are used in the water drop equations of motion and effects of gravity settling are included. Inlet flow can be accommodated, and high Mach number compressibility effects are corrected for approximately. Seven codes are described: (1) a code used to debug and plot body surface description data; (2) a code that processes the body surface data to yield the potential flow field; (3) a code that computes flow velocities at arrays of points in space; (4) a code that computes water drop trajectories from an array of points in space; (5) a code that computes water drop trajectories and fluxes to arbitrary target points; (6) a code that computes water drop trajectories tangent to the body; and (7) a code that produces stereo pair plots which include both the body and trajectories. Accuracy of the calculations is discussed, and trajectory calculation results are compared with prior calculations and with experimental data.
Shared Memory Parallelization of an Implicit ADI-type CFD Code
NASA Technical Reports Server (NTRS)
Hauser, Th.; Huang, P. G.
1999-01-01
A parallelization study designed for ADI-type algorithms is presented using the OpenMP specification for shared-memory multiprocessor programming. Details of optimizations specifically addressed to cache-based computer architectures are described and performance measurements for the single and multiprocessor implementation are summarized. The paper demonstrates that optimization of memory access on a cache-based computer architecture controls the performance of the computational algorithm. A hybrid MPI/OpenMP approach is proposed for clusters of shared memory machines to further enhance the parallel performance. The method is applied to develop a new LES/DNS code, named LESTool. A preliminary DNS calculation of a fully developed channel flow at a Reynolds number of 180, Re(sub tau) = 180, has shown good agreement with existing data.
Lewis Structures Technology, 1988. Volume 2: Structural Mechanics
NASA Technical Reports Server (NTRS)
1988-01-01
Lewis Structures Div. performs and disseminates results of research conducted in support of aerospace engine structures. These results have a wide range of applicability to practitioners of structural engineering mechanics beyond the aerospace arena. The engineering community was familiarized with the depth and range of research performed by the division and its academic and industrial partners. Sessions covered vibration control, fracture mechanics, ceramic component reliability, parallel computing, nondestructive evaluation, constitutive models and experimental capabilities, dynamic systems, fatigue and damage, wind turbines, hot section technology (HOST), aeroelasticity, structural mechanics codes, computational methods for dynamics, structural optimization, and applications of structural dynamics, and structural mechanics computer codes.
Instrument Systems Analysis and Verification Facility (ISAVF) users guide
NASA Technical Reports Server (NTRS)
Davis, J. F.; Thomason, J. O.; Wolfgang, J. L.
1985-01-01
The ISAVF facility is primarily an interconnected system of computers, special purpose real time hardware, and associated generalized software systems, which will permit the Instrument System Analysts, Design Engineers and Instrument Scientists, to perform trade off studies, specification development, instrument modeling, and verification of the instrument, hardware performance. It is not the intent of the ISAVF to duplicate or replace existing special purpose facilities such as the Code 710 Optical Laboratories or the Code 750 Test and Evaluation facilities. The ISAVF will provide data acquisition and control services for these facilities, as needed, using remote computer stations attached to the main ISAVF computers via dedicated communication lines.
A study of the optimization method used in the NAVY/NASA gas turbine engine computer code
NASA Technical Reports Server (NTRS)
Horsewood, J. L.; Pines, S.
1977-01-01
Sources of numerical noise affecting the convergence properties of the Powell's Principal Axis Method of Optimization in the NAVY/NASA gas turbine engine computer code were investigated. The principal noise source discovered resulted from loose input tolerances used in terminating iterations performed in subroutine CALCFX to satisfy specified control functions. A minor source of noise was found to be introduced by an insufficient number of digits in stored coefficients used by subroutine THERM in polynomial expressions of thermodynamic properties. Tabular results of several computer runs are presented to show the effects on program performance of selective corrective actions taken to reduce noise.
Final report for the Tera Computer TTI CRADA
DOE Office of Scientific and Technical Information (OSTI.GOV)
Davidson, G.S.; Pavlakos, C.; Silva, C.
1997-01-01
Tera Computer and Sandia National Laboratories have completed a CRADA, which examined the Tera Multi-Threaded Architecture (MTA) for use with large codes of importance to industry and DOE. The MTA is an innovative architecture that uses parallelism to mask latency between memories and processors. The physical implementation is a parallel computer with high cross-section bandwidth and GaAs processors designed by Tera, which support many small computation threads and fast, lightweight context switches between them. When any thread blocks while waiting for memory accesses to complete, another thread immediately begins execution so that high CPU utilization is maintained. The Tera MTAmore » parallel computer has a single, global address space, which is appealing when porting existing applications to a parallel computer. This ease of porting is further enabled by compiler technology that helps break computations into parallel threads. DOE and Sandia National Laboratories were interested in working with Tera to further develop this computing concept. While Tera Computer would continue the hardware development and compiler research, Sandia National Laboratories would work with Tera to ensure that their compilers worked well with important Sandia codes, most particularly CTH, a shock physics code used for weapon safety computations. In addition to that important code, Sandia National Laboratories would complete research on a robotic path planning code, SANDROS, which is important in manufacturing applications, and would evaluate the MTA performance on this code. Finally, Sandia would work directly with Tera to develop 3D visualization codes, which would be appropriate for use with the MTA. Each of these tasks has been completed to the extent possible, given that Tera has just completed the MTA hardware. All of the CRADA work had to be done on simulators.« less
Computational Aerodynamic Simulations of a Spacecraft Cabin Ventilation Fan Design
NASA Technical Reports Server (NTRS)
Tweedt, Daniel L.
2010-01-01
Quieter working environments for astronauts are needed if future long-duration space exploration missions are to be safe and productive. Ventilation and payload cooling fans are known to be dominant sources of noise, with the International Space Station being a good case in point. To address this issue cost effectively, early attention to fan design, selection, and installation has been recommended, leading to an effort by NASA to examine the potential for small-fan noise reduction by improving fan aerodynamic design. As a preliminary part of that effort, the aerodynamics of a cabin ventilation fan designed by Hamilton Sundstrand has been simulated using computational fluid dynamics codes, and the computed solutions analyzed to quantify various aspects of the fan aerodynamics and performance. Four simulations were performed at the design rotational speed: two at the design flow rate and two at off-design flow rates. Following a brief discussion of the computational codes, various aerodynamic- and performance-related quantities derived from the computed flow fields are presented along with relevant flow field details. The results show that the computed fan performance is in generally good agreement with stated design goals.
Hierarchical parallelisation of functional renormalisation group calculations - hp-fRG
NASA Astrophysics Data System (ADS)
Rohe, Daniel
2016-10-01
The functional renormalisation group (fRG) has evolved into a versatile tool in condensed matter theory for studying important aspects of correlated electron systems. Practical applications of the method often involve a high numerical effort, motivating the question in how far High Performance Computing (HPC) can leverage the approach. In this work we report on a multi-level parallelisation of the underlying computational machinery and show that this can speed up the code by several orders of magnitude. This in turn can extend the applicability of the method to otherwise inaccessible cases. We exploit three levels of parallelisation: Distributed computing by means of Message Passing (MPI), shared-memory computing using OpenMP, and vectorisation by means of SIMD units (single-instruction-multiple-data). Results are provided for two distinct High Performance Computing (HPC) platforms, namely the IBM-based BlueGene/Q system JUQUEEN and an Intel Sandy-Bridge-based development cluster. We discuss how certain issues and obstacles were overcome in the course of adapting the code. Most importantly, we conclude that this vast improvement can actually be accomplished by introducing only moderate changes to the code, such that this strategy may serve as a guideline for other researcher to likewise improve the efficiency of their codes.
Solution of the lossy nonlinear Tricomi equation with application to sonic boom focusing
NASA Astrophysics Data System (ADS)
Salamone, Joseph A., III
Sonic boom focusing theory has been augmented with new terms that account for mean flow effects in the direction of propagation and also for atmospheric absorption/dispersion due to molecular relaxation due to oxygen and nitrogen. The newly derived model equation was numerically implemented using a computer code. The computer code was numerically validated using a spectral solution for nonlinear propagation of a sinusoid through a lossy homogeneous medium. An additional numerical check was performed to verify the linear diffraction component of the code calculations. The computer code was experimentally validated using measured sonic boom focusing data from the NASA sponsored Superboom Caustic and Analysis Measurement Program (SCAMP) flight test. The computer code was in good agreement with both the numerical and experimental validation. The newly developed code was applied to examine the focusing of a NASA low-boom demonstration vehicle concept. The resulting pressure field was calculated for several supersonic climb profiles. The shaping efforts designed into the signatures were still somewhat evident despite the effects of sonic boom focusing.
Computational Infrastructure for Engine Structural Performance Simulation
NASA Technical Reports Server (NTRS)
Chamis, Christos C.
1997-01-01
Select computer codes developed over the years to simulate specific aspects of engine structures are described. These codes include blade impact integrated multidisciplinary analysis and optimization, progressive structural fracture, quantification of uncertainties for structural reliability and risk, benefits estimation of new technology insertion and hierarchical simulation of engine structures made from metal matrix and ceramic matrix composites. Collectively these codes constitute a unique infrastructure readiness to credibly evaluate new and future engine structural concepts throughout the development cycle from initial concept, to design and fabrication, to service performance and maintenance and repairs, and to retirement for cause and even to possible recycling. Stated differently, they provide 'virtual' concurrent engineering for engine structures total-life-cycle-cost.
NASA Technical Reports Server (NTRS)
Carlson, Harry W.; Darden, Christine M.; Mann, Michael J.
1990-01-01
Extensive correlations of computer code results with experimental data are employed to illustrate the use of a linearized theory, attached flow method for the estimation and optimization of the longitudinal aerodynamic performance of wing-canard and wing-horizontal tail configurations which may employ simple hinged flap systems. Use of an attached flow method is based on the premise that high levels of aerodynamic efficiency require a flow that is as nearly attached as circumstances permit. The results indicate that linearized theory, attached flow, computer code methods (modified to include estimated attainable leading-edge thrust and an approximate representation of vortex forces) provide a rational basis for the estimation and optimization of aerodynamic performance at subsonic speeds below the drag rise Mach number. Generally, good prediction of aerodynamic performance, as measured by the suction parameter, can be expected for near optimum combinations of canard or horizontal tail incidence and leading- and trailing-edge flap deflections at a given lift coefficient (conditions which tend to produce a predominantly attached flow).
A Rocket Engine Design Expert System
NASA Technical Reports Server (NTRS)
Davidian, Kenneth J.
1989-01-01
The overall structure and capabilities of an expert system designed to evaluate rocket engine performance are described. The expert system incorporates a JANNAF standard reference computer code to determine rocket engine performance and a state of the art finite element computer code to calculate the interactions between propellant injection, energy release in the combustion chamber, and regenerative cooling heat transfer. Rule-of-thumb heuristics were incorporated for the H2-O2 coaxial injector design, including a minimum gap size constraint on the total number of injector elements. One dimensional equilibrium chemistry was used in the energy release analysis of the combustion chamber. A 3-D conduction and/or 1-D advection analysis is used to predict heat transfer and coolant channel wall temperature distributions, in addition to coolant temperature and pressure drop. Inputting values to describe the geometry and state properties of the entire system is done directly from the computer keyboard. Graphical display of all output results from the computer code analyses is facilitated by menu selection of up to five dependent variables per plot.
A rocket engine design expert system
NASA Technical Reports Server (NTRS)
Davidian, Kenneth J.
1989-01-01
The overall structure and capabilities of an expert system designed to evaluate rocket engine performance are described. The expert system incorporates a JANNAF standard reference computer code to determine rocket engine performance and a state-of-the-art finite element computer code to calculate the interactions between propellant injection, energy release in the combustion chamber, and regenerative cooling heat transfer. Rule-of-thumb heuristics were incorporated for the hydrogen-oxygen coaxial injector design, including a minimum gap size constraint on the total number of injector elements. One-dimensional equilibrium chemistry was employed in the energy release analysis of the combustion chamber and three-dimensional finite-difference analysis of the regenerative cooling channels was used to calculate the pressure drop along the channels and the coolant temperature as it exits the coolant circuit. Inputting values to describe the geometry and state properties of the entire system is done directly from the computer keyboard. Graphical display of all output results from the computer code analyses is facilitated by menu selection of up to five dependent variables per plot.
High-Speed, Low-Cost Workstation for Computation-Intensive Statistics. Phase 1
1990-06-20
routine implementation and performance. 5 The two compiled versions given in the table were coded in an attempt to obtain an optimized compiled version...level statistics and linear algebra routines (BSAS and BLAS) that have been prototyped in this study. For each routine, both the C code ( Turbo C...OISTRIBUTION /AVAILABILITY STATEMENT 12b. DISTRIBUTION CODE Unlimited distribution 13. ABSTRACT (Maximum 200 words) High-performance and low-cost
Topological quantum distillation.
Bombin, H; Martin-Delgado, M A
2006-11-03
We construct a class of topological quantum codes to perform quantum entanglement distillation. These codes implement the whole Clifford group of unitary operations in a fully topological manner and without selective addressing of qubits. This allows us to extend their application also to quantum teleportation, dense coding, and computation with magic states.
NASA Astrophysics Data System (ADS)
Rueda, Antonio J.; Noguera, José M.; Luque, Adrián
2016-02-01
In recent years GPU computing has gained wide acceptance as a simple low-cost solution for speeding up computationally expensive processing in many scientific and engineering applications. However, in most cases accelerating a traditional CPU implementation for a GPU is a non-trivial task that requires a thorough refactorization of the code and specific optimizations that depend on the architecture of the device. OpenACC is a promising technology that aims at reducing the effort required to accelerate C/C++/Fortran code on an attached multicore device. Virtually with this technology the CPU code only has to be augmented with a few compiler directives to identify the areas to be accelerated and the way in which data has to be moved between the CPU and GPU. Its potential benefits are multiple: better code readability, less development time, lower risk of errors and less dependency on the underlying architecture and future evolution of the GPU technology. Our aim with this work is to evaluate the pros and cons of using OpenACC against native GPU implementations in computationally expensive hydrological applications, using the classic D8 algorithm of O'Callaghan and Mark for river network extraction as case-study. We implemented the flow accumulation step of this algorithm in CPU, using OpenACC and two different CUDA versions, comparing the length and complexity of the code and its performance with different datasets. We advance that although OpenACC can not match the performance of a CUDA optimized implementation (×3.5 slower in average), it provides a significant performance improvement against a CPU implementation (×2-6) with by far a simpler code and less implementation effort.
Development of a cryogenic mixed fluid J-T cooling computer code, 'JTMIX'
NASA Technical Reports Server (NTRS)
Jones, Jack A.
1991-01-01
An initial study was performed for analyzing and predicting the temperatures and cooling capacities when mixtures of fluids are used in Joule-Thomson coolers and in heat pipes. A computer code, JTMIX, was developed for mixed gas J-T analysis for any fluid combination of neon, nitrogen, various hydrocarbons, argon, oxygen, carbon monoxide, carbon dioxide, and hydrogen sulfide. When used in conjunction with the NIST computer code, DDMIX, it has accurately predicted order-of-magnitude increases in J-T cooling capacities when various hydrocarbons are added to nitrogen, and it predicts nitrogen normal boiling point depressions to as low as 60 K when neon is added.
NASA Astrophysics Data System (ADS)
Eisenbach, Markus
The Locally Self-consistent Multiple Scattering (LSMS) code solves the first principles Density Functional theory Kohn-Sham equation for a wide range of materials with a special focus on metals, alloys and metallic nano-structures. It has traditionally exhibited near perfect scalability on massively parallel high performance computer architectures. We present our efforts to exploit GPUs to accelerate the LSMS code to enable first principles calculations of O(100,000) atoms and statistical physics sampling of finite temperature properties. Using the Cray XK7 system Titan at the Oak Ridge Leadership Computing Facility we achieve a sustained performance of 14.5PFlop/s and a speedup of 8.6 compared to the CPU only code. This work has been sponsored by the U.S. Department of Energy, Office of Science, Basic Energy Sciences, Material Sciences and Engineering Division and by the Office of Advanced Scientific Computing. This work used resources of the Oak Ridge Leadership Computing Facility, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725.
Adaptive neural coding: from biological to behavioral decision-making
Louie, Kenway; Glimcher, Paul W.; Webb, Ryan
2015-01-01
Empirical decision-making in diverse species deviates from the predictions of normative choice theory, but why such suboptimal behavior occurs is unknown. Here, we propose that deviations from optimality arise from biological decision mechanisms that have evolved to maximize choice performance within intrinsic biophysical constraints. Sensory processing utilizes specific computations such as divisive normalization to maximize information coding in constrained neural circuits, and recent evidence suggests that analogous computations operate in decision-related brain areas. These adaptive computations implement a relative value code that may explain the characteristic context-dependent nature of behavioral violations of classical normative theory. Examining decision-making at the computational level thus provides a crucial link between the architecture of biological decision circuits and the form of empirical choice behavior. PMID:26722666
Toward performance portability of the Albany finite element analysis code using the Kokkos library
DOE Office of Scientific and Technical Information (OSTI.GOV)
Demeshko, Irina; Watkins, Jerry; Tezaur, Irina K.
Performance portability on heterogeneous high-performance computing (HPC) systems is a major challenge faced today by code developers: parallel code needs to be executed correctly as well as with high performance on machines with different architectures, operating systems, and software libraries. The finite element method (FEM) is a popular and flexible method for discretizing partial differential equations arising in a wide variety of scientific, engineering, and industrial applications that require HPC. This paper presents some preliminary results pertaining to our development of a performance portable implementation of the FEM-based Albany code. Performance portability is achieved using the Kokkos library. We presentmore » performance results for the Aeras global atmosphere dynamical core module in Albany. Finally, numerical experiments show that our single code implementation gives reasonable performance across three multicore/many-core architectures: NVIDIA General Processing Units (GPU’s), Intel Xeon Phis, and multicore CPUs.« less
Toward performance portability of the Albany finite element analysis code using the Kokkos library
Demeshko, Irina; Watkins, Jerry; Tezaur, Irina K.; ...
2018-02-05
Performance portability on heterogeneous high-performance computing (HPC) systems is a major challenge faced today by code developers: parallel code needs to be executed correctly as well as with high performance on machines with different architectures, operating systems, and software libraries. The finite element method (FEM) is a popular and flexible method for discretizing partial differential equations arising in a wide variety of scientific, engineering, and industrial applications that require HPC. This paper presents some preliminary results pertaining to our development of a performance portable implementation of the FEM-based Albany code. Performance portability is achieved using the Kokkos library. We presentmore » performance results for the Aeras global atmosphere dynamical core module in Albany. Finally, numerical experiments show that our single code implementation gives reasonable performance across three multicore/many-core architectures: NVIDIA General Processing Units (GPU’s), Intel Xeon Phis, and multicore CPUs.« less
Deploying electromagnetic particle-in-cell (EM-PIC) codes on Xeon Phi accelerators boards
NASA Astrophysics Data System (ADS)
Fonseca, Ricardo
2014-10-01
The complexity of the phenomena involved in several relevant plasma physics scenarios, where highly nonlinear and kinetic processes dominate, makes purely theoretical descriptions impossible. Further understanding of these scenarios requires detailed numerical modeling, but fully relativistic particle-in-cell codes such as OSIRIS are computationally intensive. The quest towards Exaflop computer systems has lead to the development of HPC systems based on add-on accelerator cards, such as GPGPUs and more recently the Xeon Phi accelerators that power the current number 1 system in the world. These cards, also referred to as Intel Many Integrated Core Architecture (MIC) offer peak theoretical performances of >1 TFlop/s for general purpose calculations in a single board, and are receiving significant attention as an attractive alternative to CPUs for plasma modeling. In this work we report on our efforts towards the deployment of an EM-PIC code on a Xeon Phi architecture system. We will focus on the parallelization and vectorization strategies followed, and present a detailed performance evaluation of code performance in comparison with the CPU code.
Portable multi-node LQCD Monte Carlo simulations using OpenACC
NASA Astrophysics Data System (ADS)
Bonati, Claudio; Calore, Enrico; D'Elia, Massimo; Mesiti, Michele; Negro, Francesco; Sanfilippo, Francesco; Schifano, Sebastiano Fabio; Silvi, Giorgio; Tripiccione, Raffaele
This paper describes a state-of-the-art parallel Lattice QCD Monte Carlo code for staggered fermions, purposely designed to be portable across different computer architectures, including GPUs and commodity CPUs. Portability is achieved using the OpenACC parallel programming model, used to develop a code that can be compiled for several processor architectures. The paper focuses on parallelization on multiple computing nodes using OpenACC to manage parallelism within the node, and OpenMPI to manage parallelism among the nodes. We first discuss the available strategies to be adopted to maximize performances, we then describe selected relevant details of the code, and finally measure the level of performance and scaling-performance that we are able to achieve. The work focuses mainly on GPUs, which offer a significantly high level of performances for this application, but also compares with results measured on other processors.
NASA Astrophysics Data System (ADS)
Zhao, Shengmei; Wang, Le; Liang, Wenqiang; Cheng, Weiwen; Gong, Longyan
2015-10-01
In this paper, we propose a high performance optical encryption (OE) scheme based on computational ghost imaging (GI) with QR code and compressive sensing (CS) technique, named QR-CGI-OE scheme. N random phase screens, generated by Alice, is a secret key and be shared with its authorized user, Bob. The information is first encoded by Alice with QR code, and the QR-coded image is then encrypted with the aid of computational ghost imaging optical system. Here, measurement results from the GI optical system's bucket detector are the encrypted information and be transmitted to Bob. With the key, Bob decrypts the encrypted information to obtain the QR-coded image with GI and CS techniques, and further recovers the information by QR decoding. The experimental and numerical simulated results show that the authorized users can recover completely the original image, whereas the eavesdroppers can not acquire any information about the image even the eavesdropping ratio (ER) is up to 60% at the given measurement times. For the proposed scheme, the number of bits sent from Alice to Bob are reduced considerably and the robustness is enhanced significantly. Meantime, the measurement times in GI system is reduced and the quality of the reconstructed QR-coded image is improved.
Unsteady Full Annulus Simulations of a Transonic Axial Compressor Stage
NASA Technical Reports Server (NTRS)
Herrick, Gregory P.; Hathaway, Michael D.; Chen, Jen-Ping
2009-01-01
Two recent research endeavors in turbomachinery at NASA Glenn Research Center have focused on compression system stall inception and compression system aerothermodynamic performance. Physical experiment and computational research are ongoing in support of these research objectives. TURBO, an unsteady, three-dimensional, Navier-Stokes computational fluid dynamics code commissioned and developed by NASA, has been utilized, enhanced, and validated in support of these endeavors. In the research which follows, TURBO is shown to accurately capture compression system flow range-from choke to stall inception-and also to accurately calculate fundamental aerothermodynamic performance parameters. Rigorous full-annulus calculations are performed to validate TURBO s ability to simulate the unstable, unsteady, chaotic stall inception process; as part of these efforts, full-annulus calculations are also performed at a condition approaching choke to further document TURBO s capabilities to compute aerothermodynamic performance data and support a NASA code assessment effort.
Rotary engine performance computer program (RCEMAP and RCEMAPPC): User's guide
NASA Technical Reports Server (NTRS)
Bartrand, Timothy A.; Willis, Edward A.
1993-01-01
This report is a user's guide for a computer code that simulates the performance of several rotary combustion engine configurations. It is intended to assist prospective users in getting started with RCEMAP and/or RCEMAPPC. RCEMAP (Rotary Combustion Engine performance MAP generating code) is the mainframe version, while RCEMAPPC is a simplified subset designed for the personal computer, or PC, environment. Both versions are based on an open, zero-dimensional combustion system model for the prediction of instantaneous pressures, temperature, chemical composition and other in-chamber thermodynamic properties. Both versions predict overall engine performance and thermal characteristics, including bmep, bsfc, exhaust gas temperature, average material temperatures, and turbocharger operating conditions. Required inputs include engine geometry, materials, constants for use in the combustion heat release model, and turbomachinery maps. Illustrative examples and sample input files for both versions are included.
Programming for 1.6 Millon cores: Early experiences with IBM's BG/Q SMP architecture
NASA Astrophysics Data System (ADS)
Glosli, James
2013-03-01
With the stall in clock cycle improvements a decade ago, the drive for computational performance has continues along a path of increasing core counts on a processor. The multi-core evolution has been expressed in both a symmetric multi processor (SMP) architecture and cpu/GPU architecture. Debates rage in the high performance computing (HPC) community which architecture best serves HPC. In this talk I will not attempt to resolve that debate but perhaps fuel it. I will discuss the experience of exploiting Sequoia, a 98304 node IBM Blue Gene/Q SMP at Lawrence Livermore National Laboratory. The advantages and challenges of leveraging the computational power BG/Q will be detailed through the discussion of two applications. The first application is a Molecular Dynamics code called ddcMD. This is a code developed over the last decade at LLNL and ported to BG/Q. The second application is a cardiac modeling code called Cardioid. This is a code that was recently designed and developed at LLNL to exploit the fine scale parallelism of BG/Q's SMP architecture. Through the lenses of these efforts I'll illustrate the need to rethink how we express and implement our computational approaches. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
COMSAC: Computational Methods for Stability and Control. Part 2
NASA Technical Reports Server (NTRS)
Fremaux, C. Michael (Compiler); Hall, Robert M. (Compiler)
2004-01-01
The unprecedented advances being made in computational fluid dynamic (CFD) technology have demonstrated the powerful capabilities of codes in applications to civil and military aircraft. Used in conjunction with wind-tunnel and flight investigations, many codes are now routinely used by designers in diverse applications such as aerodynamic performance predictions and propulsion integration. Typically, these codes are most reliable for attached, steady, and predominantly turbulent flows. As a result of increasing reliability and confidence in CFD, wind-tunnel testing for some new configurations has been substantially reduced in key areas, such as wing trade studies for mission performance guarantees. Interest is now growing in the application of computational methods to other critical design challenges. One of the most important disciplinary elements for civil and military aircraft is prediction of stability and control characteristics. CFD offers the potential for significantly increasing the basic understanding, prediction, and control of flow phenomena associated with requirements for satisfactory aircraft handling characteristics.
NASA Technical Reports Server (NTRS)
Walsh, J. L.; Weston, R. P.; Samareh, J. A.; Mason, B. H.; Green, L. L.; Biedron, R. T.
2000-01-01
An objective of the High Performance Computing and Communication Program at the NASA Langley Research Center is to demonstrate multidisciplinary shape and sizing optimization of a complete aerospace vehicle configuration by using high-fidelity finite-element structural analysis and computational fluid dynamics aerodynamic analysis in a distributed, heterogeneous computing environment that includes high performance parallel computing. A software system has been designed and implemented to integrate a set of existing discipline analysis codes, some of them computationally intensive, into a distributed computational environment for the design of a high-speed civil transport configuration. The paper describes both the preliminary results from implementing and validating the multidisciplinary analysis and the results from an aerodynamic optimization. The discipline codes are integrated by using the Java programming language and a Common Object Request Broker Architecture compliant software product. A companion paper describes the formulation of the multidisciplinary analysis and optimization system.
NASA Technical Reports Server (NTRS)
STACK S. H.
1981-01-01
A computer-aided design system has recently been developed specifically for the small research group environment. The system is implemented on a Prime 400 minicomputer linked with a CDC 6600 computer. The goal was to assign the minicomputer specific tasks, such as data input and graphics, thereby reserving the large mainframe computer for time-consuming analysis codes. The basic structure of the design system consists of GEMPAK, a computer code that generates detailed configuration geometry from a minimum of input; interface programs that reformat GEMPAK geometry for input to the analysis codes; and utility programs that simplify computer access and data interpretation. The working system has had a large positive impact on the quantity and quality of research performed by the originating group. This paper describes the system, the major factors that contributed to its particular form, and presents examples of its application.
HERCULES: A Pattern Driven Code Transformation System
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kartsaklis, Christos; Hernandez, Oscar R; Hsu, Chung-Hsing
2012-01-01
New parallel computers are emerging, but developing efficient scientific code for them remains difficult. A scientist must manage not only the science-domain complexity but also the performance-optimization complexity. HERCULES is a code transformation system designed to help the scientist to separate the two concerns, which improves code maintenance, and facilitates performance optimization. The system combines three technologies, code patterns, transformation scripts and compiler plugins, to provide the scientist with an environment to quickly implement code transformations that suit his needs. Unlike existing code optimization tools, HERCULES is unique in its focus on user-level accessibility. In this paper we discuss themore » design, implementation and an initial evaluation of HERCULES.« less
Fundamental differences between optimization code test problems in engineering applications
NASA Technical Reports Server (NTRS)
Eason, E. D.
1984-01-01
The purpose here is to suggest that there is at least one fundamental difference between the problems used for testing optimization codes and the problems that engineers often need to solve; in particular, the level of precision that can be practically achieved in the numerical evaluation of the objective function, derivatives, and constraints. This difference affects the performance of optimization codes, as illustrated by two examples. Two classes of optimization problem were defined. Class One functions and constraints can be evaluated to a high precision that depends primarily on the word length of the computer. Class Two functions and/or constraints can only be evaluated to a moderate or a low level of precision for economic or modeling reasons, regardless of the computer word length. Optimization codes have not been adequately tested on Class Two problems. There are very few Class Two test problems in the literature, while there are literally hundreds of Class One test problems. The relative performance of two codes may be markedly different for Class One and Class Two problems. Less sophisticated direct search type codes may be less likely to be confused or to waste many function evaluations on Class Two problems. The analysis accuracy and minimization performance are related in a complex way that probably varies from code to code. On a problem where the analysis precision was varied over a range, the simple Hooke and Jeeves code was more efficient at low precision while the Powell code was more efficient at high precision.
Processor-in-memory-and-storage architecture
DOE Office of Scientific and Technical Information (OSTI.GOV)
DeBenedictis, Erik
A method and apparatus for performing reliable general-purpose computing. Each sub-core of a plurality of sub-cores of a processor core processes a same instruction at a same time. A code analyzer receives a plurality of residues that represents a code word corresponding to the same instruction and an indication of whether the code word is a memory address code or a data code from the plurality of sub-cores. The code analyzer determines whether the plurality of residues are consistent or inconsistent. The code analyzer and the plurality of sub-cores perform a set of operations based on whether the code wordmore » is a memory address code or a data code and a determination of whether the plurality of residues are consistent or inconsistent.« less
Computational fluid mechanics utilizing the variational principle of modeling damping seals
NASA Technical Reports Server (NTRS)
Abernathy, J. M.
1986-01-01
A computational fluid dynamics code for application to traditional incompressible flow problems has been developed. The method is actually a slight compressibility approach which takes advantage of the bulk modulus and finite sound speed of all real fluids. The finite element numerical analog uses a dynamic differencing scheme based, in part, on a variational principle for computational fluid dynamics. The code was developed in order to study the feasibility of damping seals for high speed turbomachinery. Preliminary seal analyses have been performed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Trent, D.S.; Eyler, L.L.
In this study several aspects of simulating hydrogen distribution in geometric configurations relevant to reactor containment structures were investigated using the TEMPEST computer code. Of particular interest was the performance of the TEMPEST turbulence model in a density-stratified environment. Computed results illustrated that the TEMPEST numerical procedures predicted the measured phenomena with good accuracy under a variety of conditions and that the turbulence model used is a viable approach in complex turbulent flow simulation.
Extensible Computational Chemistry Environment
DOE Office of Scientific and Technical Information (OSTI.GOV)
2012-08-09
ECCE provides a sophisticated graphical user interface, scientific visualization tools, and the underlying data management framework enabling scientists to efficiently set up calculations and store, retrieve, and analyze the rapidly growing volumes of data produced by computational chemistry studies. ECCE was conceived as part of the Environmental Molecular Sciences Laboratory construction to solve the problem of researchers being able to effectively utilize complex computational chemistry codes and massively parallel high performance compute resources. Bringing the power of these codes and resources to the desktops of researcher and thus enabling world class research without users needing a detailed understanding of themore » inner workings of either the theoretical codes or the supercomputers needed to run them was a grand challenge problem in the original version of the EMSL. ECCE allows collaboration among researchers using a web-based data repository where the inputs and results for all calculations done within ECCE are organized. ECCE is a first of kind end-to-end problem solving environment for all phases of computational chemistry research: setting up calculations with sophisticated GUI and direct manipulation visualization tools, submitting and monitoring calculations on remote high performance supercomputers without having to be familiar with the details of using these compute resources, and performing results visualization and analysis including creating publication quality images. ECCE is a suite of tightly integrated applications that are employed as the user moves through the modeling process.« less
Convolutional coding results for the MVM '73 X-band telemetry experiment
NASA Technical Reports Server (NTRS)
Layland, J. W.
1978-01-01
Results of simulation of several short-constraint-length convolutional codes using a noisy symbol stream obtained via the turnaround ranging channels of the MVM'73 spacecraft are presented. First operational use of this coding technique is on the Voyager mission. The relative performance of these codes in this environment is as previously predicted from computer-based simulations.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garnier, Ch.; Mailhe, P.; Sontheimer, F.
2007-07-01
Fuel performance is a key factor for minimizing operating costs in nuclear plants. One of the important aspects of fuel performance is fuel rod design, based upon reliable tools able to verify the safety of current fuel solutions, prevent potential issues in new core managements and guide the invention of tomorrow's fuels. AREVA is developing its future global fuel rod code COPERNIC3, which is able to calculate the thermal-mechanical behavior of advanced fuel rods in nuclear plants. Some of the best practices to achieve this goal are described, by reviewing the three pillars of a fuel rod code: the database,more » the modelling and the computer and numerical aspects. At first, the COPERNIC3 database content is described, accompanied by the tools developed to effectively exploit the data. Then is given an overview of the main modelling aspects, by emphasizing the thermal, fission gas release and mechanical sub-models. In the last part, numerical solutions are detailed in order to increase the computational performance of the code, with a presentation of software configuration management solutions. (authors)« less
NASA Technical Reports Server (NTRS)
Morgan, Philip E.
2004-01-01
This final report contains reports of research related to the tasks "Scalable High Performance Computing: Direct and Lark-Eddy Turbulent FLow Simulations Using Massively Parallel Computers" and "Devleop High-Performance Time-Domain Computational Electromagnetics Capability for RCS Prediction, Wave Propagation in Dispersive Media, and Dual-Use Applications. The discussion of Scalable High Performance Computing reports on three objectives: validate, access scalability, and apply two parallel flow solvers for three-dimensional Navier-Stokes flows; develop and validate a high-order parallel solver for Direct Numerical Simulations (DNS) and Large Eddy Simulation (LES) problems; and Investigate and develop a high-order Reynolds averaged Navier-Stokes turbulence model. The discussion of High-Performance Time-Domain Computational Electromagnetics reports on five objectives: enhancement of an electromagnetics code (CHARGE) to be able to effectively model antenna problems; utilize lessons learned in high-order/spectral solution of swirling 3D jets to apply to solving electromagnetics project; transition a high-order fluids code, FDL3DI, to be able to solve Maxwell's Equations using compact-differencing; develop and demonstrate improved radiation absorbing boundary conditions for high-order CEM; and extend high-order CEM solver to address variable material properties. The report also contains a review of work done by the systems engineer.
Off-design performance analysis of MHD generator channels
NASA Technical Reports Server (NTRS)
Wilson, D. R.; Williams, T. S.
1980-01-01
A computer code for performing parametric design point calculations, and evaluating the off-design performance of MHD generators has been developed. The program is capable of analyzing Faraday, Hall, and DCW channels, including the effect of electrical shorting in the gas boundary layers and coal slag layers. Direct integration of the electrode voltage drops is included. The program can be run in either the design or off-design mode. Details of the computer code, together with results of a study of the design and off-design performance of the proposed ETF MHD generator are presented. Design point variations of pre-heat and stoichiometry were analyzed. The off-design study included variations in mass flow rate and oxygen enrichment.
Extremely Scalable Spiking Neuronal Network Simulation Code: From Laptops to Exascale Computers.
Jordan, Jakob; Ippen, Tammo; Helias, Moritz; Kitayama, Itaru; Sato, Mitsuhisa; Igarashi, Jun; Diesmann, Markus; Kunkel, Susanne
2018-01-01
State-of-the-art software tools for neuronal network simulations scale to the largest computing systems available today and enable investigations of large-scale networks of up to 10 % of the human cortex at a resolution of individual neurons and synapses. Due to an upper limit on the number of incoming connections of a single neuron, network connectivity becomes extremely sparse at this scale. To manage computational costs, simulation software ultimately targeting the brain scale needs to fully exploit this sparsity. Here we present a two-tier connection infrastructure and a framework for directed communication among compute nodes accounting for the sparsity of brain-scale networks. We demonstrate the feasibility of this approach by implementing the technology in the NEST simulation code and we investigate its performance in different scaling scenarios of typical network simulations. Our results show that the new data structures and communication scheme prepare the simulation kernel for post-petascale high-performance computing facilities without sacrificing performance in smaller systems.
Extremely Scalable Spiking Neuronal Network Simulation Code: From Laptops to Exascale Computers
Jordan, Jakob; Ippen, Tammo; Helias, Moritz; Kitayama, Itaru; Sato, Mitsuhisa; Igarashi, Jun; Diesmann, Markus; Kunkel, Susanne
2018-01-01
State-of-the-art software tools for neuronal network simulations scale to the largest computing systems available today and enable investigations of large-scale networks of up to 10 % of the human cortex at a resolution of individual neurons and synapses. Due to an upper limit on the number of incoming connections of a single neuron, network connectivity becomes extremely sparse at this scale. To manage computational costs, simulation software ultimately targeting the brain scale needs to fully exploit this sparsity. Here we present a two-tier connection infrastructure and a framework for directed communication among compute nodes accounting for the sparsity of brain-scale networks. We demonstrate the feasibility of this approach by implementing the technology in the NEST simulation code and we investigate its performance in different scaling scenarios of typical network simulations. Our results show that the new data structures and communication scheme prepare the simulation kernel for post-petascale high-performance computing facilities without sacrificing performance in smaller systems. PMID:29503613
ACON: a multipurpose production controller for plasma physics codes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Snell, C.
1983-01-01
ACON is a BCON controller designed to run large production codes on the CTSS Cray-1 or the LTSS 7600 computers. ACON can also be operated interactively, with input from the user's terminal. The controller can run one code or a sequence of up to ten codes during the same job. Options are available to get and save Mass storage files, to perform Historian file updating operations, to compile and load source files, and to send out print and film files. Special features include ability to retry after Mass failures, backup options for saving files, startup messages for the various codes,more » and ability to reserve specified amounts of computer time after successive code runs. ACON's flexibility and power make it useful for running a number of different production codes.« less
Automatic Processing of Reactive Polymers
NASA Technical Reports Server (NTRS)
Roylance, D.
1985-01-01
A series of process modeling computer codes were examined. The codes use finite element techniques to determine the time-dependent process parameters operative during nonisothermal reactive flows such as can occur in reaction injection molding or composites fabrication. The use of these analytical codes to perform experimental control functions is examined; since the models can determine the state of all variables everywhere in the system, they can be used in a manner similar to currently available experimental probes. A small but well instrumented reaction vessel in which fiber-reinforced plaques are cured using computer control and data acquisition was used. The finite element codes were also extended to treat this particular process.
Analysis of electrophoresis performance
NASA Technical Reports Server (NTRS)
Roberts, Glyn O.
1988-01-01
A flexible efficient computer code is being developed to simulate electrophoretic separation phenomena, in either a cylindrical or a rectangular geometry. The code will computer the evolution in time of the concentrations of an arbitrary number of chemical species, and of the temperature, pH distribution, conductivity, electric field, and fluid motion. Use of nonuniform meshes and fast accurate implicit time-stepping will yield accurate answers at economical cost.
WOMBAT: A Scalable and High-performance Astrophysical Magnetohydrodynamics Code
NASA Astrophysics Data System (ADS)
Mendygral, P. J.; Radcliffe, N.; Kandalla, K.; Porter, D.; O'Neill, B. J.; Nolting, C.; Edmon, P.; Donnert, J. M. F.; Jones, T. W.
2017-02-01
We present a new code for astrophysical magnetohydrodynamics specifically designed and optimized for high performance and scaling on modern and future supercomputers. We describe a novel hybrid OpenMP/MPI programming model that emerged from a collaboration between Cray, Inc. and the University of Minnesota. This design utilizes MPI-RMA optimized for thread scaling, which allows the code to run extremely efficiently at very high thread counts ideal for the latest generation of multi-core and many-core architectures. Such performance characteristics are needed in the era of “exascale” computing. We describe and demonstrate our high-performance design in detail with the intent that it may be used as a model for other, future astrophysical codes intended for applications demanding exceptional performance.
Wake coupling to full potential rotor analysis code
NASA Technical Reports Server (NTRS)
Torres, Francisco J.; Chang, I-Chung; Oh, Byung K.
1990-01-01
The wake information from a helicopter forward flight code is coupled with two transonic potential rotor codes. The induced velocities for the near-, mid-, and far-wake geometries are extracted from a nonlinear rigid wake of a standard performance and analysis code. These, together with the corresponding inflow angles, computation points, and azimuth angles, are then incorporated into the transonic potential codes. The coupled codes can then provide an improved prediction of rotor blade loading at transonic speeds.
GPU Computing in Bayesian Inference of Realized Stochastic Volatility Model
NASA Astrophysics Data System (ADS)
Takaishi, Tetsuya
2015-01-01
The realized stochastic volatility (RSV) model that utilizes the realized volatility as additional information has been proposed to infer volatility of financial time series. We consider the Bayesian inference of the RSV model by the Hybrid Monte Carlo (HMC) algorithm. The HMC algorithm can be parallelized and thus performed on the GPU for speedup. The GPU code is developed with CUDA Fortran. We compare the computational time in performing the HMC algorithm on GPU (GTX 760) and CPU (Intel i7-4770 3.4GHz) and find that the GPU can be up to 17 times faster than the CPU. We also code the program with OpenACC and find that appropriate coding can achieve the similar speedup with CUDA Fortran.
NASA Technical Reports Server (NTRS)
Leonardo, M.; Tsuchiya, T.; Murthy, S. N. B.
1982-01-01
A model for predicting the performance of a multi-spool axial-flow compressor with a fan during operation with water ingestion was developed incorporating several two-phase fluid flow effects as follows: (1) ingestion of water, (2) droplet interaction with blades and resulting changes in blade characteristics, (3) redistribution of water and water vapor due to centrifugal action, (4) heat and mass transfer processes, and (5) droplet size adjustment due to mass transfer and mechanical stability considerations. A computer program, called the PURDU-WINCOF code, was generated based on the model utilizing a one-dimensional formulation. An illustrative case serves to show the manner in which the code can be utilized and the nature of the results obtained.
NASA Astrophysics Data System (ADS)
Lee, Eun Seok
2000-10-01
An improved aerodynamics performance of a turbine cascade shape can be achieved by an understanding of the flow-field associated with the stator-rotor interaction. In this research, an axial gas turbine airfoil cascade shape is optimized for improved aerodynamic performance by using an unsteady Navier-Stokes solver and a parallel genetic algorithm. The objective of the research is twofold: (1) to develop a computational fluid dynamics code having faster convergence rate and unsteady flow simulation capabilities, and (2) to optimize a turbine airfoil cascade shape with unsteady passing wakes for improved aerodynamic performance. The computer code solves the Reynolds averaged Navier-Stokes equations. It is based on the explicit, finite difference, Runge-Kutta time marching scheme and the Diagonalized Alternating Direction Implicit (DADI) scheme, with the Baldwin-Lomax algebraic and k-epsilon turbulence modeling. Improvements in the code focused on the cascade shape design capability, convergence acceleration and unsteady formulation. First, the inverse shape design method was implemented in the code to provide the design capability, where a surface transpiration concept was employed as an inverse technique to modify the geometry satisfying the user specified pressure distribution on the airfoil surface. Second, an approximation storage multigrid method was implemented as an acceleration technique. Third, the preconditioning method was adopted to speed up the convergence rate in solving the low Mach number flows. Finally, the implicit dual time stepping method was incorporated in order to simulate the unsteady flow-fields. For the unsteady code validation, the Stokes's 2nd problem and the Poiseuille flow were chosen and compared with the computed results and analytic solutions. To test the code's ability to capture the natural unsteady flow phenomena, vortex shedding past a cylinder and the shock oscillation over a bicircular airfoil were simulated and compared with experiments and other research results. The rotor cascade shape optimization with unsteady passing wakes was performed to obtain an improved aerodynamic performance using the unsteady Navier-Stokes solver. Two objective functions were defined as minimization of total pressure loss and maximization of lift, while the mass flow rate was fixed. A parallel genetic algorithm was used as an optimizer and the penalty method was introduced. Each individual's objective function was computed simultaneously by using a 32 processor distributed memory computer. One optimization took about four days.
Design Trade-off Between Performance and Fault-Tolerance of Space Onboard Computers
NASA Astrophysics Data System (ADS)
Gorbunov, M. S.; Antonov, A. A.
2017-01-01
It is well known that there is a trade-off between performance and power consumption in onboard computers. The fault-tolerance is another important factor affecting performance, chip area and power consumption. Involving special SRAM cells and error-correcting codes is often too expensive with relation to the performance needed. We discuss the possibility of finding the optimal solutions for modern onboard computer for scientific apparatus focusing on multi-level cache memory design.
NASA Technical Reports Server (NTRS)
Klopfer, Goetz H.
1993-01-01
The work performed during the past year on this cooperative agreement covered two major areas and two lesser ones. The two major items included further development and validation of the Compressible Navier-Stokes Finite Volume (CNSFV) code and providing computational support for the Laminar Flow Supersonic Wind Tunnel (LFSWT). The two lesser items involve a Navier-Stokes simulation of an oscillating control surface at transonic speeds and improving the basic algorithm used in the CNSFV code for faster convergence rates and more robustness. The work done in all four areas is in support of the High Speed Research Program at NASA Ames Research Center.
Automated error correction in IBM quantum computer and explicit generalization
NASA Astrophysics Data System (ADS)
Ghosh, Debjit; Agarwal, Pratik; Pandey, Pratyush; Behera, Bikash K.; Panigrahi, Prasanta K.
2018-06-01
Construction of a fault-tolerant quantum computer remains a challenging problem due to unavoidable noise and fragile quantum states. However, this goal can be achieved by introducing quantum error-correcting codes. Here, we experimentally realize an automated error correction code and demonstrate the nondestructive discrimination of GHZ states in IBM 5-qubit quantum computer. After performing quantum state tomography, we obtain the experimental results with a high fidelity. Finally, we generalize the investigated code for maximally entangled n-qudit case, which could both detect and automatically correct any arbitrary phase-change error, or any phase-flip error, or any bit-flip error, or combined error of all types of error.
Investigation of Near Shannon Limit Coding Schemes
NASA Technical Reports Server (NTRS)
Kwatra, S. C.; Kim, J.; Mo, Fan
1999-01-01
Turbo codes can deliver performance that is very close to the Shannon limit. This report investigates algorithms for convolutional turbo codes and block turbo codes. Both coding schemes can achieve performance near Shannon limit. The performance of the schemes is obtained using computer simulations. There are three sections in this report. First section is the introduction. The fundamental knowledge about coding, block coding and convolutional coding is discussed. In the second section, the basic concepts of convolutional turbo codes are introduced and the performance of turbo codes, especially high rate turbo codes, is provided from the simulation results. After introducing all the parameters that help turbo codes achieve such a good performance, it is concluded that output weight distribution should be the main consideration in designing turbo codes. Based on the output weight distribution, the performance bounds for turbo codes are given. Then, the relationships between the output weight distribution and the factors like generator polynomial, interleaver and puncturing pattern are examined. The criterion for the best selection of system components is provided. The puncturing pattern algorithm is discussed in detail. Different puncturing patterns are compared for each high rate. For most of the high rate codes, the puncturing pattern does not show any significant effect on the code performance if pseudo - random interleaver is used in the system. For some special rate codes with poor performance, an alternative puncturing algorithm is designed which restores their performance close to the Shannon limit. Finally, in section three, for iterative decoding of block codes, the method of building trellis for block codes, the structure of the iterative decoding system and the calculation of extrinsic values are discussed.
NASA Astrophysics Data System (ADS)
Eisenbach, Markus; Larkin, Jeff; Lutjens, Justin; Rennich, Steven; Rogers, James H.
2017-02-01
The Locally Self-consistent Multiple Scattering (LSMS) code solves the first principles Density Functional theory Kohn-Sham equation for a wide range of materials with a special focus on metals, alloys and metallic nano-structures. It has traditionally exhibited near perfect scalability on massively parallel high performance computer architectures. We present our efforts to exploit GPUs to accelerate the LSMS code to enable first principles calculations of O(100,000) atoms and statistical physics sampling of finite temperature properties. We reimplement the scattering matrix calculation for GPUs with a block matrix inversion algorithm that only uses accelerator memory. Using the Cray XK7 system Titan at the Oak Ridge Leadership Computing Facility we achieve a sustained performance of 14.5PFlop/s and a speedup of 8.6 compared to the CPU only code.
NASA Technical Reports Server (NTRS)
Bauer, Brent
1993-01-01
This paper discusses the development of a FORTRAN computer code to perform agility analysis on aircraft configurations. This code is to be part of the NASA-Ames ACSYNT (AirCraft SYNThesis) design code. This paper begins with a discussion of contemporary agility research in the aircraft industry and a survey of a few agility metrics. The methodology, techniques and models developed for the code are then presented. Finally, example trade studies using the agility module along with ACSYNT are illustrated. These trade studies were conducted using a Northrop F-20 Tigershark aircraft model. The studies show that the agility module is effective in analyzing the influence of common parameters such as thrust-to-weight ratio and wing loading on agility criteria. The module can compare the agility potential between different configurations. In addition, one study illustrates the module's ability to optimize a configuration's agility performance.
Eisenbach, Markus; Larkin, Jeff; Lutjens, Justin; ...
2016-07-12
The Locally Self-consistent Multiple Scattering (LSMS) code solves the first principles Density Functional theory Kohn–Sham equation for a wide range of materials with a special focus on metals, alloys and metallic nano-structures. It has traditionally exhibited near perfect scalability on massively parallel high performance computer architectures. In this paper, we present our efforts to exploit GPUs to accelerate the LSMS code to enable first principles calculations of O(100,000) atoms and statistical physics sampling of finite temperature properties. We reimplement the scattering matrix calculation for GPUs with a block matrix inversion algorithm that only uses accelerator memory. Finally, using the Craymore » XK7 system Titan at the Oak Ridge Leadership Computing Facility we achieve a sustained performance of 14.5PFlop/s and a speedup of 8.6 compared to the CPU only code.« less
Duct flow nonuniformities for Space Shuttle Main Engine (SSME)
NASA Technical Reports Server (NTRS)
1987-01-01
A three-duct Space Shuttle Main Engine (SSME) Hot Gas Manifold geometry code was developed for use. The methodology of the program is described, recommendations on its implementation made, and an input guide, input deck listing, and a source code listing provided. The code listing is strewn with an abundance of comments to assist the user in following its development and logic. A working source deck will be provided. A thorough analysis was made of the proper boundary conditions and chemistry kinetics necessary for an accurate computational analysis of the flow environment in the SSME fuel side preburner chamber during the initial startup transient. Pertinent results were presented to facilitate incorporation of these findings into an appropriate CFD code. The computation must be a turbulent computation, since the flow field turbulent mixing will have a profound effect on the chemistry. Because of the additional equations demanded by the chemistry model it is recommended that for expediency a simple algebraic mixing length model be adopted. Performing this computation for all or selected time intervals of the startup time will require an abundance of computer CPU time regardless of the specific CFD code selected.
CFD Based Computations of Flexible Helicopter Blades for Stability Analysis
NASA Technical Reports Server (NTRS)
Guruswamy, Guru P.
2011-01-01
As a collaborative effort among government aerospace research laboratories an advanced version of a widely used computational fluid dynamics code, OVERFLOW, was recently released. This latest version includes additions to model flexible rotating multiple blades. In this paper, the OVERFLOW code is applied to improve the accuracy of airload computations from the linear lifting line theory that uses displacements from beam model. Data transfers required at every revolution are managed through a Unix based script that runs jobs on large super-cluster computers. Results are demonstrated for the 4-bladed UH-60A helicopter. Deviations of computed data from flight data are evaluated. Fourier analysis post-processing that is suitable for aeroelastic stability computations are performed.
Generalized Advanced Propeller Analysis System (GAPAS). Volume 2: Computer program user manual
NASA Technical Reports Server (NTRS)
Glatt, L.; Crawford, D. R.; Kosmatka, J. B.; Swigart, R. J.; Wong, E. W.
1986-01-01
The Generalized Advanced Propeller Analysis System (GAPAS) computer code is described. GAPAS was developed to analyze advanced technology multi-bladed propellers which operate on aircraft with speeds up to Mach 0.8 and altitudes up to 40,000 feet. GAPAS includes technology for analyzing aerodynamic, structural, and acoustic performance of propellers. The computer code was developed for the CDC 7600 computer and is currently available for industrial use on the NASA Langley computer. A description of all the analytical models incorporated in GAPAS is included. Sample calculations are also described as well as users requirements for modifying the analysis system. Computer system core requirements and running times are also discussed.
Balancing Particle and Mesh Computation in a Particle-In-Cell Code
DOE Office of Scientific and Technical Information (OSTI.GOV)
Worley, Patrick H; D'Azevedo, Eduardo; Hager, Robert
2016-01-01
The XGC1 plasma microturbulence particle-in-cell simulation code has both particle-based and mesh-based computational kernels that dominate performance. Both of these are subject to load imbalances that can degrade performance and that evolve during a simulation. Each separately can be addressed adequately, but optimizing just for one can introduce significant load imbalances in the other, degrading overall performance. A technique has been developed based on Golden Section Search that minimizes wallclock time given prior information on wallclock time, and on current particle distribution and mesh cost per cell, and also adapts to evolution in load imbalance in both particle and meshmore » work. In problems of interest this doubled the performance on full system runs on the XK7 at the Oak Ridge Leadership Computing Facility compared to load balancing only one of the kernels.« less
Fault-Tolerant Computing: An Overview
1991-06-01
Addison Wesley:, Reading, MA) 1984. [8] J. Wakerly , Error Detecting Codes, Self-Checking Circuits and Applications , (Elsevier North Holland, Inc.- New York... applicable to bit-sliced organi- zations of hardware. In the first time step, the normal computation is performed on the operands and the results...for error detection and fault tolerance in parallel processor systems while perform- ing specific computation-intensive applications [111. Contrary to
Next-generation acceleration and code optimization for light transport in turbid media using GPUs
Alerstam, Erik; Lo, William Chun Yip; Han, Tianyi David; Rose, Jonathan; Andersson-Engels, Stefan; Lilge, Lothar
2010-01-01
A highly optimized Monte Carlo (MC) code package for simulating light transport is developed on the latest graphics processing unit (GPU) built for general-purpose computing from NVIDIA - the Fermi GPU. In biomedical optics, the MC method is the gold standard approach for simulating light transport in biological tissue, both due to its accuracy and its flexibility in modelling realistic, heterogeneous tissue geometry in 3-D. However, the widespread use of MC simulations in inverse problems, such as treatment planning for PDT, is limited by their long computation time. Despite its parallel nature, optimizing MC code on the GPU has been shown to be a challenge, particularly when the sharing of simulation result matrices among many parallel threads demands the frequent use of atomic instructions to access the slow GPU global memory. This paper proposes an optimization scheme that utilizes the fast shared memory to resolve the performance bottleneck caused by atomic access, and discusses numerous other optimization techniques needed to harness the full potential of the GPU. Using these techniques, a widely accepted MC code package in biophotonics, called MCML, was successfully accelerated on a Fermi GPU by approximately 600x compared to a state-of-the-art Intel Core i7 CPU. A skin model consisting of 7 layers was used as the standard simulation geometry. To demonstrate the possibility of GPU cluster computing, the same GPU code was executed on four GPUs, showing a linear improvement in performance with an increasing number of GPUs. The GPU-based MCML code package, named GPU-MCML, is compatible with a wide range of graphics cards and is released as an open-source software in two versions: an optimized version tuned for high performance and a simplified version for beginners (http://code.google.com/p/gpumcml). PMID:21258498
One-Time Password Registration Key Code Request | High-Performance
Computing | NREL One-Time Password Registration Key Code Request One-Time Password Registration Key Code Request Use this form to request a one-time password (OTP) registration key code for using . Alternate Email In case there is a second email where we might contact you Phone In case we need to contact
Kuiper, L.K.
1985-01-01
A numerical code is documented for the simulation of variable density time dependent groundwater flow in three dimensions. The groundwater density, although variable with distance, is assumed to be constant in time. The Integrated Finite Difference grid elements in the code follow the geologic strata in the modeled area. If appropriate, the determination of hydraulic head in confining beds can be deleted to decrease computation time. The strongly implicit procedure (SIP), successive over-relaxation (SOR), and eight different preconditioned conjugate gradient (PCG) methods are used to solve the approximating equations. The use of the computer program that performs the calculations in the numerical code is emphasized. Detailed instructions are given for using the computer program, including input data formats. An example simulation and the Fortran listing of the program are included. (USGS)
NASA Technical Reports Server (NTRS)
Carlson, Harry W.; Darden, Christine M.
1988-01-01
Extensive correlations of computer code results with experimental data are employed to illustrate the use of linearized theory attached flow methods for the estimation and optimization of the aerodynamic performance of simple hinged flap systems. Use of attached flow methods is based on the premise that high levels of aerodynamic efficiency require a flow that is as nearly attached as circumstances permit. A variety of swept wing configurations are considered ranging from fighters to supersonic transports, all with leading- and trailing-edge flaps for enhancement of subsonic aerodynamic efficiency. The results indicate that linearized theory attached flow computer code methods provide a rational basis for the estimation and optimization of flap system aerodynamic performance at subsonic speeds. The analysis also indicates that vortex flap design is not an opposing approach but is closely related to attached flow design concepts. The successful vortex flap design actually suppresses the formation of detached vortices to produce a small vortex which is restricted almost entirely to the leading edge flap itself.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Greenough, Jeffrey A.; de Supinski, Bronis R.; Yates, Robert K.
2005-04-25
We describe the performance of the block-structured Adaptive Mesh Refinement (AMR) code Raptor on the 32k node IBM BlueGene/L computer. This machine represents a significant step forward towards petascale computing. As such, it presents Raptor with many challenges for utilizing the hardware efficiently. In terms of performance, Raptor shows excellent weak and strong scaling when running in single level mode (no adaptivity). Hardware performance monitors show Raptor achieves an aggregate performance of 3:0 Tflops in the main integration kernel on the 32k system. Results from preliminary AMR runs on a prototype astrophysical problem demonstrate the efficiency of the current softwaremore » when running at large scale. The BG/L system is enabling a physics problem to be considered that represents a factor of 64 increase in overall size compared to the largest ones of this type computed to date. Finally, we provide a description of the development work currently underway to address our inefficiencies.« less
External-Compression Supersonic Inlet Design Code
NASA Technical Reports Server (NTRS)
Slater, John W.
2011-01-01
A computer code named SUPIN has been developed to perform aerodynamic design and analysis of external-compression, supersonic inlets. The baseline set of inlets include axisymmetric pitot, two-dimensional single-duct, axisymmetric outward-turning, and two-dimensional bifurcated-duct inlets. The aerodynamic methods are based on low-fidelity analytical and numerical procedures. The geometric methods are based on planar geometry elements. SUPIN has three modes of operation: 1) generate the inlet geometry from a explicit set of geometry information, 2) size and design the inlet geometry and analyze the aerodynamic performance, and 3) compute the aerodynamic performance of a specified inlet geometry. The aerodynamic performance quantities includes inlet flow rates, total pressure recovery, and drag. The geometry output from SUPIN includes inlet dimensions, cross-sectional areas, coordinates of planar profiles, and surface grids suitable for input to grid generators for analysis by computational fluid dynamics (CFD) methods. The input data file for SUPIN and the output file from SUPIN are text (ASCII) files. The surface grid files are output as formatted Plot3D or stereolithography (STL) files. SUPIN executes in batch mode and is available as a Microsoft Windows executable and Fortran95 source code with a makefile for Linux.
Stress Analysis and Fracture in Nanolaminate Composites
NASA Technical Reports Server (NTRS)
Chamis, Christos C.
2008-01-01
A stress analysis is performed on a nanolaminate subjected to bending. A composite mechanics computer code that is based on constituent properties and nanoelement formulation is used to evaluate the nanolaminate stresses. The results indicate that the computer code is sufficient for the analysis. The results also show that when a stress concentration is present, the nanolaminate stresses exceed their corresponding matrix-dominated strengths and the nanofiber fracture strength.
Working research codes into fluid dynamics education: a science gateway approach
NASA Astrophysics Data System (ADS)
Mason, Lachlan; Hetherington, James; O'Reilly, Martin; Yong, May; Jersakova, Radka; Grieve, Stuart; Perez-Suarez, David; Klapaukh, Roman; Craster, Richard V.; Matar, Omar K.
2017-11-01
Research codes are effective for illustrating complex concepts in educational fluid dynamics courses, compared to textbook examples, an interactive three-dimensional visualisation can bring a problem to life! Various barriers, however, prevent the adoption of research codes in teaching: codes are typically created for highly-specific `once-off' calculations and, as such, have no user interface and a steep learning curve. Moreover, a code may require access to high-performance computing resources that are not readily available in the classroom. This project allows academics to rapidly work research codes into their teaching via a minimalist `science gateway' framework. The gateway is a simple, yet flexible, web interface allowing students to construct and run simulations, as well as view and share their output. Behind the scenes, the common operations of job configuration, submission, monitoring and post-processing are customisable at the level of shell scripting. In this talk, we demonstrate the creation of an example teaching gateway connected to the Code BLUE fluid dynamics software. Student simulations can be run via a third-party cloud computing provider or a local high-performance cluster. EPSRC, UK, MEMPHIS program Grant (EP/K003976/1), RAEng Research Chair (OKM).
WINCLR: a Computer Code for Heat Transfer and Clearance Calculation in a Compressor
NASA Technical Reports Server (NTRS)
Bose, T. K.; Murthy, S. N. B.
1994-01-01
One of the concerns during inclement weather operation of aircraft in rain and hail storm conditions is the nature and extent of changes in compressor casing clearance. An increase in clearance affects efficiency while a decrease may cause blade rubbing with the casing. The change in clearance is the result of geometrical dimensional changes in the blades, the casing and the rotor due to heat transfer between those parts and the two-phase working fluid. The heat transfer interacts nonlinearly with the performance of the compressor, and, therefore, the determination of clearance changes necessitates a simultaneous determination of change in performance of the compressor. A computer code the WINCLR has been designed for the determination of casing clearance, that is operated interactively with the PURDU-WINCOF I code designed previously for determining the performance of a compressor. A detailed description of the WINCLR code is provided in a companion report. The current report provides details of the code with an illustrative example of application to the case of a multistage compressor. It is found in the example case that under given ingestion and operational conditions, it is possible for a compressor to undergo changes in performance in the front stages and rubbing in the back stages.
Fault-tolerance in Two-dimensional Topological Systems
NASA Astrophysics Data System (ADS)
Anderson, Jonas T.
This thesis is a collection of ideas with the general goal of building, at least in the abstract, a local fault-tolerant quantum computer. The connection between quantum information and topology has proven to be an active area of research in several fields. The introduction of the toric code by Alexei Kitaev demonstrated the usefulness of topology for quantum memory and quantum computation. Many quantum codes used for quantum memory are modeled by spin systems on a lattice, with operators that extract syndrome information placed on vertices or faces of the lattice. It is natural to wonder whether the useful codes in such systems can be classified. This thesis presents work that leverages ideas from topology and graph theory to explore the space of such codes. Homological stabilizer codes are introduced and it is shown that, under a set of reasonable assumptions, any qubit homological stabilizer code is equivalent to either a toric code or a color code. Additionally, the toric code and the color code correspond to distinct classes of graphs. Many systems have been proposed as candidate quantum computers. It is very desirable to design quantum computing architectures with two-dimensional layouts and low complexity in parity-checking circuitry. Kitaev's surface codes provided the first example of codes satisfying this property. They provided a new route to fault tolerance with more modest overheads and thresholds approaching 1%. The recently discovered color codes share many properties with the surface codes, such as the ability to perform syndrome extraction locally in two dimensions. Some families of color codes admit a transversal implementation of the entire Clifford group. This work investigates color codes on the 4.8.8 lattice known as triangular codes. I develop a fault-tolerant error-correction strategy for these codes in which repeated syndrome measurements on this lattice generate a three-dimensional space-time combinatorial structure. I then develop an integer program that analyzes this structure and determines the most likely set of errors consistent with the observed syndrome values. I implement this integer program to find the threshold for depolarizing noise on small versions of these triangular codes. Because the threshold for magic-state distillation is likely to be higher than this value and because logical
NASA Astrophysics Data System (ADS)
Russkova, Tatiana V.
2017-11-01
One tool to improve the performance of Monte Carlo methods for numerical simulation of light transport in the Earth's atmosphere is the parallel technology. A new algorithm oriented to parallel execution on the CUDA-enabled NVIDIA graphics processor is discussed. The efficiency of parallelization is analyzed on the basis of calculating the upward and downward fluxes of solar radiation in both a vertically homogeneous and inhomogeneous models of the atmosphere. The results of testing the new code under various atmospheric conditions including continuous singlelayered and multilayered clouds, and selective molecular absorption are presented. The results of testing the code using video cards with different compute capability are analyzed. It is shown that the changeover of computing from conventional PCs to the architecture of graphics processors gives more than a hundredfold increase in performance and fully reveals the capabilities of the technology used.
NASA Technical Reports Server (NTRS)
Mclennan, G. A.
1986-01-01
This report describes, and is a User's Manual for, a computer code (ANL/RBC) which calculates cycle performance for Rankine bottoming cycles extracting heat from a specified source gas stream. The code calculates cycle power and efficiency and the sizes for the heat exchangers, using tabular input of the properties of the cycle working fluid. An option is provided to calculate the costs of system components from user defined input cost functions. These cost functions may be defined in equation form or by numerical tabular data. A variety of functional forms have been included for these functions and they may be combined to create very general cost functions. An optional calculation mode can be used to determine the off-design performance of a system when operated away from the design-point, using the heat exchanger areas calculated for the design-point.
Implementing a strand of a scalable fault-tolerant quantum computing fabric.
Chow, Jerry M; Gambetta, Jay M; Magesan, Easwar; Abraham, David W; Cross, Andrew W; Johnson, B R; Masluk, Nicholas A; Ryan, Colm A; Smolin, John A; Srinivasan, Srikanth J; Steffen, M
2014-06-24
With favourable error thresholds and requiring only nearest-neighbour interactions on a lattice, the surface code is an error-correcting code that has garnered considerable attention. At the heart of this code is the ability to perform a low-weight parity measurement of local code qubits. Here we demonstrate high-fidelity parity detection of two code qubits via measurement of a third syndrome qubit. With high-fidelity gates, we generate entanglement distributed across three superconducting qubits in a lattice where each code qubit is coupled to two bus resonators. Via high-fidelity measurement of the syndrome qubit, we deterministically entangle the code qubits in either an even or odd parity Bell state, conditioned on the syndrome qubit state. Finally, to fully characterize this parity readout, we develop a measurement tomography protocol. The lattice presented naturally extends to larger networks of qubits, outlining a path towards fault-tolerant quantum computing.
The Navy/NASA Engine Program (NNEP89): A user's manual
NASA Technical Reports Server (NTRS)
Plencner, Robert M.; Snyder, Christopher A.
1991-01-01
An engine simulation computer code called NNEP89 was written to perform 1-D steady state thermodynamic analysis of turbine engine cycles. By using a very flexible method of input, a set of standard components are connected at execution time to simulate almost any turbine engine configuration that the user could imagine. The code was used to simulate a wide range of engine cycles from turboshafts and turboprops to air turborockets and supersonic cruise variable cycle engines. Off design performance is calculated through the use of component performance maps. A chemical equilibrium model is incorporated to adequately predict chemical dissociation as well as model virtually any fuel. NNEP89 is written in standard FORTRAN77 with clear structured programming and extensive internal documentation. The standard FORTRAN77 programming allows it to be installed onto most mainframe computers and workstations without modification. The NNEP89 code was derived from the Navy/NASA Engine program (NNEP). NNEP89 provides many improvements and enhancements to the original NNEP code and incorporates features which make it easier to use for the novice user. This is a comprehensive user's guide for the NNEP89 code.
NASA Technical Reports Server (NTRS)
Allison, Dennis O.; Waggoner, E. G.
1990-01-01
Computational predictions of the effects of wing contour modifications on maximum lift and transonic performance were made and verified against low speed and transonic wind tunnel data. This effort was part of a program to improve the maneuvering capability of the EA-6B electronics countermeasures aircraft, which evolved from the A-6 attack aircraft. The predictions were based on results from three computer codes which all include viscous effects: MCARF, a 2-D subsonic panel code; TAWFIVE, a transonic full potential code; and WBPPW, a transonic small disturbance potential flow code. The modifications were previously designed with the aid of these and other codes. The wing modifications consists of contour changes to the leading edge slats and trailing edge flaps and were designed for increased maximum lift with minimum effect on transonic performance. The prediction of the effects of the modifications are presented, with emphasis on verification through comparisons with wind tunnel data from the National Transonic Facility. Attention is focused on increments in low speed maximum lift and increments in transonic lift, pitching moment, and drag resulting from the contour modifications.
A Measurement and Simulation Based Methodology for Cache Performance Modeling and Tuning
NASA Technical Reports Server (NTRS)
Waheed, Abdul; Yan, Jerry; Saini, Subhash (Technical Monitor)
1998-01-01
We present a cache performance modeling methodology that facilitates the tuning of uniprocessor cache performance for applications executing on shared memory multiprocessors by accurately predicting the effects of source code level modifications. Measurements on a single processor are initially used for identifying parts of code where cache utilization improvements may significantly impact the overall performance. Cache simulation based on trace-driven techniques can be carried out without gathering detailed address traces. Minimal runtime information for modeling cache performance of a selected code block includes: base virtual addresses of arrays, virtual addresses of variables, and loop bounds for that code block. Rest of the information is obtained from the source code. We show that the cache performance predictions are as reliable as those obtained through trace-driven simulations. This technique is particularly helpful to the exploration of various "what-if' scenarios regarding the cache performance impact for alternative code structures. We explain and validate this methodology using a simple matrix-matrix multiplication program. We then apply this methodology to predict and tune the cache performance of two realistic scientific applications taken from the Computational Fluid Dynamics (CFD) domain.
Scalable hybrid computation with spikes.
Sarpeshkar, Rahul; O'Halloran, Micah
2002-09-01
We outline a hybrid analog-digital scheme for computing with three important features that enable it to scale to systems of large complexity: First, like digital computation, which uses several one-bit precise logical units to collectively compute a precise answer to a computation, the hybrid scheme uses several moderate-precision analog units to collectively compute a precise answer to a computation. Second, frequent discrete signal restoration of the analog information prevents analog noise and offset from degrading the computation. And, third, a state machine enables complex computations to be created using a sequence of elementary computations. A natural choice for implementing this hybrid scheme is one based on spikes because spike-count codes are digital, while spike-time codes are analog. We illustrate how spikes afford easy ways to implement all three components of scalable hybrid computation. First, as an important example of distributed analog computation, we show how spikes can create a distributed modular representation of an analog number by implementing digital carry interactions between spiking analog neurons. Second, we show how signal restoration may be performed by recursive spike-count quantization of spike-time codes. And, third, we use spikes from an analog dynamical system to trigger state transitions in a digital dynamical system, which reconfigures the analog dynamical system using a binary control vector; such feedback interactions between analog and digital dynamical systems create a hybrid state machine (HSM). The HSM extends and expands the concept of a digital finite-state-machine to the hybrid domain. We present experimental data from a two-neuron HSM on a chip that implements error-correcting analog-to-digital conversion with the concurrent use of spike-time and spike-count codes. We also present experimental data from silicon circuits that implement HSM-based pattern recognition using spike-time synchrony. We outline how HSMs may be used to perform learning, vector quantization, spike pattern recognition and generation, and how they may be reconfigured.
Convolutional code performance in planetary entry channels
NASA Technical Reports Server (NTRS)
Modestino, J. W.
1974-01-01
The planetary entry channel is modeled for communication purposes representing turbulent atmospheric scattering effects. The performance of short and long constraint length convolutional codes is investigated in conjunction with coherent BPSK modulation and Viterbi maximum likelihood decoding. Algorithms for sequential decoding are studied in terms of computation and/or storage requirements as a function of the fading channel parameters. The performance of the coded coherent BPSK system is compared with the coded incoherent MFSK system. Results indicate that: some degree of interleaving is required to combat time correlated fading of channel; only modest amounts of interleaving are required to approach performance of memoryless channel; additional propagational results are required on the phase perturbation process; and the incoherent MFSK system is superior when phase tracking errors are considered.
Geospace simulations using modern accelerator processor technology
NASA Astrophysics Data System (ADS)
Germaschewski, K.; Raeder, J.; Larson, D. J.
2009-12-01
OpenGGCM (Open Geospace General Circulation Model) is a well-established numerical code simulating the Earth's space environment. The most computing intensive part is the MHD (magnetohydrodynamics) solver that models the plasma surrounding Earth and its interaction with Earth's magnetic field and the solar wind flowing in from the sun. Like other global magnetosphere codes, OpenGGCM's realism is currently limited by computational constraints on grid resolution. OpenGGCM has been ported to make use of the added computational powerof modern accelerator based processor architectures, in particular the Cell processor. The Cell architecture is a novel inhomogeneous multicore architecture capable of achieving up to 230 GFLops on a single chip. The University of New Hampshire recently acquired a PowerXCell 8i based computing cluster, and here we will report initial performance results of OpenGGCM. Realizing the high theoretical performance of the Cell processor is a programming challenge, though. We implemented the MHD solver using a multi-level parallelization approach: On the coarsest level, the problem is distributed to processors based upon the usual domain decomposition approach. Then, on each processor, the problem is divided into 3D columns, each of which is handled by the memory limited SPEs (synergistic processing elements) slice by slice. Finally, SIMD instructions are used to fully exploit the SIMD FPUs in each SPE. Memory management needs to be handled explicitly by the code, using DMA to move data from main memory to the per-SPE local store and vice versa. We use a modern technique, automatic code generation, which shields the application programmer from having to deal with all of the implementation details just described, keeping the code much more easily maintainable. Our preliminary results indicate excellent performance, a speed-up of a factor of 30 compared to the unoptimized version.
WOMBAT: A Scalable and High-performance Astrophysical Magnetohydrodynamics Code
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mendygral, P. J.; Radcliffe, N.; Kandalla, K.
2017-02-01
We present a new code for astrophysical magnetohydrodynamics specifically designed and optimized for high performance and scaling on modern and future supercomputers. We describe a novel hybrid OpenMP/MPI programming model that emerged from a collaboration between Cray, Inc. and the University of Minnesota. This design utilizes MPI-RMA optimized for thread scaling, which allows the code to run extremely efficiently at very high thread counts ideal for the latest generation of multi-core and many-core architectures. Such performance characteristics are needed in the era of “exascale” computing. We describe and demonstrate our high-performance design in detail with the intent that it maymore » be used as a model for other, future astrophysical codes intended for applications demanding exceptional performance.« less
Manual of phosphoric acid fuel cell power plant optimization model and computer program
NASA Technical Reports Server (NTRS)
Lu, C. Y.; Alkasab, K. A.
1984-01-01
An optimized cost and performance model for a phosphoric acid fuel cell power plant system was derived and developed into a modular FORTRAN computer code. Cost, energy, mass, and electrochemical analyses were combined to develop a mathematical model for optimizing the steam to methane ratio in the reformer, hydrogen utilization in the PAFC plates per stack. The nonlinear programming code, COMPUTE, was used to solve this model, in which the method of mixed penalty function combined with Hooke and Jeeves pattern search was chosen to evaluate this specific optimization problem.
Benchmarking of Neutron Production of Heavy-Ion Transport Codes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Remec, Igor; Ronningen, Reginald M.; Heilbronn, Lawrence
Accurate prediction of radiation fields generated by heavy ion interactions is important in medical applications, space missions, and in design and operation of rare isotope research facilities. In recent years, several well-established computer codes in widespread use for particle and radiation transport calculations have been equipped with the capability to simulate heavy ion transport and interactions. To assess and validate these capabilities, we performed simulations of a series of benchmark-quality heavy ion experiments with the computer codes FLUKA, MARS15, MCNPX, and PHITS. We focus on the comparisons of secondary neutron production. Results are encouraging; however, further improvements in models andmore » codes and additional benchmarking are required.« less
Benchmarking of Heavy Ion Transport Codes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Remec, Igor; Ronningen, Reginald M.; Heilbronn, Lawrence
Accurate prediction of radiation fields generated by heavy ion interactions is important in medical applications, space missions, and in designing and operation of rare isotope research facilities. In recent years, several well-established computer codes in widespread use for particle and radiation transport calculations have been equipped with the capability to simulate heavy ion transport and interactions. To assess and validate these capabilities, we performed simulations of a series of benchmark-quality heavy ion experiments with the computer codes FLUKA, MARS15, MCNPX, and PHITS. We focus on the comparisons of secondary neutron production. Results are encouraging; however, further improvements in models andmore » codes and additional benchmarking are required.« less
A parallel-vector algorithm for rapid structural analysis on high-performance computers
NASA Technical Reports Server (NTRS)
Storaasli, Olaf O.; Nguyen, Duc T.; Agarwal, Tarun K.
1990-01-01
A fast, accurate Choleski method for the solution of symmetric systems of linear equations is presented. This direct method is based on a variable-band storage scheme and takes advantage of column heights to reduce the number of operations in the Choleski factorization. The method employs parallel computation in the outermost DO-loop and vector computation via the 'loop unrolling' technique in the innermost DO-loop. The method avoids computations with zeros outside the column heights, and as an option, zeros inside the band. The close relationship between Choleski and Gauss elimination methods is examined. The minor changes required to convert the Choleski code to a Gauss code to solve non-positive-definite symmetric systems of equations are identified. The results for two large-scale structural analyses performed on supercomputers, demonstrate the accuracy and speed of the method.
A parallel-vector algorithm for rapid structural analysis on high-performance computers
NASA Technical Reports Server (NTRS)
Storaasli, Olaf O.; Nguyen, Duc T.; Agarwal, Tarun K.
1990-01-01
A fast, accurate Choleski method for the solution of symmetric systems of linear equations is presented. This direct method is based on a variable-band storage scheme and takes advantage of column heights to reduce the number of operations in the Choleski factorization. The method employs parallel computation in the outermost DO-loop and vector computation via the loop unrolling technique in the innermost DO-loop. The method avoids computations with zeros outside the column heights, and as an option, zeros inside the band. The close relationship between Choleski and Gauss elimination methods is examined. The minor changes required to convert the Choleski code to a Gauss code to solve non-positive-definite symmetric systems of equations are identified. The results for two large scale structural analyses performed on supercomputers, demonstrate the accuracy and speed of the method.
Automatic Generation of OpenMP Directives and Its Application to Computational Fluid Dynamics Codes
NASA Technical Reports Server (NTRS)
Yan, Jerry; Jin, Haoqiang; Frumkin, Michael; Yan, Jerry (Technical Monitor)
2000-01-01
The shared-memory programming model is a very effective way to achieve parallelism on shared memory parallel computers. As great progress was made in hardware and software technologies, performance of parallel programs with compiler directives has demonstrated large improvement. The introduction of OpenMP directives, the industrial standard for shared-memory programming, has minimized the issue of portability. In this study, we have extended CAPTools, a computer-aided parallelization toolkit, to automatically generate OpenMP-based parallel programs with nominal user assistance. We outline techniques used in the implementation of the tool and discuss the application of this tool on the NAS Parallel Benchmarks and several computational fluid dynamics codes. This work demonstrates the great potential of using the tool to quickly port parallel programs and also achieve good performance that exceeds some of the commercial tools.
NASA Astrophysics Data System (ADS)
Hadade, Ioan; di Mare, Luca
2016-08-01
Modern multicore and manycore processors exhibit multiple levels of parallelism through a wide range of architectural features such as SIMD for data parallel execution or threads for core parallelism. The exploitation of multi-level parallelism is therefore crucial for achieving superior performance on current and future processors. This paper presents the performance tuning of a multiblock CFD solver on Intel SandyBridge and Haswell multicore CPUs and the Intel Xeon Phi Knights Corner coprocessor. Code optimisations have been applied on two computational kernels exhibiting different computational patterns: the update of flow variables and the evaluation of the Roe numerical fluxes. We discuss at great length the code transformations required for achieving efficient SIMD computations for both kernels across the selected devices including SIMD shuffles and transpositions for flux stencil computations and global memory transformations. Core parallelism is expressed through threading based on a number of domain decomposition techniques together with optimisations pertaining to alleviating NUMA effects found in multi-socket compute nodes. Results are correlated with the Roofline performance model in order to assert their efficiency for each distinct architecture. We report significant speedups for single thread execution across both kernels: 2-5X on the multicore CPUs and 14-23X on the Xeon Phi coprocessor. Computations at full node and chip concurrency deliver a factor of three speedup on the multicore processors and up to 24X on the Xeon Phi manycore coprocessor.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dr. George L Mesina
Our ultimate goal is to create and maintain RELAP5-3D as the best software tool available to analyze nuclear power plants. This begins with writing excellent programming and requires thorough testing. This document covers development of RELAP5-3D software, the behavior of the RELAP5-3D program that must be maintained, and code testing. RELAP5-3D must perform in a manner consistent with previous code versions with backward compatibility for the sake of the users. Thus file operations, code termination, input and output must remain consistent in form and content while adding appropriate new files, input and output as new features are developed. As computermore » hardware, operating systems, and other software change, RELAP5-3D must adapt and maintain performance. The code must be thoroughly tested to ensure that it continues to perform robustly on the supported platforms. The coding must be written in a consistent manner that makes the program easy to read to reduce the time and cost of development, maintenance and error resolution. The programming guidelines presented her are intended to institutionalize a consistent way of writing FORTRAN code for the RELAP5-3D computer program that will minimize errors and rework. A common format and organization of program units creates a unifying look and feel to the code. This in turn increases readability and reduces time required for maintenance, development and debugging. It also aids new programmers in reading and understanding the program. Therefore, when undertaking development of the RELAP5-3D computer program, the programmer must write computer code that follows these guidelines. This set of programming guidelines creates a framework of good programming practices, such as initialization, structured programming, and vector-friendly coding. It sets out formatting rules for lines of code, such as indentation, capitalization, spacing, etc. It creates limits on program units, such as subprograms, functions, and modules. It establishes documentation guidance on internal comments. The guidelines apply to both existing and new subprograms. They are written for both FORTRAN 77 and FORTRAN 95. The guidelines are not so rigorous as to inhibit a programmer’s unique style, but do restrict the variations in acceptable coding to create sufficient commonality that new readers will find the coding in each new subroutine familiar. It is recognized that this is a “living” document and must be updated as languages, compilers, and computer hardware and software evolve.« less
NASA Technical Reports Server (NTRS)
1988-01-01
The charter of the Structures Division is to perform and disseminate results of research conducted in support of aerospace engine structures. These results have a wide range of applicability to practioners of structural engineering mechanics beyond the aerospace arena. The specific purpose of the symposium was to familiarize the engineering structures community with the depth and range of research performed by the division and its academic and industrial partners. Sessions covered vibration control, fracture mechanics, ceramic component reliability, parallel computing, nondestructive evaluation, constitutive models and experimental capabilities, dynamic systems, fatigue and damage, wind turbines, hot section technology (HOST), aeroelasticity, structural mechanics codes, computational methods for dynamics, structural optimization, and applications of structural dynamics, and structural mechanics computer codes.
Analysis and Simulation of Narrowband GPS Jamming Using Digital Excision Temporal Filtering.
1994-12-01
the sequence of stored values from the P- code sampled at a 20 MHz rate. When correlated with a reference vector of the same length to simulate a GPS ...rate required for the GPS signals, (20 MHz sampling rate for the P- code signal), the personal computer (PC) used run the simulation could not perform...This subroutine is used to perform a fast FFT based 168 biased cross correlation . Written by Capt Gerry Falen, USAF, 16 AUG 94 % start of code
A TDM link with channel coding and digital voice.
NASA Technical Reports Server (NTRS)
Jones, M. W.; Tu, K.; Harton, P. L.
1972-01-01
The features of a TDM (time-division multiplexed) link model are described. A PCM telemetry sequence was coded for error correction and multiplexed with a digitized voice channel. An all-digital implementation of a variable-slope delta modulation algorithm was used to digitize the voice channel. The results of extensive testing are reported. The measured coding gain and the system performance over a Gaussian channel are compared with theoretical predictions and computer simulations. Word intelligibility scores are reported as a measure of voice channel performance.
User's manual for COAST 4: a code for costing and sizing tokamaks
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sink, D. A.; Iwinski, E. M.
1979-09-01
The purpose of this report is to document the computer program COAST 4 for the user/analyst. COAST, COst And Size Tokamak reactors, provides complete and self-consistent size models for the engineering features of D-T burning tokamak reactors and associated facilities involving a continuum of performance including highly beam driven through ignited plasma devices. TNS (The Next Step) devices with no tritium breeding or electrical power production are handled as well as power producing and fissile producing fusion-fission hybrid reactors. The code has been normalized with a TFTR calculation which is consistent with cost, size, and performance data published in themore » conceptual design report for that device. Information on code development, computer implementation and detailed user instructions are included in the text.« less
High Performance Radiation Transport Simulations on TITAN
DOE Office of Scientific and Technical Information (OSTI.GOV)
Baker, Christopher G; Davidson, Gregory G; Evans, Thomas M
2012-01-01
In this paper we describe the Denovo code system. Denovo solves the six-dimensional, steady-state, linear Boltzmann transport equation, of central importance to nuclear technology applications such as reactor core analysis (neutronics), radiation shielding, nuclear forensics and radiation detection. The code features multiple spatial differencing schemes, state-of-the-art linear solvers, the Koch-Baker-Alcouffe (KBA) parallel-wavefront sweep algorithm for inverting the transport operator, a new multilevel energy decomposition method scaling to hundreds of thousands of processing cores, and a modern, novel code architecture that supports straightforward integration of new features. In this paper we discuss the performance of Denovo on the 10--20 petaflop ORNLmore » GPU-based system, Titan. We describe algorithms and techniques used to exploit the capabilities of Titan's heterogeneous compute node architecture and the challenges of obtaining good parallel performance for this sparse hyperbolic PDE solver containing inherently sequential computations. Numerical results demonstrating Denovo performance on early Titan hardware are presented.« less
Multiprocessing on supercomputers for computational aerodynamics
NASA Technical Reports Server (NTRS)
Yarrow, Maurice; Mehta, Unmeel B.
1991-01-01
Little use is made of multiple processors available on current supercomputers (computers with a theoretical peak performance capability equal to 100 MFLOPS or more) to improve turnaround time in computational aerodynamics. The productivity of a computer user is directly related to this turnaround time. In a time-sharing environment, such improvement in this speed is achieved when multiple processors are used efficiently to execute an algorithm. The concept of multiple instructions and multiple data (MIMD) is applied through multitasking via a strategy that requires relatively minor modifications to an existing code for a single processor. This approach maps the available memory to multiple processors, exploiting the C-Fortran-Unix interface. The existing code is mapped without the need for developing a new algorithm. The procedure for building a code utilizing this approach is automated with the Unix stream editor.
Time-Dependent Simulation of Incompressible Flow in a Turbopump Using Overset Grid Approach
NASA Technical Reports Server (NTRS)
Kiris, Cetin; Kwak, Dochan
2001-01-01
This paper reports the progress being made towards complete unsteady turbopump simulation capability by using overset grid systems. A computational model of a turbo-pump impeller is used as a test case for the performance evaluation of the MPI, hybrid MPI/Open-MP, and MLP versions of the INS3D code. Relative motion of the grid system for rotor-stator interaction was obtained by employing overset grid techniques. Unsteady computations for a turbo-pump, which contains 114 zones with 34.3 Million grid points, are performed on Origin 2000 systems at NASA Ames Research Center. The approach taken for these simulations, and the performance of the parallel versions of the code are presented.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mubarak, Misbah; Ross, Robert B.
This technical report describes the experiments performed to validate the MPI performance measurements reported by the CODES dragonfly network simulation with the Theta Cray XC system at the Argonne Leadership Computing Facility (ALCF).
NASA Technical Reports Server (NTRS)
Stevens, N. J.
1979-01-01
Cases where the charged-particle environment acts on the spacecraft (e.g., spacecraft charging phenomena) and cases where a system on the spacecraft causes the interaction (e.g., high voltage space power systems) are considered. Both categories were studied in ground simulation facilities to understand the processes involved and to measure the pertinent parameters. Computer simulations are based on the NASA Charging Analyzer Program (NASCAP) code. Analytical models are developed in this code and verified against the experimental data. Extrapolation from the small test samples to space conditions are made with this code. Typical results from laboratory and computer simulations are presented for both types of interactions. Extrapolations from these simulations to performance in space environments are discussed.
User's Guide for TOUGH2-MP - A Massively Parallel Version of the TOUGH2 Code
DOE Office of Scientific and Technical Information (OSTI.GOV)
Earth Sciences Division; Zhang, Keni; Zhang, Keni
TOUGH2-MP is a massively parallel (MP) version of the TOUGH2 code, designed for computationally efficient parallel simulation of isothermal and nonisothermal flows of multicomponent, multiphase fluids in one, two, and three-dimensional porous and fractured media. In recent years, computational requirements have become increasingly intensive in large or highly nonlinear problems for applications in areas such as radioactive waste disposal, CO2 geological sequestration, environmental assessment and remediation, reservoir engineering, and groundwater hydrology. The primary objective of developing the parallel-simulation capability is to significantly improve the computational performance of the TOUGH2 family of codes. The particular goal for the parallel simulator ismore » to achieve orders-of-magnitude improvement in computational time for models with ever-increasing complexity. TOUGH2-MP is designed to perform parallel simulation on multi-CPU computational platforms. An earlier version of TOUGH2-MP (V1.0) was based on the TOUGH2 Version 1.4 with EOS3, EOS9, and T2R3D modules, a software previously qualified for applications in the Yucca Mountain project, and was designed for execution on CRAY T3E and IBM SP supercomputers. The current version of TOUGH2-MP (V2.0) includes all fluid property modules of the standard version TOUGH2 V2.0. It provides computationally efficient capabilities using supercomputers, Linux clusters, or multi-core PCs, and also offers many user-friendly features. The parallel simulator inherits all process capabilities from V2.0 together with additional capabilities for handling fractured media from V1.4. This report provides a quick starting guide on how to set up and run the TOUGH2-MP program for users with a basic knowledge of running the (standard) version TOUGH2 code, The report also gives a brief technical description of the code, including a discussion of parallel methodology, code structure, as well as mathematical and numerical methods used. To familiarize users with the parallel code, illustrative sample problems are presented.« less
Salko, Robert K.; Schmidt, Rodney C.; Avramova, Maria N.
2014-11-23
This study describes major improvements to the computational infrastructure of the CTF subchannel code so that full-core, pincell-resolved (i.e., one computational subchannel per real bundle flow channel) simulations can now be performed in much shorter run-times, either in stand-alone mode or as part of coupled-code multi-physics calculations. These improvements support the goals of the Department Of Energy Consortium for Advanced Simulation of Light Water Reactors (CASL) Energy Innovation Hub to develop high fidelity multi-physics simulation tools for nuclear energy design and analysis.
NASA Astrophysics Data System (ADS)
Kudryavtsev, Alexey N.; Kashkovsky, Alexander V.; Borisov, Semyon P.; Shershnev, Anton A.
2017-10-01
In the present work a computer code RCFS for numerical simulation of chemically reacting compressible flows on hybrid CPU/GPU supercomputers is developed. It solves 3D unsteady Euler equations for multispecies chemically reacting flows in general curvilinear coordinates using shock-capturing TVD schemes. Time advancement is carried out using the explicit Runge-Kutta TVD schemes. Program implementation uses CUDA application programming interface to perform GPU computations. Data between GPUs is distributed via domain decomposition technique. The developed code is verified on the number of test cases including supersonic flow over a cylinder.
NASA Technical Reports Server (NTRS)
White, P. R.; Little, R. R.
1985-01-01
A research effort was undertaken to develop personal computer based software for vibrational analysis. The software was developed to analytically determine the natural frequencies and mode shapes for the uncoupled lateral vibrations of the blade and counterweight assemblies used in a single bladed wind turbine. The uncoupled vibration analysis was performed in both the flapwise and chordwise directions for static rotor conditions. The effects of rotation on the uncoupled flapwise vibration of the blade and counterweight assemblies were evaluated for various rotor speeds up to 90 rpm. The theory, used in the vibration analysis codes, is based on a lumped mass formulation for the blade and counterweight assemblies. The codes are general so that other designs can be readily analyzed. The input for the codes is generally interactive to facilitate usage. The output of the codes is both tabular and graphical. Listings of the codes are provided. Predicted natural frequencies of the first several modes show reasonable agreement with experimental results. The analysis codes were originally developed on a DEC PDP 11/34 minicomputer and then downloaded and modified to run on an ITT XTRA personal computer. Studies conducted to evaluate the efficiency of running the programs on a personal computer as compared with the minicomputer indicated that, with the proper combination of hardware and software options, the efficiency of using a personal computer exceeds that of a minicomputer.
The Julia programming language: the future of scientific computing
NASA Astrophysics Data System (ADS)
Gibson, John
2017-11-01
Julia is an innovative new open-source programming language for high-level, high-performance numerical computing. Julia combines the general-purpose breadth and extensibility of Python, the ease-of-use and numeric focus of Matlab, the speed of C and Fortran, and the metaprogramming power of Lisp. Julia uses type inference and just-in-time compilation to compile high-level user code to machine code on the fly. A rich set of numeric types and extensive numerical libraries are built-in. As a result, Julia is competitive with Matlab for interactive graphical exploration and with C and Fortran for high-performance computing. This talk interactively demonstrates Julia's numerical features and benchmarks Julia against C, C++, Fortran, Matlab, and Python on a spectral time-stepping algorithm for a 1d nonlinear partial differential equation. The Julia code is nearly as compact as Matlab and nearly as fast as Fortran. This material is based upon work supported by the National Science Foundation under Grant No. 1554149.
Analysis of internal flows relative to the space shuttle main engine
NASA Technical Reports Server (NTRS)
1987-01-01
Cooperative efforts between the Lockheed-Huntsville Computational Mechanics Group and the NASA-MSFC Computational Fluid Dynamics staff has resulted in improved capabilities for numerically simulating incompressible flows generic to the Space Shuttle Main Engine (SSME). A well established and documented CFD code was obtained, modified, and applied to laminar and turbulent flows of the type occurring in the SSME Hot Gas Manifold. The INS3D code was installed on the NASA-MSFC CRAY-XMP computer system and is currently being used by NASA engineers. Studies to perform a transient analysis of the FPB were conducted. The COBRA/TRAC code is recommended for simulating the transient flow of oxygen into the LOX manifold. Property data for modifying the code to represent LOX/GOX flow was collected. The ALFA code was developed and recommended for representing the transient combustion in the preburner. These two codes will couple through the transient boundary conditions to simulate the startup and/or shutdown of the fuel preburner. A study, NAS8-37461, is currently being conducted to implement this modeling effort.
Low Density Parity Check Codes Based on Finite Geometries: A Rediscovery and More
NASA Technical Reports Server (NTRS)
Kou, Yu; Lin, Shu; Fossorier, Marc
1999-01-01
Low density parity check (LDPC) codes with iterative decoding based on belief propagation achieve astonishing error performance close to Shannon limit. No algebraic or geometric method for constructing these codes has been reported and they are largely generated by computer search. As a result, encoding of long LDPC codes is in general very complex. This paper presents two classes of high rate LDPC codes whose constructions are based on finite Euclidean and projective geometries, respectively. These classes of codes a.re cyclic and have good constraint parameters and minimum distances. Cyclic structure adows the use of linear feedback shift registers for encoding. These finite geometry LDPC codes achieve very good error performance with either soft-decision iterative decoding based on belief propagation or Gallager's hard-decision bit flipping algorithm. These codes can be punctured or extended to obtain other good LDPC codes. A generalization of these codes is also presented.
The NEST Dry-Run Mode: Efficient Dynamic Analysis of Neuronal Network Simulation Code.
Kunkel, Susanne; Schenck, Wolfram
2017-01-01
NEST is a simulator for spiking neuronal networks that commits to a general purpose approach: It allows for high flexibility in the design of network models, and its applications range from small-scale simulations on laptops to brain-scale simulations on supercomputers. Hence, developers need to test their code for various use cases and ensure that changes to code do not impair scalability. However, running a full set of benchmarks on a supercomputer takes up precious compute-time resources and can entail long queuing times. Here, we present the NEST dry-run mode, which enables comprehensive dynamic code analysis without requiring access to high-performance computing facilities. A dry-run simulation is carried out by a single process, which performs all simulation steps except communication as if it was part of a parallel environment with many processes. We show that measurements of memory usage and runtime of neuronal network simulations closely match the corresponding dry-run data. Furthermore, we demonstrate the successful application of the dry-run mode in the areas of profiling and performance modeling.
The NEST Dry-Run Mode: Efficient Dynamic Analysis of Neuronal Network Simulation Code
Kunkel, Susanne; Schenck, Wolfram
2017-01-01
NEST is a simulator for spiking neuronal networks that commits to a general purpose approach: It allows for high flexibility in the design of network models, and its applications range from small-scale simulations on laptops to brain-scale simulations on supercomputers. Hence, developers need to test their code for various use cases and ensure that changes to code do not impair scalability. However, running a full set of benchmarks on a supercomputer takes up precious compute-time resources and can entail long queuing times. Here, we present the NEST dry-run mode, which enables comprehensive dynamic code analysis without requiring access to high-performance computing facilities. A dry-run simulation is carried out by a single process, which performs all simulation steps except communication as if it was part of a parallel environment with many processes. We show that measurements of memory usage and runtime of neuronal network simulations closely match the corresponding dry-run data. Furthermore, we demonstrate the successful application of the dry-run mode in the areas of profiling and performance modeling. PMID:28701946
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hong, Tianzhen; Buhl, Fred; Haves, Philip
2008-09-20
EnergyPlus is a new generation building performance simulation program offering many new modeling capabilities and more accurate performance calculations integrating building components in sub-hourly time steps. However, EnergyPlus runs much slower than the current generation simulation programs. This has become a major barrier to its widespread adoption by the industry. This paper analyzed EnergyPlus run time from comprehensive perspectives to identify key issues and challenges of speeding up EnergyPlus: studying the historical trends of EnergyPlus run time based on the advancement of computers and code improvements to EnergyPlus, comparing EnergyPlus with DOE-2 to understand and quantify the run time differences,more » identifying key simulation settings and model features that have significant impacts on run time, and performing code profiling to identify which EnergyPlus subroutines consume the most amount of run time. This paper provides recommendations to improve EnergyPlus run time from the modeler?s perspective and adequate computing platforms. Suggestions of software code and architecture changes to improve EnergyPlus run time based on the code profiling results are also discussed.« less
Parallel community climate model: Description and user`s guide
DOE Office of Scientific and Technical Information (OSTI.GOV)
Drake, J.B.; Flanery, R.E.; Semeraro, B.D.
This report gives an overview of a parallel version of the NCAR Community Climate Model, CCM2, implemented for MIMD massively parallel computers using a message-passing programming paradigm. The parallel implementation was developed on an Intel iPSC/860 with 128 processors and on the Intel Delta with 512 processors, and the initial target platform for the production version of the code is the Intel Paragon with 2048 processors. Because the implementation uses a standard, portable message-passing libraries, the code has been easily ported to other multiprocessors supporting a message-passing programming paradigm. The parallelization strategy used is to decompose the problem domain intomore » geographical patches and assign each processor the computation associated with a distinct subset of the patches. With this decomposition, the physics calculations involve only grid points and data local to a processor and are performed in parallel. Using parallel algorithms developed for the semi-Lagrangian transport, the fast Fourier transform and the Legendre transform, both physics and dynamics are computed in parallel with minimal data movement and modest change to the original CCM2 source code. Sequential or parallel history tapes are written and input files (in history tape format) are read sequentially by the parallel code to promote compatibility with production use of the model on other computer systems. A validation exercise has been performed with the parallel code and is detailed along with some performance numbers on the Intel Paragon and the IBM SP2. A discussion of reproducibility of results is included. A user`s guide for the PCCM2 version 2.1 on the various parallel machines completes the report. Procedures for compilation, setup and execution are given. A discussion of code internals is included for those who may wish to modify and use the program in their own research.« less
Joint Services Electronics Program Annual Progress Report.
1985-11-01
one symbol memory) adaptive lHuffman codes were performed, and the compression achieved was compared with that of Ziv - Lempel coding. As was expected...MATERIALS 8 4. Information Systems 9 4.1 REAL TIME STATISTICAL DATA PROCESSING 9 -. 4.2 DATA COMPRESSION for COMPUTER DATA STRUCTURES 9 5. PhD...a. Real Time Statistical Data Processing (T. Kailatb) b. Data Compression for Computer Data Structures (J. Gill) Acces Fo NTIS CRA&I I " DTIC TAB
Validation of computational code UST3D by the example of experimental aerodynamic data
NASA Astrophysics Data System (ADS)
Surzhikov, S. T.
2017-02-01
Numerical simulation of the aerodynamic characteristics of the hypersonic vehicles X-33 and X-34 as well as spherically blunted cone is performed using the unstructured meshes. It is demonstrated that the numerical predictions obtained with the computational code UST3D are in acceptable agreement with the experimental data for approximate parameters of the geometry of the hypersonic vehicles and in excellent agreement with data for blunted cone.
Anisotropic Effects on Constitutive Model Parameters of Aluminum Alloys
2012-01-01
constants are required input to computer codes (LS-DYNA, DYNA3D or SPH ) to accurately simulate fragment impact on structural components made of high...different temperatures. These model constants are required input to computer codes (LS-DYNA, DYNA3D or SPH ) to accurately simulate fragment impact on...ADDRESS(ES) Naval Surface Warfare Center,4104Evans Way Suite 102,Indian Head,MD,20640 8. PERFORMING ORGANIZATION REPORT NUMBER 9. SPONSORING/MONITORING
DOE Office of Scientific and Technical Information (OSTI.GOV)
NONE
SCALE--a modular code system for Standardized Computer Analyses Licensing Evaluation--has been developed by Oak Ridge National Laboratory at the request of the US Nuclear Regulatory Commission. The SCALE system utilizes well-established computer codes and methods within standard analysis sequences that (1) allow an input format designed for the occasional user and/or novice, (2) automated the data processing and coupling between modules, and (3) provide accurate and reliable results. System development has been directed at problem-dependent cross-section processing and analysis of criticality safety, shielding, heat transfer, and depletion/decay problems. Since the initial release of SCALE in 1980, the code system hasmore » been heavily used for evaluation of nuclear fuel facility and package designs. This revision documents Version 4.3 of the system.« less
Computational electronics and electromagnetics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shang, C. C.
The Computational Electronics and Electromagnetics thrust area at Lawrence Livermore National Laboratory serves as the focal point for engineering R&D activities for developing computer-based design, analysis, and tools for theory. Key representative applications include design of particle accelerator cells and beamline components; engineering analysis and design of high-power components, photonics, and optoelectronics circuit design; EMI susceptibility analysis; and antenna synthesis. The FY-96 technology-base effort focused code development on (1) accelerator design codes; (2) 3-D massively parallel, object-oriented time-domain EM codes; (3) material models; (4) coupling and application of engineering tools for analysis and design of high-power components; (5) 3-D spectral-domainmore » CEM tools; and (6) enhancement of laser drilling codes. Joint efforts with the Power Conversion Technologies thrust area include development of antenna systems for compact, high-performance radar, in addition to novel, compact Marx generators. 18 refs., 25 figs., 1 tab.« less
Reference Solutions for Benchmark Turbulent Flows in Three Dimensions
NASA Technical Reports Server (NTRS)
Diskin, Boris; Thomas, James L.; Pandya, Mohagna J.; Rumsey, Christopher L.
2016-01-01
A grid convergence study is performed to establish benchmark solutions for turbulent flows in three dimensions (3D) in support of turbulence-model verification campaign at the Turbulence Modeling Resource (TMR) website. The three benchmark cases are subsonic flows around a 3D bump and a hemisphere-cylinder configuration and a supersonic internal flow through a square duct. Reference solutions are computed for Reynolds Averaged Navier Stokes equations with the Spalart-Allmaras turbulence model using a linear eddy-viscosity model for the external flows and a nonlinear eddy-viscosity model based on a quadratic constitutive relation for the internal flow. The study involves three widely-used practical computational fluid dynamics codes developed and supported at NASA Langley Research Center: FUN3D, USM3D, and CFL3D. Reference steady-state solutions computed with these three codes on families of consistently refined grids are presented. Grid-to-grid and code-to-code variations are described in detail.
Remote control system for high-perfomance computer simulation of crystal growth by the PFC method
NASA Astrophysics Data System (ADS)
Pavlyuk, Evgeny; Starodumov, Ilya; Osipov, Sergei
2017-04-01
Modeling of crystallization process by the phase field crystal method (PFC) - one of the important directions of modern computational materials science. In this paper, the practical side of the computer simulation of the crystallization process by the PFC method is investigated. To solve problems using this method, it is necessary to use high-performance computing clusters, data storage systems and other often expensive complex computer systems. Access to such resources is often limited, unstable and accompanied by various administrative problems. In addition, the variety of software and settings of different computing clusters sometimes does not allow researchers to use unified program code. There is a need to adapt the program code for each configuration of the computer complex. The practical experience of the authors has shown that the creation of a special control system for computing with the possibility of remote use can greatly simplify the implementation of simulations and increase the performance of scientific research. In current paper we show the principal idea of such a system and justify its efficiency.
GPU-accelerated phase-field simulation of dendritic solidification in a binary alloy
NASA Astrophysics Data System (ADS)
Yamanaka, Akinori; Aoki, Takayuki; Ogawa, Satoi; Takaki, Tomohiro
2011-03-01
The phase-field simulation for dendritic solidification of a binary alloy has been accelerated by using a graphic processing unit (GPU). To perform the phase-field simulation of the alloy solidification on GPU, a program code was developed with computer unified device architecture (CUDA). In this paper, the implementation technique of the phase-field model on GPU is presented. Also, we evaluated the acceleration performance of the three-dimensional solidification simulation by using a single NVIDIA TESLA C1060 GPU and the developed program code. The results showed that the GPU calculation for 5763 computational grids achieved the performance of 170 GFLOPS by utilizing the shared memory as a software-managed cache. Furthermore, it can be demonstrated that the computation with the GPU is 100 times faster than that with a single CPU core. From the obtained results, we confirmed the feasibility of realizing a real-time full three-dimensional phase-field simulation of microstructure evolution on a personal desktop computer.
CFD Predictions for Transonic Performance of the ERA Hybrid Wing-Body Configuration
NASA Technical Reports Server (NTRS)
Deere, Karen A.; Luckring, James M.; McMillin, S. Naomi; Flamm, Jeffrey D.; Roman, Dino
2016-01-01
A computational study was performed for a Hybrid Wing Body configuration that was focused at transonic cruise performance conditions. In the absence of experimental data, two fully independent computational fluid dynamics analyses were conducted to add confidence to the estimated transonic performance predictions. The primary analysis was performed by Boeing with the structured overset-mesh code OVERFLOW. The secondary analysis was performed by NASA Langley Research Center with the unstructured-mesh code USM3D. Both analyses were performed at full-scale flight conditions and included three configurations customary to drag buildup and interference analysis: a powered complete configuration, the configuration with the nacelle/pylon removed, and the powered nacelle in isolation. The results in this paper are focused primarily on transonic performance up to cruise and through drag rise. Comparisons between the CFD results were very good despite some minor geometric differences in the two analyses.
Computations of the Magnus effect for slender bodies in supersonic flow
NASA Technical Reports Server (NTRS)
Sturek, W. B.; Schiff, L. B.
1980-01-01
A recently reported Parabolized Navier-Stokes code has been employed to compute the supersonic flow field about spinning cone, ogive-cylinder, and boattailed bodies of revolution at moderate incidence. The computations were performed for flow conditions where extensive measurements for wall pressure, boundary layer velocity profiles and Magnus force had been obtained. Comparisons between the computational results and experiment indicate excellent agreement for angles of attack up to six degrees. The comparisons for Magnus effects show that the code accurately predicts the effects of body shape and Mach number for the selected models for Mach numbers in the range of 2-4.
Performance analysis of the word synchronization properties of the outer code in a TDRSS decoder
NASA Technical Reports Server (NTRS)
Costello, D. J., Jr.; Lin, S.
1984-01-01
A self-synchronizing coding scheme for NASA's TDRSS satellite system is a concatenation of a (2,1,7) inner convolutional code with a (255,223) Reed-Solomon outer code. Both symbol and word synchronization are achieved without requiring that any additional symbols be transmitted. An important parameter which determines the performance of the word sync procedure is the ratio of the decoding failure probability to the undetected error probability. Ideally, the former should be as small as possible compared to the latter when the error correcting capability of the code is exceeded. A computer simulation of a (255,223) Reed-Solomon code as carried out. Results for decoding failure probability and for undetected error probability are tabulated and compared.
Current and anticipated uses of the thermal hydraulics codes at the NRC
DOE Office of Scientific and Technical Information (OSTI.GOV)
Caruso, R.
1997-07-01
The focus of Thermal-Hydraulic computer code usage in nuclear regulatory organizations has undergone a considerable shift since the codes were originally conceived. Less work is being done in the area of {open_quotes}Design Basis Accidents,{close_quotes}, and much more emphasis is being placed on analysis of operational events, probabalistic risk/safety assessment, and maintenance practices. All of these areas need support from Thermal-Hydraulic computer codes to model the behavior of plant fluid systems, and they all need the ability to perform large numbers of analyses quickly. It is therefore important for the T/H codes of the future to be able to support thesemore » needs, by providing robust, easy-to-use, tools that produce easy-to understand results for a wider community of nuclear professionals. These tools need to take advantage of the great advances that have occurred recently in computer software, by providing users with graphical user interfaces for both input and output. In addition, reduced costs of computer memory and other hardware have removed the need for excessively complex data structures and numerical schemes, which make the codes more difficult and expensive to modify, maintain, and debug, and which increase problem run-times. Future versions of the T/H codes should also be structured in a modular fashion, to allow for the easy incorporation of new correlations, models, or features, and to simplify maintenance and testing. Finally, it is important that future T/H code developers work closely with the code user community, to ensure that the code meet the needs of those users.« less
Analytical determination of propeller performance degradation due to ice accretion
NASA Technical Reports Server (NTRS)
Miller, T. L.
1986-01-01
A computer code has been developed which is capable of computing propeller performance for clean, glaze, or rime iced propeller configurations, thereby providing a mechanism for determining the degree of performance degradation which results from a given icing encounter. The inviscid, incompressible flow field at each specified propeller radial location is first computed using the Theodorsen transformation method of conformal mapping. A droplet trajectory computation then calculates droplet impingement points and airfoil collection efficiency for each radial location, at which point several user-selectable empirical correlations are available for determining the aerodynamic penalities which arise due to the ice accretion. Propeller performance is finally computed using strip analysis for either the clean or iced propeller. In the iced mode, the differential thrust and torque coefficient equations are modified by the drag and lift coefficient increments due to ice to obtain the appropriate iced values. Comparison with available experimental propeller icing data shows good agreement in several cases. The code's capability to properly predict iced thrust coefficient, power coefficient, and propeller efficiency is shown to be dependent on the choice of empirical correlation employed as well as proper specification of radial icing extent.
Validation and Performance Comparison of Numerical Codes for Tsunami Inundation
NASA Astrophysics Data System (ADS)
Velioglu, D.; Kian, R.; Yalciner, A. C.; Zaytsev, A.
2015-12-01
In inundation zones, tsunami motion turns from wave motion to flow of water. Modelling of this phenomenon is a complex problem since there are many parameters affecting the tsunami flow. In this respect, the performance of numerical codes that analyze tsunami inundation patterns becomes important. The computation of water surface elevation is not sufficient for proper analysis of tsunami behaviour in shallow water zones and on land and hence for the development of mitigation strategies. Velocity and velocity patterns are also crucial parameters and have to be computed at the highest accuracy. There are numerous numerical codes to be used for simulating tsunami inundation. In this study, FLOW 3D and NAMI DANCE codes are selected for validation and performance comparison. Flow 3D simulates linear and nonlinear propagating surface waves as well as long waves by solving three-dimensional Navier-Stokes (3D-NS) equations. FLOW 3D is used specificaly for flood problems. NAMI DANCE uses finite difference computational method to solve linear and nonlinear forms of shallow water equations (NSWE) in long wave problems, specifically tsunamis. In this study, these codes are validated and their performances are compared using two benchmark problems which are discussed in 2015 National Tsunami Hazard Mitigation Program (NTHMP) Annual meeting in Portland, USA. One of the problems is an experiment of a single long-period wave propagating up a piecewise linear slope and onto a small-scale model of the town of Seaside, Oregon. Other benchmark problem is an experiment of a single solitary wave propagating up a triangular shaped shelf with an island feature located at the offshore point of the shelf. The computed water surface elevation and velocity data are compared with the measured data. The comparisons showed that both codes are in fairly good agreement with each other and benchmark data. All results are presented with discussions and comparisons. The research leading to these results has received funding from the European Union's Seventh Framework Programme (FP7/2007-2013) under grant agreement No 603839 (Project ASTARTE - Assessment, Strategy and Risk Reduction for Tsunamis in Europe)
Off-Design Performance of Radial-Inflow Turbines
NASA Technical Reports Server (NTRS)
Meitner, P. L.; Glassman, A. J.
1986-01-01
Computer code determines rotor exit flow from hub to tip. RTOD (Radial Turbine Off-Design), computes off-design performance of radial turbine by modeling flow with stator viscous and trailing-edge losses, and with vaneless space loss between stator and rotor, and with rotor incidence, viscous, clearance, trailing-edge, and disk friction losses.
Error suppression via complementary gauge choices in Reed-Muller codes
NASA Astrophysics Data System (ADS)
Chamberland, Christopher; Jochym-O'Connor, Tomas
2017-09-01
Concatenation of two quantum error-correcting codes with complementary sets of transversal gates can provide a means toward universal fault-tolerant quantum computation. We first show that it is generally preferable to choose the inner code with the higher pseudo-threshold to achieve lower logical failure rates. We then explore the threshold properties of a wide range of concatenation schemes. Notably, we demonstrate that the concatenation of complementary sets of Reed-Muller codes can increase the code capacity threshold under depolarizing noise when compared to extensions of previously proposed concatenation models. We also analyze the properties of logical errors under circuit-level noise, showing that smaller codes perform better for all sampled physical error rates. Our work provides new insights into the performance of universal concatenated quantum codes for both code capacity and circuit-level noise.
Evaluation of three coding schemes designed for improved data communication
NASA Technical Reports Server (NTRS)
Snelsire, R. W.
1974-01-01
Three coding schemes designed for improved data communication are evaluated. Four block codes are evaluated relative to a quality function, which is a function of both the amount of data rejected and the error rate. The Viterbi maximum likelihood decoding algorithm as a decoding procedure is reviewed. This evaluation is obtained by simulating the system on a digital computer. Short constraint length rate 1/2 quick-look codes are studied, and their performance is compared to general nonsystematic codes.
Development and application of computational aerothermodynamics flowfield computer codes
NASA Technical Reports Server (NTRS)
Venkatapathy, Ethiraj
1994-01-01
Research was performed in the area of computational modeling and application of hypersonic, high-enthalpy, thermo-chemical nonequilibrium flow (Aerothermodynamics) problems. A number of computational fluid dynamic (CFD) codes were developed and applied to simulate high altitude rocket-plume, the Aeroassist Flight Experiment (AFE), hypersonic base flow for planetary probes, the single expansion ramp model (SERN) connected with the National Aerospace Plane, hypersonic drag devices, hypersonic ramp flows, ballistic range models, shock tunnel facility nozzles, transient and steady flows in the shock tunnel facility, arc-jet flows, thermochemical nonequilibrium flows around simple and complex bodies, axisymmetric ionized flows of interest to re-entry, unsteady shock induced combustion phenomena, high enthalpy pulsed facility simulations, and unsteady shock boundary layer interactions in shock tunnels. Computational modeling involved developing appropriate numerical schemes for the flows on interest and developing, applying, and validating appropriate thermochemical processes. As part of improving the accuracy of the numerical predictions, adaptive grid algorithms were explored, and a user-friendly, self-adaptive code (SAGE) was developed. Aerothermodynamic flows of interest included energy transfer due to strong radiation, and a significant level of effort was spent in developing computational codes for calculating radiation and radiation modeling. In addition, computational tools were developed and applied to predict the radiative heat flux and spectra that reach the model surface.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hornung, Richard D.; Hones, Holger E.
The RAJA Performance Suite is designed to evaluate performance of the RAJA performance portability library on a wide variety of important high performance computing (HPC) algorithmic lulmels. These kernels assess compiler optimizations and various parallel programming model backends accessible through RAJA, such as OpenMP, CUDA, etc. The Initial version of the suite contains 25 computational kernels, each of which appears in 6 variants: Baseline SequcntiaJ, RAJA SequentiaJ, Baseline OpenMP, RAJA OpenMP, Baseline CUDA, RAJA CUDA. All variants of each kernel perform essentially the same mathematical operations and the loop body code for each kernel is identical across all variants. Theremore » are a few kernels, such as those that contain reduction operations, that require CUDA-specific coding for their CUDA variants. ActuaJ computer instructions executed and how they run in parallel differs depending on the parallel programming model backend used and which optimizations are perfonned by the compiler used to build the Perfonnance Suite executable. The Suite will be used primarily by RAJA developers to perform regular assessments of RAJA performance across a range of hardware platforms and compilers as RAJA features are being developed. It will also be used by LLNL hardware and software vendor panners for new defining requirements for future computing platform procurements and acceptance testing. In particular, the RAJA Performance Suite will be used for compiler acceptance testing of the upcoming CORAUSierra machine {initial LLNL delivery expected in late-2017/early 2018) and the CORAL-2 procurement. The Suite will aJso be used to generate concise source code reproducers of compiler and runtime issues we uncover so that we may provide them to relevant vendors to be fixed.« less
Trapani, Stefano; Navaza, Jorge
2006-07-01
The FFT calculation of spherical harmonics, Wigner D matrices and rotation function has been extended to all angular variables in the AMoRe molecular replacement software. The resulting code avoids singularity issues arising from recursive formulas, performs faster and produces results with at least the same accuracy as the original code. The new code aims at permitting accurate and more rapid computations at high angular resolution of the rotation function of large particles. Test calculations on the icosahedral IBDV VP2 subviral particle showed that the new code performs on the average 1.5 times faster than the original code.
NASA Astrophysics Data System (ADS)
Mielikainen, Jarno; Huang, Bormin; Huang, Allen H.
2014-10-01
Purdue-Lin scheme is a relatively sophisticated microphysics scheme in the Weather Research and Forecasting (WRF) model. The scheme includes six classes of hydro meteors: water vapor, cloud water, raid, cloud ice, snow and graupel. The scheme is very suitable for massively parallel computation as there are no interactions among horizontal grid points. In this paper, we accelerate the Purdue Lin scheme using Intel Many Integrated Core Architecture (MIC) hardware. The Intel Xeon Phi is a high performance coprocessor consists of up to 61 cores. The Xeon Phi is connected to a CPU via the PCI Express (PICe) bus. In this paper, we will discuss in detail the code optimization issues encountered while tuning the Purdue-Lin microphysics Fortran code for Xeon Phi. In particularly, getting a good performance required utilizing multiple cores, the wide vector operations and make efficient use of memory. The results show that the optimizations improved performance of the original code on Xeon Phi 5110P by a factor of 4.2x. Furthermore, the same optimizations improved performance on Intel Xeon E5-2603 CPU by a factor of 1.2x compared to the original code.
Classification of breast tissue in mammograms using efficient coding.
Costa, Daniel D; Campos, Lúcio F; Barros, Allan K
2011-06-24
Female breast cancer is the major cause of death by cancer in western countries. Efforts in Computer Vision have been made in order to improve the diagnostic accuracy by radiologists. Some methods of lesion diagnosis in mammogram images were developed based in the technique of principal component analysis which has been used in efficient coding of signals and 2D Gabor wavelets used for computer vision applications and modeling biological vision. In this work, we present a methodology that uses efficient coding along with linear discriminant analysis to distinguish between mass and non-mass from 5090 region of interest from mammograms. The results show that the best rates of success reached with Gabor wavelets and principal component analysis were 85.28% and 87.28%, respectively. In comparison, the model of efficient coding presented here reached up to 90.07%. Altogether, the results presented demonstrate that independent component analysis performed successfully the efficient coding in order to discriminate mass from non-mass tissues. In addition, we have observed that LDA with ICA bases showed high predictive performance for some datasets and thus provide significant support for a more detailed clinical investigation.
Comparison Between Simulated and Experimentally Measured Performance of a Four Port Wave Rotor
NASA Technical Reports Server (NTRS)
Paxson, Daniel E.; Wilson, Jack; Welch, Gerard E.
2007-01-01
Performance and operability testing has been completed on a laboratory-scale, four-port wave rotor, of the type suitable for use as a topping cycle on a gas turbine engine. Many design aspects, and performance estimates for the wave rotor were determined using a time-accurate, one-dimensional, computational fluid dynamics-based simulation code developed specifically for wave rotors. The code follows a single rotor passage as it moves past the various ports, which in this reference frame become boundary conditions. This paper compares wave rotor performance predicted with the code to that measured during laboratory testing. Both on and off-design operating conditions were examined. Overall, the match between code and rig was found to be quite good. At operating points where there were disparities, the assumption of larger than expected internal leakage rates successfully realigned code predictions and laboratory measurements. Possible mechanisms for such leakage rates are discussed.
Testing of Error-Correcting Sparse Permutation Channel Codes
NASA Technical Reports Server (NTRS)
Shcheglov, Kirill, V.; Orlov, Sergei S.
2008-01-01
A computer program performs Monte Carlo direct numerical simulations for testing sparse permutation channel codes, which offer strong error-correction capabilities at high code rates and are considered especially suitable for storage of digital data in holographic and volume memories. A word in a code of this type is characterized by, among other things, a sparseness parameter (M) and a fixed number (K) of 1 or "on" bits in a channel block length of N.
electromagnetics, eddy current, computer codes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gartling, David
TORO Version 4 is designed for finite element analysis of steady, transient and time-harmonic, multi-dimensional, quasi-static problems in electromagnetics. The code allows simulation of electrostatic fields, steady current flows, magnetostatics and eddy current problems in plane or axisymmetric, two-dimensional geometries. TORO is easily coupled to heat conduction and solid mechanics codes to allow multi-physics simulations to be performed.
Analysis of PANDA Passive Containment Cooling Steady-State Tests with the Spectra Code
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stempniewicz, Marek M
2000-07-15
Results of post test simulation of the PANDA passive containment cooling (PCC) steady-state tests (S-series tests), performed at the PANDA facility at the Paul Scherrer Institute, Switzerland, are presented. The simulation has been performed using the computer code SPECTRA, a thermal-hydraulic code, designed specifically for analyzing containment behavior of nuclear power plants.Results of the present calculations are compared to the measurement data as well as the results obtained earlier with the codes MELCOR, TRAC-BF1, and TRACG. The calculated PCC efficiencies are somewhat lower than the measured values. Similar underestimation of PCC efficiencies had been obtained in the past, with themore » other computer codes. To explain this difference, it is postulated that condensate coming into the tubes forms a stream of liquid in one or two tubes, leaving most of the tubes unaffected. The condensate entering the water box is assumed to fall down in the form of droplets. With these assumptions, the results calculated with SPECTRA are close to the experimental data.It is concluded that the SPECTRA code is a suitable tool for analyzing containments of advanced reactors, equipped with passive containment cooling systems.« less
NASA Technical Reports Server (NTRS)
McGuire, Tim
1998-01-01
In this paper, we report the results of our recent research on the application of a multiprocessor Cray T916 supercomputer in modeling super-thermal electron transport in the earth's magnetic field. In general, this mathematical model requires numerical solution of a system of partial differential equations. The code we use for this model is moderately vectorized. By using Amdahl's Law for vector processors, it can be verified that the code is about 60% vectorized on a Cray computer. Speedup factors on the order of 2.5 were obtained compared to the unvectorized code. In the following sections, we discuss the methodology of improving the code. In addition to our goal of optimizing the code for solution on the Cray computer, we had the goal of scalability in mind. Scalability combines the concepts of portabilty with near-linear speedup. Specifically, a scalable program is one whose performance is portable across many different architectures with differing numbers of processors for many different problem sizes. Though we have access to a Cray at this time, the goal was to also have code which would run well on a variety of architectures.
Applications of CFD and visualization techniques
NASA Technical Reports Server (NTRS)
Saunders, James H.; Brown, Susan T.; Crisafulli, Jeffrey J.; Southern, Leslie A.
1992-01-01
In this paper, three applications are presented to illustrate current techniques for flow calculation and visualization. The first two applications use a commercial computational fluid dynamics (CFD) code, FLUENT, performed on a Cray Y-MP. The results are animated with the aid of data visualization software, apE. The third application simulates a particulate deposition pattern using techniques inspired by developments in nonlinear dynamical systems. These computations were performed on personal computers.
NASA Technical Reports Server (NTRS)
Wong, K. W.
1974-01-01
Program THREED was developed for the purpose of a research study on the treatment of control data in lunar phototriangulation. THREED is the code name of a computer program for performing absolute orientation by the method of three-dimensional projective transformation. It has the capability of performing complete error analysis on the computed transformation parameters as well as the transformed coordinates.
Observations Regarding Use of Advanced CFD Analysis, Sensitivity Analysis, and Design Codes in MDO
NASA Technical Reports Server (NTRS)
Newman, Perry A.; Hou, Gene J. W.; Taylor, Arthur C., III
1996-01-01
Observations regarding the use of advanced computational fluid dynamics (CFD) analysis, sensitivity analysis (SA), and design codes in gradient-based multidisciplinary design optimization (MDO) reflect our perception of the interactions required of CFD and our experience in recent aerodynamic design optimization studies using CFD. Sample results from these latter studies are summarized for conventional optimization (analysis - SA codes) and simultaneous analysis and design optimization (design code) using both Euler and Navier-Stokes flow approximations. The amount of computational resources required for aerodynamic design using CFD via analysis - SA codes is greater than that required for design codes. Thus, an MDO formulation that utilizes the more efficient design codes where possible is desired. However, in the aerovehicle MDO problem, the various disciplines that are involved have different design points in the flight envelope; therefore, CFD analysis - SA codes are required at the aerodynamic 'off design' points. The suggested MDO formulation is a hybrid multilevel optimization procedure that consists of both multipoint CFD analysis - SA codes and multipoint CFD design codes that perform suboptimizations.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wemhoff, A P; Burnham, A K
2006-04-05
Cross-comparison of the results of two computer codes for the same problem provides a mutual validation of their computational methods. This cross-validation exercise was performed for LLNL's ALE3D code and AKTS's Thermal Safety code, using the thermal ignition of HMX in two standard LLNL cookoff experiments: the One-Dimensional Time to Explosion (ODTX) test and the Scaled Thermal Explosion (STEX) test. The chemical kinetics model used in both codes was the extended Prout-Tompkins model, a relatively new addition to ALE3D. This model was applied using ALE3D's new pseudospecies feature. In addition, an advanced isoconversional kinetic approach was used in the AKTSmore » code. The mathematical constants in the Prout-Tompkins code were calibrated using DSC data from hermetically sealed vessels and the LLNL optimization code Kinetics05. The isoconversional kinetic parameters were optimized using the AKTS Thermokinetics code. We found that the Prout-Tompkins model calculations agree fairly well between the two codes, and the isoconversional kinetic model gives very similar results as the Prout-Tompkins model. We also found that an autocatalytic approach in the beta-delta phase transition model does affect the times to explosion for some conditions, especially STEX-like simulations at ramp rates above 100 C/hr, and further exploration of that effect is warranted.« less
Global MHD simulation of magnetosphere using HPF
NASA Astrophysics Data System (ADS)
Ogino, T.
We have translated a 3-dimensional magnetohydrodynamic (MHD) simulation code of the Earth's magnetosphere from VPP Fortran to HPF/JA on the Fujitsu VPP5000/56 vector-parallel supercomputer and the MHD code was fully vectorized and fully parallelized in VPP Fortran. The entire performance and capability of the HPF MHD code could be shown to be almost comparable to that of VPP Fortran. A 3-dimensional global MHD simulation of the earth's magnetosphere was performed at a speed of over 400 Gflops with an efficiency of 76.5% using 56 PEs of Fujitsu VPP5000/56 in vector and parallel computation that permitted comparison with catalog values. We have concluded that fluid and MHD codes that are fully vectorized and fully parallelized in VPP Fortran can be translated with relative ease to HPF/JA, and a code in HPF/JA may be expected to perform comparably to the same code written in VPP Fortran.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chien, T.H.; Domanus, H.M.; Sha, W.T.
1993-02-01
The COMMIX-PPC computer pregrain is an extended and improved version of earlier COMMIX codes and is specifically designed for evaluating the thermal performance of power plant condensers. The COMMIX codes are general-purpose computer programs for the analysis of fluid flow and heat transfer in complex Industrial systems. In COMMIX-PPC, two major features have been added to previously published COMMIX codes. One feature is the incorporation of one-dimensional equations of conservation of mass, momentum, and energy on the tube stile and the proper accounting for the thermal interaction between shell and tube side through the porous-medium approach. The other added featuremore » is the extension of the three-dimensional conservation equations for shell-side flow to treat the flow of a multicomponent medium. COMMIX-PPC is designed to perform steady-state and transient. Three-dimensional analysis of fluid flow with heat transfer tn a power plant condenser. However, the code is designed in a generalized fashion so that, with some modification, it can be used to analyze processes in any heat exchanger or other single-phase engineering applications. Volume I (Equations and Numerics) of this report describes in detail the basic equations, formulation, solution procedures, and models for a phenomena. Volume II (User's Guide and Manual) contains the input instruction, flow charts, sample problems, and descriptions of available options and boundary conditions.« less
NASA Technical Reports Server (NTRS)
Gliebe, P; Mani, R.; Shin, H.; Mitchell, B.; Ashford, G.; Salamah, S.; Connell, S.; Huff, Dennis (Technical Monitor)
2000-01-01
This report describes work performed on Contract NAS3-27720AoI 13 as part of the NASA Advanced Subsonic Transport (AST) Noise Reduction Technology effort. Computer codes were developed to provide quantitative prediction, design, and analysis capability for several aircraft engine noise sources. The objective was to provide improved, physics-based tools for exploration of noise-reduction concepts and understanding of experimental results. Methods and codes focused on fan broadband and 'buzz saw' noise and on low-emissions combustor noise and compliment work done by other contractors under the NASA AST program to develop methods and codes for fan harmonic tone noise and jet noise. The methods and codes developed and reported herein employ a wide range of approaches, from the strictly empirical to the completely computational, with some being semiempirical analytical, and/or analytical/computational. Emphasis was on capturing the essential physics while still considering method or code utility as a practical design and analysis tool for everyday engineering use. Codes and prediction models were developed for: (1) an improved empirical correlation model for fan rotor exit flow mean and turbulence properties, for use in predicting broadband noise generated by rotor exit flow turbulence interaction with downstream stator vanes: (2) fan broadband noise models for rotor and stator/turbulence interaction sources including 3D effects, noncompact-source effects. directivity modeling, and extensions to the rotor supersonic tip-speed regime; (3) fan multiple-pure-tone in-duct sound pressure prediction methodology based on computational fluid dynamics (CFD) analysis; and (4) low-emissions combustor prediction methodology and computer code based on CFD and actuator disk theory. In addition. the relative importance of dipole and quadrupole source mechanisms was studied using direct CFD source computation for a simple cascadeigust interaction problem, and an empirical combustor-noise correlation model was developed from engine acoustic test results. This work provided several insights on potential approaches to reducing aircraft engine noise. Code development is described in this report, and those insights are discussed.
Benchmarking of neutron production of heavy-ion transport codes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Remec, I.; Ronningen, R. M.; Heilbronn, L.
Document available in abstract form only, full text of document follows: Accurate prediction of radiation fields generated by heavy ion interactions is important in medical applications, space missions, and in design and operation of rare isotope research facilities. In recent years, several well-established computer codes in widespread use for particle and radiation transport calculations have been equipped with the capability to simulate heavy ion transport and interactions. To assess and validate these capabilities, we performed simulations of a series of benchmark-quality heavy ion experiments with the computer codes FLUKA, MARS15, MCNPX, and PHITS. We focus on the comparisons of secondarymore » neutron production. Results are encouraging; however, further improvements in models and codes and additional benchmarking are required. (authors)« less
Computational Modeling and Validation for Hypersonic Inlets
NASA Technical Reports Server (NTRS)
Povinelli, Louis A.
1996-01-01
Hypersonic inlet research activity at NASA is reviewed. The basis for the paper is the experimental tests performed with three inlets: the NASA Lewis Research Center Mach 5, the McDonnell Douglas Mach 12, and the NASA Langley Mach 18. Both three-dimensional PNS and NS codes have been used to compute the flow within the three inlets. Modeling assumptions in the codes involve the turbulence model, the nature of the boundary layer, shock wave-boundary layer interaction, and the flow spilled to the outside of the inlet. Use of the codes and the experimental data are helping to develop a clearer understanding of the inlet flow physics and to focus on the modeling improvements required in order to arrive at validated codes.
PREMOR: a point reactor exposure model computer code for survey analysis of power plant performance
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vondy, D.R.
1979-10-01
The PREMOR computer code was written to exploit a simple, two-group point nuclear reactor power plant model for survey analysis. Up to thirteen actinides, fourteen fission products, and one lumped absorber nuclide density are followed over a reactor history. Successive feed batches are accounted for with provision for from one to twenty batches resident. The effect of exposure of each of the batches to the same neutron flux is determined.
Davis, J P; Akella, S; Waddell, P H
2004-01-01
Having greater computational power on the desktop for processing taxa data sets has been a dream of biologists/statisticians involved in phylogenetics data analysis. Many existing algorithms have been highly optimized-one example being Felsenstein's PHYLIP code, written in C, for UPGMA and neighbor joining algorithms. However, the ability to process more than a few tens of taxa in a reasonable amount of time using conventional computers has not yielded a satisfactory speedup in data processing, making it difficult for phylogenetics practitioners to quickly explore data sets-such as might be done from a laptop computer. We discuss the application of custom computing techniques to phylogenetics. In particular, we apply this technology to speed up UPGMA algorithm execution by a factor of a hundred, against that of PHYLIP code running on the same PC. We report on these experiments and discuss how custom computing techniques can be used to not only accelerate phylogenetics algorithm performance on the desktop, but also on larger, high-performance computing engines, thus enabling the high-speed processing of data sets involving thousands of taxa.
NASA Astrophysics Data System (ADS)
Smith, J. A.; Peter, D. B.; Tromp, J.; Komatitsch, D.; Lefebvre, M. P.
2015-12-01
We present both SPECFEM3D_Cartesian and SPECFEM3D_GLOBE open-source codes, representing high-performance numerical wave solvers simulating seismic wave propagation for local-, regional-, and global-scale application. These codes are suitable for both forward propagation in complex media and tomographic imaging. Both solvers compute highly accurate seismic wave fields using the continuous Galerkin spectral-element method on unstructured meshes. Lateral variations in compressional- and shear-wave speeds, density, as well as 3D attenuation Q models, topography and fluid-solid coupling are all readily included in both codes. For global simulations, effects due to rotation, ellipticity, the oceans, 3D crustal models, and self-gravitation are additionally included. Both packages provide forward and adjoint functionality suitable for adjoint tomography on high-performance computing architectures. We highlight the most recent release of the global version which includes improved performance, simultaneous MPI runs, OpenCL and CUDA support via an automatic source-to-source transformation library (BOAST), parallel I/O readers and writers for databases using ADIOS and seismograms using the recently developed Adaptable Seismic Data Format (ASDF) with built-in provenance. This makes our spectral-element solvers current state-of-the-art, open-source community codes for high-performance seismic wave propagation on arbitrarily complex 3D models. Together with these solvers, we provide full-waveform inversion tools to image the Earth's interior at unprecedented resolution.
A Fast MHD Code for Gravitationally Stratified Media using Graphical Processing Units: SMAUG
NASA Astrophysics Data System (ADS)
Griffiths, M. K.; Fedun, V.; Erdélyi, R.
2015-03-01
Parallelization techniques have been exploited most successfully by the gaming/graphics industry with the adoption of graphical processing units (GPUs), possessing hundreds of processor cores. The opportunity has been recognized by the computational sciences and engineering communities, who have recently harnessed successfully the numerical performance of GPUs. For example, parallel magnetohydrodynamic (MHD) algorithms are important for numerical modelling of highly inhomogeneous solar, astrophysical and geophysical plasmas. Here, we describe the implementation of SMAUG, the Sheffield Magnetohydrodynamics Algorithm Using GPUs. SMAUG is a 1-3D MHD code capable of modelling magnetized and gravitationally stratified plasma. The objective of this paper is to present the numerical methods and techniques used for porting the code to this novel and highly parallel compute architecture. The methods employed are justified by the performance benchmarks and validation results demonstrating that the code successfully simulates the physics for a range of test scenarios including a full 3D realistic model of wave propagation in the solar atmosphere.
HEC Applications on Columbia Project
NASA Technical Reports Server (NTRS)
Taft, Jim
2004-01-01
NASA's Columbia system consists of a cluster of twenty 512 processor SGI Altix systems. Each of these systems is 3 TFLOP/s in peak performance - approximately the same as the entire compute capability at NAS just one year ago. Each 512p system is a single system image machine with one Linunx O5, one high performance file system, and one globally shared memory. The NAS Terascale Applications Group (TAG) is chartered to assist in scaling NASA's mission critical codes to at least 512p in order to significantly improve emergency response during flight operations, as well as provide significant improvements in the codes. and rate of scientific discovery across the scientifc disciplines within NASA's Missions. Recent accomplishments are 4x improvements to codes in the ocean modeling community, 10x performance improvements in a number of computational fluid dynamics codes used in aero-vehicle design, and 5x improvements in a number of space science codes dealing in extreme physics. The TAG group will continue its scaling work to 2048p and beyond (10240 cpus) as the Columbia system becomes fully operational and the upgrades to the SGI NUMAlink memory fabric are in place. The NUMlink uprades dramatically improve system scalability for a single application. These upgrades will allow a number of codes to execute faster at higher fidelity than ever before on any other system, thus increasing the rate of scientific discovery even further
Test results of a 40-kW Stirling engine and comparison with the NASA Lewis computer code predictions
NASA Technical Reports Server (NTRS)
Allen, David J.; Cairelli, James E.
1988-01-01
A Stirling engine was tested without auxiliaries at Nasa-Lewis. Three different regenerator configurations were tested with hydrogen. The test objectives were: (1) to obtain steady-state and dynamic engine data, including indicated power, for validation of an existing computer model for this engine; and (2) to evaluate structurally the use of silicon carbide regenerators. This paper presents comparisons of the measured brake performance, indicated mean effective pressure, and cyclic pressure variations from those predicted by the code. The silicon carbide foam generators appear to be structurally suitable, but the foam matrix showed severely reduced performance.
Visual analysis of inter-process communication for large-scale parallel computing.
Muelder, Chris; Gygi, Francois; Ma, Kwan-Liu
2009-01-01
In serial computation, program profiling is often helpful for optimization of key sections of code. When moving to parallel computation, not only does the code execution need to be considered but also communication between the different processes which can induce delays that are detrimental to performance. As the number of processes increases, so does the impact of the communication delays on performance. For large-scale parallel applications, it is critical to understand how the communication impacts performance in order to make the code more efficient. There are several tools available for visualizing program execution and communications on parallel systems. These tools generally provide either views which statistically summarize the entire program execution or process-centric views. However, process-centric visualizations do not scale well as the number of processes gets very large. In particular, the most common representation of parallel processes is a Gantt char t with a row for each process. As the number of processes increases, these charts can become difficult to work with and can even exceed screen resolution. We propose a new visualization approach that affords more scalability and then demonstrate it on systems running with up to 16,384 processes.
NASA Astrophysics Data System (ADS)
Mills, R. T.
2014-12-01
As the high performance computing (HPC) community pushes towards the exascale horizon, the importance and prevalence of fine-grained parallelism in new computer architectures is increasing. This is perhaps most apparent in the proliferation of so-called "accelerators" such as the Intel Xeon Phi or NVIDIA GPGPUs, but the trend also holds for CPUs, where serial performance has grown slowly and effective use of hardware threads and vector units are becoming increasingly important to realizing high performance. This has significant implications for weather, climate, and Earth system modeling codes, many of which display impressive scalability across MPI ranks but take relatively little advantage of threading and vector processing. In addition to increasing parallelism, next generation codes will also need to address increasingly deep hierarchies for data movement: NUMA/cache levels, on node vs. off node, local vs. wide neighborhoods on the interconnect, and even in the I/O system. We will discuss some approaches (grounded in experiences with the Intel Xeon Phi architecture) for restructuring Earth science codes to maximize concurrency across multiple levels (vectors, threads, MPI ranks), and also discuss some novel approaches for minimizing expensive data movement/communication.
Parallel design of JPEG-LS encoder on graphics processing units
NASA Astrophysics Data System (ADS)
Duan, Hao; Fang, Yong; Huang, Bormin
2012-01-01
With recent technical advances in graphic processing units (GPUs), GPUs have outperformed CPUs in terms of compute capability and memory bandwidth. Many successful GPU applications to high performance computing have been reported. JPEG-LS is an ISO/IEC standard for lossless image compression which utilizes adaptive context modeling and run-length coding to improve compression ratio. However, adaptive context modeling causes data dependency among adjacent pixels and the run-length coding has to be performed in a sequential way. Hence, using JPEG-LS to compress large-volume hyperspectral image data is quite time-consuming. We implement an efficient parallel JPEG-LS encoder for lossless hyperspectral compression on a NVIDIA GPU using the computer unified device architecture (CUDA) programming technology. We use the block parallel strategy, as well as such CUDA techniques as coalesced global memory access, parallel prefix sum, and asynchronous data transfer. We also show the relation between GPU speedup and AVIRIS block size, as well as the relation between compression ratio and AVIRIS block size. When AVIRIS images are divided into blocks, each with 64×64 pixels, we gain the best GPU performance with 26.3x speedup over its original CPU code.
Numerical predictions of EML (electromagnetic launcher) system performance
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schnurr, N.M.; Kerrisk, J.F.; Davidson, R.F.
1987-01-01
The performance of an electromagnetic launcher (EML) depends on a large number of parameters, including the characteristics of the power supply, rail geometry, rail and insulator material properties, injection velocity, and projectile mass. EML system performance is frequently limited by structural or thermal effects in the launcher (railgun). A series of computer codes has been developed at the Los Alamos National Laboratory to predict EML system performance and to determine the structural and thermal constraints on barrel design. These codes include FLD, a two-dimensional electrostatic code used to calculate the high-frequency inductance gradient and surface current density distribution for themore » rails; TOPAZRG, a two-dimensional finite-element code that simultaneously analyzes thermal and electromagnetic diffusion in the rails; and LARGE, a code that predicts the performance of the entire EML system. Trhe NIKE2D code, developed at the Lawrence Livermore National Laboratory, is used to perform structural analyses of the rails. These codes have been instrumental in the design of the Lethality Test System (LTS) at Los Alamos, which has an ultimate goal of accelerating a 30-g projectile to a velocity of 15 km/s. The capabilities of the individual codes and the coupling of these codes to perform a comprehensive analysis is discussed in relation to the LTS design. Numerical predictions are compared with experimental data and presented for the LTS prototype tests.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Seitz, R.R.; Rittmann, P.D.; Wood, M.I.
The US Department of Energy Headquarters established a performance assessment task team (PATT) to integrate the activities of DOE sites that are preparing performance assessments for the disposal of newly generated low-level waste. The PATT chartered a subteam with the task of comparing computer codes and exposure scenarios used for dose calculations in performance assessments. This report documents the efforts of the subteam. Computer codes considered in the comparison include GENII, PATHRAE-EPA, MICROSHIELD, and ISOSHLD. Calculations were also conducted using spreadsheets to provide a comparison at the most fundamental level. Calculations and modeling approaches are compared for unit radionuclide concentrationsmore » in water and soil for the ingestion, inhalation, and external dose pathways. Over 30 tables comparing inputs and results are provided.« less
Parametric Model of an Aerospike Rocket Engine
NASA Technical Reports Server (NTRS)
Korte, J. J.
2000-01-01
A suite of computer codes was assembled to simulate the performance of an aerospike engine and to generate the engine input for the Program to Optimize Simulated Trajectories. First an engine simulator module was developed that predicts the aerospike engine performance for a given mixture ratio, power level, thrust vectoring level, and altitude. This module was then used to rapidly generate the aerospike engine performance tables for axial thrust, normal thrust, pitching moment, and specific thrust. Parametric engine geometry was defined for use with the engine simulator module. The parametric model was also integrated into the iSIGHTI multidisciplinary framework so that alternate designs could be determined. The computer codes were used to support in-house conceptual studies of reusable launch vehicle designs.
Parametric Model of an Aerospike Rocket Engine
NASA Technical Reports Server (NTRS)
Korte, J. J.
2000-01-01
A suite of computer codes was assembled to simulate the performance of an aerospike engine and to generate the engine input for the Program to Optimize Simulated Trajectories. First an engine simulator module was developed that predicts the aerospike engine performance for a given mixture ratio, power level, thrust vectoring level, and altitude. This module was then used to rapidly generate the aerospike engine performance tables for axial thrust, normal thrust, pitching moment, and specific thrust. Parametric engine geometry was defined for use with the engine simulator module. The parametric model was also integrated into the iSIGHT multidisciplinary framework so that alternate designs could be determined. The computer codes were used to support in-house conceptual studies of reusable launch vehicle designs.
NEAMS Update. Quarterly Report for October - December 2011.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bradley, K.
2012-02-16
The Advanced Modeling and Simulation Office within the DOE Office of Nuclear Energy (NE) has been charged with revolutionizing the design tools used to build nuclear power plants during the next 10 years. To accomplish this, the DOE has brought together the national laboratories, U.S. universities, and the nuclear energy industry to establish the Nuclear Energy Advanced Modeling and Simulation (NEAMS) Program. The mission of NEAMS is to modernize computer modeling of nuclear energy systems and improve the fidelity and validity of modeling results using contemporary software environments and high-performance computers. NEAMS will create a set of engineering-level codes aimedmore » at designing and analyzing the performance and safety of nuclear power plants and reactor fuels. The truly predictive nature of these codes will be achieved by modeling the governing phenomena at the spatial and temporal scales that dominate the behavior. These codes will be executed within a simulation environment that orchestrates code integration with respect to spatial meshing, computational resources, and execution to give the user a common 'look and feel' for setting up problems and displaying results. NEAMS is building upon a suite of existing simulation tools, including those developed by the federal Scientific Discovery through Advanced Computing and Advanced Simulation and Computing programs. NEAMS also draws upon existing simulation tools for materials and nuclear systems, although many of these are limited in terms of scale, applicability, and portability (their ability to be integrated into contemporary software and hardware architectures). NEAMS investments have directly and indirectly supported additional NE research and development programs, including those devoted to waste repositories, safeguarded separations systems, and long-term storage of used nuclear fuel. NEAMS is organized into two broad efforts, each comprising four elements. The quarterly highlights October-December 2011 are: (1) Version 1.0 of AMP, the fuel assembly performance code, was tested on the JAGUAR supercomputer and released on November 1, 2011, a detailed discussion of this new simulation tool is given; (2) A coolant sub-channel model and a preliminary UO{sub 2} smeared-cracking model were implemented in BISON, the single-pin fuel code, more information on how these models were developed and benchmarked is given; (3) The Object Kinetic Monte Carlo model was implemented to account for nucleation events in meso-scale simulations and a discussion of the significance of this advance is given; (4) The SHARP neutronics module, PROTEUS, was expanded to be applicable to all types of reactors, and a discussion of the importance of PROTEUS is given; (5) A plan has been finalized for integrating the high-fidelity, three-dimensional reactor code SHARP with both the systems-level code RELAP7 and the fuel assembly code AMP. This is a new initiative; (6) Work began to evaluate the applicability of AMP to the problem of dry storage of used fuel and to define a relevant problem to test the applicability; (7) A code to obtain phonon spectra from the force-constant matrix for a crystalline lattice has been completed. This important bridge between subcontinuum and continuum phenomena is discussed; (8) Benchmarking was begun on the meso-scale, finite-element fuels code MARMOT to validate its new variable splitting algorithm; (9) A very computationally demanding simulation of diffusion-driven nucleation of new microstructural features has been completed. An explanation of the difficulty of this simulation is given; (10) Experiments were conducted with deformed steel to validate a crystal plasticity finite-element code for bodycentered cubic iron; (11) The Capability Transfer Roadmap was completed and published as an internal laboratory technical report; (12) The AMP fuel assembly code input generator was integrated into the NEAMS Integrated Computational Environment (NiCE). More details on the planned NEAMS computing environment is given; and (13) The NEAMS program website (neams.energy.gov) is nearly ready to launch.« less
Han, Min Cheol; Yeom, Yeon Soo; Lee, Hyun Su; Shin, Bangho; Kim, Chan Hyeong; Furuta, Takuya
2018-05-04
In this study, the multi-threading performance of the Geant4, MCNP6, and PHITS codes was evaluated as a function of the number of threads (N) and the complexity of the tetrahedral-mesh phantom. For this, three tetrahedral-mesh phantoms of varying complexity (simple, moderately complex, and highly complex) were prepared and implemented in the three different Monte Carlo codes, in photon and neutron transport simulations. Subsequently, for each case, the initialization time, calculation time, and memory usage were measured as a function of the number of threads used in the simulation. It was found that for all codes, the initialization time significantly increased with the complexity of the phantom, but not with the number of threads. Geant4 exhibited much longer initialization time than the other codes, especially for the complex phantom (MRCP). The improvement of computation speed due to the use of a multi-threaded code was calculated as the speed-up factor, the ratio of the computation speed on a multi-threaded code to the computation speed on a single-threaded code. Geant4 showed the best multi-threading performance among the codes considered in this study, with the speed-up factor almost linearly increasing with the number of threads, reaching ~30 when N = 40. PHITS and MCNP6 showed a much smaller increase of the speed-up factor with the number of threads. For PHITS, the speed-up factors were low when N = 40. For MCNP6, the increase of the speed-up factors was better, but they were still less than ~10 when N = 40. As for memory usage, Geant4 was found to use more memory than the other codes. In addition, compared to that of the other codes, the memory usage of Geant4 more rapidly increased with the number of threads, reaching as high as ~74 GB when N = 40 for the complex phantom (MRCP). It is notable that compared to that of the other codes, the memory usage of PHITS was much lower, regardless of both the complexity of the phantom and the number of threads, hardly increasing with the number of threads for the MRCP.
NASA Technical Reports Server (NTRS)
Chang, C. Y.; Kwok, R.; Curlander, J. C.
1987-01-01
Five coding techniques in the spatial and transform domains have been evaluated for SAR image compression: linear three-point predictor (LTPP), block truncation coding (BTC), microadaptive picture sequencing (MAPS), adaptive discrete cosine transform (ADCT), and adaptive Hadamard transform (AHT). These techniques have been tested with Seasat data. Both LTPP and BTC spatial domain coding techniques provide very good performance at rates of 1-2 bits/pixel. The two transform techniques, ADCT and AHT, demonstrate the capability to compress the SAR imagery to less than 0.5 bits/pixel without visible artifacts. Tradeoffs such as the rate distortion performance, the computational complexity, the algorithm flexibility, and the controllability of compression ratios are also discussed.
NASA Technical Reports Server (NTRS)
Boyd, David D. Jr.
2009-01-01
Preliminary aerodynamic and performance predictions for an active twist rotor for a HART-II type of configuration are performed using a computational fluid dynamics (CFD) code, OVERFLOW2, and a computational structural dynamics (CSD) code, CAMRAD -II. These codes are loosely coupled to compute a consistent set of aerodynamics and elastic blade motions. Resultant aerodynamic and blade motion data are then used in the Ffowcs-Williams Hawkins solver, PSU-WOPWOP, to compute noise on an observer plane under the rotor. Active twist of the rotor blade is achieved in CAMRAD-II by application of a periodic torsional moment couple (of equal and opposite sign) at the blade root and tip at a specified frequency and amplitude. To provide confidence in these particular active twist predictions for which no measured data is available, the rotor system geometry and computational set up examined here are identical to that used in a previous successful Higher Harmonic Control (HHC) computational study. For a single frequency equal to three times the blade passage frequency (3P), active twist is applied across a range of control phase angles at two different amplitudes. Predicted results indicate that there are control phase angles where the maximum mid-frequency noise level and the 4P non -rotating hub vibrations can be reduced, potentially, both at the same time. However, these calculated reductions are predicted to come with a performance penalty in the form of a reduction in rotor lift-to-drag ratio due to an increase in rotor profile power.
The coupling of fluids, dynamics, and controls on advanced architecture computers
NASA Technical Reports Server (NTRS)
Atwood, Christopher
1995-01-01
This grant provided for the demonstration of coupled controls, body dynamics, and fluids computations in a workstation cluster environment; and an investigation of the impact of peer-peer communication on flow solver performance and robustness. The findings of these investigations were documented in the conference articles.The attached publication, 'Towards Distributed Fluids/Controls Simulations', documents the solution and scaling of the coupled Navier-Stokes, Euler rigid-body dynamics, and state feedback control equations for a two-dimensional canard-wing. The poor scaling shown was due to serialized grid connectivity computation and Ethernet bandwidth limits. The scaling of a peer-to-peer communication flow code on an IBM SP-2 was also shown. The scaling of the code on the switched fabric-linked nodes was good, with a 2.4 percent loss due to communication of intergrid boundary point information. The code performance on 30 worker nodes was 1.7 (mu)s/point/iteration, or a factor of three over a Cray C-90 head. The attached paper, 'Nonlinear Fluid Computations in a Distributed Environment', documents the effect of several computational rate enhancing methods on convergence. For the cases shown, the highest throughput was achieved using boundary updates at each step, with the manager process performing communication tasks only. Constrained domain decomposition of the implicit fluid equations did not degrade the convergence rate or final solution. The scaling of a coupled body/fluid dynamics problem on an Ethernet-linked cluster was also shown.
PIC codes for plasma accelerators on emerging computer architectures (GPUS, Multicore/Manycore CPUS)
NASA Astrophysics Data System (ADS)
Vincenti, Henri
2016-03-01
The advent of exascale computers will enable 3D simulations of a new laser-plasma interaction regimes that were previously out of reach of current Petasale computers. However, the paradigm used to write current PIC codes will have to change in order to fully exploit the potentialities of these new computing architectures. Indeed, achieving Exascale computing facilities in the next decade will be a great challenge in terms of energy consumption and will imply hardware developments directly impacting our way of implementing PIC codes. As data movement (from die to network) is by far the most energy consuming part of an algorithm future computers will tend to increase memory locality at the hardware level and reduce energy consumption related to data movement by using more and more cores on each compute nodes (''fat nodes'') that will have a reduced clock speed to allow for efficient cooling. To compensate for frequency decrease, CPU machine vendors are making use of long SIMD instruction registers that are able to process multiple data with one arithmetic operator in one clock cycle. SIMD register length is expected to double every four years. GPU's also have a reduced clock speed per core and can process Multiple Instructions on Multiple Datas (MIMD). At the software level Particle-In-Cell (PIC) codes will thus have to achieve both good memory locality and vectorization (for Multicore/Manycore CPU) to fully take advantage of these upcoming architectures. In this talk, we present the portable solutions we implemented in our high performance skeleton PIC code PICSAR to both achieve good memory locality and cache reuse as well as good vectorization on SIMD architectures. We also present the portable solutions used to parallelize the Pseudo-sepctral quasi-cylindrical code FBPIC on GPUs using the Numba python compiler.
Testing and Performance Analysis of the Multichannel Error Correction Code Decoder
NASA Technical Reports Server (NTRS)
Soni, Nitin J.
1996-01-01
This report provides the test results and performance analysis of the multichannel error correction code decoder (MED) system for a regenerative satellite with asynchronous, frequency-division multiple access (FDMA) uplink channels. It discusses the system performance relative to various critical parameters: the coding length, data pattern, unique word value, unique word threshold, and adjacent-channel interference. Testing was performed under laboratory conditions and used a computer control interface with specifically developed control software to vary these parameters. Needed technologies - the high-speed Bose Chaudhuri-Hocquenghem (BCH) codec from Harris Corporation and the TRW multichannel demultiplexer/demodulator (MCDD) - were fully integrated into the mesh very small aperture terminal (VSAT) onboard processing architecture and were demonstrated.
NASA Technical Reports Server (NTRS)
1991-01-01
Various papers on supercomputing are presented. The general topics addressed include: program analysis/data dependence, memory access, distributed memory code generation, numerical algorithms, supercomputer benchmarks, latency tolerance, parallel programming, applications, processor design, networks, performance tools, mapping and scheduling, characterization affecting performance, parallelism packaging, computing climate change, combinatorial algorithms, hardware and software performance issues, system issues. (No individual items are abstracted in this volume)
Addressing the challenges of standalone multi-core simulations in molecular dynamics
NASA Astrophysics Data System (ADS)
Ocaya, R. O.; Terblans, J. J.
2017-07-01
Computational modelling in material science involves mathematical abstractions of force fields between particles with the aim to postulate, develop and understand materials by simulation. The aggregated pairwise interactions of the material's particles lead to a deduction of its macroscopic behaviours. For practically meaningful macroscopic scales, a large amount of data are generated, leading to vast execution times. Simulation times of hours, days or weeks for moderately sized problems are not uncommon. The reduction of simulation times, improved result accuracy and the associated software and hardware engineering challenges are the main motivations for many of the ongoing researches in the computational sciences. This contribution is concerned mainly with simulations that can be done on a "standalone" computer based on Message Passing Interfaces (MPI), parallel code running on hardware platforms with wide specifications, such as single/multi- processor, multi-core machines with minimal reconfiguration for upward scaling of computational power. The widely available, documented and standardized MPI library provides this functionality through the MPI_Comm_size (), MPI_Comm_rank () and MPI_Reduce () functions. A survey of the literature shows that relatively little is written with respect to the efficient extraction of the inherent computational power in a cluster. In this work, we discuss the main avenues available to tap into this extra power without compromising computational accuracy. We also present methods to overcome the high inertia encountered in single-node-based computational molecular dynamics. We begin by surveying the current state of the art and discuss what it takes to achieve parallelism, efficiency and enhanced computational accuracy through program threads and message passing interfaces. Several code illustrations are given. The pros and cons of writing raw code as opposed to using heuristic, third-party code are also discussed. The growing trend towards graphical processor units and virtual computing clouds for high-performance computing is also discussed. Finally, we present the comparative results of vacancy formation energy calculations using our own parallelized standalone code called Verlet-Stormer velocity (VSV) operating on 30,000 copper atoms. The code is based on the Sutton-Chen implementation of the Finnis-Sinclair pairwise embedded atom potential. A link to the code is also given.
Adams, Bradley J; Aschheim, Kenneth W
2016-01-01
Comparison of antemortem and postmortem dental records is a leading method of victim identification, especially for incidents involving a large number of decedents. This process may be expedited with computer software that provides a ranked list of best possible matches. This study provides a comparison of the most commonly used conventional coding and sorting algorithms used in the United States (WinID3) with a simplified coding format that utilizes an optimized sorting algorithm. The simplified system consists of seven basic codes and utilizes an optimized algorithm based largely on the percentage of matches. To perform this research, a large reference database of approximately 50,000 antemortem and postmortem records was created. For most disaster scenarios, the proposed simplified codes, paired with the optimized algorithm, performed better than WinID3 which uses more complex codes. The detailed coding system does show better performance with extremely large numbers of records and/or significant body fragmentation. © 2015 American Academy of Forensic Sciences.
Computational techniques in gamma-ray skyshine analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
George, D.L.
1988-12-01
Two computer codes were developed to analyze gamma-ray skyshine, the scattering of gamma photons by air molecules. A review of previous gamma-ray skyshine studies discusses several Monte Carlo codes, programs using a single-scatter model, and the MicroSkyshine program for microcomputers. A benchmark gamma-ray skyshine experiment performed at Kansas State University is also described. A single-scatter numerical model was presented which traces photons from the source to their first scatter, then applies a buildup factor along a direct path from the scattering point to a detector. The FORTRAN code SKY, developed with this model before the present study, was modified tomore » use Gauss quadrature, recent photon attenuation data and a more accurate buildup approximation. The resulting code, SILOGP, computes response from a point photon source on the axis of a silo, with and without concrete shielding over the opening. Another program, WALLGP, was developed using the same model to compute response from a point gamma source behind a perfectly absorbing wall, with and without shielding overhead. 29 refs., 48 figs., 13 tabs.« less
Optimal periodic binary codes of lengths 28 to 64
NASA Technical Reports Server (NTRS)
Tyler, S.; Keston, R.
1980-01-01
Results from computer searches performed to find repeated binary phase coded waveforms with optimal periodic autocorrelation functions are discussed. The best results for lengths 28 to 64 are given. The code features of major concern are where (1) the peak sidelobe in the autocorrelation function is small and (2) the sum of the squares of the sidelobes in the autocorrelation function is small.
NASA Technical Reports Server (NTRS)
Wade, Randall S.; Jones, Bailey
2009-01-01
A computer program loads configuration code into a Xilinx field-programmable gate array (FPGA), reads back and verifies that code, reloads the code if an error is detected, and monitors the performance of the FPGA for errors in the presence of radiation. The program consists mainly of a set of VHDL files (wherein "VHDL" signifies "VHSIC Hardware Description Language" and "VHSIC" signifies "very-high-speed integrated circuit").
1975-09-01
This report assumes a familiarity with the GIFT and MAGIC computer codes. The EDIT-COMGEOM code is a FORTRAN computer code. The EDIT-COMGEOM code...converts the target description data which was used in the MAGIC computer code to the target description data which can be used in the GIFT computer code
Performance analysis of parallel gravitational N-body codes on large GPU clusters
NASA Astrophysics Data System (ADS)
Huang, Si-Yi; Spurzem, Rainer; Berczik, Peter
2016-01-01
We compare the performance of two very different parallel gravitational N-body codes for astrophysical simulations on large Graphics Processing Unit (GPU) clusters, both of which are pioneers in their own fields as well as on certain mutual scales - NBODY6++ and Bonsai. We carry out benchmarks of the two codes by analyzing their performance, accuracy and efficiency through the modeling of structure decomposition and timing measurements. We find that both codes are heavily optimized to leverage the computational potential of GPUs as their performance has approached half of the maximum single precision performance of the underlying GPU cards. With such performance we predict that a speed-up of 200 - 300 can be achieved when up to 1k processors and GPUs are employed simultaneously. We discuss the quantitative information about comparisons of the two codes, finding that in the same cases Bonsai adopts larger time steps as well as larger relative energy errors than NBODY6++, typically ranging from 10 - 50 times larger, depending on the chosen parameters of the codes. Although the two codes are built for different astrophysical applications, in specified conditions they may overlap in performance at certain physical scales, thus allowing the user to choose either one by fine-tuning parameters accordingly.
Three-dimensional structural analysis using interactive graphics
NASA Technical Reports Server (NTRS)
Biffle, J.; Sumlin, H. A.
1975-01-01
The application of computer interactive graphics to three-dimensional structural analysis was described, with emphasis on the following aspects: (1) structural analysis, and (2) generation and checking of input data and examination of the large volume of output data (stresses, displacements, velocities, accelerations). Handling of three-dimensional input processing with a special MESH3D computer program was explained. Similarly, a special code PLTZ may be used to perform all the needed tasks for output processing from a finite element code. Examples were illustrated.
General 3D Airborne Antenna Radiation Pattern Code Users Manual.
1983-02-01
AD-A 30 359 GENERAL 3D AIRBORNEANTENNA RADIATION PATTERN CODE USERS MANUA (U) OHIO STATE UNIV COLUMBUS ELECTROSCIENCE LAB H HCHUNGET AL FEB 83 RADC...F30602-79-C-0068 9. PERFORMING ORGANIZATION NAME AND ADDRESS 10. PROGRAM ELEMENT. PROJECT. TASKAREA A WORK UNIT NUMEEfRS The Ohio State University...Computer Program 20, ABSTRACT (Coaffivme on reverse side it ntecessar a" 141etifIr &V block mUbef) This report describes a computer program and how it may
Program structure-based blocking
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bertolli, Carlo; Eichenberger, Alexandre E.; O'Brien, John K.
2017-09-26
Embodiments relate to program structure-based blocking. An aspect includes receiving source code corresponding to a computer program by a compiler of a computer system. Another aspect includes determining a prefetching section in the source code by a marking module of the compiler. Yet another aspect includes performing, by a blocking module of the compiler, blocking of instructions located in the prefetching section into instruction blocks, such that the instruction blocks of the prefetching section only contain instructions that are located in the prefetching section.
NASA Technical Reports Server (NTRS)
1992-01-01
The technical effort and computer code developed during the first year are summarized. Several formulations for Probabilistic Finite Element Analysis (PFEA) are described with emphasis on the selected formulation. The strategies being implemented in the first-version computer code to perform linear, elastic PFEA is described. The results of a series of select Space Shuttle Main Engine (SSME) component surveys are presented. These results identify the critical components and provide the information necessary for probabilistic structural analysis.
2015-06-01
events was ad - hoc and problematic due to time constraints and changing requirements. Determining errors in context and heuristics required expertise...area code ) 410-278-4678 Standard Form 298 (Rev. 8/98) Prescribed by ANSI Std. Z39.18 iii Contents List of Figures iv 1. Introduction 1...reduction code ...........8 1 1. Introduction Data reduction for analysis of Command, Control, Communications, and Computer (C4) network tests
Turbulent Bubbly Flow in a Vertical Pipe Computed By an Eddy-Resolving Reynolds Stress Model
2014-09-19
the numerical code OpenFOAM R©. 1 Introduction Turbulent bubbly flows are encountered in many industrially relevant applications, such as chemical in...performed using the OpenFOAM -2.2.2 computational code utilizing a cell- center-based finite volume method on an unstructured numerical grid. The...the mean Courant number is always below 0.4. The utilized turbulence models were implemented into the so-called twoPhaseEulerFoam solver in OpenFOAM , to
Charon Message-Passing Toolkit for Scientific Computations
NASA Technical Reports Server (NTRS)
VanderWijngaart, Rob F.; Yan, Jerry (Technical Monitor)
2000-01-01
Charon is a library, callable from C and Fortran, that aids the conversion of structured-grid legacy codes-such as those used in the numerical computation of fluid flows-into parallel, high- performance codes. Key are functions that define distributed arrays, that map between distributed and non-distributed arrays, and that allow easy specification of common communications on structured grids. The library is based on the widely accepted MPI message passing standard. We present an overview of the functionality of Charon, and some representative results.
Viterbi decoding for satellite and space communication.
NASA Technical Reports Server (NTRS)
Heller, J. A.; Jacobs, I. M.
1971-01-01
Convolutional coding and Viterbi decoding, along with binary phase-shift keyed modulation, is presented as an efficient system for reliable communication on power limited satellite and space channels. Performance results, obtained theoretically and through computer simulation, are given for optimum short constraint length codes for a range of code constraint lengths and code rates. System efficiency is compared for hard receiver quantization and 4 and 8 level soft quantization. The effects on performance of varying of certain parameters relevant to decoder complexity and cost are examined. Quantitative performance degradation due to imperfect carrier phase coherence is evaluated and compared to that of an uncoded system. As an example of decoder performance versus complexity, a recently implemented 2-Mbit/sec constraint length 7 Viterbi decoder is discussed. Finally a comparison is made between Viterbi and sequential decoding in terms of suitability to various system requirements.
Numerical Analysis of Dusty-Gas Flows
NASA Astrophysics Data System (ADS)
Saito, T.
2002-02-01
This paper presents the development of a numerical code for simulating unsteady dusty-gas flows including shock and rarefaction waves. The numerical results obtained for a shock tube problem are used for validating the accuracy and performance of the code. The code is then extended for simulating two-dimensional problems. Since the interactions between the gas and particle phases are calculated with the operator splitting technique, we can choose numerical schemes independently for the different phases. A semi-analytical method is developed for the dust phase, while the TVD scheme of Harten and Yee is chosen for the gas phase. Throughout this study, computations are carried out on SGI Origin2000, a parallel computer with multiple of RISC based processors. The efficient use of the parallel computer system is an important issue and the code implementation on Origin2000 is also described. Flow profiles of both the gas and solid particles behind the steady shock wave are calculated by integrating the steady conservation equations. The good agreement between the pseudo-stationary solutions and those from the current numerical code validates the numerical approach and the actual coding. The pseudo-stationary shock profiles can also be used as initial conditions of unsteady multidimensional simulations.
Practical somewhat-secure quantum somewhat-homomorphic encryption with coherent states
NASA Astrophysics Data System (ADS)
Tan, Si-Hui; Ouyang, Yingkai; Rohde, Peter P.
2018-04-01
We present a scheme for implementing homomorphic encryption on coherent states encoded using phase-shift keys. The encryption operations require only rotations in phase space, which commute with computations in the code space performed via passive linear optics, and with generalized nonlinear phase operations that are polynomials of the photon-number operator in the code space. This encoding scheme can thus be applied to any computation with coherent-state inputs, and the computation proceeds via a combination of passive linear optics and generalized nonlinear phase operations. An example of such a computation is matrix multiplication, whereby a vector representing coherent-state amplitudes is multiplied by a matrix representing a linear optics network, yielding a new vector of coherent-state amplitudes. By finding an orthogonal partitioning of the support of our encoded states, we quantify the security of our scheme via the indistinguishability of the encrypted code words. While we focus on coherent-state encodings, we expect that this phase-key encoding technique could apply to any continuous-variable computation scheme where the phase-shift operator commutes with the computation.
Summary Report of Working Group 2: Computation
NASA Astrophysics Data System (ADS)
Stoltz, P. H.; Tsung, R. S.
2009-01-01
The working group on computation addressed three physics areas: (i) plasma-based accelerators (laser-driven and beam-driven), (ii) high gradient structure-based accelerators, and (iii) electron beam sources and transport [1]. Highlights of the talks in these areas included new models of breakdown on the microscopic scale, new three-dimensional multipacting calculations with both finite difference and finite element codes, and detailed comparisons of new electron gun models with standard models such as PARMELA. The group also addressed two areas of advances in computation: (i) new algorithms, including simulation in a Lorentz-boosted frame that can reduce computation time orders of magnitude, and (ii) new hardware architectures, like graphics processing units and Cell processors that promise dramatic increases in computing power. Highlights of the talks in these areas included results from the first large-scale parallel finite element particle-in-cell code (PIC), many order-of-magnitude speedup of, and details of porting the VPIC code to the Roadrunner supercomputer. The working group featured two plenary talks, one by Brian Albright of Los Alamos National Laboratory on the performance of the VPIC code on the Roadrunner supercomputer, and one by David Bruhwiler of Tech-X Corporation on recent advances in computation for advanced accelerators. Highlights of the talk by Albright included the first one trillion particle simulations, a sustained performance of 0.3 petaflops, and an eight times speedup of science calculations, including back-scatter in laser-plasma interaction. Highlights of the talk by Bruhwiler included simulations of 10 GeV accelerator laser wakefield stages including external injection, new developments in electromagnetic simulations of electron guns using finite difference and finite element approaches.
Summary Report of Working Group 2: Computation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stoltz, P. H.; Tsung, R. S.
2009-01-22
The working group on computation addressed three physics areas: (i) plasma-based accelerators (laser-driven and beam-driven), (ii) high gradient structure-based accelerators, and (iii) electron beam sources and transport [1]. Highlights of the talks in these areas included new models of breakdown on the microscopic scale, new three-dimensional multipacting calculations with both finite difference and finite element codes, and detailed comparisons of new electron gun models with standard models such as PARMELA. The group also addressed two areas of advances in computation: (i) new algorithms, including simulation in a Lorentz-boosted frame that can reduce computation time orders of magnitude, and (ii) newmore » hardware architectures, like graphics processing units and Cell processors that promise dramatic increases in computing power. Highlights of the talks in these areas included results from the first large-scale parallel finite element particle-in-cell code (PIC), many order-of-magnitude speedup of, and details of porting the VPIC code to the Roadrunner supercomputer. The working group featured two plenary talks, one by Brian Albright of Los Alamos National Laboratory on the performance of the VPIC code on the Roadrunner supercomputer, and one by David Bruhwiler of Tech-X Corporation on recent advances in computation for advanced accelerators. Highlights of the talk by Albright included the first one trillion particle simulations, a sustained performance of 0.3 petaflops, and an eight times speedup of science calculations, including back-scatter in laser-plasma interaction. Highlights of the talk by Bruhwiler included simulations of 10 GeV accelerator laser wakefield stages including external injection, new developments in electromagnetic simulations of electron guns using finite difference and finite element approaches.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kistler, B.L.
DELSOL3 is a revised and updated version of the DELSOL2 computer program (SAND81-8237) for calculating collector field performance and layout and optimal system design for solar thermal central receiver plants. The code consists of a detailed model of the optical performance, a simpler model of the non-optical performance, an algorithm for field layout, and a searching algorithm to find the best system design based on energy cost. The latter two features are coupled to a cost model of central receiver components and an economic model for calculating energy costs. The code can handle flat, focused and/or canted heliostats, and externalmore » cylindrical, multi-aperture cavity, and flat plate receivers. The program optimizes the tower height, receiver size, field layout, heliostat spacings, and tower position at user specified power levels subject to flux limits on the receiver and land constraints for field layout. DELSOL3 maintains the advantages of speed and accuracy which are characteristics of DELSOL2.« less
A Biosequence-based Approach to Software Characterization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Oehmen, Christopher S.; Peterson, Elena S.; Phillips, Aaron R.
For many applications, it is desirable to have some process for recognizing when software binaries are closely related without relying on them to be identical or have identical segments. Some examples include monitoring utilization of high performance computing centers or service clouds, detecting freeware in licensed code, and enforcing application whitelists. But doing so in a dynamic environment is a nontrivial task because most approaches to software similarity require extensive and time-consuming analysis of a binary, or they fail to recognize executables that are similar but nonidentical. Presented herein is a novel biosequence-based method for quantifying similarity of executable binaries.more » Using this method, it is shown in an example application on large-scale multi-author codes that 1) the biosequence-based method has a statistical performance in recognizing and distinguishing between a collection of real-world high performance computing applications better than 90% of ideal; and 2) an example of using family tree analysis to tune identification for a code subfamily can achieve better than 99% of ideal performance.« less
Performance of MCNP4A on seven computing platforms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hendricks, J.S.; Brockhoff, R.C.
1994-12-31
The performance of seven computer platforms has been evaluated with the MCNP4A Monte Carlo radiation transport code. For the first time we report timing results using MCNP4A and its new test set and libraries. Comparisons are made on platforms not available to us in previous MCNP timing studies. By using MCNP4A and its 325-problem test set, a widely-used and readily-available physics production code is used; the timing comparison is not limited to a single ``typical`` problem, demonstrating the problem dependence of timing results; the results are reproducible at the more than 100 installations around the world using MCNP; comparison ofmore » performance of other computer platforms to the ones tested in this study is possible because we present raw data rather than normalized results; and a measure of the increase in performance of computer hardware and software over the past two years is possible. The computer platforms reported are the Cray-YMP 8/64, IBM RS/6000-560, Sun Sparc10, Sun Sparc2, HP/9000-735, 4 processor 100 MHz Silicon Graphics ONYX, and Gateway 2000 model 4DX2-66V PC. In 1991 a timing study of MCNP4, the predecessor to MCNP4A, was conducted using ENDF/B-V cross-section libraries, which are export protected. The new study is based upon the new MCNP 25-problem test set which utilizes internationally available data. MCNP4A, its test problems and the test data library are available from the Radiation Shielding and Information Center in Oak Ridge, Tennessee, or from the NEA Data Bank in Saclay, France. Anyone with the same workstation and compiler can get the same test problem sets, the same library files, and the same MCNP4A code from RSIC or NEA and replicate our results. And, because we report raw data, comparison of the performance of other compute platforms and compilers can be made.« less
Scalability of Parallel Spatial Direct Numerical Simulations on Intel Hypercube and IBM SP1 and SP2
NASA Technical Reports Server (NTRS)
Joslin, Ronald D.; Hanebutte, Ulf R.; Zubair, Mohammad
1995-01-01
The implementation and performance of a parallel spatial direct numerical simulation (PSDNS) approach on the Intel iPSC/860 hypercube and IBM SP1 and SP2 parallel computers is documented. Spatially evolving disturbances associated with the laminar-to-turbulent transition in boundary-layer flows are computed with the PSDNS code. The feasibility of using the PSDNS to perform transition studies on these computers is examined. The results indicate that PSDNS approach can effectively be parallelized on a distributed-memory parallel machine by remapping the distributed data structure during the course of the calculation. Scalability information is provided to estimate computational costs to match the actual costs relative to changes in the number of grid points. By increasing the number of processors, slower than linear speedups are achieved with optimized (machine-dependent library) routines. This slower than linear speedup results because the computational cost is dominated by FFT routine, which yields less than ideal speedups. By using appropriate compile options and optimized library routines on the SP1, the serial code achieves 52-56 M ops on a single node of the SP1 (45 percent of theoretical peak performance). The actual performance of the PSDNS code on the SP1 is evaluated with a "real world" simulation that consists of 1.7 million grid points. One time step of this simulation is calculated on eight nodes of the SP1 in the same time as required by a Cray Y/MP supercomputer. For the same simulation, 32-nodes of the SP1 and SP2 are required to reach the performance of a Cray C-90. A 32 node SP1 (SP2) configuration is 2.9 (4.6) times faster than a Cray Y/MP for this simulation, while the hypercube is roughly 2 times slower than the Y/MP for this application. KEY WORDS: Spatial direct numerical simulations; incompressible viscous flows; spectral methods; finite differences; parallel computing.
The Influence of Viscous Effects on Ice Accretion Prediction and Airfoil Performance Predictions
NASA Technical Reports Server (NTRS)
Kreeger, Richard E.; Wright, William B.
2005-01-01
A computational study was conducted to evaluate the effectiveness of using a viscous flow solution in an ice accretion code and the resulting accuracy of aerodynamic performance prediction. Ice shapes were obtained for one single-element and one multi-element airfoil using both potential flow and Navier-Stokes flowfields in the LEWICE ice accretion code. Aerodynamics were then calculated using a Navier-Stokes flow solver.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Prowell, Stacy J; Symons, Christopher T
2015-01-01
Producing trusted results from high-performance codes is essential for policy and has significant economic impact. We propose combining rigorous analytical methods with machine learning techniques to achieve the goal of repeatable, trustworthy scientific computing.
Spatial transform coding of color images.
NASA Technical Reports Server (NTRS)
Pratt, W. K.
1971-01-01
The application of the transform-coding concept to the coding of color images represented by three primary color planes of data is discussed. The principles of spatial transform coding are reviewed and the merits of various methods of color-image representation are examined. A performance analysis is presented for the color-image transform-coding system. Results of a computer simulation of the coding system are also given. It is shown that, by transform coding, the chrominance content of a color image can be coded with an average of 1.0 bits per element or less without serious degradation. If luminance coding is also employed, the average rate reduces to about 2.0 bits per element or less.
NASA Astrophysics Data System (ADS)
Naumov, D.; Fischer, T.; Böttcher, N.; Watanabe, N.; Walther, M.; Rink, K.; Bilke, L.; Shao, H.; Kolditz, O.
2014-12-01
OpenGeoSys (OGS) is a scientific open source code for numerical simulation of thermo-hydro-mechanical-chemical processes in porous and fractured media. Its basic concept is to provide a flexible numerical framework for solving multi-field problems for applications in geoscience and hydrology as e.g. for CO2 storage applications, geothermal power plant forecast simulation, salt water intrusion, water resources management, etc. Advances in computational mathematics have revolutionized the variety and nature of the problems that can be addressed by environmental scientists and engineers nowadays and an intensive code development in the last years enables in the meantime the solutions of much larger numerical problems and applications. However, solving environmental processes along the water cycle at large scales, like for complete catchment or reservoirs, stays computationally still a challenging task. Therefore, we started a new OGS code development with focus on execution speed and parallelization. In the new version, a local data structure concept improves the instruction and data cache performance by a tight bundling of data with an element-wise numerical integration loop. Dedicated analysis methods enable the investigation of memory-access patterns in the local and global assembler routines, which leads to further data structure optimization for an additional performance gain. The concept is presented together with a technical code analysis of the recent development and a large case study including transient flow simulation in the unsaturated / saturated zone of the Thuringian Syncline, Germany. The analysis is performed on a high-resolution mesh (up to 50M elements) with embedded fault structures.
NASA Astrophysics Data System (ADS)
Chen, Gang; Yang, Bing; Zhang, Xiaoyun; Gao, Zhiyong
2017-07-01
The latest high efficiency video coding (HEVC) standard significantly increases the encoding complexity for improving its coding efficiency. Due to the limited computational capability of handheld devices, complexity constrained video coding has drawn great attention in recent years. A complexity control algorithm based on adaptive mode selection is proposed for interframe coding in HEVC. Considering the direct proportionality between encoding time and computational complexity, the computational complexity is measured in terms of encoding time. First, complexity is mapped to a target in terms of prediction modes. Then, an adaptive mode selection algorithm is proposed for the mode decision process. Specifically, the optimal mode combination scheme that is chosen through offline statistics is developed at low complexity. If the complexity budget has not been used up, an adaptive mode sorting method is employed to further improve coding efficiency. The experimental results show that the proposed algorithm achieves a very large complexity control range (as low as 10%) for the HEVC encoder while maintaining good rate-distortion performance. For the lowdelayP condition, compared with the direct resource allocation method and the state-of-the-art method, an average gain of 0.63 and 0.17 dB in BDPSNR is observed for 18 sequences when the target complexity is around 40%.
SoAx: A generic C++ Structure of Arrays for handling particles in HPC codes
NASA Astrophysics Data System (ADS)
Homann, Holger; Laenen, Francois
2018-03-01
The numerical study of physical problems often require integrating the dynamics of a large number of particles evolving according to a given set of equations. Particles are characterized by the information they are carrying such as an identity, a position other. There are generally speaking two different possibilities for handling particles in high performance computing (HPC) codes. The concept of an Array of Structures (AoS) is in the spirit of the object-oriented programming (OOP) paradigm in that the particle information is implemented as a structure. Here, an object (realization of the structure) represents one particle and a set of many particles is stored in an array. In contrast, using the concept of a Structure of Arrays (SoA), a single structure holds several arrays each representing one property (such as the identity) of the whole set of particles. The AoS approach is often implemented in HPC codes due to its handiness and flexibility. For a class of problems, however, it is known that the performance of SoA is much better than that of AoS. We confirm this observation for our particle problem. Using a benchmark we show that on modern Intel Xeon processors the SoA implementation is typically several times faster than the AoS one. On Intel's MIC co-processors the performance gap even attains a factor of ten. The same is true for GPU computing, using both computational and multi-purpose GPUs. Combining performance and handiness, we present the library SoAx that has optimal performance (on CPUs, MICs, and GPUs) while providing the same handiness as AoS. For this, SoAx uses modern C++ design techniques such template meta programming that allows to automatically generate code for user defined heterogeneous data structures.
Performance characteristics of the Cooper PC-9 centrifugal compressor
DOE Office of Scientific and Technical Information (OSTI.GOV)
Foster, R.E.; Neely, R.F.
1988-06-30
Mathematical performance modeling of the PC-9 centrifugal compressor has been completed. Performance characteristics curves have never been obtained for them in test loops with the same degree of accuracy as for the uprated axial compressors and, consequently, computer modeling of the top cascade and purge cascades has been very difficult and of limited value. This compressor modeling work has been carried out in an attempt to generate data which would more accurately define the compressor's performance and would permit more accurate cascade modeling. A computer code, COMPAL, was used to mathematically model the PC-9 performance with variations in gas composition,more » flow ratios, pressure ratios, speed and temperature. The results of this effort, in the form of graphs, with information about the compressor and the code, are the subject of this report. Compressor characteristic curves are featured. 13 figs.« less
Developing Information Power Grid Based Algorithms and Software
NASA Technical Reports Server (NTRS)
Dongarra, Jack
1998-01-01
This exploratory study initiated our effort to understand performance modeling on parallel systems. The basic goal of performance modeling is to understand and predict the performance of a computer program or set of programs on a computer system. Performance modeling has numerous applications, including evaluation of algorithms, optimization of code implementations, parallel library development, comparison of system architectures, parallel system design, and procurement of new systems. Our work lays the basis for the construction of parallel libraries that allow for the reconstruction of application codes on several distinct architectures so as to assure performance portability. Following our strategy, once the requirements of applications are well understood, one can then construct a library in a layered fashion. The top level of this library will consist of architecture-independent geometric, numerical, and symbolic algorithms that are needed by the sample of applications. These routines should be written in a language that is portable across the targeted architectures.
Users manual for updated computer code for axial-flow compressor conceptual design
NASA Technical Reports Server (NTRS)
Glassman, Arthur J.
1992-01-01
An existing computer code that determines the flow path for an axial-flow compressor either for a given number of stages or for a given overall pressure ratio was modified for use in air-breathing engine conceptual design studies. This code uses a rapid approximate design methodology that is based on isentropic simple radial equilibrium. Calculations are performed at constant-span-fraction locations from tip to hub. Energy addition per stage is controlled by specifying the maximum allowable values for several aerodynamic design parameters. New modeling was introduced to the code to overcome perceived limitations. Specific changes included variable rather than constant tip radius, flow path inclination added to the continuity equation, input of mass flow rate directly rather than indirectly as inlet axial velocity, solution for the exact value of overall pressure ratio rather than for any value that met or exceeded it, and internal computation of efficiency rather than the use of input values. The modified code was shown to be capable of computing efficiencies that are compatible with those of five multistage compressors and one fan that were tested experimentally. This report serves as a users manual for the revised code, Compressor Spanline Analysis (CSPAN). The modeling modifications, including two internal loss correlations, are presented. Program input and output are described. A sample case for a multistage compressor is included.
Modeling high-temperature superconductors and metallic alloys on the Intel IPSC/860
NASA Astrophysics Data System (ADS)
Geist, G. A.; Peyton, B. W.; Shelton, W. A.; Stocks, G. M.
Oak Ridge National Laboratory has embarked on several computational Grand Challenges, which require the close cooperation of physicists, mathematicians, and computer scientists. One of these projects is the determination of the material properties of alloys from first principles and, in particular, the electronic structure of high-temperature superconductors. While the present focus of the project is on superconductivity, the approach is general enough to permit study of other properties of metallic alloys such as strength and magnetic properties. This paper describes the progress to date on this project. We include a description of a self-consistent KKR-CPA method, parallelization of the model, and the incorporation of a dynamic load balancing scheme into the algorithm. We also describe the development and performance of a consolidated KKR-CPA code capable of running on CRAYs, workstations, and several parallel computers without source code modification. Performance of this code on the Intel iPSC/860 is also compared to a CRAY 2, CRAY YMP, and several workstations. Finally, some density of state calculations of two perovskite superconductors are given.
Computation of the tip vortex flowfield for advanced aircraft propellers
NASA Technical Reports Server (NTRS)
Tsai, Tommy M.; Dejong, Frederick J.; Levy, Ralph
1988-01-01
The tip vortex flowfield plays a significant role in the performance of advanced aircraft propellers. The flowfield in the tip region is complex, three-dimensional and viscous with large secondary velocities. An analysis is presented using an approximate set of equations which contains the physics required by the tip vortex flowfield, but which does not require the resources of the full Navier-Stokes equations. A computer code was developed to predict the tip vortex flowfield of advanced aircraft propellers. A grid generation package was developed to allow specification of a variety of advanced aircraft propeller shapes. Calculations of the tip vortex generation on an SR3 type blade at high Reynolds numbers were made using this code and a parametric study was performed to show the effect of tip thickness on tip vortex intensity. In addition, calculations of the tip vortex generation on a NACA 0012 type blade were made, including the flowfield downstream of the blade trailing edge. Comparison of flowfield calculations with experimental data from an F4 blade was made. A user's manual was also prepared for the computer code (NASA CR-182178).
NASA Technical Reports Server (NTRS)
Carpenter, M. H.
1988-01-01
The generalized chemistry version of the computer code SPARK is extended to include two higher-order numerical schemes, yielding fourth-order spatial accuracy for the inviscid terms. The new and old formulations are used to study the influences of finite rate chemical processes on nozzle performance. A determination is made of the computationally optimum reaction scheme for use in high-enthalpy nozzles. Finite rate calculations are compared with the frozen and equilibrium limits to assess the validity of each formulation. In addition, the finite rate SPARK results are compared with the constant ratio of specific heats (gamma) SEAGULL code, to determine its accuracy in variable gamma flow situations. Finally, the higher-order SPARK code is used to calculate nozzle flows having species stratification. Flame quenching occurs at low nozzle pressures, while for high pressures, significant burning continues in the nozzle.
Multidisciplinary Aerospace Systems Optimization: Computational AeroSciences (CAS) Project
NASA Technical Reports Server (NTRS)
Kodiyalam, S.; Sobieski, Jaroslaw S. (Technical Monitor)
2001-01-01
The report describes a method for performing optimization of a system whose analysis is so expensive that it is impractical to let the optimization code invoke it directly because excessive computational cost and elapsed time might result. In such situation it is imperative to have user control the number of times the analysis is invoked. The reported method achieves that by two techniques in the Design of Experiment category: a uniform dispersal of the trial design points over a n-dimensional hypersphere and a response surface fitting, and the technique of krigging. Analyses of all the trial designs whose number may be set by the user are performed before activation of the optimization code and the results are stored as a data base. That code is then executed and referred to the above data base. Two applications, one of the airborne laser system, and one of an aircraft optimization illustrate the method application.
NASA Astrophysics Data System (ADS)
Schmieschek, S.; Shamardin, L.; Frijters, S.; Krüger, T.; Schiller, U. D.; Harting, J.; Coveney, P. V.
2017-08-01
We introduce the lattice-Boltzmann code LB3D, version 7.1. Building on a parallel program and supporting tools which have enabled research utilising high performance computing resources for nearly two decades, LB3D version 7 provides a subset of the research code functionality as an open source project. Here, we describe the theoretical basis of the algorithm as well as computational aspects of the implementation. The software package is validated against simulations of meso-phases resulting from self-assembly in ternary fluid mixtures comprising immiscible and amphiphilic components such as water-oil-surfactant systems. The impact of the surfactant species on the dynamics of spinodal decomposition are tested and quantitative measurement of the permeability of a body centred cubic (BCC) model porous medium for a simple binary mixture is described. Single-core performance and scaling behaviour of the code are reported for simulations on current supercomputer architectures.
NASA Technical Reports Server (NTRS)
Kikuchi, Hideaki; Kalia, Rajiv; Nakano, Aiichiro; Vashishta, Priya; Iyetomi, Hiroshi; Ogata, Shuji; Kouno, Takahisa; Shimojo, Fuyuki; Tsuruta, Kanji; Saini, Subhash;
2002-01-01
A multidisciplinary, collaborative simulation has been performed on a Grid of geographically distributed PC clusters. The multiscale simulation approach seamlessly combines i) atomistic simulation backed on the molecular dynamics (MD) method and ii) quantum mechanical (QM) calculation based on the density functional theory (DFT), so that accurate but less scalable computations are performed only where they are needed. The multiscale MD/QM simulation code has been Grid-enabled using i) a modular, additive hybridization scheme, ii) multiple QM clustering, and iii) computation/communication overlapping. The Gridified MD/QM simulation code has been used to study environmental effects of water molecules on fracture in silicon. A preliminary run of the code has achieved a parallel efficiency of 94% on 25 PCs distributed over 3 PC clusters in the US and Japan, and a larger test involving 154 processors on 5 distributed PC clusters is in progress.
Load management strategy for Particle-In-Cell simulations in high energy particle acceleration
NASA Astrophysics Data System (ADS)
Beck, A.; Frederiksen, J. T.; Dérouillat, J.
2016-09-01
In the wake of the intense effort made for the experimental CILEX project, numerical simulation campaigns have been carried out in order to finalize the design of the facility and to identify optimal laser and plasma parameters. These simulations bring, of course, important insight into the fundamental physics at play. As a by-product, they also characterize the quality of our theoretical and numerical models. In this paper, we compare the results given by different codes and point out algorithmic limitations both in terms of physical accuracy and computational performances. These limitations are illustrated in the context of electron laser wakefield acceleration (LWFA). The main limitation we identify in state-of-the-art Particle-In-Cell (PIC) codes is computational load imbalance. We propose an innovative algorithm to deal with this specific issue as well as milestones towards a modern, accurate high-performance PIC code for high energy particle acceleration.
Computation of Thermodynamic Equilibria Pertinent to Nuclear Materials in Multi-Physics Codes
NASA Astrophysics Data System (ADS)
Piro, Markus Hans Alexander
Nuclear energy plays a vital role in supporting electrical needs and fulfilling commitments to reduce greenhouse gas emissions. Research is a continuing necessity to improve the predictive capabilities of fuel behaviour in order to reduce costs and to meet increasingly stringent safety requirements by the regulator. Moreover, a renewed interest in nuclear energy has given rise to a "nuclear renaissance" and the necessity to design the next generation of reactors. In support of this goal, significant research efforts have been dedicated to the advancement of numerical modelling and computational tools in simulating various physical and chemical phenomena associated with nuclear fuel behaviour. This undertaking in effect is collecting the experience and observations of a past generation of nuclear engineers and scientists in a meaningful way for future design purposes. There is an increasing desire to integrate thermodynamic computations directly into multi-physics nuclear fuel performance and safety codes. A new equilibrium thermodynamic solver is being developed with this matter as a primary objective. This solver is intended to provide thermodynamic material properties and boundary conditions for continuum transport calculations. There are several concerns with the use of existing commercial thermodynamic codes: computational performance; limited capabilities in handling large multi-component systems of interest to the nuclear industry; convenient incorporation into other codes with quality assurance considerations; and, licensing entanglements associated with code distribution. The development of this software in this research is aimed at addressing all of these concerns. The approach taken in this work exploits fundamental principles of equilibrium thermodynamics to simplify the numerical optimization equations. In brief, the chemical potentials of all species and phases in the system are constrained by estimates of the chemical potentials of the system components at each iterative step, and the objective is to minimize the residuals of the mass balance equations. Several numerical advantages are achieved through this simplification. In particular, computational expense is reduced and the rate of convergence is enhanced. Furthermore, the software has demonstrated the ability to solve systems involving as many as 118 component elements. An early version of the code has already been integrated into the Advanced Multi-Physics (AMP) code under development by the Oak Ridge National Laboratory, Los Alamos National Laboratory, Idaho National Laboratory and Argonne National Laboratory. Keywords: Engineering, Nuclear -- 0552, Engineering, Material Science -- 0794, Chemistry, Mathematics -- 0405, Computer Science -- 0984
DOE Office of Scientific and Technical Information (OSTI.GOV)
Curry, Matthew L.; Ferreira, Kurt Brian; Pedretti, Kevin Thomas Tauke
2012-03-01
This report documents thirteen of Sandia's contributions to the Computational Systems and Software Environment (CSSE) within the Advanced Simulation and Computing (ASC) program between fiscal years 2009 and 2012. It describes their impact on ASC applications. Most contributions are implemented in lower software levels allowing for application improvement without source code changes. Improvements are identified in such areas as reduced run time, characterizing power usage, and Input/Output (I/O). Other experiments are more forward looking, demonstrating potential bottlenecks using mini-application versions of the legacy codes and simulating their network activity on Exascale-class hardware. The purpose of this report is to provemore » that the team has completed milestone 4467-Demonstration of a Legacy Application's Path to Exascale. Cielo is expected to be the last capability system on which existing ASC codes can run without significant modifications. This assertion will be tested to determine where the breaking point is for an existing highly scalable application. The goal is to stretch the performance boundaries of the application by applying recent CSSE RD in areas such as resilience, power, I/O, visualization services, SMARTMAP, lightweight LWKs, virtualization, simulation, and feedback loops. Dedicated system time reservations and/or CCC allocations will be used to quantify the impact of system-level changes to extend the life and performance of the ASC code base. Finally, a simulation of anticipated exascale-class hardware will be performed using SST to supplement the calculations. Determine where the breaking point is for an existing highly scalable application: Chapter 15 presented the CSSE work that sought to identify the breaking point in two ASC legacy applications-Charon and CTH. Their mini-app versions were also employed to complete the task. There is no single breaking point as more than one issue was found with the two codes. The results were that applications can expect to encounter performance issues related to the computing environment, system software, and algorithms. Careful profiling of runtime performance will be needed to identify the source of an issue, in strong combination with knowledge of system software and application source code.« less
A decoding procedure for the Reed-Solomon codes
NASA Technical Reports Server (NTRS)
Lim, R. S.
1978-01-01
A decoding procedure is described for the (n,k) t-error-correcting Reed-Solomon (RS) code, and an implementation of the (31,15) RS code for the I4-TENEX central system. This code can be used for error correction in large archival memory systems. The principal features of the decoder are a Galois field arithmetic unit implemented by microprogramming a microprocessor, and syndrome calculation by using the g(x) encoding shift register. Complete decoding of the (31,15) code is expected to take less than 500 microsecs. The syndrome calculation is performed by hardware using the encoding shift register and a modified Chien search. The error location polynomial is computed by using Lin's table, which is an interpretation of Berlekamp's iterative algorithm. The error location numbers are calculated by using the Chien search. Finally, the error values are computed by using Forney's method.
Verification of low-Mach number combustion codes using the method of manufactured solutions
NASA Astrophysics Data System (ADS)
Shunn, Lee; Ham, Frank; Knupp, Patrick; Moin, Parviz
2007-11-01
Many computational combustion models rely on tabulated constitutive relations to close the system of equations. As these reactive state-equations are typically multi-dimensional and highly non-linear, their implications on the convergence and accuracy of simulation codes are not well understood. In this presentation, the effects of tabulated state-relationships on the computational performance of low-Mach number combustion codes are explored using the method of manufactured solutions (MMS). Several MMS examples are developed and applied, progressing from simple one-dimensional configurations to problems involving higher dimensionality and solution-complexity. The manufactured solutions are implemented in two multi-physics hydrodynamics codes: CDP developed at Stanford University and FUEGO developed at Sandia National Laboratories. In addition to verifying the order-of-accuracy of the codes, the MMS problems help highlight certain robustness issues in existing variable-density flow-solvers. Strategies to overcome these issues are briefly discussed.
Hypersonic simulations using open-source CFD and DSMC solvers
NASA Astrophysics Data System (ADS)
Casseau, V.; Scanlon, T. J.; John, B.; Emerson, D. R.; Brown, R. E.
2016-11-01
Hypersonic hybrid hydrodynamic-molecular gas flow solvers are required to satisfy the two essential requirements of any high-speed reacting code, these being physical accuracy and computational efficiency. The James Weir Fluids Laboratory at the University of Strathclyde is currently developing an open-source hybrid code which will eventually reconcile the direct simulation Monte-Carlo method, making use of the OpenFOAM application called dsmcFoam, and the newly coded open-source two-temperature computational fluid dynamics solver named hy2Foam. In conjunction with employing the CVDV chemistry-vibration model in hy2Foam, novel use is made of the QK rates in a CFD solver. In this paper, further testing is performed, in particular with the CFD solver, to ensure its efficacy before considering more advanced test cases. The hy2Foam and dsmcFoam codes have shown to compare reasonably well, thus providing a useful basis for other codes to compare against.
Development of a Model and Computer Code to Describe Solar Grade Silicon Production Processes
NASA Technical Reports Server (NTRS)
Srivastava, R.; Gould, R. K.
1979-01-01
Mathematical models and computer codes based on these models, which allow prediction of the product distribution in chemical reactors for converting gaseous silicon compounds to condensed-phase silicon were developed. The following tasks were accomplished: (1) formulation of a model for silicon vapor separation/collection from the developing turbulent flow stream within reactors of the Westinghouse (2) modification of an available general parabolic code to achieve solutions to the governing partial differential equations (boundary layer type) which describe migration of the vapor to the reactor walls, (3) a parametric study using the boundary layer code to optimize the performance characteristics of the Westinghouse reactor, (4) calculations relating to the collection efficiency of the new AeroChem reactor, and (5) final testing of the modified LAPP code for use as a method of predicting Si(1) droplet sizes in these reactors.
A Framework for Debugging Geoscience Projects in a High Performance Computing Environment
NASA Astrophysics Data System (ADS)
Baxter, C.; Matott, L.
2012-12-01
High performance computing (HPC) infrastructure has become ubiquitous in today's world with the emergence of commercial cloud computing and academic supercomputing centers. Teams of geoscientists, hydrologists and engineers can take advantage of this infrastructure to undertake large research projects - for example, linking one or more site-specific environmental models with soft computing algorithms, such as heuristic global search procedures, to perform parameter estimation and predictive uncertainty analysis, and/or design least-cost remediation systems. However, the size, complexity and distributed nature of these projects can make identifying failures in the associated numerical experiments using conventional ad-hoc approaches both time- consuming and ineffective. To address these problems a multi-tiered debugging framework has been developed. The framework allows for quickly isolating and remedying a number of potential experimental failures, including: failures in the HPC scheduler; bugs in the soft computing code; bugs in the modeling code; and permissions and access control errors. The utility of the framework is demonstrated via application to a series of over 200,000 numerical experiments involving a suite of 5 heuristic global search algorithms and 15 mathematical test functions serving as cheap analogues for the simulation-based optimization of pump-and-treat subsurface remediation systems.
Hybrid and concatenated coding applications.
NASA Technical Reports Server (NTRS)
Hofman, L. B.; Odenwalder, J. P.
1972-01-01
Results of a study to evaluate the performance and implementation complexity of a concatenated and a hybrid coding system for moderate-speed deep-space applications. It is shown that with a total complexity of less than three times that of the basic Viterbi decoder, concatenated coding improves a constraint length 8 rate 1/3 Viterbi decoding system by 1.1 and 2.6 dB at bit error probabilities of 0.0001 and one hundred millionth, respectively. With a somewhat greater total complexity, the hybrid coding system is shown to obtain a 0.9-dB computational performance improvement over the basic rate 1/3 sequential decoding system. Although substantial, these complexities are much less than those required to achieve the same performances with more complex Viterbi or sequential decoder systems.
Mir Cooperative Solar Array Flight Performance Data and Computational Analysis
NASA Technical Reports Server (NTRS)
Kerslake, Thomas W.; Hoffman, David J.
1997-01-01
The Mir Cooperative Solar Array (MCSA) was developed jointly by the United States (US) and Russia to provide approximately 6 kW of photovoltaic power to the Russian space station Mir. The MCSA was launched to Mir in November 1995 and installed on the Kvant-1 module in May 1996. Since the MCSA photovoltaic panel modules (PPMs) are nearly identical to those of the International Space Station (ISS) photovoltaic arrays, MCSA operation offered an opportunity to gather multi-year performance data on this technology prior to its implementation on ISS. Two specially designed test sequences were executed in June and December 1996 to measure MCSA performance. Each test period encompassed 3 orbital revolutions whereby the current produced by the MCSA channels was measured. The temperature of MCSA PPMs was also measured. To better interpret the MCSA flight data, a dedicated FORTRAN computer code was developed to predict the detailed thermal-electrical performance of the MCSA. Flight data compared very favorably with computational performance predictions. This indicated that the MCSA electrical performance was fully meeting pre-flight expectations. There were no measurable indications of unexpected or precipitous MCSA performance degradation due to contamination or other causes after 7 months of operation on orbit. Power delivered to the Mir bus was lower than desired as a consequence of the retrofitted power distribution cabling. The strong correlation of experimental and computational results further bolsters the confidence level of performance codes used in critical ISS electric power forecasting. In this paper, MCSA flight performance tests are described as well as the computational modeling behind the performance predictions.
One-Time Password Tokens | High-Performance Computing | NREL
One-Time Password Tokens One-Time Password Tokens For connecting to NREL's high-performance computing (HPC) systems, learn how to set up a one-time password (OTP) token for remote and privileged a one-time pass code from the HPC Operations team. At the sign-in screen Enter your HPC Username in
NASA Technical Reports Server (NTRS)
Flemming, Robert J.; Britton, Randall K.; Bond, Thomas H.
1994-01-01
The cost and time to certify or qualify a rotorcraft for flight in forecast icing has been a major impediment to the development of ice protection systems for helicopter rotors. Development and flight test programs for those aircraft that have achieved certification or qualification for flight in icing conditions have taken many years, and the costs have been very high. NASA, Sikorsky, and others have been conducting research into alternative means for providing information for the development of ice protection systems, and subsequent flight testing to substantiate the air-worthiness of a rotor ice protection system. Model rotor icing tests conducted in 1989 and 1993 have provided a data base for correlation of codes, and for the validation of wind tunnel icing test techniques. This paper summarizes this research, showing test and correlation trends as functions of cloud liquid water content, rotor lift, flight speed, and ambient temperature. Molds were made of several of the ice formations on the rotor blades. These molds were used to form simulated ice on the rotor blades, and the blades were then tested in a wind tunnel to determine flight performance characteristics. These simulated-ice rotor performance tests are discussed in the paper. The levels of correlation achieved and the role of these tools (codes and wind tunnel tests) in flight test planning, testing, and extension of flight data to the limits of the icing envelope are discussed. The potential application of simulated ice, the NASA LEWICE computer, the Sikorsky Generalized Rotor Performance aerodynamic computer code, and NASA Icing Research Tunnel rotor tests in a rotorcraft certification or qualification program are also discussed. The correlation of these computer codes with tunnel test data is presented, and a procedure or process to use these methods as part of a certification or qualification program is introduced.
SU (2) lattice gauge theory simulations on Fermi GPUs
NASA Astrophysics Data System (ADS)
Cardoso, Nuno; Bicudo, Pedro
2011-05-01
In this work we explore the performance of CUDA in quenched lattice SU (2) simulations. CUDA, NVIDIA Compute Unified Device Architecture, is a hardware and software architecture developed by NVIDIA for computing on the GPU. We present an analysis and performance comparison between the GPU and CPU in single and double precision. Analyses with multiple GPUs and two different architectures (G200 and Fermi architectures) are also presented. In order to obtain a high performance, the code must be optimized for the GPU architecture, i.e., an implementation that exploits the memory hierarchy of the CUDA programming model. We produce codes for the Monte Carlo generation of SU (2) lattice gauge configurations, for the mean plaquette, for the Polyakov Loop at finite T and for the Wilson loop. We also present results for the potential using many configurations (50,000) without smearing and almost 2000 configurations with APE smearing. With two Fermi GPUs we have achieved an excellent performance of 200× the speed over one CPU, in single precision, around 110 Gflops/s. We also find that, using the Fermi architecture, double precision computations for the static quark-antiquark potential are not much slower (less than 2× slower) than single precision computations.
Message Passing vs. Shared Address Space on a Cluster of SMPs
NASA Technical Reports Server (NTRS)
Shan, Hongzhang; Singh, Jaswinder Pal; Oliker, Leonid; Biswas, Rupak
2000-01-01
The convergence of scalable computer architectures using clusters of PCs (or PC-SMPs) with commodity networking has become an attractive platform for high end scientific computing. Currently, message-passing and shared address space (SAS) are the two leading programming paradigms for these systems. Message-passing has been standardized with MPI, and is the most common and mature programming approach. However message-passing code development can be extremely difficult, especially for irregular structured computations. SAS offers substantial ease of programming, but may suffer from performance limitations due to poor spatial locality, and high protocol overhead. In this paper, we compare the performance of and programming effort, required for six applications under both programming models on a 32 CPU PC-SMP cluster. Our application suite consists of codes that typically do not exhibit high efficiency under shared memory programming. due to their high communication to computation ratios and complex communication patterns. Results indicate that SAS can achieve about half the parallel efficiency of MPI for most of our applications: however, on certain classes of problems SAS performance is competitive with MPI. We also present new algorithms for improving the PC cluster performance of MPI collective operations.
NASA Technical Reports Server (NTRS)
Shankar, V.; Rowell, C.; Hall, W. F.; Mohammadian, A. H.; Schuh, M.; Taylor, K.
1992-01-01
Accurate and rapid evaluation of radar signature for alternative aircraft/store configurations would be of substantial benefit in the evolution of integrated designs that meet radar cross-section (RCS) requirements across the threat spectrum. Finite-volume time domain methods offer the possibility of modeling the whole aircraft, including penetrable regions and stores, at longer wavelengths on today's gigaflop supercomputers and at typical airborne radar wavelengths on the teraflop computers of tomorrow. A structured-grid finite-volume time domain computational fluid dynamics (CFD)-based RCS code has been developed at the Rockwell Science Center, and this code incorporates modeling techniques for general radar absorbing materials and structures. Using this work as a base, the goal of the CFD-based CEM effort is to define, implement and evaluate various code development issues suitable for rapid prototype signature prediction.
Mean Line Pump Flow Model in Rocket Engine System Simulation
NASA Technical Reports Server (NTRS)
Veres, Joseph P.; Lavelle, Thomas M.
2000-01-01
A mean line pump flow modeling method has been developed to provide a fast capability for modeling turbopumps of rocket engines. Based on this method, a mean line pump flow code PUMPA has been written that can predict the performance of pumps at off-design operating conditions, given the loss of the diffusion system at the design point. The pump code can model axial flow inducers, mixed-flow and centrifugal pumps. The code can model multistage pumps in series. The code features rapid input setup and computer run time, and is an effective analysis and conceptual design tool. The map generation capability of the code provides the map information needed for interfacing with a rocket engine system modeling code. The off-design and multistage modeling capabilities of the code permit parametric design space exploration of candidate pump configurations and provide pump performance data for engine system evaluation. The PUMPA code has been integrated with the Numerical Propulsion System Simulation (NPSS) code and an expander rocket engine system has been simulated. The mean line pump flow code runs as an integral part of the NPSS rocket engine system simulation and provides key pump performance information directly to the system model at all operating conditions.
Computer assisted performance tests of the Lyman Alpha Coronagraph
NASA Technical Reports Server (NTRS)
Parkinson, W. H.; Kohl, J. L.
1979-01-01
Preflight calibration and performance tests of the Lyman Alpha Coronagraph rocket instrument in the laboratory, with the experiment in its flight configuration and illumination levels near those expected during flight were successfully carried out using a pulse code modulation telemetry system simulator interfaced in real time to a PDP 11/10 computer system. Post acquisition data reduction programs developed and implemented on the same computer system aided in the interpretation of test and calibration data.
Coupled Aerodynamic and Structural Sensitivity Analysis of a High-Speed Civil Transport
NASA Technical Reports Server (NTRS)
Mason, B. H.; Walsh, J. L.
2001-01-01
An objective of the High Performance Computing and Communication Program at the NASA Langley Research Center is to demonstrate multidisciplinary shape and sizing optimization of a complete aerospace vehicle configuration by using high-fidelity, finite-element structural analysis and computational fluid dynamics aerodynamic analysis. In a previous study, a multi-disciplinary analysis system for a high-speed civil transport was formulated to integrate a set of existing discipline analysis codes, some of them computationally intensive, This paper is an extension of the previous study, in which the sensitivity analysis for the coupled aerodynamic and structural analysis problem is formulated and implemented. Uncoupled stress sensitivities computed with a constant load vector in a commercial finite element analysis code are compared to coupled aeroelastic sensitivities computed by finite differences. The computational expense of these sensitivity calculation methods is discussed.
Factor Structure and Incremental Validity of the Enhanced Computer- Administered Tests
1992-07-01
performance in the mechanical maintenance specialties. 14. SUBJECT TERMS Aptitude tests, ASVAB (Armed services vocational aptitude battery), CAT ...Code 11) Attn: Dir, Personnel Systems (Code 12) Attn: Dir, Testing Systems (Code 13) Attn: CAT /ASVABPMO FJB1 COMNAVCRUITCOM FT1 CNET V8 CG MCRD...test, a computerized adaptive testing version of the ASVAB ( CAT -ASVAB), the psychomotor portion of the General Aptitude Test Battery (GATB), and the
Li, Ying
2016-09-16
Fault-tolerant quantum computing in systems composed of both Majorana fermions and topologically unprotected quantum systems, e.g., superconducting circuits or quantum dots, is studied in this Letter. Errors caused by topologically unprotected quantum systems need to be corrected with error-correction schemes, for instance, the surface code. We find that the error-correction performance of such a hybrid topological quantum computer is not superior to a normal quantum computer unless the topological charge of Majorana fermions is insusceptible to noise. If errors changing the topological charge are rare, the fault-tolerance threshold is much higher than the threshold of a normal quantum computer and a surface-code logical qubit could be encoded in only tens of topological qubits instead of about 1,000 normal qubits.
A look at scalable dense linear algebra libraries
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dongarra, J.J.; Van de Geijn, R.A.; Walker, D.W.
1992-01-01
We discuss the essential design features of a library of scalable software for performing dense linear algebra computations on distributed memory concurrent computers. The square block scattered decomposition is proposed as a flexible and general-purpose way of decomposing most, if not all, dense matrix problems. An object- oriented interface to the library permits more portable applications to be written, and is easy to learn and use, since details of the parallel implementation are hidden from the user. Experiments on the Intel Touchstone Delta system with a prototype code that uses the square block scattered decomposition to perform LU factorization aremore » presented and analyzed. It was found that the code was both scalable and efficient, performing at about 14 GFLOPS (double precision) for the largest problem considered.« less
A look at scalable dense linear algebra libraries
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dongarra, J.J.; Van de Geijn, R.A.; Walker, D.W.
1992-08-01
We discuss the essential design features of a library of scalable software for performing dense linear algebra computations on distributed memory concurrent computers. The square block scattered decomposition is proposed as a flexible and general-purpose way of decomposing most, if not all, dense matrix problems. An object- oriented interface to the library permits more portable applications to be written, and is easy to learn and use, since details of the parallel implementation are hidden from the user. Experiments on the Intel Touchstone Delta system with a prototype code that uses the square block scattered decomposition to perform LU factorization aremore » presented and analyzed. It was found that the code was both scalable and efficient, performing at about 14 GFLOPS (double precision) for the largest problem considered.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chien, T.H.; Domanus, H.M.; Sha, W.T.
1993-02-01
The COMMIX-PPC computer pregrain is an extended and improved version of earlier COMMIX codes and is specifically designed for evaluating the thermal performance of power plant condensers. The COMMIX codes are general-purpose computer programs for the analysis of fluid flow and heat transfer in complex Industrial systems. In COMMIX-PPC, two major features have been added to previously published COMMIX codes. One feature is the incorporation of one-dimensional equations of conservation of mass, momentum, and energy on the tube stile and the proper accounting for the thermal interaction between shell and tube side through the porous-medium approach. The other added featuremore » is the extension of the three-dimensional conservation equations for shell-side flow to treat the flow of a multicomponent medium. COMMIX-PPC is designed to perform steady-state and transient. Three-dimensional analysis of fluid flow with heat transfer tn a power plant condenser. However, the code is designed in a generalized fashion so that, with some modification, it can be used to analyze processes in any heat exchanger or other single-phase engineering applications. Volume I (Equations and Numerics) of this report describes in detail the basic equations, formulation, solution procedures, and models for a phenomena. Volume II (User`s Guide and Manual) contains the input instruction, flow charts, sample problems, and descriptions of available options and boundary conditions.« less
Adaptive Wavelet Coding Applied in a Wireless Control System.
Gama, Felipe O S; Silveira, Luiz F Q; Salazar, Andrés O
2017-12-13
Wireless control systems can sense, control and act on the information exchanged between the wireless sensor nodes in a control loop. However, the exchanged information becomes susceptible to the degenerative effects produced by the multipath propagation. In order to minimize the destructive effects characteristic of wireless channels, several techniques have been investigated recently. Among them, wavelet coding is a good alternative for wireless communications for its robustness to the effects of multipath and its low computational complexity. This work proposes an adaptive wavelet coding whose parameters of code rate and signal constellation can vary according to the fading level and evaluates the use of this transmission system in a control loop implemented by wireless sensor nodes. The performance of the adaptive system was evaluated in terms of bit error rate (BER) versus E b / N 0 and spectral efficiency, considering a time-varying channel with flat Rayleigh fading, and in terms of processing overhead on a control system with wireless communication. The results obtained through computational simulations and experimental tests show performance gains obtained by insertion of the adaptive wavelet coding in a control loop with nodes interconnected by wireless link. These results enable the use of this technique in a wireless link control loop.
Roads towards fault-tolerant universal quantum computation
NASA Astrophysics Data System (ADS)
Campbell, Earl T.; Terhal, Barbara M.; Vuillot, Christophe
2017-09-01
A practical quantum computer must not merely store information, but also process it. To prevent errors introduced by noise from multiplying and spreading, a fault-tolerant computational architecture is required. Current experiments are taking the first steps toward noise-resilient logical qubits. But to convert these quantum devices from memories to processors, it is necessary to specify how a universal set of gates is performed on them. The leading proposals for doing so, such as magic-state distillation and colour-code techniques, have high resource demands. Alternative schemes, such as those that use high-dimensional quantum codes in a modular architecture, have potential benefits, but need to be explored further.
Roads towards fault-tolerant universal quantum computation.
Campbell, Earl T; Terhal, Barbara M; Vuillot, Christophe
2017-09-13
A practical quantum computer must not merely store information, but also process it. To prevent errors introduced by noise from multiplying and spreading, a fault-tolerant computational architecture is required. Current experiments are taking the first steps toward noise-resilient logical qubits. But to convert these quantum devices from memories to processors, it is necessary to specify how a universal set of gates is performed on them. The leading proposals for doing so, such as magic-state distillation and colour-code techniques, have high resource demands. Alternative schemes, such as those that use high-dimensional quantum codes in a modular architecture, have potential benefits, but need to be explored further.
Determinant Computation on the GPU using the Condensation Method
NASA Astrophysics Data System (ADS)
Anisul Haque, Sardar; Moreno Maza, Marc
2012-02-01
We report on a GPU implementation of the condensation method designed by Abdelmalek Salem and Kouachi Said for computing the determinant of a matrix. We consider two types of coefficients: modular integers and floating point numbers. We evaluate the performance of our code by measuring its effective bandwidth and argue that it is numerical stable in the floating point number case. In addition, we compare our code with serial implementation of determinant computation from well-known mathematical packages. Our results suggest that a GPU implementation of the condensation method has a large potential for improving those packages in terms of running time and numerical stability.
Parallel Semi-Implicit Spectral Element Atmospheric Model
NASA Astrophysics Data System (ADS)
Fournier, A.; Thomas, S.; Loft, R.
2001-05-01
The shallow-water equations (SWE) have long been used to test atmospheric-modeling numerical methods. The SWE contain essential wave-propagation and nonlinear effects of more complete models. We present a semi-implicit (SI) improvement of the Spectral Element Atmospheric Model to solve the SWE (SEAM, Taylor et al. 1997, Fournier et al. 2000, Thomas & Loft 2000). SE methods are h-p finite element methods combining the geometric flexibility of size-h finite elements with the accuracy of degree-p spectral methods. Our work suggests that exceptional parallel-computation performance is achievable by a General-Circulation-Model (GCM) dynamical core, even at modest climate-simulation resolutions (>1o). The code derivation involves weak variational formulation of the SWE, Gauss(-Lobatto) quadrature over the collocation points, and Legendre cardinal interpolators. Appropriate weak variation yields a symmetric positive-definite Helmholtz operator. To meet the Ladyzhenskaya-Babuska-Brezzi inf-sup condition and avoid spurious modes, we use a staggered grid. The SI scheme combines leapfrog and Crank-Nicholson schemes for the nonlinear and linear terms respectively. The localization of operations to elements ideally fits the method to cache-based microprocessor computer architectures --derivatives are computed as collections of small (8x8), naturally cache-blocked matrix-vector products. SEAM also has desirable boundary-exchange communication, like finite-difference models. Timings on on the IBM SP and Compaq ES40 supercomputers indicate that the SI code (20-min timestep) requires 1/3 the CPU time of the explicit code (2-min timestep) for T42 resolutions. Both codes scale nearly linearly out to 400 processors. We achieved single-processor performance up to 30% of peak for both codes on the 375-MHz IBM Power-3 processors. Fast computation and linear scaling lead to a useful climate-simulation dycore only if enough model time is computed per unit wall-clock time. An efficient SI solver is essential to substantially increase this rate. Parallel preconditioning for an iterative conjugate-gradient elliptic solver is described. We are building a GCM dycore capable of 200 GF% lOPS sustained performance on clustered RISC/cache architectures using hybrid MPI/OpenMP programming.
NASA Technical Reports Server (NTRS)
Lawrence, Charles; Putt, Charles W.
1997-01-01
The Visual Computing Environment (VCE) is a NASA Lewis Research Center project to develop a framework for intercomponent and multidisciplinary computational simulations. Many current engineering analysis codes simulate various aspects of aircraft engine operation. For example, existing computational fluid dynamics (CFD) codes can model the airflow through individual engine components such as the inlet, compressor, combustor, turbine, or nozzle. Currently, these codes are run in isolation, making intercomponent and complete system simulations very difficult to perform. In addition, management and utilization of these engineering codes for coupled component simulations is a complex, laborious task, requiring substantial experience and effort. To facilitate multicomponent aircraft engine analysis, the CFD Research Corporation (CFDRC) is developing the VCE system. This system, which is part of NASA's Numerical Propulsion Simulation System (NPSS) program, can couple various engineering disciplines, such as CFD, structural analysis, and thermal analysis. The objectives of VCE are to (1) develop a visual computing environment for controlling the execution of individual simulation codes that are running in parallel and are distributed on heterogeneous host machines in a networked environment, (2) develop numerical coupling algorithms for interchanging boundary conditions between codes with arbitrary grid matching and different levels of dimensionality, (3) provide a graphical interface for simulation setup and control, and (4) provide tools for online visualization and plotting. VCE was designed to provide a distributed, object-oriented environment. Mechanisms are provided for creating and manipulating objects, such as grids, boundary conditions, and solution data. This environment includes parallel virtual machine (PVM) for distributed processing. Users can interactively select and couple any set of codes that have been modified to run in a parallel distributed fashion on a cluster of heterogeneous workstations. A scripting facility allows users to dictate the sequence of events that make up the particular simulation.
ODECS -- A computer code for the optimal design of S.I. engine control strategies
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arsie, I.; Pianese, C.; Rizzo, G.
1996-09-01
The computer code ODECS (Optimal Design of Engine Control Strategies) for the design of Spark Ignition engine control strategies is presented. This code has been developed starting from the author`s activity in this field, availing of some original contributions about engine stochastic optimization and dynamical models. This code has a modular structure and is composed of a user interface for the definition, the execution and the analysis of different computations performed with 4 independent modules. These modules allow the following calculations: (1) definition of the engine mathematical model from steady-state experimental data; (2) engine cycle test trajectory corresponding to amore » vehicle transient simulation test such as ECE15 or FTP drive test schedule; (3) evaluation of the optimal engine control maps with a steady-state approach; (4) engine dynamic cycle simulation and optimization of static control maps and/or dynamic compensation strategies, taking into account dynamical effects due to the unsteady fluxes of air and fuel and the influences of combustion chamber wall thermal inertia on fuel consumption and emissions. Moreover, in the last two modules it is possible to account for errors generated by a non-deterministic behavior of sensors and actuators and the related influences on global engine performances, and compute robust strategies, less sensitive to stochastic effects. In the paper the four models are described together with significant results corresponding to the simulation and the calculation of optimal control strategies for dynamic transient tests.« less
Production Level CFD Code Acceleration for Hybrid Many-Core Architectures
NASA Technical Reports Server (NTRS)
Duffy, Austen C.; Hammond, Dana P.; Nielsen, Eric J.
2012-01-01
In this work, a novel graphics processing unit (GPU) distributed sharing model for hybrid many-core architectures is introduced and employed in the acceleration of a production-level computational fluid dynamics (CFD) code. The latest generation graphics hardware allows multiple processor cores to simultaneously share a single GPU through concurrent kernel execution. This feature has allowed the NASA FUN3D code to be accelerated in parallel with up to four processor cores sharing a single GPU. For codes to scale and fully use resources on these and the next generation machines, codes will need to employ some type of GPU sharing model, as presented in this work. Findings include the effects of GPU sharing on overall performance. A discussion of the inherent challenges that parallel unstructured CFD codes face in accelerator-based computing environments is included, with considerations for future generation architectures. This work was completed by the author in August 2010, and reflects the analysis and results of the time.
Real science at the petascale.
Saksena, Radhika S; Boghosian, Bruce; Fazendeiro, Luis; Kenway, Owain A; Manos, Steven; Mazzeo, Marco D; Sadiq, S Kashif; Suter, James L; Wright, David; Coveney, Peter V
2009-06-28
We describe computational science research that uses petascale resources to achieve scientific results at unprecedented scales and resolution. The applications span a wide range of domains, from investigation of fundamental problems in turbulence through computational materials science research to biomedical applications at the forefront of HIV/AIDS research and cerebrovascular haemodynamics. This work was mainly performed on the US TeraGrid 'petascale' resource, Ranger, at Texas Advanced Computing Center, in the first half of 2008 when it was the largest computing system in the world available for open scientific research. We have sought to use this petascale supercomputer optimally across application domains and scales, exploiting the excellent parallel scaling performance found on up to at least 32 768 cores for certain of our codes in the so-called 'capability computing' category as well as high-throughput intermediate-scale jobs for ensemble simulations in the 32-512 core range. Furthermore, this activity provides evidence that conventional parallel programming with MPI should be successful at the petascale in the short to medium term. We also report on the parallel performance of some of our codes on up to 65 636 cores on the IBM Blue Gene/P system at the Argonne Leadership Computing Facility, which has recently been named the fastest supercomputer in the world for open science.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Medin, Stanislav A.; Basko, Mikhail M.; Orlov, Yurii N.
2012-07-11
Radiation hydrodynamics 1D simulations were performed with two concurrent codes, DEIRA and RAMPHY. The DEIRA code was used for DT capsule implosion and burn, and the RAMPHY code was used for computation of X-ray and fast ions deposition in the first wall liquid film of the reactor chamber. The simulations were run for 740 MJ direct drive DT capsule and Pb thin liquid wall reactor chamber of 10 m diameter. Temporal profiles for DT capsule leaking power of X-rays, neutrons and fast {sup 4}He ions were obtained and spatial profiles of the liquid film flow parameter were computed and analyzed.
Spectral-element Seismic Wave Propagation on CUDA/OpenCL Hardware Accelerators
NASA Astrophysics Data System (ADS)
Peter, D. B.; Videau, B.; Pouget, K.; Komatitsch, D.
2015-12-01
Seismic wave propagation codes are essential tools to investigate a variety of wave phenomena in the Earth. Furthermore, they can now be used for seismic full-waveform inversions in regional- and global-scale adjoint tomography. Although these seismic wave propagation solvers are crucial ingredients to improve the resolution of tomographic images to answer important questions about the nature of Earth's internal processes and subsurface structure, their practical application is often limited due to high computational costs. They thus need high-performance computing (HPC) facilities to improving the current state of knowledge. At present, numerous large HPC systems embed many-core architectures such as graphics processing units (GPUs) to enhance numerical performance. Such hardware accelerators can be programmed using either the CUDA programming environment or the OpenCL language standard. CUDA software development targets NVIDIA graphic cards while OpenCL was adopted by additional hardware accelerators, like e.g. AMD graphic cards, ARM-based processors as well as Intel Xeon Phi coprocessors. For seismic wave propagation simulations using the open-source spectral-element code package SPECFEM3D_GLOBE, we incorporated an automatic source-to-source code generation tool (BOAST) which allows us to use meta-programming of all computational kernels for forward and adjoint runs. Using our BOAST kernels, we generate optimized source code for both CUDA and OpenCL languages within the source code package. Thus, seismic wave simulations are able now to fully utilize CUDA and OpenCL hardware accelerators. We show benchmarks of forward seismic wave propagation simulations using SPECFEM3D_GLOBE on CUDA/OpenCL GPUs, validating results and comparing performances for different simulations and hardware usages.
Modern gyrokinetic particle-in-cell simulation of fusion plasmas on top supercomputers
Wang, Bei; Ethier, Stephane; Tang, William; ...
2017-06-29
The Gyrokinetic Toroidal Code at Princeton (GTC-P) is a highly scalable and portable particle-in-cell (PIC) code. It solves the 5D Vlasov-Poisson equation featuring efficient utilization of modern parallel computer architectures at the petascale and beyond. Motivated by the goal of developing a modern code capable of dealing with the physics challenge of increasing problem size with sufficient resolution, new thread-level optimizations have been introduced as well as a key additional domain decomposition. GTC-P's multiple levels of parallelism, including inter-node 2D domain decomposition and particle decomposition, as well as intra-node shared memory partition and vectorization have enabled pushing the scalability ofmore » the PIC method to extreme computational scales. In this paper, we describe the methods developed to build a highly parallelized PIC code across a broad range of supercomputer designs. This particularly includes implementations on heterogeneous systems using NVIDIA GPU accelerators and Intel Xeon Phi (MIC) co-processors and performance comparisons with state-of-the-art homogeneous HPC systems such as Blue Gene/Q. New discovery science capabilities in the magnetic fusion energy application domain are enabled, including investigations of Ion-Temperature-Gradient (ITG) driven turbulence simulations with unprecedented spatial resolution and long temporal duration. Performance studies with realistic fusion experimental parameters are carried out on multiple supercomputing systems spanning a wide range of cache capacities, cache-sharing configurations, memory bandwidth, interconnects and network topologies. These performance comparisons using a realistic discovery-science-capable domain application code provide valuable insights on optimization techniques across one of the broadest sets of current high-end computing platforms worldwide.« less
Modern gyrokinetic particle-in-cell simulation of fusion plasmas on top supercomputers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Bei; Ethier, Stephane; Tang, William
The Gyrokinetic Toroidal Code at Princeton (GTC-P) is a highly scalable and portable particle-in-cell (PIC) code. It solves the 5D Vlasov-Poisson equation featuring efficient utilization of modern parallel computer architectures at the petascale and beyond. Motivated by the goal of developing a modern code capable of dealing with the physics challenge of increasing problem size with sufficient resolution, new thread-level optimizations have been introduced as well as a key additional domain decomposition. GTC-P's multiple levels of parallelism, including inter-node 2D domain decomposition and particle decomposition, as well as intra-node shared memory partition and vectorization have enabled pushing the scalability ofmore » the PIC method to extreme computational scales. In this paper, we describe the methods developed to build a highly parallelized PIC code across a broad range of supercomputer designs. This particularly includes implementations on heterogeneous systems using NVIDIA GPU accelerators and Intel Xeon Phi (MIC) co-processors and performance comparisons with state-of-the-art homogeneous HPC systems such as Blue Gene/Q. New discovery science capabilities in the magnetic fusion energy application domain are enabled, including investigations of Ion-Temperature-Gradient (ITG) driven turbulence simulations with unprecedented spatial resolution and long temporal duration. Performance studies with realistic fusion experimental parameters are carried out on multiple supercomputing systems spanning a wide range of cache capacities, cache-sharing configurations, memory bandwidth, interconnects and network topologies. These performance comparisons using a realistic discovery-science-capable domain application code provide valuable insights on optimization techniques across one of the broadest sets of current high-end computing platforms worldwide.« less
High Performance Object-Oriented Scientific Programming in Fortran 90
NASA Technical Reports Server (NTRS)
Norton, Charles D.; Decyk, Viktor K.; Szymanski, Boleslaw K.
1997-01-01
We illustrate how Fortran 90 supports object-oriented concepts by example of plasma particle computations on the IBM SP. Our experience shows that Fortran 90 and object-oriented methodology give high performance while providing a bridge from Fortran 77 legacy codes to modern programming principles. All of our object-oriented Fortran 90 codes execute more quickly thatn the equeivalent C++ versions, yet the abstraction modelling capabilities used for scentific programming are comparably powereful.
TADSim: Discrete Event-based Performance Prediction for Temperature Accelerated Dynamics
Mniszewski, Susan M.; Junghans, Christoph; Voter, Arthur F.; ...
2015-04-16
Next-generation high-performance computing will require more scalable and flexible performance prediction tools to evaluate software--hardware co-design choices relevant to scientific applications and hardware architectures. Here, we present a new class of tools called application simulators—parameterized fast-running proxies of large-scale scientific applications using parallel discrete event simulation. Parameterized choices for the algorithmic method and hardware options provide a rich space for design exploration and allow us to quickly find well-performing software--hardware combinations. We demonstrate our approach with a TADSim simulator that models the temperature-accelerated dynamics (TAD) method, an algorithmically complex and parameter-rich member of the accelerated molecular dynamics (AMD) family ofmore » molecular dynamics methods. The essence of the TAD application is captured without the computational expense and resource usage of the full code. We accomplish this by identifying the time-intensive elements, quantifying algorithm steps in terms of those elements, abstracting them out, and replacing them by the passage of time. We use TADSim to quickly characterize the runtime performance and algorithmic behavior for the otherwise long-running simulation code. We extend TADSim to model algorithm extensions, such as speculative spawning of the compute-bound stages, and predict performance improvements without having to implement such a method. Validation against the actual TAD code shows close agreement for the evolution of an example physical system, a silver surface. Finally, focused parameter scans have allowed us to study algorithm parameter choices over far more scenarios than would be possible with the actual simulation. This has led to interesting performance-related insights and suggested extensions.« less
NASA Astrophysics Data System (ADS)
Chang, S. L.; Lottes, S. A.; Berry, G. F.
Argonne National Laboratory is investigating the non-reacting jet-gas mixing patterns in a magnetohydrodynamics (MHD) second stage combustor by using a three-dimensional single-phase hydrodynamics computer program. The computer simulation is intended to enhance the understanding of flow and mixing patterns in the combustor, which in turn may improve downstream MHD channel performance. The code is used to examine the three-dimensional effects of the side walls and the distributed jet flows on the non-reacting jet-gas mixing patterns. The code solves the conservation equations of mass, momentum, and energy, and a transport equation of a turbulence parameter and allows permeable surfaces to be specified for any computational cell.
An implementation of a tree code on a SIMD, parallel computer
NASA Technical Reports Server (NTRS)
Olson, Kevin M.; Dorband, John E.
1994-01-01
We describe a fast tree algorithm for gravitational N-body simulation on SIMD parallel computers. The tree construction uses fast, parallel sorts. The sorted lists are recursively divided along their x, y and z coordinates. This data structure is a completely balanced tree (i.e., each particle is paired with exactly one other particle) and maintains good spatial locality. An implementation of this tree-building algorithm on a 16k processor Maspar MP-1 performs well and constitutes only a small fraction (approximately 15%) of the entire cycle of finding the accelerations. Each node in the tree is treated as a monopole. The tree search and the summation of accelerations also perform well. During the tree search, node data that is needed from another processor is simply fetched. Roughly 55% of the tree search time is spent in communications between processors. We apply the code to two problems of astrophysical interest. The first is a simulation of the close passage of two gravitationally, interacting, disk galaxies using 65,636 particles. We also simulate the formation of structure in an expanding, model universe using 1,048,576 particles. Our code attains speeds comparable to one head of a Cray Y-MP, so single instruction, multiple data (SIMD) type computers can be used for these simulations. The cost/performance ratio for SIMD machines like the Maspar MP-1 make them an extremely attractive alternative to either vector processors or large multiple instruction, multiple data (MIMD) type parallel computers. With further optimizations (e.g., more careful load balancing), speeds in excess of today's vector processing computers should be possible.
Geant4 Computing Performance Benchmarking and Monitoring
Dotti, Andrea; Elvira, V. Daniel; Folger, Gunter; ...
2015-12-23
Performance evaluation and analysis of large scale computing applications is essential for optimal use of resources. As detector simulation is one of the most compute intensive tasks and Geant4 is the simulation toolkit most widely used in contemporary high energy physics (HEP) experiments, it is important to monitor Geant4 through its development cycle for changes in computing performance and to identify problems and opportunities for code improvements. All Geant4 development and public releases are being profiled with a set of applications that utilize different input event samples, physics parameters, and detector configurations. Results from multiple benchmarking runs are compared tomore » previous public and development reference releases to monitor CPU and memory usage. Observed changes are evaluated and correlated with code modifications. Besides the full summary of call stack and memory footprint, a detailed call graph analysis is available to Geant4 developers for further analysis. The set of software tools used in the performance evaluation procedure, both in sequential and multi-threaded modes, include FAST, IgProf and Open|Speedshop. In conclusion, the scalability of the CPU time and memory performance in multi-threaded application is evaluated by measuring event throughput and memory gain as a function of the number of threads for selected event samples.« less
Numerical Simulation of a High-Lift Configuration with Embedded Fluidic Actuators
NASA Technical Reports Server (NTRS)
Vatsa, Veer N.; Casalino, Damiano; Lin, John C.; Appelbaum, Jason
2014-01-01
Numerical simulations have been performed for a vertical tail configuration with deflected rudder. The suction surface of the main element of this configuration is embedded with an array of 32 fluidic actuators that produce oscillating sweeping jets. Such oscillating jets have been found to be very effective for flow control applications in the past. In the current paper, a high-fidelity computational fluid dynamics (CFD) code known as the PowerFLOW(Registered TradeMark) code is used to simulate the entire flow field associated with this configuration, including the flow inside the actuators. The computed results for the surface pressure and integrated forces compare favorably with measured data. In addition, numerical solutions predict the correct trends in forces with active flow control compared to the no control case. Effect of varying yaw and rudder deflection angles are also presented. In addition, computations have been performed at a higher Reynolds number to assess the performance of fluidic actuators at flight conditions.
NASA Technical Reports Server (NTRS)
Massey, J. L.
1976-01-01
Virtually all previously-suggested rate 1/2 binary convolutional codes with KE = 24 are compared. Their distance properties are given; and their performance, both in computation and in error probability, with sequential decoding on the deep-space channel is determined by simulation. Recommendations are made both for the choice of a specific KE = 24 code as well as for codes to be included in future coding standards for the deep-space channel. A new result given in this report is a method for determining the statistical significance of error probability data when the error probability is so small that it is not feasible to perform enough decoding simulations to obtain more than a very small number of decoding errors.
Software for Brain Network Simulations: A Comparative Study
Tikidji-Hamburyan, Ruben A.; Narayana, Vikram; Bozkus, Zeki; El-Ghazawi, Tarek A.
2017-01-01
Numerical simulations of brain networks are a critical part of our efforts in understanding brain functions under pathological and normal conditions. For several decades, the community has developed many software packages and simulators to accelerate research in computational neuroscience. In this article, we select the three most popular simulators, as determined by the number of models in the ModelDB database, such as NEURON, GENESIS, and BRIAN, and perform an independent evaluation of these simulators. In addition, we study NEST, one of the lead simulators of the Human Brain Project. First, we study them based on one of the most important characteristics, the range of supported models. Our investigation reveals that brain network simulators may be biased toward supporting a specific set of models. However, all simulators tend to expand the supported range of models by providing a universal environment for the computational study of individual neurons and brain networks. Next, our investigations on the characteristics of computational architecture and efficiency indicate that all simulators compile the most computationally intensive procedures into binary code, with the aim of maximizing their computational performance. However, not all simulators provide the simplest method for module development and/or guarantee efficient binary code. Third, a study of their amenability for high-performance computing reveals that NEST can almost transparently map an existing model on a cluster or multicore computer, while NEURON requires code modification if the model developed for a single computer has to be mapped on a computational cluster. Interestingly, parallelization is the weakest characteristic of BRIAN, which provides no support for cluster computations and limited support for multicore computers. Fourth, we identify the level of user support and frequency of usage for all simulators. Finally, we carry out an evaluation using two case studies: a large network with simplified neural and synaptic models and a small network with detailed models. These two case studies allow us to avoid any bias toward a particular software package. The results indicate that BRIAN provides the most concise language for both cases considered. Furthermore, as expected, NEST mostly favors large network models, while NEURON is better suited for detailed models. Overall, the case studies reinforce our general observation that simulators have a bias in the computational performance toward specific types of the brain network models. PMID:28775687
NASA Technical Reports Server (NTRS)
Lawson, Gary; Sosonkina, Masha; Baurle, Robert; Hammond, Dana
2017-01-01
In many fields, real-world applications for High Performance Computing have already been developed. For these applications to stay up-to-date, new parallel strategies must be explored to yield the best performance; however, restructuring or modifying a real-world application may be daunting depending on the size of the code. In this case, a mini-app may be employed to quickly explore such options without modifying the entire code. In this work, several mini-apps have been created to enhance a real-world application performance, namely the VULCAN code for complex flow analysis developed at the NASA Langley Research Center. These mini-apps explore hybrid parallel programming paradigms with Message Passing Interface (MPI) for distributed memory access and either Shared MPI (SMPI) or OpenMP for shared memory accesses. Performance testing shows that MPI+SMPI yields the best execution performance, while requiring the largest number of code changes. A maximum speedup of 23 was measured for MPI+SMPI, but only 11 was measured for MPI+OpenMP.
NASA Astrophysics Data System (ADS)
Iwasaki, Y.
1997-02-01
The CP-PACS computer with a peak speed of 300 Gflops was completed in March 1996 and has started to operate. We describe the final specification and the hardware implementation of the CP-PACS computer, and its performance for QCD codes. A plan of the grade-up of the computer scheduled for fall of 1996 is also given.
The Design and Evaluation of "CAPTools"--A Computer Aided Parallelization Toolkit
NASA Technical Reports Server (NTRS)
Yan, Jerry; Frumkin, Michael; Hribar, Michelle; Jin, Haoqiang; Waheed, Abdul; Johnson, Steve; Cross, Jark; Evans, Emyr; Ierotheou, Constantinos; Leggett, Pete;
1998-01-01
Writing applications for high performance computers is a challenging task. Although writing code by hand still offers the best performance, it is extremely costly and often not very portable. The Computer Aided Parallelization Tools (CAPTools) are a toolkit designed to help automate the mapping of sequential FORTRAN scientific applications onto multiprocessors. CAPTools consists of the following major components: an inter-procedural dependence analysis module that incorporates user knowledge; a 'self-propagating' data partitioning module driven via user guidance; an execution control mask generation and optimization module for the user to fine tune parallel processing of individual partitions; a program transformation/restructuring facility for source code clean up and optimization; a set of browsers through which the user interacts with CAPTools at each stage of the parallelization process; and a code generator supporting multiple programming paradigms on various multiprocessors. Besides describing the rationale behind the architecture of CAPTools, the parallelization process is illustrated via case studies involving structured and unstructured meshes. The programming process and the performance of the generated parallel programs are compared against other programming alternatives based on the NAS Parallel Benchmarks, ARC3D and other scientific applications. Based on these results, a discussion on the feasibility of constructing architectural independent parallel applications is presented.
A survey of compiler optimization techniques
NASA Technical Reports Server (NTRS)
Schneck, P. B.
1972-01-01
Major optimization techniques of compilers are described and grouped into three categories: machine dependent, architecture dependent, and architecture independent. Machine-dependent optimizations tend to be local and are performed upon short spans of generated code by using particular properties of an instruction set to reduce the time or space required by a program. Architecture-dependent optimizations are global and are performed while generating code. These optimizations consider the structure of a computer, but not its detailed instruction set. Architecture independent optimizations are also global but are based on analysis of the program flow graph and the dependencies among statements of source program. A conceptual review of a universal optimizer that performs architecture-independent optimizations at source-code level is also presented.
NASA Technical Reports Server (NTRS)
Ingels, F.; Schoggen, W. O.
1981-01-01
The various methods of high bit transition density encoding are presented, their relative performance is compared in so far as error propagation characteristics, transition properties and system constraints are concerned. A computer simulation of the system using the specific PN code recommended, is included.
Exploring Asynchronous Many-Task Runtime Systems toward Extreme Scales
DOE Office of Scientific and Technical Information (OSTI.GOV)
Knight, Samuel; Baker, Gavin Matthew; Gamell, Marc
2015-10-01
Major exascale computing reports indicate a number of software challenges to meet the dramatic change of system architectures in near future. While several-orders-of-magnitude increase in parallelism is the most commonly cited of those, hurdles also include performance heterogeneity of compute nodes across the system, increased imbalance between computational capacity and I/O capabilities, frequent system interrupts, and complex hardware architectures. Asynchronous task-parallel programming models show a great promise in addressing these issues, but are not yet fully understood nor developed su ciently for computational science and engineering application codes. We address these knowledge gaps through quantitative and qualitative exploration of leadingmore » candidate solutions in the context of engineering applications at Sandia. In this poster, we evaluate MiniAero code ported to three leading candidate programming models (Charm++, Legion and UINTAH) to examine the feasibility of these models that permits insertion of new programming model elements into an existing code base.« less
Metrics for comparing dynamic earthquake rupture simulations
Barall, Michael; Harris, Ruth A.
2014-01-01
Earthquakes are complex events that involve a myriad of interactions among multiple geologic features and processes. One of the tools that is available to assist with their study is computer simulation, particularly dynamic rupture simulation. A dynamic rupture simulation is a numerical model of the physical processes that occur during an earthquake. Starting with the fault geometry, friction constitutive law, initial stress conditions, and assumptions about the condition and response of the near‐fault rocks, a dynamic earthquake rupture simulation calculates the evolution of fault slip and stress over time as part of the elastodynamic numerical solution (Ⓔ see the simulation description in the electronic supplement to this article). The complexity of the computations in a dynamic rupture simulation make it challenging to verify that the computer code is operating as intended, because there are no exact analytic solutions against which these codes’ results can be directly compared. One approach for checking if dynamic rupture computer codes are working satisfactorily is to compare each code’s results with the results of other dynamic rupture codes running the same earthquake simulation benchmark. To perform such a comparison consistently, it is necessary to have quantitative metrics. In this paper, we present a new method for quantitatively comparing the results of dynamic earthquake rupture computer simulation codes.
Euler Calculations at Off-Design Conditions for an Inlet of Inward Turning RBCC-SSTO Vehicle
NASA Technical Reports Server (NTRS)
Takashima, N.; Kothari, A. P.
1998-01-01
The inviscid performance of an inward turning inlet design is calculated computationally for the first time. Hypersonic vehicle designs based on the inward turning inlets have been shown analytically to have increased effective specific impulse and lower heat load than comparably designed vehicles with two-dimensional inlets. The inward turning inlets are designed inversely from inviscid stream surfaces of known flow fields. The computational study is performed on a Mach 12 inlet design to validate the performance predicted by the design code (HAVDAC) and calculate its off-design Mach number performance. The three-dimensional Euler equations are solved for Mach 4, 8, and 12 using a software package called SAM, which consists of an unstructured mesh generator (SAMmesh), a three-dimensional unstructured mesh flow solver (SAMcfd), and a CAD-based software (SAMcad). The computed momentum averaged inlet throat pressure is within 6% of the design inlet throat pressure. The mass-flux at the inlet throat is also within 7 % of the value predicted by the design code thereby validating the accuracy of the design code. The off-design Mach number results show that flow spillage is minimal, and the variation in the mass capture ratio with Mach number is comparable to an ideal 2-D inlet. The results from the inviscid flow calculations of a Mach 12 inward turning inlet indicate that the inlet design has very good on and off-design performance which makes it a promising design candidate for future air-breathing hypersonic vehicles.
Braiding by Majorana tracking and long-range CNOT gates with color codes
NASA Astrophysics Data System (ADS)
Litinski, Daniel; von Oppen, Felix
2017-11-01
Color-code quantum computation seamlessly combines Majorana-based hardware with topological error correction. Specifically, as Clifford gates are transversal in two-dimensional color codes, they enable the use of the Majoranas' non-Abelian statistics for gate operations at the code level. Here, we discuss the implementation of color codes in arrays of Majorana nanowires that avoid branched networks such as T junctions, thereby simplifying their realization. We show that, in such implementations, non-Abelian statistics can be exploited without ever performing physical braiding operations. Physical braiding operations are replaced by Majorana tracking, an entirely software-based protocol which appropriately updates the Majoranas involved in the color-code stabilizer measurements. This approach minimizes the required hardware operations for single-qubit Clifford gates. For Clifford completeness, we combine color codes with surface codes, and use color-to-surface-code lattice surgery for long-range multitarget CNOT gates which have a time overhead that grows only logarithmically with the physical distance separating control and target qubits. With the addition of magic state distillation, our architecture describes a fault-tolerant universal quantum computer in systems such as networks of tetrons, hexons, or Majorana box qubits, but can also be applied to nontopological qubit platforms.
2013 R&D 100 Award: âMiniappsâ Bolster High Performance Computing
Belak, Jim; Richards, David
2018-06-12
Two Livermore computer scientists served on a Sandia National Laboratories-led team that developed Mantevo Suite 1.0, the first integrated suite of small software programs, also called "miniapps," to be made available to the high performance computing (HPC) community. These miniapps facilitate the development of new HPC systems and the applications that run on them. Miniapps (miniature applications) serve as stripped down surrogates for complex, full-scale applications that can require a great deal of time and effort to port to a new HPC system because they often consist of hundreds of thousands of lines of code. The miniapps are a prototype that contains some or all of the essentials of the real application but with many fewer lines of code, making the miniapp more versatile for experimentation. This allows researchers to more rapidly explore options and optimize system design, greatly improving the chances the full-scale application will perform successfully. These miniapps have become essential tools for exploring complex design spaces because they can reliably predict the performance of full applications.
Performance Trend of Different Algorithms for Structural Design Optimization
NASA Technical Reports Server (NTRS)
Patnaik, Surya N.; Coroneos, Rula M.; Guptill, James D.; Hopkins, Dale A.
1996-01-01
Nonlinear programming algorithms play an important role in structural design optimization. Fortunately, several algorithms with computer codes are available. At NASA Lewis Research Center, a project was initiated to assess performance of different optimizers through the development of a computer code CometBoards. This paper summarizes the conclusions of that research. CometBoards was employed to solve sets of small, medium and large structural problems, using different optimizers on a Cray-YMP8E/8128 computer. The reliability and efficiency of the optimizers were determined from the performance of these problems. For small problems, the performance of most of the optimizers could be considered adequate. For large problems however, three optimizers (two sequential quadratic programming routines, DNCONG of IMSL and SQP of IDESIGN, along with the sequential unconstrained minimizations technique SUMT) outperformed others. At optimum, most optimizers captured an identical number of active displacement and frequency constraints but the number of active stress constraints differed among the optimizers. This discrepancy can be attributed to singularity conditions in the optimization and the alleviation of this discrepancy can improve the efficiency of optimizers.
Message Passing and Shared Address Space Parallelism on an SMP Cluster
NASA Technical Reports Server (NTRS)
Shan, Hongzhang; Singh, Jaswinder P.; Oliker, Leonid; Biswas, Rupak; Biegel, Bryan (Technical Monitor)
2002-01-01
Currently, message passing (MP) and shared address space (SAS) are the two leading parallel programming paradigms. MP has been standardized with MPI, and is the more common and mature approach; however, code development can be extremely difficult, especially for irregularly structured computations. SAS offers substantial ease of programming, but may suffer from performance limitations due to poor spatial locality and high protocol overhead. In this paper, we compare the performance of and the programming effort required for six applications under both programming models on a 32-processor PC-SMP cluster, a platform that is becoming increasingly attractive for high-end scientific computing. Our application suite consists of codes that typically do not exhibit scalable performance under shared-memory programming due to their high communication-to-computation ratios and/or complex communication patterns. Results indicate that SAS can achieve about half the parallel efficiency of MPI for most of our applications, while being competitive for the others. A hybrid MPI+SAS strategy shows only a small performance advantage over pure MPI in some cases. Finally, improved implementations of two MPI collective operations on PC-SMP clusters are presented.
Trace Replay and Network Simulation Tool
DOE Office of Scientific and Technical Information (OSTI.GOV)
Acun, Bilge; Jain, Nikhil; Bhatele, Abhinav
2015-03-23
TraceR is a trace reply tool built upon the ROSS-based CODES simulation framework. TraceR can be used for predicting network performances and understanding network behavior by simulating messaging in High Performance Computing applications on interconnection networks.
Trace Replay and Network Simulation Tool
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jain, Nikhil; Bhatele, Abhinav; Acun, Bilge
TraceR Is a trace replay tool built upon the ROSS-based CODES simulation framework. TraceR can be used for predicting network performance and understanding network behavior by simulating messaging In High Performance Computing applications on interconnection networks.
NASA Astrophysics Data System (ADS)
Fabien-Ouellet, Gabriel; Gloaguen, Erwan; Giroux, Bernard
2017-03-01
Full Waveform Inversion (FWI) aims at recovering the elastic parameters of the Earth by matching recordings of the ground motion with the direct solution of the wave equation. Modeling the wave propagation for realistic scenarios is computationally intensive, which limits the applicability of FWI. The current hardware evolution brings increasing parallel computing power that can speed up the computations in FWI. However, to take advantage of the diversity of parallel architectures presently available, new programming approaches are required. In this work, we explore the use of OpenCL to develop a portable code that can take advantage of the many parallel processor architectures now available. We present a program called SeisCL for 2D and 3D viscoelastic FWI in the time domain. The code computes the forward and adjoint wavefields using finite-difference and outputs the gradient of the misfit function given by the adjoint state method. To demonstrate the code portability on different architectures, the performance of SeisCL is tested on three different devices: Intel CPUs, NVidia GPUs and Intel Xeon PHI. Results show that the use of GPUs with OpenCL can speed up the computations by nearly two orders of magnitudes over a single threaded application on the CPU. Although OpenCL allows code portability, we show that some device-specific optimization is still required to get the best performance out of a specific architecture. Using OpenCL in conjunction with MPI allows the domain decomposition of large models on several devices located on different nodes of a cluster. For large enough models, the speedup of the domain decomposition varies quasi-linearly with the number of devices. Finally, we investigate two different approaches to compute the gradient by the adjoint state method and show the significant advantages of using OpenCL for FWI.
Trellis coding techniques for mobile communications
NASA Technical Reports Server (NTRS)
Divsalar, D.; Simon, M. K.; Jedrey, T.
1988-01-01
A criterion for designing optimum trellis codes to be used over fading channels is given. A technique is shown for reducing certain multiple trellis codes, optimally designed for the fading channel, to conventional (i.e., multiplicity one) trellis codes. The computational cutoff rate R0 is evaluated for MPSK transmitted over fading channels. Examples of trellis codes optimally designed for the Rayleigh fading channel are given and compared with respect to R0. Two types of modulation/demodulation techniques are considered, namely coherent (using pilot tone-aided carrier recovery) and differentially coherent with Doppler frequency correction. Simulation results are given for end-to-end performance of two trellis-coded systems.
Examining the architecture of cellular computing through a comparative study with a computer
Wang, Degeng; Gribskov, Michael
2005-01-01
The computer and the cell both use information embedded in simple coding, the binary software code and the quadruple genomic code, respectively, to support system operations. A comparative examination of their system architecture as well as their information storage and utilization schemes is performed. On top of the code, both systems display a modular, multi-layered architecture, which, in the case of a computer, arises from human engineering efforts through a combination of hardware implementation and software abstraction. Using the computer as a reference system, a simplistic mapping of the architectural components between the two is easily detected. This comparison also reveals that a cell abolishes the software–hardware barrier through genomic encoding for the constituents of the biochemical network, a cell's ‘hardware’ equivalent to the computer central processing unit (CPU). The information loading (gene expression) process acts as a major determinant of the encoded constituent's abundance, which, in turn, often determines the ‘bandwidth’ of a biochemical pathway. Cellular processes are implemented in biochemical pathways in parallel manners. In a computer, on the other hand, the software provides only instructions and data for the CPU. A process represents just sequentially ordered actions by the CPU and only virtual parallelism can be implemented through CPU time-sharing. Whereas process management in a computer may simply mean job scheduling, coordinating pathway bandwidth through the gene expression machinery represents a major process management scheme in a cell. In summary, a cell can be viewed as a super-parallel computer, which computes through controlled hardware composition. While we have, at best, a very fragmented understanding of cellular operation, we have a thorough understanding of the computer throughout the engineering process. The potential utilization of this knowledge to the benefit of systems biology is discussed. PMID:16849179
Examining the architecture of cellular computing through a comparative study with a computer.
Wang, Degeng; Gribskov, Michael
2005-06-22
The computer and the cell both use information embedded in simple coding, the binary software code and the quadruple genomic code, respectively, to support system operations. A comparative examination of their system architecture as well as their information storage and utilization schemes is performed. On top of the code, both systems display a modular, multi-layered architecture, which, in the case of a computer, arises from human engineering efforts through a combination of hardware implementation and software abstraction. Using the computer as a reference system, a simplistic mapping of the architectural components between the two is easily detected. This comparison also reveals that a cell abolishes the software-hardware barrier through genomic encoding for the constituents of the biochemical network, a cell's "hardware" equivalent to the computer central processing unit (CPU). The information loading (gene expression) process acts as a major determinant of the encoded constituent's abundance, which, in turn, often determines the "bandwidth" of a biochemical pathway. Cellular processes are implemented in biochemical pathways in parallel manners. In a computer, on the other hand, the software provides only instructions and data for the CPU. A process represents just sequentially ordered actions by the CPU and only virtual parallelism can be implemented through CPU time-sharing. Whereas process management in a computer may simply mean job scheduling, coordinating pathway bandwidth through the gene expression machinery represents a major process management scheme in a cell. In summary, a cell can be viewed as a super-parallel computer, which computes through controlled hardware composition. While we have, at best, a very fragmented understanding of cellular operation, we have a thorough understanding of the computer throughout the engineering process. The potential utilization of this knowledge to the benefit of systems biology is discussed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brun, B.
1997-07-01
Computer technology has improved tremendously during the last years with larger media capacity, memory and more computational power. Visual computing with high-performance graphic interface and desktop computational power have changed the way engineers accomplish everyday tasks, development and safety studies analysis. The emergence of parallel computing will permit simulation over a larger domain. In addition, new development methods, languages and tools have appeared in the last several years.
NASA Astrophysics Data System (ADS)
Class, G.; Meyder, R.; Stratmanns, E.
1985-12-01
The large data base for validation and development of computer codes for two-phase flow, generated at the COSIMA facility, is reviewed. The aim of COSIMA is to simulate the hydraulic, thermal, and mechanical conditions in the subchannel and the cladding of fuel rods in pressurized water reactors during the blowout phase of a loss of coolant accident. In terms of fuel rod behavior, it is found that during blowout under realistic conditions only small strains are reached. For cladding rupture extremely high rod internal pressures are necessary. The behavior of fuel rod simulators and the effect of thermocouples attached to the cladding outer surface are clarified. Calculations performed with the codes RELAP and DRUFAN show satisfactory agreement with experiments. This can be improved by updating the phase separation models in the codes.
Performance of the OVERFLOW-MLP and LAURA-MLP CFD Codes on the NASA Ames 512 CPU Origin System
NASA Technical Reports Server (NTRS)
Taft, James R.
2000-01-01
The shared memory Multi-Level Parallelism (MLP) technique, developed last year at NASA Ames has been very successful in dramatically improving the performance of important NASA CFD codes. This new and very simple parallel programming technique was first inserted into the OVERFLOW production CFD code in FY 1998. The OVERFLOW-MLP code's parallel performance scaled linearly to 256 CPUs on the NASA Ames 256 CPU Origin 2000 system (steger). Overall performance exceeded 20.1 GFLOP/s, or about 4.5x the performance of a dedicated 16 CPU C90 system. All of this was achieved without any major modification to the original vector based code. The OVERFLOW-MLP code is now in production on the inhouse Origin systems as well as being used offsite at commercial aerospace companies. Partially as a result of this work, NASA Ames has purchased a new 512 CPU Origin 2000 system to further test the limits of parallel performance for NASA codes of interest. This paper presents the performance obtained from the latest optimization efforts on this machine for the LAURA-MLP and OVERFLOW-MLP codes. The Langley Aerothermodynamics Upwind Relaxation Algorithm (LAURA) code is a key simulation tool in the development of the next generation shuttle, interplanetary reentry vehicles, and nearly all "X" plane development. This code sustains about 4-5 GFLOP/s on a dedicated 16 CPU C90. At this rate, expected workloads would require over 100 C90 CPU years of computing over the next few calendar years. It is not feasible to expect that this would be affordable or available to the user community. Dramatic performance gains on cheaper systems are needed. This code is expected to be perhaps the largest consumer of NASA Ames compute cycles per run in the coming year.The OVERFLOW CFD code is extensively used in the government and commercial aerospace communities to evaluate new aircraft designs. It is one of the largest consumers of NASA supercomputing cycles and large simulations of highly resolved full aircraft are routinely undertaken. Typical large problems might require 100s of Cray C90 CPU hours to complete. The dramatic performance gains with the 256 CPU steger system are exciting. Obtaining results in hours instead of months is revolutionizing the way in which aircraft manufacturers are looking at future aircraft simulation work. Figure 2 below is a current state of the art plot of OVERFLOW-MLP performance on the 512 CPU Lomax system. As can be seen, the chart indicates that OVERFLOW-MLP continues to scale linearly with CPU count up to 512 CPUs on a large 35 million point full aircraft RANS simulation. At this point performance is such that a fully converged simulation of 2500 time steps is completed in less than 2 hours of elapsed time. Further work over the next few weeks will improve the performance of this code even further.The LAURA code has been converted to the MLP format as well. This code is currently being optimized for the 512 CPU system. Performance statistics indicate that the goal of 100 GFLOP/s will be achieved by year's end. This amounts to 20x the 16 CPU C90 result and strongly demonstrates the viability of the new parallel systems rapidly solving very large simulations in a production environment.
DUKSUP: A Computer Program for High Thrust Launch Vehicle Trajectory Design and Optimization
NASA Technical Reports Server (NTRS)
Williams, C. H.; Spurlock, O. F.
2014-01-01
From the late 1960's through 1997, the leadership of NASA's Intermediate and Large class unmanned expendable launch vehicle projects resided at the NASA Lewis (now Glenn) Research Center (LeRC). One of LeRC's primary responsibilities --- trajectory design and performance analysis --- was accomplished by an internally-developed analytic three dimensional computer program called DUKSUP. Because of its Calculus of Variations-based optimization routine, this code was generally more capable of finding optimal solutions than its contemporaries. A derivation of optimal control using the Calculus of Variations is summarized including transversality, intermediate, and final conditions. The two point boundary value problem is explained. A brief summary of the code's operation is provided, including iteration via the Newton-Raphson scheme and integration of variational and motion equations via a 4th order Runge-Kutta scheme. Main subroutines are discussed. The history of the LeRC trajectory design efforts in the early 1960's is explained within the context of supporting the Centaur upper stage program. How the code was constructed based on the operation of the Atlas/Centaur launch vehicle, the limits of the computers of that era, the limits of the computer programming languages, and the missions it supported are discussed. The vehicles DUKSUP supported (Atlas/Centaur, Titan/Centaur, and Shuttle/Centaur) are briefly described. The types of missions, including Earth orbital and interplanetary, are described. The roles of flight constraints and their impact on launch operations are detailed (such as jettisoning hardware on heating, Range Safety, ground station tracking, and elliptical parking orbits). The computer main frames on which the code was hosted are described. The applications of the code are detailed, including independent check of contractor analysis, benchmarking, leading edge analysis, and vehicle performance improvement assessments. Several of DUKSUP's many major impacts on launches are discussed including Intelsat, Voyager, Pioneer Venus, HEAO, Galileo, and Cassini.
DUKSUP: A Computer Program for High Thrust Launch Vehicle Trajectory Design and Optimization
NASA Technical Reports Server (NTRS)
Spurlock, O. Frank; Williams, Craig H.
2015-01-01
From the late 1960s through 1997, the leadership of NASAs Intermediate and Large class unmanned expendable launch vehicle projects resided at the NASA Lewis (now Glenn) Research Center (LeRC). One of LeRCs primary responsibilities --- trajectory design and performance analysis --- was accomplished by an internally-developed analytic three dimensional computer program called DUKSUP. Because of its Calculus of Variations-based optimization routine, this code was generally more capable of finding optimal solutions than its contemporaries. A derivation of optimal control using the Calculus of Variations is summarized including transversality, intermediate, and final conditions. The two point boundary value problem is explained. A brief summary of the codes operation is provided, including iteration via the Newton-Raphson scheme and integration of variational and motion equations via a 4th order Runge-Kutta scheme. Main subroutines are discussed. The history of the LeRC trajectory design efforts in the early 1960s is explained within the context of supporting the Centaur upper stage program. How the code was constructed based on the operation of the AtlasCentaur launch vehicle, the limits of the computers of that era, the limits of the computer programming languages, and the missions it supported are discussed. The vehicles DUKSUP supported (AtlasCentaur, TitanCentaur, and ShuttleCentaur) are briefly described. The types of missions, including Earth orbital and interplanetary, are described. The roles of flight constraints and their impact on launch operations are detailed (such as jettisoning hardware on heating, Range Safety, ground station tracking, and elliptical parking orbits). The computer main frames on which the code was hosted are described. The applications of the code are detailed, including independent check of contractor analysis, benchmarking, leading edge analysis, and vehicle performance improvement assessments. Several of DUKSUPs many major impacts on launches are discussed including Intelsat, Voyager, Pioneer Venus, HEAO, Galileo, and Cassini.
SAPNEW: Parallel finite element code for thin shell structures on the Alliant FX-80
NASA Astrophysics Data System (ADS)
Kamat, Manohar P.; Watson, Brian C.
1992-11-01
The finite element method has proven to be an invaluable tool for analysis and design of complex, high performance systems, such as bladed-disk assemblies in aircraft turbofan engines. However, as the problem size increase, the computation time required by conventional computers can be prohibitively high. Parallel processing computers provide the means to overcome these computation time limits. This report summarizes the results of a research activity aimed at providing a finite element capability for analyzing turbomachinery bladed-disk assemblies in a vector/parallel processing environment. A special purpose code, named with the acronym SAPNEW, has been developed to perform static and eigen analysis of multi-degree-of-freedom blade models built-up from flat thin shell elements. SAPNEW provides a stand alone capability for static and eigen analysis on the Alliant FX/80, a parallel processing computer. A preprocessor, named with the acronym NTOS, has been developed to accept NASTRAN input decks and convert them to the SAPNEW format to make SAPNEW more readily used by researchers at NASA Lewis Research Center.
NASA Astrophysics Data System (ADS)
Mills, R. T.; Rupp, K.; Smith, B. F.; Brown, J.; Knepley, M.; Zhang, H.; Adams, M.; Hammond, G. E.
2017-12-01
As the high-performance computing community pushes towards the exascale horizon, power and heat considerations have driven the increasing importance and prevalence of fine-grained parallelism in new computer architectures. High-performance computing centers have become increasingly reliant on GPGPU accelerators and "manycore" processors such as the Intel Xeon Phi line, and 512-bit SIMD registers have even been introduced in the latest generation of Intel's mainstream Xeon server processors. The high degree of fine-grained parallelism and more complicated memory hierarchy considerations of such "manycore" processors present several challenges to existing scientific software. Here, we consider how the massively parallel, open-source hydrologic flow and reactive transport code PFLOTRAN - and the underlying Portable, Extensible Toolkit for Scientific Computation (PETSc) library on which it is built - can best take advantage of such architectures. We will discuss some key features of these novel architectures and our code optimizations and algorithmic developments targeted at them, and present experiences drawn from working with a wide range of PFLOTRAN benchmark problems on these architectures.
NASA Astrophysics Data System (ADS)
Myre, Joseph M.
Heterogeneous computing systems have recently come to the forefront of the High-Performance Computing (HPC) community's interest. HPC computer systems that incorporate special purpose accelerators, such as Graphics Processing Units (GPUs), are said to be heterogeneous. Large scale heterogeneous computing systems have consistently ranked highly on the Top500 list since the beginning of the heterogeneous computing trend. By using heterogeneous computing systems that consist of both general purpose processors and special- purpose accelerators, the speed and problem size of many simulations could be dramatically increased. Ultimately this results in enhanced simulation capabilities that allows, in some cases for the first time, the execution of parameter space and uncertainty analyses, model optimizations, and other inverse modeling techniques that are critical for scientific discovery and engineering analysis. However, simplifying the usage and optimization of codes for heterogeneous computing systems remains a challenge. This is particularly true for scientists and engineers for whom understanding HPC architectures and undertaking performance analysis may not be primary research objectives. To enable scientists and engineers to remain focused on their primary research objectives, a modular environment for geophysical inversion and run-time autotuning on heterogeneous computing systems is presented. This environment is composed of three major components: 1) CUSH---a framework for reducing the complexity of programming heterogeneous computer systems, 2) geophysical inversion routines which can be used to characterize physical systems, and 3) run-time autotuning routines designed to determine configurations of heterogeneous computing systems in an attempt to maximize the performance of scientific and engineering codes. Using three case studies, a lattice-Boltzmann method, a non-negative least squares inversion, and a finite-difference fluid flow method, it is shown that this environment provides scientists and engineers with means to reduce the programmatic complexity of their applications, to perform geophysical inversions for characterizing physical systems, and to determine high-performing run-time configurations of heterogeneous computing systems using a run-time autotuner.
NASA Technical Reports Server (NTRS)
Wood, Jerry R.; Schmidt, James F.; Steinke, Ronald J.; Chima, Rodrick V.; Kunik, William G.
1987-01-01
Increased emphasis on sustained supersonic or hypersonic cruise has revived interest in the supersonic throughflow fan as a possible component in advanced propulsion systems. Use of a fan that can operate with a supersonic inlet axial Mach number is attractive from the standpoint of reducing the inlet losses incurred in diffusing the flow from a supersonic flight Mach number to a subsonic one at the fan face. The design of the experiment using advanced computational codes to calculate the components required is described. The rotor was designed using existing turbomachinery design and analysis codes modified to handle fully supersonic axial flow through the rotor. A two-dimensional axisymmetric throughflow design code plus a blade element code were used to generate fan rotor velocity diagrams and blade shapes. A quasi-three-dimensional, thin shear layer Navier-Stokes code was used to assess the performance of the fan rotor blade shapes. The final design was stacked and checked for three-dimensional effects using a three-dimensional Euler code interactively coupled with a two-dimensional boundary layer code. The nozzle design in the expansion region was analyzed with a three-dimensional parabolized viscous code which corroborated the results from the Euler code. A translating supersonic diffuser was designed using these same codes.
NASA Technical Reports Server (NTRS)
Kowalski, E. J.
1979-01-01
A computerized method which utilizes the engine performance data is described. The method estimates the installed performance of aircraft gas turbine engines. This installation includes: engine weight and dimensions, inlet and nozzle internal performance and drag, inlet and nacelle weight, and nacelle drag.
Code Optimization and Parallelization on the Origins: Looking from Users' Perspective
NASA Technical Reports Server (NTRS)
Chang, Yan-Tyng Sherry; Thigpen, William W. (Technical Monitor)
2002-01-01
Parallel machines are becoming the main compute engines for high performance computing. Despite their increasing popularity, it is still a challenge for most users to learn the basic techniques to optimize/parallelize their codes on such platforms. In this paper, we present some experiences on learning these techniques for the Origin systems at the NASA Advanced Supercomputing Division. Emphasis of this paper will be on a few essential issues (with examples) that general users should master when they work with the Origins as well as other parallel systems.
NASA Technical Reports Server (NTRS)
Goglia, G. L.; Spiegler, E.
1977-01-01
The research activity focused on two main tasks: (1) the further development of the SCRAM program and, in particular, the addition of a procedure for modeling the mechanism of the internal adjustment process of the flow, in response to the imposed thermal load across the combustor and (2) the development of a numerical code for the computation of the variation of concentrations throughout a turbulent field, where finite-rate reactions occur. The code also includes an estimation of the effect of the phenomenon called 'unmixedness'.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chapman, Bryan Scott; MacQuigg, Michael Robert; Wysong, Andrew Russell
In this document, the code MCNP is validated with ENDF/B-VII.1 cross section data under the purview of ANSI/ANS-8.24-2007, for use with uranium systems. MCNP is a computer code based on Monte Carlo transport methods. While MCNP has wide reading capability in nuclear transport simulation, this validation is limited to the functionality related to neutron transport and calculation of criticality parameters such as k eff.
Probabilistic Structural Analysis Theory Development
NASA Technical Reports Server (NTRS)
Burnside, O. H.
1985-01-01
The objective of the Probabilistic Structural Analysis Methods (PSAM) project is to develop analysis techniques and computer programs for predicting the probabilistic response of critical structural components for current and future space propulsion systems. This technology will play a central role in establishing system performance and durability. The first year's technical activity is concentrating on probabilistic finite element formulation strategy and code development. Work is also in progress to survey critical materials and space shuttle mian engine components. The probabilistic finite element computer program NESSUS (Numerical Evaluation of Stochastic Structures Under Stress) is being developed. The final probabilistic code will have, in the general case, the capability of performing nonlinear dynamic of stochastic structures. It is the goal of the approximate methods effort to increase problem solving efficiency relative to finite element methods by using energy methods to generate trial solutions which satisfy the structural boundary conditions. These approximate methods will be less computer intensive relative to the finite element approach.
Cultural and Technological Issues and Solutions for Geodynamics Software Citation
NASA Astrophysics Data System (ADS)
Heien, E. M.; Hwang, L.; Fish, A. E.; Smith, M.; Dumit, J.; Kellogg, L. H.
2014-12-01
Computational software and custom-written codes play a key role in scientific research and teaching, providing tools to perform data analysis and forward modeling through numerical computation. However, development of these codes is often hampered by the fact that there is no well-defined way for the authors to receive credit or professional recognition for their work through the standard methods of scientific publication and subsequent citation of the work. This in turn may discourage researchers from publishing their codes or making them easier for other scientists to use. We investigate the issues involved in citing software in a scientific context, and introduce features that should be components of a citation infrastructure, particularly oriented towards the codes and scientific culture in the area of geodynamics research. The codes used in geodynamics are primarily specialized numerical modeling codes for continuum mechanics problems; they may be developed by individual researchers, teams of researchers, geophysicists in collaboration with computational scientists and applied mathematicians, or by coordinated community efforts such as the Computational Infrastructure for Geodynamics. Some but not all geodynamics codes are open-source. These characteristics are common to many areas of geophysical software development and use. We provide background on the problem of software citation and discuss some of the barriers preventing adoption of such citations, including social/cultural barriers, insufficient technological support infrastructure, and an overall lack of agreement about what a software citation should consist of. We suggest solutions in an initial effort to create a system to support citation of software and promotion of scientific software development.
Chaste: An Open Source C++ Library for Computational Physiology and Biology
Mirams, Gary R.; Arthurs, Christopher J.; Bernabeu, Miguel O.; Bordas, Rafel; Cooper, Jonathan; Corrias, Alberto; Davit, Yohan; Dunn, Sara-Jane; Fletcher, Alexander G.; Harvey, Daniel G.; Marsh, Megan E.; Osborne, James M.; Pathmanathan, Pras; Pitt-Francis, Joe; Southern, James; Zemzemi, Nejib; Gavaghan, David J.
2013-01-01
Chaste — Cancer, Heart And Soft Tissue Environment — is an open source C++ library for the computational simulation of mathematical models developed for physiology and biology. Code development has been driven by two initial applications: cardiac electrophysiology and cancer development. A large number of cardiac electrophysiology studies have been enabled and performed, including high-performance computational investigations of defibrillation on realistic human cardiac geometries. New models for the initiation and growth of tumours have been developed. In particular, cell-based simulations have provided novel insight into the role of stem cells in the colorectal crypt. Chaste is constantly evolving and is now being applied to a far wider range of problems. The code provides modules for handling common scientific computing components, such as meshes and solvers for ordinary and partial differential equations (ODEs/PDEs). Re-use of these components avoids the need for researchers to ‘re-invent the wheel’ with each new project, accelerating the rate of progress in new applications. Chaste is developed using industrially-derived techniques, in particular test-driven development, to ensure code quality, re-use and reliability. In this article we provide examples that illustrate the types of problems Chaste can be used to solve, which can be run on a desktop computer. We highlight some scientific studies that have used or are using Chaste, and the insights they have provided. The source code, both for specific releases and the development version, is available to download under an open source Berkeley Software Distribution (BSD) licence at http://www.cs.ox.ac.uk/chaste, together with details of a mailing list and links to documentation and tutorials. PMID:23516352
Multiprocessing on supercomputers for computational aerodynamics
NASA Technical Reports Server (NTRS)
Yarrow, Maurice; Mehta, Unmeel B.
1990-01-01
Very little use is made of multiple processors available on current supercomputers (computers with a theoretical peak performance capability equal to 100 MFLOPs or more) in computational aerodynamics to significantly improve turnaround time. The productivity of a computer user is directly related to this turnaround time. In a time-sharing environment, the improvement in this speed is achieved when multiple processors are used efficiently to execute an algorithm. The concept of multiple instructions and multiple data (MIMD) through multi-tasking is applied via a strategy which requires relatively minor modifications to an existing code for a single processor. Essentially, this approach maps the available memory to multiple processors, exploiting the C-FORTRAN-Unix interface. The existing single processor code is mapped without the need for developing a new algorithm. The procedure for building a code utilizing this approach is automated with the Unix stream editor. As a demonstration of this approach, a Multiple Processor Multiple Grid (MPMG) code is developed. It is capable of using nine processors, and can be easily extended to a larger number of processors. This code solves the three-dimensional, Reynolds averaged, thin-layer and slender-layer Navier-Stokes equations with an implicit, approximately factored and diagonalized method. The solver is applied to generic oblique-wing aircraft problem on a four processor Cray-2 computer. A tricubic interpolation scheme is developed to increase the accuracy of coupling of overlapped grids. For the oblique-wing aircraft problem, a speedup of two in elapsed (turnaround) time is observed in a saturated time-sharing environment.
Construction and Utilization of a Beowulf Computing Cluster: A User's Perspective
NASA Technical Reports Server (NTRS)
Woods, Judy L.; West, Jeff S.; Sulyma, Peter R.
2000-01-01
Lockheed Martin Space Operations - Stennis Programs (LMSO) at the John C Stennis Space Center (NASA/SSC) has designed and built a Beowulf computer cluster which is owned by NASA/SSC and operated by LMSO. The design and construction of the cluster are detailed in this paper. The cluster is currently used for Computational Fluid Dynamics (CFD) simulations. The CFD codes in use and their applications are discussed. Examples of some of the work are also presented. Performance benchmark studies have been conducted for the CFD codes being run on the cluster. The results of two of the studies are presented and discussed. The cluster is not currently being utilized to its full potential; therefore, plans are underway to add more capabilities. These include the addition of structural, thermal, fluid, and acoustic Finite Element Analysis codes as well as real-time data acquisition and processing during test operations at NASA/SSC. These plans are discussed as well.
LTCP 2D Graphical User Interface. Application Description and User's Guide
NASA Technical Reports Server (NTRS)
Ball, Robert; Navaz, Homayun K.
1996-01-01
A graphical user interface (GUI) written for NASA's LTCP (Liquid Thrust Chamber Performance) 2 dimensional computational fluid dynamic code is described. The GUI is written in C++ for a desktop personal computer running under a Microsoft Windows operating environment. Through the use of common and familiar dialog boxes, features, and tools, the user can easily and quickly create and modify input files for the LTCP code. In addition, old input files used with the LTCP code can be opened and modified using the GUI. The application is written in C++ for a desktop personal computer running under a Microsoft Windows operating environment. The program and its capabilities are presented, followed by a detailed description of each menu selection and the method of creating an input file for LTCP. A cross reference is included to help experienced users quickly find the variables which commonly need changes. Finally, the system requirements and installation instructions are provided.
NASA Technical Reports Server (NTRS)
Tweedt, Daniel L.; Chima, Rodrick V.; Turkel, Eli
1997-01-01
A preconditioning scheme has been implemented into a three-dimensional viscous computational fluid dynamics code for turbomachine blade rows. The preconditioning allows the code, originally developed for simulating compressible flow fields, to be applied to nearly-incompressible, low Mach number flows. A brief description is given of the compressible Navier-Stokes equations for a rotating coordinate system, along with the preconditioning method employed. Details about the conservative formulation of artificial dissipation are provided, and different artificial dissipation schemes are discussed and compared. The preconditioned code was applied to a well-documented case involving the NASA large low-speed centrifugal compressor for which detailed experimental data are available for comparison. Performance and flow field data are compared for the near-design operating point of the compressor, with generally good agreement between computation and experiment. Further, significant differences between computational results for the different numerical implementations, revealing different levels of solution accuracy, are discussed.
Error-correction coding for digital communications
NASA Astrophysics Data System (ADS)
Clark, G. C., Jr.; Cain, J. B.
This book is written for the design engineer who must build the coding and decoding equipment and for the communication system engineer who must incorporate this equipment into a system. It is also suitable as a senior-level or first-year graduate text for an introductory one-semester course in coding theory. Fundamental concepts of coding are discussed along with group codes, taking into account basic principles, practical constraints, performance computations, coding bounds, generalized parity check codes, polynomial codes, and important classes of group codes. Other topics explored are related to simple nonalgebraic decoding techniques for group codes, soft decision decoding of block codes, algebraic techniques for multiple error correction, the convolutional code structure and Viterbi decoding, syndrome decoding techniques, and sequential decoding techniques. System applications are also considered, giving attention to concatenated codes, coding for the white Gaussian noise channel, interleaver structures for coded systems, and coding for burst noise channels.
FY17 Status Report on NEAMS Neutronics Activities
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, C. H.; Jung, Y. S.; Smith, M. A.
2017-09-30
Under the U.S. DOE NEAMS program, the high-fidelity neutronics code system has been developed to support the multiphysics modeling and simulation capability named SHARP. The neutronics code system includes the high-fidelity neutronics code PROTEUS, the cross section library and preprocessing tools, the multigroup cross section generation code MC2-3, the in-house meshing generation tool, the perturbation and sensitivity analysis code PERSENT, and post-processing tools. The main objectives of the NEAMS neutronics activities in FY17 are to continue development of an advanced nodal solver in PROTEUS for use in nuclear reactor design and analysis projects, implement a simplified sub-channel based thermal-hydraulic (T/H)more » capability into PROTEUS to efficiently compute the thermal feedback, improve the performance of PROTEUS-MOCEX using numerical acceleration and code optimization, improve the cross section generation tools including MC2-3, and continue to perform verification and validation tests for PROTEUS.« less
Error control techniques for satellite and space communications
NASA Technical Reports Server (NTRS)
Costello, Daniel J., Jr.
1992-01-01
Worked performed during the reporting period is summarized. Construction of robustly good trellis codes for use with sequential decoding was developed. The robustly good trellis codes provide a much better trade off between free distance and distance profile. The unequal error protection capabilities of convolutional codes was studied. The problem of finding good large constraint length, low rate convolutional codes for deep space applications is investigated. A formula for computing the free distance of 1/n convolutional codes was discovered. Double memory (DM) codes, codes with two memory units per unit bit position, were studied; a search for optimal DM codes is being conducted. An algorithm for constructing convolutional codes from a given quasi-cyclic code was developed. Papers based on the above work are included in the appendix.
Control Law Design in a Computational Aeroelasticity Environment
NASA Technical Reports Server (NTRS)
Newsom, Jerry R.; Robertshaw, Harry H.; Kapania, Rakesh K.
2003-01-01
A methodology for designing active control laws in a computational aeroelasticity environment is given. The methodology involves employing a systems identification technique to develop an explicit state-space model for control law design from the output of a computational aeroelasticity code. The particular computational aeroelasticity code employed in this paper solves the transonic small disturbance aerodynamic equation using a time-accurate, finite-difference scheme. Linear structural dynamics equations are integrated simultaneously with the computational fluid dynamics equations to determine the time responses of the structure. These structural responses are employed as the input to a modern systems identification technique that determines the Markov parameters of an "equivalent linear system". The Eigensystem Realization Algorithm is then employed to develop an explicit state-space model of the equivalent linear system. The Linear Quadratic Guassian control law design technique is employed to design a control law. The computational aeroelasticity code is modified to accept control laws and perform closed-loop simulations. Flutter control of a rectangular wing model is chosen to demonstrate the methodology. Various cases are used to illustrate the usefulness of the methodology as the nonlinearity of the aeroelastic system is increased through increased angle-of-attack changes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Walker, Andrew; Lawrence, Earl
The Response Surface Modeling (RSM) Tool Suite is a collection of three codes used to generate an empirical interpolation function for a collection of drag coefficient calculations computed with Test Particle Monte Carlo (TPMC) simulations. The first code, "Automated RSM", automates the generation of a drag coefficient RSM for a particular object to a single command. "Automated RSM" first creates a Latin Hypercube Sample (LHS) of 1,000 ensemble members to explore the global parameter space. For each ensemble member, a TPMC simulation is performed and the object drag coefficient is computed. In the next step of the "Automated RSM" code,more » a Gaussian process is used to fit the TPMC simulations. In the final step, Markov Chain Monte Carlo (MCMC) is used to evaluate the non-analytic probability distribution function from the Gaussian process. The second code, "RSM Area", creates a look-up table for the projected area of the object based on input limits on the minimum and maximum allowed pitch and yaw angles and pitch and yaw angle intervals. The projected area from the look-up table is used to compute the ballistic coefficient of the object based on its pitch and yaw angle. An accurate ballistic coefficient is crucial in accurately computing the drag on an object. The third code, "RSM Cd", uses the RSM generated by the "Automated RSM" code and the projected area look-up table generated by the "RSM Area" code to accurately compute the drag coefficient and ballistic coefficient of the object. The user can modify the object velocity, object surface temperature, the translational temperature of the gas, the species concentrations of the gas, and the pitch and yaw angles of the object. Together, these codes allow for the accurate derivation of an object's drag coefficient and ballistic coefficient under any conditions with only knowledge of the object's geometry and mass.« less
Enhancement/upgrade of Engine Structures Technology Best Estimator (EST/BEST) Software System
NASA Technical Reports Server (NTRS)
Shah, Ashwin
2003-01-01
This report describes the work performed during the contract period and the capabilities included in the EST/BEST software system. The developed EST/BEST software system includes the integrated NESSUS, IPACS, COBSTRAN, and ALCCA computer codes required to perform the engine cycle mission and component structural analysis. Also, the interactive input generator for NESSUS, IPACS, and COBSTRAN computer codes have been developed and integrated with the EST/BEST software system. The input generator allows the user to create input from scratch as well as edit existing input files interactively. Since it has been integrated with the EST/BEST software system, it enables the user to modify EST/BEST generated files and perform the analysis to evaluate the benefits. Appendix A gives details of how to use the newly added features in the EST/BEST software system.
Benchmarking NNWSI flow and transport codes: COVE 1 results
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hayden, N.K.
1985-06-01
The code verification (COVE) activity of the Nevada Nuclear Waste Storage Investigations (NNWSI) Project is the first step in certification of flow and transport codes used for NNWSI performance assessments of a geologic repository for disposing of high-level radioactive wastes. The goals of the COVE activity are (1) to demonstrate and compare the numerical accuracy and sensitivity of certain codes, (2) to identify and resolve problems in running typical NNWSI performance assessment calculations, and (3) to evaluate computer requirements for running the codes. This report describes the work done for COVE 1, the first step in benchmarking some of themore » codes. Isothermal calculations for the COVE 1 benchmarking have been completed using the hydrologic flow codes SAGUARO, TRUST, and GWVIP; the radionuclide transport codes FEMTRAN and TRUMP; and the coupled flow and transport code TRACR3D. This report presents the results of three cases of the benchmarking problem solved for COVE 1, a comparison of the results, questions raised regarding sensitivities to modeling techniques, and conclusions drawn regarding the status and numerical sensitivities of the codes. 30 refs.« less
Posttest calculation of the PBF LOC-11B and LOC-11C experiments using RELAP4/MOD6. [PWR
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hendrix, C.E.
Comparisons between RELAP4/MOD6, Update 4 code-calculated and measured experimental data are presented for the PBF LOC-11C and LOC-11B experiments. Independent code verification techniques are now being developed and this study represents a preliminary effort applying structured criteria for developing computer models, selecting code input, and performing base-run analyses. Where deficiencies are indicated in the base-case representation of the experiment, methods of code and criteria improvement are developed and appropriate recommendations are made.
Trinary signed-digit arithmetic using an efficient encoding scheme
NASA Astrophysics Data System (ADS)
Salim, W. Y.; Alam, M. S.; Fyath, R. S.; Ali, S. A.
2000-09-01
The trinary signed-digit (TSD) number system is of interest for ultrafast optoelectronic computing systems since it permits parallel carry-free addition and borrow-free subtraction of two arbitrary length numbers in constant time. In this paper, a simple coding scheme is proposed to encode the decimal number directly into the TSD form. The coding scheme enables one to perform parallel one-step TSD arithmetic operation. The proposed coding scheme uses only a 5-combination coding table instead of the 625-combination table reported recently for recoded TSD arithmetic technique.
One-step trinary signed-digit arithmetic using an efficient encoding scheme
NASA Astrophysics Data System (ADS)
Salim, W. Y.; Fyath, R. S.; Ali, S. A.; Alam, Mohammad S.
2000-11-01
The trinary signed-digit (TSD) number system is of interest for ultra fast optoelectronic computing systems since it permits parallel carry-free addition and borrow-free subtraction of two arbitrary length numbers in constant time. In this paper, a simple coding scheme is proposed to encode the decimal number directly into the TSD form. The coding scheme enables one to perform parallel one-step TSD arithmetic operation. The proposed coding scheme uses only a 5-combination coding table instead of the 625-combination table reported recently for recoded TSD arithmetic technique.
Performance of concatenated Reed-Solomon/Viterbi channel coding
NASA Technical Reports Server (NTRS)
Divsalar, D.; Yuen, J. H.
1982-01-01
The concatenated Reed-Solomon (RS)/Viterbi coding system is reviewed. The performance of the system is analyzed and results are derived with a new simple approach. A functional model for the input RS symbol error probability is presented. Based on this new functional model, we compute the performance of a concatenated system in terms of RS word error probability, output RS symbol error probability, bit error probability due to decoding failure, and bit error probability due to decoding error. Finally we analyze the effects of the noisy carrier reference and the slow fading on the system performance.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lilienthal, P.
1997-12-01
This paper describes three different computer codes which have been written to model village power applications. The reasons which have driven the development of these codes include: the existance of limited field data; diverse applications can be modeled; models allow cost and performance comparisons; simulations generate insights into cost structures. The models which are discussed are: Hybrid2, a public code which provides detailed engineering simulations to analyze the performance of a particular configuration; HOMER - the hybrid optimization model for electric renewables - which provides economic screening for sensitivity analyses; and VIPOR the village power model - which is amore » network optimization model for comparing mini-grids to individual systems. Examples of the output of these codes are presented for specific applications.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chapman, Bryan Scott; Gough, Sean T.
This report documents a validation of the MCNP6 Version 1.0 computer code on the high performance computing platform Moonlight, for operations at Los Alamos National Laboratory (LANL) that involve plutonium metals, oxides, and solutions. The validation is conducted using the ENDF/B-VII.1 continuous energy group cross section library at room temperature. The results are for use by nuclear criticality safety personnel in performing analysis and evaluation of various facility activities involving plutonium materials.
CoreTSAR: Core Task-Size Adapting Runtime
Scogland, Thomas R. W.; Feng, Wu-chun; Rountree, Barry; ...
2014-10-27
Heterogeneity continues to increase at all levels of computing, with the rise of accelerators such as GPUs, FPGAs, and other co-processors into everything from desktops to supercomputers. As a consequence, efficiently managing such disparate resources has become increasingly complex. CoreTSAR seeks to reduce this complexity by adaptively worksharing parallel-loop regions across compute resources without requiring any transformation of the code within the loop. Lastly, our results show performance improvements of up to three-fold over a current state-of-the-art heterogeneous task scheduler as well as linear performance scaling from a single GPU to four GPUs for many codes. In addition, CoreTSAR demonstratesmore » a robust ability to adapt to both a variety of workloads and underlying system configurations.« less
NASA Technical Reports Server (NTRS)
Evans, Austin Lewis
1987-01-01
A computer code to model the steady-state performance of a monogroove heat pipe for the NASA Space Station is presented, including the effects on heat pipe performance of a screen in the evaporator section which deals with transient surges in the heat input. Errors in a previous code have been corrected, and the new code adds additional loss terms in order to model several different working fluids. Good agreement with existing performance curves is obtained. From a preliminary evaluation of several of the radiator design parameters it is found that an optimum fin width could be achieved but that structural considerations limit the thickness of the fin to a value above optimum.
Fast methods to numerically integrate the Reynolds equation for gas fluid films
NASA Technical Reports Server (NTRS)
Dimofte, Florin
1992-01-01
The alternating direction implicit (ADI) method is adopted, modified, and applied to the Reynolds equation for thin, gas fluid films. An efficient code is developed to predict both the steady-state and dynamic performance of an aerodynamic journal bearing. An alternative approach is shown for hybrid journal gas bearings by using Liebmann's iterative solution (LIS) for elliptic partial differential equations. The results are compared with known design criteria from experimental data. The developed methods show good accuracy and very short computer running time in comparison with methods based on an inverting of a matrix. The computer codes need a small amount of memory and can be run on either personal computers or on mainframe systems.
Numerical simulation of steady supersonic flow over spinning bodies of revolution
NASA Technical Reports Server (NTRS)
Sturek, W. B.; Schiff, L. B.
1982-01-01
A recently reported parabolized Navier-Stokes code has been employed to compute the supersonic flowfield about a spinning cone and spinning and nonspinning ogive cylinder and boattailed bodies of revolution at moderate incidence. The computations were performed for flow conditions where extensive measurements for wall pressure, boundary-layer velocity profiles, and Magnus force had been obtained. Comparisons between the computational results and experiment indicate excellent agreement for angles of attack up to 6 deg. At angles greater than 6 deg discrepancies are noted which are tentatively attributed to turbulence modeling errors. The comparisons for Magnus effects show that the code accurately predicts the effects of body shape for the selected models.
Optimization of large matrix calculations for execution on the Cray X-MP vector supercomputer
NASA Technical Reports Server (NTRS)
Hornfeck, William A.
1988-01-01
A considerable volume of large computational computer codes were developed for NASA over the past twenty-five years. This code represents algorithms developed for machines of earlier generation. With the emergence of the vector supercomputer as a viable, commercially available machine, an opportunity exists to evaluate optimization strategies to improve the efficiency of existing software. This result is primarily due to architectural differences in the latest generation of large-scale machines and the earlier, mostly uniprocessor, machines. A sofware package being used by NASA to perform computations on large matrices is described, and a strategy for conversion to the Cray X-MP vector supercomputer is also described.
NASA Technical Reports Server (NTRS)
Gardner, Kevin D.; Liu, Jong-Shang; Murthy, Durbha V.; Kruse, Marlin J.; James, Darrell
1999-01-01
AlliedSignal Engines, in cooperation with NASA GRC (National Aeronautics and Space Administration Glenn Research Center), completed an evaluation of recently-developed aeroelastic computer codes using test cases from the AlliedSignal Engines fan blisk and turbine databases. Test data included strain gage, performance, and steady-state pressure information obtained for conditions where synchronous or flutter vibratory conditions were found to occur. Aeroelastic codes evaluated included quasi 3-D UNSFLO (MIT Developed/AE Modified, Quasi 3-D Aeroelastic Computer Code), 2-D FREPS (NASA-Developed Forced Response Prediction System Aeroelastic Computer Code), and 3-D TURBO-AE (NASA/Mississippi State University Developed 3-D Aeroelastic Computer Code). Unsteady pressure predictions for the turbine test case were used to evaluate the forced response prediction capabilities of each of the three aeroelastic codes. Additionally, one of the fan flutter cases was evaluated using TURBO-AE. The UNSFLO and FREPS evaluation predictions showed good agreement with the experimental test data trends, but quantitative improvements are needed. UNSFLO over-predicted turbine blade response reductions, while FREPS under-predicted them. The inviscid TURBO-AE turbine analysis predicted no discernible blade response reduction, indicating the necessity of including viscous effects for this test case. For the TURBO-AE fan blisk test case, significant effort was expended getting the viscous version of the code to give converged steady flow solutions for the transonic flow conditions. Once converged, the steady solutions provided an excellent match with test data and the calibrated DAWES (AlliedSignal 3-D Viscous Steady Flow CFD Solver). However, efforts expended establishing quality steady-state solutions prevented exercising the unsteady portion of the TURBO-AE code during the present program. AlliedSignal recommends that unsteady pressure measurement data be obtained for both test cases examined for use in aeroelastic code validation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Massimo, F., E-mail: francesco.massimo@ensta-paristech.fr; Dipartimento SBAI, Università di Roma “La Sapienza“, Via A. Scarpa 14, 00161 Roma; Atzeni, S.
Architect, a time explicit hybrid code designed to perform quick simulations for electron driven plasma wakefield acceleration, is described. In order to obtain beam quality acceptable for applications, control of the beam-plasma-dynamics is necessary. Particle in Cell (PIC) codes represent the state-of-the-art technique to investigate the underlying physics and possible experimental scenarios; however PIC codes demand the necessity of heavy computational resources. Architect code substantially reduces the need for computational resources by using a hybrid approach: relativistic electron bunches are treated kinetically as in a PIC code and the background plasma as a fluid. Cylindrical symmetry is assumed for themore » solution of the electromagnetic fields and fluid equations. In this paper both the underlying algorithms as well as a comparison with a fully three dimensional particle in cell code are reported. The comparison highlights the good agreement between the two models up to the weakly non-linear regimes. In highly non-linear regimes the two models only disagree in a localized region, where the plasma electrons expelled by the bunch close up at the end of the first plasma oscillation.« less
Design Considerations of a Virtual Laboratory for Advanced X-ray Sources
NASA Astrophysics Data System (ADS)
Luginsland, J. W.; Frese, M. H.; Frese, S. D.; Watrous, J. J.; Heileman, G. L.
2004-11-01
The field of scientific computation has greatly advanced in the last few years, resulting in the ability to perform complex computer simulations that can predict the performance of real-world experiments in a number of fields of study. Among the forces driving this new computational capability is the advent of parallel algorithms, allowing calculations in three-dimensional space with realistic time scales. Electromagnetic radiation sources driven by high-voltage, high-current electron beams offer an area to further push the state-of-the-art in high fidelity, first-principles simulation tools. The physics of these x-ray sources combine kinetic plasma physics (electron beams) with dense fluid-like plasma physics (anode plasmas) and x-ray generation (bremsstrahlung). There are a number of mature techniques and software packages for dealing with the individual aspects of these sources, such as Particle-In-Cell (PIC), Magneto-Hydrodynamics (MHD), and radiation transport codes. The current effort is focused on developing an object-oriented software environment using the Rational© Unified Process and the Unified Modeling Language (UML) to provide a framework where multiple 3D parallel physics packages, such as a PIC code (ICEPIC), a MHD code (MACH), and a x-ray transport code (ITS) can co-exist in a system-of-systems approach to modeling advanced x-ray sources. Initial software design and assessments of the various physics algorithms' fidelity will be presented.
Peregrine System User Basics | High-Performance Computing | NREL
peregrine.hpc.nrel.gov or to one of the login nodes. Example commands to access Peregrine from a Linux or Mac OS X system Code Example Create a file called hello.F90 containing the following code: program hello write(6 information by enclosing it in brackets < >. For example: $ ssh -Y
Multiplier Architecture for Coding Circuits
NASA Technical Reports Server (NTRS)
Wang, C. C.; Truong, T. K.; Shao, H. M.; Deutsch, L. J.
1986-01-01
Multipliers based on new algorithm for Galois-field (GF) arithmetic regular and expandable. Pipeline structures used for computing both multiplications and inverses. Designs suitable for implementation in very-large-scale integrated (VLSI) circuits. This general type of inverter and multiplier architecture especially useful in performing finite-field arithmetic of Reed-Solomon error-correcting codes and of some cryptographic algorithms.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Saad, Tony; Sutherland, James C.
To address the coding and software challenges of modern hybrid architectures, we propose an approach to multiphysics code development for high-performance computing. This approach is based on using a Domain Specific Language (DSL) in tandem with a directed acyclic graph (DAG) representation of the problem to be solved that allows runtime algorithm generation. When coupled with a large-scale parallel framework, the result is a portable development framework capable of executing on hybrid platforms and handling the challenges of multiphysics applications. In addition, we share our experience developing a code in such an environment – an effort that spans an interdisciplinarymore » team of engineers and computer scientists.« less
NASA Technical Reports Server (NTRS)
Bobbitt, Percy J.
1992-01-01
A discussion is given of the many factors that affect sonic booms with particular emphasis on the application and development of improved computational fluid dynamics (CFD) codes. The benefits that accrue from interference (induced) lift, distributing lift using canard configurations, the use of wings with dihedral or anhedral and hybrid laminar flow control for drag reduction are detailed. The application of the most advanced codes to a wider variety of configurations along with improved ray-tracing codes to arrive at more accurate and, hopefully, lower sonic booms is advocated. Finally, it is speculated that when all of the latest technology is applied to the design of a supersonic transport it will be found environmentally acceptable.
Saad, Tony; Sutherland, James C.
2016-05-04
To address the coding and software challenges of modern hybrid architectures, we propose an approach to multiphysics code development for high-performance computing. This approach is based on using a Domain Specific Language (DSL) in tandem with a directed acyclic graph (DAG) representation of the problem to be solved that allows runtime algorithm generation. When coupled with a large-scale parallel framework, the result is a portable development framework capable of executing on hybrid platforms and handling the challenges of multiphysics applications. In addition, we share our experience developing a code in such an environment – an effort that spans an interdisciplinarymore » team of engineers and computer scientists.« less
Development and application of GASP 2.0
NASA Technical Reports Server (NTRS)
Mcgrory, W. D.; Huebner, L. D.; Slack, D. C.; Walters, R. W.
1992-01-01
GASP 2.0 represents a major new release of the computational fluid dynamics code in wide use by the aerospace community. The authors have spent the last two years analyzing the strengths and weaknesses of the previous version of the finite-rate chemistry, Navier Stokes solution algorithm. What has resulted is a completely redesigned computer code that offers two to four times the performance of previous versions while requiring as little as one quarter of the memory requirements. In addition to the improvements in efficiency over the original code, Version 2.0 contains many new features. A brief discussion of the improvements made to GASP, and an application using GASP 2.0 which demonstrates some of the new features are presented.
Heat pipe design handbook, part 2. [digital computer code specifications
NASA Technical Reports Server (NTRS)
Skrabek, E. A.
1972-01-01
The utilization of a digital computer code for heat pipe analysis and design (HPAD) is described which calculates the steady state hydrodynamic heat transport capability of a heat pipe with a particular wick configuration, the working fluid being a function of wick cross-sectional area. Heat load, orientation, operating temperature, and heat pipe geometry are specified. Both one 'g' and zero 'g' environments are considered, and, at the user's option, the code will also perform a weight analysis and will calculate heat pipe temperature drops. The central porous slab, circumferential porous wick, arterial wick, annular wick, and axial rectangular grooves are the wick configurations which HPAD has the capability of analyzing. For Vol. 1, see N74-22569.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, Joon H.; Siegel, Malcolm Dean; Arguello, Jose Guadalupe, Jr.
2011-03-01
This report describes a gap analysis performed in the process of developing the Waste Integrated Performance and Safety Codes (IPSC) in support of the U.S. Department of Energy (DOE) Office of Nuclear Energy Advanced Modeling and Simulation (NEAMS) Campaign. The goal of the Waste IPSC is to develop an integrated suite of computational modeling and simulation capabilities to quantitatively assess the long-term performance of waste forms in the engineered and geologic environments of a radioactive waste storage or disposal system. The Waste IPSC will provide this simulation capability (1) for a range of disposal concepts, waste form types, engineered repositorymore » designs, and geologic settings, (2) for a range of time scales and distances, (3) with appropriate consideration of the inherent uncertainties, and (4) in accordance with rigorous verification, validation, and software quality requirements. The gap analyses documented in this report were are performed during an initial gap analysis to identify candidate codes and tools to support the development and integration of the Waste IPSC, and during follow-on activities that delved into more detailed assessments of the various codes that were acquired, studied, and tested. The current Waste IPSC strategy is to acquire and integrate the necessary Waste IPSC capabilities wherever feasible, and develop only those capabilities that cannot be acquired or suitably integrated, verified, or validated. The gap analysis indicates that significant capabilities may already exist in the existing THC codes although there is no single code able to fully account for all physical and chemical processes involved in a waste disposal system. Large gaps exist in modeling chemical processes and their couplings with other processes. The coupling of chemical processes with flow transport and mechanical deformation remains challenging. The data for extreme environments (e.g., for elevated temperature and high ionic strength media) that are needed for repository modeling are severely lacking. In addition, most of existing reactive transport codes were developed for non-radioactive contaminants, and they need to be adapted to account for radionuclide decay and in-growth. The accessibility to the source codes is generally limited. Because the problems of interest for the Waste IPSC are likely to result in relatively large computational models, a compact memory-usage footprint and a fast/robust solution procedure will be needed. A robust massively parallel processing (MPP) capability will also be required to provide reasonable turnaround times on the analyses that will be performed with the code. A performance assessment (PA) calculation for a waste disposal system generally requires a large number (hundreds to thousands) of model simulations to quantify the effect of model parameter uncertainties on the predicted repository performance. A set of codes for a PA calculation must be sufficiently robust and fast in terms of code execution. A PA system as a whole must be able to provide multiple alternative models for a specific set of physical/chemical processes, so that the users can choose various levels of modeling complexity based on their modeling needs. This requires PA codes, preferably, to be highly modularized. Most of the existing codes have difficulties meeting these requirements. Based on the gap analysis results, we have made the following recommendations for the code selection and code development for the NEAMS waste IPSC: (1) build fully coupled high-fidelity THCMBR codes using the existing SIERRA codes (e.g., ARIA and ADAGIO) and platform, (2) use DAKOTA to build an enhanced performance assessment system (EPAS), and build a modular code architecture and key code modules for performance assessments. The key chemical calculation modules will be built by expanding the existing CANTERA capabilities as well as by extracting useful components from other existing codes.« less
Computer Simulation Performed for Columbia Project Cooling System
NASA Technical Reports Server (NTRS)
Ahmad, Jasim
2005-01-01
This demo shows a high-fidelity simulation of the air flow in the main computer room housing the Columbia (10,024 intel titanium processors) system. The simulation asseses the performance of the cooling system and identified deficiencies, and recommended modifications to eliminate them. It used two in house software packages on NAS supercomputers: Chimera Grid tools to generate a geometric model of the computer room, OVERFLOW-2 code for fluid and thermal simulation. This state-of-the-art technology can be easily extended to provide a general capability for air flow analyses on any modern computer room. Columbia_CFD_black.tiff
Recent Developments in the Application of Biologically Inspired Computation to Chemical Sensing
NASA Astrophysics Data System (ADS)
Marco, S.; Gutierrez-Gálvez, A.
2009-05-01
Biological olfaction outperforms chemical instrumentation in specificity, response time, detection limit, coding capacity, time stability, robustness, size, power consumption, and portability. This biological function provides outstanding performance due, to a large extent, to the unique architecture of the olfactory pathway, which combines a high degree of redundancy, an efficient combinatorial coding along with unmatched chemical information processing mechanisms. The last decade has witnessed important advances in the understanding of the computational primitives underlying the functioning of the olfactory system. In this work, the state of the art concerning biologically inspired computation for chemical sensing will be reviewed. Instead of reviewing the whole body of computational neuroscience of olfaction, we restrict this review to the application of models to the processing of real chemical sensor data.
Steady and Unsteady Nozzle Simulations Using the Conservation Element and Solution Element Method
NASA Technical Reports Server (NTRS)
Friedlander, David Joshua; Wang, Xiao-Yen J.
2014-01-01
This paper presents results from computational fluid dynamic (CFD) simulations of a three-stream plug nozzle. Time-accurate, Euler, quasi-1D and 2D-axisymmetric simulations were performed as part of an effort to provide a CFD-based approach to modeling nozzle dynamics. The CFD code used for the simulations is based on the space-time Conservation Element and Solution Element (CESE) method. Steady-state results were validated using the Wind-US code and a code utilizing the MacCormack method while the unsteady results were partially validated via an aeroacoustic benchmark problem. The CESE steady-state flow field solutions showed excellent agreement with solutions derived from the other methods and codes while preliminary unsteady results for the three-stream plug nozzle are also shown. Additionally, a study was performed to explore the sensitivity of gross thrust computations to the control surface definition. The results showed that most of the sensitivity while computing the gross thrust is attributed to the control surface stencil resolution and choice of stencil end points and not to the control surface definition itself.Finally, comparisons between the quasi-1D and 2D-axisymetric solutions were performed in order to gain insight on whether a quasi-1D solution can capture the steady and unsteady nozzle phenomena without the cost of a 2D-axisymmetric simulation. Initial results show that while the quasi-1D solutions are similar to the 2D-axisymmetric solutions, the inability of the quasi-1D simulations to predict two dimensional phenomena limits its accuracy.
Unsteady Flow Interactions Between the LH2 Feed Line and SSME LPFP Inducer
NASA Technical Reports Server (NTRS)
Dorney, Dan; Griffin, Lisa; Marcu, Bogdan; Williams, Morgan
2006-01-01
An extensive computational effort has been performed in order to investigate the nature of unsteady flow in the fuel line supplying the three Space Shuttle Main Engines during flight. Evidence of high cycle fatigue (HCF) in the flow liner one diameter upstream of the Low Pressure Fuel Pump inducer has been observed in several locations. The analysis presented in this report has the objective of determining the driving mechanisms inducing HCF and the associated fluid flow phenomena. The simulations have been performed using two different computational codes, the NASA MSFC PHANTOM code and the Pratt and Whitney Rocketdyne ENIGMA code. The fuel flow through the flow liner and the pump inducer have been modeled in full three-dimensional geometry, and the results of the computations compared with test data taken during hot fire tests at NASA Stennis Space Center, and cold-flow water flow test data obtained at NASA MSFC. The numerical results indicate that unsteady pressure fluctuations at specific frequencies develop in the duct at the flow-liner location. Detailed frequency analysis of the flow disturbances is presented. The unsteadiness is believed to be an important source for fluctuating pressures generating high cycle fatigue.
Tail Biting Trellis Representation of Codes: Decoding and Construction
NASA Technical Reports Server (NTRS)
Shao. Rose Y.; Lin, Shu; Fossorier, Marc
1999-01-01
This paper presents two new iterative algorithms for decoding linear codes based on their tail biting trellises, one is unidirectional and the other is bidirectional. Both algorithms are computationally efficient and achieves virtually optimum error performance with a small number of decoding iterations. They outperform all the previous suboptimal decoding algorithms. The bidirectional algorithm also reduces decoding delay. Also presented in the paper is a method for constructing tail biting trellises for linear block codes.
NASA Astrophysics Data System (ADS)
White, Christopher Joseph
We describe the implementation of sophisticated numerical techniques for general-relativistic magnetohydrodynamics simulations in the Athena++ code framework. Improvements over many existing codes include the use of advanced Riemann solvers and of staggered-mesh constrained transport. Combined with considerations for computational performance and parallel scalability, these allow us to investigate black hole accretion flows with unprecedented accuracy. The capability of the code is demonstrated by exploring magnetically arrested disks.