large computer code: Topics by Science.gov

Sample records for large computer code

PARAVT: Parallel Voronoi tessellation code

NASA Astrophysics Data System (ADS)

González, R. E.

2016-10-01

In this study, we present a new open source code for massive parallel computation of Voronoi tessellations (VT hereafter) in large data sets. The code is focused for astrophysical purposes where VT densities and neighbors are widely used. There are several serial Voronoi tessellation codes, however no open source and parallel implementations are available to handle the large number of particles/galaxies in current N-body simulations and sky surveys. Parallelization is implemented under MPI and VT using Qhull library. Domain decomposition takes into account consistent boundary computation between tasks, and includes periodic conditions. In addition, the code computes neighbors list, Voronoi density, Voronoi cell volume, density gradient for each particle, and densities on a regular grid. Code implementation and user guide are publicly available at https://github.com/regonzar/paravt.
Parallelization of Finite Element Analysis Codes Using Heterogeneous Distributed Computing

NASA Technical Reports Server (NTRS)

Ozguner, Fusun

1996-01-01

Performance gains in computer design are quickly consumed as users seek to analyze larger problems to a higher degree of accuracy. Innovative computational methods, such as parallel and distributed computing, seek to multiply the power of existing hardware technology to satisfy the computational demands of large applications. In the early stages of this project, experiments were performed using two large, coarse-grained applications, CSTEM and METCAN. These applications were parallelized on an Intel iPSC/860 hypercube. It was found that the overall speedup was very low, due to large, inherently sequential code segments present in the applications. The overall execution time T(sub par), of the application is dependent on these sequential segments. If these segments make up a significant fraction of the overall code, the application will have a poor speedup measure.
MIADS2 ... an alphanumeric map information assembly and display system for a large computer

Treesearch

Elliot L. Amidon

1966-01-01

A major improvement and extension of the Map Information Assembly and Display System (MIADS) developed in 1964 is described. Basic principles remain unchanged, but the computer programs have been expanded and rewritten for a large computer, in Fortran IV and MAP languages. The code system is extended from 99 integers to about 2,200 alphanumeric 2-character codes. Hand-...
Efficient preparation of large-block-code ancilla states for fault-tolerant quantum computation

NASA Astrophysics Data System (ADS)

Zheng, Yi-Cong; Lai, Ching-Yi; Brun, Todd A.

2018-03-01

Fault-tolerant quantum computation (FTQC) schemes that use multiqubit large block codes can potentially reduce the resource overhead to a great extent. A major obstacle is the requirement for a large number of clean ancilla states of different types without correlated errors inside each block. These ancilla states are usually logical stabilizer states of the data-code blocks, which are generally difficult to prepare if the code size is large. Previously, we have proposed an ancilla distillation protocol for Calderbank-Shor-Steane (CSS) codes by classical error-correcting codes. It was assumed that the quantum gates in the distillation circuit were perfect; however, in reality, noisy quantum gates may introduce correlated errors that are not treatable by the protocol. In this paper, we show that additional postselection by another classical error-detecting code can be applied to remove almost all correlated errors. Consequently, the revised protocol is fully fault tolerant and capable of preparing a large set of stabilizer states sufficient for FTQC using large block codes. At the same time, the yield rate can be boosted from O (t-2) to O (1 ) in practice for an [[n ,k ,d =2 t +1
Self-Scheduling Parallel Methods for Multiple Serial Codes with Application to WOPWOP

NASA Technical Reports Server (NTRS)

Long, Lyle N.; Brentner, Kenneth S.

2000-01-01

This paper presents a scheme for efficiently running a large number of serial jobs on parallel computers. Two examples are given of computer programs that run relatively quickly, but often they must be run numerous times to obtain all the results needed. It is very common in science and engineering to have codes that are not massive computing challenges in themselves, but due to the number of instances that must be run, they do become large-scale computing problems. The two examples given here represent common problems in aerospace engineering: aerodynamic panel methods and aeroacoustic integral methods. The first example simply solves many systems of linear equations. This is representative of an aerodynamic panel code where someone would like to solve for numerous angles of attack. The complete code for this first example is included in the appendix so that it can be readily used by others as a template. The second example is an aeroacoustics code (WOPWOP) that solves the Ffowcs Williams Hawkings equation to predict the far-field sound due to rotating blades. In this example, one quite often needs to compute the sound at numerous observer locations, hence parallelization is utilized to automate the noise computation for a large number of observers.
The adaption and use of research codes for performance assessment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liebetrau, A.M.

1987-05-01

Models of real-world phenomena are developed for many reasons. The models are usually, if not always, implemented in the form of a computer code. The characteristics of a code are determined largely by its intended use. Realizations or implementations of detailed mathematical models of complex physical and/or chemical processes are often referred to as research or scientific (RS) codes. Research codes typically require large amounts of computing time. One example of an RS code is a finite-element code for solving complex systems of differential equations that describe mass transfer through some geologic medium. Considerable computing time is required because computationsmore » are done at many points in time and/or space. Codes used to evaluate the overall performance of real-world physical systems are called performance assessment (PA) codes. Performance assessment codes are used to conduct simulated experiments involving systems that cannot be directly observed. Thus, PA codes usually involve repeated simulations of system performance in situations that preclude the use of conventional experimental and statistical methods. 3 figs.« less
Cloud Computing for Complex Performance Codes.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Appel, Gordon John; Hadgu, Teklu; Klein, Brandon Thorin

This report describes the use of cloud computing services for running complex public domain performance assessment problems. The work consisted of two phases: Phase 1 was to demonstrate complex codes, on several differently configured servers, could run and compute trivial small scale problems in a commercial cloud infrastructure. Phase 2 focused on proving non-trivial large scale problems could be computed in the commercial cloud environment. The cloud computing effort was successfully applied using codes of interest to the geohydrology and nuclear waste disposal modeling community.
A computer-aided design system geared toward conceptual design in a research environment. [for hypersonic vehicles

NASA Technical Reports Server (NTRS)

STACK S. H.

1981-01-01

A computer-aided design system has recently been developed specifically for the small research group environment. The system is implemented on a Prime 400 minicomputer linked with a CDC 6600 computer. The goal was to assign the minicomputer specific tasks, such as data input and graphics, thereby reserving the large mainframe computer for time-consuming analysis codes. The basic structure of the design system consists of GEMPAK, a computer code that generates detailed configuration geometry from a minimum of input; interface programs that reformat GEMPAK geometry for input to the analysis codes; and utility programs that simplify computer access and data interpretation. The working system has had a large positive impact on the quantity and quality of research performed by the originating group. This paper describes the system, the major factors that contributed to its particular form, and presents examples of its application.
Optimization of large matrix calculations for execution on the Cray X-MP vector supercomputer

NASA Technical Reports Server (NTRS)

Hornfeck, William A.

1988-01-01

A considerable volume of large computational computer codes were developed for NASA over the past twenty-five years. This code represents algorithms developed for machines of earlier generation. With the emergence of the vector supercomputer as a viable, commercially available machine, an opportunity exists to evaluate optimization strategies to improve the efficiency of existing software. This result is primarily due to architectural differences in the latest generation of large-scale machines and the earlier, mostly uniprocessor, machines. A sofware package being used by NASA to perform computations on large matrices is described, and a strategy for conversion to the Cray X-MP vector supercomputer is also described.
Cross-indexing of binary SIFT codes for large-scale image search.

PubMed

Liu, Zhen; Li, Houqiang; Zhang, Liyan; Zhou, Wengang; Tian, Qi

2014-05-01

In recent years, there has been growing interest in mapping visual features into compact binary codes for applications on large-scale image collections. Encoding high-dimensional data as compact binary codes reduces the memory cost for storage. Besides, it benefits the computational efficiency since the computation of similarity can be efficiently measured by Hamming distance. In this paper, we propose a novel flexible scale invariant feature transform (SIFT) binarization (FSB) algorithm for large-scale image search. The FSB algorithm explores the magnitude patterns of SIFT descriptor. It is unsupervised and the generated binary codes are demonstrated to be dispreserving. Besides, we propose a new searching strategy to find target features based on the cross-indexing in the binary SIFT space and original SIFT space. We evaluate our approach on two publicly released data sets. The experiments on large-scale partial duplicate image retrieval system demonstrate the effectiveness and efficiency of the proposed algorithm.
Modeling Improvements and Users Manual for Axial-flow Turbine Off-design Computer Code AXOD

NASA Technical Reports Server (NTRS)

Glassman, Arthur J.

1994-01-01

An axial-flow turbine off-design performance computer code used for preliminary studies of gas turbine systems was modified and calibrated based on the experimental performance of large aircraft-type turbines. The flow- and loss-model modifications and calibrations are presented in this report. Comparisons are made between computed performances and experimental data for seven turbines over wide ranges of speed and pressure ratio. This report also serves as the users manual for the revised code, which is named AXOD.
Design Aspects of the Rayleigh Convection Code

NASA Astrophysics Data System (ADS)

Featherstone, N. A.

2017-12-01

Understanding the long-term generation of planetary or stellar magnetic field requires complementary knowledge of the large-scale fluid dynamics pervading large fractions of the object's interior. Such large-scale motions are sensitive to the system's geometry which, in planets and stars, is spherical to a good approximation. As a result, computational models designed to study such systems often solve the MHD equations in spherical geometry, frequently employing a spectral approach involving spherical harmonics. We present computational and user-interface design aspects of one such modeling tool, the Rayleigh convection code, which is suitable for deployment on desktop and petascale-hpc architectures alike. In this poster, we will present an overview of this code's parallel design and its built-in diagnostics-output package. Rayleigh has been developed with NSF support through the Computational Infrastructure for Geodynamics and is expected to be released as open-source software in winter 2017/2018.
Fast Computation of the Two-Point Correlation Function in the Age of Big Data

NASA Astrophysics Data System (ADS)

Pellegrino, Andrew; Timlin, John

2018-01-01

We present a new code which quickly computes the two-point correlation function for large sets of astronomical data. This code combines the ease of use of Python with the speed of parallel shared libraries written in C. We include the capability to compute the auto- and cross-correlation statistics, and allow the user to calculate the three-dimensional and angular correlation functions. Additionally, the code automatically divides the user-provided sky masks into contiguous subsamples of similar size, using the HEALPix pixelization scheme, for the purpose of resampling. Errors are computed using jackknife and bootstrap resampling in a way that adds negligible extra runtime, even with many subsamples. We demonstrate comparable speed with other clustering codes, and code accuracy compared to known and analytic results.
Binary weight distributions of some Reed-Solomon codes

NASA Technical Reports Server (NTRS)

Pollara, F.; Arnold, S.

1992-01-01

The binary weight distributions of the (7,5) and (15,9) Reed-Solomon (RS) codes and their duals are computed using the MacWilliams identities. Several mappings of symbols to bits are considered and those offering the largest binary minimum distance are found. These results are then used to compute bounds on the soft-decoding performance of these codes in the presence of additive Gaussian noise. These bounds are useful for finding large binary block codes with good performance and for verifying the performance obtained by specific soft-coding algorithms presently under development.
Fault tolerant computing: A preamble for assuring viability of large computer systems

NASA Technical Reports Server (NTRS)

Lim, R. S.

1977-01-01

The need for fault-tolerant computing is addressed from the viewpoints of (1) why it is needed, (2) how to apply it in the current state of technology, and (3) what it means in the context of the Phoenix computer system and other related systems. To this end, the value of concurrent error detection and correction is described. User protection, program retry, and repair are among the factors considered. The technology of algebraic codes to protect memory systems and arithmetic codes to protect memory systems and arithmetic codes to protect arithmetic operations is discussed.
The TeraShake Computational Platform for Large-Scale Earthquake Simulations

NASA Astrophysics Data System (ADS)

Cui, Yifeng; Olsen, Kim; Chourasia, Amit; Moore, Reagan; Maechling, Philip; Jordan, Thomas

Geoscientific and computer science researchers with the Southern California Earthquake Center (SCEC) are conducting a large-scale, physics-based, computationally demanding earthquake system science research program with the goal of developing predictive models of earthquake processes. The computational demands of this program continue to increase rapidly as these researchers seek to perform physics-based numerical simulations of earthquake processes for larger meet the needs of this research program, a multiple-institution team coordinated by SCEC has integrated several scientific codes into a numerical modeling-based research tool we call the TeraShake computational platform (TSCP). A central component in the TSCP is a highly scalable earthquake wave propagation simulation program called the TeraShake anelastic wave propagation (TS-AWP) code. In this chapter, we describe how we extended an existing, stand-alone, wellvalidated, finite-difference, anelastic wave propagation modeling code into the highly scalable and widely used TS-AWP and then integrated this code into the TeraShake computational platform that provides end-to-end (initialization to analysis) research capabilities. We also describe the techniques used to enhance the TS-AWP parallel performance on TeraGrid supercomputers, as well as the TeraShake simulations phases including input preparation, run time, data archive management, and visualization. As a result of our efforts to improve its parallel efficiency, the TS-AWP has now shown highly efficient strong scaling on over 40K processors on IBM’s BlueGene/L Watson computer. In addition, the TSCP has developed into a computational system that is useful to many members of the SCEC community for performing large-scale earthquake simulations.
Hypercube matrix computation task

NASA Technical Reports Server (NTRS)

Calalo, Ruel H.; Imbriale, William A.; Jacobi, Nathan; Liewer, Paulett C.; Lockhart, Thomas G.; Lyzenga, Gregory A.; Lyons, James R.; Manshadi, Farzin; Patterson, Jean E.

1988-01-01

A major objective of the Hypercube Matrix Computation effort at the Jet Propulsion Laboratory (JPL) is to investigate the applicability of a parallel computing architecture to the solution of large-scale electromagnetic scattering problems. Three scattering analysis codes are being implemented and assessed on a JPL/California Institute of Technology (Caltech) Mark 3 Hypercube. The codes, which utilize different underlying algorithms, give a means of evaluating the general applicability of this parallel architecture. The three analysis codes being implemented are a frequency domain method of moments code, a time domain finite difference code, and a frequency domain finite elements code. These analysis capabilities are being integrated into an electromagnetics interactive analysis workstation which can serve as a design tool for the construction of antennas and other radiating or scattering structures. The first two years of work on the Hypercube Matrix Computation effort is summarized. It includes both new developments and results as well as work previously reported in the Hypercube Matrix Computation Task: Final Report for 1986 to 1987 (JPL Publication 87-18).
Sub-Selective Quantization for Learning Binary Codes in Large-Scale Image Search.

PubMed

Li, Yeqing; Liu, Wei; Huang, Junzhou

2018-06-01

Recently with the explosive growth of visual content on the Internet, large-scale image search has attracted intensive attention. It has been shown that mapping high-dimensional image descriptors to compact binary codes can lead to considerable efficiency gains in both storage and performing similarity computation of images. However, most existing methods still suffer from expensive training devoted to large-scale binary code learning. To address this issue, we propose a sub-selection based matrix manipulation algorithm, which can significantly reduce the computational cost of code learning. As case studies, we apply the sub-selection algorithm to several popular quantization techniques including cases using linear and nonlinear mappings. Crucially, we can justify the resulting sub-selective quantization by proving its theoretic properties. Extensive experiments are carried out on three image benchmarks with up to one million samples, corroborating the efficacy of the sub-selective quantization method in terms of image retrieval.
Multitasking the code ARC3D. [for computational fluid dynamics

NASA Technical Reports Server (NTRS)

Barton, John T.; Hsiung, Christopher C.

1986-01-01

The CRAY multitasking system was developed in order to utilize all four processors and sharply reduce the wall clock run time. This paper describes the techniques used to modify the computational fluid dynamics code ARC3D for this run and analyzes the achieved speedup. The ARC3D code solves either the Euler or thin-layer N-S equations using an implicit approximate factorization scheme. Results indicate that multitask processing can be used to achieve wall clock speedup factors of over three times, depending on the nature of the program code being used. Multitasking appears to be particularly advantageous for large-memory problems running on multiple CPU computers.
STEMsalabim: A high-performance computing cluster friendly code for scanning transmission electron microscopy image simulations of thin specimens.

PubMed

Oelerich, Jan Oliver; Duschek, Lennart; Belz, Jürgen; Beyer, Andreas; Baranovskii, Sergei D; Volz, Kerstin

2017-06-01

We present a new multislice code for the computer simulation of scanning transmission electron microscope (STEM) images based on the frozen lattice approximation. Unlike existing software packages, the code is optimized to perform well on highly parallelized computing clusters, combining distributed and shared memory architectures. This enables efficient calculation of large lateral scanning areas of the specimen within the frozen lattice approximation and fine-grained sweeps of parameter space. Copyright © 2017 Elsevier B.V. All rights reserved.

Geometrical-optics code for computing the optical properties of large dielectric spheres.

PubMed

Zhou, Xiaobing; Li, Shusun; Stamnes, Knut

2003-07-20

Absorption of electromagnetic radiation by absorptive dielectric spheres such as snow grains in the near-infrared part of the solar spectrum cannot be neglected when radiative properties of snow are computed. Thus a new, to our knowledge, geometrical-optics code is developed to compute scattering and absorption cross sections of large dielectric particles of arbitrary complex refractive index. The number of internal reflections and transmissions are truncated on the basis of the ratio of the irradiance incident at the nth interface to the irradiance incident at the first interface for a specific optical ray. Thus the truncation number is a function of the angle of incidence. Phase functions for both near- and far-field absorption and scattering of electromagnetic radiation are calculated directly at any desired scattering angle by using a hybrid algorithm based on the bisection and Newton-Raphson methods. With these methods a large sphere's absorption and scattering properties of light can be calculated for any wavelength from the ultraviolet to the microwave regions. Assuming that large snow meltclusters (1-cm order), observed ubiquitously in the snow cover during summer, can be characterized as spheres, one may compute absorption and scattering efficiencies and the scattering phase function on the basis of this geometrical-optics method. A geometrical-optics method for sphere (GOMsphere) code is developed and tested against Wiscombe's Mie scattering code (MIE0) and a Monte Carlo code for a range of size parameters. GOMsphere can be combined with MIE0 to calculate the single-scattering properties of dielectric spheres of any size.
Analysis of a Distributed Pulse Power System Using a Circuit Analysis Code

DTIC Science & Technology

1979-06-01

dose rate was then integrated to give a number that could be compared with measure- ments made using thermal luminescent dosimeters ( TLD ’ s). Since...NM 8 7117 AND THE BDM CORPORATION, ALBUQUERQUE, NM 87106 Abstract A sophisticated computer code (SCEPTRE), used to analyze electronic circuits...computer code (SCEPTRE), used to analyze electronic circuits, was used to evaluate the performance of a large flash X-ray machine. This device was
Calculation of Water Drop Trajectories to and About Arbitrary Three-Dimensional Bodies in Potential Airflow

NASA Technical Reports Server (NTRS)

Norment, H. G.

1980-01-01

Calculations can be performed for any atmospheric conditions and for all water drop sizes, from the smallest cloud droplet to large raindrops. Any subsonic, external, non-lifting flow can be accommodated; flow into, but not through, inlets also can be simulated. Experimental water drop drag relations are used in the water drop equations of motion and effects of gravity settling are included. Seven codes are described: (1) a code used to debug and plot body surface description data; (2) a code that processes the body surface data to yield the potential flow field; (3) a code that computes flow velocities at arrays of points in space; (4) a code that computes water drop trajectories from an array of points in space; (5) a code that computes water drop trajectories and fluxes to arbitrary target points; (6) a code that computes water drop trajectories tangent to the body; and (7) a code that produces stereo pair plots which include both the body and trajectories. Code descriptions include operating instructions, card inputs and printouts for example problems, and listing of the FORTRAN codes. Accuracy of the calculations is discussed, and trajectory calculation results are compared with prior calculations and with experimental data.
Utilizing GPUs to Accelerate Turbomachinery CFD Codes

NASA Technical Reports Server (NTRS)

MacCalla, Weylin; Kulkarni, Sameer

2016-01-01

GPU computing has established itself as a way to accelerate parallel codes in the high performance computing world. This work focuses on speeding up APNASA, a legacy CFD code used at NASA Glenn Research Center, while also drawing conclusions about the nature of GPU computing and the requirements to make GPGPU worthwhile on legacy codes. Rewriting and restructuring of the source code was avoided to limit the introduction of new bugs. The code was profiled and investigated for parallelization potential, then OpenACC directives were used to indicate parallel parts of the code. The use of OpenACC directives was not able to reduce the runtime of APNASA on either the NVIDIA Tesla discrete graphics card, or the AMD accelerated processing unit. Additionally, it was found that in order to justify the use of GPGPU, the amount of parallel work being done within a kernel would have to greatly exceed the work being done by any one portion of the APNASA code. It was determined that in order for an application like APNASA to be accelerated on the GPU, it should not be modular in nature, and the parallel portions of the code must contain a large portion of the code's computation time.
Error Suppression for Hamiltonian-Based Quantum Computation Using Subsystem Codes

NASA Astrophysics Data System (ADS)

Marvian, Milad; Lidar, Daniel A.

2017-01-01

We present general conditions for quantum error suppression for Hamiltonian-based quantum computation using subsystem codes. This involves encoding the Hamiltonian performing the computation using an error detecting subsystem code and the addition of a penalty term that commutes with the encoded Hamiltonian. The scheme is general and includes the stabilizer formalism of both subspace and subsystem codes as special cases. We derive performance bounds and show that complete error suppression results in the large penalty limit. To illustrate the power of subsystem-based error suppression, we introduce fully two-local constructions for protection against local errors of the swap gate of adiabatic gate teleportation and the Ising chain in a transverse field.
Error Suppression for Hamiltonian-Based Quantum Computation Using Subsystem Codes.

PubMed

Marvian, Milad; Lidar, Daniel A

2017-01-20

We present general conditions for quantum error suppression for Hamiltonian-based quantum computation using subsystem codes. This involves encoding the Hamiltonian performing the computation using an error detecting subsystem code and the addition of a penalty term that commutes with the encoded Hamiltonian. The scheme is general and includes the stabilizer formalism of both subspace and subsystem codes as special cases. We derive performance bounds and show that complete error suppression results in the large penalty limit. To illustrate the power of subsystem-based error suppression, we introduce fully two-local constructions for protection against local errors of the swap gate of adiabatic gate teleportation and the Ising chain in a transverse field.
Application of computer generated color graphic techniques to the processing and display of three dimensional fluid dynamic data

NASA Technical Reports Server (NTRS)

Anderson, B. H.; Putt, C. W.; Giamati, C. C.

1981-01-01

Color coding techniques used in the processing of remote sensing imagery were adapted and applied to the fluid dynamics problems associated with turbofan mixer nozzles. The computer generated color graphics were found to be useful in reconstructing the measured flow field from low resolution experimental data to give more physical meaning to this information and in scanning and interpreting the large volume of computer generated data from the three dimensional viscous computer code used in the analysis.
SciDAC-3: Searching for Physics Beyond the Standard Model, University of Arizona component, Year 2 progress report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Toussaint, Doug

2014-03-21

The Arizona component of the SciDAC-3 Lattice Gauge Theory program consisted of partial support for a postdoctoral position. In the original budget this covered three fourths of a postdoc, but the University of Arizona changed its ERE rate for postdoctoral positions from 4.3% to 21%, so the support level was closer to two-thirds of a postdoc. The grant covered the work of postdoc Thomas Primer. Dr. Primer's first task was an urgent one, although it was not forseen in our proposed work. It turned out that on the large lattices used in some of our current computations the gauge fixingmore » code was not working as expected, and this revealed itself in inconsistent results in the correlators needed to compute the semileptonic form factors for K and D decays. Dr. Primer participated in the effort to understand this problem and to modify our codes to deal with the large lattices we are now generating (as large as 144 3 x 288). Corrected code was incorporated in our standard codes, and workarounds that allow us to use the correlators already computed with the unexpected gauge fixing were been implemented.« less
Bar-Code System for a Microbiological Laboratory

NASA Technical Reports Server (NTRS)

Law, Jennifer; Kirschner, Larry

2007-01-01

A bar-code system has been assembled for a microbiological laboratory that must examine a large number of samples. The system includes a commercial bar-code reader, computer hardware and software components, plus custom-designed database software. The software generates a user-friendly, menu-driven interface.
Transport and equilibrium in field-reversed mirrors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Boyd, J.K.

Two plasma models relevant to compact torus research have been developed to study transport and equilibrium in field reversed mirrors. In the first model for small Larmor radius and large collision frequency, the plasma is described as an adiabatic hydromagnetic fluid. In the second model for large Larmor radius and small collision frequency, a kinetic theory description has been developed. Various aspects of the two models have been studied in five computer codes ADB, AV, NEO, OHK, RES. The ADB code computes two dimensional equilibrium and one dimensional transport in a flux coordinate. The AV code calculates orbit average integralsmore » in a harmonic oscillator potential. The NEO code follows particle trajectories in a Hill's vortex magnetic field to study stochasticity, invariants of the motion, and orbit average formulas. The OHK code displays analytic psi(r), B/sub Z/(r), phi(r), E/sub r/(r) formulas developed for the kinetic theory description. The RES code calculates resonance curves to consider overlap regions relevant to stochastic orbit behavior.« less
Efficient computation of kinship and identity coefficients on large pedigrees.

PubMed

Cheng, En; Elliott, Brendan; Ozsoyoglu, Z Meral

2009-06-01

With the rapidly expanding field of medical genetics and genetic counseling, genealogy information is becoming increasingly abundant. An important computation on pedigree data is the calculation of identity coefficients, which provide a complete description of the degree of relatedness of a pair of individuals. The areas of application of identity coefficients are numerous and diverse, from genetic counseling to disease tracking, and thus, the computation of identity coefficients merits special attention. However, the computation of identity coefficients is not done directly, but rather as the final step after computing a set of generalized kinship coefficients. In this paper, we first propose a novel Path-Counting Formula for calculating generalized kinship coefficients, which is motivated by Wright's path-counting method for computing inbreeding coefficient. We then present an efficient and scalable scheme for calculating generalized kinship coefficients on large pedigrees using NodeCodes, a special encoding scheme for expediting the evaluation of queries on pedigree graph structures. Furthermore, we propose an improved scheme using Family NodeCodes for the computation of generalized kinship coefficients, which is motivated by the significant improvement of using Family NodeCodes for inbreeding coefficient over the use of NodeCodes. We also perform experiments for evaluating the efficiency of our method, and compare it with the performance of the traditional recursive algorithm for three individuals. Experimental results demonstrate that the resulting scheme is more scalable and efficient than the traditional recursive methods for computing generalized kinship coefficients.
Application of augmented-Lagrangian methods in meteorology: Comparison of different conjugate-gradient codes for large-scale minimization

NASA Technical Reports Server (NTRS)

Navon, I. M.

1984-01-01

A Lagrange multiplier method using techniques developed by Bertsekas (1982) was applied to solving the problem of enforcing simultaneous conservation of the nonlinear integral invariants of the shallow water equations on a limited area domain. This application of nonlinear constrained optimization is of the large dimensional type and the conjugate gradient method was found to be the only computationally viable method for the unconstrained minimization. Several conjugate-gradient codes were tested and compared for increasing accuracy requirements. Robustness and computational efficiency were the principal criteria.
Validation of hydrogen gas stratification and mixing models

DOE PAGES

Wu, Hsingtzu; Zhao, Haihua

2015-05-26

Two validation benchmarks confirm that the BMIX++ code is capable of simulating unintended hydrogen release scenarios efficiently. The BMIX++ (UC Berkeley mechanistic MIXing code in C++) code has been developed to accurately and efficiently predict the fluid mixture distribution and heat transfer in large stratified enclosures for accident analyses and design optimizations. The BMIX++ code uses a scaling based one-dimensional method to achieve large reduction in computational effort compared to a 3-D computational fluid dynamics (CFD) simulation. Two BMIX++ benchmark models have been developed. One is for a single buoyant jet in an open space and another is for amore » large sealed enclosure with both a jet source and a vent near the floor. Both of them have been validated by comparisons with experimental data. Excellent agreements are observed. The entrainment coefficients of 0.09 and 0.08 are found to fit the experimental data for hydrogen leaks with the Froude number of 99 and 268 best, respectively. In addition, the BIX++ simulation results of the average helium concentration for an enclosure with a vent and a single jet agree with the experimental data within a margin of about 10% for jet flow rates ranging from 1.21 × 10⁻⁴ to 3.29 × 10⁻⁴ m³/s. In conclusion, computing time for each BMIX++ model with a normal desktop computer is less than 5 min.« less
The development of an intelligent interface to a computational fluid dynamics flow-solver code

NASA Technical Reports Server (NTRS)

Williams, Anthony D.

1988-01-01

Researchers at NASA Lewis are currently developing an 'intelligent' interface to aid in the development and use of large, computational fluid dynamics flow-solver codes for studying the internal fluid behavior of aerospace propulsion systems. This paper discusses the requirements, design, and implementation of an intelligent interface to Proteus, a general purpose, 3-D, Navier-Stokes flow solver. The interface is called PROTAIS to denote its introduction of artificial intelligence (AI) concepts to the Proteus code.
The development of an intelligent interface to a computational fluid dynamics flow-solver code

NASA Technical Reports Server (NTRS)

Williams, Anthony D.

1988-01-01

Researchers at NASA Lewis are currently developing an 'intelligent' interface to aid in the development and use of large, computational fluid dynamics flow-solver codes for studying the internal fluid behavior of aerospace propulsion systems. This paper discusses the requirements, design, and implementation of an intelligent interface to Proteus, a general purpose, three-dimensional, Navier-Stokes flow solver. The interface is called PROTAIS to denote its introduction of artificial intelligence (AI) concepts to the Proteus code.
Comparison of computer codes for calculating dynamic loads in wind turbines

NASA Technical Reports Server (NTRS)

Spera, D. A.

1977-01-01

Seven computer codes for analyzing performance and loads in large, horizontal axis wind turbines were used to calculate blade bending moment loads for two operational conditions of the 100 kW Mod-0 wind turbine. Results were compared with test data on the basis of cyclic loads, peak loads, and harmonic contents. Four of the seven codes include rotor-tower interaction and three were limited to rotor analysis. With a few exceptions, all calculated loads were within 25 percent of nominal test data.
pycola: N-body COLA method code

NASA Astrophysics Data System (ADS)

Tassev, Svetlin; Eisenstein, Daniel J.; Wandelt, Benjamin D.; Zaldarriagag, Matias

2015-09-01

pycola is a multithreaded Python/Cython N-body code, implementing the Comoving Lagrangian Acceleration (COLA) method in the temporal and spatial domains, which trades accuracy at small-scales to gain computational speed without sacrificing accuracy at large scales. This is especially useful for cheaply generating large ensembles of accurate mock halo catalogs required to study galaxy clustering and weak lensing. The COLA method achieves its speed by calculating the large-scale dynamics exactly using LPT while letting the N-body code solve for the small scales, without requiring it to capture exactly the internal dynamics of halos.
Quantum error correction in crossbar architectures

NASA Astrophysics Data System (ADS)

Helsen, Jonas; Steudtner, Mark; Veldhorst, Menno; Wehner, Stephanie

2018-07-01

A central challenge for the scaling of quantum computing systems is the need to control all qubits in the system without a large overhead. A solution for this problem in classical computing comes in the form of so-called crossbar architectures. Recently we made a proposal for a large-scale quantum processor (Li et al arXiv:1711.03807 (2017)) to be implemented in silicon quantum dots. This system features a crossbar control architecture which limits parallel single-qubit control, but allows the scheme to overcome control scaling issues that form a major hurdle to large-scale quantum computing systems. In this work, we develop a language that makes it possible to easily map quantum circuits to crossbar systems, taking into account their architecture and control limitations. Using this language we show how to map well known quantum error correction codes such as the planar surface and color codes in this limited control setting with only a small overhead in time. We analyze the logical error behavior of this surface code mapping for estimated experimental parameters of the crossbar system and conclude that logical error suppression to a level useful for real quantum computation is feasible.
Raptor: An Enterprise Knowledge Discovery Engine Version 2.0

DOE Office of Scientific and Technical Information (OSTI.GOV)

2011-08-31

The Raptor Version 2.0 computer code uses a set of documents as seed documents to recommend documents of interest from a large, target set of documents. The computer code provides results that show the recommended documents with the highest similarity to the seed documents. Version 2.0 was specifically developed to work with SharePoint 2007 and MS SQL server.
BRYNTRN: A baryon transport model

NASA Technical Reports Server (NTRS)

Wilson, John W.; Townsend, Lawrence W.; Nealy, John E.; Chun, Sang Y.; Hong, B. S.; Buck, Warren W.; Lamkin, S. L.; Ganapol, Barry D.; Khan, Ferdous; Cucinotta, Francis A.

1989-01-01

The development of an interaction data base and a numerical solution to the transport of baryons through an arbitrary shield material based on a straight ahead approximation of the Boltzmann equation are described. The code is most accurate for continuous energy boundary values, but gives reasonable results for discrete spectra at the boundary using even a relatively coarse energy grid (30 points) and large spatial increments (1 cm in H2O). The resulting computer code is self-contained, efficient and ready to use. The code requires only a very small fraction of the computer resources required for Monte Carlo codes.

Linear chirp phase perturbing approach for finding binary phased codes

NASA Astrophysics Data System (ADS)

Li, Bing C.

2017-05-01

Binary phased codes have many applications in communication and radar systems. These applications require binary phased codes to have low sidelobes in order to reduce interferences and false detection. Barker codes are the ones that satisfy these requirements and they have lowest maximum sidelobes. However, Barker codes have very limited code lengths (equal or less than 13) while many applications including low probability of intercept radar, and spread spectrum communication, require much higher code lengths. The conventional techniques of finding binary phased codes in literatures include exhaust search, neural network, and evolutionary methods, and they all require very expensive computation for large code lengths. Therefore these techniques are limited to find binary phased codes with small code lengths (less than 100). In this paper, by analyzing Barker code, linear chirp, and P3 phases, we propose a new approach to find binary codes. Experiments show that the proposed method is able to find long low sidelobe binary phased codes (code length >500) with reasonable computational cost.
Computerized Dental Comparison: A Critical Review of Dental Coding and Ranking Algorithms Used in Victim Identification.

PubMed

Adams, Bradley J; Aschheim, Kenneth W

2016-01-01

Comparison of antemortem and postmortem dental records is a leading method of victim identification, especially for incidents involving a large number of decedents. This process may be expedited with computer software that provides a ranked list of best possible matches. This study provides a comparison of the most commonly used conventional coding and sorting algorithms used in the United States (WinID3) with a simplified coding format that utilizes an optimized sorting algorithm. The simplified system consists of seven basic codes and utilizes an optimized algorithm based largely on the percentage of matches. To perform this research, a large reference database of approximately 50,000 antemortem and postmortem records was created. For most disaster scenarios, the proposed simplified codes, paired with the optimized algorithm, performed better than WinID3 which uses more complex codes. The detailed coding system does show better performance with extremely large numbers of records and/or significant body fragmentation. © 2015 American Academy of Forensic Sciences.
Hypercube matrix computation task

NASA Technical Reports Server (NTRS)

Calalo, R.; Imbriale, W.; Liewer, P.; Lyons, J.; Manshadi, F.; Patterson, J.

1987-01-01

The Hypercube Matrix Computation (Year 1986-1987) task investigated the applicability of a parallel computing architecture to the solution of large scale electromagnetic scattering problems. Two existing electromagnetic scattering codes were selected for conversion to the Mark III Hypercube concurrent computing environment. They were selected so that the underlying numerical algorithms utilized would be different thereby providing a more thorough evaluation of the appropriateness of the parallel environment for these types of problems. The first code was a frequency domain method of moments solution, NEC-2, developed at Lawrence Livermore National Laboratory. The second code was a time domain finite difference solution of Maxwell's equations to solve for the scattered fields. Once the codes were implemented on the hypercube and verified to obtain correct solutions by comparing the results with those from sequential runs, several measures were used to evaluate the performance of the two codes. First, a comparison was provided of the problem size possible on the hypercube with 128 megabytes of memory for a 32-node configuration with that available in a typical sequential user environment of 4 to 8 megabytes. Then, the performance of the codes was anlyzed for the computational speedup attained by the parallel architecture.
Calculation of water drop trajectories to and about arbitrary three-dimensional lifting and nonlifting bodies in potential airflow

NASA Technical Reports Server (NTRS)

Norment, H. G.

1985-01-01

Subsonic, external flow about nonlifting bodies, lifting bodies or combinations of lifting and nonlifting bodies is calculated by a modified version of the Hess lifting code. Trajectory calculations can be performed for any atmospheric conditions and for all water drop sizes, from the smallest cloud droplet to large raindrops. Experimental water drop drag relations are used in the water drop equations of motion and effects of gravity settling are included. Inlet flow can be accommodated, and high Mach number compressibility effects are corrected for approximately. Seven codes are described: (1) a code used to debug and plot body surface description data; (2) a code that processes the body surface data to yield the potential flow field; (3) a code that computes flow velocities at arrays of points in space; (4) a code that computes water drop trajectories from an array of points in space; (5) a code that computes water drop trajectories and fluxes to arbitrary target points; (6) a code that computes water drop trajectories tangent to the body; and (7) a code that produces stereo pair plots which include both the body and trajectories. Accuracy of the calculations is discussed, and trajectory calculation results are compared with prior calculations and with experimental data.
COLAcode: COmoving Lagrangian Acceleration code

NASA Astrophysics Data System (ADS)

Tassev, Svetlin V.

2016-02-01

COLAcode is a serial particle mesh-based N-body code illustrating the COLA (COmoving Lagrangian Acceleration) method; it solves for Large Scale Structure (LSS) in a frame that is comoving with observers following trajectories calculated in Lagrangian Perturbation Theory (LPT). It differs from standard N-body code by trading accuracy at small-scales to gain computational speed without sacrificing accuracy at large scales. This is useful for generating large ensembles of accurate mock halo catalogs required to study galaxy clustering and weak lensing; such catalogs are needed to perform detailed error analysis for ongoing and future surveys of LSS.
Evaluation of the efficiency and fault density of software generated by code generators

NASA Technical Reports Server (NTRS)

Schreur, Barbara

1993-01-01

Flight computers and flight software are used for GN&C (guidance, navigation, and control), engine controllers, and avionics during missions. The software development requires the generation of a considerable amount of code. The engineers who generate the code make mistakes and the generation of a large body of code with high reliability requires considerable time. Computer-aided software engineering (CASE) tools are available which generates code automatically with inputs through graphical interfaces. These tools are referred to as code generators. In theory, code generators could write highly reliable code quickly and inexpensively. The various code generators offer different levels of reliability checking. Some check only the finished product while some allow checking of individual modules and combined sets of modules as well. Considering NASA's requirement for reliability, an in house manually generated code is needed. Furthermore, automatically generated code is reputed to be as efficient as the best manually generated code when executed. In house verification is warranted.
Final report for the Tera Computer TTI CRADA

DOE Office of Scientific and Technical Information (OSTI.GOV)

Davidson, G.S.; Pavlakos, C.; Silva, C.

1997-01-01

Tera Computer and Sandia National Laboratories have completed a CRADA, which examined the Tera Multi-Threaded Architecture (MTA) for use with large codes of importance to industry and DOE. The MTA is an innovative architecture that uses parallelism to mask latency between memories and processors. The physical implementation is a parallel computer with high cross-section bandwidth and GaAs processors designed by Tera, which support many small computation threads and fast, lightweight context switches between them. When any thread blocks while waiting for memory accesses to complete, another thread immediately begins execution so that high CPU utilization is maintained. The Tera MTAmore » parallel computer has a single, global address space, which is appealing when porting existing applications to a parallel computer. This ease of porting is further enabled by compiler technology that helps break computations into parallel threads. DOE and Sandia National Laboratories were interested in working with Tera to further develop this computing concept. While Tera Computer would continue the hardware development and compiler research, Sandia National Laboratories would work with Tera to ensure that their compilers worked well with important Sandia codes, most particularly CTH, a shock physics code used for weapon safety computations. In addition to that important code, Sandia National Laboratories would complete research on a robotic path planning code, SANDROS, which is important in manufacturing applications, and would evaluate the MTA performance on this code. Finally, Sandia would work directly with Tera to develop 3D visualization codes, which would be appropriate for use with the MTA. Each of these tasks has been completed to the extent possible, given that Tera has just completed the MTA hardware. All of the CRADA work had to be done on simulators.« less
Solutions of the Taylor-Green Vortex Problem Using High-Resolution Explicit Finite Difference Methods

NASA Technical Reports Server (NTRS)

DeBonis, James R.

2013-01-01

A computational fluid dynamics code that solves the compressible Navier-Stokes equations was applied to the Taylor-Green vortex problem to examine the code s ability to accurately simulate the vortex decay and subsequent turbulence. The code, WRLES (Wave Resolving Large-Eddy Simulation), uses explicit central-differencing to compute the spatial derivatives and explicit Low Dispersion Runge-Kutta methods for the temporal discretization. The flow was first studied and characterized using Bogey & Bailley s 13-point dispersion relation preserving (DRP) scheme. The kinetic energy dissipation rate, computed both directly and from the enstrophy field, vorticity contours, and the energy spectra are examined. Results are in excellent agreement with a reference solution obtained using a spectral method and provide insight into computations of turbulent flows. In addition the following studies were performed: a comparison of 4th-, 8th-, 12th- and DRP spatial differencing schemes, the effect of the solution filtering on the results, the effect of large-eddy simulation sub-grid scale models, and the effect of high-order discretization of the viscous terms.
ACON: a multipurpose production controller for plasma physics codes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Snell, C.

1983-01-01

ACON is a BCON controller designed to run large production codes on the CTSS Cray-1 or the LTSS 7600 computers. ACON can also be operated interactively, with input from the user's terminal. The controller can run one code or a sequence of up to ten codes during the same job. Options are available to get and save Mass storage files, to perform Historian file updating operations, to compile and load source files, and to send out print and film files. Special features include ability to retry after Mass failures, backup options for saving files, startup messages for the various codes,more » and ability to reserve specified amounts of computer time after successive code runs. ACON's flexibility and power make it useful for running a number of different production codes.« less
Los Alamos radiation transport code system on desktop computing platforms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Briesmeister, J.F.; Brinkley, F.W.; Clark, B.A.

The Los Alamos Radiation Transport Code System (LARTCS) consists of state-of-the-art Monte Carlo and discrete ordinates transport codes and data libraries. These codes were originally developed many years ago and have undergone continual improvement. With a large initial effort and continued vigilance, the codes are easily portable from one type of hardware to another. The performance of scientific work-stations (SWS) has evolved to the point that such platforms can be used routinely to perform sophisticated radiation transport calculations. As the personal computer (PC) performance approaches that of the SWS, the hardware options for desk-top radiation transport calculations expands considerably. Themore » current status of the radiation transport codes within the LARTCS is described: MCNP, SABRINA, LAHET, ONEDANT, TWODANT, TWOHEX, and ONELD. Specifically, the authors discuss hardware systems on which the codes run and present code performance comparisons for various machines.« less
Technology Infusion of CodeSonar into the Space Network Ground Segment

NASA Technical Reports Server (NTRS)

Benson, Markland J.

2009-01-01

This slide presentation reviews the applicability of CodeSonar to the Space Network software. CodeSonar is a commercial off the shelf system that analyzes programs written in C, C++ or Ada for defects in the code. Software engineers use CodeSonar results as an input to the existing source code inspection process. The study is focused on large scale software developed using formal processes. The systems studied are mission critical in nature but some use commodity computer systems.
CFD Based Computations of Flexible Helicopter Blades for Stability Analysis

NASA Technical Reports Server (NTRS)

Guruswamy, Guru P.

2011-01-01

As a collaborative effort among government aerospace research laboratories an advanced version of a widely used computational fluid dynamics code, OVERFLOW, was recently released. This latest version includes additions to model flexible rotating multiple blades. In this paper, the OVERFLOW code is applied to improve the accuracy of airload computations from the linear lifting line theory that uses displacements from beam model. Data transfers required at every revolution are managed through a Unix based script that runs jobs on large super-cluster computers. Results are demonstrated for the 4-bladed UH-60A helicopter. Deviations of computed data from flight data are evaluated. Fourier analysis post-processing that is suitable for aeroelastic stability computations are performed.
Inclusion of pressure and flow in a new 3D MHD equilibrium code

NASA Astrophysics Data System (ADS)

Raburn, Daniel; Fukuyama, Atsushi

2012-10-01

Flow and nonsymmetric effects can play a large role in plasma equilibria and energy confinement. A concept for such a 3D equilibrium code was developed and presented in 2011. The code is called the Kyoto ITerative Equilibrium Solver (KITES) [1], and the concept is based largely on the PIES code [2]. More recently, the work-in-progress KITES code was used to calculate force-free equilibria. Here, progress and results on the inclusion of pressure and flow in the code are presented. [4pt] [1] Daniel Raburn and Atsushi Fukuyama, Plasma and Fusion Research: Regular Articles, 7:240381 (2012).[0pt] [2] H. S. Greenside, A. H. Reiman, and A. Salas, J. Comput. Phys, 81(1):102-136 (1989).
Overview of the NASA Glenn Flux Reconstruction Based High-Order Unstructured Grid Code

NASA Technical Reports Server (NTRS)

Spiegel, Seth C.; DeBonis, James R.; Huynh, H. T.

2016-01-01

A computational fluid dynamics code based on the flux reconstruction (FR) method is currently being developed at NASA Glenn Research Center to ultimately provide a large- eddy simulation capability that is both accurate and efficient for complex aeropropulsion flows. The FR approach offers a simple and efficient method that is easy to implement and accurate to an arbitrary order on common grid cell geometries. The governing compressible Navier-Stokes equations are discretized in time using various explicit Runge-Kutta schemes, with the default being the 3-stage/3rd-order strong stability preserving scheme. The code is written in modern Fortran (i.e., Fortran 2008) and parallelization is attained through MPI for execution on distributed-memory high-performance computing systems. An h- refinement study of the isentropic Euler vortex problem is able to empirically demonstrate the capability of the FR method to achieve super-accuracy for inviscid flows. Additionally, the code is applied to the Taylor-Green vortex problem, performing numerous implicit large-eddy simulations across a range of grid resolutions and solution orders. The solution found by a pseudo-spectral code is commonly used as a reference solution to this problem, and the FR code is able to reproduce this solution using approximately the same grid resolution. Finally, an examination of the code's performance demonstrates good parallel scaling, as well as an implementation of the FR method with a computational cost/degree- of-freedom/time-step that is essentially independent of the solution order of accuracy for structured geometries.
Fast computation of quadrupole and hexadecapole approximations in microlensing with a single point-source evaluation

NASA Astrophysics Data System (ADS)

Cassan, Arnaud

2017-07-01

The exoplanet detection rate from gravitational microlensing has grown significantly in recent years thanks to a great enhancement of resources and improved observational strategy. Current observatories include ground-based wide-field and/or robotic world-wide networks of telescopes, as well as space-based observatories such as satellites Spitzer or Kepler/K2. This results in a large quantity of data to be processed and analysed, which is a challenge for modelling codes because of the complexity of the parameter space to be explored and the intensive computations required to evaluate the models. In this work, I present a method that allows to compute the quadrupole and hexadecapole approximations of the finite-source magnification with more efficiency than previously available codes, with routines about six times and four times faster, respectively. The quadrupole takes just about twice the time of a point-source evaluation, which advocates for generalizing its use to large portions of the light curves. The corresponding routines are available as open-source python codes.
Efficient full wave code for the coupling of large multirow multijunction LH grills

NASA Astrophysics Data System (ADS)

Preinhaelter, Josef; Hillairet, Julien; Milanesio, Daniele; Maggiora, Riccardo; Urban, Jakub; Vahala, Linda; Vahala, George

2017-11-01

The full wave code OLGA, for determining the coupling of a single row lower hybrid launcher (waveguide grills) to the plasma, is extended to handle multirow multijunction active passive structures (like the C3 and C4 launchers on TORE SUPRA) by implementing the scattering matrix formalism. The extended code is still computationally fast because of the use of (i) 2D splines of the plasma surface admittance in the accessibility region of the k-space, (ii) high order Gaussian quadrature rules for the integration of the coupling elements and (iii) utilizing the symmetries of the coupling elements in the multiperiodic structures. The extended OLGA code is benchmarked against the ALOHA-1D, ALOHA-2D and TOPLHA codes for the coupling of the C3 and C4 TORE SUPRA launchers for several plasma configurations derived from reflectometry and interferometery. Unlike nearly all codes (except the ALOHA-1D code), OLGA does not require large computational resources and can be used for everyday usage in planning experimental runs. In particular, it is shown that the OLGA code correctly handles the coupling of the C3 and C4 launchers over a very wide range of plasma densities in front of the grill.
A Computational Study of an Oscillating VR-12 Airfoil with a Gurney Flap

NASA Technical Reports Server (NTRS)

Rhee, Myung

2004-01-01

Computations of the flow over an oscillating airfoil with a Gurney-flap are performed using a Reynolds Averaged Navier-Stokes code and compared with recent experimental data. The experimental results have been generated for different sizes of the Gurney flaps. The computations are focused mainly on a configuration. The baseline airfoil without a Gurney flap is computed and compared with the experiments in both steady and unsteady cases for the purpose of initial testing of the code performance. The are carried out with different turbulence models. Effects of the grid refinement are also examined and unsteady cases, in addition to the assessment of solver effects. The results of the comparisons of steady lift and drag computations indicate that the code is reasonably accurate in the attached flow of the steady condition but largely overpredicts the lift and underpredicts the drag in the higher angle steady flow.
Large-Scale Computation of Nuclear Magnetic Resonance Shifts for Paramagnetic Solids Using CP2K.

PubMed

Mondal, Arobendo; Gaultois, Michael W; Pell, Andrew J; Iannuzzi, Marcella; Grey, Clare P; Hutter, Jürg; Kaupp, Martin

2018-01-09

Large-scale computations of nuclear magnetic resonance (NMR) shifts for extended paramagnetic solids (pNMR) are reported using the highly efficient Gaussian-augmented plane-wave implementation of the CP2K code. Combining hyperfine couplings obtained with hybrid functionals with g-tensors and orbital shieldings computed using gradient-corrected functionals, contact, pseudocontact, and orbital-shift contributions to pNMR shifts are accessible. Due to the efficient and highly parallel performance of CP2K, a wide variety of materials with large unit cells can be studied with extended Gaussian basis sets. Validation of various approaches for the different contributions to pNMR shifts is done first for molecules in a large supercell in comparison with typical quantum-chemical codes. This is then extended to a detailed study of g-tensors for extended solid transition-metal fluorides and for a series of complex lithium vanadium phosphates. Finally, lithium pNMR shifts are computed for Li 3 V 2 (PO 4 ) 3 , for which detailed experimental data are available. This has allowed an in-depth study of different approaches (e.g., full periodic versus incremental cluster computations of g-tensors and different functionals and basis sets for hyperfine computations) as well as a thorough analysis of the different contributions to the pNMR shifts. This study paves the way for a more-widespread computational treatment of NMR shifts for paramagnetic materials.
Transferring ecosystem simulation codes to supercomputers

NASA Technical Reports Server (NTRS)

Skiles, J. W.; Schulbach, C. H.

1995-01-01

Many ecosystem simulation computer codes have been developed in the last twenty-five years. This development took place initially on main-frame computers, then mini-computers, and more recently, on micro-computers and workstations. Supercomputing platforms (both parallel and distributed systems) have been largely unused, however, because of the perceived difficulty in accessing and using the machines. Also, significant differences in the system architectures of sequential, scalar computers and parallel and/or vector supercomputers must be considered. We have transferred a grassland simulation model (developed on a VAX) to a Cray Y-MP/C90. We describe porting the model to the Cray and the changes we made to exploit the parallelism in the application and improve code execution. The Cray executed the model 30 times faster than the VAX and 10 times faster than a Unix workstation. We achieved an additional speedup of 30 percent by using the compiler's vectoring and 'in-line' capabilities. The code runs at only about 5 percent of the Cray's peak speed because it ineffectively uses the vector and parallel processing capabilities of the Cray. We expect that by restructuring the code, it could execute an additional six to ten times faster.
Concurrent electromagnetic scattering analysis

NASA Technical Reports Server (NTRS)

Patterson, Jean E.; Cwik, Tom; Ferraro, Robert D.; Jacobi, Nathan; Liewer, Paulett C.; Lockhart, Thomas G.; Lyzenga, Gregory A.; Parker, Jay

1989-01-01

The computational power of the hypercube parallel computing architecture is applied to the solution of large-scale electromagnetic scattering and radiation problems. Three analysis codes have been implemented. A Hypercube Electromagnetic Interactive Analysis Workstation was developed to aid in the design and analysis of metallic structures such as antennas and to facilitate the use of these analysis codes. The workstation provides a general user environment for specification of the structure to be analyzed and graphical representations of the results.

BRYNTRN: A baryon transport computer code, computation procedures and data base

NASA Technical Reports Server (NTRS)

Wilson, John W.; Townsend, Lawrence W.; Chun, Sang Y.; Buck, Warren W.; Khan, Ferdous; Cucinotta, Frank

1988-01-01

The development is described of an interaction data base and a numerical solution to the transport of baryons through the arbitrary shield material based on a straight ahead approximation of the Boltzmann equation. The code is most accurate for continuous energy boundary values but gives reasonable results for discrete spectra at the boundary with even a relatively coarse energy grid (30 points) and large spatial increments (1 cm in H2O).
Visual analysis of inter-process communication for large-scale parallel computing.

PubMed

Muelder, Chris; Gygi, Francois; Ma, Kwan-Liu

2009-01-01

In serial computation, program profiling is often helpful for optimization of key sections of code. When moving to parallel computation, not only does the code execution need to be considered but also communication between the different processes which can induce delays that are detrimental to performance. As the number of processes increases, so does the impact of the communication delays on performance. For large-scale parallel applications, it is critical to understand how the communication impacts performance in order to make the code more efficient. There are several tools available for visualizing program execution and communications on parallel systems. These tools generally provide either views which statistically summarize the entire program execution or process-centric views. However, process-centric visualizations do not scale well as the number of processes gets very large. In particular, the most common representation of parallel processes is a Gantt char t with a row for each process. As the number of processes increases, these charts can become difficult to work with and can even exceed screen resolution. We propose a new visualization approach that affords more scalability and then demonstrate it on systems running with up to 16,384 processes.
CAVE: A computer code for two-dimensional transient heating analysis of conceptual thermal protection systems for hypersonic vehicles

NASA Technical Reports Server (NTRS)

Rathjen, K. A.

1977-01-01

A digital computer code CAVE (Conduction Analysis Via Eigenvalues), which finds application in the analysis of two dimensional transient heating of hypersonic vehicles is described. The CAVE is written in FORTRAN 4 and is operational on both IBM 360-67 and CDC 6600 computers. The method of solution is a hybrid analytical numerical technique that is inherently stable permitting large time steps even with the best of conductors having the finest of mesh size. The aerodynamic heating boundary conditions are calculated by the code based on the input flight trajectory or can optionally be calculated external to the code and then entered as input data. The code computes the network conduction and convection links, as well as capacitance values, given basic geometrical and mesh sizes, for four generations (leading edges, cooled panels, X-24C structure and slabs). Input and output formats are presented and explained. Sample problems are included. A brief summary of the hybrid analytical-numerical technique, which utilizes eigenvalues (thermal frequencies) and eigenvectors (thermal mode vectors) is given along with aerodynamic heating equations that have been incorporated in the code and flow charts.
A new Fortran 90 program to compute regular and irregular associated Legendre functions (new version announcement)

NASA Astrophysics Data System (ADS)

Schneider, Barry I.; Segura, Javier; Gil, Amparo; Guan, Xiaoxu; Bartschat, Klaus

2018-04-01

This is a revised and updated version of a modern Fortran 90 code to compute the regular Plm (x) and irregular Qlm (x) associated Legendre functions for all x ∈(- 1 , + 1) (on the cut) and | x | > 1 and integer degree (l) and order (m). The necessity to revise the code comes as a consequence of some comments of Prof. James Bremer of the UC//Davis Mathematics Department, who discovered that there were errors in the code for large integer degree and order for the normalized regular Legendre functions on the cut.
Large eddy simulation of fine water sprays: comparative analysis of two models and computer codes

NASA Astrophysics Data System (ADS)

Tsoy, A. S.; Snegirev, A. Yu.

2015-09-01

The model and the computer code FDS, albeit widely used in engineering practice to predict fire development, is not sufficiently validated for fire suppression by fine water sprays. In this work, the effect of numerical resolution of the large scale turbulent pulsations on the accuracy of predicted time-averaged spray parameters is evaluated. Comparison of the simulation results obtained with the two versions of the model and code, as well as that of the predicted and measured radial distributions of the liquid flow rate revealed the need to apply monotonic and yet sufficiently accurate discrete approximations of the convective terms. Failure to do so delays jet break-up, otherwise induced by large turbulent eddies, thereby excessively focuses the predicted flow around its axis. The effect of the pressure drop in the spray nozzle is also examined, and its increase has shown to cause only weak increase of the evaporated fraction and vapor concentration despite the significant increase of flow velocity.
Parameters that affect parallel processing for computational electromagnetic simulation codes on high performance computing clusters

NASA Astrophysics Data System (ADS)

Moon, Hongsik

What is the impact of multicore and associated advanced technologies on computational software for science? Most researchers and students have multicore laptops or desktops for their research and they need computing power to run computational software packages. Computing power was initially derived from Central Processing Unit (CPU) clock speed. That changed when increases in clock speed became constrained by power requirements. Chip manufacturers turned to multicore CPU architectures and associated technological advancements to create the CPUs for the future. Most software applications benefited by the increased computing power the same way that increases in clock speed helped applications run faster. However, for Computational ElectroMagnetics (CEM) software developers, this change was not an obvious benefit - it appeared to be a detriment. Developers were challenged to find a way to correctly utilize the advancements in hardware so that their codes could benefit. The solution was parallelization and this dissertation details the investigation to address these challenges. Prior to multicore CPUs, advanced computer technologies were compared with the performance using benchmark software and the metric was FLoting-point Operations Per Seconds (FLOPS) which indicates system performance for scientific applications that make heavy use of floating-point calculations. Is FLOPS an effective metric for parallelized CEM simulation tools on new multicore system? Parallel CEM software needs to be benchmarked not only by FLOPS but also by the performance of other parameters related to type and utilization of the hardware, such as CPU, Random Access Memory (RAM), hard disk, network, etc. The codes need to be optimized for more than just FLOPs and new parameters must be included in benchmarking. In this dissertation, the parallel CEM software named High Order Basis Based Integral Equation Solver (HOBBIES) is introduced. This code was developed to address the needs of the changing computer hardware platforms in order to provide fast, accurate and efficient solutions to large, complex electromagnetic problems. The research in this dissertation proves that the performance of parallel code is intimately related to the configuration of the computer hardware and can be maximized for different hardware platforms. To benchmark and optimize the performance of parallel CEM software, a variety of large, complex projects are created and executed on a variety of computer platforms. The computer platforms used in this research are detailed in this dissertation. The projects run as benchmarks are also described in detail and results are presented. The parameters that affect parallel CEM software on High Performance Computing Clusters (HPCC) are investigated. This research demonstrates methods to maximize the performance of parallel CEM software code.
RAY-RAMSES: a code for ray tracing on the fly in N-body simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Barreira, Alexandre; Llinares, Claudio; Bose, Sownak

2016-05-01

We present a ray tracing code to compute integrated cosmological observables on the fly in AMR N-body simulations. Unlike conventional ray tracing techniques, our code takes full advantage of the time and spatial resolution attained by the N-body simulation by computing the integrals along the line of sight on a cell-by-cell basis through the AMR simulation grid. Moroever, since it runs on the fly in the N-body run, our code can produce maps of the desired observables without storing large (or any) amounts of data for post-processing. We implemented our routines in the RAMSES N-body code and tested the implementationmore » using an example of weak lensing simulation. We analyse basic statistics of lensing convergence maps and find good agreement with semi-analytical methods. The ray tracing methodology presented here can be used in several cosmological analysis such as Sunyaev-Zel'dovich and integrated Sachs-Wolfe effect studies as well as modified gravity. Our code can also be used in cross-checks of the more conventional methods, which can be important in tests of theory systematics in preparation for upcoming large scale structure surveys.« less
A proposed study of multiple scattering through clouds up to 1 THz

NASA Technical Reports Server (NTRS)

Gerace, G. C.; Smith, E. K.

1992-01-01

A rigorous computation of the electromagnetic field scattered from an atmospheric liquid water cloud is proposed. The recent development of a fast recursive algorithm (Chew algorithm) for computing the fields scattered from numerous scatterers now makes a rigorous computation feasible. A method is presented for adapting this algorithm to a general case where there are an extremely large number of scatterers. It is also proposed to extend a new binary PAM channel coding technique (El-Khamy coding) to multiple levels with non-square pulse shapes. The Chew algorithm can be used to compute the transfer function of a cloud channel. Then the transfer function can be used to design an optimum El-Khamy code. In principle, these concepts can be applied directly to the realistic case of a time-varying cloud (adaptive channel coding and adaptive equalization). A brief review is included of some preliminary work on cloud dispersive effects on digital communication signals and on cloud liquid water spectra and correlations.
A performance comparison of the Cray-2 and the Cray X-MP

NASA Technical Reports Server (NTRS)

Schmickley, Ronald; Bailey, David H.

1986-01-01

A suite of thirteen large Fortran benchmark codes were run on Cray-2 and Cray X-MP supercomputers. These codes were a mix of compute-intensive scientific application programs (mostly Computational Fluid Dynamics) and some special vectorized computation exercise programs. For the general class of programs tested on the Cray-2, most of which were not specially tuned for speed, the floating point operation rates varied under a variety of system load configurations from 40 percent up to 125 percent of X-MP performance rates. It is concluded that the Cray-2, in the original system configuration studied (without memory pseudo-banking) will run untuned Fortran code, on average, about 70 percent of X-MP speeds.
Observations on computational methodologies for use in large-scale, gradient-based, multidisciplinary design incorporating advanced CFD codes

NASA Technical Reports Server (NTRS)

Newman, P. A.; Hou, G. J.-W.; Jones, H. E.; Taylor, A. C., III; Korivi, V. M.

1992-01-01

How a combination of various computational methodologies could reduce the enormous computational costs envisioned in using advanced CFD codes in gradient based optimized multidisciplinary design (MdD) procedures is briefly outlined. Implications of these MdD requirements upon advanced CFD codes are somewhat different than those imposed by a single discipline design. A means for satisfying these MdD requirements for gradient information is presented which appear to permit: (1) some leeway in the CFD solution algorithms which can be used; (2) an extension to 3-D problems; and (3) straightforward use of other computational methodologies. Many of these observations have previously been discussed as possibilities for doing parts of the problem more efficiently; the contribution here is observing how they fit together in a mutually beneficial way.
Pretest aerosol code comparisons for LWR aerosol containment tests LA1 and LA2

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wright, A.L.; Wilson, J.H.; Arwood, P.C.

The Light-Water-Reactor (LWR) Aerosol Containment Experiments (LACE) are being performed in Richland, Washington, at the Hanford Engineering Development Laboratory (HEDL) under the leadership of an international project board and the Electric Power Research Institute. These tests have two objectives: (1) to investigate, at large scale, the inherent aerosol retention behavior in LWR containments under simulated severe accident conditions, and (2) to provide an experimental data base for validating aerosol behavior and thermal-hydraulic computer codes. Aerosol computer-code comparison activities are being coordinated at the Oak Ridge National Laboratory. For each of the six LACE tests, ''pretest'' calculations (for code-to-code comparisons) andmore » ''posttest'' calculations (for code-to-test data comparisons) are being performed. The overall goals of the comparison effort are (1) to provide code users with experience in applying their codes to LWR accident-sequence conditions and (2) to evaluate and improve the code models.« less
High-Threshold Fault-Tolerant Quantum Computation with Analog Quantum Error Correction

NASA Astrophysics Data System (ADS)

Fukui, Kosuke; Tomita, Akihisa; Okamoto, Atsushi; Fujii, Keisuke

2018-04-01

To implement fault-tolerant quantum computation with continuous variables, the Gottesman-Kitaev-Preskill (GKP) qubit has been recognized as an important technological element. However, it is still challenging to experimentally generate the GKP qubit with the required squeezing level, 14.8 dB, of the existing fault-tolerant quantum computation. To reduce this requirement, we propose a high-threshold fault-tolerant quantum computation with GKP qubits using topologically protected measurement-based quantum computation with the surface code. By harnessing analog information contained in the GKP qubits, we apply analog quantum error correction to the surface code. Furthermore, we develop a method to prevent the squeezing level from decreasing during the construction of the large-scale cluster states for the topologically protected, measurement-based, quantum computation. We numerically show that the required squeezing level can be relaxed to less than 10 dB, which is within the reach of the current experimental technology. Hence, this work can considerably alleviate this experimental requirement and take a step closer to the realization of large-scale quantum computation.
Development of the US3D Code for Advanced Compressible and Reacting Flow Simulations

NASA Technical Reports Server (NTRS)

Candler, Graham V.; Johnson, Heath B.; Nompelis, Ioannis; Subbareddy, Pramod K.; Drayna, Travis W.; Gidzak, Vladimyr; Barnhardt, Michael D.

2015-01-01

Aerothermodynamics and hypersonic flows involve complex multi-disciplinary physics, including finite-rate gas-phase kinetics, finite-rate internal energy relaxation, gas-surface interactions with finite-rate oxidation and sublimation, transition to turbulence, large-scale unsteadiness, shock-boundary layer interactions, fluid-structure interactions, and thermal protection system ablation and thermal response. Many of the flows have a large range of length and time scales, requiring large computational grids, implicit time integration, and large solution run times. The University of Minnesota NASA US3D code was designed for the simulation of these complex, highly-coupled flows. It has many of the features of the well-established DPLR code, but uses unstructured grids and has many advanced numerical capabilities and physical models for multi-physics problems. The main capabilities of the code are described, the physical modeling approaches are discussed, the different types of numerical flux functions and time integration approaches are outlined, and the parallelization strategy is overviewed. Comparisons between US3D and the NASA DPLR code are presented, and several advanced simulations are presented to illustrate some of novel features of the code.
Computation of Sensitivity Derivatives of Navier-Stokes Equations using Complex Variables

NASA Technical Reports Server (NTRS)

Vatsa, Veer N.

2004-01-01

Accurate computation of sensitivity derivatives is becoming an important item in Computational Fluid Dynamics (CFD) because of recent emphasis on using nonlinear CFD methods in aerodynamic design, optimization, stability and control related problems. Several techniques are available to compute gradients or sensitivity derivatives of desired flow quantities or cost functions with respect to selected independent (design) variables. Perhaps the most common and oldest method is to use straightforward finite-differences for the evaluation of sensitivity derivatives. Although very simple, this method is prone to errors associated with choice of step sizes and can be cumbersome for geometric variables. The cost per design variable for computing sensitivity derivatives with central differencing is at least equal to the cost of three full analyses, but is usually much larger in practice due to difficulty in choosing step sizes. Another approach gaining popularity is the use of Automatic Differentiation software (such as ADIFOR) to process the source code, which in turn can be used to evaluate the sensitivity derivatives of preselected functions with respect to chosen design variables. In principle, this approach is also very straightforward and quite promising. The main drawback is the large memory requirement because memory use increases linearly with the number of design variables. ADIFOR software can also be cumber-some for large CFD codes and has not yet reached a full maturity level for production codes, especially in parallel computing environments.
Learning Short Binary Codes for Large-scale Image Retrieval.

PubMed

Liu, Li; Yu, Mengyang; Shao, Ling

2017-03-01

Large-scale visual information retrieval has become an active research area in this big data era. Recently, hashing/binary coding algorithms prove to be effective for scalable retrieval applications. Most existing hashing methods require relatively long binary codes (i.e., over hundreds of bits, sometimes even thousands of bits) to achieve reasonable retrieval accuracies. However, for some realistic and unique applications, such as on wearable or mobile devices, only short binary codes can be used for efficient image retrieval due to the limitation of computational resources or bandwidth on these devices. In this paper, we propose a novel unsupervised hashing approach called min-cost ranking (MCR) specifically for learning powerful short binary codes (i.e., usually the code length shorter than 100 b) for scalable image retrieval tasks. By exploring the discriminative ability of each dimension of data, MCR can generate one bit binary code for each dimension and simultaneously rank the discriminative separability of each bit according to the proposed cost function. Only top-ranked bits with minimum cost-values are then selected and grouped together to compose the final salient binary codes. Extensive experimental results on large-scale retrieval demonstrate that MCR can achieve comparative performance as the state-of-the-art hashing algorithms but with significantly shorter codes, leading to much faster large-scale retrieval.
Scalability improvements to NRLMOL for DFT calculations of large molecules

NASA Astrophysics Data System (ADS)

Diaz, Carlos Manuel

Advances in high performance computing (HPC) have provided a way to treat large, computationally demanding tasks using thousands of processors. With the development of more powerful HPC architectures, the need to create efficient and scalable code has grown more important. Electronic structure calculations are valuable in understanding experimental observations and are routinely used for new materials predictions. For the electronic structure calculations, the memory and computation time are proportional to the number of atoms. Memory requirements for these calculations scale as N2, where N is the number of atoms. While the recent advances in HPC offer platforms with large numbers of cores, the limited amount of memory available on a given node and poor scalability of the electronic structure code hinder their efficient usage of these platforms. This thesis will present some developments to overcome these bottlenecks in order to study large systems. These developments, which are implemented in the NRLMOL electronic structure code, involve the use of sparse matrix storage formats and the use of linear algebra using sparse and distributed matrices. These developments along with other related development now allow ground state density functional calculations using up to 25,000 basis functions and the excited state calculations using up to 17,000 basis functions while utilizing all cores on a node. An example on a light-harvesting triad molecule is described. Finally, future plans to further improve the scalability will be presented.
Computer code for the prediction of nozzle admittance

NASA Technical Reports Server (NTRS)

Nguyen, Thong V.

1988-01-01

A procedure which can accurately characterize injector designs for large thrust (0.5 to 1.5 million pounds), high pressure (500 to 3000 psia) LOX/hydrocarbon engines is currently under development. In this procedure, a rectangular cross-sectional combustion chamber is to be used to simulate the lower traverse frequency modes of the large scale chamber. The chamber will be sized so that the first width mode of the rectangular chamber corresponds to the first tangential mode of the full-scale chamber. Test data to be obtained from the rectangular chamber will be used to assess the full scale engine stability. This requires the development of combustion stability models for rectangular chambers. As part of the combustion stability model development, a computer code, NOAD based on existing theory was developed to calculate the nozzle admittances for both rectangular and axisymmetric nozzles. This code is detailed.
Combining Thermal And Structural Analyses

NASA Technical Reports Server (NTRS)

Winegar, Steven R.

1990-01-01

Computer code makes programs compatible so stresses and deformations calculated. Paper describes computer code combining thermal analysis with structural analysis. Called SNIP (for SINDA-NASTRAN Interfacing Program), code provides interface between finite-difference thermal model of system and finite-element structural model when no node-to-element correlation between models. Eliminates much manual work in converting temperature results of SINDA (Systems Improved Numerical Differencing Analyzer) program into thermal loads for NASTRAN (NASA Structural Analysis) program. Used to analyze concentrating reflectors for solar generation of electric power. Large thermal and structural models needed to predict distortion of surface shapes, and SNIP saves considerable time and effort in combining models.
Requirements for migration of NSSD code systems from LTSS to NLTSS

NASA Technical Reports Server (NTRS)

Pratt, M.

1984-01-01

The purpose of this document is to address the requirements necessary for a successful conversion of the Nuclear Design (ND) application code systems to the NLTSS environment. The ND application code system community can be characterized as large-scale scientific computation carried out on supercomputers. NLTSS is a distributed operating system being developed at LLNL to replace the LTSS system currently in use. The implications of change are examined including a description of the computational environment and users in ND. The discussion then turns to requirements, first in a general way, followed by specific requirements, including a proposal for managing the transition.
Supersonic propulsion simulation by incorporating component models in the large perturbation inlet (LAPIN) computer code

NASA Technical Reports Server (NTRS)

Cole, Gary L.; Richard, Jacques C.

1991-01-01

An approach to simulating the internal flows of supersonic propulsion systems is presented. The approach is based on a fairly simple modification of the Large Perturbation Inlet (LAPIN) computer code. LAPIN uses a quasi-one dimensional, inviscid, unsteady formulation of the continuity, momentum, and energy equations. The equations are solved using a shock capturing, finite difference algorithm. The original code, developed for simulating supersonic inlets, includes engineering models of unstart/restart, bleed, bypass, and variable duct geometry, by means of source terms in the equations. The source terms also provide a mechanism for incorporating, with the inlet, propulsion system components such as compressor stages, combustors, and turbine stages. This requires each component to be distributed axially over a number of grid points. Because of the distributed nature of such components, this representation should be more accurate than a lumped parameter model. Components can be modeled by performance map(s), which in turn are used to compute the source terms. The general approach is described. Then, simulation of a compressor/fan stage is discussed to show the approach in detail.

Three-dimensional structural analysis using interactive graphics

NASA Technical Reports Server (NTRS)

Biffle, J.; Sumlin, H. A.

1975-01-01

The application of computer interactive graphics to three-dimensional structural analysis was described, with emphasis on the following aspects: (1) structural analysis, and (2) generation and checking of input data and examination of the large volume of output data (stresses, displacements, velocities, accelerations). Handling of three-dimensional input processing with a special MESH3D computer program was explained. Similarly, a special code PLTZ may be used to perform all the needed tasks for output processing from a finite element code. Examples were illustrated.
General Monte Carlo reliability simulation code including common mode failures and HARP fault/error-handling

NASA Technical Reports Server (NTRS)

Platt, M. E.; Lewis, E. E.; Boehm, F.

1991-01-01

A Monte Carlo Fortran computer program was developed that uses two variance reduction techniques for computing system reliability applicable to solving very large highly reliable fault-tolerant systems. The program is consistent with the hybrid automated reliability predictor (HARP) code which employs behavioral decomposition and complex fault-error handling models. This new capability is called MC-HARP which efficiently solves reliability models with non-constant failures rates (Weibull). Common mode failure modeling is also a specialty.
Subscale Fast Cookoff Testing and Modeling for the Hazard Assessment of Large Rocket Motors

DTIC Science & Technology

2001-03-01

41 LIST OF TABLES Table 1 Heats of Vaporization Parameter for Two-liner Phase Transformation - Complete Liner Sublimation and/or Combined Liner...One-dimensional 2-D Two-dimensional ALE3D Arbitrary-Lagrange-Eulerian (3-D) Computer Code ALEGRA 3-D Arbitrary-Lagrange-Eulerian Computer Code for...case-liner bond areas and in the grain inner bore to explore the pre-ignition and ignition phases , as well as burning evolution in rocket motor fast
ELEFANT: a user-friendly multipurpose geodynamics code

NASA Astrophysics Data System (ADS)

Thieulot, C.

2014-07-01

A new finite element code for the solution of the Stokes and heat transport equations is presented. It has purposely been designed to address geological flow problems in two and three dimensions at crustal and lithospheric scales. The code relies on the Marker-in-Cell technique and Lagrangian markers are used to track materials in the simulation domain which allows recording of the integrated history of deformation; their (number) density is variable and dynamically adapted. A variety of rheologies has been implemented including nonlinear thermally activated dislocation and diffusion creep and brittle (or plastic) frictional models. The code is built on the Arbitrary Lagrangian Eulerian kinematic description: the computational grid deforms vertically and allows for a true free surface while the computational domain remains of constant width in the horizontal direction. The solution to the large system of algebraic equations resulting from the finite element discretisation and linearisation of the set of coupled partial differential equations to be solved is obtained by means of the efficient parallel direct solver MUMPS whose performance is thoroughly tested, or by means of the WISMP and AGMG iterative solvers. The code accuracy is assessed by means of many geodynamically relevant benchmark experiments which highlight specific features or algorithms, e.g., the implementation of the free surface stabilisation algorithm, the (visco-)plastic rheology implementation, the temperature advection, the capacity of the code to handle large viscosity contrasts. A two-dimensional application to salt tectonics presented as case study illustrates the potential of the code to model large scale high resolution thermo-mechanically coupled free surface flows.
Current and anticipated uses of the thermal hydraulics codes at the NRC

DOE Office of Scientific and Technical Information (OSTI.GOV)

Caruso, R.

1997-07-01

The focus of Thermal-Hydraulic computer code usage in nuclear regulatory organizations has undergone a considerable shift since the codes were originally conceived. Less work is being done in the area of {open_quotes}Design Basis Accidents,{close_quotes}, and much more emphasis is being placed on analysis of operational events, probabalistic risk/safety assessment, and maintenance practices. All of these areas need support from Thermal-Hydraulic computer codes to model the behavior of plant fluid systems, and they all need the ability to perform large numbers of analyses quickly. It is therefore important for the T/H codes of the future to be able to support thesemore » needs, by providing robust, easy-to-use, tools that produce easy-to understand results for a wider community of nuclear professionals. These tools need to take advantage of the great advances that have occurred recently in computer software, by providing users with graphical user interfaces for both input and output. In addition, reduced costs of computer memory and other hardware have removed the need for excessively complex data structures and numerical schemes, which make the codes more difficult and expensive to modify, maintain, and debug, and which increase problem run-times. Future versions of the T/H codes should also be structured in a modular fashion, to allow for the easy incorporation of new correlations, models, or features, and to simplify maintenance and testing. Finally, it is important that future T/H code developers work closely with the code user community, to ensure that the code meet the needs of those users.« less
Computational Performance of Intel MIC, Sandy Bridge, and GPU Architectures: Implementation of a 1D c++/OpenMP Electrostatic Particle-In-Cell Code

DTIC Science & Technology

2014-05-01

fusion, space and astrophysical plasmas, but still the general picture can be presented quite well with the fluid approach [6, 7]. The microscopic...purpose computing CPU for algorithms where processing of large blocks of data is done in parallel. The reason for that is the GPU’s highly effective...parallel structure. Most of the image and video processing computations involve heavy matrix and vector op- erations over large amounts of data and
Applications of Parallel Process HiMAP for Large Scale Multidisciplinary Problems

NASA Technical Reports Server (NTRS)

Guruswamy, Guru P.; Potsdam, Mark; Rodriguez, David; Kwak, Dochay (Technical Monitor)

2000-01-01

HiMAP is a three level parallel middleware that can be interfaced to a large scale global design environment for code independent, multidisciplinary analysis using high fidelity equations. Aerospace technology needs are rapidly changing. Computational tools compatible with the requirements of national programs such as space transportation are needed. Conventional computation tools are inadequate for modern aerospace design needs. Advanced, modular computational tools are needed, such as those that incorporate the technology of massively parallel processors (MPP).
hybrid\\scriptsize{{MANTIS}}: a CPU-GPU Monte Carlo method for modeling indirect x-ray detectors with columnar scintillators

NASA Astrophysics Data System (ADS)

Sharma, Diksha; Badal, Andreu; Badano, Aldo

2012-04-01

The computational modeling of medical imaging systems often requires obtaining a large number of simulated images with low statistical uncertainty which translates into prohibitive computing times. We describe a novel hybrid approach for Monte Carlo simulations that maximizes utilization of CPUs and GPUs in modern workstations. We apply the method to the modeling of indirect x-ray detectors using a new and improved version of the code \\scriptsize{{MANTIS}}, an open source software tool used for the Monte Carlo simulations of indirect x-ray imagers. We first describe a GPU implementation of the physics and geometry models in fast\\scriptsize{{DETECT}}2 (the optical transport model) and a serial CPU version of the same code. We discuss its new features like on-the-fly column geometry and columnar crosstalk in relation to the \\scriptsize{{MANTIS}} code, and point out areas where our model provides more flexibility for the modeling of realistic columnar structures in large area detectors. Second, we modify \\scriptsize{{PENELOPE}} (the open source software package that handles the x-ray and electron transport in \\scriptsize{{MANTIS}}) to allow direct output of location and energy deposited during x-ray and electron interactions occurring within the scintillator. This information is then handled by optical transport routines in fast\\scriptsize{{DETECT}}2. A load balancer dynamically allocates optical transport showers to the GPU and CPU computing cores. Our hybrid\\scriptsize{{MANTIS}} approach achieves a significant speed-up factor of 627 when compared to \\scriptsize{{MANTIS}} and of 35 when compared to the same code running only in a CPU instead of a GPU. Using hybrid\\scriptsize{{MANTIS}}, we successfully hide hours of optical transport time by running it in parallel with the x-ray and electron transport, thus shifting the computational bottleneck from optical to x-ray transport. The new code requires much less memory than \\scriptsize{{MANTIS}} and, as a result, allows us to efficiently simulate large area detectors.
National Combustion Code Validated Against Lean Direct Injection Flow Field Data

NASA Technical Reports Server (NTRS)

Iannetti, Anthony C.

2003-01-01

Most combustion processes have, in some way or another, a recirculating flow field. This recirculation stabilizes the reaction zone, or flame, but an unnecessarily large recirculation zone can result in high nitrogen oxide (NOx) values for combustion systems. The size of this recirculation zone is crucial to the performance of state-of-the-art, low-emissions hardware. If this is a large-scale combustion process, the flow field will probably be turbulent and, therefore, three-dimensional. This research dealt primarily with flow fields resulting from lean direct injection (LDI) concepts, as described in Research & Technology 2001. LDI is a concept that depends heavily on the design of the swirler. The LDI concept has the potential to reduce NOx values from 50 to 70 percent of current values, with good flame stability characteristics. It is cost effective and (hopefully) beneficial to do most of the design work for an LDI swirler using computer-aided design (CAD) and computer-aided engineering (CAE) tools. Computational fluid dynamics (CFD) codes are CAE tools that can calculate three-dimensional flows in complex geometries. However, CFD codes are only beginning to correctly calculate the flow fields for complex devices, and the related combustion models usually remove a large portion of the flow physics.
Multi-phase SPH modelling of violent hydrodynamics on GPUs

NASA Astrophysics Data System (ADS)

Mokos, Athanasios; Rogers, Benedict D.; Stansby, Peter K.; Domínguez, José M.

2015-11-01

This paper presents the acceleration of multi-phase smoothed particle hydrodynamics (SPH) using a graphics processing unit (GPU) enabling large numbers of particles (10-20 million) to be simulated on just a single GPU card. With novel hardware architectures such as a GPU, the optimum approach to implement a multi-phase scheme presents some new challenges. Many more particles must be included in the calculation and there are very different speeds of sound in each phase with the largest speed of sound determining the time step. This requires efficient computation. To take full advantage of the hardware acceleration provided by a single GPU for a multi-phase simulation, four different algorithms are investigated: conditional statements, binary operators, separate particle lists and an intermediate global function. Runtime results show that the optimum approach needs to employ separate cell and neighbour lists for each phase. The profiler shows that this approach leads to a reduction in both memory transactions and arithmetic operations giving significant runtime gains. The four different algorithms are compared to the efficiency of the optimised single-phase GPU code, DualSPHysics, for 2-D and 3-D simulations which indicate that the multi-phase functionality has a significant computational overhead. A comparison with an optimised CPU code shows a speed up of an order of magnitude over an OpenMP simulation with 8 threads and two orders of magnitude over a single thread simulation. A demonstration of the multi-phase SPH GPU code is provided by a 3-D dam break case impacting an obstacle. This shows better agreement with experimental results than an equivalent single-phase code. The multi-phase GPU code enables a convergence study to be undertaken on a single GPU with a large number of particles that otherwise would have required large high performance computing resources.
Design applications for supercomputers

NASA Technical Reports Server (NTRS)

Studerus, C. J.

1987-01-01

The complexity of codes for solutions of real aerodynamic problems has progressed from simple two-dimensional models to three-dimensional inviscid and viscous models. As the algorithms used in the codes increased in accuracy, speed and robustness, the codes were steadily incorporated into standard design processes. The highly sophisticated codes, which provide solutions to the truly complex flows, require computers with large memory and high computational speed. The advent of high-speed supercomputers, such that the solutions of these complex flows become more practical, permits the introduction of the codes into the design system at an earlier stage. The results of several codes which either were already introduced into the design process or are rapidly in the process of becoming so, are presented. The codes fall into the area of turbomachinery aerodynamics and hypersonic propulsion. In the former category, results are presented for three-dimensional inviscid and viscous flows through nozzle and unducted fan bladerows. In the latter category, results are presented for two-dimensional inviscid and viscous flows for hypersonic vehicle forebodies and engine inlets.
Ultrafast and scalable cone-beam CT reconstruction using MapReduce in a cloud computing environment.

PubMed

Meng, Bowen; Pratx, Guillem; Xing, Lei

2011-12-01

Four-dimensional CT (4DCT) and cone beam CT (CBCT) are widely used in radiation therapy for accurate tumor target definition and localization. However, high-resolution and dynamic image reconstruction is computationally demanding because of the large amount of data processed. Efficient use of these imaging techniques in the clinic requires high-performance computing. The purpose of this work is to develop a novel ultrafast, scalable and reliable image reconstruction technique for 4D CBCT∕CT using a parallel computing framework called MapReduce. We show the utility of MapReduce for solving large-scale medical physics problems in a cloud computing environment. In this work, we accelerated the Feldcamp-Davis-Kress (FDK) algorithm by porting it to Hadoop, an open-source MapReduce implementation. Gated phases from a 4DCT scans were reconstructed independently. Following the MapReduce formalism, Map functions were used to filter and backproject subsets of projections, and Reduce function to aggregate those partial backprojection into the whole volume. MapReduce automatically parallelized the reconstruction process on a large cluster of computer nodes. As a validation, reconstruction of a digital phantom and an acquired CatPhan 600 phantom was performed on a commercial cloud computing environment using the proposed 4D CBCT∕CT reconstruction algorithm. Speedup of reconstruction time is found to be roughly linear with the number of nodes employed. For instance, greater than 10 times speedup was achieved using 200 nodes for all cases, compared to the same code executed on a single machine. Without modifying the code, faster reconstruction is readily achievable by allocating more nodes in the cloud computing environment. Root mean square error between the images obtained using MapReduce and a single-threaded reference implementation was on the order of 10(-7). Our study also proved that cloud computing with MapReduce is fault tolerant: the reconstruction completed successfully with identical results even when half of the nodes were manually terminated in the middle of the process. An ultrafast, reliable and scalable 4D CBCT∕CT reconstruction method was developed using the MapReduce framework. Unlike other parallel computing approaches, the parallelization and speedup required little modification of the original reconstruction code. MapReduce provides an efficient and fault tolerant means of solving large-scale computing problems in a cloud computing environment.
Ultrafast and scalable cone-beam CT reconstruction using MapReduce in a cloud computing environment

PubMed Central

Meng, Bowen; Pratx, Guillem; Xing, Lei

2011-01-01

Purpose: Four-dimensional CT (4DCT) and cone beam CT (CBCT) are widely used in radiation therapy for accurate tumor target definition and localization. However, high-resolution and dynamic image reconstruction is computationally demanding because of the large amount of data processed. Efficient use of these imaging techniques in the clinic requires high-performance computing. The purpose of this work is to develop a novel ultrafast, scalable and reliable image reconstruction technique for 4D CBCT/CT using a parallel computing framework called MapReduce. We show the utility of MapReduce for solving large-scale medical physics problems in a cloud computing environment. Methods: In this work, we accelerated the Feldcamp–Davis–Kress (FDK) algorithm by porting it to Hadoop, an open-source MapReduce implementation. Gated phases from a 4DCT scans were reconstructed independently. Following the MapReduce formalism, Map functions were used to filter and backproject subsets of projections, and Reduce function to aggregate those partial backprojection into the whole volume. MapReduce automatically parallelized the reconstruction process on a large cluster of computer nodes. As a validation, reconstruction of a digital phantom and an acquired CatPhan 600 phantom was performed on a commercial cloud computing environment using the proposed 4D CBCT/CT reconstruction algorithm. Results: Speedup of reconstruction time is found to be roughly linear with the number of nodes employed. For instance, greater than 10 times speedup was achieved using 200 nodes for all cases, compared to the same code executed on a single machine. Without modifying the code, faster reconstruction is readily achievable by allocating more nodes in the cloud computing environment. Root mean square error between the images obtained using MapReduce and a single-threaded reference implementation was on the order of 10−7. Our study also proved that cloud computing with MapReduce is fault tolerant: the reconstruction completed successfully with identical results even when half of the nodes were manually terminated in the middle of the process. Conclusions: An ultrafast, reliable and scalable 4D CBCT/CT reconstruction method was developed using the MapReduce framework. Unlike other parallel computing approaches, the parallelization and speedup required little modification of the original reconstruction code. MapReduce provides an efficient and fault tolerant means of solving large-scale computing problems in a cloud computing environment. PMID:22149842
Verification of the FBR fuel bundle-duct interaction analysis code BAMBOO by the out-of-pile bundle compression test with large diameter pins

NASA Astrophysics Data System (ADS)

Uwaba, Tomoyuki; Ito, Masahiro; Nemoto, Junichi; Ichikawa, Shoichi; Katsuyama, Kozo

2014-09-01

The BAMBOO computer code was verified by results for the out-of-pile bundle compression test with large diameter pin bundle deformation under the bundle-duct interaction (BDI) condition. The pin diameters of the examined test bundles were 8.5 mm and 10.4 mm, which are targeted as preliminary fuel pin diameters for the upgraded core of the prototype fast breeder reactor (FBR) and for demonstration and commercial FBRs studied in the FaCT project. In the bundle compression test, bundle cross-sectional views were obtained from X-ray computer tomography (CT) images and local parameters of bundle deformation such as pin-to-duct and pin-to-pin clearances were measured by CT image analyses. In the verification, calculation results of bundle deformation obtained by the BAMBOO code analyses were compared with the experimental results from the CT image analyses. The comparison showed that the BAMBOO code reasonably predicts deformation of large diameter pin bundles under the BDI condition by assuming that pin bowing and cladding oval distortion are the major deformation mechanisms, the same as in the case of small diameter pin bundles. In addition, the BAMBOO analysis results confirmed that cladding oval distortion effectively suppresses BDI in large diameter pin bundles as well as in small diameter pin bundles.
High-performance computational fluid dynamics: a custom-code approach

NASA Astrophysics Data System (ADS)

Fannon, James; Loiseau, Jean-Christophe; Valluri, Prashant; Bethune, Iain; Náraigh, Lennon Ó.

2016-07-01

We introduce a modified and simplified version of the pre-existing fully parallelized three-dimensional Navier-Stokes flow solver known as TPLS. We demonstrate how the simplified version can be used as a pedagogical tool for the study of computational fluid dynamics (CFDs) and parallel computing. TPLS is at its heart a two-phase flow solver, and uses calls to a range of external libraries to accelerate its performance. However, in the present context we narrow the focus of the study to basic hydrodynamics and parallel computing techniques, and the code is therefore simplified and modified to simulate pressure-driven single-phase flow in a channel, using only relatively simple Fortran 90 code with MPI parallelization, but no calls to any other external libraries. The modified code is analysed in order to both validate its accuracy and investigate its scalability up to 1000 CPU cores. Simulations are performed for several benchmark cases in pressure-driven channel flow, including a turbulent simulation, wherein the turbulence is incorporated via the large-eddy simulation technique. The work may be of use to advanced undergraduate and graduate students as an introductory study in CFDs, while also providing insight for those interested in more general aspects of high-performance computing.
Modernization of the graphics post-processors of the Hamburg German Climate Computer Center Carbon Cycle Codes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stevens, E.J.; McNeilly, G.S.

The existing National Center for Atmospheric Research (NCAR) code in the Hamburg Oceanic Carbon Cycle Circulation Model and the Hamburg Large-Scale Geostrophic Ocean General Circulation Model was modernized and reduced in size while still producing an equivalent end result. A reduction in the size of the existing code from more than 50,000 lines to approximately 7,500 lines in the new code has made the new code much easier to maintain. The existing code in Hamburg model uses legacy NCAR (including even emulated CALCOMP subrountines) graphics to display graphical output. The new code uses only current (version 3.1) NCAR subrountines.
Superconducting quantum circuits at the surface code threshold for fault tolerance.

PubMed

Barends, R; Kelly, J; Megrant, A; Veitia, A; Sank, D; Jeffrey, E; White, T C; Mutus, J; Fowler, A G; Campbell, B; Chen, Y; Chen, Z; Chiaro, B; Dunsworth, A; Neill, C; O'Malley, P; Roushan, P; Vainsencher, A; Wenner, J; Korotkov, A N; Cleland, A N; Martinis, John M

2014-04-24

A quantum computer can solve hard problems, such as prime factoring, database searching and quantum simulation, at the cost of needing to protect fragile quantum states from error. Quantum error correction provides this protection by distributing a logical state among many physical quantum bits (qubits) by means of quantum entanglement. Superconductivity is a useful phenomenon in this regard, because it allows the construction of large quantum circuits and is compatible with microfabrication. For superconducting qubits, the surface code approach to quantum computing is a natural choice for error correction, because it uses only nearest-neighbour coupling and rapidly cycled entangling gates. The gate fidelity requirements are modest: the per-step fidelity threshold is only about 99 per cent. Here we demonstrate a universal set of logic gates in a superconducting multi-qubit processor, achieving an average single-qubit gate fidelity of 99.92 per cent and a two-qubit gate fidelity of up to 99.4 per cent. This places Josephson quantum computing at the fault-tolerance threshold for surface code error correction. Our quantum processor is a first step towards the surface code, using five qubits arranged in a linear array with nearest-neighbour coupling. As a further demonstration, we construct a five-qubit Greenberger-Horne-Zeilinger state using the complete circuit and full set of gates. The results demonstrate that Josephson quantum computing is a high-fidelity technology, with a clear path to scaling up to large-scale, fault-tolerant quantum circuits.
Parallelized direct execution simulation of message-passing parallel programs

NASA Technical Reports Server (NTRS)

Dickens, Phillip M.; Heidelberger, Philip; Nicol, David M.

1994-01-01

As massively parallel computers proliferate, there is growing interest in findings ways by which performance of massively parallel codes can be efficiently predicted. This problem arises in diverse contexts such as parallelizing computers, parallel performance monitoring, and parallel algorithm development. In this paper we describe one solution where one directly executes the application code, but uses a discrete-event simulator to model details of the presumed parallel machine such as operating system and communication network behavior. Because this approach is computationally expensive, we are interested in its own parallelization specifically the parallelization of the discrete-event simulator. We describe methods suitable for parallelized direct execution simulation of message-passing parallel programs, and report on the performance of such a system, Large Application Parallel Simulation Environment (LAPSE), we have built on the Intel Paragon. On all codes measured to date, LAPSE predicts performance well typically within 10 percent relative error. Depending on the nature of the application code, we have observed low slowdowns (relative to natively executing code) and high relative speedups using up to 64 processors.
Determinant Computation on the GPU using the Condensation Method

NASA Astrophysics Data System (ADS)

Anisul Haque, Sardar; Moreno Maza, Marc

2012-02-01

We report on a GPU implementation of the condensation method designed by Abdelmalek Salem and Kouachi Said for computing the determinant of a matrix. We consider two types of coefficients: modular integers and floating point numbers. We evaluate the performance of our code by measuring its effective bandwidth and argue that it is numerical stable in the floating point number case. In addition, we compare our code with serial implementation of determinant computation from well-known mathematical packages. Our results suggest that a GPU implementation of the condensation method has a large potential for improving those packages in terms of running time and numerical stability.
Large-scale transmission-type multifunctional anisotropic coding metasurfaces in millimeter-wave frequencies

NASA Astrophysics Data System (ADS)

Cui, Tie Jun; Wu, Rui Yuan; Wu, Wei; Shi, Chuan Bo; Li, Yun Bo

2017-10-01

We propose fast and accurate designs to large-scale and low-profile transmission-type anisotropic coding metasurfaces with multiple functions in the millimeter-wave frequencies based on the antenna-array method. The numerical simulation of an anisotropic coding metasurface with the size of 30λ × 30λ by the proposed method takes only 20 min, which however cannot be realized by commercial software due to huge memory usage in personal computers. To inspect the performance of coding metasurfaces in the millimeter-wave band, the working frequency is chosen as 60 GHz. Based on the convolution operations and holographic theory, the proposed multifunctional anisotropic coding metasurface exhibits different effects excited by y-polarized and x-polarized incidences. This study extends the frequency range of coding metasurfaces, filling the gap between microwave and terahertz bands, and implying promising applications in millimeter-wave communication and imaging.

Application of a Two-dimensional Unsteady Viscous Analysis Code to a Supersonic Throughflow Fan Stage

NASA Technical Reports Server (NTRS)

Steinke, Ronald J.

1989-01-01

The Rai ROTOR1 code for two-dimensional, unsteady viscous flow analysis was applied to a supersonic throughflow fan stage design. The axial Mach number for this fan design increases from 2.0 at the inlet to 2.9 at the outlet. The Rai code uses overlapped O- and H-grids that are appropriately packed. The Rai code was run on a Cray XMP computer; then data postprocessing and graphics were performed to obtain detailed insight into the stage flow. The large rotor wakes uniformly traversed the rotor-stator interface and dispersed as they passed through the stator passage. Only weak blade shock losses were computerd, which supports the design goals. High viscous effects caused large blade wakes and a low fan efficiency. Rai code flow predictions were essentially steady for the rotor, and they compared well with Chima rotor viscous code predictions based on a C-grid of similar density.
A decoding procedure for the Reed-Solomon codes

NASA Technical Reports Server (NTRS)

Lim, R. S.

1978-01-01

A decoding procedure is described for the (n,k) t-error-correcting Reed-Solomon (RS) code, and an implementation of the (31,15) RS code for the I4-TENEX central system. This code can be used for error correction in large archival memory systems. The principal features of the decoder are a Galois field arithmetic unit implemented by microprogramming a microprocessor, and syndrome calculation by using the g(x) encoding shift register. Complete decoding of the (31,15) code is expected to take less than 500 microsecs. The syndrome calculation is performed by hardware using the encoding shift register and a modified Chien search. The error location polynomial is computed by using Lin's table, which is an interpretation of Berlekamp's iterative algorithm. The error location numbers are calculated by using the Chien search. Finally, the error values are computed by using Forney's method.
Simulation of 2D Kinetic Effects in Plasmas using the Grid Based Continuum Code LOKI

NASA Astrophysics Data System (ADS)

Banks, Jeffrey; Berger, Richard; Chapman, Tom; Brunner, Stephan

2016-10-01

Kinetic simulation of multi-dimensional plasma waves through direct discretization of the Vlasov equation is a useful tool to study many physical interactions and is particularly attractive for situations where minimal fluctuation levels are desired, for instance, when measuring growth rates of plasma wave instabilities. However, direct discretization of phase space can be computationally expensive, and as a result there are few examples of published results using Vlasov codes in more than a single configuration space dimension. In an effort to fill this gap we have developed the Eulerian-based kinetic code LOKI that evolves the Vlasov-Poisson system in 2+2-dimensional phase space. The code is designed to reduce the cost of phase-space computation by using fully 4th order accurate conservative finite differencing, while retaining excellent parallel scalability that efficiently uses large scale computing resources. In this poster I will discuss the algorithms used in the code as well as some aspects of their parallel implementation using MPI. I will also overview simulation results of basic plasma wave instabilities relevant to laser plasma interaction, which have been obtained using the code.
HIFiRE-1 Turbulent Shock Boundary Layer Interaction - Flight Data and Computations

NASA Technical Reports Server (NTRS)

Kimmel, Roger L.; Prabhu, Dinesh

2015-01-01

The Hypersonic International Flight Research Experimentation (HIFiRE) program is a hypersonic flight test program executed by the Air Force Research Laboratory (AFRL) and Australian Defence Science and Technology Organisation (DSTO). This flight contained a cylinder-flare induced shock boundary layer interaction (SBLI). Computations of the interaction were conducted for a number of times during the ascent. The DPLR code used for predictions was calibrated against ground test data prior to exercising the code at flight conditions. Generally, the computations predicted the upstream influence and interaction pressures very well. Plateau pressures on the cylinder were predicted well at all conditions. Although the experimental heat transfer showed a large amount of scatter, especially at low heating levels, the measured heat transfer agreed well with computations. The primary discrepancy between the experiment and computation occurred in the pressures measured on the flare during second stage burn. Measured pressures exhibited large overshoots late in the second stage burn, the mechanism of which is unknown. The good agreement between flight measurements and CFD helps validate the philosophy of calibrating CFD against ground test, prior to exercising it at flight conditions.
Rapid solution of large-scale systems of equations

NASA Technical Reports Server (NTRS)

Storaasli, Olaf O.

1994-01-01

The analysis and design of complex aerospace structures requires the rapid solution of large systems of linear and nonlinear equations, eigenvalue extraction for buckling, vibration and flutter modes, structural optimization and design sensitivity calculation. Computers with multiple processors and vector capabilities can offer substantial computational advantages over traditional scalar computer for these analyses. These computers fall into two categories: shared memory computers and distributed memory computers. This presentation covers general-purpose, highly efficient algorithms for generation/assembly or element matrices, solution of systems of linear and nonlinear equations, eigenvalue and design sensitivity analysis and optimization. All algorithms are coded in FORTRAN for shared memory computers and many are adapted to distributed memory computers. The capability and numerical performance of these algorithms will be addressed.
Computational strategy for the solution of large strain nonlinear problems using the Wilkins explicit finite-difference approach

NASA Technical Reports Server (NTRS)

Hofmann, R.

1980-01-01

The STEALTH code system, which solves large strain, nonlinear continuum mechanics problems, was rigorously structured in both overall design and programming standards. The design is based on the theoretical elements of analysis while the programming standards attempt to establish a parallelism between physical theory, programming structure, and documentation. These features have made it easy to maintain, modify, and transport the codes. It has also guaranteed users a high level of quality control and quality assurance.
Consequence modeling using the fire dynamics simulator.

PubMed

Ryder, Noah L; Sutula, Jason A; Schemel, Christopher F; Hamer, Andrew J; Van Brunt, Vincent

2004-11-11

The use of Computational Fluid Dynamics (CFD) and in particular Large Eddy Simulation (LES) codes to model fires provides an efficient tool for the prediction of large-scale effects that include plume characteristics, combustion product dispersion, and heat effects to adjacent objects. This paper illustrates the strengths of the Fire Dynamics Simulator (FDS), an LES code developed by the National Institute of Standards and Technology (NIST), through several small and large-scale validation runs and process safety applications. The paper presents two fire experiments--a small room fire and a large (15 m diameter) pool fire. The model results are compared to experimental data and demonstrate good agreement between the models and data. The validation work is then extended to demonstrate applicability to process safety concerns by detailing a model of a tank farm fire and a model of the ignition of a gaseous fuel in a confined space. In this simulation, a room was filled with propane, given time to disperse, and was then ignited. The model yields accurate results of the dispersion of the gas throughout the space. This information can be used to determine flammability and explosive limits in a space and can be used in subsequent models to determine the pressure and temperature waves that would result from an explosion. The model dispersion results were compared to an experiment performed by Factory Mutual. Using the above examples, this paper will demonstrate that FDS is ideally suited to build realistic models of process geometries in which large scale explosion and fire failure risks can be evaluated with several distinct advantages over more traditional CFD codes. Namely transient solutions to fire and explosion growth can be produced with less sophisticated hardware (lower cost) than needed for traditional CFD codes (PC type computer verses UNIX workstation) and can be solved for longer time histories (on the order of hundreds of seconds of computed time) with minimal computer resources and length of model run. Additionally results that are produced can be analyzed, viewed, and tabulated during and following a model run within a PC environment. There are some tradeoffs, however, as rapid computations in PC's may require a sacrifice in the grid resolution or in the sub-grid modeling, depending on the size of the geometry modeled.
Complexity control algorithm based on adaptive mode selection for interframe coding in high efficiency video coding

NASA Astrophysics Data System (ADS)

Chen, Gang; Yang, Bing; Zhang, Xiaoyun; Gao, Zhiyong

2017-07-01

The latest high efficiency video coding (HEVC) standard significantly increases the encoding complexity for improving its coding efficiency. Due to the limited computational capability of handheld devices, complexity constrained video coding has drawn great attention in recent years. A complexity control algorithm based on adaptive mode selection is proposed for interframe coding in HEVC. Considering the direct proportionality between encoding time and computational complexity, the computational complexity is measured in terms of encoding time. First, complexity is mapped to a target in terms of prediction modes. Then, an adaptive mode selection algorithm is proposed for the mode decision process. Specifically, the optimal mode combination scheme that is chosen through offline statistics is developed at low complexity. If the complexity budget has not been used up, an adaptive mode sorting method is employed to further improve coding efficiency. The experimental results show that the proposed algorithm achieves a very large complexity control range (as low as 10%) for the HEVC encoder while maintaining good rate-distortion performance. For the lowdelayP condition, compared with the direct resource allocation method and the state-of-the-art method, an average gain of 0.63 and 0.17 dB in BDPSNR is observed for 18 sequences when the target complexity is around 40%.
Aerodynamic-structural model of offwind yacht sails

NASA Astrophysics Data System (ADS)

Mairs, Christopher M.

An aerodynamic-structural model of offwind yacht sails was created that is useful in predicting sail forces. Two sails were examined experimentally and computationally at several wind angles to explore a variety of flow regimes. The accuracy of the numerical solutions was measured by comparing to experimental results. The two sails examined were a Code 0 and a reaching asymmetric spinnaker. During experiment, balance, wake, and sail shape data were recorded for both sails in various configurations. Two computational steps were used to evaluate the computational model. First, an aerodynamic flow model that includes viscosity effects was used to examine the experimental flying shapes that were recorded. Second, the aerodynamic model was combined with a nonlinear, structural, finite element analysis (FEA) model. The aerodynamic and structural models were used iteratively to predict final flying shapes of offwind sails, starting with the design shapes. The Code 0 has relatively low camber and is used at small angles of attack. It was examined experimentally and computationally at a single angle of attack in two trim configurations, a baseline and overtrimmed setting. Experimentally, the Code 0 was stable and maintained large flow attachment regions. The digitized flying shapes from experiment were examined in the aerodynamic model. Force area predictions matched experimental results well. When the aerodynamic-structural tool was employed, the predictive capability was slightly worse. The reaching asymmetric spinnaker has higher camber and operates at higher angles of attack than the Code 0. Experimentally and computationally, it was examined at two angles of attack. Like the Code 0, at each wind angle, baseline and overtrimmed settings were examined. Experimentally, sail oscillations and large flow detachment regions were encountered. The computational analysis began by examining the experimental flying shapes in the aerodynamic model. In the baseline setting, the computational force predictions were fair at both wind angles examined. Force predictions were much improved in the overtrimmed setting when the sail was highly stalled and more stable. The same trends in force prediction were seen when employing the aerodynamic-structural model. Predictions were good to fair in the baseline setting but improved in the overtrimmed configuration.
Addressing the challenges of standalone multi-core simulations in molecular dynamics

NASA Astrophysics Data System (ADS)

Ocaya, R. O.; Terblans, J. J.

2017-07-01

Computational modelling in material science involves mathematical abstractions of force fields between particles with the aim to postulate, develop and understand materials by simulation. The aggregated pairwise interactions of the material's particles lead to a deduction of its macroscopic behaviours. For practically meaningful macroscopic scales, a large amount of data are generated, leading to vast execution times. Simulation times of hours, days or weeks for moderately sized problems are not uncommon. The reduction of simulation times, improved result accuracy and the associated software and hardware engineering challenges are the main motivations for many of the ongoing researches in the computational sciences. This contribution is concerned mainly with simulations that can be done on a "standalone" computer based on Message Passing Interfaces (MPI), parallel code running on hardware platforms with wide specifications, such as single/multi- processor, multi-core machines with minimal reconfiguration for upward scaling of computational power. The widely available, documented and standardized MPI library provides this functionality through the MPI_Comm_size (), MPI_Comm_rank () and MPI_Reduce () functions. A survey of the literature shows that relatively little is written with respect to the efficient extraction of the inherent computational power in a cluster. In this work, we discuss the main avenues available to tap into this extra power without compromising computational accuracy. We also present methods to overcome the high inertia encountered in single-node-based computational molecular dynamics. We begin by surveying the current state of the art and discuss what it takes to achieve parallelism, efficiency and enhanced computational accuracy through program threads and message passing interfaces. Several code illustrations are given. The pros and cons of writing raw code as opposed to using heuristic, third-party code are also discussed. The growing trend towards graphical processor units and virtual computing clouds for high-performance computing is also discussed. Finally, we present the comparative results of vacancy formation energy calculations using our own parallelized standalone code called Verlet-Stormer velocity (VSV) operating on 30,000 copper atoms. The code is based on the Sutton-Chen implementation of the Finnis-Sinclair pairwise embedded atom potential. A link to the code is also given.
OpenGeoSys: Performance-Oriented Computational Methods for Numerical Modeling of Flow in Large Hydrogeological Systems

NASA Astrophysics Data System (ADS)

Naumov, D.; Fischer, T.; Böttcher, N.; Watanabe, N.; Walther, M.; Rink, K.; Bilke, L.; Shao, H.; Kolditz, O.

2014-12-01

OpenGeoSys (OGS) is a scientific open source code for numerical simulation of thermo-hydro-mechanical-chemical processes in porous and fractured media. Its basic concept is to provide a flexible numerical framework for solving multi-field problems for applications in geoscience and hydrology as e.g. for CO2 storage applications, geothermal power plant forecast simulation, salt water intrusion, water resources management, etc. Advances in computational mathematics have revolutionized the variety and nature of the problems that can be addressed by environmental scientists and engineers nowadays and an intensive code development in the last years enables in the meantime the solutions of much larger numerical problems and applications. However, solving environmental processes along the water cycle at large scales, like for complete catchment or reservoirs, stays computationally still a challenging task. Therefore, we started a new OGS code development with focus on execution speed and parallelization. In the new version, a local data structure concept improves the instruction and data cache performance by a tight bundling of data with an element-wise numerical integration loop. Dedicated analysis methods enable the investigation of memory-access patterns in the local and global assembler routines, which leads to further data structure optimization for an additional performance gain. The concept is presented together with a technical code analysis of the recent development and a large case study including transient flow simulation in the unsaturated / saturated zone of the Thuringian Syncline, Germany. The analysis is performed on a high-resolution mesh (up to 50M elements) with embedded fault structures.
Comparisons for ESTA-Task3: ASTEC, CESAM and CLÉS

NASA Astrophysics Data System (ADS)

Christensen-Dalsgaard, J.

The ESTA activity under the CoRoT project aims at testing the tools for computing stellar models and oscillation frequencies that will be used in the analysis of asteroseismic data from CoRoT and other large-scale upcoming asteroseismic projects. Here I report results of comparisons between calculations using the Aarhus code (ASTEC) and two other codes, for models that include diffusion and settling. It is found that there are likely deficiencies, requiring further study, in the ASTEC computation of models including convective cores.
Deployment of the OSIRIS EM-PIC code on the Intel Knights Landing architecture

NASA Astrophysics Data System (ADS)

Fonseca, Ricardo

2017-10-01

Electromagnetic particle-in-cell (EM-PIC) codes such as OSIRIS have found widespread use in modelling the highly nonlinear and kinetic processes that occur in several relevant plasma physics scenarios, ranging from astrophysical settings to high-intensity laser plasma interaction. Being computationally intensive, these codes require large scale HPC systems, and a continuous effort in adapting the algorithm to new hardware and computing paradigms. In this work, we report on our efforts on deploying the OSIRIS code on the new Intel Knights Landing (KNL) architecture. Unlike the previous generation (Knights Corner), these boards are standalone systems, and introduce several new features, include the new AVX-512 instructions and on-package MCDRAM. We will focus on the parallelization and vectorization strategies followed, as well as memory management, and present a detailed performance evaluation of code performance in comparison with the CPU code. This work was partially supported by Fundaçã para a Ciência e Tecnologia (FCT), Portugal, through Grant No. PTDC/FIS-PLA/2940/2014.
Comparison of COBRA III-C and SABRE-1 (wire-wrap version) computational results with steady-state data from a 19-pin internally guard heated sodium-cooled bundle with a six-channel central blockage (THORS Bundle 3C). [LMFBR

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dearing, J F; Nelson, W R; Rose, S D

Computational thermal-hydraulic models of a 19-pin, electrically heated, wire-wrap liquid-metal fast breeder reactor test bundle were developed using two well-known subchannel analysis codes, COBRA III-C and SABRE-1 (wire-wrap version). These two codes use similar subchannel control volumes for the finite difference conservation equations but vary markedly in solution strategy and modeling capability. In particular, the empirical wire-wrap-forced diversion crossflow models are different. Surprisingly, however, crossflow velocity predictions of the two codes are very similar. Both codes show generally good agreement with experimental temperature data from a test in which a large radial temperature gradient was imposed. Differences between data andmore » code results are probably caused by experimental pin bowing, which is presently the limiting factor in validating coded empirical models.« less
Effectiveness of Various Computer-Based Instructional Strategies in Language Teaching. Final Report, November 1, 1969-August 31, 1970.

ERIC Educational Resources Information Center

Van Campen, Joseph A.

Computer software for programed language instruction, developed in the second quarter of 1970 at Stanford's Institute for Mathematical Studies in the Social Sciences is described in this report. The software includes: (1) a PDP-10 computer assembly language for generating drill sentences; (2) a coding system allowing a large number of sentences to…
Multiplier Architecture for Coding Circuits

NASA Technical Reports Server (NTRS)

Wang, C. C.; Truong, T. K.; Shao, H. M.; Deutsch, L. J.

1986-01-01

Multipliers based on new algorithm for Galois-field (GF) arithmetic regular and expandable. Pipeline structures used for computing both multiplications and inverses. Designs suitable for implementation in very-large-scale integrated (VLSI) circuits. This general type of inverter and multiplier architecture especially useful in performing finite-field arithmetic of Reed-Solomon error-correcting codes and of some cryptographic algorithms.
Time-domain seismic modeling in viscoelastic media for full waveform inversion on heterogeneous computing platforms with OpenCL

NASA Astrophysics Data System (ADS)

Fabien-Ouellet, Gabriel; Gloaguen, Erwan; Giroux, Bernard

2017-03-01

Full Waveform Inversion (FWI) aims at recovering the elastic parameters of the Earth by matching recordings of the ground motion with the direct solution of the wave equation. Modeling the wave propagation for realistic scenarios is computationally intensive, which limits the applicability of FWI. The current hardware evolution brings increasing parallel computing power that can speed up the computations in FWI. However, to take advantage of the diversity of parallel architectures presently available, new programming approaches are required. In this work, we explore the use of OpenCL to develop a portable code that can take advantage of the many parallel processor architectures now available. We present a program called SeisCL for 2D and 3D viscoelastic FWI in the time domain. The code computes the forward and adjoint wavefields using finite-difference and outputs the gradient of the misfit function given by the adjoint state method. To demonstrate the code portability on different architectures, the performance of SeisCL is tested on three different devices: Intel CPUs, NVidia GPUs and Intel Xeon PHI. Results show that the use of GPUs with OpenCL can speed up the computations by nearly two orders of magnitudes over a single threaded application on the CPU. Although OpenCL allows code portability, we show that some device-specific optimization is still required to get the best performance out of a specific architecture. Using OpenCL in conjunction with MPI allows the domain decomposition of large models on several devices located on different nodes of a cluster. For large enough models, the speedup of the domain decomposition varies quasi-linearly with the number of devices. Finally, we investigate two different approaches to compute the gradient by the adjoint state method and show the significant advantages of using OpenCL for FWI.
Pseudo-point transport technique: a new method for solving the Boltzmann transport equation in media with highly fluctuating cross sections

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nakhai, B.

A new method for solving radiation transport problems is presented. The heart of the technique is a new cross section processing procedure for the calculation of group-to-point and point-to-group cross sections sets. The method is ideally suited for problems which involve media with highly fluctuating cross sections, where the results of the traditional multigroup calculations are beclouded by the group averaging procedures employed. Extensive computational efforts, which would be required to evaluate double integrals in the multigroup treatment numerically, prohibit iteration to optimize the energy boundaries. On the other hand, use of point-to-point techniques (as in the stochastic technique) ismore » often prohibitively expensive due to the large computer storage requirement. The pseudo-point code is a hybrid of the two aforementioned methods (group-to-group and point-to-point) - hence the name pseudo-point - that reduces the computational efforts of the former and the large core requirements of the latter. The pseudo-point code generates the group-to-point or the point-to-group transfer matrices, and can be coupled with the existing transport codes to calculate pointwise energy-dependent fluxes. This approach yields much more detail than is available from the conventional energy-group treatments. Due to the speed of this code, several iterations could be performed (in affordable computing efforts) to optimize the energy boundaries and the weighting functions. The pseudo-point technique is demonstrated by solving six problems, each depicting a certain aspect of the technique. The results are presented as flux vs energy at various spatial intervals. The sensitivity of the technique to the energy grid and the savings in computational effort are clearly demonstrated.« less
Simulating the interaction of the heliosphere with the local interstellar medium: MHD results from a finite volume approach, first bidimensional results

NASA Technical Reports Server (NTRS)

Chanteur, G.; Khanfir, R.

1995-01-01

We have designed a full compressible MHD code working on unstructured meshes in order to be able to compute accurately sharp structures embedded in large scale simulations. The code is based on a finite volume method making use of a kinetic flux splitting. A bidimensional version of the code has been used to simulate the interaction of a moving interstellar medium, magnetized or unmagnetized with a rotating and magnetized heliopspheric plasma source. Being aware that these computations are not realistic due to the restriction to two dimensions, we present it to demonstrate the ability of this new code to handle this problem. An axisymetric version, now under development, will be operational in a few months. Ultimately we plan to run a full 3d version.
The COSIMA experiments and their verification, a data base for the validation of two phase flow computer codes

NASA Astrophysics Data System (ADS)

Class, G.; Meyder, R.; Stratmanns, E.

1985-12-01

The large data base for validation and development of computer codes for two-phase flow, generated at the COSIMA facility, is reviewed. The aim of COSIMA is to simulate the hydraulic, thermal, and mechanical conditions in the subchannel and the cladding of fuel rods in pressurized water reactors during the blowout phase of a loss of coolant accident. In terms of fuel rod behavior, it is found that during blowout under realistic conditions only small strains are reached. For cladding rupture extremely high rod internal pressures are necessary. The behavior of fuel rod simulators and the effect of thermocouples attached to the cladding outer surface are clarified. Calculations performed with the codes RELAP and DRUFAN show satisfactory agreement with experiments. This can be improved by updating the phase separation models in the codes.

Dakota Uncertainty Quantification Methods Applied to the CFD code Nek5000

DOE Office of Scientific and Technical Information (OSTI.GOV)

Delchini, Marc-Olivier; Popov, Emilian L.; Pointer, William David

This report presents the state of advancement of a Nuclear Energy Advanced Modeling and Simulation (NEAMS) project to characterize the uncertainty of the computational fluid dynamics (CFD) code Nek5000 using the Dakota package for flows encountered in the nuclear engineering industry. Nek5000 is a high-order spectral element CFD code developed at Argonne National Laboratory for high-resolution spectral-filtered large eddy simulations (LESs) and unsteady Reynolds-averaged Navier-Stokes (URANS) simulations.
Methods and feasibility of collecting occupational data for a large population-based cohort study in the United States: the reasons for geographic and racial differences in stroke study

PubMed Central

2014-01-01

Background Coronary heart disease and stroke are major contributors to preventable mortality. Evidence links work conditions to these diseases; however, occupational data are perceived to be difficult to collect for large population-based cohorts. We report methodological details and the feasibility of conducting an occupational ancillary study for a large U.S. prospective cohort being followed longitudinally for cardiovascular disease and stroke. Methods Current and historical occupational information were collected from active participants of the REasons for Geographic And Racial Differences in Stroke (REGARDS) Study. A survey was designed to gather quality occupational data among this national cohort of black and white men and women aged 45 years and older (enrolled 2003–2007). Trained staff conducted Computer-Assisted Telephone Interviews (CATI). After a brief pilot period, interviewers received additional training in the collection of narrative industry and occupation data before administering the survey to remaining cohort members. Trained coders used a computer-assisted coding system to assign U.S. Census codes for industry and occupation. All data were double coded; discrepant codes were independently resolved. Results Over a 2-year period, 17,648 participants provided consent and completed the occupational survey (87% response rate). A total of 20,427 jobs were assigned Census codes. Inter-rater reliability was 80% for industry and 74% for occupation. Less than 0.5% of the industry and occupation data were uncodable, compared with 12% during the pilot period. Concordance between the current and longest-held jobs was moderately high. The median time to collect employment status plus narrative and descriptive job information by CATI was 1.6 to 2.3 minutes per job. Median time to assign Census codes was 1.3 minutes per rater. Conclusions The feasibility of conducting high-quality occupational data collection and coding for a large heterogeneous population-based sample was demonstrated. We found that training for interview staff was important in ensuring that narrative responses for industry and occupation were adequately specified for coding. Estimates of survey administration time and coding from digital records provide an objective basis for planning future studies. The social and environmental conditions of work are important understudied risk factors that can be feasibly integrated into large population-based health studies. PMID:24512119
Methods and feasibility of collecting occupational data for a large population-based cohort study in the United States: the reasons for geographic and racial differences in stroke study.

PubMed

MacDonald, Leslie A; Pulley, LeaVonne; Hein, Misty J; Howard, Virginia J

2014-02-10

Coronary heart disease and stroke are major contributors to preventable mortality. Evidence links work conditions to these diseases; however, occupational data are perceived to be difficult to collect for large population-based cohorts. We report methodological details and the feasibility of conducting an occupational ancillary study for a large U.S. prospective cohort being followed longitudinally for cardiovascular disease and stroke. Current and historical occupational information were collected from active participants of the REasons for Geographic And Racial Differences in Stroke (REGARDS) Study. A survey was designed to gather quality occupational data among this national cohort of black and white men and women aged 45 years and older (enrolled 2003-2007). Trained staff conducted Computer-Assisted Telephone Interviews (CATI). After a brief pilot period, interviewers received additional training in the collection of narrative industry and occupation data before administering the survey to remaining cohort members. Trained coders used a computer-assisted coding system to assign U.S. Census codes for industry and occupation. All data were double coded; discrepant codes were independently resolved. Over a 2-year period, 17,648 participants provided consent and completed the occupational survey (87% response rate). A total of 20,427 jobs were assigned Census codes. Inter-rater reliability was 80% for industry and 74% for occupation. Less than 0.5% of the industry and occupation data were uncodable, compared with 12% during the pilot period. Concordance between the current and longest-held jobs was moderately high. The median time to collect employment status plus narrative and descriptive job information by CATI was 1.6 to 2.3 minutes per job. Median time to assign Census codes was 1.3 minutes per rater. The feasibility of conducting high-quality occupational data collection and coding for a large heterogeneous population-based sample was demonstrated. We found that training for interview staff was important in ensuring that narrative responses for industry and occupation were adequately specified for coding. Estimates of survey administration time and coding from digital records provide an objective basis for planning future studies. The social and environmental conditions of work are important understudied risk factors that can be feasibly integrated into large population-based health studies.
Error control techniques for satellite and space communications

NASA Technical Reports Server (NTRS)

Costello, Daniel J., Jr.

1992-01-01

Worked performed during the reporting period is summarized. Construction of robustly good trellis codes for use with sequential decoding was developed. The robustly good trellis codes provide a much better trade off between free distance and distance profile. The unequal error protection capabilities of convolutional codes was studied. The problem of finding good large constraint length, low rate convolutional codes for deep space applications is investigated. A formula for computing the free distance of 1/n convolutional codes was discovered. Double memory (DM) codes, codes with two memory units per unit bit position, were studied; a search for optimal DM codes is being conducted. An algorithm for constructing convolutional codes from a given quasi-cyclic code was developed. Papers based on the above work are included in the appendix.
GPU implementation of the linear scaling three dimensional fragment method for large scale electronic structure calculations

NASA Astrophysics Data System (ADS)

Jia, Weile; Wang, Jue; Chi, Xuebin; Wang, Lin-Wang

2017-02-01

LS3DF, namely linear scaling three-dimensional fragment method, is an efficient linear scaling ab initio total energy electronic structure calculation code based on a divide-and-conquer strategy. In this paper, we present our GPU implementation of the LS3DF code. Our test results show that the GPU code can calculate systems with about ten thousand atoms fully self-consistently in the order of 10 min using thousands of computing nodes. This makes the electronic structure calculations of 10,000-atom nanosystems routine work. This speed is 4.5-6 times faster than the CPU calculations using the same number of nodes on the Titan machine in the Oak Ridge leadership computing facility (OLCF). Such speedup is achieved by (a) carefully re-designing of the computationally heavy kernels; (b) redesign of the communication pattern for heterogeneous supercomputers.
Two-dimensional finite-element analyses of simulated rotor-fragment impacts against rings and beams compared with experiments

NASA Technical Reports Server (NTRS)

Stagliano, T. R.; Witmer, E. A.; Rodal, J. J. A.

1979-01-01

Finite element modeling alternatives as well as the utility and limitations of the two dimensional structural response computer code CIVM-JET 4B for predicting the transient, large deflection, elastic plastic, structural responses of two dimensional beam and/or ring structures which are subjected to rigid fragment impact were investigated. The applicability of the CIVM-JET 4B analysis and code for the prediction of steel containment ring response to impact by complex deformable fragments from a trihub burst of a T58 turbine rotor was studied. Dimensional analysis considerations were used in a parametric examination of data from engine rotor burst containment experiments and data from sphere beam impact experiments. The use of the CIVM-JET 4B computer code for making parametric structural response studies on both fragment-containment structure and fragment-deflector structure was illustrated. Modifications to the analysis/computation procedure were developed to alleviate restrictions.
[The computer assisted pacemaker clinic at the regional hospital of Udine (author's transl)].

PubMed

Feruglio, G A; Lestuzzi, L; Carminati, D

1978-01-01

For a close follow-up of large groups of pacemaker patients and for evaluation of long term pacing on a reliable statistical basis, many pacemaker centers in the world are now using computer systems. A patient data system with structured display records, designed to give complete, comprehensive and surveyable information and which are immediately retrievable 24 hours a day, on display or printed sets, seems to offer an ideal solution. The pacemaker clinic at the Regional Hospital of Udine has adopted this type of system. The clinic in linked to a live, on-line patient data system (G/3, Informatica Friuli-Venezia Giulia). The input and retrieval of information are made through a conventional keyboard. The input formats have fixed headings with coded alternatives and a limited space for comments in free text. The computer edits the coded information to surveyable reviews. Searches can be made on coded information and data of interest.
Preconditioning for Numerical Simulation of Low Mach Number Three-Dimensional Viscous Turbomachinery Flows

NASA Technical Reports Server (NTRS)

Tweedt, Daniel L.; Chima, Rodrick V.; Turkel, Eli

1997-01-01

A preconditioning scheme has been implemented into a three-dimensional viscous computational fluid dynamics code for turbomachine blade rows. The preconditioning allows the code, originally developed for simulating compressible flow fields, to be applied to nearly-incompressible, low Mach number flows. A brief description is given of the compressible Navier-Stokes equations for a rotating coordinate system, along with the preconditioning method employed. Details about the conservative formulation of artificial dissipation are provided, and different artificial dissipation schemes are discussed and compared. The preconditioned code was applied to a well-documented case involving the NASA large low-speed centrifugal compressor for which detailed experimental data are available for comparison. Performance and flow field data are compared for the near-design operating point of the compressor, with generally good agreement between computation and experiment. Further, significant differences between computational results for the different numerical implementations, revealing different levels of solution accuracy, are discussed.
Microsoft C#.NET program and electromagnetic depth sounding for large loop source

NASA Astrophysics Data System (ADS)

Prabhakar Rao, K.; Ashok Babu, G.

2009-07-01

A program, in the C# (C Sharp) language with Microsoft.NET Framework, is developed to compute the normalized vertical magnetic field of a horizontal rectangular loop source placed on the surface of an n-layered earth. The field can be calculated either inside or outside the loop. Five C# classes with member functions in each class are, designed to compute the kernel, Hankel transform integral, coefficients for cubic spline interpolation between computed values and the normalized vertical magnetic field. The program computes the vertical magnetic field in the frequency domain using the integral expressions evaluated by a combination of straightforward numerical integration and the digital filter technique. The code utilizes different object-oriented programming (OOP) features. It finally computes the amplitude and phase of the normalized vertical magnetic field. The computed results are presented for geometric and parametric soundings. The code is developed in Microsoft.NET visual studio 2003 and uses various system class libraries.
Development of the 3DHZETRN code for space radiation protection

NASA Astrophysics Data System (ADS)

Wilson, John; Badavi, Francis; Slaba, Tony; Reddell, Brandon; Bahadori, Amir; Singleterry, Robert

Space radiation protection requires computationally efficient shield assessment methods that have been verified and validated. The HZETRN code is the engineering design code used for low Earth orbit dosimetric analysis and astronaut record keeping with end-to-end validation to twenty percent in Space Shuttle and International Space Station operations. HZETRN treated diffusive leakage only at the distal surface limiting its application to systems with a large radius of curvature. A revision of HZETRN that included forward and backward diffusion allowed neutron leakage to be evaluated at both the near and distal surfaces. That revision provided a deterministic code of high computational efficiency that was in substantial agreement with Monte Carlo (MC) codes in flat plates (at least to the degree that MC codes agree among themselves). In the present paper, the 3DHZETRN formalism capable of evaluation in general geometry is described. Benchmarking will help quantify uncertainty with MC codes (Geant4, FLUKA, MCNP6, and PHITS) in simple shapes such as spheres within spherical shells and boxes. Connection of the 3DHZETRN to general geometry will be discussed.
Scalable High Performance Computing: Direct and Large-Eddy Turbulent Flow Simulations Using Massively Parallel Computers

NASA Technical Reports Server (NTRS)

Morgan, Philip E.

2004-01-01

This final report contains reports of research related to the tasks "Scalable High Performance Computing: Direct and Lark-Eddy Turbulent FLow Simulations Using Massively Parallel Computers" and "Devleop High-Performance Time-Domain Computational Electromagnetics Capability for RCS Prediction, Wave Propagation in Dispersive Media, and Dual-Use Applications. The discussion of Scalable High Performance Computing reports on three objectives: validate, access scalability, and apply two parallel flow solvers for three-dimensional Navier-Stokes flows; develop and validate a high-order parallel solver for Direct Numerical Simulations (DNS) and Large Eddy Simulation (LES) problems; and Investigate and develop a high-order Reynolds averaged Navier-Stokes turbulence model. The discussion of High-Performance Time-Domain Computational Electromagnetics reports on five objectives: enhancement of an electromagnetics code (CHARGE) to be able to effectively model antenna problems; utilize lessons learned in high-order/spectral solution of swirling 3D jets to apply to solving electromagnetics project; transition a high-order fluids code, FDL3DI, to be able to solve Maxwell's Equations using compact-differencing; develop and demonstrate improved radiation absorbing boundary conditions for high-order CEM; and extend high-order CEM solver to address variable material properties. The report also contains a review of work done by the systems engineer.
Sierra/Solid Mechanics 4.48 User's Guide.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Merewether, Mark Thomas; Crane, Nathan K; de Frias, Gabriel Jose

Sierra/SolidMechanics (Sierra/SM) is a Lagrangian, three-dimensional code for finite element analysis of solids and structures. It provides capabilities for explicit dynamic, implicit quasistatic and dynamic analyses. The explicit dynamics capabilities allow for the efficient and robust solution of models with extensive contact subjected to large, suddenly applied loads. For implicit problems, Sierra/SM uses a multi-level iterative solver, which enables it to effectively solve problems with large deformations, nonlinear material behavior, and contact. Sierra/SM has a versatile library of continuum and structural elements, and a large library of material models. The code is written for parallel computing environments enabling scalable solutionsmore » of extremely large problems for both implicit and explicit analyses. It is built on the SIERRA Framework, which facilitates coupling with other SIERRA mechanics codes. This document describes the functionality and input syntax for Sierra/SM.« less
Gigaflop performance on a CRAY-2: Multitasking a computational fluid dynamics application

NASA Technical Reports Server (NTRS)

Tennille, Geoffrey M.; Overman, Andrea L.; Lambiotte, Jules J.; Streett, Craig L.

1991-01-01

The methodology is described for converting a large, long-running applications code that executed on a single processor of a CRAY-2 supercomputer to a version that executed efficiently on multiple processors. Although the conversion of every application is different, a discussion of the types of modification used to achieve gigaflop performance is included to assist others in the parallelization of applications for CRAY computers, especially those that were developed for other computers. An existing application, from the discipline of computational fluid dynamics, that had utilized over 2000 hrs of CPU time on CRAY-2 during the previous year was chosen as a test case to study the effectiveness of multitasking on a CRAY-2. The nature of dominant calculations within the application indicated that a sustained computational rate of 1 billion floating-point operations per second, or 1 gigaflop, might be achieved. The code was first analyzed and modified for optimal performance on a single processor in a batch environment. After optimal performance on a single CPU was achieved, the code was modified to use multiple processors in a dedicated environment. The results of these two efforts were merged into a single code that had a sustained computational rate of over 1 gigaflop on a CRAY-2. Timings and analysis of performance are given for both single- and multiple-processor runs.
Wasatch: An architecture-proof multiphysics development environment using a Domain Specific Language and graph theory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Saad, Tony; Sutherland, James C.

To address the coding and software challenges of modern hybrid architectures, we propose an approach to multiphysics code development for high-performance computing. This approach is based on using a Domain Specific Language (DSL) in tandem with a directed acyclic graph (DAG) representation of the problem to be solved that allows runtime algorithm generation. When coupled with a large-scale parallel framework, the result is a portable development framework capable of executing on hybrid platforms and handling the challenges of multiphysics applications. In addition, we share our experience developing a code in such an environment – an effort that spans an interdisciplinarymore » team of engineers and computer scientists.« less
Wasatch: An architecture-proof multiphysics development environment using a Domain Specific Language and graph theory

DOE PAGES

Saad, Tony; Sutherland, James C.

2016-05-04

To address the coding and software challenges of modern hybrid architectures, we propose an approach to multiphysics code development for high-performance computing. This approach is based on using a Domain Specific Language (DSL) in tandem with a directed acyclic graph (DAG) representation of the problem to be solved that allows runtime algorithm generation. When coupled with a large-scale parallel framework, the result is a portable development framework capable of executing on hybrid platforms and handling the challenges of multiphysics applications. In addition, we share our experience developing a code in such an environment – an effort that spans an interdisciplinarymore » team of engineers and computer scientists.« less
PETSc Users Manual Revision 3.7

DOE Office of Scientific and Technical Information (OSTI.GOV)

Balay, Satish; Abhyankar, S.; Adams, M.

This manual describes the use of PETSc for the numerical solution of partial differential equations and related problems on high-performance computers. The Portable, Extensible Toolkit for Scientific Computation (PETSc) is a suite of data structures and routines that provide the building blocks for the implementation of large-scale application codes on parallel (and serial) computers. PETSc uses the MPI standard for all message-passing communication.
PETSc Users Manual Revision 3.8

DOE Office of Scientific and Technical Information (OSTI.GOV)

Balay, S.; Abhyankar, S.; Adams, M.

This manual describes the use of PETSc for the numerical solution of partial differential equations and related problems on high-performance computers. The Portable, Extensible Toolkit for Scientific Computation (PETSc) is a suite of data structures and routines that provide the building blocks for the implementation of large-scale application codes on parallel (and serial) computers. PETSc uses the MPI standard for all message-passing communication.
Recent Developments in the Application of Biologically Inspired Computation to Chemical Sensing

NASA Astrophysics Data System (ADS)

Marco, S.; Gutierrez-Gálvez, A.

2009-05-01

Biological olfaction outperforms chemical instrumentation in specificity, response time, detection limit, coding capacity, time stability, robustness, size, power consumption, and portability. This biological function provides outstanding performance due, to a large extent, to the unique architecture of the olfactory pathway, which combines a high degree of redundancy, an efficient combinatorial coding along with unmatched chemical information processing mechanisms. The last decade has witnessed important advances in the understanding of the computational primitives underlying the functioning of the olfactory system. In this work, the state of the art concerning biologically inspired computation for chemical sensing will be reviewed. Instead of reviewing the whole body of computational neuroscience of olfaction, we restrict this review to the application of models to the processing of real chemical sensor data.
Modeling Spectra of Icy Satellites and Cometary Icy Particles Using Multi-Sphere T-Matrix Code

NASA Astrophysics Data System (ADS)

Kolokolova, Ludmilla; Mackowski, Daniel; Pitman, Karly M.; Joseph, Emily C. S.; Buratti, Bonnie J.; Protopapa, Silvia; Kelley, Michael S.

2016-10-01

The Multi-Sphere T-matrix code (MSTM) allows rigorous computations of characteristics of the light scattered by a cluster of spherical particles. It was introduced to the scientific community in 1996 (Mackowski & Mishchenko, 1996, JOSA A, 13, 2266). Later it was put online and became one of the most popular codes to study photopolarimetric properties of aggregated particles. Later versions of this code, especially its parallelized version MSTM3 (Mackowski & Mishchenko, 2011, JQSRT, 112, 2182), were used to compute angular and wavelength dependence of the intensity and polarization of light scattered by aggregates of up to 4000 constituent particles (Kolokolova & Mackowski, 2012, JQSRT, 113, 2567). The version MSTM4 considers large thick slabs of spheres (Mackowski, 2014, Proc. of the Workshop ``Scattering by aggregates``, Bremen, Germany, March 2014, Th. Wriedt & Yu. Eremin, Eds., 6) and is significantly different from the earlier versions. It adopts a Discrete Fourier Convolution, implemented using a Fast Fourier Transform, for evaluation of the exciting field. MSTM4 is able to treat dozens of thousands of spheres and is about 100 times faster than the MSTM3 code. This allows us not only to compute the light scattering properties of a large number of electromagnetically interacting constituent particles, but also to perform multi-wavelength and multi-angular computations using computer resources with rather reasonable CPU and computer memory. We used MSTM4 to model near-infrared spectra of icy satellites of Saturn (Rhea, Dione, and Tethys data from Cassini VIMS), and of icy particles observed in the coma of comet 103P/Hartley 2 (data from EPOXI/DI HRII). Results of our modeling show that in the case of icy satellites the best fit to the observed spectra is provided by regolith made of spheres of radius ~1 micron with a porosity in the range 85% - 95%, which slightly varies for the different satellites. Fitting the spectra of the cometary icy particles requires icy aggregates of size larger than 40 micron with constituent spheres in the micron size range.
Single stock dynamics on high-frequency data: from a compressed coding perspective.

PubMed

Fushing, Hsieh; Chen, Shu-Chun; Hwang, Chii-Ruey

2014-01-01

High-frequency return, trading volume and transaction number are digitally coded via a nonparametric computing algorithm, called hierarchical factor segmentation (HFS), and then are coupled together to reveal a single stock dynamics without global state-space structural assumptions. The base-8 digital coding sequence, which is capable of revealing contrasting aggregation against sparsity of extreme events, is further compressed into a shortened sequence of state transitions. This compressed digital code sequence vividly demonstrates that the aggregation of large absolute returns is the primary driving force for stimulating both the aggregations of large trading volumes and transaction numbers. The state of system-wise synchrony is manifested with very frequent recurrence in the stock dynamics. And this data-driven dynamic mechanism is seen to correspondingly vary as the global market transiting in and out of contraction-expansion cycles. These results not only elaborate the stock dynamics of interest to a fuller extent, but also contradict some classical theories in finance. Overall this version of stock dynamics is potentially more coherent and realistic, especially when the current financial market is increasingly powered by high-frequency trading via computer algorithms, rather than by individual investors.

Single Stock Dynamics on High-Frequency Data: From a Compressed Coding Perspective

PubMed Central

Fushing, Hsieh; Chen, Shu-Chun; Hwang, Chii-Ruey

2014-01-01

High-frequency return, trading volume and transaction number are digitally coded via a nonparametric computing algorithm, called hierarchical factor segmentation (HFS), and then are coupled together to reveal a single stock dynamics without global state-space structural assumptions. The base-8 digital coding sequence, which is capable of revealing contrasting aggregation against sparsity of extreme events, is further compressed into a shortened sequence of state transitions. This compressed digital code sequence vividly demonstrates that the aggregation of large absolute returns is the primary driving force for stimulating both the aggregations of large trading volumes and transaction numbers. The state of system-wise synchrony is manifested with very frequent recurrence in the stock dynamics. And this data-driven dynamic mechanism is seen to correspondingly vary as the global market transiting in and out of contraction-expansion cycles. These results not only elaborate the stock dynamics of interest to a fuller extent, but also contradict some classical theories in finance. Overall this version of stock dynamics is potentially more coherent and realistic, especially when the current financial market is increasingly powered by high-frequency trading via computer algorithms, rather than by individual investors. PMID:24586235
MicroHH 1.0: a computational fluid dynamics code for direct numerical simulation and large-eddy simulation of atmospheric boundary layer flows

NASA Astrophysics Data System (ADS)

van Heerwaarden, Chiel C.; van Stratum, Bart J. H.; Heus, Thijs; Gibbs, Jeremy A.; Fedorovich, Evgeni; Mellado, Juan Pedro

2017-08-01

This paper describes MicroHH 1.0, a new and open-source (www.microhh.org) computational fluid dynamics code for the simulation of turbulent flows in the atmosphere. It is primarily made for direct numerical simulation but also supports large-eddy simulation (LES). The paper covers the description of the governing equations, their numerical implementation, and the parameterizations included in the code. Furthermore, the paper presents the validation of the dynamical core in the form of convergence and conservation tests, and comparison of simulations of channel flows and slope flows against well-established test cases. The full numerical model, including the associated parameterizations for LES, has been tested for a set of cases under stable and unstable conditions, under the Boussinesq and anelastic approximations, and with dry and moist convection under stationary and time-varying boundary conditions. The paper presents performance tests showing good scaling from 256 to 32 768 processes. The graphical processing unit (GPU)-enabled version of the code can reach a speedup of more than an order of magnitude for simulations that fit in the memory of a single GPU.
Transonic Navier-Stokes wing solutions using a zonal approach. Part 2: High angle-of-attack simulation

NASA Technical Reports Server (NTRS)

Chaderjian, N. M.

1986-01-01

A computer code is under development whereby the thin-layer Reynolds-averaged Navier-Stokes equations are to be applied to realistic fighter-aircraft configurations. This transonic Navier-Stokes code (TNS) utilizes a zonal approach in order to treat complex geometries and satisfy in-core computer memory constraints. The zonal approach has been applied to isolated wing geometries in order to facilitate code development. Part 1 of this paper addresses the TNS finite-difference algorithm, zonal methodology, and code validation with experimental data. Part 2 of this paper addresses some numerical issues such as code robustness, efficiency, and accuracy at high angles of attack. Special free-stream-preserving metrics proved an effective way to treat H-mesh singularities over a large range of severe flow conditions, including strong leading-edge flow gradients, massive shock-induced separation, and stall. Furthermore, lift and drag coefficients have been computed for a wing up through CLmax. Numerical oil flow patterns and particle trajectories are presented both for subcritical and transonic flow. These flow simulations are rich with complex separated flow physics and demonstrate the efficiency and robustness of the zonal approach.
Time-Shifted Boundary Conditions Used for Navier-Stokes Aeroelastic Solver

NASA Technical Reports Server (NTRS)

Srivastava, Rakesh

1999-01-01

Under the Advanced Subsonic Technology (AST) Program, an aeroelastic analysis code (TURBO-AE) based on Navier-Stokes equations is currently under development at NASA Lewis Research Center s Machine Dynamics Branch. For a blade row, aeroelastic instability can occur in any of the possible interblade phase angles (IBPA s). Analyzing small IBPA s is very computationally expensive because a large number of blade passages must be simulated. To reduce the computational cost of these analyses, we used time shifted, or phase-lagged, boundary conditions in the TURBO-AE code. These conditions can be used to reduce the computational domain to a single blade passage by requiring the boundary conditions across the passage to be lagged depending on the IBPA being analyzed. The time-shifted boundary conditions currently implemented are based on the direct-store method. This method requires large amounts of data to be stored over a period of the oscillation cycle. On CRAY computers this is not a major problem because solid-state devices can be used for fast input and output to read and write the data onto a disk instead of storing it in core memory.
Survey of computer programs for heat transfer analysis

NASA Astrophysics Data System (ADS)

Noor, A. K.

An overview is presented of the current capabilities of thirty-eight computer programs that can be used for solution of heat transfer problems. These programs range from the large, general-purpose codes with a broad spectrum of capabilities, large user community and comprehensive user support (e.g., ANSYS, MARC, MITAS 2 MSC/NASTRAN, SESAM-69/NV-615) to the small, special purpose codes with limited user community such as ANDES, NNTB, SAHARA, SSPTA, TACO, TEPSA AND TRUMP. The capabilities of the programs surveyed are listed in tabular form followed by a summary of the major features of each program. As with any survey of computer programs, the present one has the following limitations: (1) It is useful only in the initial selection of the programs which are most suitable for a particular application. The final selection of the program to be used should, however, be based on a detailed examination of the documentation and the literature about the program; (2) Since computer software continually changes, often at a rapid rate, some means must be found for updating this survey and maintaining some degree of currency.
Survey of computer programs for heat transfer analysis

NASA Technical Reports Server (NTRS)

Noor, A. K.

1982-01-01

An overview is presented of the current capabilities of thirty-eight computer programs that can be used for solution of heat transfer problems. These programs range from the large, general-purpose codes with a broad spectrum of capabilities, large user community and comprehensive user support (e.g., ANSYS, MARC, MITAS 2 MSC/NASTRAN, SESAM-69/NV-615) to the small, special purpose codes with limited user community such as ANDES, NNTB, SAHARA, SSPTA, TACO, TEPSA AND TRUMP. The capabilities of the programs surveyed are listed in tabular form followed by a summary of the major features of each program. As with any survey of computer programs, the present one has the following limitations: (1) It is useful only in the initial selection of the programs which are most suitable for a particular application. The final selection of the program to be used should, however, be based on a detailed examination of the documentation and the literature about the program; (2) Since computer software continually changes, often at a rapid rate, some means must be found for updating this survey and maintaining some degree of currency.
Dynamic Elasto-Plastic Response of Shells in an Acoustic Medium - Theoretical Development for the EPSA Code

DTIC Science & Technology

1978-07-01

TECHNOLOGY OFFICE OF NAVAL RESEARCH ARLINGTON* VA 22217 ATTN CODE 200 NAVAL. UNDERWATER SYSTEMS COMMAND NEWPORT. RI 02840 ATTN DRo AZRIEL HARARI/ 3 .b 311...ANAOST.FIT THEORETICAL DEVELOPMENT FOR THE EPSA CODE ~/ R/Atkatsh, M.P./Bieniek. -AM M.L.,/aron OFF NAVAL RESEARCH CONTRACT N/ 3 14-72-C-19~. TRACT 7_...the report, both procedures result In a marked increase in computational efficiency, parti- cularly for cases in which large systems are to be
Experimental program for real gas flow code validation at NASA Ames Research Center

NASA Technical Reports Server (NTRS)

Deiwert, George S.; Strawa, Anthony W.; Sharma, Surendra P.; Park, Chul

1989-01-01

The experimental program for validating real gas hypersonic flow codes at NASA Ames Rsearch Center is described. Ground-based test facilities used include ballistic ranges, shock tubes and shock tunnels, arc jet facilities and heated-air hypersonic wind tunnels. Also included are large-scale computer systems for kinetic theory simulations and benchmark code solutions. Flight tests consist of the Aeroassist Flight Experiment, the Space Shuttle, Project Fire 2, and planetary probes such as Galileo, Pioneer Venus, and PAET.
Adagio 4.20 User’s Guide

DOE Office of Scientific and Technical Information (OSTI.GOV)

Spencer, Benjamin Whiting; Crane, Nathan K.; Heinstein, Martin W.

2011-03-01

Adagio is a Lagrangian, three-dimensional, implicit code for the analysis of solids and structures. It uses a multi-level iterative solver, which enables it to solve problems with large deformations, nonlinear material behavior, and contact. It also has a versatile library of continuum and structural elements, and an extensive library of material models. Adagio is written for parallel computing environments, and its solvers allow for scalable solutions of very large problems. Adagio uses the SIERRA Framework, which allows for coupling with other SIERRA mechanics codes. This document describes the functionality and input structure for Adagio.
Initial Computations of Vertical Displacement Events with NIMROD

NASA Astrophysics Data System (ADS)

Bunkers, Kyle; Sovinec, C. R.

2014-10-01

Disruptions associated with vertical displacement events (VDEs) have potential for causing considerable physical damage to ITER and other tokamak experiments. We report on initial computations of generic axisymmetric VDEs using the NIMROD code [Sovinec et al., JCP 195, 355 (2004)]. An implicit thin-wall computation has been implemented to couple separate internal and external regions without numerical stability limitations. A simple rectangular cross-section domain generated with the NIMEQ code [Howell and Sovinec, CPC (2014)] modified to use a symmetry condition at the midplane is used to test linear and nonlinear axisymmetric VDE computation. As current in simulated external coils for large- R / a cases is varied, there is a clear n = 0 stability threshold which lies below the decay-index criterion for the current-loop model of a tokamak to model VDEs [Mukhovatov and Shafranov, Nucl. Fusion 11, 605 (1971)]; a scan of wall distance indicates the offset is due to the influence of the conducting wall. Results with a vacuum region surrounding a resistive wall will also be presented. Initial nonlinear computations show large vertical displacement of an intact simulated tokamak. This effort is supported by U.S. Department of Energy Grant DE-FG02-06ER54850.
Evolution of plastic anisotropy for high-strain-rate computations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schiferl, S.K.; Maudlin, P.J.

1994-12-01

A model for anisotropic material strength, and for changes in the anisotropy due to plastic strain, is described. This model has been developed for use in high-rate, explicit, Lagrangian multidimensional continuum-mechanics codes. The model handles anisotropies in single-phase materials, in particular the anisotropies due to crystallographic texture--preferred orientations of the single-crystal grains. Textural anisotropies, and the changes in these anisotropies, depend overwhelmingly no the crystal structure of the material and on the deformation history. The changes, particularly for a complex deformations, are not amenable to simple analytical forms. To handle this problem, the material model described here includes a texturemore » code, or micromechanical calculation, coupled to a continuum code. The texture code updates grain orientations as a function of tensor plastic strain, and calculates the yield strength in different directions. A yield function is fitted to these yield points. For each computational cell in the continuum simulation, the texture code tracks a particular set of grain orientations. The orientations will change due to the tensor strain history, and the yield function will change accordingly. Hence, the continuum code supplies a tensor strain to the texture code, and the texture code supplies an updated yield function to the continuum code. Since significant texture changes require relatively large strains--typically, a few percent or more--the texture code is not called very often, and the increase in computer time is not excessive. The model was implemented, using a finite-element continuum code and a texture code specialized for hexagonal-close-packed crystal structures. The results for several uniaxial stress problems and an explosive-forming problem are shown.« less
Parallel CARLOS-3D code development

DOE Office of Scientific and Technical Information (OSTI.GOV)

Putnam, J.M.; Kotulski, J.D.

1996-02-01

CARLOS-3D is a three-dimensional scattering code which was developed under the sponsorship of the Electromagnetic Code Consortium, and is currently used by over 80 aerospace companies and government agencies. The code has been extensively validated and runs on both serial workstations and parallel super computers such as the Intel Paragon. CARLOS-3D is a three-dimensional surface integral equation scattering code based on a Galerkin method of moments formulation employing Rao- Wilton-Glisson roof-top basis for triangular faceted surfaces. Fully arbitrary 3D geometries composed of multiple conducting and homogeneous bulk dielectric materials can be modeled. This presentation describes some of the extensions tomore » the CARLOS-3D code, and how the operator structure of the code facilitated these improvements. Body of revolution (BOR) and two-dimensional geometries were incorporated by simply including new input routines, and the appropriate Galerkin matrix operator routines. Some additional modifications were required in the combined field integral equation matrix generation routine due to the symmetric nature of the BOR and 2D operators. Quadrilateral patched surfaces with linear roof-top basis functions were also implemented in the same manner. Quadrilateral facets and triangular facets can be used in combination to more efficiently model geometries with both large smooth surfaces and surfaces with fine detail such as gaps and cracks. Since the parallel implementation in CARLOS-3D is at high level, these changes were independent of the computer platform being used. This approach minimizes code maintenance, while providing capabilities with little additional effort. Results are presented showing the performance and accuracy of the code for some large scattering problems. Comparisons between triangular faceted and quadrilateral faceted geometry representations will be shown for some complex scatterers.« less
Toward a first-principles integrated simulation of tokamak edge plasmas

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chang, C S; Klasky, Scott A; Cummings, Julian

2008-01-01

Performance of the ITER is anticipated to be highly sensitive to the edge plasma condition. The edge pedestal in ITER needs to be predicted from an integrated simulation of the necessary firstprinciples, multi-scale physics codes. The mission of the SciDAC Fusion Simulation Project (FSP) Prototype Center for Plasma Edge Simulation (CPES) is to deliver such a code integration framework by (1) building new kinetic codes XGC0 and XGC1, which can simulate the edge pedestal buildup; (2) using and improving the existing MHD codes ELITE, M3D-OMP, M3D-MPP and NIMROD, for study of large-scale edge instabilities called Edge Localized Modes (ELMs); andmore » (3) integrating the codes into a framework using cutting-edge computer science technology. Collaborative effort among physics, computer science, and applied mathematics within CPES has created the first working version of the End-to-end Framework for Fusion Integrated Simulation (EFFIS), which can be used to study the pedestal-ELM cycles.« less
Open-Source Python Tools for Deploying Interactive GIS Dashboards for a Billion Datapoints on a Laptop

NASA Astrophysics Data System (ADS)

Steinberg, P. D.; Bednar, J. A.; Rudiger, P.; Stevens, J. L. R.; Ball, C. E.; Christensen, S. D.; Pothina, D.

2017-12-01

The rich variety of software libraries available in the Python scientific ecosystem provides a flexible and powerful alternative to traditional integrated GIS (geographic information system) programs. Each such library focuses on doing a certain set of general-purpose tasks well, and Python makes it relatively simple to glue the libraries together to solve a wide range of complex, open-ended problems in Earth science. However, choosing an appropriate set of libraries can be challenging, and it is difficult to predict how much "glue code" will be needed for any particular combination of libraries and tasks. Here we present a set of libraries that have been designed to work well together to build interactive analyses and visualizations of large geographic datasets, in standard web browsers. The resulting workflows run on ordinary laptops even for billions of data points, and easily scale up to larger compute clusters when available. The declarative top-level interface used in these libraries means that even complex, fully interactive applications can be built and deployed as web services using only a few dozen lines of code, making it simple to create and share custom interactive applications even for datasets too large for most traditional GIS systems. The libraries we will cover include GeoViews (HoloViews extended for geographic applications) for declaring visualizable/plottable objects, Bokeh for building visual web applications from GeoViews objects, Datashader for rendering arbitrarily large datasets faithfully as fixed-size images, Param for specifying user-modifiable parameters that model your domain, Xarray for computing with n-dimensional array data, Dask for flexibly dispatching computational tasks across processors, and Numba for compiling array-based Python code down to fast machine code. We will show how to use the resulting workflow with static datasets and with simulators such as GSSHA or AdH, allowing you to deploy flexible, high-performance web-based dashboards for your GIS data or simulations without needing major investments in code development or maintenance.
Constructions for finite-state codes

NASA Technical Reports Server (NTRS)

Pollara, F.; Mceliece, R. J.; Abdel-Ghaffar, K.

1987-01-01

A class of codes called finite-state (FS) codes is defined and investigated. These codes, which generalize both block and convolutional codes, are defined by their encoders, which are finite-state machines with parallel inputs and outputs. A family of upper bounds on the free distance of a given FS code is derived from known upper bounds on the minimum distance of block codes. A general construction for FS codes is then given, based on the idea of partitioning a given linear block into cosets of one of its subcodes, and it is shown that in many cases the FS codes constructed in this way have a d sub free which is as large as possible. These codes are found without the need for lengthy computer searches, and have potential applications for future deep-space coding systems. The issue of catastropic error propagation (CEP) for FS codes is also investigated.
HEPLIB `91: International users meeting on the support and environments of high energy physics computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Johnstad, H.

The purpose of this meeting is to discuss the current and future HEP computing support and environments from the perspective of new horizons in accelerator, physics, and computing technologies. Topics of interest to the Meeting include (but are limited to): the forming of the HEPLIB world user group for High Energy Physic computing; mandate, desirables, coordination, organization, funding; user experience, international collaboration; the roles of national labs, universities, and industry; range of software, Monte Carlo, mathematics, physics, interactive analysis, text processors, editors, graphics, data base systems, code management tools; program libraries, frequency of updates, distribution; distributed and interactive computing, datamore » base systems, user interface, UNIX operating systems, networking, compilers, Xlib, X-Graphics; documentation, updates, availability, distribution; code management in large collaborations, keeping track of program versions; and quality assurance, testing, conventions, standards.« less
HEPLIB 91: International users meeting on the support and environments of high energy physics computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Johnstad, H.

The purpose of this meeting is to discuss the current and future HEP computing support and environments from the perspective of new horizons in accelerator, physics, and computing technologies. Topics of interest to the Meeting include (but are limited to): the forming of the HEPLIB world user group for High Energy Physic computing; mandate, desirables, coordination, organization, funding; user experience, international collaboration; the roles of national labs, universities, and industry; range of software, Monte Carlo, mathematics, physics, interactive analysis, text processors, editors, graphics, data base systems, code management tools; program libraries, frequency of updates, distribution; distributed and interactive computing, datamore » base systems, user interface, UNIX operating systems, networking, compilers, Xlib, X-Graphics; documentation, updates, availability, distribution; code management in large collaborations, keeping track of program versions; and quality assurance, testing, conventions, standards.« less
TABULATED EQUIVALENT SDR FLAMELET (TESF) MODEFL

DOE Office of Scientific and Technical Information (OSTI.GOV)

KUNDU, PRITHWISH; AMEEN, mUHSIN MOHAMMED; UNNIKRISHNAN, UMESH

The code consists of an implementation of a novel tabulated combustion model for non-premixed flames in CFD solvers. This novel technique/model is used to implement an unsteady flamelet tabulation without using progress variables for non-premixed flames. It also has the capability to include history effects which is unique within tabulated flamelet models. The flamelet table generation code can be run in parallel to generate tables with large chemistry mechanisms in relatively short wall clock times. The combustion model/code reads these tables. This framework can be coupled with any CFD solver with RANS as well as LES turbulence models. This frameworkmore » enables CFD solvers to run large chemistry mechanisms with large number of grids at relatively lower computational costs. Currently it has been coupled with the Converge CFD code and validated against available experimental data. This model can be used to simulate non-premixed combustion in a variety of applications like reciprocating engines, gas turbines and industrial burners operating over a wide range of fuels.« less
The best bits in an iris code.

PubMed

Hollingsworth, Karen P; Bowyer, Kevin W; Flynn, Patrick J

2009-06-01

Iris biometric systems apply filters to iris images to extract information about iris texture. Daugman's approach maps the filter output to a binary iris code. The fractional Hamming distance between two iris codes is computed and decisions about the identity of a person are based on the computed distance. The fractional Hamming distance weights all bits in an iris code equally. However, not all the bits in an iris code are equally useful. Our research is the first to present experiments documenting that some bits are more consistent than others. Different regions of the iris are compared to evaluate their relative consistency, and contrary to some previous research, we find that the middle bands of the iris are more consistent than the inner bands. The inconsistent-bit phenomenon is evident across genders and different filter types. Possible causes of inconsistencies, such as segmentation, alignment issues, and different filters are investigated. The inconsistencies are largely due to the coarse quantization of the phase response. Masking iris code bits corresponding to complex filter responses near the axes of the complex plane improves the separation between the match and nonmatch Hamming distance distributions.
Off-design computer code for calculating the aerodynamic performance of axial-flow fans and compressors

NASA Technical Reports Server (NTRS)

Schmidt, James F.

1995-01-01

An off-design axial-flow compressor code is presented and is available from COSMIC for predicting the aerodynamic performance maps of fans and compressors. Steady axisymmetric flow is assumed and the aerodynamic solution reduces to solving the two-dimensional flow field in the meridional plane. A streamline curvature method is used for calculating this flow-field outside the blade rows. This code allows for bleed flows and the first five stators can be reset for each rotational speed, capabilities which are necessary for large multistage compressors. The accuracy of the off-design performance predictions depend upon the validity of the flow loss and deviation correlation models. These empirical correlations for the flow loss and deviation are used to model the real flow effects and the off-design code will compute through small reverse flow regions. The input to this off-design code is fully described and a user's example case for a two-stage fan is included with complete input and output data sets. Also, a comparison of the off-design code predictions with experimental data is included which generally shows good agreement.

The development of an explicit thermochemical nonequilibrium algorithm and its application to compute three dimensional AFE flowfields

NASA Technical Reports Server (NTRS)

Palmer, Grant

1989-01-01

This study presents a three-dimensional explicit, finite-difference, shock-capturing numerical algorithm applied to viscous hypersonic flows in thermochemical nonequilibrium. The algorithm employs a two-temperature physical model. Equations governing the finite-rate chemical reactions are fully-coupled to the gas dynamic equations using a novel coupling technique. The new coupling method maintains stability in the explicit, finite-rate formulation while allowing relatively large global time steps. The code uses flux-vector accuracy. Comparisons with experimental data and other numerical computations verify the accuracy of the present method. The code is used to compute the three-dimensional flowfield over the Aeroassist Flight Experiment (AFE) vehicle at one of its trajectory points.
Software Engineering for Scientific Computer Simulations

NASA Astrophysics Data System (ADS)

Post, Douglass E.; Henderson, Dale B.; Kendall, Richard P.; Whitney, Earl M.

2004-11-01

Computer simulation is becoming a very powerful tool for analyzing and predicting the performance of fusion experiments. Simulation efforts are evolving from including only a few effects to many effects, from small teams with a few people to large teams, and from workstations and small processor count parallel computers to massively parallel platforms. Successfully making this transition requires attention to software engineering issues. We report on the conclusions drawn from a number of case studies of large scale scientific computing projects within DOE, academia and the DoD. The major lessons learned include attention to sound project management including setting reasonable and achievable requirements, building a good code team, enforcing customer focus, carrying out verification and validation and selecting the optimum computational mathematics approaches.
Lightweight computational steering of very large scale molecular dynamics simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Beazley, D.M.; Lomdahl, P.S.

1996-09-01

We present a computational steering approach for controlling, analyzing, and visualizing very large scale molecular dynamics simulations involving tens to hundreds of millions of atoms. Our approach relies on extensible scripting languages and an easy to use tool for building extensions and modules. The system is extremely easy to modify, works with existing C code, is memory efficient, and can be used from inexpensive workstations and networks. We demonstrate how we have used this system to manipulate data from production MD simulations involving as many as 104 million atoms running on the CM-5 and Cray T3D. We also show howmore » this approach can be used to build systems that integrate common scripting languages (including Tcl/Tk, Perl, and Python), simulation code, user extensions, and commercial data analysis packages.« less
Enhancing Scalability and Efficiency of the TOUGH2_MP for LinuxClusters

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Keni; Wu, Yu-Shu

2006-04-17

TOUGH2{_}MP, the parallel version TOUGH2 code, has been enhanced by implementing more efficient communication schemes. This enhancement is achieved through reducing the amount of small-size messages and the volume of large messages. The message exchange speed is further improved by using non-blocking communications for both linear and nonlinear iterations. In addition, we have modified the AZTEC parallel linear-equation solver to nonblocking communication. Through the improvement of code structuring and bug fixing, the new version code is now more stable, while demonstrating similar or even better nonlinear iteration converging speed than the original TOUGH2 code. As a result, the new versionmore » of TOUGH2{_}MP is improved significantly in its efficiency. In this paper, the scalability and efficiency of the parallel code are demonstrated by solving two large-scale problems. The testing results indicate that speedup of the code may depend on both problem size and complexity. In general, the code has excellent scalability in memory requirement as well as computing time.« less
Reactor Dosimetry Applications Using RAPTOR-M3G:. a New Parallel 3-D Radiation Transport Code

NASA Astrophysics Data System (ADS)

Longoni, Gianluca; Anderson, Stanwood L.

2009-08-01

The numerical solution of the Linearized Boltzmann Equation (LBE) via the Discrete Ordinates method (SN) requires extensive computational resources for large 3-D neutron and gamma transport applications due to the concurrent discretization of the angular, spatial, and energy domains. This paper will discuss the development RAPTOR-M3G (RApid Parallel Transport Of Radiation - Multiple 3D Geometries), a new 3-D parallel radiation transport code, and its application to the calculation of ex-vessel neutron dosimetry responses in the cavity of a commercial 2-loop Pressurized Water Reactor (PWR). RAPTOR-M3G is based domain decomposition algorithms, where the spatial and angular domains are allocated and processed on multi-processor computer architectures. As compared to traditional single-processor applications, this approach reduces the computational load as well as the memory requirement per processor, yielding an efficient solution methodology for large 3-D problems. Measured neutron dosimetry responses in the reactor cavity air gap will be compared to the RAPTOR-M3G predictions. This paper is organized as follows: Section 1 discusses the RAPTOR-M3G methodology; Section 2 describes the 2-loop PWR model and the numerical results obtained. Section 3 addresses the parallel performance of the code, and Section 4 concludes this paper with final remarks and future work.
Xyce parallel electronic simulator users guide, version 6.1

DOE Office of Scientific and Technical Information (OSTI.GOV)

Keiter, Eric R; Mei, Ting; Russo, Thomas V.

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas; Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers; A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models; Device models that are specifically tailored to meet Sandia's needs, including some radiationaware devices (for Sandia users only); and Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase-a message passing parallel implementation-which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less
Xyce parallel electronic simulator users' guide, Version 6.0.1.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Keiter, Eric R; Mei, Ting; Russo, Thomas V.

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandias needs, including some radiationaware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase a message passing parallel implementation which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less
Xyce parallel electronic simulator users guide, version 6.0.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Keiter, Eric R; Mei, Ting; Russo, Thomas V.

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandias needs, including some radiationaware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase a message passing parallel implementation which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less
RIO: a new computational framework for accurate initial data of binary black holes

NASA Astrophysics Data System (ADS)

Barreto, W.; Clemente, P. C. M.; de Oliveira, H. P.; Rodriguez-Mueller, B.

2018-06-01

We present a computational framework ( Rio) in the ADM 3+1 approach for numerical relativity. This work enables us to carry out high resolution calculations for initial data of two arbitrary black holes. We use the transverse conformal treatment, the Bowen-York and the puncture methods. For the numerical solution of the Hamiltonian constraint we use the domain decomposition and the spectral decomposition of Galerkin-Collocation. The nonlinear numerical code solves the set of equations for the spectral modes using the standard Newton-Raphson method, LU decomposition and Gaussian quadratures. We show the convergence of the Rio code. This code allows for easy deployment of large calculations. We show how the spin of one of the black holes is manifest in the conformal factor.
Simulation of Hypervelocity Impact on Aluminum-Nextel-Kevlar Orbital Debris Shields

NASA Technical Reports Server (NTRS)

Fahrenthold, Eric P.

2000-01-01

An improved hybrid particle-finite element method has been developed for hypervelocity impact simulation. The method combines the general contact-impact capabilities of particle codes with the true Lagrangian kinematics of large strain finite element formulations. Unlike some alternative schemes which couple Lagrangian finite element models with smooth particle hydrodynamics, the present formulation makes no use of slidelines or penalty forces. The method has been implemented in a parallel, three dimensional computer code. Simulations of three dimensional orbital debris impact problems using this parallel hybrid particle-finite element code, show good agreement with experiment and good speedup in parallel computation. The simulations included single and multi-plate shields as well as aluminum and composite shielding materials. at an impact velocity of eleven kilometers per second.
A parallel-vector algorithm for rapid structural analysis on high-performance computers

NASA Technical Reports Server (NTRS)

Storaasli, Olaf O.; Nguyen, Duc T.; Agarwal, Tarun K.

1990-01-01

A fast, accurate Choleski method for the solution of symmetric systems of linear equations is presented. This direct method is based on a variable-band storage scheme and takes advantage of column heights to reduce the number of operations in the Choleski factorization. The method employs parallel computation in the outermost DO-loop and vector computation via the 'loop unrolling' technique in the innermost DO-loop. The method avoids computations with zeros outside the column heights, and as an option, zeros inside the band. The close relationship between Choleski and Gauss elimination methods is examined. The minor changes required to convert the Choleski code to a Gauss code to solve non-positive-definite symmetric systems of equations are identified. The results for two large-scale structural analyses performed on supercomputers, demonstrate the accuracy and speed of the method.
A parallel-vector algorithm for rapid structural analysis on high-performance computers

NASA Technical Reports Server (NTRS)

Storaasli, Olaf O.; Nguyen, Duc T.; Agarwal, Tarun K.

1990-01-01

A fast, accurate Choleski method for the solution of symmetric systems of linear equations is presented. This direct method is based on a variable-band storage scheme and takes advantage of column heights to reduce the number of operations in the Choleski factorization. The method employs parallel computation in the outermost DO-loop and vector computation via the loop unrolling technique in the innermost DO-loop. The method avoids computations with zeros outside the column heights, and as an option, zeros inside the band. The close relationship between Choleski and Gauss elimination methods is examined. The minor changes required to convert the Choleski code to a Gauss code to solve non-positive-definite symmetric systems of equations are identified. The results for two large scale structural analyses performed on supercomputers, demonstrate the accuracy and speed of the method.
Automatic Generation of OpenMP Directives and Its Application to Computational Fluid Dynamics Codes

NASA Technical Reports Server (NTRS)

Yan, Jerry; Jin, Haoqiang; Frumkin, Michael; Yan, Jerry (Technical Monitor)

2000-01-01

The shared-memory programming model is a very effective way to achieve parallelism on shared memory parallel computers. As great progress was made in hardware and software technologies, performance of parallel programs with compiler directives has demonstrated large improvement. The introduction of OpenMP directives, the industrial standard for shared-memory programming, has minimized the issue of portability. In this study, we have extended CAPTools, a computer-aided parallelization toolkit, to automatically generate OpenMP-based parallel programs with nominal user assistance. We outline techniques used in the implementation of the tool and discuss the application of this tool on the NAS Parallel Benchmarks and several computational fluid dynamics codes. This work demonstrates the great potential of using the tool to quickly port parallel programs and also achieve good performance that exceeds some of the commercial tools.
National Fusion Collaboratory: Grid Computing for Simulations and Experiments

NASA Astrophysics Data System (ADS)

Greenwald, Martin

2004-05-01

The National Fusion Collaboratory Project is creating a computational grid designed to advance scientific understanding and innovation in magnetic fusion research by facilitating collaborations, enabling more effective integration of experiments, theory and modeling and allowing more efficient use of experimental facilities. The philosophy of FusionGrid is that data, codes, analysis routines, visualization tools, and communication tools should be thought of as network available services, easily used by the fusion scientist. In such an environment, access to services is stressed rather than portability. By building on a foundation of established computer science toolkits, deployment time can be minimized. These services all share the same basic infrastructure that allows for secure authentication and resource authorization which allows stakeholders to control their own resources such as computers, data and experiments. Code developers can control intellectual property, and fair use of shared resources can be demonstrated and controlled. A key goal is to shield scientific users from the implementation details such that transparency and ease-of-use are maximized. The first FusionGrid service deployed was the TRANSP code, a widely used tool for transport analysis. Tools for run preparation, submission, monitoring and management have been developed and shared among a wide user base. This approach saves user sites from the laborious effort of maintaining such a large and complex code while at the same time reducing the burden on the development team by avoiding the need to support a large number of heterogeneous installations. Shared visualization and A/V tools are being developed and deployed to enhance long-distance collaborations. These include desktop versions of the Access Grid, a highly capable multi-point remote conferencing tool and capabilities for sharing displays and analysis tools over local and wide-area networks.
MHD code using multi graphical processing units: SMAUG+

NASA Astrophysics Data System (ADS)

Gyenge, N.; Griffiths, M. K.; Erdélyi, R.

2018-01-01

This paper introduces the Sheffield Magnetohydrodynamics Algorithm Using GPUs (SMAUG+), an advanced numerical code for solving magnetohydrodynamic (MHD) problems, using multi-GPU systems. Multi-GPU systems facilitate the development of accelerated codes and enable us to investigate larger model sizes and/or more detailed computational domain resolutions. This is a significant advancement over the parent single-GPU MHD code, SMAUG (Griffiths et al., 2015). Here, we demonstrate the validity of the SMAUG + code, describe the parallelisation techniques and investigate performance benchmarks. The initial configuration of the Orszag-Tang vortex simulations are distributed among 4, 16, 64 and 100 GPUs. Furthermore, different simulation box resolutions are applied: 1000 × 1000, 2044 × 2044, 4000 × 4000 and 8000 × 8000 . We also tested the code with the Brio-Wu shock tube simulations with model size of 800 employing up to 10 GPUs. Based on the test results, we observed speed ups and slow downs, depending on the granularity and the communication overhead of certain parallel tasks. The main aim of the code development is to provide massively parallel code without the memory limitation of a single GPU. By using our code, the applied model size could be significantly increased. We demonstrate that we are able to successfully compute numerically valid and large 2D MHD problems.
Convergence acceleration of the Proteus computer code with multigrid methods

NASA Technical Reports Server (NTRS)

Demuren, A. O.; Ibraheem, S. O.

1995-01-01

This report presents the results of a study to implement convergence acceleration techniques based on the multigrid concept in the two-dimensional and three-dimensional versions of the Proteus computer code. The first section presents a review of the relevant literature on the implementation of the multigrid methods in computer codes for compressible flow analysis. The next two sections present detailed stability analysis of numerical schemes for solving the Euler and Navier-Stokes equations, based on conventional von Neumann analysis and the bi-grid analysis, respectively. The next section presents details of the computational method used in the Proteus computer code. Finally, the multigrid implementation and applications to several two-dimensional and three-dimensional test problems are presented. The results of the present study show that the multigrid method always leads to a reduction in the number of iterations (or time steps) required for convergence. However, there is an overhead associated with the use of multigrid acceleration. The overhead is higher in 2-D problems than in 3-D problems, thus overall multigrid savings in CPU time are in general better in the latter. Savings of about 40-50 percent are typical in 3-D problems, but they are about 20-30 percent in large 2-D problems. The present multigrid method is applicable to steady-state problems and is therefore ineffective in problems with inherently unstable solutions.
Performance optimization of Qbox and WEST on Intel Knights Landing

NASA Astrophysics Data System (ADS)

Zheng, Huihuo; Knight, Christopher; Galli, Giulia; Govoni, Marco; Gygi, Francois

We present the optimization of electronic structure codes Qbox and WEST targeting the Intel®Xeon Phi™processor, codenamed Knights Landing (KNL). Qbox is an ab-initio molecular dynamics code based on plane wave density functional theory (DFT) and WEST is a post-DFT code for excited state calculations within many-body perturbation theory. Both Qbox and WEST employ highly scalable algorithms which enable accurate large-scale electronic structure calculations on leadership class supercomputer platforms beyond 100,000 cores, such as Mira and Theta at the Argonne Leadership Computing Facility. In this work, features of the KNL architecture (e.g. hierarchical memory) are explored to achieve higher performance in key algorithms of the Qbox and WEST codes and to develop a road-map for further development targeting next-generation computing architectures. In particular, the optimizations of the Qbox and WEST codes on the KNL platform will target efficient large-scale electronic structure calculations of nanostructured materials exhibiting complex structures and prediction of their electronic and thermal properties for use in solar and thermal energy conversion device. This work was supported by MICCoM, as part of Comp. Mats. Sci. Program funded by the U.S. DOE, Office of Sci., BES, MSE Division. This research used resources of the ALCF, which is a DOE Office of Sci. User Facility under Contract DE-AC02-06CH11357.
What makes computational open source software libraries successful?

NASA Astrophysics Data System (ADS)

Bangerth, Wolfgang; Heister, Timo

2013-01-01

Software is the backbone of scientific computing. Yet, while we regularly publish detailed accounts about the results of scientific software, and while there is a general sense of which numerical methods work well, our community is largely unaware of best practices in writing the large-scale, open source scientific software upon which our discipline rests. This is particularly apparent in the commonly held view that writing successful software packages is largely the result of simply ‘being a good programmer’ when in fact there are many other factors involved, for example the social skill of community building. In this paper, we consider what we have found to be the necessary ingredients for successful scientific software projects and, in particular, for software libraries upon which the vast majority of scientific codes are built today. In particular, we discuss the roles of code, documentation, communities, project management and licenses. We also briefly comment on the impact on academic careers of engaging in software projects.
Three-dimensional inviscid analysis of radial-turbine flow and a limited comparison with experimental data

NASA Technical Reports Server (NTRS)

Choo, Y. K.; Civinskas, K. C.

1985-01-01

The three-dimensional inviscid DENTON code is used to analyze flow through a radial-inflow turbine rotor. Experimental data from the rotor are compared with analytical results obtained by using the code. The experimental data available for comparison are the radial distributions of circumferentially averaged values of absolute flow angle and total pressure downstream of the rotor exit. The computed rotor-exit flow angles are generally underturned relative to the experimental values, which reflect the boundary-layer separation at the trailing edge and the development of wakes downstream of the rotor. The experimental rotor is designed for a higher-than-optimum work factor of 1.126 resulting in a nonoptimum positive incidence and causing a region of rapid flow adjustment and large velocity gradients. For this experimental rotor, the computed radial distribution of rotor-exit to turbine-inlet total pressure ratios are underpredicted due to the errors in the finite-difference approximations in the regions of rapid flow adjustment, and due to using the relatively coarser grids in the middle of the blade region where the flow passage is highly three-dimensional. Additional results obtained from the three-dimensional inviscid computation are also presented, but without comparison due to the lack of experimental data. These include quasi-secondary velocity vectors on cross-channel surfaces, velocity components on the meridional and blade-to-blade surfaces, and blade surface loading diagrams. Computed results show the evolution of a passage vortex and large streamline deviations from the computational streamwise grid lines. Experience gained from applying the code to a radial turbine geometry is also discussed.
Constructing Neuronal Network Models in Massively Parallel Environments.

PubMed

Ippen, Tammo; Eppler, Jochen M; Plesser, Hans E; Diesmann, Markus

2017-01-01

Recent advances in the development of data structures to represent spiking neuron network models enable us to exploit the complete memory of petascale computers for a single brain-scale network simulation. In this work, we investigate how well we can exploit the computing power of such supercomputers for the creation of neuronal networks. Using an established benchmark, we divide the runtime of simulation code into the phase of network construction and the phase during which the dynamical state is advanced in time. We find that on multi-core compute nodes network creation scales well with process-parallel code but exhibits a prohibitively large memory consumption. Thread-parallel network creation, in contrast, exhibits speedup only up to a small number of threads but has little overhead in terms of memory. We further observe that the algorithms creating instances of model neurons and their connections scale well for networks of ten thousand neurons, but do not show the same speedup for networks of millions of neurons. Our work uncovers that the lack of scaling of thread-parallel network creation is due to inadequate memory allocation strategies and demonstrates that thread-optimized memory allocators recover excellent scaling. An analysis of the loop order used for network construction reveals that more complex tests on the locality of operations significantly improve scaling and reduce runtime by allowing construction algorithms to step through large networks more efficiently than in existing code. The combination of these techniques increases performance by an order of magnitude and harnesses the increasingly parallel compute power of the compute nodes in high-performance clusters and supercomputers.

Constructing Neuronal Network Models in Massively Parallel Environments

PubMed Central

Ippen, Tammo; Eppler, Jochen M.; Plesser, Hans E.; Diesmann, Markus

2017-01-01

Recent advances in the development of data structures to represent spiking neuron network models enable us to exploit the complete memory of petascale computers for a single brain-scale network simulation. In this work, we investigate how well we can exploit the computing power of such supercomputers for the creation of neuronal networks. Using an established benchmark, we divide the runtime of simulation code into the phase of network construction and the phase during which the dynamical state is advanced in time. We find that on multi-core compute nodes network creation scales well with process-parallel code but exhibits a prohibitively large memory consumption. Thread-parallel network creation, in contrast, exhibits speedup only up to a small number of threads but has little overhead in terms of memory. We further observe that the algorithms creating instances of model neurons and their connections scale well for networks of ten thousand neurons, but do not show the same speedup for networks of millions of neurons. Our work uncovers that the lack of scaling of thread-parallel network creation is due to inadequate memory allocation strategies and demonstrates that thread-optimized memory allocators recover excellent scaling. An analysis of the loop order used for network construction reveals that more complex tests on the locality of operations significantly improve scaling and reduce runtime by allowing construction algorithms to step through large networks more efficiently than in existing code. The combination of these techniques increases performance by an order of magnitude and harnesses the increasingly parallel compute power of the compute nodes in high-performance clusters and supercomputers. PMID:28559808
Three-dimensional inviscid analysis of radial turbine flow and a limited comparison with experimental data

NASA Technical Reports Server (NTRS)

Choo, Y. K.; Civinskas, K. C.

1985-01-01

The three-dimensional inviscid DENTON code is used to analyze flow through a radial-inflow turbine rotor. Experimental data from the rotor are compared with analytical results obtained by using the code. The experimental data available for comparison are the radial distributions of circumferentially averaged values of absolute flow angle and total pressure downstream of the rotor exit. The computed rotor-exit flow angles are generally underturned relative to the experimental values, which reflect the boundary-layer separation at the trailing edge and the development of wakes downstream of the rotor. The experimental rotor is designed for a higher-than-optimum work factor of 1.126 resulting in a nonoptimum positive incidence and causing a region of rapid flow adjustment and large velocity gradients. For this experimental rotor, the computed radial distribution of rotor-exit to turbine-inlet total pressure ratios are underpredicted due to the errors in the finite-difference approximations in the regions of rapid flow adjustment, and due to using the relatively coarser grids in the middle of the blade region where the flow passage is highly three-dimensional. Additional results obtained from the three-dimensional inviscid computation are also presented, but without comparison due to the lack of experimental data. These include quasi-secondary velocity vectors on cross-channel surfaces, velocity components on the meridional and blade-to-blade surfaces, and blade surface loading diagrams. Computed results show the evolution of a passage vortex and large streamline deviations from the computational streamwise grid lines. Experience gained from applying the code to a radial turbine geometry is also discussed.
Speeding Up Ecological and Evolutionary Computations in R; Essentials of High Performance Computing for Biologists

PubMed Central

Visser, Marco D.; McMahon, Sean M.; Merow, Cory; Dixon, Philip M.; Record, Sydne; Jongejans, Eelke

2015-01-01

Computation has become a critical component of research in biology. A risk has emerged that computational and programming challenges may limit research scope, depth, and quality. We review various solutions to common computational efficiency problems in ecological and evolutionary research. Our review pulls together material that is currently scattered across many sources and emphasizes those techniques that are especially effective for typical ecological and environmental problems. We demonstrate how straightforward it can be to write efficient code and implement techniques such as profiling or parallel computing. We supply a newly developed R package (aprof) that helps to identify computational bottlenecks in R code and determine whether optimization can be effective. Our review is complemented by a practical set of examples and detailed Supporting Information material (S1–S3 Texts) that demonstrate large improvements in computational speed (ranging from 10.5 times to 14,000 times faster). By improving computational efficiency, biologists can feasibly solve more complex tasks, ask more ambitious questions, and include more sophisticated analyses in their research. PMID:25811842
GW/Bethe-Salpeter calculations for charged and model systems from real-space DFT

NASA Astrophysics Data System (ADS)

Strubbe, David A.

GW and Bethe-Salpeter (GW/BSE) calculations use mean-field input from density-functional theory (DFT) calculations to compute excited states of a condensed-matter system. Many parts of a GW/BSE calculation are efficiently performed in a plane-wave basis, and extensive effort has gone into optimizing and parallelizing plane-wave GW/BSE codes for large-scale computations. Most straightforwardly, plane-wave DFT can be used as a starting point, but real-space DFT is also an attractive starting point: it is systematically convergeable like plane waves, can take advantage of efficient domain parallelization for large systems, and is well suited physically for finite and especially charged systems. The flexibility of a real-space grid also allows convenient calculations on non-atomic model systems. I will discuss the interfacing of a real-space (TD)DFT code (Octopus, www.tddft.org/programs/octopus) with a plane-wave GW/BSE code (BerkeleyGW, www.berkeleygw.org), consider performance issues and accuracy, and present some applications to simple and paradigmatic systems that illuminate fundamental properties of these approximations in many-body perturbation theory.
GeNN: a code generation framework for accelerated brain simulations

NASA Astrophysics Data System (ADS)

Yavuz, Esin; Turner, James; Nowotny, Thomas

2016-01-01

Large-scale numerical simulations of detailed brain circuit models are important for identifying hypotheses on brain functions and testing their consistency and plausibility. An ongoing challenge for simulating realistic models is, however, computational speed. In this paper, we present the GeNN (GPU-enhanced Neuronal Networks) framework, which aims to facilitate the use of graphics accelerators for computational models of large-scale neuronal networks to address this challenge. GeNN is an open source library that generates code to accelerate the execution of network simulations on NVIDIA GPUs, through a flexible and extensible interface, which does not require in-depth technical knowledge from the users. We present performance benchmarks showing that 200-fold speedup compared to a single core of a CPU can be achieved for a network of one million conductance based Hodgkin-Huxley neurons but that for other models the speedup can differ. GeNN is available for Linux, Mac OS X and Windows platforms. The source code, user manual, tutorials, Wiki, in-depth example projects and all other related information can be found on the project website http://genn-team.github.io/genn/.
GeNN: a code generation framework for accelerated brain simulations.

PubMed

Yavuz, Esin; Turner, James; Nowotny, Thomas

2016-01-07

Large-scale numerical simulations of detailed brain circuit models are important for identifying hypotheses on brain functions and testing their consistency and plausibility. An ongoing challenge for simulating realistic models is, however, computational speed. In this paper, we present the GeNN (GPU-enhanced Neuronal Networks) framework, which aims to facilitate the use of graphics accelerators for computational models of large-scale neuronal networks to address this challenge. GeNN is an open source library that generates code to accelerate the execution of network simulations on NVIDIA GPUs, through a flexible and extensible interface, which does not require in-depth technical knowledge from the users. We present performance benchmarks showing that 200-fold speedup compared to a single core of a CPU can be achieved for a network of one million conductance based Hodgkin-Huxley neurons but that for other models the speedup can differ. GeNN is available for Linux, Mac OS X and Windows platforms. The source code, user manual, tutorials, Wiki, in-depth example projects and all other related information can be found on the project website http://genn-team.github.io/genn/.
GeNN: a code generation framework for accelerated brain simulations

PubMed Central

Yavuz, Esin; Turner, James; Nowotny, Thomas

2016-01-01

Large-scale numerical simulations of detailed brain circuit models are important for identifying hypotheses on brain functions and testing their consistency and plausibility. An ongoing challenge for simulating realistic models is, however, computational speed. In this paper, we present the GeNN (GPU-enhanced Neuronal Networks) framework, which aims to facilitate the use of graphics accelerators for computational models of large-scale neuronal networks to address this challenge. GeNN is an open source library that generates code to accelerate the execution of network simulations on NVIDIA GPUs, through a flexible and extensible interface, which does not require in-depth technical knowledge from the users. We present performance benchmarks showing that 200-fold speedup compared to a single core of a CPU can be achieved for a network of one million conductance based Hodgkin-Huxley neurons but that for other models the speedup can differ. GeNN is available for Linux, Mac OS X and Windows platforms. The source code, user manual, tutorials, Wiki, in-depth example projects and all other related information can be found on the project website http://genn-team.github.io/genn/. PMID:26740369
Planet formation: is it good or bad to have a stellar companion?

NASA Astrophysics Data System (ADS)

Marzari, F.; Thebault, P.; Scholl, H.

2010-04-01

Planet formation in binary star systems is a complex issue due to the gravitational perturbations of the companion star. One of the crucial steps of the core-accretion model is planetesimal accretion into large protoplanets which finally coalesce into planets. In a planetesimal swarm surrounding the primary star, the average mutual impact velocity determines if larger bodies form or if the population is grinded down to dust, halting the planet formation process. This velocity is strongly influenced by the companion gravitational pull and by gas drag. The combined effect of these two forces may act in favour of or against planet formation, setting a lower or equal probability of the existence of extrasolar planets around single or binary stars. Planetesimal accretion in binaries has been studied so far with two different approaches. N-body codes based on the assumption that the disk is axisymmetric are very cost-effective since they allow the study of the mutual relative velocity with limited CPU usage. A large amount of planetesimal trajectories can be computed making it possible to outline the regions around the star where planet formation is possible. The main limitation of the N-body codes is the axisymmetric assumption. The companion perturbations affect not only the planetesimal orbits, but also the gaseous disk, by forcing spiral density waves. In addition, the overall shape of the disk changes from circular to elliptic. Hybrid codes have been recently developed which solve the equations for the disk with a hydrodynamical grid code and use the computed gas density and velocity vector to calculate an accurate value of the gas drag force on the planetesimals. These codes are more complex and may compute the trajectories of only a limited number of planetesimals.
Aerodynamic Interference Due to MSL Reaction Control System

NASA Technical Reports Server (NTRS)

Dyakonov, Artem A.; Schoenenberger, Mark; Scallion, William I.; VanNorman, John W.; Novak, Luke A.; Tang, Chun Y.

2009-01-01

An investigation of effectiveness of the reaction control system (RCS) of Mars Science Laboratory (MSL) entry capsule during atmospheric flight has been conducted. The reason for the investigation is that MSL is designed to fly a lifting actively guided entry with hypersonic bank maneuvers, therefore an understanding of RCS effectiveness is required. In the course of the study several jet configurations were evaluated using Langley Aerothermal Upwind Relaxation Algorithm (LAURA) code, Data Parallel Line Relaxation (DPLR) code, Fully Unstructured 3D (FUN3D) code and an Overset Grid Flowsolver (OVERFLOW) code. Computations indicated that some of the proposed configurations might induce aero-RCS interactions, sufficient to impede and even overwhelm the intended control torques. It was found that the maximum potential for aero-RCS interference exists around peak dynamic pressure along the trajectory. Present analysis largely relies on computational methods. Ground testing, flight data and computational analyses are required to fully understand the problem. At the time of this writing some experimental work spanning range of Mach number 2.5 through 4.5 has been completed and used to establish preliminary levels of confidence for computations. As a result of the present work a final RCS configuration has been designed such as to minimize aero-interference effects and it is a design baseline for MSL entry capsule.
Calculation of spherical harmonics and Wigner d functions by FFT. Applications to fast rotational matching in molecular replacement and implementation into AMoRe.

PubMed

Trapani, Stefano; Navaza, Jorge

2006-07-01

The FFT calculation of spherical harmonics, Wigner D matrices and rotation function has been extended to all angular variables in the AMoRe molecular replacement software. The resulting code avoids singularity issues arising from recursive formulas, performs faster and produces results with at least the same accuracy as the original code. The new code aims at permitting accurate and more rapid computations at high angular resolution of the rotation function of large particles. Test calculations on the icosahedral IBDV VP2 subviral particle showed that the new code performs on the average 1.5 times faster than the original code.
Data Parallel Line Relaxation (DPLR) Code User Manual: Acadia - Version 4.01.1

NASA Technical Reports Server (NTRS)

Wright, Michael J.; White, Todd; Mangini, Nancy

2009-01-01

Data-Parallel Line Relaxation (DPLR) code is a computational fluid dynamic (CFD) solver that was developed at NASA Ames Research Center to help mission support teams generate high-value predictive solutions for hypersonic flow field problems. The DPLR Code Package is an MPI-based, parallel, full three-dimensional Navier-Stokes CFD solver with generalized models for finite-rate reaction kinetics, thermal and chemical non-equilibrium, accurate high-temperature transport coefficients, and ionized flow physics incorporated into the code. DPLR also includes a large selection of generalized realistic surface boundary conditions and links to enable loose coupling with external thermal protection system (TPS) material response and shock layer radiation codes.
Low Density Parity Check Codes Based on Finite Geometries: A Rediscovery and More

NASA Technical Reports Server (NTRS)

Kou, Yu; Lin, Shu; Fossorier, Marc

1999-01-01

Low density parity check (LDPC) codes with iterative decoding based on belief propagation achieve astonishing error performance close to Shannon limit. No algebraic or geometric method for constructing these codes has been reported and they are largely generated by computer search. As a result, encoding of long LDPC codes is in general very complex. This paper presents two classes of high rate LDPC codes whose constructions are based on finite Euclidean and projective geometries, respectively. These classes of codes a.re cyclic and have good constraint parameters and minimum distances. Cyclic structure adows the use of linear feedback shift registers for encoding. These finite geometry LDPC codes achieve very good error performance with either soft-decision iterative decoding based on belief propagation or Gallager's hard-decision bit flipping algorithm. These codes can be punctured or extended to obtain other good LDPC codes. A generalization of these codes is also presented.
The EDIT-COMGEOM Code

DTIC Science & Technology

1975-09-01

This report assumes a familiarity with the GIFT and MAGIC computer codes. The EDIT-COMGEOM code is a FORTRAN computer code. The EDIT-COMGEOM code...converts the target description data which was used in the MAGIC computer code to the target description data which can be used in the GIFT computer code
Silicon CMOS architecture for a spin-based quantum computer.

PubMed

Veldhorst, M; Eenink, H G J; Yang, C H; Dzurak, A S

2017-12-15

Recent advances in quantum error correction codes for fault-tolerant quantum computing and physical realizations of high-fidelity qubits in multiple platforms give promise for the construction of a quantum computer based on millions of interacting qubits. However, the classical-quantum interface remains a nascent field of exploration. Here, we propose an architecture for a silicon-based quantum computer processor based on complementary metal-oxide-semiconductor (CMOS) technology. We show how a transistor-based control circuit together with charge-storage electrodes can be used to operate a dense and scalable two-dimensional qubit system. The qubits are defined by the spin state of a single electron confined in quantum dots, coupled via exchange interactions, controlled using a microwave cavity, and measured via gate-based dispersive readout. We implement a spin qubit surface code, showing the prospects for universal quantum computation. We discuss the challenges and focus areas that need to be addressed, providing a path for large-scale quantum computing.
Hybrid reduced order modeling for assembly calculations

DOE PAGES

Bang, Youngsuk; Abdel-Khalik, Hany S.; Jessee, Matthew A.; ...

2015-08-14

While the accuracy of assembly calculations has greatly improved due to the increase in computer power enabling more refined description of the phase space and use of more sophisticated numerical algorithms, the computational cost continues to increase which limits the full utilization of their effectiveness for routine engineering analysis. Reduced order modeling is a mathematical vehicle that scales down the dimensionality of large-scale numerical problems to enable their repeated executions on small computing environment, often available to end users. This is done by capturing the most dominant underlying relationships between the model's inputs and outputs. Previous works demonstrated the usemore » of the reduced order modeling for a single physics code, such as a radiation transport calculation. This paper extends those works to coupled code systems as currently employed in assembly calculations. Finally, numerical tests are conducted using realistic SCALE assembly models with resonance self-shielding, neutron transport, and nuclides transmutation/depletion models representing the components of the coupled code system.« less
Shell stability analysis in a computer aided engineering (CAE) environment

NASA Technical Reports Server (NTRS)

Arbocz, J.; Hol, J. M. A. M.

1993-01-01

The development of 'DISDECO', the Delft Interactive Shell DEsign COde is described. The purpose of this project is to make the accumulated theoretical, numerical and practical knowledge of the last 25 years or so readily accessible to users interested in the analysis of buckling sensitive structures. With this open ended, hierarchical, interactive computer code the user can access from his workstation successively programs of increasing complexity. The computational modules currently operational in DISDECO provide the prospective user with facilities to calculate the critical buckling loads of stiffened anisotropic shells under combined loading, to investigate the effects the various types of boundary conditions will have on the critical load, and to get a complete picture of the degrading effects the different shapes of possible initial imperfections might cause, all in one interactive session. Once a design is finalized, its collapse load can be verified by running a large refined model remotely from behind the workstation with one of the current generation 2-dimensional codes, with advanced capabilities to handle both geometric and material nonlinearities.
Inference in the brain: Statistics flowing in redundant population codes

PubMed Central

Pitkow, Xaq; Angelaki, Dora E

2017-01-01

It is widely believed that the brain performs approximate probabilistic inference to estimate causal variables in the world from ambiguous sensory data. To understand these computations, we need to analyze how information is represented and transformed by the actions of nonlinear recurrent neural networks. We propose that these probabilistic computations function by a message-passing algorithm operating at the level of redundant neural populations. To explain this framework, we review its underlying concepts, including graphical models, sufficient statistics, and message-passing, and then describe how these concepts could be implemented by recurrently connected probabilistic population codes. The relevant information flow in these networks will be most interpretable at the population level, particularly for redundant neural codes. We therefore outline a general approach to identify the essential features of a neural message-passing algorithm. Finally, we argue that to reveal the most important aspects of these neural computations, we must study large-scale activity patterns during moderately complex, naturalistic behaviors. PMID:28595050
Improvements to Busquet's Non LTE algorithm in NRL's Hydro code

NASA Astrophysics Data System (ADS)

Klapisch, M.; Colombant, D.

1996-11-01

Implementation of the Non LTE model RADIOM (M. Busquet, Phys. Fluids B, 5, 4191 (1993)) in NRL's RAD2D Hydro code in conservative form was reported previously(M. Klapisch et al., Bull. Am. Phys. Soc., 40, 1806 (1995)).While the results were satisfactory, the algorithm was slow and not always converging. We describe here modifications that address the latter two shortcomings. This method is quicker and more stable than the original. It also gives information about the validity of the fitting. It turns out that the number and distribution of groups in the multigroup diffusion opacity tables - a basis for the computation of radiation effects in the ionization balance in RADIOM- has a large influence on the robustness of the algorithm. These modifications give insight about the algorithm, and allow to check that the obtained average charge state is the true average. In addition, code optimization resulted in greatly reduced computing time: The ratio of Non LTE to LTE computing times being now between 1.5 and 2.
User's and test case manual for FEMATS

NASA Technical Reports Server (NTRS)

Chatterjee, Arindam; Volakis, John; Nurnberger, Mike; Natzke, John

1995-01-01

The FEMATS program incorporates first-order edge-based finite elements and vector absorbing boundary conditions into the scattered field formulation for computation of the scattering from three-dimensional geometries. The code has been validated extensively for a large class of geometries containing inhomogeneities and satisfying transition conditions. For geometries that are too large for the workstation environment, the FEMATS code has been optimized to run on various supercomputers. Currently, FEMATS has been configured to run on the HP 9000 workstation, vectorized for the Cray Y-MP, and parallelized to run on the Kendall Square Research (KSR) architecture and the Intel Paragon.
Building A Community Focused Data and Modeling Collaborative platform with Hardware Virtualization Technology

NASA Astrophysics Data System (ADS)

Michaelis, A.; Wang, W.; Melton, F. S.; Votava, P.; Milesi, C.; Hashimoto, H.; Nemani, R. R.; Hiatt, S. H.

2009-12-01

As the length and diversity of the global earth observation data records grow, modeling and analyses of biospheric conditions increasingly requires multiple terabytes of data from a diversity of models and sensors. With network bandwidth beginning to flatten, transmission of these data from centralized data archives presents an increasing challenge, and costs associated with local storage and management of data and compute resources are often significant for individual research and application development efforts. Sharing community valued intermediary data sets, results and codes from individual efforts with others that are not in direct funded collaboration can also be a challenge with respect to time, cost and expertise. We purpose a modeling, data and knowledge center that houses NASA satellite data, climate data and ancillary data where a focused community may come together to share modeling and analysis codes, scientific results, knowledge and expertise on a centralized platform, named Ecosystem Modeling Center (EMC). With the recent development of new technologies for secure hardware virtualization, an opportunity exists to create specific modeling, analysis and compute environments that are customizable, “archiveable” and transferable. Allowing users to instantiate such environments on large compute infrastructures that are directly connected to large data archives may significantly reduce costs and time associated with scientific efforts by alleviating users from redundantly retrieving and integrating data sets and building modeling analysis codes. The EMC platform also provides the possibility for users receiving indirect assistance from expertise through prefabricated compute environments, potentially reducing study “ramp up” times.

GALARIO: a GPU accelerated library for analysing radio interferometer observations

NASA Astrophysics Data System (ADS)

Tazzari, Marco; Beaujean, Frederik; Testi, Leonardo

2018-06-01

We present GALARIO, a computational library that exploits the power of modern graphical processing units (GPUs) to accelerate the analysis of observations from radio interferometers like Atacama Large Millimeter and sub-millimeter Array or the Karl G. Jansky Very Large Array. GALARIO speeds up the computation of synthetic visibilities from a generic 2D model image or a radial brightness profile (for axisymmetric sources). On a GPU, GALARIO is 150 faster than standard PYTHON and 10 times faster than serial C++ code on a CPU. Highly modular, easy to use, and to adopt in existing code, GALARIO comes as two compiled libraries, one for Nvidia GPUs and one for multicore CPUs, where both have the same functions with identical interfaces. GALARIO comes with PYTHON bindings but can also be directly used in C or C++. The versatility and the speed of GALARIO open new analysis pathways that otherwise would be prohibitively time consuming, e.g. fitting high-resolution observations of large number of objects, or entire spectral cubes of molecular gas emission. It is a general tool that can be applied to any field that uses radio interferometer observations. The source code is available online at http://github.com/mtazzari/galario under the open source GNU Lesser General Public License v3.
Accurate Cold-Test Model of Helical TWT Slow-Wave Circuits

NASA Technical Reports Server (NTRS)

Kory, Carol L.; Dayton, James A., Jr.

1997-01-01

Recently, a method has been established to accurately calculate cold-test data for helical slow-wave structures using the three-dimensional electromagnetic computer code, MAFIA. Cold-test parameters have been calculated for several helical traveling-wave tube (TWT) slow-wave circuits possessing various support rod configurations, and results are presented here showing excellent agreement with experiment. The helical models include tape thickness, dielectric support shapes and material properties consistent with the actual circuits. The cold-test data from this helical model can be used as input into large-signal helical TWT interaction codes making it possible, for the first time, to design a complete TWT via computer simulation.
Wakefield Computations for the CLIC PETS using the Parallel Finite Element Time-Domain Code T3P

DOE Office of Scientific and Technical Information (OSTI.GOV)

Candel, A; Kabel, A.; Lee, L.

In recent years, SLAC's Advanced Computations Department (ACD) has developed the high-performance parallel 3D electromagnetic time-domain code, T3P, for simulations of wakefields and transients in complex accelerator structures. T3P is based on advanced higher-order Finite Element methods on unstructured grids with quadratic surface approximation. Optimized for large-scale parallel processing on leadership supercomputing facilities, T3P allows simulations of realistic 3D structures with unprecedented accuracy, aiding the design of the next generation of accelerator facilities. Applications to the Compact Linear Collider (CLIC) Power Extraction and Transfer Structure (PETS) are presented.
Scaling up Planetary Dynamo Modeling to Massively Parallel Computing Systems: The Rayleigh Code at ALCF

NASA Astrophysics Data System (ADS)

Featherstone, N. A.; Aurnou, J. M.; Yadav, R. K.; Heimpel, M. H.; Soderlund, K. M.; Matsui, H.; Stanley, S.; Brown, B. P.; Glatzmaier, G.; Olson, P.; Buffett, B. A.; Hwang, L.; Kellogg, L. H.

2017-12-01

In the past three years, CIG's Dynamo Working Group has successfully ported the Rayleigh Code to the Argonne Leadership Computer Facility's Mira BG/Q device. In this poster, we present some our first results, showing simulations of 1) convection in the solar convection zone; 2) dynamo action in Earth's core and 3) convection in the jovian deep atmosphere. These simulations have made efficient use of 131 thousand cores, 131 thousand cores and 232 thousand cores, respectively, on Mira. In addition to our novel results, the joys and logistical challenges of carrying out such large runs will also be discussed.
Simulation of Combustion Systems with Realistic g-jitter

NASA Technical Reports Server (NTRS)

Mell, William E.; McGrattan, Kevin B.; Baum, Howard R.

2003-01-01

In this project a transient, fully three-dimensional computer simulation code was developed to simulate the effects of realistic g-jitter on a number of combustion systems. The simulation code is capable of simulating flame spread on a solid and nonpremixed or premixed gaseous combustion in nonturbulent flow with simple combustion models. Simple combustion models were used to preserve computational efficiency since this is meant to be an engineering code. Also, the use of sophisticated turbulence models was not pursued (a simple Smagorinsky type model can be implemented if deemed appropriate) because if flow velocities are large enough for turbulence to develop in a reduced gravity combustion scenario it is unlikely that g-jitter disturbances (in NASA's reduced gravity facilities) will play an important role in the flame dynamics. Acceleration disturbances of realistic orientation, magnitude, and time dependence can be easily included in the simulation. The simulation algorithm was based on techniques used in an existing large eddy simulation code which has successfully simulated fire dynamics in complex domains. A series of simulations with measured and predicted acceleration disturbances on the International Space Station (ISS) are presented. The results of this series of simulations suggested a passive isolation system and appropriate scheduling of crew activity would provide a sufficiently "quiet" acceleration environment for spherical diffusion flames.
Finite-difference computations of rotor loads

NASA Technical Reports Server (NTRS)

Caradonna, F. X.; Tung, C.

1985-01-01

This paper demonstrates the current and future potential of finite-difference methods for solving real rotor problems which now rely largely on empiricism. The demonstration consists of a simple means of combining existing finite-difference, integral, and comprehensive loads codes to predict real transonic rotor flows. These computations are performed for hover and high-advance-ratio flight. Comparisons are made with experimental pressure data.
Finite-difference computations of rotor loads

NASA Technical Reports Server (NTRS)

Caradonna, F. X.; Tung, C.

1985-01-01

The current and future potential of finite difference methods for solving real rotor problems which now rely largely on empiricism are demonstrated. The demonstration consists of a simple means of combining existing finite-difference, integral, and comprehensive loads codes to predict real transonic rotor flows. These computations are performed for hover and high-advanced-ratio flight. Comparisons are made with experimental pressure data.
An efficient, explicit finite-rate algorithm to compute flows in chemical nonequilibrium

NASA Technical Reports Server (NTRS)

Palmer, Grant

1989-01-01

An explicit finite-rate code was developed to compute hypersonic viscous chemically reacting flows about three-dimensional bodies. Equations describing the finite-rate chemical reactions were fully coupled to the gas dynamic equations using a new coupling technique. The new technique maintains stability in the explicit finite-rate formulation while permitting relatively large global time steps.
Blueprint for a microwave trapped ion quantum computer.

PubMed

Lekitsch, Bjoern; Weidt, Sebastian; Fowler, Austin G; Mølmer, Klaus; Devitt, Simon J; Wunderlich, Christof; Hensinger, Winfried K

2017-02-01

The availability of a universal quantum computer may have a fundamental impact on a vast number of research fields and on society as a whole. An increasingly large scientific and industrial community is working toward the realization of such a device. An arbitrarily large quantum computer may best be constructed using a modular approach. We present a blueprint for a trapped ion-based scalable quantum computer module, making it possible to create a scalable quantum computer architecture based on long-wavelength radiation quantum gates. The modules control all operations as stand-alone units, are constructed using silicon microfabrication techniques, and are within reach of current technology. To perform the required quantum computations, the modules make use of long-wavelength radiation-based quantum gate technology. To scale this microwave quantum computer architecture to a large size, we present a fully scalable design that makes use of ion transport between different modules, thereby allowing arbitrarily many modules to be connected to construct a large-scale device. A high error-threshold surface error correction code can be implemented in the proposed architecture to execute fault-tolerant operations. With appropriate adjustments, the proposed modules are also suitable for alternative trapped ion quantum computer architectures, such as schemes using photonic interconnects.
A complementary graphical method for reducing and analyzing large data sets. Case studies demonstrating thresholds setting and selection.

PubMed

Jing, X; Cimino, J J

2014-01-01

Graphical displays can make data more understandable; however, large graphs can challenge human comprehension. We have previously described a filtering method to provide high-level summary views of large data sets. In this paper we demonstrate our method for setting and selecting thresholds to limit graph size while retaining important information by applying it to large single and paired data sets, taken from patient and bibliographic databases. Four case studies are used to illustrate our method. The data are either patient discharge diagnoses (coded using the International Classification of Diseases, Clinical Modifications [ICD9-CM]) or Medline citations (coded using the Medical Subject Headings [MeSH]). We use combinations of different thresholds to obtain filtered graphs for detailed analysis. The thresholds setting and selection, such as thresholds for node counts, class counts, ratio values, p values (for diff data sets), and percentiles of selected class count thresholds, are demonstrated with details in case studies. The main steps include: data preparation, data manipulation, computation, and threshold selection and visualization. We also describe the data models for different types of thresholds and the considerations for thresholds selection. The filtered graphs are 1%-3% of the size of the original graphs. For our case studies, the graphs provide 1) the most heavily used ICD9-CM codes, 2) the codes with most patients in a research hospital in 2011, 3) a profile of publications on "heavily represented topics" in MEDLINE in 2011, and 4) validated knowledge about adverse effects of the medication of rosiglitazone and new interesting areas in the ICD9-CM hierarchy associated with patients taking the medication of pioglitazone. Our filtering method reduces large graphs to a manageable size by removing relatively unimportant nodes. The graphical method provides summary views based on computation of usage frequency and semantic context of hierarchical terminology. The method is applicable to large data sets (such as a hundred thousand records or more) and can be used to generate new hypotheses from data sets coded with hierarchical terminologies.
Scalable hybrid computation with spikes.

PubMed

Sarpeshkar, Rahul; O'Halloran, Micah

2002-09-01

We outline a hybrid analog-digital scheme for computing with three important features that enable it to scale to systems of large complexity: First, like digital computation, which uses several one-bit precise logical units to collectively compute a precise answer to a computation, the hybrid scheme uses several moderate-precision analog units to collectively compute a precise answer to a computation. Second, frequent discrete signal restoration of the analog information prevents analog noise and offset from degrading the computation. And, third, a state machine enables complex computations to be created using a sequence of elementary computations. A natural choice for implementing this hybrid scheme is one based on spikes because spike-count codes are digital, while spike-time codes are analog. We illustrate how spikes afford easy ways to implement all three components of scalable hybrid computation. First, as an important example of distributed analog computation, we show how spikes can create a distributed modular representation of an analog number by implementing digital carry interactions between spiking analog neurons. Second, we show how signal restoration may be performed by recursive spike-count quantization of spike-time codes. And, third, we use spikes from an analog dynamical system to trigger state transitions in a digital dynamical system, which reconfigures the analog dynamical system using a binary control vector; such feedback interactions between analog and digital dynamical systems create a hybrid state machine (HSM). The HSM extends and expands the concept of a digital finite-state-machine to the hybrid domain. We present experimental data from a two-neuron HSM on a chip that implements error-correcting analog-to-digital conversion with the concurrent use of spike-time and spike-count codes. We also present experimental data from silicon circuits that implement HSM-based pattern recognition using spike-time synchrony. We outline how HSMs may be used to perform learning, vector quantization, spike pattern recognition and generation, and how they may be reconfigured.
The EMCC / DARPA Massively Parallel Electromagnetic Scattering Project

NASA Technical Reports Server (NTRS)

Woo, Alex C.; Hill, Kueichien C.

1996-01-01

The Electromagnetic Code Consortium (EMCC) was sponsored by the Advanced Research Program Agency (ARPA) to demonstrate the effectiveness of massively parallel computing in large scale radar signature predictions. The EMCC/ARPA project consisted of three parts.
Multi-Zone Liquid Thrust Chamber Performance Code with Domain Decomposition for Parallel Processing

NASA Technical Reports Server (NTRS)

Navaz, Homayun K.

2002-01-01

Computational Fluid Dynamics (CFD) has considerably evolved in the last decade. There are many computer programs that can perform computations on viscous internal or external flows with chemical reactions. CFD has become a commonly used tool in the design and analysis of gas turbines, ramjet combustors, turbo-machinery, inlet ducts, rocket engines, jet interaction, missile, and ramjet nozzles. One of the problems of interest to NASA has always been the performance prediction for rocket and air-breathing engines. Due to the complexity of flow in these engines it is necessary to resolve the flowfield into a fine mesh to capture quantities like turbulence and heat transfer. However, calculation on a high-resolution grid is associated with a prohibitively increasing computational time that can downgrade the value of the CFD for practical engineering calculations. The Liquid Thrust Chamber Performance (LTCP) code was developed for NASA/MSFC (Marshall Space Flight Center) to perform liquid rocket engine performance calculations. This code is a 2D/axisymmetric full Navier-Stokes (NS) solver with fully coupled finite rate chemistry and Eulerian treatment of liquid fuel and/or oxidizer droplets. One of the advantages of this code has been the resemblance of its input file to the JANNAF (Joint Army Navy NASA Air Force Interagency Propulsion Committee) standard TDK code, and its automatic grid generation for JANNAF defined combustion chamber wall geometry. These options minimize the learning effort for TDK users, and make the code a good candidate for performing engineering calculations. Although the LTCP code was developed for liquid rocket engines, it is a general-purpose code and has been used for solving many engineering problems. However, the single zone formulation of the LTCP has limited the code to be applicable to problems with complex geometry. Furthermore, the computational time becomes prohibitively large for high-resolution problems with chemistry, two-equation turbulence model, and two-phase flow. To overcome these limitations, the LTCP code is rewritten to include the multi-zone capability with domain decomposition that makes it suitable for parallel processing, i.e., enabling the code to run every zone or sub-domain on a separate processor. This can reduce the run time by a factor of 6 to 8, depending on the problem.
Mendel-GPU: haplotyping and genotype imputation on graphics processing units

PubMed Central

Chen, Gary K.; Wang, Kai; Stram, Alex H.; Sobel, Eric M.; Lange, Kenneth

2012-01-01

Motivation: In modern sequencing studies, one can improve the confidence of genotype calls by phasing haplotypes using information from an external reference panel of fully typed unrelated individuals. However, the computational demands are so high that they prohibit researchers with limited computational resources from haplotyping large-scale sequence data. Results: Our graphics processing unit based software delivers haplotyping and imputation accuracies comparable to competing programs at a fraction of the computational cost and peak memory demand. Availability: Mendel-GPU, our OpenCL software, runs on Linux platforms and is portable across AMD and nVidia GPUs. Users can download both code and documentation at http://code.google.com/p/mendel-gpu/. Contact: gary.k.chen@usc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22954633
Comparison of computational results of the SABRE LMFBR pin bundle blockage code with data from well-instrumented out-of-pile test bundles (THORS bundles 3A and 5A)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dearing, J.F.

The Subchannel Analysis of Blockages in Reactor Elements (SABRE) computer code, developed by the United Kingdom Atomic Energy Authority, is currently the only practical tool available for performing detailed analyses of velocity and temperature fields in the recirculating flow regions downstream of blockages in liquid-metal fast breeder reactor (LMFBR) pin bundles. SABRE is a subchannel analysis code; that is, it accurately represents the complex geometry of nuclear fuel pins arranged on a triangular lattice. The results of SABRE computational models are compared here with temperature data from two out-of-pile 19-pin test bundles from the Thermal-Hydraulic Out-of-Reactor Safety (THORS) Facility atmore » Oak Ridge National Laboratory. One of these bundles has a small central flow blockage (bundle 3A), while the other has a large edge blockage (bundle 5A). Values that give best agreement with experiment for the empirical thermal mixing correlation factor, FMIX, in SABRE are suggested. These values of FMIX are Reynolds-number dependent, however, indicating that the coded turbulent mixing correlation is not appropriate for wire-wrap pin bundles.« less
Parallel Higher-order Finite Element Method for Accurate Field Computations in Wakefield and PIC Simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Candel, A.; Kabel, A.; Lee, L.

Over the past years, SLAC's Advanced Computations Department (ACD), under SciDAC sponsorship, has developed a suite of 3D (2D) parallel higher-order finite element (FE) codes, T3P (T2P) and Pic3P (Pic2P), aimed at accurate, large-scale simulation of wakefields and particle-field interactions in radio-frequency (RF) cavities of complex shape. The codes are built on the FE infrastructure that supports SLAC's frequency domain codes, Omega3P and S3P, to utilize conformal tetrahedral (triangular)meshes, higher-order basis functions and quadratic geometry approximation. For time integration, they adopt an unconditionally stable implicit scheme. Pic3P (Pic2P) extends T3P (T2P) to treat charged-particle dynamics self-consistently using the PIC (particle-in-cell)more » approach, the first such implementation on a conformal, unstructured grid using Whitney basis functions. Examples from applications to the International Linear Collider (ILC), Positron Electron Project-II (PEP-II), Linac Coherent Light Source (LCLS) and other accelerators will be presented to compare the accuracy and computational efficiency of these codes versus their counterparts using structured grids.« less
Potential Flow Theory and Operation Guide for the Panel Code PMARC. Version 14

NASA Technical Reports Server (NTRS)

Ashby, Dale L.

1999-01-01

The theoretical basis for PMARC, a low-order panel code for modeling complex three-dimensional bodies, in potential flow, is outlined. PMARC can be run on a wide variety of computer platforms, including desktop machines, workstations, and supercomputers. Execution times for PMARC vary tremendously depending on the computer resources used, but typically range from several minutes for simple or moderately complex cases to several hours for very large complex cases. Several of the advanced features currently included in the code, such as internal flow modeling, boundary layer analysis, and time-dependent flow analysis, including problems involving relative motion, are discussed in some detail. The code is written in Fortran77, using adjustable-size arrays so that it can be easily redimensioned to match problem requirements and computer hardware constraints. An overview of the program input is presented. A detailed description of the input parameters is provided in the appendices. PMARC results for several test cases are presented along with analytic or experimental data, where available. The input files for these test cases are given in the appendices. PMARC currently supports plotfile output formats for several commercially available graphics packages. The supported graphics packages are Plot3D, Tecplot, and PmarcViewer.
Rapid Geometry Creation for Computer-Aided Engineering Parametric Analyses: A Case Study Using ComGeom2 for Launch Abort System Design

NASA Technical Reports Server (NTRS)

Hawke, Veronica; Gage, Peter; Manning, Ted

2007-01-01

ComGeom2, a tool developed to generate Common Geometry representation for multidisciplinary analysis, has been used to create a large set of geometries for use in a design study requiring analysis by two computational codes. This paper describes the process used to generate the large number of configurations and suggests ways to further automate the process and make it more efficient for future studies. The design geometry for this study is the launch abort system of the NASA Crew Launch Vehicle.
User's guide for a large signal computer model of the helical traveling wave tube

NASA Technical Reports Server (NTRS)

Palmer, Raymond W.

1992-01-01

The use is described of a successful large-signal, two-dimensional (axisymmetric), deformable disk computer model of the helical traveling wave tube amplifier, an extensively revised and operationally simplified version. We also discuss program input and output and the auxiliary files necessary for operation. Included is a sample problem and its input data and output results. Interested parties may now obtain from the author the FORTRAN source code, auxiliary files, and sample input data on a standard floppy diskette, the contents of which are described herein.
Electronic Structure of Transition Metal Clusters, Actinide Complexes and Their Reactivities

DOE Office of Scientific and Technical Information (OSTI.GOV)

Krishnan Balasubramanian

2009-07-18

This is a continuing DOE-BES funded project on transition metal and actinide containing species, aimed at the electronic structure and spectroscopy of transition metal and actinide containing species. While a long term connection of these species is to catalysis and environmental management of high-level nuclear wastes, the immediate relevance is directly to other DOE-BES funded experimental projects at DOE-National labs and universities. There are a number of ongoing gas-phase spectroscopic studies of these species at various places, and our computational work has been inspired by these experimental studies and we have also inspired other experimental and theoretical studies. Thus ourmore » studies have varied from spectroscopy of diatomic transition metal carbides to large complexes containing transition metals, and actinide complexes that are critical to the environment. In addition, we are continuing to make code enhancements and modernization of ALCHEMY II set of codes and its interface with relativistic configuration interaction (RCI). At present these codes can carry out multi-reference computations that included up to 60 million configurations and multiple states from each such CI expansion. ALCHEMY II codes have been modernized and converted to a variety of platforms such as Windows XP, and Linux. We have revamped the symbolic CI code to automate the MRSDCI technique so that the references are automatically chosen with a given cutoff from the CASSCF and thus we are doing accurate MRSDCI computations with 10,000 or larger reference space of configurations. The RCI code can also handle a large number of reference configurations, which include up to 10,000 reference configurations. Another major progress is in routinely including larger basis sets up to 5g functions in thee computations. Of course higher angular momenta functions can also be handled using Gaussian and other codes with other methods such as DFT, MP2, CCSD(T), etc. We have also calibrated our RECP methods with all-electron Douglas-Kroll relativistic methods. We have the capabilities for computing full CI extrapolations including spin-orbit effects and several one-electron properties and electron density maps including spin-orbit effects. We are continuously collaborating with several experimental groups around the country and at National Labs to carry out computational studies on the DOE-BES funded projects. The past work in the last 3 years was primarily motivated and driven by the concurrent or recent experimental studies on these systems. We were thus significantly benefited by coordinating our computational efforts with experimental studies. The interaction between theory and experiment has resulted in some unique and exciting opportunities. For example, for the very first time ever, the upper spin-orbit component of a heavy trimer such as Au{sub 3} was experimentally observed as a result of our accurate computational study on the upper electronic states of gold trimer. Likewise for the first time AuH{sub 2} could be observed and interpreted clearly due to our computed potential energy surfaces that revealed the existence of a large barrier to convert the isolated AuH{sub 2} back to Au and H{sub 2}. We have also worked on yet to be observed systems and have made predictions for future experiments. We have computed the spectroscopic and thermodynamic properties of transition metal carbides transition metal clusters and compared our electronic states to the anion photodetachment spectra of Lai Sheng Wang. Prof Mike Morse and coworkers(funded also by DOE-BES) and Prof Stimle and coworkers(also funded by DOE-BES) are working on the spectroscopic properties of transition metal carbides and nitrides. Our predictions on the excited states of transition metal clusters such as Hf{sub 3}, Nb{sub 2}{sup +} etc., have been confirmed experimentally by Prof. Lombardi and coworkers using resonance Raman spectroscopy. We have also been studying larger complexes critical to the environmental management of high-level nuclear wastes. In collaboration with experimental colleague Prof Hieno Nitsche (Berkeley) and Dr. Pat Allen (Livermore, EXAFS) we have studied the uranyl complexes with silicates and carbonates. It should be stressed that although our computed ionization potential of uranium oxide was in conflict with the existing experimental data at the time, a subsequent gas-phase experimental work by Prof Mike Haven and coworkers published as communication in JACS confirmed our computed result to within 0.1 eV. This provides considerable confidence that the computed results in large basis sets with highly-correlated wave functions have excellent accuracies and they have the capabilities to predict the excited states also with great accuracy. Computations of actinide complexes (Uranyl and plutonyl complexes) are critical to management of high-level nuclear wastes.« less

Electromagnetic plasma simulation in realistic geometries

NASA Astrophysics Data System (ADS)

Brandon, S.; Ambrosiano, J. J.; Nielsen, D.

1991-08-01

Particle-in-Cell (PIC) calculations have become an indispensable tool to model the nonlinear collective behavior of charged particle species in electromagnetic fields. Traditional finite difference codes, such as CONDOR (2-D) and ARGUS (3-D), are used extensively to design experiments and develop new concepts. A wide variety of physical processes can be modeled simply and efficiently by these codes. However, experiments have become more complex. Geometrical shapes and length scales are becoming increasingly more difficult to model. Spatial resolution requirements for the electromagnetic calculation force large grids and small time steps. Many hours of CRAY YMP time may be required to complete 2-D calculation -- many more for 3-D calculations. In principle, the number of mesh points and particles need only to be increased until all relevant physical processes are resolved. In practice, the size of a calculation is limited by the computer budget. As a result, experimental design is being limited by the ability to calculate, not by the experimenters ingenuity or understanding of the physical processes involved. Several approaches to meet these computational demands are being pursued. Traditional PIC codes continue to be the major design tools. These codes are being actively maintained, optimized, and extended to handle large and more complex problems. Two new formulations are being explored to relax the geometrical constraints of the finite difference codes. A modified finite volume test code, TALUS, uses a data structure compatible with that of standard finite difference meshes. This allows a basic conformal boundary/variable grid capability to be retrofitted to CONDOR. We are also pursuing an unstructured grid finite element code, MadMax. The unstructured mesh approach provides maximum flexibility in the geometrical model while also allowing local mesh refinement.
On the linear programming bound for linear Lee codes.

PubMed

Astola, Helena; Tabus, Ioan

2016-01-01

Based on an invariance-type property of the Lee-compositions of a linear Lee code, additional equality constraints can be introduced to the linear programming problem of linear Lee codes. In this paper, we formulate this property in terms of an action of the multiplicative group of the field [Formula: see text] on the set of Lee-compositions. We show some useful properties of certain sums of Lee-numbers, which are the eigenvalues of the Lee association scheme, appearing in the linear programming problem of linear Lee codes. Using the additional equality constraints, we formulate the linear programming problem of linear Lee codes in a very compact form, leading to a fast execution, which allows to efficiently compute the bounds for large parameter values of the linear codes.
HAlign-II: efficient ultra-large multiple sequence alignment and phylogenetic tree reconstruction with distributed and parallel computing.

PubMed

Wan, Shixiang; Zou, Quan

2017-01-01

Multiple sequence alignment (MSA) plays a key role in biological sequence analyses, especially in phylogenetic tree construction. Extreme increase in next-generation sequencing results in shortage of efficient ultra-large biological sequence alignment approaches for coping with different sequence types. Distributed and parallel computing represents a crucial technique for accelerating ultra-large (e.g. files more than 1 GB) sequence analyses. Based on HAlign and Spark distributed computing system, we implement a highly cost-efficient and time-efficient HAlign-II tool to address ultra-large multiple biological sequence alignment and phylogenetic tree construction. The experiments in the DNA and protein large scale data sets, which are more than 1GB files, showed that HAlign II could save time and space. It outperformed the current software tools. HAlign-II can efficiently carry out MSA and construct phylogenetic trees with ultra-large numbers of biological sequences. HAlign-II shows extremely high memory efficiency and scales well with increases in computing resource. THAlign-II provides a user-friendly web server based on our distributed computing infrastructure. HAlign-II with open-source codes and datasets was established at http://lab.malab.cn/soft/halign.
User's Guide for TOUGH2-MP - A Massively Parallel Version of the TOUGH2 Code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Earth Sciences Division; Zhang, Keni; Zhang, Keni

TOUGH2-MP is a massively parallel (MP) version of the TOUGH2 code, designed for computationally efficient parallel simulation of isothermal and nonisothermal flows of multicomponent, multiphase fluids in one, two, and three-dimensional porous and fractured media. In recent years, computational requirements have become increasingly intensive in large or highly nonlinear problems for applications in areas such as radioactive waste disposal, CO2 geological sequestration, environmental assessment and remediation, reservoir engineering, and groundwater hydrology. The primary objective of developing the parallel-simulation capability is to significantly improve the computational performance of the TOUGH2 family of codes. The particular goal for the parallel simulator ismore » to achieve orders-of-magnitude improvement in computational time for models with ever-increasing complexity. TOUGH2-MP is designed to perform parallel simulation on multi-CPU computational platforms. An earlier version of TOUGH2-MP (V1.0) was based on the TOUGH2 Version 1.4 with EOS3, EOS9, and T2R3D modules, a software previously qualified for applications in the Yucca Mountain project, and was designed for execution on CRAY T3E and IBM SP supercomputers. The current version of TOUGH2-MP (V2.0) includes all fluid property modules of the standard version TOUGH2 V2.0. It provides computationally efficient capabilities using supercomputers, Linux clusters, or multi-core PCs, and also offers many user-friendly features. The parallel simulator inherits all process capabilities from V2.0 together with additional capabilities for handling fractured media from V1.4. This report provides a quick starting guide on how to set up and run the TOUGH2-MP program for users with a basic knowledge of running the (standard) version TOUGH2 code, The report also gives a brief technical description of the code, including a discussion of parallel methodology, code structure, as well as mathematical and numerical methods used. To familiarize users with the parallel code, illustrative sample problems are presented.« less
Comparative Evaluation of Different Optimization Algorithms for Structural Design Applications

NASA Technical Reports Server (NTRS)

Patnaik, Surya N.; Coroneos, Rula M.; Guptill, James D.; Hopkins, Dale A.

1996-01-01

Non-linear programming algorithms play an important role in structural design optimization. Fortunately, several algorithms with computer codes are available. At NASA Lewis Research Centre, a project was initiated to assess the performance of eight different optimizers through the development of a computer code CometBoards. This paper summarizes the conclusions of that research. CometBoards was employed to solve sets of small, medium and large structural problems, using the eight different optimizers on a Cray-YMP8E/8128 computer. The reliability and efficiency of the optimizers were determined from the performance of these problems. For small problems, the performance of most of the optimizers could be considered adequate. For large problems, however, three optimizers (two sequential quadratic programming routines, DNCONG of IMSL and SQP of IDESIGN, along with Sequential Unconstrained Minimizations Technique SUMT) outperformed others. At optimum, most optimizers captured an identical number of active displacement and frequency constraints but the number of active stress constraints differed among the optimizers. This discrepancy can be attributed to singularity conditions in the optimization and the alleviation of this discrepancy can improve the efficiency of optimizers.
Unsteady Aero Computation of a 1 1/2 Stage Large Scale Rotating Turbine

NASA Technical Reports Server (NTRS)

To, Wai-Ming

2012-01-01

This report is the documentation of the work performed for the Subsonic Rotary Wing Project under the NASA s Fundamental Aeronautics Program. It was funded through Task Number NNC10E420T under GESS-2 Contract NNC06BA07B in the period of 10/1/2010 to 8/31/2011. The objective of the task is to provide support for the development of variable speed power turbine technology through application of computational fluid dynamics analyses. This includes work elements in mesh generation, multistage URANS simulations, and post-processing of the simulation results for comparison with the experimental data. The unsteady CFD calculations were performed with the TURBO code running in multistage single passage (phase lag) mode. Meshes for the blade rows were generated with the NASA developed TCGRID code. The CFD performance is assessed and improvements are recommended for future research in this area. For that, the United Technologies Research Center's 1 1/2 stage Large Scale Rotating Turbine was selected to be the candidate engine configuration for this computational effort because of the completeness and availability of the data.
Performance Trend of Different Algorithms for Structural Design Optimization

NASA Technical Reports Server (NTRS)

Patnaik, Surya N.; Coroneos, Rula M.; Guptill, James D.; Hopkins, Dale A.

1996-01-01

Nonlinear programming algorithms play an important role in structural design optimization. Fortunately, several algorithms with computer codes are available. At NASA Lewis Research Center, a project was initiated to assess performance of different optimizers through the development of a computer code CometBoards. This paper summarizes the conclusions of that research. CometBoards was employed to solve sets of small, medium and large structural problems, using different optimizers on a Cray-YMP8E/8128 computer. The reliability and efficiency of the optimizers were determined from the performance of these problems. For small problems, the performance of most of the optimizers could be considered adequate. For large problems however, three optimizers (two sequential quadratic programming routines, DNCONG of IMSL and SQP of IDESIGN, along with the sequential unconstrained minimizations technique SUMT) outperformed others. At optimum, most optimizers captured an identical number of active displacement and frequency constraints but the number of active stress constraints differed among the optimizers. This discrepancy can be attributed to singularity conditions in the optimization and the alleviation of this discrepancy can improve the efficiency of optimizers.
Legacy Code Modernization

NASA Technical Reports Server (NTRS)

Hribar, Michelle R.; Frumkin, Michael; Jin, Haoqiang; Waheed, Abdul; Yan, Jerry; Saini, Subhash (Technical Monitor)

1998-01-01

Over the past decade, high performance computing has evolved rapidly; systems based on commodity microprocessors have been introduced in quick succession from at least seven vendors/families. Porting codes to every new architecture is a difficult problem; in particular, here at NASA, there are many large CFD applications that are very costly to port to new machines by hand. The LCM ("Legacy Code Modernization") Project is the development of an integrated parallelization environment (IPE) which performs the automated mapping of legacy CFD (Fortran) applications to state-of-the-art high performance computers. While most projects to port codes focus on the parallelization of the code, we consider porting to be an iterative process consisting of several steps: 1) code cleanup, 2) serial optimization,3) parallelization, 4) performance monitoring and visualization, 5) intelligent tools for automated tuning using performance prediction and 6) machine specific optimization. The approach for building this parallelization environment is to build the components for each of the steps simultaneously and then integrate them together. The demonstration will exhibit our latest research in building this environment: 1. Parallelizing tools and compiler evaluation. 2. Code cleanup and serial optimization using automated scripts 3. Development of a code generator for performance prediction 4. Automated partitioning 5. Automated insertion of directives. These demonstrations will exhibit the effectiveness of an automated approach for all the steps involved with porting and tuning a legacy code application for a new architecture.
Parallel workflow manager for non-parallel bioinformatic applications to solve large-scale biological problems on a supercomputer.

PubMed

Suplatov, Dmitry; Popova, Nina; Zhumatiy, Sergey; Voevodin, Vladimir; Švedas, Vytas

2016-04-01

Rapid expansion of online resources providing access to genomic, structural, and functional information associated with biological macromolecules opens an opportunity to gain a deeper understanding of the mechanisms of biological processes due to systematic analysis of large datasets. This, however, requires novel strategies to optimally utilize computer processing power. Some methods in bioinformatics and molecular modeling require extensive computational resources. Other algorithms have fast implementations which take at most several hours to analyze a common input on a modern desktop station, however, due to multiple invocations for a large number of subtasks the full task requires a significant computing power. Therefore, an efficient computational solution to large-scale biological problems requires both a wise parallel implementation of resource-hungry methods as well as a smart workflow to manage multiple invocations of relatively fast algorithms. In this work, a new computer software mpiWrapper has been developed to accommodate non-parallel implementations of scientific algorithms within the parallel supercomputing environment. The Message Passing Interface has been implemented to exchange information between nodes. Two specialized threads - one for task management and communication, and another for subtask execution - are invoked on each processing unit to avoid deadlock while using blocking calls to MPI. The mpiWrapper can be used to launch all conventional Linux applications without the need to modify their original source codes and supports resubmission of subtasks on node failure. We show that this approach can be used to process huge amounts of biological data efficiently by running non-parallel programs in parallel mode on a supercomputer. The C++ source code and documentation are available from http://biokinet.belozersky.msu.ru/mpiWrapper .
Parallel algorithm for multiscale atomistic/continuum simulations using LAMMPS

NASA Astrophysics Data System (ADS)

Pavia, F.; Curtin, W. A.

2015-07-01

Deformation and fracture processes in engineering materials often require simultaneous descriptions over a range of length and time scales, with each scale using a different computational technique. Here we present a high-performance parallel 3D computing framework for executing large multiscale studies that couple an atomic domain, modeled using molecular dynamics and a continuum domain, modeled using explicit finite elements. We use the robust Coupled Atomistic/Discrete-Dislocation (CADD) displacement-coupling method, but without the transfer of dislocations between atoms and continuum. The main purpose of the work is to provide a multiscale implementation within an existing large-scale parallel molecular dynamics code (LAMMPS) that enables use of all the tools associated with this popular open-source code, while extending CADD-type coupling to 3D. Validation of the implementation includes the demonstration of (i) stability in finite-temperature dynamics using Langevin dynamics, (ii) elimination of wave reflections due to large dynamic events occurring in the MD region and (iii) the absence of spurious forces acting on dislocations due to the MD/FE coupling, for dislocations further than 10 Å from the coupling boundary. A first non-trivial example application of dislocation glide and bowing around obstacles is shown, for dislocation lengths of ∼50 nm using fewer than 1 000 000 atoms but reproducing results of extremely large atomistic simulations at much lower computational cost.
Step-by-step magic state encoding for efficient fault-tolerant quantum computation

PubMed Central

Goto, Hayato

2014-01-01

Quantum error correction allows one to make quantum computers fault-tolerant against unavoidable errors due to decoherence and imperfect physical gate operations. However, the fault-tolerant quantum computation requires impractically large computational resources for useful applications. This is a current major obstacle to the realization of a quantum computer. In particular, magic state distillation, which is a standard approach to universality, consumes the most resources in fault-tolerant quantum computation. For the resource problem, here we propose step-by-step magic state encoding for concatenated quantum codes, where magic states are encoded step by step from the physical level to the logical one. To manage errors during the encoding, we carefully use error detection. Since the sizes of intermediate codes are small, it is expected that the resource overheads will become lower than previous approaches based on the distillation at the logical level. Our simulation results suggest that the resource requirements for a logical magic state will become comparable to those for a single logical controlled-NOT gate. Thus, the present method opens a new possibility for efficient fault-tolerant quantum computation. PMID:25511387
Step-by-step magic state encoding for efficient fault-tolerant quantum computation.

PubMed

Goto, Hayato

2014-12-16

Quantum error correction allows one to make quantum computers fault-tolerant against unavoidable errors due to decoherence and imperfect physical gate operations. However, the fault-tolerant quantum computation requires impractically large computational resources for useful applications. This is a current major obstacle to the realization of a quantum computer. In particular, magic state distillation, which is a standard approach to universality, consumes the most resources in fault-tolerant quantum computation. For the resource problem, here we propose step-by-step magic state encoding for concatenated quantum codes, where magic states are encoded step by step from the physical level to the logical one. To manage errors during the encoding, we carefully use error detection. Since the sizes of intermediate codes are small, it is expected that the resource overheads will become lower than previous approaches based on the distillation at the logical level. Our simulation results suggest that the resource requirements for a logical magic state will become comparable to those for a single logical controlled-NOT gate. Thus, the present method opens a new possibility for efficient fault-tolerant quantum computation.
Application of CHAD hydrodynamics to shock-wave problems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Trease, H.E.; O`Rourke, P.J.; Sahota, M.S.

1997-12-31

CHAD is the latest in a sequence of continually evolving computer codes written to effectively utilize massively parallel computer architectures and the latest grid generators for unstructured meshes. Its applications range from automotive design issues such as in-cylinder and manifold flows of internal combustion engines, vehicle aerodynamics, underhood cooling and passenger compartment heating, ventilation, and air conditioning to shock hydrodynamics and materials modeling. CHAD solves the full unsteady Navier-Stoke equations with the k-epsilon turbulence model in three space dimensions. The code has four major features that distinguish it from the earlier KIVA code, also developed at Los Alamos. First, itmore » is based on a node-centered, finite-volume method in which, like finite element methods, all fluid variables are located at computational nodes. The computational mesh efficiently and accurately handles all element shapes ranging from tetrahedra to hexahedra. Second, it is written in standard Fortran 90 and relies on automatic domain decomposition and a universal communication library written in standard C and MPI for unstructured grids to effectively exploit distributed-memory parallel architectures. Thus the code is fully portable to a variety of computing platforms such as uniprocessor workstations, symmetric multiprocessors, clusters of workstations, and massively parallel platforms. Third, CHAD utilizes a variable explicit/implicit upwind method for convection that improves computational efficiency in flows that have large velocity Courant number variations due to velocity of mesh size variations. Fourth, CHAD is designed to also simulate shock hydrodynamics involving multimaterial anisotropic behavior under high shear. The authors will discuss CHAD capabilities and show several sample calculations showing the strengths and weaknesses of CHAD.« less
78 FR 29063 - Survey of Urban Rates for Fixed Voice and Fixed Broadband Residential Services

Federal Register 2010, 2011, 2012, 2013, 2014

2013-05-17

... in alternative formats (computer diskette, large print, audio record, and Braille). Persons with... Company Name: Provider FRN (used on MONTH DAY, YEAR Form 477): Provider Study Area Code (if current USF...
Modeling Close-In Airblast from ANFO Cylindrical and Box-Shaped Charges

DTIC Science & Technology

2010-10-01

Eulerian hydrodynamics code [1]. The Jones-Wilkins-Lee (JWL) equation of the state (EOS) [2] of the reacted ANFO was computed using the Cheetah ...thermodynamics code [3]. Cheetah first calculates the detonation state from Chapman-Jouget (C-J) theory and then models the adiabatic expansion from...success modeling a large range of ANFO charge sizes using the Cheetah -generated EOS along with the Ignition and Growth (IG) reactive flow model [6
Rapid Calculation of Spacecraft Trajectories Using Efficient Taylor Series Integration

NASA Technical Reports Server (NTRS)

Scott, James R.; Martini, Michael C.

2011-01-01

A variable-order, variable-step Taylor series integration algorithm was implemented in NASA Glenn's SNAP (Spacecraft N-body Analysis Program) code. SNAP is a high-fidelity trajectory propagation program that can propagate the trajectory of a spacecraft about virtually any body in the solar system. The Taylor series algorithm's very high order accuracy and excellent stability properties lead to large reductions in computer time relative to the code's existing 8th order Runge-Kutta scheme. Head-to-head comparison on near-Earth, lunar, Mars, and Europa missions showed that Taylor series integration is 15.8 times faster than Runge- Kutta on average, and is more accurate. These speedups were obtained for calculations involving central body, other body, thrust, and drag forces. Similar speedups have been obtained for calculations that include J2 spherical harmonic for central body gravitation. The algorithm includes a step size selection method that directly calculates the step size and never requires a repeat step. High-order Taylor series integration algorithms have been shown to provide major reductions in computer time over conventional integration methods in numerous scientific applications. The objective here was to directly implement Taylor series integration in an existing trajectory analysis code and demonstrate that large reductions in computer time (order of magnitude) could be achieved while simultaneously maintaining high accuracy. This software greatly accelerates the calculation of spacecraft trajectories. At each time level, the spacecraft position, velocity, and mass are expanded in a high-order Taylor series whose coefficients are obtained through efficient differentiation arithmetic. This makes it possible to take very large time steps at minimal cost, resulting in large savings in computer time. The Taylor series algorithm is implemented primarily through three subroutines: (1) a driver routine that automatically introduces auxiliary variables and sets up initial conditions and integrates; (2) a routine that calculates system reduced derivatives using recurrence relations for quotients and products; and (3) a routine that determines the step size and sums the series. The order of accuracy used in a trajectory calculation is arbitrary and can be set by the user. The algorithm directly calculates the motion of other planetary bodies and does not require ephemeris files (except to start the calculation). The code also runs with Taylor series and Runge-Kutta used interchangeably for different phases of a mission.
Exploring a QoS Driven Scheduling Approach for Peer-to-Peer Live Streaming Systems with Network Coding

PubMed Central

Cui, Laizhong; Lu, Nan; Chen, Fu

2014-01-01

Most large-scale peer-to-peer (P2P) live streaming systems use mesh to organize peers and leverage pull scheduling to transmit packets for providing robustness in dynamic environment. The pull scheduling brings large packet delay. Network coding makes the push scheduling feasible in mesh P2P live streaming and improves the efficiency. However, it may also introduce some extra delays and coding computational overhead. To improve the packet delay, streaming quality, and coding overhead, in this paper are as follows. we propose a QoS driven push scheduling approach. The main contributions of this paper are: (i) We introduce a new network coding method to increase the content diversity and reduce the complexity of scheduling; (ii) we formulate the push scheduling as an optimization problem and transform it to a min-cost flow problem for solving it in polynomial time; (iii) we propose a push scheduling algorithm to reduce the coding overhead and do extensive experiments to validate the effectiveness of our approach. Compared with previous approaches, the simulation results demonstrate that packet delay, continuity index, and coding ratio of our system can be significantly improved, especially in dynamic environments. PMID:25114968
NASA's Information Power Grid: Large Scale Distributed Computing and Data Management

NASA Technical Reports Server (NTRS)

Johnston, William E.; Vaziri, Arsi; Hinke, Tom; Tanner, Leigh Ann; Feiereisen, William J.; Thigpen, William; Tang, Harry (Technical Monitor)

2001-01-01

Large-scale science and engineering are done through the interaction of people, heterogeneous computing resources, information systems, and instruments, all of which are geographically and organizationally dispersed. The overall motivation for Grids is to facilitate the routine interactions of these resources in order to support large-scale science and engineering. Multi-disciplinary simulations provide a good example of a class of applications that are very likely to require aggregation of widely distributed computing, data, and intellectual resources. Such simulations - e.g. whole system aircraft simulation and whole system living cell simulation - require integrating applications and data that are developed by different teams of researchers frequently in different locations. The research team's are the only ones that have the expertise to maintain and improve the simulation code and/or the body of experimental data that drives the simulations. This results in an inherently distributed computing and data management environment.
Extremely Scalable Spiking Neuronal Network Simulation Code: From Laptops to Exascale Computers.

PubMed

Jordan, Jakob; Ippen, Tammo; Helias, Moritz; Kitayama, Itaru; Sato, Mitsuhisa; Igarashi, Jun; Diesmann, Markus; Kunkel, Susanne

2018-01-01

State-of-the-art software tools for neuronal network simulations scale to the largest computing systems available today and enable investigations of large-scale networks of up to 10 % of the human cortex at a resolution of individual neurons and synapses. Due to an upper limit on the number of incoming connections of a single neuron, network connectivity becomes extremely sparse at this scale. To manage computational costs, simulation software ultimately targeting the brain scale needs to fully exploit this sparsity. Here we present a two-tier connection infrastructure and a framework for directed communication among compute nodes accounting for the sparsity of brain-scale networks. We demonstrate the feasibility of this approach by implementing the technology in the NEST simulation code and we investigate its performance in different scaling scenarios of typical network simulations. Our results show that the new data structures and communication scheme prepare the simulation kernel for post-petascale high-performance computing facilities without sacrificing performance in smaller systems.
Computational simulation of matrix micro-slip bands in SiC/Ti-15 composite

NASA Technical Reports Server (NTRS)

Mital, S. K.; Lee, H.-J.; Murthy, P. L. N.; Chamis, C. C.

1992-01-01

Computational simulation procedures are used to identify the key deformation mechanisms for (0)(sub 8) and (90)(sub 8) SiC/Ti-15 metal matrix composites. The computational simulation procedures employed consist of a three-dimensional finite-element analysis and a micromechanics based computer code METCAN. The interphase properties used in the analysis have been calibrated using the METCAN computer code with the (90)(sub 8) experimental stress-strain curve. Results of simulation show that although shear stresses are sufficiently high to cause the formation of some slip bands in the matrix concentrated mostly near the fibers, the nonlinearity in the composite stress-strain curve in the case of (90)(sub 8) composite is dominated by interfacial damage, such as microcracks and debonding rather than microplasticity. The stress-strain curve for (0)(sub 8) composite is largely controlled by the fibers and shows only slight nonlinearity at higher strain levels that could be the result of matrix microplasticity.

Extremely Scalable Spiking Neuronal Network Simulation Code: From Laptops to Exascale Computers

PubMed Central

Jordan, Jakob; Ippen, Tammo; Helias, Moritz; Kitayama, Itaru; Sato, Mitsuhisa; Igarashi, Jun; Diesmann, Markus; Kunkel, Susanne

2018-01-01

State-of-the-art software tools for neuronal network simulations scale to the largest computing systems available today and enable investigations of large-scale networks of up to 10 % of the human cortex at a resolution of individual neurons and synapses. Due to an upper limit on the number of incoming connections of a single neuron, network connectivity becomes extremely sparse at this scale. To manage computational costs, simulation software ultimately targeting the brain scale needs to fully exploit this sparsity. Here we present a two-tier connection infrastructure and a framework for directed communication among compute nodes accounting for the sparsity of brain-scale networks. We demonstrate the feasibility of this approach by implementing the technology in the NEST simulation code and we investigate its performance in different scaling scenarios of typical network simulations. Our results show that the new data structures and communication scheme prepare the simulation kernel for post-petascale high-performance computing facilities without sacrificing performance in smaller systems. PMID:29503613
Development of a GPU Compatible Version of the Fast Radiation Code RRTMG

NASA Astrophysics Data System (ADS)

Iacono, M. J.; Mlawer, E. J.; Berthiaume, D.; Cady-Pereira, K. E.; Suarez, M.; Oreopoulos, L.; Lee, D.

2012-12-01

The absorption of solar radiation and emission/absorption of thermal radiation are crucial components of the physics that drive Earth's climate and weather. Therefore, accurate radiative transfer calculations are necessary for realistic climate and weather simulations. Efficient radiation codes have been developed for this purpose, but their accuracy requirements still necessitate that as much as 30% of the computational time of a GCM is spent computing radiative fluxes and heating rates. The overall computational expense constitutes a limitation on a GCM's predictive ability if it becomes an impediment to adding new physics to or increasing the spatial and/or vertical resolution of the model. The emergence of Graphics Processing Unit (GPU) technology, which will allow the parallel computation of multiple independent radiative calculations in a GCM, will lead to a fundamental change in the competition between accuracy and speed. Processing time previously consumed by radiative transfer will now be available for the modeling of other processes, such as physics parameterizations, without any sacrifice in the accuracy of the radiative transfer. Furthermore, fast radiation calculations can be performed much more frequently and will allow the modeling of radiative effects of rapid changes in the atmosphere. The fast radiation code RRTMG, developed at Atmospheric and Environmental Research (AER), is utilized operationally in many dynamical models throughout the world. We will present the results from the first stage of an effort to create a version of the RRTMG radiation code designed to run efficiently in a GPU environment. This effort will focus on the RRTMG implementation in GEOS-5. RRTMG has an internal pseudo-spectral vector of length of order 100 that, when combined with the much greater length of the global horizontal grid vector from which the radiation code is called in GEOS-5, makes RRTMG/GEOS-5 particularly suited to achieving a significant speed improvement through GPU technology. This large number of independent cases will allow us to take full advantage of the computational power of the latest GPUs, ensuring that all thread cores in the GPU remain active, a key criterion for obtaining significant speedup. The CUDA (Compute Unified Device Architecture) Fortran compiler developed by PGI and Nvidia will allow us to construct this parallel implementation on the GPU while remaining in the Fortran language. This implementation will scale very well across various CUDA-supported GPUs such as the recently released Fermi Nvidia cards. We will present the computational speed improvements of the GPU-compatible code relative to the standard CPU-based RRTMG with respect to a very large and diverse suite of atmospheric profiles. This suite will also be utilized to demonstrate the minimal impact of the code restructuring on the accuracy of radiation calculations. The GPU-compatible version of RRTMG will be directly applicable to future versions of GEOS-5, but it is also likely to provide significant associated benefits for other GCMs that employ RRTMG.
Parallel-vector computation for structural analysis and nonlinear unconstrained optimization problems

NASA Technical Reports Server (NTRS)

Nguyen, Duc T.

1990-01-01

Practical engineering application can often be formulated in the form of a constrained optimization problem. There are several solution algorithms for solving a constrained optimization problem. One approach is to convert a constrained problem into a series of unconstrained problems. Furthermore, unconstrained solution algorithms can be used as part of the constrained solution algorithms. Structural optimization is an iterative process where one starts with an initial design, a finite element structure analysis is then performed to calculate the response of the system (such as displacements, stresses, eigenvalues, etc.). Based upon the sensitivity information on the objective and constraint functions, an optimizer such as ADS or IDESIGN, can be used to find the new, improved design. For the structural analysis phase, the equation solver for the system of simultaneous, linear equations plays a key role since it is needed for either static, or eigenvalue, or dynamic analysis. For practical, large-scale structural analysis-synthesis applications, computational time can be excessively large. Thus, it is necessary to have a new structural analysis-synthesis code which employs new solution algorithms to exploit both parallel and vector capabilities offered by modern, high performance computers such as the Convex, Cray-2 and Cray-YMP computers. The objective of this research project is, therefore, to incorporate the latest development in the parallel-vector equation solver, PVSOLVE into the widely popular finite-element production code, such as the SAP-4. Furthermore, several nonlinear unconstrained optimization subroutines have also been developed and tested under a parallel computer environment. The unconstrained optimization subroutines are not only useful in their own right, but they can also be incorporated into a more popular constrained optimization code, such as ADS.
A transient FETI methodology for large-scale parallel implicit computations in structural mechanics

NASA Technical Reports Server (NTRS)

Farhat, Charbel; Crivelli, Luis; Roux, Francois-Xavier

1992-01-01

Explicit codes are often used to simulate the nonlinear dynamics of large-scale structural systems, even for low frequency response, because the storage and CPU requirements entailed by the repeated factorizations traditionally found in implicit codes rapidly overwhelm the available computing resources. With the advent of parallel processing, this trend is accelerating because explicit schemes are also easier to parallelize than implicit ones. However, the time step restriction imposed by the Courant stability condition on all explicit schemes cannot yet -- and perhaps will never -- be offset by the speed of parallel hardware. Therefore, it is essential to develop efficient and robust alternatives to direct methods that are also amenable to massively parallel processing because implicit codes using unconditionally stable time-integration algorithms are computationally more efficient when simulating low-frequency dynamics. Here we present a domain decomposition method for implicit schemes that requires significantly less storage than factorization algorithms, that is several times faster than other popular direct and iterative methods, that can be easily implemented on both shared and local memory parallel processors, and that is both computationally and communication-wise efficient. The proposed transient domain decomposition method is an extension of the method of Finite Element Tearing and Interconnecting (FETI) developed by Farhat and Roux for the solution of static problems. Serial and parallel performance results on the CRAY Y-MP/8 and the iPSC-860/128 systems are reported and analyzed for realistic structural dynamics problems. These results establish the superiority of the FETI method over both the serial/parallel conjugate gradient algorithm with diagonal scaling and the serial/parallel direct method, and contrast the computational power of the iPSC-860/128 parallel processor with that of the CRAY Y-MP/8 system.
Multimodal Discriminative Binary Embedding for Large-Scale Cross-Modal Retrieval.

PubMed

Wang, Di; Gao, Xinbo; Wang, Xiumei; He, Lihuo; Yuan, Bo

2016-10-01

Multimodal hashing, which conducts effective and efficient nearest neighbor search across heterogeneous data on large-scale multimedia databases, has been attracting increasing interest, given the explosive growth of multimedia content on the Internet. Recent multimodal hashing research mainly aims at learning the compact binary codes to preserve semantic information given by labels. The overwhelming majority of these methods are similarity preserving approaches which approximate pairwise similarity matrix with Hamming distances between the to-be-learnt binary hash codes. However, these methods ignore the discriminative property in hash learning process, which results in hash codes from different classes undistinguished, and therefore reduces the accuracy and robustness for the nearest neighbor search. To this end, we present a novel multimodal hashing method, named multimodal discriminative binary embedding (MDBE), which focuses on learning discriminative hash codes. First, the proposed method formulates the hash function learning in terms of classification, where the binary codes generated by the learned hash functions are expected to be discriminative. And then, it exploits the label information to discover the shared structures inside heterogeneous data. Finally, the learned structures are preserved for hash codes to produce similar binary codes in the same class. Hence, the proposed MDBE can preserve both discriminability and similarity for hash codes, and will enhance retrieval accuracy. Thorough experiments on benchmark data sets demonstrate that the proposed method achieves excellent accuracy and competitive computational efficiency compared with the state-of-the-art methods for large-scale cross-modal retrieval task.
Modeling coupled Thermo-Hydro-Mechanical processes including plastic deformation in geological porous media

NASA Astrophysics Data System (ADS)

Kelkar, S.; Karra, S.; Pawar, R. J.; Zyvoloski, G.

2012-12-01

There has been an increasing interest in the recent years in developing computational tools for analyzing coupled thermal, hydrological and mechanical (THM) processes that occur in geological porous media. This is mainly due to their importance in applications including carbon sequestration, enhanced geothermal systems, oil and gas production from unconventional sources, degradation of Arctic permafrost, and nuclear waste isolation. Large changes in pressures, temperatures and saturation can result due to injection/withdrawal of fluids or emplaced heat sources. These can potentially lead to large changes in the fluid flow and mechanical behavior of the formation, including shear and tensile failure on pre-existing or induced fractures and the associated permeability changes. Due to this, plastic deformation and large changes in material properties such as permeability and porosity can be expected to play an important role in these processes. We describe a general purpose computational code FEHM that has been developed for the purpose of modeling coupled THM processes during multi-phase fluid flow and transport in fractured porous media. The code uses a continuum mechanics approach, based on control volume - finite element method. It is designed to address spatial scales on the order of tens of centimeters to tens of kilometers. While large deformations are important in many situations, we have adapted the small strain formulation as useful insight can be obtained in many problems of practical interest with this approach while remaining computationally manageable. Nonlinearities in the equations and the material properties are handled using a full Jacobian Newton-Raphson technique. Stress-strain relationships are assumed to follow linear elastic/plastic behavior. The code incorporates several plasticity models such as von Mises, Drucker-Prager, and also a large suite of models for coupling flow and mechanical deformation via permeability and stresses/deformations. In this work we present several example applications of such models.
Development of Parallel Code for the Alaska Tsunami Forecast Model

NASA Astrophysics Data System (ADS)

Bahng, B.; Knight, W. R.; Whitmore, P.

2014-12-01

The Alaska Tsunami Forecast Model (ATFM) is a numerical model used to forecast propagation and inundation of tsunamis generated by earthquakes and other means in both the Pacific and Atlantic Oceans. At the U.S. National Tsunami Warning Center (NTWC), the model is mainly used in a pre-computed fashion. That is, results for hundreds of hypothetical events are computed before alerts, and are accessed and calibrated with observations during tsunamis to immediately produce forecasts. ATFM uses the non-linear, depth-averaged, shallow-water equations of motion with multiply nested grids in two-way communications between domains of each parent-child pair as waves get closer to coastal waters. Even with the pre-computation the task becomes non-trivial as sub-grid resolution gets finer. Currently, the finest resolution Digital Elevation Models (DEM) used by ATFM are 1/3 arc-seconds. With a serial code, large or multiple areas of very high resolution can produce run-times that are unrealistic even in a pre-computed approach. One way to increase the model performance is code parallelization used in conjunction with a multi-processor computing environment. NTWC developers have undertaken an ATFM code-parallelization effort to streamline the creation of the pre-computed database of results with the long term aim of tsunami forecasts from source to high resolution shoreline grids in real time. Parallelization will also permit timely regeneration of the forecast model database with new DEMs; and, will make possible future inclusion of new physics such as the non-hydrostatic treatment of tsunami propagation. The purpose of our presentation is to elaborate on the parallelization approach and to show the compute speed increase on various multi-processor systems.
Final Report for "Implimentation and Evaluation of Multigrid Linear Solvers into Extended Magnetohydrodynamic Codes for Petascale Computing"

DOE Office of Scientific and Technical Information (OSTI.GOV)

Srinath Vadlamani; Scott Kruger; Travis Austin

Extended magnetohydrodynamic (MHD) codes are used to model the large, slow-growing instabilities that are projected to limit the performance of International Thermonuclear Experimental Reactor (ITER). The multiscale nature of the extended MHD equations requires an implicit approach. The current linear solvers needed for the implicit algorithm scale poorly because the resultant matrices are so ill-conditioned. A new solver is needed, especially one that scales to the petascale. The most successful scalable parallel processor solvers to date are multigrid solvers. Applying multigrid techniques to a set of equations whose fundamental modes are dispersive waves is a promising solution to CEMM problems.more » For the Phase 1, we implemented multigrid preconditioners from the HYPRE project of the Center for Applied Scientific Computing at LLNL via PETSc of the DOE SciDAC TOPS for the real matrix systems of the extended MHD code NIMROD which is a one of the primary modeling codes of the OFES-funded Center for Extended Magnetohydrodynamic Modeling (CEMM) SciDAC. We implemented the multigrid solvers on the fusion test problem that allows for real matrix systems with success, and in the process learned about the details of NIMROD data structures and the difficulties of inverting NIMROD operators. The further success of this project will allow for efficient usage of future petascale computers at the National Leadership Facilities: Oak Ridge National Laboratory, Argonne National Laboratory, and National Energy Research Scientific Computing Center. The project will be a collaborative effort between computational plasma physicists and applied mathematicians at Tech-X Corporation, applied mathematicians Front Range Scientific Computations, Inc. (who are collaborators on the HYPRE project), and other computational plasma physicists involved with the CEMM project.« less
GSAC - Generic Seismic Application Computing

NASA Astrophysics Data System (ADS)

Herrmann, R. B.; Ammon, C. J.; Koper, K. D.

2004-12-01

With the success of the IRIS data management center, the use of large data sets in seismological research has become common. Such data sets, and especially the significantly larger data sets expected from EarthScope, present challenges for analysis with existing tools developed over the last 30 years. For much of the community, the primary format for data analysis is the Seismic Analysis Code (SAC) format developed by Lawrence Livermore National Laboratory. Although somewhat restrictive in meta-data storage, the simplicity and stability of the format has established it as an important component of seismological research. Tools for working with SAC files fall into two categories - custom research quality processing codes and shared display - processing tools such as SAC2000, MatSeis,etc., which were developed primarily for the needs of individual seismic research groups. While the current graphics display and platform dependence of SAC2000 may be resolved if the source code is released, the code complexity and the lack of large-data set analysis or even introductory tutorials could preclude code improvements and development of expertise in its use. We believe that there is a place for new, especially open source, tools. The GSAC effort is an approach that focuses on ease of use, computational speed, transportability, rapid addition of new features and openness so that new and advanced students, researchers and instructors can quickly browse and process large data sets. We highlight several approaches toward data processing under this model. gsac - part of the Computer Programs in Seismology 3.30 distribution has much of the functionality of SAC2000 and works on UNIX/LINUX/MacOS-X/Windows (CYGWIN). This is completely programmed in C from scratch, is small, fast, and easy to maintain and extend. It is command line based and is easily included within shell processing scripts. PySAC is a set of Python functions that allow easy access to SAC files and enable efficient manipulation of SAC files under a variety of operating systems. PySAC has proven to be valuable in organizing large data sets. An array processing package includes standard beamforming algorithms and a search based method for inference of slowness vectors. The search results can be visualized using GMT scripts output by the C programs, and the resulting snapshots can be combined into an animation of the time evolution of the 2D slowness field.
Log-normal spray drop distribution...analyzed by two new computer programs

Treesearch

Gerald S. Walton

1968-01-01

Results of U.S. Forest Service research on chemical insecticides suggest that large drops are not as effective as small drops in carrying insecticides to target insects. Two new computer programs have been written to analyze size distribution properties of drops from spray nozzles. Coded in Fortran IV, the programs have been tested on both the CDC 6400 and the IBM 7094...
Computational strategies for three-dimensional flow simulations on distributed computer systems

NASA Technical Reports Server (NTRS)

Sankar, Lakshmi N.; Weed, Richard A.

1995-01-01

This research effort is directed towards an examination of issues involved in porting large computational fluid dynamics codes in use within the industry to a distributed computing environment. This effort addresses strategies for implementing the distributed computing in a device independent fashion and load balancing. A flow solver called TEAM presently in use at Lockheed Aeronautical Systems Company was acquired to start this effort. The following tasks were completed: (1) The TEAM code was ported to a number of distributed computing platforms including a cluster of HP workstations located in the School of Aerospace Engineering at Georgia Tech; a cluster of DEC Alpha Workstations in the Graphics visualization lab located at Georgia Tech; a cluster of SGI workstations located at NASA Ames Research Center; and an IBM SP-2 system located at NASA ARC. (2) A number of communication strategies were implemented. Specifically, the manager-worker strategy and the worker-worker strategy were tested. (3) A variety of load balancing strategies were investigated. Specifically, the static load balancing, task queue balancing and the Crutchfield algorithm were coded and evaluated. (4) The classical explicit Runge-Kutta scheme in the TEAM solver was replaced with an LU implicit scheme. And (5) the implicit TEAM-PVM solver was extensively validated through studies of unsteady transonic flow over an F-5 wing, undergoing combined bending and torsional motion. These investigations are documented in extensive detail in the dissertation, 'Computational Strategies for Three-Dimensional Flow Simulations on Distributed Computing Systems', enclosed as an appendix.
Computational strategies for three-dimensional flow simulations on distributed computer systems

NASA Astrophysics Data System (ADS)

Sankar, Lakshmi N.; Weed, Richard A.

1995-08-01

This research effort is directed towards an examination of issues involved in porting large computational fluid dynamics codes in use within the industry to a distributed computing environment. This effort addresses strategies for implementing the distributed computing in a device independent fashion and load balancing. A flow solver called TEAM presently in use at Lockheed Aeronautical Systems Company was acquired to start this effort. The following tasks were completed: (1) The TEAM code was ported to a number of distributed computing platforms including a cluster of HP workstations located in the School of Aerospace Engineering at Georgia Tech; a cluster of DEC Alpha Workstations in the Graphics visualization lab located at Georgia Tech; a cluster of SGI workstations located at NASA Ames Research Center; and an IBM SP-2 system located at NASA ARC. (2) A number of communication strategies were implemented. Specifically, the manager-worker strategy and the worker-worker strategy were tested. (3) A variety of load balancing strategies were investigated. Specifically, the static load balancing, task queue balancing and the Crutchfield algorithm were coded and evaluated. (4) The classical explicit Runge-Kutta scheme in the TEAM solver was replaced with an LU implicit scheme. And (5) the implicit TEAM-PVM solver was extensively validated through studies of unsteady transonic flow over an F-5 wing, undergoing combined bending and torsional motion. These investigations are documented in extensive detail in the dissertation, 'Computational Strategies for Three-Dimensional Flow Simulations on Distributed Computing Systems', enclosed as an appendix.
WARP3D-Release 10.8: Dynamic Nonlinear Analysis of Solids using a Preconditioned Conjugate Gradient Software Architecture

NASA Technical Reports Server (NTRS)

Koppenhoefer, Kyle C.; Gullerud, Arne S.; Ruggieri, Claudio; Dodds, Robert H., Jr.; Healy, Brian E.

1998-01-01

This report describes theoretical background material and commands necessary to use the WARP3D finite element code. WARP3D is under continuing development as a research code for the solution of very large-scale, 3-D solid models subjected to static and dynamic loads. Specific features in the code oriented toward the investigation of ductile fracture in metals include a robust finite strain formulation, a general J-integral computation facility (with inertia, face loading), an element extinction facility to model crack growth, nonlinear material models including viscoplastic effects, and the Gurson-Tver-gaard dilatant plasticity model for void growth. The nonlinear, dynamic equilibrium equations are solved using an incremental-iterative, implicit formulation with full Newton iterations to eliminate residual nodal forces. The history integration of the nonlinear equations of motion is accomplished with Newmarks Beta method. A central feature of WARP3D involves the use of a linear-preconditioned conjugate gradient (LPCG) solver implemented in an element-by-element format to replace a conventional direct linear equation solver. This software architecture dramatically reduces both the memory requirements and CPU time for very large, nonlinear solid models since formation of the assembled (dynamic) stiffness matrix is avoided. Analyses thus exhibit the numerical stability for large time (load) steps provided by the implicit formulation coupled with the low memory requirements characteristic of an explicit code. In addition to the much lower memory requirements of the LPCG solver, the CPU time required for solution of the linear equations during each Newton iteration is generally one-half or less of the CPU time required for a traditional direct solver. All other computational aspects of the code (element stiffnesses, element strains, stress updating, element internal forces) are implemented in the element-by- element, blocked architecture. This greatly improves vectorization of the code on uni-processor hardware and enables straightforward parallel-vector processing of element blocks on multi-processor hardware.
Implementing Scientific Simulation Codes Highly Tailored for Vector Architectures Using Custom Configurable Computing Machines

NASA Technical Reports Server (NTRS)

Rutishauser, David

2006-01-01

The motivation for this work comes from an observation that amidst the push for Massively Parallel (MP) solutions to high-end computing problems such as numerical physical simulations, large amounts of legacy code exist that are highly optimized for vector supercomputers. Because re-hosting legacy code often requires a complete re-write of the original code, which can be a very long and expensive effort, this work examines the potential to exploit reconfigurable computing machines in place of a vector supercomputer to implement an essentially unmodified legacy source code. Custom and reconfigurable computing resources could be used to emulate an original application's target platform to the extent required to achieve high performance. To arrive at an architecture that delivers the desired performance subject to limited resources involves solving a multi-variable optimization problem with constraints. Prior research in the area of reconfigurable computing has demonstrated that designing an optimum hardware implementation of a given application under hardware resource constraints is an NP-complete problem. The premise of the approach is that the general issue of applying reconfigurable computing resources to the implementation of an application, maximizing the performance of the computation subject to physical resource constraints, can be made a tractable problem by assuming a computational paradigm, such as vector processing. This research contributes a formulation of the problem and a methodology to design a reconfigurable vector processing implementation of a given application that satisfies a performance metric. A generic, parametric, architectural framework for vector processing implemented in reconfigurable logic is developed as a target for a scheduling/mapping algorithm that maps an input computation to a given instance of the architecture. This algorithm is integrated with an optimization framework to arrive at a specification of the architecture parameters that attempts to minimize execution time, while staying within resource constraints. The flexibility of using a custom reconfigurable implementation is exploited in a unique manner to leverage the lessons learned in vector supercomputer development. The vector processing framework is tailored to the application, with variable parameters that are fixed in traditional vector processing. Benchmark data that demonstrates the functionality and utility of the approach is presented. The benchmark data includes an identified bottleneck in a real case study example vector code, the NASA Langley Terminal Area Simulation System (TASS) application.
A Parallel Sliding Region Algorithm to Make Agent-Based Modeling Possible for a Large-Scale Simulation: Modeling Hepatitis C Epidemics in Canada.

PubMed

Wong, William W L; Feng, Zeny Z; Thein, Hla-Hla

2016-11-01

Agent-based models (ABMs) are computer simulation models that define interactions among agents and simulate emergent behaviors that arise from the ensemble of local decisions. ABMs have been increasingly used to examine trends in infectious disease epidemiology. However, the main limitation of ABMs is the high computational cost for a large-scale simulation. To improve the computational efficiency for large-scale ABM simulations, we built a parallelizable sliding region algorithm (SRA) for ABM and compared it to a nonparallelizable ABM. We developed a complex agent network and performed two simulations to model hepatitis C epidemics based on the real demographic data from Saskatchewan, Canada. The first simulation used the SRA that processed on each postal code subregion subsequently. The second simulation processed the entire population simultaneously. It was concluded that the parallelizable SRA showed computational time saving with comparable results in a province-wide simulation. Using the same method, SRA can be generalized for performing a country-wide simulation. Thus, this parallel algorithm enables the possibility of using ABM for large-scale simulation with limited computational resources.
Flowgen: Flowchart-based documentation for C + + codes

NASA Astrophysics Data System (ADS)

Kosower, David A.; Lopez-Villarejo, J. J.

2015-11-01

We present the Flowgen tool, which generates flowcharts from annotated C + + source code. The tool generates a set of interconnected high-level UML activity diagrams, one for each function or method in the C + + sources. It provides a simple and visual overview of complex implementations of numerical algorithms. Flowgen is complementary to the widely-used Doxygen documentation tool. The ultimate aim is to render complex C + + computer codes accessible, and to enhance collaboration between programmers and algorithm or science specialists. We describe the tool and a proof-of-concept application to the VINCIA plug-in for simulating collisions at CERN's Large Hadron Collider.
GAPD: a GPU-accelerated atom-based polychromatic diffraction simulation code.

PubMed

E, J C; Wang, L; Chen, S; Zhang, Y Y; Luo, S N

2018-03-01

GAPD, a graphics-processing-unit (GPU)-accelerated atom-based polychromatic diffraction simulation code for direct, kinematics-based, simulations of X-ray/electron diffraction of large-scale atomic systems with mono-/polychromatic beams and arbitrary plane detector geometries, is presented. This code implements GPU parallel computation via both real- and reciprocal-space decompositions. With GAPD, direct simulations are performed of the reciprocal lattice node of ultralarge systems (∼5 billion atoms) and diffraction patterns of single-crystal and polycrystalline configurations with mono- and polychromatic X-ray beams (including synchrotron undulator sources), and validation, benchmark and application cases are presented.
Design evolution of large wind turbine generators

NASA Technical Reports Server (NTRS)

Spera, D. A.

1979-01-01

During the past five years, the goals of economy and reliability have led to a significant evolution in the basic design--both external and internal--of large wind turbine systems. To show the scope and nature of recent changes in wind turbine designs, development of three types are described: (1) system configuration developments; (2) computer code developments; and (3) blade technology developments.
Simulations of Laboratory Astrophysics Experiments using the CRASH code

NASA Astrophysics Data System (ADS)

Trantham, Matthew; Kuranz, Carolyn; Fein, Jeff; Wan, Willow; Young, Rachel; Keiter, Paul; Drake, R. Paul

2015-11-01

Computer simulations can assist in the design and analysis of laboratory astrophysics experiments. The Center for Radiative Shock Hydrodynamics (CRASH) at the University of Michigan developed a code that has been used to design and analyze high-energy-density experiments on OMEGA, NIF, and other large laser facilities. This Eulerian code uses block-adaptive mesh refinement (AMR) with implicit multigroup radiation transport, electron heat conduction and laser ray tracing. This poster will demonstrate some of the experiments the CRASH code has helped design or analyze including: Kelvin-Helmholtz, Rayleigh-Taylor, magnetized flows, jets, and laser-produced plasmas. This work is funded by the following grants: DEFC52-08NA28616, DE-NA0001840, and DE-NA0002032.
Statistical Surrogate Modeling of Atmospheric Dispersion Events Using Bayesian Adaptive Splines

NASA Astrophysics Data System (ADS)

Francom, D.; Sansó, B.; Bulaevskaya, V.; Lucas, D. D.

2016-12-01

Uncertainty in the inputs of complex computer models, including atmospheric dispersion and transport codes, is often assessed via statistical surrogate models. Surrogate models are computationally efficient statistical approximations of expensive computer models that enable uncertainty analysis. We introduce Bayesian adaptive spline methods for producing surrogate models that capture the major spatiotemporal patterns of the parent model, while satisfying all the necessities of flexibility, accuracy and computational feasibility. We present novel methodological and computational approaches motivated by a controlled atmospheric tracer release experiment conducted at the Diablo Canyon nuclear power plant in California. Traditional methods for building statistical surrogate models often do not scale well to experiments with large amounts of data. Our approach is well suited to experiments involving large numbers of model inputs, large numbers of simulations, and functional output for each simulation. Our approach allows us to perform global sensitivity analysis with ease. We also present an approach to calibration of simulators using field data.

From 2D to 3D modelling in long term tectonics: Modelling challenges and HPC solutions (Invited)

NASA Astrophysics Data System (ADS)

Le Pourhiet, L.; May, D.

2013-12-01

Over the last decades, 3D thermo-mechanical codes have been made available to the long term tectonics community either as open source (Underworld, Gale) or more limited access (Fantom, Elvis3D, Douar, LaMem etc ...). However, to date, few published results using these methods have included the coupling between crustal and lithospheric dynamics at large strain. The fact that these computations are computational expensive is not the primary reason for the relatively slow development of 3D modeling in the long term tectonics community, as compare to the rapid development observed within the mantle dynamic community, or in the short-term tectonics field. Long term tectonics problems have specific issues not found in either of these two field, including; large strain (not an issue for short-term), the inclusion of free surface and the occurence of large viscosity contrasts. The first issue is typically eliminated using a combined marker-ALE method instead of fully lagrangian method, however, the marker-ALE approach can pose some algorithmic challenges in a massively parallel environment. The two last issues are more problematic because they affect the convergence of the linear/non-linear solver and the memory cost. Two options have been tested so far, using low order element and solving with a sparse direct solver, or using higher order stable elements together with a multi-grid solver. The first options, is simpler to code and to use but reaches its limit at around 80^3 low order elements. The second option requires more operations but allows using iterative solver on extremely large computers. In this presentation, I will describe the design philosophy and highlight results obtained using a code from the second-class method. The presentation will be oriented from an end-user point of view, using an application from 3D continental break up to illustrate key concepts. The description will proceed point by point from implementing physics into the code, to dealing with specific issues related to solving the discrete system of non linear equations.
Development and evaluation of a Naïve Bayesian model for coding causation of workers' compensation claims.

PubMed

Bertke, S J; Meyers, A R; Wurzelbacher, S J; Bell, J; Lampl, M L; Robins, D

2012-12-01

Tracking and trending rates of injuries and illnesses classified as musculoskeletal disorders caused by ergonomic risk factors such as overexertion and repetitive motion (MSDs) and slips, trips, or falls (STFs) in different industry sectors is of high interest to many researchers. Unfortunately, identifying the cause of injuries and illnesses in large datasets such as workers' compensation systems often requires reading and coding the free form accident text narrative for potentially millions of records. To alleviate the need for manual coding, this paper describes and evaluates a computer auto-coding algorithm that demonstrated the ability to code millions of claims quickly and accurately by learning from a set of previously manually coded claims. The auto-coding program was able to code claims as a musculoskeletal disorders, STF or other with approximately 90% accuracy. The program developed and discussed in this paper provides an accurate and efficient method for identifying the causation of workers' compensation claims as a STF or MSD in a large database based on the unstructured text narrative and resulting injury diagnoses. The program coded thousands of claims in minutes. The method described in this paper can be used by researchers and practitioners to relieve the manual burden of reading and identifying the causation of claims as a STF or MSD. Furthermore, the method can be easily generalized to code/classify other unstructured text narratives. Published by Elsevier Ltd.
Improvement of Mishchenko's T-matrix code for absorbing particles.

PubMed

Moroz, Alexander

2005-06-10

The use of Gaussian elimination with backsubstitution for matrix inversion in scattering theories is discussed. Within the framework of the T-matrix method (the state-of-the-art code by Mishchenko is freely available at http://www.giss.nasa.gov/-crmim), it is shown that the domain of applicability of Mishchenko's FORTRAN 77 (F77) code can be substantially expanded in the direction of strongly absorbing particles where the current code fails to converge. Such an extension is especially important if the code is to be used in nanoplasmonic or nanophotonic applications involving metallic particles. At the same time, convergence can also be achieved for large nonabsorbing particles, in which case the non-Numerical Algorithms Group option of Mishchenko's code diverges. Computer F77 implementation of Mishchenko's code supplemented with Gaussian elimination with backsubstitution is freely available at http://www.wave-scattering.com.
Evaluation of Cache-based Superscalar and Cacheless Vector Architectures for Scientific Computations

NASA Technical Reports Server (NTRS)

Oliker, Leonid; Carter, Jonathan; Shalf, John; Skinner, David; Ethier, Stephane; Biswas, Rupak; Djomehri, Jahed; VanderWijngaart, Rob

2003-01-01

The growing gap between sustained and peak performance for scientific applications has become a well-known problem in high performance computing. The recent development of parallel vector systems offers the potential to bridge this gap for a significant number of computational science codes and deliver a substantial increase in computing capabilities. This paper examines the intranode performance of the NEC SX6 vector processor and the cache-based IBM Power3/4 superscalar architectures across a number of key scientific computing areas. First, we present the performance of a microbenchmark suite that examines a full spectrum of low-level machine characteristics. Next, we study the behavior of the NAS Parallel Benchmarks using some simple optimizations. Finally, we evaluate the perfor- mance of several numerical codes from key scientific computing domains. Overall results demonstrate that the SX6 achieves high performance on a large fraction of our application suite and in many cases significantly outperforms the RISC-based architectures. However, certain classes of applications are not easily amenable to vectorization and would likely require extensive reengineering of both algorithm and implementation to utilize the SX6 effectively.
Million-body star cluster simulations: comparisons between Monte Carlo and direct N-body

NASA Astrophysics Data System (ADS)

Rodriguez, Carl L.; Morscher, Meagan; Wang, Long; Chatterjee, Sourav; Rasio, Frederic A.; Spurzem, Rainer

2016-12-01

We present the first detailed comparison between million-body globular cluster simulations computed with a Hénon-type Monte Carlo code, CMC, and a direct N-body code, NBODY6++GPU. Both simulations start from an identical cluster model with 106 particles, and include all of the relevant physics needed to treat the system in a highly realistic way. With the two codes `frozen' (no fine-tuning of any free parameters or internal algorithms of the codes) we find good agreement in the overall evolution of the two models. Furthermore, we find that in both models, large numbers of stellar-mass black holes (>1000) are retained for 12 Gyr. Thus, the very accurate direct N-body approach confirms recent predictions that black holes can be retained in present-day, old globular clusters. We find only minor disagreements between the two models and attribute these to the small-N dynamics driving the evolution of the cluster core for which the Monte Carlo assumptions are less ideal. Based on the overwhelming general agreement between the two models computed using these vastly different techniques, we conclude that our Monte Carlo approach, which is more approximate, but dramatically faster compared to the direct N-body, is capable of producing an accurate description of the long-term evolution of massive globular clusters even when the clusters contain large populations of stellar-mass black holes.
Blueprint for a microwave trapped ion quantum computer

PubMed Central

Lekitsch, Bjoern; Weidt, Sebastian; Fowler, Austin G.; Mølmer, Klaus; Devitt, Simon J.; Wunderlich, Christof; Hensinger, Winfried K.

2017-01-01

The availability of a universal quantum computer may have a fundamental impact on a vast number of research fields and on society as a whole. An increasingly large scientific and industrial community is working toward the realization of such a device. An arbitrarily large quantum computer may best be constructed using a modular approach. We present a blueprint for a trapped ion–based scalable quantum computer module, making it possible to create a scalable quantum computer architecture based on long-wavelength radiation quantum gates. The modules control all operations as stand-alone units, are constructed using silicon microfabrication techniques, and are within reach of current technology. To perform the required quantum computations, the modules make use of long-wavelength radiation–based quantum gate technology. To scale this microwave quantum computer architecture to a large size, we present a fully scalable design that makes use of ion transport between different modules, thereby allowing arbitrarily many modules to be connected to construct a large-scale device. A high error–threshold surface error correction code can be implemented in the proposed architecture to execute fault-tolerant operations. With appropriate adjustments, the proposed modules are also suitable for alternative trapped ion quantum computer architectures, such as schemes using photonic interconnects. PMID:28164154
Layer-oriented simulation tool.

PubMed

Arcidiacono, Carmelo; Diolaiti, Emiliano; Tordi, Massimiliano; Ragazzoni, Roberto; Farinato, Jacopo; Vernet, Elise; Marchetti, Enrico

2004-08-01

The Layer-Oriented Simulation Tool (LOST) is a numerical simulation code developed for analysis of the performance of multiconjugate adaptive optics modules following a layer-oriented approach. The LOST code computes the atmospheric layers in terms of phase screens and then propagates the phase delays introduced in the natural guide stars' wave fronts by using geometrical optics approximations. These wave fronts are combined in an optical or numerical way, including the effects of wave-front sensors on measurements in terms of phase noise. The LOST code is described, and two applications to layer-oriented modules are briefly presented. We have focus on the Multiconjugate adaptive optics demonstrator to be mounted upon the Very Large Telescope and on the Near-IR-Visible Adaptive Interferometer for Astronomy (NIRVANA) interferometric system to be installed on the combined focus of the Large Binocular Telescope.
Equilibrium Reconstruction on the Large Helical Device

DOE Office of Scientific and Technical Information (OSTI.GOV)

Samuel A. Lazerson, D. Gates, D. Monticello, H. Neilson, N. Pomphrey, A. Reiman S. Sakakibara, and Y. Suzuki

Equilibrium reconstruction is commonly applied to axisymmetric toroidal devices. Recent advances in computational power and equilibrium codes have allowed for reconstructions of three-dimensional fields in stellarators and heliotrons. We present the first reconstructions of finite beta discharges in the Large Helical Device (LHD). The plasma boundary and magnetic axis are constrained by the pressure profile from Thomson scattering. This results in a calculation of plasma beta without a-priori assumptions of the equipartition of energy between species. Saddle loop arrays place additional constraints on the equilibrium. These reconstruction utilize STELLOPT, which calls VMEC. The VMEC equilibrium code assumes good nested fluxmore » surfaces. Reconstructed magnetic fields are fed into the PIES code which relaxes this constraint allowing for the examination of the effect of islands and stochastic regions on the magnetic measurements.« less
Multi-level discriminative dictionary learning with application to large scale image classification.

PubMed

Shen, Li; Sun, Gang; Huang, Qingming; Wang, Shuhui; Lin, Zhouchen; Wu, Enhua

2015-10-01

The sparse coding technique has shown flexibility and capability in image representation and analysis. It is a powerful tool in many visual applications. Some recent work has shown that incorporating the properties of task (such as discrimination for classification task) into dictionary learning is effective for improving the accuracy. However, the traditional supervised dictionary learning methods suffer from high computation complexity when dealing with large number of categories, making them less satisfactory in large scale applications. In this paper, we propose a novel multi-level discriminative dictionary learning method and apply it to large scale image classification. Our method takes advantage of hierarchical category correlation to encode multi-level discriminative information. Each internal node of the category hierarchy is associated with a discriminative dictionary and a classification model. The dictionaries at different layers are learnt to capture the information of different scales. Moreover, each node at lower layers also inherits the dictionary of its parent, so that the categories at lower layers can be described with multi-scale information. The learning of dictionaries and associated classification models is jointly conducted by minimizing an overall tree loss. The experimental results on challenging data sets demonstrate that our approach achieves excellent accuracy and competitive computation cost compared with other sparse coding methods for large scale image classification.
Application of Fast Multipole Methods to the NASA Fast Scattering Code

NASA Technical Reports Server (NTRS)

Dunn, Mark H.; Tinetti, Ana F.

2008-01-01

The NASA Fast Scattering Code (FSC) is a versatile noise prediction program designed to conduct aeroacoustic noise reduction studies. The equivalent source method is used to solve an exterior Helmholtz boundary value problem with an impedance type boundary condition. The solution process in FSC v2.0 requires direct manipulation of a large, dense system of linear equations, limiting the applicability of the code to small scales and/or moderate excitation frequencies. Recent advances in the use of Fast Multipole Methods (FMM) for solving scattering problems, coupled with sparse linear algebra techniques, suggest that a substantial reduction in computer resource utilization over conventional solution approaches can be obtained. Implementation of the single level FMM (SLFMM) and a variant of the Conjugate Gradient Method (CGM) into the FSC is discussed in this paper. The culmination of this effort, FSC v3.0, was used to generate solutions for three configurations of interest. Benchmarking against previously obtained simulations indicate that a twenty-fold reduction in computational memory and up to a four-fold reduction in computer time have been achieved on a single processor.
Computation of the tip vortex flowfield for advanced aircraft propellers

NASA Technical Reports Server (NTRS)

Tsai, Tommy M.; Dejong, Frederick J.; Levy, Ralph

1988-01-01

The tip vortex flowfield plays a significant role in the performance of advanced aircraft propellers. The flowfield in the tip region is complex, three-dimensional and viscous with large secondary velocities. An analysis is presented using an approximate set of equations which contains the physics required by the tip vortex flowfield, but which does not require the resources of the full Navier-Stokes equations. A computer code was developed to predict the tip vortex flowfield of advanced aircraft propellers. A grid generation package was developed to allow specification of a variety of advanced aircraft propeller shapes. Calculations of the tip vortex generation on an SR3 type blade at high Reynolds numbers were made using this code and a parametric study was performed to show the effect of tip thickness on tip vortex intensity. In addition, calculations of the tip vortex generation on a NACA 0012 type blade were made, including the flowfield downstream of the blade trailing edge. Comparison of flowfield calculations with experimental data from an F4 blade was made. A user's manual was also prepared for the computer code (NASA CR-182178).
Extraordinarily Adaptive Properties of the Genetically Encoded Amino Acids

PubMed Central

Ilardo, Melissa; Meringer, Markus; Freeland, Stephen; Rasulev, Bakhtiyor; Cleaves II, H. James

2015-01-01

Using novel advances in computational chemistry, we demonstrate that the set of 20 genetically encoded amino acids, used nearly universally to construct all coded terrestrial proteins, has been highly influenced by natural selection. We defined an adaptive set of amino acids as one whose members thoroughly cover relevant physico-chemical properties, or “chemistry space.” Using this metric, we compared the encoded amino acid alphabet to random sets of amino acids. These random sets were drawn from a computationally generated compound library containing 1913 alternative amino acids that lie within the molecular weight range of the encoded amino acids. Sets that cover chemistry space better than the genetically encoded alphabet are extremely rare and energetically costly. Further analysis of more adaptive sets reveals common features and anomalies, and we explore their implications for synthetic biology. We present these computations as evidence that the set of 20 amino acids found within the standard genetic code is the result of considerable natural selection. The amino acids used for constructing coded proteins may represent a largely global optimum, such that any aqueous biochemistry would use a very similar set. PMID:25802223
Computer-based coding of free-text job descriptions to efficiently identify occupations in epidemiological studies

PubMed Central

Russ, Daniel E.; Ho, Kwan-Yuet; Colt, Joanne S.; Armenti, Karla R.; Baris, Dalsu; Chow, Wong-Ho; Davis, Faith; Johnson, Alison; Purdue, Mark P.; Karagas, Margaret R.; Schwartz, Kendra; Schwenn, Molly; Silverman, Debra T.; Johnson, Calvin A.; Friesen, Melissa C.

2016-01-01

Background Mapping job titles to standardized occupation classification (SOC) codes is an important step in identifying occupational risk factors in epidemiologic studies. Because manual coding is time-consuming and has moderate reliability, we developed an algorithm called SOCcer (Standardized Occupation Coding for Computer-assisted Epidemiologic Research) to assign SOC-2010 codes based on free-text job description components. Methods Job title and task-based classifiers were developed by comparing job descriptions to multiple sources linking job and task descriptions to SOC codes. An industry-based classifier was developed based on the SOC prevalence within an industry. These classifiers were used in a logistic model trained using 14,983 jobs with expert-assigned SOC codes to obtain empirical weights for an algorithm that scored each SOC/job description. We assigned the highest scoring SOC code to each job. SOCcer was validated in two occupational data sources by comparing SOC codes obtained from SOCcer to expert assigned SOC codes and lead exposure estimates obtained by linking SOC codes to a job-exposure matrix. Results For 11,991 case-control study jobs, SOCcer-assigned codes agreed with 44.5% and 76.3% of manually assigned codes at the 6- and 2-digit level, respectively. Agreement increased with the score, providing a mechanism to identify assignments needing review. Good agreement was observed between lead estimates based on SOCcer and manual SOC assignments (kappa: 0.6–0.8). Poorer performance was observed for inspection job descriptions, which included abbreviations and worksite-specific terminology. Conclusions Although some manual coding will remain necessary, using SOCcer may improve the efficiency of incorporating occupation into large-scale epidemiologic studies. PMID:27102331
GPU-Accelerated Large-Scale Electronic Structure Theory on Titan with a First-Principles All-Electron Code

NASA Astrophysics Data System (ADS)

Huhn, William Paul; Lange, Björn; Yu, Victor; Blum, Volker; Lee, Seyong; Yoon, Mina

Density-functional theory has been well established as the dominant quantum-mechanical computational method in the materials community. Large accurate simulations become very challenging on small to mid-scale computers and require high-performance compute platforms to succeed. GPU acceleration is one promising approach. In this talk, we present a first implementation of all-electron density-functional theory in the FHI-aims code for massively parallel GPU-based platforms. Special attention is paid to the update of the density and to the integration of the Hamiltonian and overlap matrices, realized in a domain decomposition scheme on non-uniform grids. The initial implementation scales well across nodes on ORNL's Titan Cray XK7 supercomputer (8 to 64 nodes, 16 MPI ranks/node) and shows an overall speed up in runtime due to utilization of the K20X Tesla GPUs on each Titan node of 1.4x, with the charge density update showing a speed up of 2x. Further acceleration opportunities will be discussed. Work supported by the LDRD Program of ORNL managed by UT-Battle, LLC, for the U.S. DOE and by the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725.
Results of comparative RBMK neutron computation using VNIIEF codes (cell computation, 3D statics, 3D kinetics). Final report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grebennikov, A.N.; Zhitnik, A.K.; Zvenigorodskaya, O.A.

1995-12-31

In conformity with the protocol of the Workshop under Contract {open_quotes}Assessment of RBMK reactor safety using modern Western Codes{close_quotes} VNIIEF performed a neutronics computation series to compare western and VNIIEF codes and assess whether VNIIEF codes are suitable for RBMK type reactor safety assessment computation. The work was carried out in close collaboration with M.I. Rozhdestvensky and L.M. Podlazov, NIKIET employees. The effort involved: (1) cell computations with the WIMS, EKRAN codes (improved modification of the LOMA code) and the S-90 code (VNIIEF Monte Carlo). Cell, polycell, burnup computation; (2) 3D computation of static states with the KORAT-3D and NEUmore » codes and comparison with results of computation with the NESTLE code (USA). The computations were performed in the geometry and using the neutron constants presented by the American party; (3) 3D computation of neutron kinetics with the KORAT-3D and NEU codes. These computations were performed in two formulations, both being developed in collaboration with NIKIET. Formulation of the first problem maximally possibly agrees with one of NESTLE problems and imitates gas bubble travel through a core. The second problem is a model of the RBMK as a whole with imitation of control and protection system controls (CPS) movement in a core.« less
Users manual for program NYQUIST: Liquid rocket nyquist plots developed for use on a PC computer

NASA Astrophysics Data System (ADS)

Armstrong, Wilbur C.

1992-06-01

The piping in a liquid rocket can assume complex configurations due to multiple tanks, multiple engines, and structures that must be piped around. The capability to handle some of these complex configurations have been incorporated into the NYQUIST code. The capability to modify the input on line has been implemented. The configurations allowed include multiple tanks, multiple engines, and the splitting of a pipe into unequal segments going to different (or the same) engines. This program will handle the following type elements: straight pipes, bends, inline accumulators, tuned stub accumulators, Helmholtz resonators, parallel resonators, pumps, split pipes, multiple tanks, and multiple engines. The code is too large to compile as one program using Microsoft FORTRAN 5; therefore, the code was broken into two segments: NYQUIST1.FOR and NYQUIST2.FOR. These are compiled separately and then linked together. The final run code is not too large (approximately equals 344,000 bytes).
Quantitative Profiling of Peptides from RNAs classified as non-coding

PubMed Central

Prabakaran, Sudhakaran; Hemberg, Martin; Chauhan, Ruchi; Winter, Dominic; Tweedie-Cullen, Ry Y.; Dittrich, Christian; Hong, Elizabeth; Gunawardena, Jeremy; Steen, Hanno; Kreiman, Gabriel; Steen, Judith A.

2014-01-01

Only a small fraction of the mammalian genome codes for messenger RNAs destined to be translated into proteins, and it is generally assumed that a large portion of transcribed sequences - including introns and several classes of non-coding RNAs (ncRNAs) do not give rise to peptide products. A systematic examination of translation and physiological regulation of ncRNAs has not been conducted. Here, we use computational methods to identify the products of non-canonical translation in mouse neurons by analyzing unannotated transcripts in combination with proteomic data. This study supports the existence of non-canonical translation products from both intragenic and extragenic genomic regions, including peptides derived from anti-sense transcripts and introns. Moreover, the studied novel translation products exhibit temporal regulation similar to that of proteins known to be involved in neuronal activity processes. These observations highlight a potentially large and complex set of biologically regulated translational events from transcripts formerly thought to lack coding potential. PMID:25403355
Exshall: A Turkel-Zwas explicit large time-step FORTRAN program for solving the shallow-water equations in spherical coordinates

NASA Astrophysics Data System (ADS)

Navon, I. M.; Yu, Jian

A FORTRAN computer program is presented and documented applying the Turkel-Zwas explicit large time-step scheme to a hemispheric barotropic model with constraint restoration of integral invariants of the shallow-water equations. We then proceed to detail the algorithms embodied in the code EXSHALL in this paper, particularly algorithms related to the efficiency and stability of T-Z scheme and the quadratic constraint restoration method which is based on a variational approach. In particular we provide details about the high-latitude filtering, Shapiro filtering, and Robert filtering algorithms used in the code. We explain in detail the various subroutines in the EXSHALL code with emphasis on algorithms implemented in the code and present the flowcharts of some major subroutines. Finally, we provide a visual example illustrating a 4-day run using real initial data, along with a sample printout and graphic isoline contours of the height field and velocity fields.
Users manual for program NYQUIST: Liquid rocket nyquist plots developed for use on a PC computer

NASA Technical Reports Server (NTRS)

Armstrong, Wilbur C.

1992-01-01

The piping in a liquid rocket can assume complex configurations due to multiple tanks, multiple engines, and structures that must be piped around. The capability to handle some of these complex configurations have been incorporated into the NYQUIST code. The capability to modify the input on line has been implemented. The configurations allowed include multiple tanks, multiple engines, and the splitting of a pipe into unequal segments going to different (or the same) engines. This program will handle the following type elements: straight pipes, bends, inline accumulators, tuned stub accumulators, Helmholtz resonators, parallel resonators, pumps, split pipes, multiple tanks, and multiple engines. The code is too large to compile as one program using Microsoft FORTRAN 5; therefore, the code was broken into two segments: NYQUIST1.FOR and NYQUIST2.FOR. These are compiled separately and then linked together. The final run code is not too large (approximately equals 344,000 bytes).
mGrid: A load-balanced distributed computing environment for the remote execution of the user-defined Matlab code

PubMed Central

Karpievitch, Yuliya V; Almeida, Jonas S

2006-01-01

Background Matlab, a powerful and productive language that allows for rapid prototyping, modeling and simulation, is widely used in computational biology. Modeling and simulation of large biological systems often require more computational resources then are available on a single computer. Existing distributed computing environments like the Distributed Computing Toolbox, MatlabMPI, Matlab*G and others allow for the remote (and possibly parallel) execution of Matlab commands with varying support for features like an easy-to-use application programming interface, load-balanced utilization of resources, extensibility over the wide area network, and minimal system administration skill requirements. However, all of these environments require some level of access to participating machines to manually distribute the user-defined libraries that the remote call may invoke. Results mGrid augments the usual process distribution seen in other similar distributed systems by adding facilities for user code distribution. mGrid's client-side interface is an easy-to-use native Matlab toolbox that transparently executes user-defined code on remote machines (i.e. the user is unaware that the code is executing somewhere else). Run-time variables are automatically packed and distributed with the user-defined code and automated load-balancing of remote resources enables smooth concurrent execution. mGrid is an open source environment. Apart from the programming language itself, all other components are also open source, freely available tools: light-weight PHP scripts and the Apache web server. Conclusion Transparent, load-balanced distribution of user-defined Matlab toolboxes and rapid prototyping of many simple parallel applications can now be done with a single easy-to-use Matlab command. Because mGrid utilizes only Matlab, light-weight PHP scripts and the Apache web server, installation and configuration are very simple. Moreover, the web-based infrastructure of mGrid allows for it to be easily extensible over the Internet. PMID:16539707

mGrid: a load-balanced distributed computing environment for the remote execution of the user-defined Matlab code.

PubMed

Karpievitch, Yuliya V; Almeida, Jonas S

2006-03-15

Matlab, a powerful and productive language that allows for rapid prototyping, modeling and simulation, is widely used in computational biology. Modeling and simulation of large biological systems often require more computational resources then are available on a single computer. Existing distributed computing environments like the Distributed Computing Toolbox, MatlabMPI, Matlab*G and others allow for the remote (and possibly parallel) execution of Matlab commands with varying support for features like an easy-to-use application programming interface, load-balanced utilization of resources, extensibility over the wide area network, and minimal system administration skill requirements. However, all of these environments require some level of access to participating machines to manually distribute the user-defined libraries that the remote call may invoke. mGrid augments the usual process distribution seen in other similar distributed systems by adding facilities for user code distribution. mGrid's client-side interface is an easy-to-use native Matlab toolbox that transparently executes user-defined code on remote machines (i.e. the user is unaware that the code is executing somewhere else). Run-time variables are automatically packed and distributed with the user-defined code and automated load-balancing of remote resources enables smooth concurrent execution. mGrid is an open source environment. Apart from the programming language itself, all other components are also open source, freely available tools: light-weight PHP scripts and the Apache web server. Transparent, load-balanced distribution of user-defined Matlab toolboxes and rapid prototyping of many simple parallel applications can now be done with a single easy-to-use Matlab command. Because mGrid utilizes only Matlab, light-weight PHP scripts and the Apache web server, installation and configuration are very simple. Moreover, the web-based infrastructure of mGrid allows for it to be easily extensible over the Internet.
Code Modernization of VPIC

NASA Astrophysics Data System (ADS)

Bird, Robert; Nystrom, David; Albright, Brian

2017-10-01

The ability of scientific simulations to effectively deliver performant computation is increasingly being challenged by successive generations of high-performance computing architectures. Code development to support efficient computation on these modern architectures is both expensive, and highly complex; if it is approached without due care, it may also not be directly transferable between subsequent hardware generations. Previous works have discussed techniques to support the process of adapting a legacy code for modern hardware generations, but despite the breakthroughs in the areas of mini-app development, portable-performance, and cache oblivious algorithms the problem still remains largely unsolved. In this work we demonstrate how a focus on platform agnostic modern code-development can be applied to Particle-in-Cell (PIC) simulations to facilitate effective scientific delivery. This work builds directly on our previous work optimizing VPIC, in which we replaced intrinsic based vectorisation with compile generated auto-vectorization to improve the performance and portability of VPIC. In this work we present the use of a specialized SIMD queue for processing some particle operations, and also preview a GPU capable OpenMP variant of VPIC. Finally we include a lessons learnt. Work performed under the auspices of the U.S. Dept. of Energy by the Los Alamos National Security, LLC Los Alamos National Laboratory under contract DE-AC52-06NA25396 and supported by the LANL LDRD program.
Summary Report of Working Group 2: Computation

NASA Astrophysics Data System (ADS)

Stoltz, P. H.; Tsung, R. S.

2009-01-01

The working group on computation addressed three physics areas: (i) plasma-based accelerators (laser-driven and beam-driven), (ii) high gradient structure-based accelerators, and (iii) electron beam sources and transport [1]. Highlights of the talks in these areas included new models of breakdown on the microscopic scale, new three-dimensional multipacting calculations with both finite difference and finite element codes, and detailed comparisons of new electron gun models with standard models such as PARMELA. The group also addressed two areas of advances in computation: (i) new algorithms, including simulation in a Lorentz-boosted frame that can reduce computation time orders of magnitude, and (ii) new hardware architectures, like graphics processing units and Cell processors that promise dramatic increases in computing power. Highlights of the talks in these areas included results from the first large-scale parallel finite element particle-in-cell code (PIC), many order-of-magnitude speedup of, and details of porting the VPIC code to the Roadrunner supercomputer. The working group featured two plenary talks, one by Brian Albright of Los Alamos National Laboratory on the performance of the VPIC code on the Roadrunner supercomputer, and one by David Bruhwiler of Tech-X Corporation on recent advances in computation for advanced accelerators. Highlights of the talk by Albright included the first one trillion particle simulations, a sustained performance of 0.3 petaflops, and an eight times speedup of science calculations, including back-scatter in laser-plasma interaction. Highlights of the talk by Bruhwiler included simulations of 10 GeV accelerator laser wakefield stages including external injection, new developments in electromagnetic simulations of electron guns using finite difference and finite element approaches.
Summary Report of Working Group 2: Computation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stoltz, P. H.; Tsung, R. S.

2009-01-22

The working group on computation addressed three physics areas: (i) plasma-based accelerators (laser-driven and beam-driven), (ii) high gradient structure-based accelerators, and (iii) electron beam sources and transport [1]. Highlights of the talks in these areas included new models of breakdown on the microscopic scale, new three-dimensional multipacting calculations with both finite difference and finite element codes, and detailed comparisons of new electron gun models with standard models such as PARMELA. The group also addressed two areas of advances in computation: (i) new algorithms, including simulation in a Lorentz-boosted frame that can reduce computation time orders of magnitude, and (ii) newmore » hardware architectures, like graphics processing units and Cell processors that promise dramatic increases in computing power. Highlights of the talks in these areas included results from the first large-scale parallel finite element particle-in-cell code (PIC), many order-of-magnitude speedup of, and details of porting the VPIC code to the Roadrunner supercomputer. The working group featured two plenary talks, one by Brian Albright of Los Alamos National Laboratory on the performance of the VPIC code on the Roadrunner supercomputer, and one by David Bruhwiler of Tech-X Corporation on recent advances in computation for advanced accelerators. Highlights of the talk by Albright included the first one trillion particle simulations, a sustained performance of 0.3 petaflops, and an eight times speedup of science calculations, including back-scatter in laser-plasma interaction. Highlights of the talk by Bruhwiler included simulations of 10 GeV accelerator laser wakefield stages including external injection, new developments in electromagnetic simulations of electron guns using finite difference and finite element approaches.« less
Computationally efficient methods for modelling laser wakefield acceleration in the blowout regime

NASA Astrophysics Data System (ADS)

Cowan, B. M.; Kalmykov, S. Y.; Beck, A.; Davoine, X.; Bunkers, K.; Lifschitz, A. F.; Lefebvre, E.; Bruhwiler, D. L.; Shadwick, B. A.; Umstadter, D. P.; Umstadter

2012-08-01

Electron self-injection and acceleration until dephasing in the blowout regime is studied for a set of initial conditions typical of recent experiments with 100-terawatt-class lasers. Two different approaches to computationally efficient, fully explicit, 3D particle-in-cell modelling are examined. First, the Cartesian code vorpal (Nieter, C. and Cary, J. R. 2004 VORPAL: a versatile plasma simulation code. J. Comput. Phys. 196, 538) using a perfect-dispersion electromagnetic solver precisely describes the laser pulse and bubble dynamics, taking advantage of coarser resolution in the propagation direction, with a proportionally larger time step. Using third-order splines for macroparticles helps suppress the sampling noise while keeping the usage of computational resources modest. The second way to reduce the simulation load is using reduced-geometry codes. In our case, the quasi-cylindrical code calder-circ (Lifschitz, A. F. et al. 2009 Particle-in-cell modelling of laser-plasma interaction using Fourier decomposition. J. Comput. Phys. 228(5), 1803-1814) uses decomposition of fields and currents into a set of poloidal modes, while the macroparticles move in the Cartesian 3D space. Cylindrical symmetry of the interaction allows using just two modes, reducing the computational load to roughly that of a planar Cartesian simulation while preserving the 3D nature of the interaction. This significant economy of resources allows using fine resolution in the direction of propagation and a small time step, making numerical dispersion vanishingly small, together with a large number of particles per cell, enabling good particle statistics. Quantitative agreement of two simulations indicates that these are free of numerical artefacts. Both approaches thus retrieve the physically correct evolution of the plasma bubble, recovering the intrinsic connection of electron self-injection to the nonlinear optical evolution of the driver.
Numerical predictions of EML (electromagnetic launcher) system performance

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schnurr, N.M.; Kerrisk, J.F.; Davidson, R.F.

1987-01-01

The performance of an electromagnetic launcher (EML) depends on a large number of parameters, including the characteristics of the power supply, rail geometry, rail and insulator material properties, injection velocity, and projectile mass. EML system performance is frequently limited by structural or thermal effects in the launcher (railgun). A series of computer codes has been developed at the Los Alamos National Laboratory to predict EML system performance and to determine the structural and thermal constraints on barrel design. These codes include FLD, a two-dimensional electrostatic code used to calculate the high-frequency inductance gradient and surface current density distribution for themore » rails; TOPAZRG, a two-dimensional finite-element code that simultaneously analyzes thermal and electromagnetic diffusion in the rails; and LARGE, a code that predicts the performance of the entire EML system. Trhe NIKE2D code, developed at the Lawrence Livermore National Laboratory, is used to perform structural analyses of the rails. These codes have been instrumental in the design of the Lethality Test System (LTS) at Los Alamos, which has an ultimate goal of accelerating a 30-g projectile to a velocity of 15 km/s. The capabilities of the individual codes and the coupling of these codes to perform a comprehensive analysis is discussed in relation to the LTS design. Numerical predictions are compared with experimental data and presented for the LTS prototype tests.« less
Software for Brain Network Simulations: A Comparative Study

PubMed Central

Tikidji-Hamburyan, Ruben A.; Narayana, Vikram; Bozkus, Zeki; El-Ghazawi, Tarek A.

2017-01-01

Numerical simulations of brain networks are a critical part of our efforts in understanding brain functions under pathological and normal conditions. For several decades, the community has developed many software packages and simulators to accelerate research in computational neuroscience. In this article, we select the three most popular simulators, as determined by the number of models in the ModelDB database, such as NEURON, GENESIS, and BRIAN, and perform an independent evaluation of these simulators. In addition, we study NEST, one of the lead simulators of the Human Brain Project. First, we study them based on one of the most important characteristics, the range of supported models. Our investigation reveals that brain network simulators may be biased toward supporting a specific set of models. However, all simulators tend to expand the supported range of models by providing a universal environment for the computational study of individual neurons and brain networks. Next, our investigations on the characteristics of computational architecture and efficiency indicate that all simulators compile the most computationally intensive procedures into binary code, with the aim of maximizing their computational performance. However, not all simulators provide the simplest method for module development and/or guarantee efficient binary code. Third, a study of their amenability for high-performance computing reveals that NEST can almost transparently map an existing model on a cluster or multicore computer, while NEURON requires code modification if the model developed for a single computer has to be mapped on a computational cluster. Interestingly, parallelization is the weakest characteristic of BRIAN, which provides no support for cluster computations and limited support for multicore computers. Fourth, we identify the level of user support and frequency of usage for all simulators. Finally, we carry out an evaluation using two case studies: a large network with simplified neural and synaptic models and a small network with detailed models. These two case studies allow us to avoid any bias toward a particular software package. The results indicate that BRIAN provides the most concise language for both cases considered. Furthermore, as expected, NEST mostly favors large network models, while NEURON is better suited for detailed models. Overall, the case studies reinforce our general observation that simulators have a bias in the computational performance toward specific types of the brain network models. PMID:28775687
Noniterative MAP reconstruction using sparse matrix representations.

PubMed

Cao, Guangzhi; Bouman, Charles A; Webb, Kevin J

2009-09-01

We present a method for noniterative maximum a posteriori (MAP) tomographic reconstruction which is based on the use of sparse matrix representations. Our approach is to precompute and store the inverse matrix required for MAP reconstruction. This approach has generally not been used in the past because the inverse matrix is typically large and fully populated (i.e., not sparse). In order to overcome this problem, we introduce two new ideas. The first idea is a novel theory for the lossy source coding of matrix transformations which we refer to as matrix source coding. This theory is based on a distortion metric that reflects the distortions produced in the final matrix-vector product, rather than the distortions in the coded matrix itself. The resulting algorithms are shown to require orthonormal transformations of both the measurement data and the matrix rows and columns before quantization and coding. The second idea is a method for efficiently storing and computing the required orthonormal transformations, which we call a sparse-matrix transform (SMT). The SMT is a generalization of the classical FFT in that it uses butterflies to compute an orthonormal transform; but unlike an FFT, the SMT uses the butterflies in an irregular pattern, and is numerically designed to best approximate the desired transforms. We demonstrate the potential of the noniterative MAP reconstruction with examples from optical tomography. The method requires offline computation to encode the inverse transform. However, once these offline computations are completed, the noniterative MAP algorithm is shown to reduce both storage and computation by well over two orders of magnitude, as compared to a linear iterative reconstruction methods.
MCNP and GADRAS Comparisons

DOE Office of Scientific and Technical Information (OSTI.GOV)

Klasky, Marc Louis; Myers, Steven Charles; James, Michael R.

To facilitate the timely execution of System Threat Reviews (STRs) for DNDO, and also to develop a methodology for performing STRs, LANL performed comparisons of several radiation transport codes (MCNP, GADRAS, and Gamma-Designer) that have been previously utilized to compute radiation signatures. While each of these codes has strengths, it is of paramount interest to determine the limitations of each of the respective codes and also to identify the most time efficient means by which to produce computational results, given the large number of parametric cases that are anticipated in performing STR's. These comparisons serve to identify regions of applicabilitymore » for each code and provide estimates of uncertainty that may be anticipated. Furthermore, while performing these comparisons, examination of the sensitivity of the results to modeling assumptions was also examined. These investigations serve to enable the creation of the LANL methodology for performing STRs. Given the wide variety of radiation test sources, scenarios, and detectors, LANL calculated comparisons of the following parameters: decay data, multiplicity, device (n,γ) leakages, and radiation transport through representative scenes and shielding. This investigation was performed to understand potential limitations utilizing specific codes for different aspects of the STR challenges.« less
A project based on multi-configuration Dirac-Fock calculations for plasma spectroscopy

NASA Astrophysics Data System (ADS)

Comet, M.; Pain, J.-C.; Gilleron, F.; Piron, R.

2017-09-01

We present a project dedicated to hot plasma spectroscopy based on a Multi-Configuration Dirac-Fock (MCDF) code, initially developed by J. Bruneau. The code is briefly described and the use of the transition state method for plasma spectroscopy is detailed. Then an opacity code for local-thermodynamic-equilibrium plasmas using MCDF data, named OPAMCDF, is presented. Transition arrays for which the number of lines is too large to be handled in a Detailed Line Accounting (DLA) calculation can be modeled within the Partially Resolved Transition Array method or using the Unresolved Transition Arrays formalism in jj-coupling. An improvement of the original Partially Resolved Transition Array method is presented which gives a better agreement with DLA computations. Comparisons with some absorption and emission experimental spectra are shown. Finally, the capability of the MCDF code to compute atomic data required for collisional-radiative modeling of plasma at non local thermodynamic equilibrium is illustrated. In addition to photoexcitation, this code can be used to calculate photoionization, electron impact excitation and ionization cross-sections as well as autoionization rates in the Distorted-Wave or Close Coupling approximations. Comparisons with cross-sections and rates available in the literature are discussed.
Model verification of large structural systems

NASA Technical Reports Server (NTRS)

Lee, L. T.; Hasselman, T. K.

1977-01-01

A methodology was formulated, and a general computer code implemented for processing sinusoidal vibration test data to simultaneously make adjustments to a prior mathematical model of a large structural system, and resolve measured response data to obtain a set of orthogonal modes representative of the test model. The derivation of estimator equations is shown along with example problems. A method for improving the prior analytic model is included.
The Initial Atmospheric Transport (IAT) Code: Description and Validation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Morrow, Charles W.; Bartel, Timothy James

The Initial Atmospheric Transport (IAT) computer code was developed at Sandia National Laboratories as part of their nuclear launch accident consequences analysis suite of computer codes. The purpose of IAT is to predict the initial puff/plume rise resulting from either a solid rocket propellant or liquid rocket fuel fire. The code generates initial conditions for subsequent atmospheric transport calculations. The Initial Atmospheric Transfer (IAT) code has been compared to two data sets which are appropriate to the design space of space launch accident analyses. The primary model uncertainties are the entrainment coefficients for the extended Taylor model. The Titan 34Dmore » accident (1986) was used to calibrate these entrainment settings for a prototypic liquid propellant accident while the recent Johns Hopkins University Applied Physics Laboratory (JHU/APL, or simply APL) large propellant block tests (2012) were used to calibrate the entrainment settings for prototypic solid propellant accidents. North American Meteorology (NAM )formatted weather data profiles are used by IAT to determine the local buoyancy force balance. The IAT comparisons for the APL solid propellant tests illustrate the sensitivity of the plume elevation to the weather profiles; that is, the weather profile is a dominant factor in determining the plume elevation. The IAT code performed remarkably well and is considered validated for neutral weather conditions.« less
Development and application of the GIM code for the Cyber 203 computer

NASA Technical Reports Server (NTRS)

Stainaker, J. F.; Robinson, M. A.; Rawlinson, E. G.; Anderson, P. G.; Mayne, A. W.; Spradley, L. W.

1982-01-01

The GIM computer code for fluid dynamics research was developed. Enhancement of the computer code, implicit algorithm development, turbulence model implementation, chemistry model development, interactive input module coding and wing/body flowfield computation are described. The GIM quasi-parabolic code development was completed, and the code used to compute a number of example cases. Turbulence models, algebraic and differential equations, were added to the basic viscous code. An equilibrium reacting chemistry model and implicit finite difference scheme were also added. Development was completed on the interactive module for generating the input data for GIM. Solutions for inviscid hypersonic flow over a wing/body configuration are also presented.
Data engineering systems: Computerized modeling and data bank capabilities for engineering analysis

NASA Technical Reports Server (NTRS)

Kopp, H.; Trettau, R.; Zolotar, B.

1984-01-01

The Data Engineering System (DES) is a computer-based system that organizes technical data and provides automated mechanisms for storage, retrieval, and engineering analysis. The DES combines the benefits of a structured data base system with automated links to large-scale analysis codes. While the DES provides the user with many of the capabilities of a computer-aided design (CAD) system, the systems are actually quite different in several respects. A typical CAD system emphasizes interactive graphics capabilities and organizes data in a manner that optimizes these graphics. On the other hand, the DES is a computer-aided engineering system intended for the engineer who must operationally understand an existing or planned design or who desires to carry out additional technical analysis based on a particular design. The DES emphasizes data retrieval in a form that not only provides the engineer access to search and display the data but also links the data automatically with the computer analysis codes.
Parallel Computation of the Regional Ocean Modeling System (ROMS)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, P; Song, Y T; Chao, Y

2005-04-05

The Regional Ocean Modeling System (ROMS) is a regional ocean general circulation modeling system solving the free surface, hydrostatic, primitive equations over varying topography. It is free software distributed world-wide for studying both complex coastal ocean problems and the basin-to-global scale ocean circulation. The original ROMS code could only be run on shared-memory systems. With the increasing need to simulate larger model domains with finer resolutions and on a variety of computer platforms, there is a need in the ocean-modeling community to have a ROMS code that can be run on any parallel computer ranging from 10 to hundreds ofmore » processors. Recently, we have explored parallelization for ROMS using the MPI programming model. In this paper, an efficient parallelization strategy for such a large-scale scientific software package, based on an existing shared-memory computing model, is presented. In addition, scientific applications and data-performance issues on a couple of SGI systems, including Columbia, the world's third-fastest supercomputer, are discussed.« less
Well-characterized sequence features of eukaryote genomes and implications for ab initio gene prediction.

PubMed

Huang, Ying; Chen, Shi-Yi; Deng, Feilong

2016-01-01

In silico analysis of DNA sequences is an important area of computational biology in the post-genomic era. Over the past two decades, computational approaches for ab initio prediction of gene structure from genome sequence alone have largely facilitated our understanding on a variety of biological questions. Although the computational prediction of protein-coding genes has already been well-established, we are also facing challenges to robustly find the non-coding RNA genes, such as miRNA and lncRNA. Two main aspects of ab initio gene prediction include the computed values for describing sequence features and used algorithm for training the discriminant function, and by which different combinations are employed into various bioinformatic tools. Herein, we briefly review these well-characterized sequence features in eukaryote genomes and applications to ab initio gene prediction. The main purpose of this article is to provide an overview to beginners who aim to develop the related bioinformatic tools.
Performance analysis of parallel gravitational N-body codes on large GPU clusters

NASA Astrophysics Data System (ADS)

Huang, Si-Yi; Spurzem, Rainer; Berczik, Peter

2016-01-01

We compare the performance of two very different parallel gravitational N-body codes for astrophysical simulations on large Graphics Processing Unit (GPU) clusters, both of which are pioneers in their own fields as well as on certain mutual scales - NBODY6++ and Bonsai. We carry out benchmarks of the two codes by analyzing their performance, accuracy and efficiency through the modeling of structure decomposition and timing measurements. We find that both codes are heavily optimized to leverage the computational potential of GPUs as their performance has approached half of the maximum single precision performance of the underlying GPU cards. With such performance we predict that a speed-up of 200 - 300 can be achieved when up to 1k processors and GPUs are employed simultaneously. We discuss the quantitative information about comparisons of the two codes, finding that in the same cases Bonsai adopts larger time steps as well as larger relative energy errors than NBODY6++, typically ranging from 10 - 50 times larger, depending on the chosen parameters of the codes. Although the two codes are built for different astrophysical applications, in specified conditions they may overlap in performance at certain physical scales, thus allowing the user to choose either one by fine-tuning parameters accordingly.
High-speed inlet research program and supporting analysis

NASA Technical Reports Server (NTRS)

Coltrin, Robert E.

1990-01-01

The technology challenges faced by the high speed inlet designer are discussed by describing the considerations that went into the design of the Mach 5 research inlet. It is shown that the emerging three dimensional viscous computational fluid dynamics (CFD) flow codes, together with small scale experiments, can be used to guide larger scale full inlet systems research. Then, in turn, the results of the large scale research, if properly instrumented, can be used to validate or at least to calibrate the CFD codes.
Additional application of the NASCAP code. Volume 1: NASCAP extension

NASA Technical Reports Server (NTRS)

Katz, I.; Cassidy, J. J.; Mandell, M. J.; Parks, D. E.; Schnuelle, G. W.; Stannard, P. R.; Steen, P. G.

1981-01-01

The NASCAP computer program comprehensively analyzes problems of spacecraft charging. Using a fully three dimensional approach, it can accurately predict spacecraft potentials under a variety of conditions. Several changes were made to NASCAP, and a new code, NASCAP/LEO, was developed. In addition, detailed studies of several spacecraft-environmental interactions and of the SCATHA spacecraft were performed. The NASCAP/LEO program handles situations of relatively short Debye length encountered by large space structures or by any satellite in low earth orbit (LEO).
Metadata Management on the SCEC PetaSHA Project: Helping Users Describe, Discover, Understand, and Use Simulation Data in a Large-Scale Scientific Collaboration

NASA Astrophysics Data System (ADS)

Okaya, D.; Deelman, E.; Maechling, P.; Wong-Barnum, M.; Jordan, T. H.; Meyers, D.

2007-12-01

Large scientific collaborations, such as the SCEC Petascale Cyberfacility for Physics-based Seismic Hazard Analysis (PetaSHA) Project, involve interactions between many scientists who exchange ideas and research results. These groups must organize, manage, and make accessible their community materials of observational data, derivative (research) results, computational products, and community software. The integration of scientific workflows as a paradigm to solve complex computations provides advantages of efficiency, reliability, repeatability, choices, and ease of use. The underlying resource needed for a scientific workflow to function and create discoverable and exchangeable products is the construction, tracking, and preservation of metadata. In the scientific workflow environment there is a two-tier structure of metadata. Workflow-level metadata and provenance describe operational steps, identity of resources, execution status, and product locations and names. Domain-level metadata essentially define the scientific meaning of data, codes and products. To a large degree the metadata at these two levels are separate. However, between these two levels is a subset of metadata produced at one level but is needed by the other. This crossover metadata suggests that some commonality in metadata handling is needed. SCEC researchers are collaborating with computer scientists at SDSC, the USC Information Sciences Institute, and Carnegie Mellon Univ. in order to perform earthquake science using high-performance computational resources. A primary objective of the "PetaSHA" collaboration is to perform physics-based estimations of strong ground motion associated with real and hypothetical earthquakes located within Southern California. Construction of 3D earth models, earthquake representations, and numerical simulation of seismic waves are key components of these estimations. Scientific workflows are used to orchestrate the sequences of scientific tasks and to access distributed computational facilities such as the NSF TeraGrid. Different types of metadata are produced and captured within the scientific workflows. One workflow within PetaSHA ("Earthworks") performs a linear sequence of tasks with workflow and seismological metadata preserved. Downstream scientific codes ingest these metadata produced by upstream codes. The seismological metadata uses attribute-value pairing in plain text; an identified need is to use more advanced handling methods. Another workflow system within PetaSHA ("Cybershake") involves several complex workflows in order to perform statistical analysis of ground shaking due to thousands of hypothetical but plausible earthquakes. Metadata management has been challenging due to its construction around a number of legacy scientific codes. We describe difficulties arising in the scientific workflow due to the lack of this metadata and suggest corrective steps, which in some cases include the cultural shift of domain science programmers coding for metadata.

Parallel and serial computing tools for testing single-locus and epistatic SNP effects of quantitative traits in genome-wide association studies

PubMed Central

Ma, Li; Runesha, H Birali; Dvorkin, Daniel; Garbe, John R; Da, Yang

2008-01-01

Background Genome-wide association studies (GWAS) using single nucleotide polymorphism (SNP) markers provide opportunities to detect epistatic SNPs associated with quantitative traits and to detect the exact mode of an epistasis effect. Computational difficulty is the main bottleneck for epistasis testing in large scale GWAS. Results The EPISNPmpi and EPISNP computer programs were developed for testing single-locus and epistatic SNP effects on quantitative traits in GWAS, including tests of three single-locus effects for each SNP (SNP genotypic effect, additive and dominance effects) and five epistasis effects for each pair of SNPs (two-locus interaction, additive × additive, additive × dominance, dominance × additive, and dominance × dominance) based on the extended Kempthorne model. EPISNPmpi is the parallel computing program for epistasis testing in large scale GWAS and achieved excellent scalability for large scale analysis and portability for various parallel computing platforms. EPISNP is the serial computing program based on the EPISNPmpi code for epistasis testing in small scale GWAS using commonly available operating systems and computer hardware. Three serial computing utility programs were developed for graphical viewing of test results and epistasis networks, and for estimating CPU time and disk space requirements. Conclusion The EPISNPmpi parallel computing program provides an effective computing tool for epistasis testing in large scale GWAS, and the epiSNP serial computing programs are convenient tools for epistasis analysis in small scale GWAS using commonly available computer hardware. PMID:18644146
Accelerating cardiac bidomain simulations using graphics processing units.

PubMed

Neic, A; Liebmann, M; Hoetzl, E; Mitchell, L; Vigmond, E J; Haase, G; Plank, G

2012-08-01

Anatomically realistic and biophysically detailed multiscale computer models of the heart are playing an increasingly important role in advancing our understanding of integrated cardiac function in health and disease. Such detailed simulations, however, are computationally vastly demanding, which is a limiting factor for a wider adoption of in-silico modeling. While current trends in high-performance computing (HPC) hardware promise to alleviate this problem, exploiting the potential of such architectures remains challenging since strongly scalable algorithms are necessitated to reduce execution times. Alternatively, acceleration technologies such as graphics processing units (GPUs) are being considered. While the potential of GPUs has been demonstrated in various applications, benefits in the context of bidomain simulations where large sparse linear systems have to be solved in parallel with advanced numerical techniques are less clear. In this study, the feasibility of multi-GPU bidomain simulations is demonstrated by running strong scalability benchmarks using a state-of-the-art model of rabbit ventricles. The model is spatially discretized using the finite element methods (FEM) on fully unstructured grids. The GPU code is directly derived from a large pre-existing code, the Cardiac Arrhythmia Research Package (CARP), with very minor perturbation of the code base. Overall, bidomain simulations were sped up by a factor of 11.8 to 16.3 in benchmarks running on 6-20 GPUs compared to the same number of CPU cores. To match the fastest GPU simulation which engaged 20 GPUs, 476 CPU cores were required on a national supercomputing facility.
Accelerating Cardiac Bidomain Simulations Using Graphics Processing Units

PubMed Central

Neic, Aurel; Liebmann, Manfred; Hoetzl, Elena; Mitchell, Lawrence; Vigmond, Edward J.; Haase, Gundolf

2013-01-01

Anatomically realistic and biophysically detailed multiscale computer models of the heart are playing an increasingly important role in advancing our understanding of integrated cardiac function in health and disease. Such detailed simulations, however, are computationally vastly demanding, which is a limiting factor for a wider adoption of in-silico modeling. While current trends in high-performance computing (HPC) hardware promise to alleviate this problem, exploiting the potential of such architectures remains challenging since strongly scalable algorithms are necessitated to reduce execution times. Alternatively, acceleration technologies such as graphics processing units (GPUs) are being considered. While the potential of GPUs has been demonstrated in various applications, benefits in the context of bidomain simulations where large sparse linear systems have to be solved in parallel with advanced numerical techniques are less clear. In this study, the feasibility of multi-GPU bidomain simulations is demonstrated by running strong scalability benchmarks using a state-of-the-art model of rabbit ventricles. The model is spatially discretized using the finite element methods (FEM) on fully unstructured grids. The GPU code is directly derived from a large pre-existing code, the Cardiac Arrhythmia Research Package (CARP), with very minor perturbation of the code base. Overall, bidomain simulations were sped up by a factor of 11.8 to 16.3 in benchmarks running on 6–20 GPUs compared to the same number of CPU cores. To match the fastest GPU simulation which engaged 20GPUs, 476 CPU cores were required on a national supercomputing facility. PMID:22692867
Photo-z-SQL: Integrated, flexible photometric redshift computation in a database

NASA Astrophysics Data System (ADS)

Beck, R.; Dobos, L.; Budavári, T.; Szalay, A. S.; Csabai, I.

2017-04-01

We present a flexible template-based photometric redshift estimation framework, implemented in C#, that can be seamlessly integrated into a SQL database (or DB) server and executed on-demand in SQL. The DB integration eliminates the need to move large photometric datasets outside a database for redshift estimation, and utilizes the computational capabilities of DB hardware. The code is able to perform both maximum likelihood and Bayesian estimation, and can handle inputs of variable photometric filter sets and corresponding broad-band magnitudes. It is possible to take into account the full covariance matrix between filters, and filter zero points can be empirically calibrated using measurements with given redshifts. The list of spectral templates and the prior can be specified flexibly, and the expensive synthetic magnitude computations are done via lazy evaluation, coupled with a caching of results. Parallel execution is fully supported. For large upcoming photometric surveys such as the LSST, the ability to perform in-place photo-z calculation would be a significant advantage. Also, the efficient handling of variable filter sets is a necessity for heterogeneous databases, for example the Hubble Source Catalog, and for cross-match services such as SkyQuery. We illustrate the performance of our code on two reference photo-z estimation testing datasets, and provide an analysis of execution time and scalability with respect to different configurations. The code is available for download at https://github.com/beckrob/Photo-z-SQL.
Xyce™ Parallel Electronic Simulator Users' Guide, Version 6.5.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Keiter, Eric R.; Aadithya, Karthik V.; Mei, Ting

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows one to developmore » new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase -- a message passing parallel implementation -- which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The information herein is subject to change without notice. Copyright © 2002-2016 Sandia Corporation. All rights reserved.« less
Extreme-Scale De Novo Genome Assembly

DOE Office of Scientific and Technical Information (OSTI.GOV)

Georganas, Evangelos; Hofmeyr, Steven; Egan, Rob

De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially erroneous DNA segments and is one of the most important computations in modern genomics. This work presents HipMER, a high-quality end-to-end de novo assembler designed for extreme scale analysis, via efficient parallelization of the Meraculous code. Genome assembly software has many components, each of which stresses different components of a computer system. This chapter explains the computational challenges involved in each step of the HipMer pipeline, the key distributed data structures, and communication costs in detail. We present performance results of assembling the human genome and themore » large hexaploid wheat genome on large supercomputers up to tens of thousands of cores.« less
Translating an AI application from Lisp to Ada: A case study

NASA Technical Reports Server (NTRS)

Davis, Gloria J.

1991-01-01

A set of benchmarks was developed to test the performance of a newly designed computer executing both Lisp and Ada. Among these was AutoClassII -- a large Artificial Intelligence (AI) application written in Common Lisp. The extraction of a representative subset of this complex application was aided by a Lisp Code Analyzer (LCA). The LCA enabled rapid analysis of the code, putting it in a concise and functionally readable form. An equivalent benchmark was created in Ada through manual translation of the Lisp version. A comparison of the execution results of both programs across a variety of compiler-machine combinations indicate that line-by-line translation coupled with analysis of the initial code can produce relatively efficient and reusable target code.
Validation of CFD Codes for Parawing Geometries in Subsonic to Supersonic Flows

NASA Technical Reports Server (NTRS)

Cruz-Ayoroa, Juan G.; Garcia, Joseph A.; Melton, John E.

2014-01-01

Computational Fluid Dynamic studies of a rigid parawing at Mach numbers from 0.8 to 4.65 were carried out using three established inviscid, viscous and independent panel method codes. Pressure distributions along four chordwise sections of the wing were compared to experimental wind tunnel data gathered from NASA technical reports. Results show good prediction of the overall trends and magnitudes of the pressure distributions for the inviscid and viscous solvers. Pressure results for the panel method code diverge from test data at large angles of attack due to shock interaction phenomena. Trends in the flow behavior and their effect on the integrated force and moments on this type of wing are examined in detail using the inviscid CFD code results.
Radar transponder antenna pattern analysis for the space shuttle

NASA Technical Reports Server (NTRS)

Radcliff, Roger

1989-01-01

In order to improve tracking capability, radar transponder antennas will soon be mounted on the Shuttle solid rocket boosters (SRB). These four antennas, each being identical cavity-backed helices operating at 5.765 GHz, will be mounted near the top of the SRB's, adjacent to the intertank portion of the external tank. The purpose is to calculate the roll-plane pattern (the plane perpendicular to the SRB axes and containing the antennas) in the presence of this complex electromagnetic environment. The large electrical size of this problem mandates an optical (asymptotic) approach. Development of a specific code for this application is beyond the scope of a summer fellowship; thus a general purpose code, the Numerical Electromagnetics Code - Basic Scattering Code, was chosen as the computational tool. This code is based on the modern Geometrical Theory of Diffraction, and allows computation of scattering of bodies composed of canonical problems such as plates and elliptic cylinders. Apertures mounted on a curved surface (the SRB) cannot be accomplished by the code, so an antenna model consisting of wires excited by a method of moments current input was devised that approximated the actual performance of the antennas. The improvised antenna model matched well with measurements taken at the MSFC range. The SRB's, the external tank, and the shuttle nose were modeled as circular cylinders, and the code was able to produce what is thought to be a reasonable roll-plane pattern.
Breaking the Supermassive Black Hole Speed Limit

DOE Office of Scientific and Technical Information (OSTI.GOV)

Smidt, Joseph

A new computer simulation helps explain the existence of puzzling supermassive black holes observed in the early universe. The simulation is based on a computer code used to understand the coupling of radiation and certain materials. “Supermassive black holes have a speed limit that governs how fast and how large they can grow,” said Joseph Smidt of the Theoretical Design Division at Los Alamos National Laboratory. “The relatively recent discovery of supermassive black holes in the early development of the universe raised a fundamental question, how did they get so big so fast?” Using computer codes developed at Los Alamosmore » for modeling the interaction of matter and radiation related to the Lab’s stockpile stewardship mission, Smidt and colleagues created a simulation of collapsing stars that resulted in supermassive black holes forming in less time than expected, cosmologically speaking, in the first billion years of the universe.« less
JACOB: an enterprise framework for computational chemistry.

PubMed

Waller, Mark P; Dresselhaus, Thomas; Yang, Jack

2013-06-15

Here, we present just a collection of beans (JACOB): an integrated batch-based framework designed for the rapid development of computational chemistry applications. The framework expedites developer productivity by handling the generic infrastructure tier, and can be easily extended by user-specific scientific code. Paradigms from enterprise software engineering were rigorously applied to create a scalable, testable, secure, and robust framework. A centralized web application is used to configure and control the operation of the framework. The application-programming interface provides a set of generic tools for processing large-scale noninteractive jobs (e.g., systematic studies), or for coordinating systems integration (e.g., complex workflows). The code for the JACOB framework is open sourced and is available at: www.wallerlab.org/jacob. Copyright © 2013 Wiley Periodicals, Inc.
Remote control missile model test

NASA Technical Reports Server (NTRS)

Allen, Jerry M.; Shaw, David S.; Sawyer, Wallace C.

1989-01-01

An extremely large, systematic, axisymmetric body/tail fin data base was gathered through tests of an innovative missile model design which is described herein. These data were originally obtained for incorporation into a missile aerodynamics code based on engineering methods (Program MISSILE3), but can also be used as diagnostic test cases for developing computational methods because of the individual-fin data included in the data base. Detailed analysis of four sample cases from these data are presented to illustrate interesting individual-fin force and moment trends. These samples quantitatively show how bow shock, fin orientation, fin deflection, and body vortices can produce strong, unusual, and computationally challenging effects on individual fin loads. Comparisons between these data and calculations from the SWINT Euler code are also presented.
Survey of computer programs for heat transfer analysis

NASA Technical Reports Server (NTRS)

Noor, Ahmed K.

1986-01-01

An overview is given of the current capabilities of thirty-three computer programs that are used to solve heat transfer problems. The programs considered range from large general-purpose codes with broad spectrum of capabilities, large user community, and comprehensive user support (e.g., ABAQUS, ANSYS, EAL, MARC, MITAS II, MSC/NASTRAN, and SAMCEF) to the small, special-purpose codes with limited user community such as ANDES, NTEMP, TAC2D, TAC3D, TEPSA and TRUMP. The majority of the programs use either finite elements or finite differences for the spatial discretization. The capabilities of the programs are listed in tabular form followed by a summary of the major features of each program. The information presented herein is based on a questionnaire sent to the developers of each program. This information is preceded by a brief background material needed for effective evaluation and use of computer programs for heat transfer analysis. The present survey is useful in the initial selection of the programs which are most suitable for a particular application. The final selection of the program to be used should, however, be based on a detailed examination of the documentation and the literature about the program.
BlazeDEM3D-GPU A Large Scale DEM simulation code for GPUs

NASA Astrophysics Data System (ADS)

Govender, Nicolin; Wilke, Daniel; Pizette, Patrick; Khinast, Johannes

2017-06-01

Accurately predicting the dynamics of particulate materials is of importance to numerous scientific and industrial areas with applications ranging across particle scales from powder flow to ore crushing. Computational discrete element simulations is a viable option to aid in the understanding of particulate dynamics and design of devices such as mixers, silos and ball mills, as laboratory scale tests comes at a significant cost. However, the computational time required to simulate an industrial scale simulation which consists of tens of millions of particles can take months to complete on large CPU clusters, making the Discrete Element Method (DEM) unfeasible for industrial applications. Simulations are therefore typically restricted to tens of thousands of particles with highly detailed particle shapes or a few million of particles with often oversimplified particle shapes. However, a number of applications require accurate representation of the particle shape to capture the macroscopic behaviour of the particulate system. In this paper we give an overview of the recent extensions to the open source GPU based DEM code, BlazeDEM3D-GPU, that can simulate millions of polyhedra and tens of millions of spheres on a desktop computer with a single or multiple GPUs.
Capacity of a direct detection optical communication channel

NASA Technical Reports Server (NTRS)

Tan, H. H.

1980-01-01

The capacity of a free space optical channel using a direct detection receiver is derived under both peak and average signal power constraints and without a signal bandwidth constraint. The addition of instantaneous noiseless feedback from the receiver to the transmitter does not increase the channel capacity. In the absence of received background noise, an optimally coded PPM system is shown to achieve capacity in the limit as signal bandwidth approaches infinity. In the case of large peak to average signal power ratios, an interleaved coding scheme with PPM modulation is shown to have a computational cutoff rate far greater than ordinary coding schemes.
BALANCING THE LOAD: A VORONOI BASED SCHEME FOR PARALLEL COMPUTATIONS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Steinberg, Elad; Yalinewich, Almog; Sari, Re'em

2015-01-01

One of the key issues when running a simulation on multiple CPUs is maintaining a proper load balance throughout the run and minimizing communications between CPUs. We propose a novel method of utilizing a Voronoi diagram to achieve a nearly perfect load balance without the need of any global redistributions of data. As a show case, we implement our method in RICH, a two-dimensional moving mesh hydrodynamical code, but it can be extended trivially to other codes in two or three dimensions. Our tests show that this method is indeed efficient and can be used in a large variety ofmore » existing hydrodynamical codes.« less
Computation of Thermodynamic Equilibria Pertinent to Nuclear Materials in Multi-Physics Codes

NASA Astrophysics Data System (ADS)

Piro, Markus Hans Alexander

Nuclear energy plays a vital role in supporting electrical needs and fulfilling commitments to reduce greenhouse gas emissions. Research is a continuing necessity to improve the predictive capabilities of fuel behaviour in order to reduce costs and to meet increasingly stringent safety requirements by the regulator. Moreover, a renewed interest in nuclear energy has given rise to a "nuclear renaissance" and the necessity to design the next generation of reactors. In support of this goal, significant research efforts have been dedicated to the advancement of numerical modelling and computational tools in simulating various physical and chemical phenomena associated with nuclear fuel behaviour. This undertaking in effect is collecting the experience and observations of a past generation of nuclear engineers and scientists in a meaningful way for future design purposes. There is an increasing desire to integrate thermodynamic computations directly into multi-physics nuclear fuel performance and safety codes. A new equilibrium thermodynamic solver is being developed with this matter as a primary objective. This solver is intended to provide thermodynamic material properties and boundary conditions for continuum transport calculations. There are several concerns with the use of existing commercial thermodynamic codes: computational performance; limited capabilities in handling large multi-component systems of interest to the nuclear industry; convenient incorporation into other codes with quality assurance considerations; and, licensing entanglements associated with code distribution. The development of this software in this research is aimed at addressing all of these concerns. The approach taken in this work exploits fundamental principles of equilibrium thermodynamics to simplify the numerical optimization equations. In brief, the chemical potentials of all species and phases in the system are constrained by estimates of the chemical potentials of the system components at each iterative step, and the objective is to minimize the residuals of the mass balance equations. Several numerical advantages are achieved through this simplification. In particular, computational expense is reduced and the rate of convergence is enhanced. Furthermore, the software has demonstrated the ability to solve systems involving as many as 118 component elements. An early version of the code has already been integrated into the Advanced Multi-Physics (AMP) code under development by the Oak Ridge National Laboratory, Los Alamos National Laboratory, Idaho National Laboratory and Argonne National Laboratory. Keywords: Engineering, Nuclear -- 0552, Engineering, Material Science -- 0794, Chemistry, Mathematics -- 0405, Computer Science -- 0984
Exclusively visual analysis of classroom group interactions

NASA Astrophysics Data System (ADS)

Tucker, Laura; Scherr, Rachel E.; Zickler, Todd; Mazur, Eric

2016-12-01

Large-scale audiovisual data that measure group learning are time consuming to collect and analyze. As an initial step towards scaling qualitative classroom observation, we qualitatively coded classroom video using an established coding scheme with and without its audio cues. We find that interrater reliability is as high when using visual data only—without audio—as when using both visual and audio data to code. Also, interrater reliability is high when comparing use of visual and audio data to visual-only data. We see a small bias to code interactions as group discussion when visual and audio data are used compared with video-only data. This work establishes that meaningful educational observation can be made through visual information alone. Further, it suggests that after initial work to create a coding scheme and validate it in each environment, computer-automated visual coding could drastically increase the breadth of qualitative studies and allow for meaningful educational analysis on a far greater scale.
Efficient parallelization of analytic bond-order potentials for large-scale atomistic simulations

NASA Astrophysics Data System (ADS)

Teijeiro, C.; Hammerschmidt, T.; Drautz, R.; Sutmann, G.

2016-07-01

Analytic bond-order potentials (BOPs) provide a way to compute atomistic properties with controllable accuracy. For large-scale computations of heterogeneous compounds at the atomistic level, both the computational efficiency and memory demand of BOP implementations have to be optimized. Since the evaluation of BOPs is a local operation within a finite environment, the parallelization concepts known from short-range interacting particle simulations can be applied to improve the performance of these simulations. In this work, several efficient parallelization methods for BOPs that use three-dimensional domain decomposition schemes are described. The schemes are implemented into the bond-order potential code BOPfox, and their performance is measured in a series of benchmarks. Systems of up to several millions of atoms are simulated on a high performance computing system, and parallel scaling is demonstrated for up to thousands of processors.
GPU-computing in econophysics and statistical physics

NASA Astrophysics Data System (ADS)

Preis, T.

2011-03-01

A recent trend in computer science and related fields is general purpose computing on graphics processing units (GPUs), which can yield impressive performance. With multiple cores connected by high memory bandwidth, today's GPUs offer resources for non-graphics parallel processing. This article provides a brief introduction into the field of GPU computing and includes examples. In particular computationally expensive analyses employed in financial market context are coded on a graphics card architecture which leads to a significant reduction of computing time. In order to demonstrate the wide range of possible applications, a standard model in statistical physics - the Ising model - is ported to a graphics card architecture as well, resulting in large speedup values.

Parallel Computation of Unsteady Flows on a Network of Workstations

NASA Technical Reports Server (NTRS)

1997-01-01

Parallel computation of unsteady flows requires significant computational resources. The utilization of a network of workstations seems an efficient solution to the problem where large problems can be treated at a reasonable cost. This approach requires the solution of several problems: 1) the partitioning and distribution of the problem over a network of workstation, 2) efficient communication tools, 3) managing the system efficiently for a given problem. Of course, there is the question of the efficiency of any given numerical algorithm to such a computing system. NPARC code was chosen as a sample for the application. For the explicit version of the NPARC code both two- and three-dimensional problems were studied. Again both steady and unsteady problems were investigated. The issues studied as a part of the research program were: 1) how to distribute the data between the workstations, 2) how to compute and how to communicate at each node efficiently, 3) how to balance the load distribution. In the following, a summary of these activities is presented. Details of the work have been presented and published as referenced.
Development and evaluation of a Naïve Bayesian model for coding causation of workers’ compensation claims☆

PubMed Central

Bertke, S. J.; Meyers, A. R.; Wurzelbacher, S. J.; Bell, J.; Lampl, M. L.; Robins, D.

2015-01-01

Introduction Tracking and trending rates of injuries and illnesses classified as musculoskeletal disorders caused by ergonomic risk factors such as overexertion and repetitive motion (MSDs) and slips, trips, or falls (STFs) in different industry sectors is of high interest to many researchers. Unfortunately, identifying the cause of injuries and illnesses in large datasets such as workers’ compensation systems often requires reading and coding the free form accident text narrative for potentially millions of records. Method To alleviate the need for manual coding, this paper describes and evaluates a computer auto-coding algorithm that demonstrated the ability to code millions of claims quickly and accurately by learning from a set of previously manually coded claims. Conclusions The auto-coding program was able to code claims as a musculoskeletal disorders, STF or other with approximately 90% accuracy. Impact on industry The program developed and discussed in this paper provides an accurate and efficient method for identifying the causation of workers’ compensation claims as a STF or MSD in a large database based on the unstructured text narrative and resulting injury diagnoses. The program coded thousands of claims in minutes. The method described in this paper can be used by researchers and practitioners to relieve the manual burden of reading and identifying the causation of claims as a STF or MSD. Furthermore, the method can be easily generalized to code/classify other unstructured text narratives. PMID:23206504
PARAMESH: A Parallel Adaptive Mesh Refinement Community Toolkit

NASA Technical Reports Server (NTRS)

MacNeice, Peter; Olson, Kevin M.; Mobarry, Clark; deFainchtein, Rosalinda; Packer, Charles

1999-01-01

In this paper, we describe a community toolkit which is designed to provide parallel support with adaptive mesh capability for a large and important class of computational models, those using structured, logically cartesian meshes. The package of Fortran 90 subroutines, called PARAMESH, is designed to provide an application developer with an easy route to extend an existing serial code which uses a logically cartesian structured mesh into a parallel code with adaptive mesh refinement. Alternatively, in its simplest use, and with minimal effort, it can operate as a domain decomposition tool for users who want to parallelize their serial codes, but who do not wish to use adaptivity. The package can provide them with an incremental evolutionary path for their code, converting it first to uniformly refined parallel code, and then later if they so desire, adding adaptivity.
Enabling large-scale viscoelastic calculations via neural network acceleration

NASA Astrophysics Data System (ADS)

Robinson DeVries, P.; Thompson, T. B.; Meade, B. J.

2017-12-01

One of the most significant challenges involved in efforts to understand the effects of repeated earthquake cycle activity are the computational costs of large-scale viscoelastic earthquake cycle models. Deep artificial neural networks (ANNs) can be used to discover new, compact, and accurate computational representations of viscoelastic physics. Once found, these efficient ANN representations may replace computationally intensive viscoelastic codes and accelerate large-scale viscoelastic calculations by more than 50,000%. This magnitude of acceleration enables the modeling of geometrically complex faults over thousands of earthquake cycles across wider ranges of model parameters and at larger spatial and temporal scales than have been previously possible. Perhaps most interestingly from a scientific perspective, ANN representations of viscoelastic physics may lead to basic advances in the understanding of the underlying model phenomenology. We demonstrate the potential of artificial neural networks to illuminate fundamental physical insights with specific examples.
An Optimization Code for Nonlinear Transient Problems of a Large Scale Multidisciplinary Mathematical Model

NASA Astrophysics Data System (ADS)

Takasaki, Koichi

This paper presents a program for the multidisciplinary optimization and identification problem of the nonlinear model of large aerospace vehicle structures. The program constructs the global matrix of the dynamic system in the time direction by the p-version finite element method (pFEM), and the basic matrix for each pFEM node in the time direction is described by a sparse matrix similarly to the static finite element problem. The algorithm used by the program does not require the Hessian matrix of the objective function and so has low memory requirements. It also has a relatively low computational cost, and is suited to parallel computation. The program was integrated as a solver module of the multidisciplinary analysis system CUMuLOUS (Computational Utility for Multidisciplinary Large scale Optimization of Undense System) which is under development by the Aerospace Research and Development Directorate (ARD) of the Japan Aerospace Exploration Agency (JAXA).
GASPRNG: GPU accelerated scalable parallel random number generator library

NASA Astrophysics Data System (ADS)

Gao, Shuang; Peterson, Gregory D.

2013-04-01

Graphics processors represent a promising technology for accelerating computational science applications. Many computational science applications require fast and scalable random number generation with good statistical properties, so they use the Scalable Parallel Random Number Generators library (SPRNG). We present the GPU Accelerated SPRNG library (GASPRNG) to accelerate SPRNG in GPU-based high performance computing systems. GASPRNG includes code for a host CPU and CUDA code for execution on NVIDIA graphics processing units (GPUs) along with a programming interface to support various usage models for pseudorandom numbers and computational science applications executing on the CPU, GPU, or both. This paper describes the implementation approach used to produce high performance and also describes how to use the programming interface. The programming interface allows a user to be able to use GASPRNG the same way as SPRNG on traditional serial or parallel computers as well as to develop tightly coupled programs executing primarily on the GPU. We also describe how to install GASPRNG and use it. To help illustrate linking with GASPRNG, various demonstration codes are included for the different usage models. GASPRNG on a single GPU shows up to 280x speedup over SPRNG on a single CPU core and is able to scale for larger systems in the same manner as SPRNG. Because GASPRNG generates identical streams of pseudorandom numbers as SPRNG, users can be confident about the quality of GASPRNG for scalable computational science applications. Catalogue identifier: AEOI_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEOI_v1_0.html Program obtainable from: CPC Program Library, Queen’s University, Belfast, N. Ireland Licensing provisions: UTK license. No. of lines in distributed program, including test data, etc.: 167900 No. of bytes in distributed program, including test data, etc.: 1422058 Distribution format: tar.gz Programming language: C and CUDA. Computer: Any PC or workstation with NVIDIA GPU (Tested on Fermi GTX480, Tesla C1060, Tesla M2070). Operating system: Linux with CUDA version 4.0 or later. Should also run on MacOS, Windows, or UNIX. Has the code been vectorized or parallelized?: Yes. Parallelized using MPI directives. RAM: 512 MB˜ 732 MB (main memory on host CPU, depending on the data type of random numbers.) / 512 MB (GPU global memory) Classification: 4.13, 6.5. Nature of problem: Many computational science applications are able to consume large numbers of random numbers. For example, Monte Carlo simulations are able to consume limitless random numbers for the computation as long as resources for the computing are supported. Moreover, parallel computational science applications require independent streams of random numbers to attain statistically significant results. The SPRNG library provides this capability, but at a significant computational cost. The GASPRNG library presented here accelerates the generators of independent streams of random numbers using graphical processing units (GPUs). Solution method: Multiple copies of random number generators in GPUs allow a computational science application to consume large numbers of random numbers from independent, parallel streams. GASPRNG is a random number generators library to allow a computational science application to employ multiple copies of random number generators to boost performance. Users can interface GASPRNG with software code executing on microprocessors and/or GPUs. Running time: The tests provided take a few minutes to run.
Satellite Image Mosaic Engine

NASA Technical Reports Server (NTRS)

Plesea, Lucian

2006-01-01

A computer program automatically builds large, full-resolution mosaics of multispectral images of Earth landmasses from images acquired by Landsat 7, complete with matching of colors and blending between adjacent scenes. While the code has been used extensively for Landsat, it could also be used for other data sources. A single mosaic of as many as 8,000 scenes, represented by more than 5 terabytes of data and the largest set produced in this work, demonstrated what the code could do to provide global coverage. The program first statistically analyzes input images to determine areas of coverage and data-value distributions. It then transforms the input images from their original universal transverse Mercator coordinates to other geographical coordinates, with scaling. It applies a first-order polynomial brightness correction to each band in each scene. It uses a data-mask image for selecting data and blending of input scenes. Under control by a user, the program can be made to operate on small parts of the output image space, with check-point and restart capabilities. The program runs on SGI IRIX computers. It is capable of parallel processing using shared-memory code, large memories, and tens of central processing units. It can retrieve input data and store output data at locations remote from the processors on which it is executed.
Effects of target fragmentation on evaluation of LET spectra from space radiations: implications for space radiation protection studies

NASA Technical Reports Server (NTRS)

Cucinotta, F. A.; Wilson, J. W.; Shinn, J. L.; Badavi, F. F.; Badhwar, G. D.

1996-01-01

We present calculations of linear energy transfer (LET) spectra in low earth orbit from galactic cosmic rays and trapped protons using the HZETRN/BRYNTRN computer code. The emphasis of our calculations is on the analysis of the effects of secondary nuclei produced through target fragmentation in the spacecraft shield or detectors. Recent improvements in the HZETRN/BRYNTRN radiation transport computer code are described. Calculations show that at large values of LET (> 100 keV/micrometer) the LET spectra seen in free space and low earth orbit (LEO) are dominated by target fragments and not the primary nuclei. Although the evaluation of microdosimetric spectra is not considered here, calculations of LET spectra support that the large lineal energy (y) events are dominated by the target fragments. Finally, we discuss the situation for interplanetary exposures to galactic cosmic rays and show that current radiation transport codes predict that in the region of high LET values the LET spectra at significant shield depths (> 10 g/cm2 of Al) is greatly modified by target fragments. These results suggest that studies of track structure and biological response of space radiation should place emphasis on short tracks of medium charge fragments produced in the human body by high energy protons and neutrons.
Applications of Derandomization Theory in Coding

NASA Astrophysics Data System (ADS)

Cheraghchi, Mahdi

2011-07-01

Randomized techniques play a fundamental role in theoretical computer science and discrete mathematics, in particular for the design of efficient algorithms and construction of combinatorial objects. The basic goal in derandomization theory is to eliminate or reduce the need for randomness in such randomized constructions. In this thesis, we explore some applications of the fundamental notions in derandomization theory to problems outside the core of theoretical computer science, and in particular, certain problems related to coding theory. First, we consider the wiretap channel problem which involves a communication system in which an intruder can eavesdrop a limited portion of the transmissions, and construct efficient and information-theoretically optimal communication protocols for this model. Then we consider the combinatorial group testing problem. In this classical problem, one aims to determine a set of defective items within a large population by asking a number of queries, where each query reveals whether a defective item is present within a specified group of items. We use randomness condensers to explicitly construct optimal, or nearly optimal, group testing schemes for a setting where the query outcomes can be highly unreliable, as well as the threshold model where a query returns positive if the number of defectives pass a certain threshold. Finally, we design ensembles of error-correcting codes that achieve the information-theoretic capacity of a large class of communication channels, and then use the obtained ensembles for construction of explicit capacity achieving codes. [This is a shortened version of the actual abstract in the thesis.
Implementation of a 3D mixing layer code on parallel computers

NASA Technical Reports Server (NTRS)

Roe, K.; Thakur, R.; Dang, T.; Bogucz, E.

1995-01-01

This paper summarizes our progress and experience in the development of a Computational-Fluid-Dynamics code on parallel computers to simulate three-dimensional spatially-developing mixing layers. In this initial study, the three-dimensional time-dependent Euler equations are solved using a finite-volume explicit time-marching algorithm. The code was first programmed in Fortran 77 for sequential computers. The code was then converted for use on parallel computers using the conventional message-passing technique, while we have not been able to compile the code with the present version of HPF compilers.
Efficiently modeling neural networks on massively parallel computers

NASA Technical Reports Server (NTRS)

Farber, Robert M.

1993-01-01

Neural networks are a very useful tool for analyzing and modeling complex real world systems. Applying neural network simulations to real world problems generally involves large amounts of data and massive amounts of computation. To efficiently handle the computational requirements of large problems, we have implemented at Los Alamos a highly efficient neural network compiler for serial computers, vector computers, vector parallel computers, and fine grain SIMD computers such as the CM-2 connection machine. This paper describes the mapping used by the compiler to implement feed-forward backpropagation neural networks for a SIMD (Single Instruction Multiple Data) architecture parallel computer. Thinking Machines Corporation has benchmarked our code at 1.3 billion interconnects per second (approximately 3 gigaflops) on a 64,000 processor CM-2 connection machine (Singer 1990). This mapping is applicable to other SIMD computers and can be implemented on MIMD computers such as the CM-5 connection machine. Our mapping has virtually no communications overhead with the exception of the communications required for a global summation across the processors (which has a sub-linear runtime growth on the order of O(log(number of processors)). We can efficiently model very large neural networks which have many neurons and interconnects and our mapping can extend to arbitrarily large networks (within memory limitations) by merging the memory space of separate processors with fast adjacent processor interprocessor communications. This paper will consider the simulation of only feed forward neural network although this method is extendable to recurrent networks.
Interface COMSOL-PHREEQC (iCP), an efficient numerical framework for the solution of coupled multiphysics and geochemistry

NASA Astrophysics Data System (ADS)

Nardi, Albert; Idiart, Andrés; Trinchero, Paolo; de Vries, Luis Manuel; Molinero, Jorge

2014-08-01

This paper presents the development, verification and application of an efficient interface, denoted as iCP, which couples two standalone simulation programs: the general purpose Finite Element framework COMSOL Multiphysics® and the geochemical simulator PHREEQC. The main goal of the interface is to maximize the synergies between the aforementioned codes, providing a numerical platform that can efficiently simulate a wide number of multiphysics problems coupled with geochemistry. iCP is written in Java and uses the IPhreeqc C++ dynamic library and the COMSOL Java-API. Given the large computational requirements of the aforementioned coupled models, special emphasis has been placed on numerical robustness and efficiency. To this end, the geochemical reactions are solved in parallel by balancing the computational load over multiple threads. First, a benchmark exercise is used to test the reliability of iCP regarding flow and reactive transport. Then, a large scale thermo-hydro-chemical (THC) problem is solved to show the code capabilities. The results of the verification exercise are successfully compared with those obtained using PHREEQC and the application case demonstrates the scalability of a large scale model, at least up to 32 threads.
Numerically modelling the large scale coronal magnetic field

NASA Astrophysics Data System (ADS)

Panja, Mayukh; Nandi, Dibyendu

2016-07-01

The solar corona spews out vast amounts of magnetized plasma into the heliosphere which has a direct impact on the Earth's magnetosphere. Thus it is important that we develop an understanding of the dynamics of the solar corona. With our present technology it has not been possible to generate 3D magnetic maps of the solar corona; this warrants the use of numerical simulations to study the coronal magnetic field. A very popular method of doing this, is to extrapolate the photospheric magnetic field using NLFF or PFSS codes. However the extrapolations at different time intervals are completely independent of each other and do not capture the temporal evolution of magnetic fields. On the other hand full MHD simulations of the global coronal field, apart from being computationally very expensive would be physically less transparent, owing to the large number of free parameters that are typically used in such codes. This brings us to the Magneto-frictional model which is relatively simpler and computationally more economic. We have developed a Magnetofrictional Model, in 3D spherical polar co-ordinates to study the large scale global coronal field. Here we present studies of changing connectivities between active regions, in response to photospheric motions.
Integration of a neuroimaging processing pipeline into a pan-canadian computing grid

NASA Astrophysics Data System (ADS)

Lavoie-Courchesne, S.; Rioux, P.; Chouinard-Decorte, F.; Sherif, T.; Rousseau, M.-E.; Das, S.; Adalat, R.; Doyon, J.; Craddock, C.; Margulies, D.; Chu, C.; Lyttelton, O.; Evans, A. C.; Bellec, P.

2012-02-01

The ethos of the neuroimaging field is quickly moving towards the open sharing of resources, including both imaging databases and processing tools. As a neuroimaging database represents a large volume of datasets and as neuroimaging processing pipelines are composed of heterogeneous, computationally intensive tools, such open sharing raises specific computational challenges. This motivates the design of novel dedicated computing infrastructures. This paper describes an interface between PSOM, a code-oriented pipeline development framework, and CBRAIN, a web-oriented platform for grid computing. This interface was used to integrate a PSOM-compliant pipeline for preprocessing of structural and functional magnetic resonance imaging into CBRAIN. We further tested the capacity of our infrastructure to handle a real large-scale project. A neuroimaging database including close to 1000 subjects was preprocessed using our interface and publicly released to help the participants of the ADHD-200 international competition. This successful experiment demonstrated that our integrated grid-computing platform is a powerful solution for high-throughput pipeline analysis in the field of neuroimaging.
Algorithm for solving the linear Cauchy problem for large systems of ordinary differential equations with the use of parallel computations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Moryakov, A. V., E-mail: sailor@orc.ru

2016-12-15

An algorithm for solving the linear Cauchy problem for large systems of ordinary differential equations is presented. The algorithm for systems of first-order differential equations is implemented in the EDELWEISS code with the possibility of parallel computations on supercomputers employing the MPI (Message Passing Interface) standard for the data exchange between parallel processes. The solution is represented by a series of orthogonal polynomials on the interval [0, 1]. The algorithm is characterized by simplicity and the possibility to solve nonlinear problems with a correction of the operator in accordance with the solution obtained in the previous iterative process.
CELES: CUDA-accelerated simulation of electromagnetic scattering by large ensembles of spheres

NASA Astrophysics Data System (ADS)

Egel, Amos; Pattelli, Lorenzo; Mazzamuto, Giacomo; Wiersma, Diederik S.; Lemmer, Uli

2017-09-01

CELES is a freely available MATLAB toolbox to simulate light scattering by many spherical particles. Aiming at high computational performance, CELES leverages block-diagonal preconditioning, a lookup-table approach to evaluate costly functions and massively parallel execution on NVIDIA graphics processing units using the CUDA computing platform. The combination of these techniques allows to efficiently address large electrodynamic problems (>104 scatterers) on inexpensive consumer hardware. In this paper, we validate near- and far-field distributions against the well-established multi-sphere T-matrix (MSTM) code and discuss the convergence behavior for ensembles of different sizes, including an exemplary system comprising 105 particles.
Scalable Analysis Methods and In Situ Infrastructure for Extreme Scale Knowledge Discovery

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bethel, Wes

2016-07-24

The primary challenge motivating this team’s work is the widening gap between the ability to compute information and to store it for subsequent analysis. This gap adversely impacts science code teams, who are able to perform analysis only on a small fraction of the data they compute, resulting in the very real likelihood of lost or missed science, when results are computed but not analyzed. Our approach is to perform as much analysis or visualization processing on data while it is still resident in memory, an approach that is known as in situ processing. The idea in situ processing wasmore » not new at the time of the start of this effort in 2014, but efforts in that space were largely ad hoc, and there was no concerted effort within the research community that aimed to foster production-quality software tools suitable for use by DOE science projects. In large, our objective was produce and enable use of production-quality in situ methods and infrastructure, at scale, on DOE HPC facilities, though we expected to have impact beyond DOE due to the widespread nature of the challenges, which affect virtually all large-scale computational science efforts. To achieve that objective, we assembled a unique team of researchers consisting of representatives from DOE national laboratories, academia, and industry, and engaged in software technology R&D, as well as engaged in close partnerships with DOE science code teams, to produce software technologies that were shown to run effectively at scale on DOE HPC platforms.« less
Gibbs sampling on large lattice with GMRF

NASA Astrophysics Data System (ADS)

Marcotte, Denis; Allard, Denis

2018-02-01

Gibbs sampling is routinely used to sample truncated Gaussian distributions. These distributions naturally occur when associating latent Gaussian fields to category fields obtained by discrete simulation methods like multipoint, sequential indicator simulation and object-based simulation. The latent Gaussians are often used in data assimilation and history matching algorithms. When the Gibbs sampling is applied on a large lattice, the computing cost can become prohibitive. The usual practice of using local neighborhoods is unsatisfying as it can diverge and it does not reproduce exactly the desired covariance. A better approach is to use Gaussian Markov Random Fields (GMRF) which enables to compute the conditional distributions at any point without having to compute and invert the full covariance matrix. As the GMRF is locally defined, it allows simultaneous updating of all points that do not share neighbors (coding sets). We propose a new simultaneous Gibbs updating strategy on coding sets that can be efficiently computed by convolution and applied with an acceptance/rejection method in the truncated case. We study empirically the speed of convergence, the effect of choice of boundary conditions, of the correlation range and of GMRF smoothness. We show that the convergence is slower in the Gaussian case on the torus than for the finite case studied in the literature. However, in the truncated Gaussian case, we show that short scale correlation is quickly restored and the conditioning categories at each lattice point imprint the long scale correlation. Hence our approach enables to realistically apply Gibbs sampling on large 2D or 3D lattice with the desired GMRF covariance.
GenomicTools: a computational platform for developing high-throughput analytics in genomics.

PubMed

Tsirigos, Aristotelis; Haiminen, Niina; Bilal, Erhan; Utro, Filippo

2012-01-15

Recent advances in sequencing technology have resulted in the dramatic increase of sequencing data, which, in turn, requires efficient management of computational resources, such as computing time, memory requirements as well as prototyping of computational pipelines. We present GenomicTools, a flexible computational platform, comprising both a command-line set of tools and a C++ API, for the analysis and manipulation of high-throughput sequencing data such as DNA-seq, RNA-seq, ChIP-seq and MethylC-seq. GenomicTools implements a variety of mathematical operations between sets of genomic regions thereby enabling the prototyping of computational pipelines that can address a wide spectrum of tasks ranging from pre-processing and quality control to meta-analyses. Additionally, the GenomicTools platform is designed to analyze large datasets of any size by minimizing memory requirements. In practical applications, where comparable, GenomicTools outperforms existing tools in terms of both time and memory usage. The GenomicTools platform (version 2.0.0) was implemented in C++. The source code, documentation, user manual, example datasets and scripts are available online at http://code.google.com/p/ibm-cbc-genomic-tools.
On the Large-Scaling Issues of Cloud-based Applications for Earth Science Dat

NASA Astrophysics Data System (ADS)

Hua, H.

2016-12-01

Next generation science data systems are needed to address the incoming flood of data from new missions such as NASA's SWOT and NISAR where its SAR data volumes and data throughput rates are order of magnitude larger than present day missions. Existing missions, such as OCO-2, may also require high turn-around time for processing different science scenarios where on-premise and even traditional HPC computing environments may not meet the high processing needs. Additionally, traditional means of procuring hardware on-premise are already limited due to facilities capacity constraints for these new missions. Experiences have shown that to embrace efficient cloud computing approaches for large-scale science data systems requires more than just moving existing code to cloud environments. At large cloud scales, we need to deal with scaling and cost issues. We present our experiences on deploying multiple instances of our hybrid-cloud computing science data system (HySDS) to support large-scale processing of Earth Science data products. We will explore optimization approaches to getting best performance out of hybrid-cloud computing as well as common issues that will arise when dealing with large-scale computing. Novel approaches were utilized to do processing on Amazon's spot market, which can potentially offer 75%-90% costs savings but with an unpredictable computing environment based on market forces.

A new look at the simultaneous analysis and design of structures

NASA Technical Reports Server (NTRS)

Striz, Alfred G.

1994-01-01

The minimum weight optimization of structural systems, subject to strength and displacement constraints as well as size side constraints, was investigated by the Simultaneous ANalysis and Design (SAND) approach. As an optimizer, the code NPSOL was used which is based on a sequential quadratic programming (SQP) algorithm. The structures were modeled by the finite element method. The finite element related input to NPSOL was automatically generated from the input decks of such standard FEM/optimization codes as NASTRAN or ASTROS, with the stiffness matrices, at present, extracted from the FEM code ANALYZE. In order to avoid ill-conditioned matrices that can be encountered when the global stiffness equations are used as additional nonlinear equality constraints in the SAND approach (with the displacements as additional variables), the matrix displacement method was applied. In this approach, the element stiffness equations are used as constraints instead of the global stiffness equations, in conjunction with the nodal force equilibrium equations. This approach adds the element forces as variables to the system. Since, for complex structures and the associated large and very sparce matrices, the execution times of the optimization code became excessive due to the large number of required constraint gradient evaluations, the Kreisselmeier-Steinhauser function approach was used to decrease the computational effort by reducing the nonlinear equality constraint system to essentially a single combined constraint equation. As the linear equality and inequality constraints require much less computational effort to evaluate, they were kept in their previous form to limit the complexity of the KS function evaluation. To date, the standard three-bar, ten-bar, and 72-bar trusses have been tested. For the standard SAND approach, correct results were obtained for all three trusses although convergence became slower for the 72-bar truss. When the matrix displacement method was used, correct results were still obtained, but the execution times became excessive due to the large number of constraint gradient evaluations required. Using the KS function, the computational effort dropped, but the optimization seemed to become less robust. The investigation of this phenomenon is continuing. As an alternate approach, the code MINOS for the optimization of sparse matrices can be applied to the problem in lieu of the Kreisselmeier-Steinhauser function. This investigation is underway.
DUKSUP: A Computer Program for High Thrust Launch Vehicle Trajectory Design and Optimization

NASA Technical Reports Server (NTRS)

Williams, C. H.; Spurlock, O. F.

2014-01-01

From the late 1960's through 1997, the leadership of NASA's Intermediate and Large class unmanned expendable launch vehicle projects resided at the NASA Lewis (now Glenn) Research Center (LeRC). One of LeRC's primary responsibilities --- trajectory design and performance analysis --- was accomplished by an internally-developed analytic three dimensional computer program called DUKSUP. Because of its Calculus of Variations-based optimization routine, this code was generally more capable of finding optimal solutions than its contemporaries. A derivation of optimal control using the Calculus of Variations is summarized including transversality, intermediate, and final conditions. The two point boundary value problem is explained. A brief summary of the code's operation is provided, including iteration via the Newton-Raphson scheme and integration of variational and motion equations via a 4th order Runge-Kutta scheme. Main subroutines are discussed. The history of the LeRC trajectory design efforts in the early 1960's is explained within the context of supporting the Centaur upper stage program. How the code was constructed based on the operation of the Atlas/Centaur launch vehicle, the limits of the computers of that era, the limits of the computer programming languages, and the missions it supported are discussed. The vehicles DUKSUP supported (Atlas/Centaur, Titan/Centaur, and Shuttle/Centaur) are briefly described. The types of missions, including Earth orbital and interplanetary, are described. The roles of flight constraints and their impact on launch operations are detailed (such as jettisoning hardware on heating, Range Safety, ground station tracking, and elliptical parking orbits). The computer main frames on which the code was hosted are described. The applications of the code are detailed, including independent check of contractor analysis, benchmarking, leading edge analysis, and vehicle performance improvement assessments. Several of DUKSUP's many major impacts on launches are discussed including Intelsat, Voyager, Pioneer Venus, HEAO, Galileo, and Cassini.
DUKSUP: A Computer Program for High Thrust Launch Vehicle Trajectory Design and Optimization

NASA Technical Reports Server (NTRS)

Spurlock, O. Frank; Williams, Craig H.

2015-01-01

From the late 1960s through 1997, the leadership of NASAs Intermediate and Large class unmanned expendable launch vehicle projects resided at the NASA Lewis (now Glenn) Research Center (LeRC). One of LeRCs primary responsibilities --- trajectory design and performance analysis --- was accomplished by an internally-developed analytic three dimensional computer program called DUKSUP. Because of its Calculus of Variations-based optimization routine, this code was generally more capable of finding optimal solutions than its contemporaries. A derivation of optimal control using the Calculus of Variations is summarized including transversality, intermediate, and final conditions. The two point boundary value problem is explained. A brief summary of the codes operation is provided, including iteration via the Newton-Raphson scheme and integration of variational and motion equations via a 4th order Runge-Kutta scheme. Main subroutines are discussed. The history of the LeRC trajectory design efforts in the early 1960s is explained within the context of supporting the Centaur upper stage program. How the code was constructed based on the operation of the AtlasCentaur launch vehicle, the limits of the computers of that era, the limits of the computer programming languages, and the missions it supported are discussed. The vehicles DUKSUP supported (AtlasCentaur, TitanCentaur, and ShuttleCentaur) are briefly described. The types of missions, including Earth orbital and interplanetary, are described. The roles of flight constraints and their impact on launch operations are detailed (such as jettisoning hardware on heating, Range Safety, ground station tracking, and elliptical parking orbits). The computer main frames on which the code was hosted are described. The applications of the code are detailed, including independent check of contractor analysis, benchmarking, leading edge analysis, and vehicle performance improvement assessments. Several of DUKSUPs many major impacts on launches are discussed including Intelsat, Voyager, Pioneer Venus, HEAO, Galileo, and Cassini.
Chaste: An Open Source C++ Library for Computational Physiology and Biology

PubMed Central

Mirams, Gary R.; Arthurs, Christopher J.; Bernabeu, Miguel O.; Bordas, Rafel; Cooper, Jonathan; Corrias, Alberto; Davit, Yohan; Dunn, Sara-Jane; Fletcher, Alexander G.; Harvey, Daniel G.; Marsh, Megan E.; Osborne, James M.; Pathmanathan, Pras; Pitt-Francis, Joe; Southern, James; Zemzemi, Nejib; Gavaghan, David J.

2013-01-01

Chaste — Cancer, Heart And Soft Tissue Environment — is an open source C++ library for the computational simulation of mathematical models developed for physiology and biology. Code development has been driven by two initial applications: cardiac electrophysiology and cancer development. A large number of cardiac electrophysiology studies have been enabled and performed, including high-performance computational investigations of defibrillation on realistic human cardiac geometries. New models for the initiation and growth of tumours have been developed. In particular, cell-based simulations have provided novel insight into the role of stem cells in the colorectal crypt. Chaste is constantly evolving and is now being applied to a far wider range of problems. The code provides modules for handling common scientific computing components, such as meshes and solvers for ordinary and partial differential equations (ODEs/PDEs). Re-use of these components avoids the need for researchers to ‘re-invent the wheel’ with each new project, accelerating the rate of progress in new applications. Chaste is developed using industrially-derived techniques, in particular test-driven development, to ensure code quality, re-use and reliability. In this article we provide examples that illustrate the types of problems Chaste can be used to solve, which can be run on a desktop computer. We highlight some scientific studies that have used or are using Chaste, and the insights they have provided. The source code, both for specific releases and the development version, is available to download under an open source Berkeley Software Distribution (BSD) licence at http://www.cs.ox.ac.uk/chaste, together with details of a mailing list and links to documentation and tutorials. PMID:23516352
Combining Language Corpora with Experimental and Computational Approaches for Language Acquisition Research

ERIC Educational Resources Information Center

Monaghan, Padraic; Rowland, Caroline F.

2017-01-01

Historically, first language acquisition research was a painstaking process of observation, requiring the laborious hand coding of children's linguistic productions, followed by the generation of abstract theoretical proposals for how the developmental process unfolds. Recently, the ability to collect large-scale corpora of children's language…
Twelfth NASTRAN (R) Users' Colloquium

NASA Technical Reports Server (NTRS)

1984-01-01

NASTRAN is a large, comprehensive, nonproprietary, general purpose finite element computer code for structural analysis. The Twelfth Users' Colloquim provides some comprehensive papers on the application of finite element methods in engineering, comparisons with other approaches, unique applications, pre and post processing or auxiliary programs, and new methods of analysis with NASTRAN.
A comparison of skyshine computational methods.

PubMed

Hertel, Nolan E; Sweezy, Jeremy E; Shultis, J Kenneth; Warkentin, J Karl; Rose, Zachary J

2005-01-01

A variety of methods employing radiation transport and point-kernel codes have been used to model two skyshine problems. The first problem is a 1 MeV point source of photons on the surface of the earth inside a 2 m tall and 1 m radius silo having black walls. The skyshine radiation downfield from the point source was estimated with and without a 30-cm-thick concrete lid on the silo. The second benchmark problem is to estimate the skyshine radiation downfield from 12 cylindrical canisters emplaced in a low-level radioactive waste trench. The canisters are filled with ion-exchange resin with a representative radionuclide loading, largely 60Co, 134Cs and 137Cs. The solution methods include use of the MCNP code to solve the problem by directly employing variance reduction techniques, the single-scatter point kernel code GGG-GP, the QADMOD-GP point kernel code, the COHORT Monte Carlo code, the NAC International version of the SKYSHINE-III code, the KSU hybrid method and the associated KSU skyshine codes.
40 CFR 194.23 - Models and computer codes.

Code of Federal Regulations, 2013 CFR

2013-07-01

... 40 Protection of Environment 26 2013-07-01 2013-07-01 false Models and computer codes. 194.23... General Requirements § 194.23 Models and computer codes. (a) Any compliance application shall include: (1... obtain stable solutions; (iv) Computer models accurately implement the numerical models; i.e., computer...
40 CFR 194.23 - Models and computer codes.

Code of Federal Regulations, 2012 CFR

2012-07-01

... 40 Protection of Environment 26 2012-07-01 2011-07-01 true Models and computer codes. 194.23... General Requirements § 194.23 Models and computer codes. (a) Any compliance application shall include: (1... obtain stable solutions; (iv) Computer models accurately implement the numerical models; i.e., computer...
40 CFR 194.23 - Models and computer codes.

Code of Federal Regulations, 2014 CFR

2014-07-01

... 40 Protection of Environment 25 2014-07-01 2014-07-01 false Models and computer codes. 194.23... General Requirements § 194.23 Models and computer codes. (a) Any compliance application shall include: (1... obtain stable solutions; (iv) Computer models accurately implement the numerical models; i.e., computer...
40 CFR 194.23 - Models and computer codes.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 40 Protection of Environment 24 2010-07-01 2010-07-01 false Models and computer codes. 194.23... General Requirements § 194.23 Models and computer codes. (a) Any compliance application shall include: (1... obtain stable solutions; (iv) Computer models accurately implement the numerical models; i.e., computer...
40 CFR 194.23 - Models and computer codes.

Code of Federal Regulations, 2011 CFR

2011-07-01

... 40 Protection of Environment 25 2011-07-01 2011-07-01 false Models and computer codes. 194.23... General Requirements § 194.23 Models and computer codes. (a) Any compliance application shall include: (1... obtain stable solutions; (iv) Computer models accurately implement the numerical models; i.e., computer...
Electrode channel selection based on backtracking search optimization in motor imagery brain-computer interfaces.

PubMed

Dai, Shengfa; Wei, Qingguo

2017-01-01

Common spatial pattern algorithm is widely used to estimate spatial filters in motor imagery based brain-computer interfaces. However, use of a large number of channels will make common spatial pattern tend to over-fitting and the classification of electroencephalographic signals time-consuming. To overcome these problems, it is necessary to choose an optimal subset of the whole channels to save computational time and improve the classification accuracy. In this paper, a novel method named backtracking search optimization algorithm is proposed to automatically select the optimal channel set for common spatial pattern. Each individual in the population is a N-dimensional vector, with each component representing one channel. A population of binary codes generate randomly in the beginning, and then channels are selected according to the evolution of these codes. The number and positions of 1's in the code denote the number and positions of chosen channels. The objective function of backtracking search optimization algorithm is defined as the combination of classification error rate and relative number of channels. Experimental results suggest that higher classification accuracy can be achieved with much fewer channels compared to standard common spatial pattern with whole channels.
Parallel design of JPEG-LS encoder on graphics processing units

NASA Astrophysics Data System (ADS)

Duan, Hao; Fang, Yong; Huang, Bormin

2012-01-01

With recent technical advances in graphic processing units (GPUs), GPUs have outperformed CPUs in terms of compute capability and memory bandwidth. Many successful GPU applications to high performance computing have been reported. JPEG-LS is an ISO/IEC standard for lossless image compression which utilizes adaptive context modeling and run-length coding to improve compression ratio. However, adaptive context modeling causes data dependency among adjacent pixels and the run-length coding has to be performed in a sequential way. Hence, using JPEG-LS to compress large-volume hyperspectral image data is quite time-consuming. We implement an efficient parallel JPEG-LS encoder for lossless hyperspectral compression on a NVIDIA GPU using the computer unified device architecture (CUDA) programming technology. We use the block parallel strategy, as well as such CUDA techniques as coalesced global memory access, parallel prefix sum, and asynchronous data transfer. We also show the relation between GPU speedup and AVIRIS block size, as well as the relation between compression ratio and AVIRIS block size. When AVIRIS images are divided into blocks, each with 64×64 pixels, we gain the best GPU performance with 26.3x speedup over its original CPU code.
Chemical application of diffusion quantum Monte Carlo

NASA Technical Reports Server (NTRS)

Reynolds, P. J.; Lester, W. A., Jr.

1984-01-01

The diffusion quantum Monte Carlo (QMC) method gives a stochastic solution to the Schroedinger equation. This approach is receiving increasing attention in chemical applications as a result of its high accuracy. However, reducing statistical uncertainty remains a priority because chemical effects are often obtained as small differences of large numbers. As an example, the single-triplet splitting of the energy of the methylene molecule CH sub 2 is given. The QMC algorithm was implemented on the CYBER 205, first as a direct transcription of the algorithm running on the VAX 11/780, and second by explicitly writing vector code for all loops longer than a crossover length C. The speed of the codes relative to one another as a function of C, and relative to the VAX, are discussed. The computational time dependence obtained versus the number of basis functions is discussed and this is compared with that obtained from traditional quantum chemistry codes and that obtained from traditional computer architectures.
Design oriented structural analysis

NASA Technical Reports Server (NTRS)

Giles, Gary L.

1994-01-01

Desirable characteristics and benefits of design oriented analysis methods are described and illustrated by presenting a synoptic description of the development and uses of the Equivalent Laminated Plate Solution (ELAPS) computer code. ELAPS is a design oriented structural analysis method which is intended for use in the early design of aircraft wing structures. Model preparation is minimized by using a few large plate segments to model the wing box structure. Computational efficiency is achieved by using a limited number of global displacement functions that encompass all segments over the wing planform. Coupling with other codes is facilitated since the output quantities such as deflections and stresses are calculated as continuous functions over the plate segments. Various aspects of the ELAPS development are discussed including the analytical formulation, verification of results by comparison with finite element analysis results, coupling with other codes, and calculation of sensitivity derivatives. The effectiveness of ELAPS for multidisciplinary design application is illustrated by describing its use in design studies of high speed civil transport wing structures.
3DHZETRN: Inhomogeneous Geometry Issues

NASA Technical Reports Server (NTRS)

Wilson, John W.; Slaba, Tony C.; Badavi, Francis F.

2017-01-01

Historical methods for assessing radiation exposure inside complicated geometries for space applications were limited by computational constraints and lack of knowledge associated with nuclear processes occurring over a broad range of particles and energies. Various methods were developed and utilized to simplify geometric representations and enable coupling with simplified but efficient particle transport codes. Recent transport code development efforts, leading to 3DHZETRN, now enable such approximate methods to be carefully assessed to determine if past exposure analyses and validation efforts based on those approximate methods need to be revisited. In this work, historical methods of representing inhomogeneous spacecraft geometry for radiation protection analysis are first reviewed. Two inhomogeneous geometry cases, previously studied with 3DHZETRN and Monte Carlo codes, are considered with various levels of geometric approximation. Fluence, dose, and dose equivalent values are computed in all cases and compared. It is found that although these historical geometry approximations can induce large errors in neutron fluences up to 100 MeV, errors on dose and dose equivalent are modest (<10%) for the cases studied here.
Status and future of the 3D MAFIA group of codes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ebeling, F.; Klatt, R.; Krawzcyk, F.

1988-12-01

The group of fully three dimensional computer codes for solving Maxwell's equations for a wide range of applications, MAFIA, is already well established. Extensive comparisons with measurements have demonstrated the accuracy of the computations. A large numer of components have been designed for accelerators, such as kicker magnets, non cyclindrical cavities, ferrite loaded cavities, vacuum chambers with slots and transitions, etc. The latest additions to the system include a new static solver that can calculate 3D magneto- and electrostatic fields, and a self consistent version of the 2D-BCI that solves the field equations and the equations of motion in parallel.more » Work on new eddy current modules has started, which will allow treatment of laminated and/or solid iron cores excited by low frequency currents. Based on our experience with the present releases 1 and 2, we have started a complete revision of the whole user interface and data structure, which will make the codes even more user-friendly and flexible.« less
An efficient implementation of 3D high-resolution imaging for large-scale seismic data with GPU/CPU heterogeneous parallel computing

NASA Astrophysics Data System (ADS)

Xu, Jincheng; Liu, Wei; Wang, Jin; Liu, Linong; Zhang, Jianfeng

2018-02-01

De-absorption pre-stack time migration (QPSTM) compensates for the absorption and dispersion of seismic waves by introducing an effective Q parameter, thereby making it an effective tool for 3D, high-resolution imaging of seismic data. Although the optimal aperture obtained via stationary-phase migration reduces the computational cost of 3D QPSTM and yields 3D stationary-phase QPSTM, the associated computational efficiency is still the main problem in the processing of 3D, high-resolution images for real large-scale seismic data. In the current paper, we proposed a division method for large-scale, 3D seismic data to optimize the performance of stationary-phase QPSTM on clusters of graphics processing units (GPU). Then, we designed an imaging point parallel strategy to achieve an optimal parallel computing performance. Afterward, we adopted an asynchronous double buffering scheme for multi-stream to perform the GPU/CPU parallel computing. Moreover, several key optimization strategies of computation and storage based on the compute unified device architecture (CUDA) were adopted to accelerate the 3D stationary-phase QPSTM algorithm. Compared with the initial GPU code, the implementation of the key optimization steps, including thread optimization, shared memory optimization, register optimization and special function units (SFU), greatly improved the efficiency. A numerical example employing real large-scale, 3D seismic data showed that our scheme is nearly 80 times faster than the CPU-QPSTM algorithm. Our GPU/CPU heterogeneous parallel computing framework significant reduces the computational cost and facilitates 3D high-resolution imaging for large-scale seismic data.
Development of an Object-Oriented Turbomachinery Analysis Code within the NPSS Framework

NASA Technical Reports Server (NTRS)

Jones, Scott M.

2014-01-01

During the preliminary or conceptual design phase of an aircraft engine, the turbomachinery designer has a need to estimate the effects of a large number of design parameters such as flow size, stage count, blade count, radial position, etc. on the weight and efficiency of a turbomachine. Computer codes are invariably used to perform this task however, such codes are often very old, written in outdated languages with arcane input files, and rarely adaptable to new architectures or unconventional layouts. Given the need to perform these kinds of preliminary design trades, a modern 2-D turbomachinery design and analysis code has been written using the Numerical Propulsion System Simulation (NPSS) framework. This paper discusses the development of the governing equations and the structure of the primary objects used in OTAC.

Development of a dynamic coupled hydro-geomechanical code and its application to induced seismicity

NASA Astrophysics Data System (ADS)

Miah, Md Mamun

This research describes the importance of a hydro-geomechanical coupling in the geologic sub-surface environment from fluid injection at geothermal plants, large-scale geological CO2 sequestration for climate mitigation, enhanced oil recovery, and hydraulic fracturing during wells construction in the oil and gas industries. A sequential computational code is developed to capture the multiphysics interaction behavior by linking a flow simulation code TOUGH2 and a geomechanics modeling code PyLith. Numerical formulation of each code is discussed to demonstrate their modeling capabilities. The computational framework involves sequential coupling, and solution of two sub-problems- fluid flow through fractured and porous media and reservoir geomechanics. For each time step of flow calculation, pressure field is passed to the geomechanics code to compute effective stress field and fault slips. A simplified permeability model is implemented in the code that accounts for the permeability of porous and saturated rocks subject to confining stresses. The accuracy of the TOUGH-PyLith coupled simulator is tested by simulating Terzaghi's 1D consolidation problem. The modeling capability of coupled poroelasticity is validated by benchmarking it against Mandel's problem. The code is used to simulate both quasi-static and dynamic earthquake nucleation and slip distribution on a fault from the combined effect of far field tectonic loading and fluid injection by using an appropriate fault constitutive friction model. Results from the quasi-static induced earthquake simulations show a delayed response in earthquake nucleation. This is attributed to the increased total stress in the domain and not accounting for pressure on the fault. However, this issue is resolved in the final chapter in simulating a single event earthquake dynamic rupture. Simulation results show that fluid pressure has a positive effect on slip nucleation and subsequent crack propagation. This is confirmed by running a sensitivity analysis that shows an increase in injection well distance results in delayed slip nucleation and rupture propagation on the fault.
Perspectives on the Future of CFD

NASA Technical Reports Server (NTRS)

Kwak, Dochan

2000-01-01

This viewgraph presentation gives an overview of the future of computational fluid dynamics (CFD), which in the past has pioneered the field of flow simulation. Over time CFD has progressed as computing power. Numerical methods have been advanced as CPU and memory capacity increases. Complex configurations are routinely computed now and direct numerical simulations (DNS) and large eddy simulations (LES) are used to study turbulence. As the computing resources changed to parallel and distributed platforms, computer science aspects such as scalability (algorithmic and implementation) and portability and transparent codings have advanced. Examples of potential future (or current) challenges include risk assessment, limitations of the heuristic model, and the development of CFD and information technology (IT) tools.
Performance of the OVERFLOW-MLP and LAURA-MLP CFD Codes on the NASA Ames 512 CPU Origin System

NASA Technical Reports Server (NTRS)

Taft, James R.

2000-01-01

The shared memory Multi-Level Parallelism (MLP) technique, developed last year at NASA Ames has been very successful in dramatically improving the performance of important NASA CFD codes. This new and very simple parallel programming technique was first inserted into the OVERFLOW production CFD code in FY 1998. The OVERFLOW-MLP code's parallel performance scaled linearly to 256 CPUs on the NASA Ames 256 CPU Origin 2000 system (steger). Overall performance exceeded 20.1 GFLOP/s, or about 4.5x the performance of a dedicated 16 CPU C90 system. All of this was achieved without any major modification to the original vector based code. The OVERFLOW-MLP code is now in production on the inhouse Origin systems as well as being used offsite at commercial aerospace companies. Partially as a result of this work, NASA Ames has purchased a new 512 CPU Origin 2000 system to further test the limits of parallel performance for NASA codes of interest. This paper presents the performance obtained from the latest optimization efforts on this machine for the LAURA-MLP and OVERFLOW-MLP codes. The Langley Aerothermodynamics Upwind Relaxation Algorithm (LAURA) code is a key simulation tool in the development of the next generation shuttle, interplanetary reentry vehicles, and nearly all "X" plane development. This code sustains about 4-5 GFLOP/s on a dedicated 16 CPU C90. At this rate, expected workloads would require over 100 C90 CPU years of computing over the next few calendar years. It is not feasible to expect that this would be affordable or available to the user community. Dramatic performance gains on cheaper systems are needed. This code is expected to be perhaps the largest consumer of NASA Ames compute cycles per run in the coming year.The OVERFLOW CFD code is extensively used in the government and commercial aerospace communities to evaluate new aircraft designs. It is one of the largest consumers of NASA supercomputing cycles and large simulations of highly resolved full aircraft are routinely undertaken. Typical large problems might require 100s of Cray C90 CPU hours to complete. The dramatic performance gains with the 256 CPU steger system are exciting. Obtaining results in hours instead of months is revolutionizing the way in which aircraft manufacturers are looking at future aircraft simulation work. Figure 2 below is a current state of the art plot of OVERFLOW-MLP performance on the 512 CPU Lomax system. As can be seen, the chart indicates that OVERFLOW-MLP continues to scale linearly with CPU count up to 512 CPUs on a large 35 million point full aircraft RANS simulation. At this point performance is such that a fully converged simulation of 2500 time steps is completed in less than 2 hours of elapsed time. Further work over the next few weeks will improve the performance of this code even further.The LAURA code has been converted to the MLP format as well. This code is currently being optimized for the 512 CPU system. Performance statistics indicate that the goal of 100 GFLOP/s will be achieved by year's end. This amounts to 20x the 16 CPU C90 result and strongly demonstrates the viability of the new parallel systems rapidly solving very large simulations in a production environment.
Large-scale structural optimization

NASA Technical Reports Server (NTRS)

Sobieszczanski-Sobieski, J.

1983-01-01

Problems encountered by aerospace designers in attempting to optimize whole aircraft are discussed, along with possible solutions. Large scale optimization, as opposed to component-by-component optimization, is hindered by computational costs, software inflexibility, concentration on a single, rather than trade-off, design methodology and the incompatibility of large-scale optimization with single program, single computer methods. The software problem can be approached by placing the full analysis outside of the optimization loop. Full analysis is then performed only periodically. Problem-dependent software can be removed from the generic code using a systems programming technique, and then embody the definitions of design variables, objective function and design constraints. Trade-off algorithms can be used at the design points to obtain quantitative answers. Finally, decomposing the large-scale problem into independent subproblems allows systematic optimization of the problems by an organization of people and machines.
Pressure investigation of NASA leading edge vortex flaps on a 60 deg Delta wing

NASA Technical Reports Server (NTRS)

Marchman, J. F., III; Donatelli, D. A.; Terry, J. E.

1983-01-01

Pressure distributions on a 60 deg Delta Wing with NASA designed leading edge vortex flaps (LEVF) were found in order to provide more pressure data for LEVF and to help verify NASA computer codes used in designing these flaps. These flaps were intended to be optimized designs based on these computer codes. However, the pressure distributions show that the flaps wre not optimum for the size and deflection specified. A second drag-producing vortex forming over the wing indicated that the flap was too large for the specified deflection. Also, it became apparent that flap thickness has a possible effect on the reattachment location of the vortex. Research is continuing to determine proper flap size and deflection relationships that provide well-behaved flowfields and acceptable hinge-moment characteristics.
High-frequency CAD-based scattering model: SERMAT

NASA Astrophysics Data System (ADS)

Goupil, D.; Boutillier, M.

1991-09-01

Specifications for an industrial radar cross section (RCS) calculation code are given: it must be able to exchange data with many computer aided design (CAD) systems, it must be fast, and it must have powerful graphic tools. Classical physical optics (PO) and equivalent currents (EC) techniques have proven their efficiency on simple objects for a long time. Difficult geometric problems occur when objects with very complex shapes have to be computed. Only a specific geometric code can solve these problems. We have established that, once these problems have been solved: (1) PO and EC give good results on complex objects of large size compared to wavelength; and (2) the implementation of these objects in a software package (SERMAT) allows fast and sufficiently precise domain RCS calculations to meet industry requirements in the domain of stealth.
A combined Fuzzy and Naive Bayesian strategy can be used to assign event codes to injury narratives.

PubMed

Marucci-Wellman, H; Lehto, M; Corns, H

2011-12-01

Bayesian methods show promise for classifying injury narratives from large administrative datasets into cause groups. This study examined a combined approach where two Bayesian models (Fuzzy and Naïve) were used to either classify a narrative or select it for manual review. Injury narratives were extracted from claims filed with a worker's compensation insurance provider between January 2002 and December 2004. Narratives were separated into a training set (n=11,000) and prediction set (n=3,000). Expert coders assigned two-digit Bureau of Labor Statistics Occupational Injury and Illness Classification event codes to each narrative. Fuzzy and Naïve Bayesian models were developed using manually classified cases in the training set. Two semi-automatic machine coding strategies were evaluated. The first strategy assigned cases for manual review if the Fuzzy and Naïve models disagreed on the classification. The second strategy selected additional cases for manual review from the Agree dataset using prediction strength to reach a level of 50% computer coding and 50% manual coding. When agreement alone was used as the filtering strategy, the majority were coded by the computer (n=1,928, 64%) leaving 36% for manual review. The overall combined (human plus computer) sensitivity was 0.90 and positive predictive value (PPV) was >0.90 for 11 of 18 2-digit event categories. Implementing the 2nd strategy improved results with an overall sensitivity of 0.95 and PPV >0.90 for 17 of 18 categories. A combined Naïve-Fuzzy Bayesian approach can classify some narratives with high accuracy and identify others most beneficial for manual review, reducing the burden on human coders.
A class of hybrid finite element methods for electromagnetics: A review

NASA Technical Reports Server (NTRS)

Volakis, J. L.; Chatterjee, A.; Gong, J.

1993-01-01

Integral equation methods have generally been the workhorse for antenna and scattering computations. In the case of antennas, they continue to be the prominent computational approach, but for scattering applications the requirement for large-scale computations has turned researchers' attention to near neighbor methods such as the finite element method, which has low O(N) storage requirements and is readily adaptable in modeling complex geometrical features and material inhomogeneities. In this paper, we review three hybrid finite element methods for simulating composite scatterers, conformal microstrip antennas, and finite periodic arrays. Specifically, we discuss the finite element method and its application to electromagnetic problems when combined with the boundary integral, absorbing boundary conditions, and artificial absorbers for terminating the mesh. Particular attention is given to large-scale simulations, methods, and solvers for achieving low memory requirements and code performance on parallel computing architectures.
Three-Dimensional Terahertz Coded-Aperture Imaging Based on Single Input Multiple Output Technology.

PubMed

Chen, Shuo; Luo, Chenggao; Deng, Bin; Wang, Hongqiang; Cheng, Yongqiang; Zhuang, Zhaowen

2018-01-19

As a promising radar imaging technique, terahertz coded-aperture imaging (TCAI) can achieve high-resolution, forward-looking, and staring imaging by producing spatiotemporal independent signals with coded apertures. In this paper, we propose a three-dimensional (3D) TCAI architecture based on single input multiple output (SIMO) technology, which can reduce the coding and sampling times sharply. The coded aperture applied in the proposed TCAI architecture loads either purposive or random phase modulation factor. In the transmitting process, the purposive phase modulation factor drives the terahertz beam to scan the divided 3D imaging cells. In the receiving process, the random phase modulation factor is adopted to modulate the terahertz wave to be spatiotemporally independent for high resolution. Considering human-scale targets, images of each 3D imaging cell are reconstructed one by one to decompose the global computational complexity, and then are synthesized together to obtain the complete high-resolution image. As for each imaging cell, the multi-resolution imaging method helps to reduce the computational burden on a large-scale reference-signal matrix. The experimental results demonstrate that the proposed architecture can achieve high-resolution imaging with much less time for 3D targets and has great potential in applications such as security screening, nondestructive detection, medical diagnosis, etc.
4P: fast computing of population genetics statistics from large DNA polymorphism panels

PubMed Central

Benazzo, Andrea; Panziera, Alex; Bertorelle, Giorgio

2015-01-01

Massive DNA sequencing has significantly increased the amount of data available for population genetics and molecular ecology studies. However, the parallel computation of simple statistics within and between populations from large panels of polymorphic sites is not yet available, making the exploratory analyses of a set or subset of data a very laborious task. Here, we present 4P (parallel processing of polymorphism panels), a stand-alone software program for the rapid computation of genetic variation statistics (including the joint frequency spectrum) from millions of DNA variants in multiple individuals and multiple populations. It handles a standard input file format commonly used to store DNA variation from empirical or simulation experiments. The computational performance of 4P was evaluated using large SNP (single nucleotide polymorphism) datasets from human genomes or obtained by simulations. 4P was faster or much faster than other comparable programs, and the impact of parallel computing using multicore computers or servers was evident. 4P is a useful tool for biologists who need a simple and rapid computer program to run exploratory population genetics analyses in large panels of genomic data. It is also particularly suitable to analyze multiple data sets produced in simulation studies. Unix, Windows, and MacOs versions are provided, as well as the source code for easier pipeline implementations. PMID:25628874
Large scale in vivo recordings to study neuronal biophysics.

PubMed

Giocomo, Lisa M

2015-06-01

Over the last several years, technological advances have enabled researchers to more readily observe single-cell membrane biophysics in awake, behaving animals. Studies utilizing these technologies have provided important insights into the mechanisms generating functional neural codes in both sensory and non-sensory cortical circuits. Crucial for a deeper understanding of how membrane biophysics control circuit dynamics however, is a continued effort to move toward large scale studies of membrane biophysics, in terms of the numbers of neurons and ion channels examined. Future work faces a number of theoretical and technical challenges on this front but recent technological developments hold great promise for a larger scale understanding of how membrane biophysics contribute to circuit coding and computation. Copyright © 2014 Elsevier Ltd. All rights reserved.
Computational Control of Flexible Aerospace Systems

NASA Technical Reports Server (NTRS)

Sharpe, Lonnie, Jr.; Shen, Ji Yao

1994-01-01

The main objective of this project is to establish a distributed parameter modeling technique for structural analysis, parameter estimation, vibration suppression and control synthesis of large flexible aerospace structures. This report concentrates on the research outputs produced in the last two years of the project. The main accomplishments can be summarized as follows. A new version of the PDEMOD Code had been completed. A theoretical investigation of the NASA MSFC two-dimensional ground-based manipulator facility by using distributed parameter modelling technique has been conducted. A new mathematical treatment for dynamic analysis and control of large flexible manipulator systems has been conceived, which may provide a embryonic form of a more sophisticated mathematical model for future modified versions of the PDEMOD Codes.
Methods for evaluating the predictive accuracy of structural dynamic models

NASA Technical Reports Server (NTRS)

Hasselman, T. K.; Chrostowski, Jon D.

1990-01-01

Uncertainty of frequency response using the fuzzy set method and on-orbit response prediction using laboratory test data to refine an analytical model are emphasized with respect to large space structures. Two aspects of the fuzzy set approach were investigated relative to its application to large structural dynamics problems: (1) minimizing the number of parameters involved in computing possible intervals; and (2) the treatment of extrema which may occur in the parameter space enclosed by all possible combinations of the important parameters of the model. Extensive printer graphics were added to the SSID code to help facilitate model verification, and an application of this code to the LaRC Ten Bay Truss is included in the appendix to illustrate this graphics capability.
Schnek: A C++ library for the development of parallel simulation codes on regular grids

NASA Astrophysics Data System (ADS)

Schmitz, Holger

2018-05-01

A large number of algorithms across the field of computational physics are formulated on grids with a regular topology. We present Schnek, a library that enables fast development of parallel simulations on regular grids. Schnek contains a number of easy-to-use modules that greatly reduce the amount of administrative code for large-scale simulation codes. The library provides an interface for reading simulation setup files with a hierarchical structure. The structure of the setup file is translated into a hierarchy of simulation modules that the developer can specify. The reader parses and evaluates mathematical expressions and initialises variables or grid data. This enables developers to write modular and flexible simulation codes with minimal effort. Regular grids of arbitrary dimension are defined as well as mechanisms for defining physical domain sizes, grid staggering, and ghost cells on these grids. Ghost cells can be exchanged between neighbouring processes using MPI with a simple interface. The grid data can easily be written into HDF5 files using serial or parallel I/O.
Dust Dynamics in Protoplanetary Disks: Parallel Computing with PVM

NASA Astrophysics Data System (ADS)

de La Fuente Marcos, Carlos; Barge, Pierre; de La Fuente Marcos, Raúl

2002-03-01

We describe a parallel version of our high-order-accuracy particle-mesh code for the simulation of collisionless protoplanetary disks. We use this code to carry out a massively parallel, two-dimensional, time-dependent, numerical simulation, which includes dust particles, to study the potential role of large-scale, gaseous vortices in protoplanetary disks. This noncollisional problem is easy to parallelize on message-passing multicomputer architectures. We performed the simulations on a cache-coherent nonuniform memory access Origin 2000 machine, using both the parallel virtual machine (PVM) and message-passing interface (MPI) message-passing libraries. Our performance analysis suggests that, for our problem, PVM is about 25% faster than MPI. Using PVM and MPI made it possible to reduce CPU time and increase code performance. This allows for simulations with a large number of particles (N ~ 105-106) in reasonable CPU times. The performances of our implementation of the pa! rallel code on an Origin 2000 supercomputer are presented and discussed. They exhibit very good speedup behavior and low load unbalancing. Our results confirm that giant gaseous vortices can play a dominant role in giant planet formation.
Review of particle-in-cell modeling for the extraction region of large negative hydrogen ion sources for fusion

NASA Astrophysics Data System (ADS)

Wünderlich, D.; Mochalskyy, S.; Montellano, I. M.; Revel, A.

2018-05-01

Particle-in-cell (PIC) codes are used since the early 1960s for calculating self-consistently the motion of charged particles in plasmas, taking into account external electric and magnetic fields as well as the fields created by the particles itself. Due to the used very small time steps (in the order of the inverse plasma frequency) and mesh size, the computational requirements can be very high and they drastically increase with increasing plasma density and size of the calculation domain. Thus, usually small computational domains and/or reduced dimensionality are used. In the last years, the available central processing unit (CPU) power strongly increased. Together with a massive parallelization of the codes, it is now possible to describe in 3D the extraction of charged particles from a plasma, using calculation domains with an edge length of several centimeters, consisting of one extraction aperture, the plasma in direct vicinity of the aperture, and a part of the extraction system. Large negative hydrogen or deuterium ion sources are essential parts of the neutral beam injection (NBI) system in future fusion devices like the international fusion experiment ITER and the demonstration reactor (DEMO). For ITER NBI RF driven sources with a source area of 0.9 × 1.9 m2 and 1280 extraction apertures will be used. The extraction of negative ions is accompanied by the co-extraction of electrons which are deflected onto an electron dump. Typically, the maximum negative extracted ion current is limited by the amount and the temporal instability of the co-extracted electrons, especially for operation in deuterium. Different PIC codes are available for the extraction region of large driven negative ion sources for fusion. Additionally, some effort is ongoing in developing codes that describe in a simplified manner (coarser mesh or reduced dimensionality) the plasma of the whole ion source. The presentation first gives a brief overview of the current status of the ion source development for ITER NBI and of the PIC method. Different PIC codes for the extraction region are introduced as well as the coupling to codes describing the whole source (PIC codes or fluid codes). Presented and discussed are different physical and numerical aspects of applying PIC codes to negative hydrogen ion sources for fusion as well as selected code results. The main focus of future calculations will be the meniscus formation and identifying measures for reducing the co-extracted electrons, in particular for deuterium operation. The recent results of the 3D PIC code ONIX (calculation domain: one extraction aperture and its vicinity) for the ITER prototype source (1/8 size of the ITER NBI source) are presented.
Computer Description of Black Hawk Helicopter

DTIC Science & Technology

1979-06-01

Model Combinatorial Geometry Models Black Hawk Helicopter Helicopter GIFT Computer Code Geometric Description of Targets 20. ABSTRACT...description was made using the technique of combinatorial geometry (COM-GEOM) and will be used as input to the GIFT computer code which generates Tliic...rnHp The data used bv the COVART comtmter code was eenerated bv the Geometric Information for Targets ( GIFT )Z computer code. This report documents
Multiphysics Code Demonstrated for Propulsion Applications

NASA Technical Reports Server (NTRS)

Lawrence, Charles; Melis, Matthew E.

1998-01-01

The utility of multidisciplinary analysis tools for aeropropulsion applications is being investigated at the NASA Lewis Research Center. The goal of this project is to apply Spectrum, a multiphysics code developed by Centric Engineering Systems, Inc., to simulate multidisciplinary effects in turbomachinery components. Many engineering problems today involve detailed computer analyses to predict the thermal, aerodynamic, and structural response of a mechanical system as it undergoes service loading. Analysis of aerospace structures generally requires attention in all three disciplinary areas to adequately predict component service behavior, and in many cases, the results from one discipline substantially affect the outcome of the other two. There are numerous computer codes currently available in the engineering community to perform such analyses in each of these disciplines. Many of these codes are developed and used in-house by a given organization, and many are commercially available. However, few, if any, of these codes are designed specifically for multidisciplinary analyses. The Spectrum code has been developed for performing fully coupled fluid, thermal, and structural analyses on a mechanical system with a single simulation that accounts for all simultaneous interactions, thus eliminating the requirement for running a large number of sequential, separate, disciplinary analyses. The Spectrum code has a true multiphysics analysis capability, which improves analysis efficiency as well as accuracy. Centric Engineering, Inc., working with a team of Lewis and AlliedSignal Engines engineers, has been evaluating Spectrum for a variety of propulsion applications including disk quenching, drum cavity flow, aeromechanical simulations, and a centrifugal compressor flow simulation.
Embed dynamic content in your poster.

PubMed

Hutchins, B Ian

2013-01-29

A new technology has emerged that will facilitate the presentation of dynamic or otherwise inaccessible data on posters at scientific meetings. Video, audio, or other digital files hosted on mobile-friendly sites can be linked to through a quick response (QR) code, a two-dimensional barcode that can be scanned by smartphones, which then display the content. This approach is more affordable than acquiring tablet computers for playing dynamic content and can reach many users at large conferences. This resource details how to host videos, generate QR codes, and view the associated files on mobile devices.
Large-area sheet task advanced dendritic web growth development

NASA Technical Reports Server (NTRS)

Duncan, C. S.; Seidensticker, R. G.; Mchugh, J. P.; Hopkins, R. H.; Meier, D.; Schruben, J.

1982-01-01

The computer code for calculating web temperature distribution was expanded to provide a graphics output in addition to numerical and punch card output. The new code was used to examine various modifications of the J419 configuration and, on the basis of the results, a new growth geometry was designed. Additionally, several mathematically defined temperature profiles were evaluated for the effects of the free boundary (growth front) on the thermal stress generation. Experimental growth runs were made with modified J419 configurations to complement the modeling work. A modified J435 configuration was evaluated.

From Physics Model to Results: An Optimizing Framework for Cross-Architecture Code Generation

DOE PAGES

Blazewicz, Marek; Hinder, Ian; Koppelman, David M.; ...

2013-01-01

Starting from a high-level problem description in terms of partial differential equations using abstract tensor notation, the Chemora framework discretizes, optimizes, and generates complete high performance codes for a wide range of compute architectures. Chemora extends the capabilities of Cactus, facilitating the usage of large-scale CPU/GPU systems in an efficient manner for complex applications, without low-level code tuning. Chemora achieves parallelism through MPI and multi-threading, combining OpenMP and CUDA. Optimizations include high-level code transformations, efficient loop traversal strategies, dynamically selected data and instruction cache usage strategies, and JIT compilation of GPU code tailored to the problem characteristics. The discretization ismore » based on higher-order finite differences on multi-block domains. Chemora's capabilities are demonstrated by simulations of black hole collisions. This problem provides an acid test of the framework, as the Einstein equations contain hundreds of variables and thousands of terms.« less
Coupling of PIES 3-D Equilibrium Code and NIFS Bootstrap Code with Applications to the Computation of Stellarator Equilibria

NASA Astrophysics Data System (ADS)

Monticello, D. A.; Reiman, A. H.; Watanabe, K. Y.; Nakajima, N.; Okamoto, M.

1997-11-01

The existence of bootstrap currents in both tokamaks and stellarators was confirmed, experimentally, more than ten years ago. Such currents can have significant effects on the equilibrium and stability of these MHD devices. In addition, stellarators, with the notable exception of W7-X, are predicted to have such large bootstrap currents that reliable equilibrium calculations require the self-consistent evaluation of bootstrap currents. Modeling of discharges which contain islands requires an algorithm that does not assume good surfaces. Only one of the two 3-D equilibrium codes that exist, PIES( Reiman, A. H., Greenside, H. S., Compt. Phys. Commun. 43), (1986)., can easily be modified to handle bootstrap current. Here we report on the coupling of the PIES 3-D equilibrium code and NIFS bootstrap code(Watanabe, K., et al., Nuclear Fusion 35) (1995), 335.
FORCE2: A state-of-the-art two-phase code for hydrodynamic calculations

NASA Astrophysics Data System (ADS)

Ding, Jianmin; Lyczkowski, R. W.; Burge, S. W.

1993-02-01

A three-dimensional computer code for two-phase flow named FORCE2 has been developed by Babcock and Wilcox (B & W) in close collaboration with Argonne National Laboratory (ANL). FORCE2 is capable of both transient as well as steady-state simulations. This Cartesian coordinates computer program is a finite control volume, industrial grade and quality embodiment of the pilot-scale FLUFIX/MOD2 code and contains features such as three-dimensional blockages, volume and surface porosities to account for various obstructions in the flow field, and distributed resistance modeling to account for pressure drops caused by baffles, distributor plates and large tube banks. Recently computed results demonstrated the significance of and necessity for three-dimensional models of hydrodynamics and erosion. This paper describes the process whereby ANL's pilot-scale FLUFIX/MOD2 models and numerics were implemented into FORCE2. A description of the quality control to assess the accuracy of the new code and the validation using some of the measured data from Illinois Institute of Technology (UT) and the University of Illinois at Urbana-Champaign (UIUC) are given. It is envisioned that one day, FORCE2 with additional modules such as radiation heat transfer, combustion kinetics and multi-solids together with user-friendly pre- and post-processor software and tailored for massively parallel multiprocessor shared memory computational platforms will be used by industry and researchers to assist in reducing and/or eliminating the environmental and economic barriers which limit full consideration of coal, shale and biomass as energy sources, to retain energy security, and to remediate waste and ecological problems.
JADAMILU: a software code for computing selected eigenvalues of large sparse symmetric matrices

NASA Astrophysics Data System (ADS)

Bollhöfer, Matthias; Notay, Yvan

2007-12-01

A new software code for computing selected eigenvalues and associated eigenvectors of a real symmetric matrix is described. The eigenvalues are either the smallest or those closest to some specified target, which may be in the interior of the spectrum. The underlying algorithm combines the Jacobi-Davidson method with efficient multilevel incomplete LU (ILU) preconditioning. Key features are modest memory requirements and robust convergence to accurate solutions. Parameters needed for incomplete LU preconditioning are automatically computed and may be updated at run time depending on the convergence pattern. The software is easy to use by non-experts and its top level routines are written in FORTRAN 77. Its potentialities are demonstrated on a few applications taken from computational physics. Program summaryProgram title: JADAMILU Catalogue identifier: ADZT_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADZT_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 101 359 No. of bytes in distributed program, including test data, etc.: 7 493 144 Distribution format: tar.gz Programming language: Fortran 77 Computer: Intel or AMD with g77 and pgf; Intel EM64T or Itanium with ifort; AMD Opteron with g77, pgf and ifort; Power (IBM) with xlf90. Operating system: Linux, AIX RAM: problem dependent Word size: real:8; integer: 4 or 8, according to user's choice Classification: 4.8 Nature of problem: Any physical problem requiring the computation of a few eigenvalues of a symmetric matrix. Solution method: Jacobi-Davidson combined with multilevel ILU preconditioning. Additional comments: We supply binaries rather than source code because JADAMILU uses the following external packages: MC64. This software is copyrighted software and not freely available. COPYRIGHT (c) 1999 Council for the Central Laboratory of the Research Councils. AMD. Copyright (c) 2004-2006 by Timothy A. Davis, Patrick R. Amestoy, and Iain S. Duff. Source code is distributed by the authors under the GNU LGPL licence. BLAS. The reference BLAS is a freely-available software package. It is available from netlib via anonymous ftp and the World Wide Web. LAPACK. The complete LAPACK package or individual routines from LAPACK are freely available on netlib and can be obtained via the World Wide Web or anonymous ftp. For maximal benefit to the community, we added the sources we are proprietary of to the tar.gz file submitted for inclusion in the CPC library. However, as explained in the README file, users willing to compile the code instead of using binaries should first obtain the sources for the external packages mentioned above (email and/or web addresses are provided). Running time: Problem dependent; the test examples provided with the code only take a few seconds to run; timing results for large scale problems are given in Section 5.
Computer-based coding of free-text job descriptions to efficiently identify occupations in epidemiological studies.

PubMed

Russ, Daniel E; Ho, Kwan-Yuet; Colt, Joanne S; Armenti, Karla R; Baris, Dalsu; Chow, Wong-Ho; Davis, Faith; Johnson, Alison; Purdue, Mark P; Karagas, Margaret R; Schwartz, Kendra; Schwenn, Molly; Silverman, Debra T; Johnson, Calvin A; Friesen, Melissa C

2016-06-01

Mapping job titles to standardised occupation classification (SOC) codes is an important step in identifying occupational risk factors in epidemiological studies. Because manual coding is time-consuming and has moderate reliability, we developed an algorithm called SOCcer (Standardized Occupation Coding for Computer-assisted Epidemiologic Research) to assign SOC-2010 codes based on free-text job description components. Job title and task-based classifiers were developed by comparing job descriptions to multiple sources linking job and task descriptions to SOC codes. An industry-based classifier was developed based on the SOC prevalence within an industry. These classifiers were used in a logistic model trained using 14 983 jobs with expert-assigned SOC codes to obtain empirical weights for an algorithm that scored each SOC/job description. We assigned the highest scoring SOC code to each job. SOCcer was validated in 2 occupational data sources by comparing SOC codes obtained from SOCcer to expert assigned SOC codes and lead exposure estimates obtained by linking SOC codes to a job-exposure matrix. For 11 991 case-control study jobs, SOCcer-assigned codes agreed with 44.5% and 76.3% of manually assigned codes at the 6-digit and 2-digit level, respectively. Agreement increased with the score, providing a mechanism to identify assignments needing review. Good agreement was observed between lead estimates based on SOCcer and manual SOC assignments (κ 0.6-0.8). Poorer performance was observed for inspection job descriptions, which included abbreviations and worksite-specific terminology. Although some manual coding will remain necessary, using SOCcer may improve the efficiency of incorporating occupation into large-scale epidemiological studies. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Scientific Discovery through Advanced Computing in Plasma Science

NASA Astrophysics Data System (ADS)

Tang, William

2005-03-01

Advanced computing is generally recognized to be an increasingly vital tool for accelerating progress in scientific research during the 21st Century. For example, the Department of Energy's ``Scientific Discovery through Advanced Computing'' (SciDAC) Program was motivated in large measure by the fact that formidable scientific challenges in its research portfolio could best be addressed by utilizing the combination of the rapid advances in super-computing technology together with the emergence of effective new algorithms and computational methodologies. The imperative is to translate such progress into corresponding increases in the performance of the scientific codes used to model complex physical systems such as those encountered in high temperature plasma research. If properly validated against experimental measurements and analytic benchmarks, these codes can provide reliable predictive capability for the behavior of a broad range of complex natural and engineered systems. This talk reviews recent progress and future directions for advanced simulations with some illustrative examples taken from the plasma science applications area. Significant recent progress has been made in both particle and fluid simulations of fine-scale turbulence and large-scale dynamics, giving increasingly good agreement between experimental observations and computational modeling. This was made possible by the combination of access to powerful new computational resources together with innovative advances in analytic and computational methods for developing reduced descriptions of physics phenomena spanning a huge range in time and space scales. In particular, the plasma science community has made excellent progress in developing advanced codes for which computer run-time and problem size scale well with the number of processors on massively parallel machines (MPP's). A good example is the effective usage of the full power of multi-teraflop (multi-trillion floating point computations per second) MPP's to produce three-dimensional, general geometry, nonlinear particle simulations which have accelerated progress in understanding the nature of plasma turbulence in magnetically-confined high temperature plasmas. These calculations, which typically utilized billions of particles for thousands of time-steps, would not have been possible without access to powerful present generation MPP computers and the associated diagnostic and visualization capabilities. In general, results from advanced simulations provide great encouragement for being able to include increasingly realistic dynamics to enable deeper physics insights into plasmas in both natural and laboratory environments. The associated scientific excitement should serve to stimulate improved cross-cutting collaborations with other fields and also to help attract bright young talent to the computational science area.
Multi-stage decoding for multi-level block modulation codes

NASA Technical Reports Server (NTRS)

Lin, Shu

1991-01-01

In this paper, we investigate various types of multi-stage decoding for multi-level block modulation codes, in which the decoding of a component code at each stage can be either soft-decision or hard-decision, maximum likelihood or bounded-distance. Error performance of codes is analyzed for a memoryless additive channel based on various types of multi-stage decoding, and upper bounds on the probability of an incorrect decoding are derived. Based on our study and computation results, we find that, if component codes of a multi-level modulation code and types of decoding at various stages are chosen properly, high spectral efficiency and large coding gain can be achieved with reduced decoding complexity. In particular, we find that the difference in performance between the suboptimum multi-stage soft-decision maximum likelihood decoding of a modulation code and the single-stage optimum decoding of the overall code is very small: only a fraction of dB loss in SNR at the probability of an incorrect decoding for a block of 10(exp -6). Multi-stage decoding of multi-level modulation codes really offers a way to achieve the best of three worlds, bandwidth efficiency, coding gain, and decoding complexity.
Development of Web Interfaces for Analysis Codes

NASA Astrophysics Data System (ADS)

Emoto, M.; Watanabe, T.; Funaba, H.; Murakami, S.; Nagayama, Y.; Kawahata, K.

Several codes have been developed to analyze plasma physics. However, most of them are developed to run on supercomputers. Therefore, users who typically use personal computers (PCs) find it difficult to use these codes. In order to facilitate the widespread use of these codes, a user-friendly interface is required. The authors propose Web interfaces for these codes. To demonstrate the usefulness of this approach, the authors developed Web interfaces for two analysis codes. One of them is for FIT developed by Murakami. This code is used to analyze the NBI heat deposition, etc. Because it requires electron density profiles, electron temperatures, and ion temperatures as polynomial expressions, those unfamiliar with the experiments find it difficult to use this code, especially visitors from other institutes. The second one is for visualizing the lines of force in the LHD (large helical device) developed by Watanabe. This code is used to analyze the interference caused by the lines of force resulting from the various structures installed in the vacuum vessel of the LHD. This code runs on PCs; however, it requires that the necessary parameters be edited manually. Using these Web interfaces, users can execute these codes interactively.
Theoretical studies of Resonance Enhanced Stimulated Raman Scattering (RESRS) of frequency doubled Alexandrite laser wavelength in cesium vapor

NASA Technical Reports Server (NTRS)

Lawandy, Nabil M.

1987-01-01

The third phase of research will focus on the propagation and energy extraction of the pump and SERS beams in a variety of configurations including oscillator structures. In order to address these questions a numerical code capable of allowing for saturation and full transverse beam evolution is required. The method proposed is based on a discretized propagation energy extraction model which uses a Kirchoff integral propagator coupled to the three level Raman model already developed. The model will have the resolution required by diffraction limits and will use the previous density matrix results in the adiabatic following limit. Owing to its large computational requirements, such a code must be implemented on a vector array processor. One code on the Cyber is being tested by using previously understood two-level laser models as guidelines for interpreting the results. Two tests were implemented: the evolution of modes in a passive resonator and the evolution of a stable state of the adiabatically eliminated laser equations. These results show mode shapes and diffraction losses for the first case and relaxation oscillations for the second one. Finally, in order to clarify the computing methodology used to exploit the speed of the Cyber's computational speed, the time it takes to perform both of the computations previously mentioned to run on the Cyber and VAX 730 must be measured. Also included is a short description of the current laser model (CAVITY.FOR) and a flow chart of the test computations.
Climate Modeling with a Million CPUs

NASA Astrophysics Data System (ADS)

Tobis, M.; Jackson, C. S.

2010-12-01

Michael Tobis, Ph.D. Research Scientist Associate University of Texas Institute for Geophysics Charles S. Jackson Research Scientist University of Texas Institute for Geophysics Meteorological, oceanographic, and climatological applications have been at the forefront of scientific computing since its inception. The trend toward ever larger and more capable computing installations is unabated. However, much of the increase in capacity is accompanied by an increase in parallelism and a concomitant increase in complexity. An increase of at least four additional orders of magnitude in the computational power of scientific platforms is anticipated. It is unclear how individual climate simulations can continue to make effective use of the largest platforms. Conversion of existing community codes to higher resolution, or to more complex phenomenology, or both, presents daunting design and validation challenges. Our alternative approach is to use the expected resources to run very large ensembles of simulations of modest size, rather than to await the emergence of very large simulations. We are already doing this in exploring the parameter space of existing models using the Multiple Very Fast Simulated Annealing algorithm, which was developed for seismic imaging. Our experiments have the dual intentions of tuning the model and identifying ranges of parameter uncertainty. Our approach is less strongly constrained by the dimensionality of the parameter space than are competing methods. Nevertheless, scaling up remains costly. Much could be achieved by increasing the dimensionality of the search and adding complexity to the search algorithms. Such ensemble approaches scale naturally to very large platforms. Extensions of the approach are anticipated. For example, structurally different models can be tuned to comparable effectiveness. This can provide an objective test for which there is no realistic precedent with smaller computations. We find ourselves inventing new code to manage our ensembles. Component computations involve tens to hundreds of CPUs and tens to hundreds of hours. The results of these moderately large parallel jobs influence the scheduling of subsequent jobs, and complex algorithms may be easily contemplated for this. The operating system concept of a "thread" re-emerges at a very coarse level, where each thread manages atomic computations of thousands of CPU-hours. That is, rather than multiple threads operating on a processor, at this level, multiple processors operate within a single thread. In collaboration with the Texas Advanced Computing Center, we are developing a software library at the system level, which should facilitate the development of computations involving complex strategies which invoke large numbers of moderately large multi-processor jobs. While this may have applications in other sciences, our key intent is to better characterize the coupled behavior of a very large set of climate model configurations.
Computational Infrastructure for Geodynamics (CIG)

NASA Astrophysics Data System (ADS)

Gurnis, M.; Kellogg, L. H.; Bloxham, J.; Hager, B. H.; Spiegelman, M.; Willett, S.; Wysession, M. E.; Aivazis, M.

2004-12-01

Solid earth geophysicists have a long tradition of writing scientific software to address a wide range of problems. In particular, computer simulations came into wide use in geophysics during the decade after the plate tectonic revolution. Solution schemes and numerical algorithms that developed in other areas of science, most notably engineering, fluid mechanics, and physics, were adapted with considerable success to geophysics. This software has largely been the product of individual efforts and although this approach has proven successful, its strength for solving problems of interest is now starting to show its limitations as we try to share codes and algorithms or when we want to recombine codes in novel ways to produce new science. With funding from the NSF, the US community has embarked on a Computational Infrastructure for Geodynamics (CIG) that will develop, support, and disseminate community-accessible software for the greater geodynamics community from model developers to end-users. The software is being developed for problems involving mantle and core dynamics, crustal and earthquake dynamics, magma migration, seismology, and other related topics. With a high level of community participation, CIG is leveraging state-of-the-art scientific computing into a suite of open-source tools and codes. The infrastructure that we are now starting to develop will consist of: (a) a coordinated effort to develop reusable, well-documented and open-source geodynamics software; (b) the basic building blocks - an infrastructure layer - of software by which state-of-the-art modeling codes can be quickly assembled; (c) extension of existing software frameworks to interlink multiple codes and data through a superstructure layer; (d) strategic partnerships with the larger world of computational science and geoinformatics; and (e) specialized training and workshops for both the geodynamics and broader Earth science communities. The CIG initiative has already started to leverage and develop long-term strategic partnerships with open source development efforts within the larger thrusts of scientific computing and geoinformatics. These strategic partnerships are essential as the frontier has moved into multi-scale and multi-physics problems in which many investigators now want to use simulation software for data interpretation, data assimilation, and hypothesis testing.
Final Report for ALCC Allocation: Predictive Simulation of Complex Flow in Wind Farms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Barone, Matthew F.; Ananthan, Shreyas; Churchfield, Matt

This report documents work performed using ALCC computing resources granted under a proposal submitted in February 2016, with the resource allocation period spanning the period July 2016 through June 2017. The award allocation was 10.7 million processor-hours at the National Energy Research Scientific Computing Center. The simulations performed were in support of two projects: the Atmosphere to Electrons (A2e) project, supported by the DOE EERE office; and the Exascale Computing Project (ECP), supported by the DOE Office of Science. The project team for both efforts consists of staff scientists and postdocs from Sandia National Laboratories and the National Renewable Energymore » Laboratory. At the heart of these projects is the open-source computational-fluid-dynamics (CFD) code, Nalu. Nalu solves the low-Mach-number Navier-Stokes equations using an unstructured- grid discretization. Nalu leverages the open-source Trilinos solver library and the Sierra Toolkit (STK) for parallelization and I/O. This report documents baseline computational performance of the Nalu code on problems of direct relevance to the wind plant physics application - namely, Large Eddy Simulation (LES) of an atmospheric boundary layer (ABL) flow and wall-modeled LES of a flow past a static wind turbine rotor blade. Parallel performance of Nalu and its constituent solver routines residing in the Trilinos library has been assessed previously under various campaigns. However, both Nalu and Trilinos have been, and remain, in active development and resources have not been available previously to rigorously track code performance over time. With the initiation of the ECP, it is important to establish and document baseline code performance on the problems of interest. This will allow the project team to identify and target any deficiencies in performance, as well as highlight any performance bottlenecks as we exercise the code on a greater variety of platforms and at larger scales. The current study is rather modest in scale, examining performance on problem sizes of O(100 million) elements and core counts up to 8k cores. This will be expanded as more computational resources become available to the projects.« less
User manual for semi-circular compact range reflector code: Version 2

NASA Technical Reports Server (NTRS)

Gupta, Inder J.; Burnside, Walter D.

1987-01-01

A computer code has been developed at the Ohio State University ElectroScience Laboratory to analyze a semi-circular paraboloidal reflector with or without a rolled edge at the top and a skirt at the bottom. The code can be used to compute the total near field of the reflector or its individual components at a given distance from the center of the paraboloid. The code computes the fields along a radial, horizontal, vertical or axial cut at that distance. Thus, it is very effective in computing the size of the sweet spot for a semi-circular compact range reflector. This report describes the operation of the code. Various input and output statements are explained. Some results obtained using the computer code are presented to illustrate the code's capability as well as being samples of input/output sets.
Program Processes Thermocouple Readings

NASA Technical Reports Server (NTRS)

Quave, Christine A.; Nail, William, III

1995-01-01

Digital Signal Processor for Thermocouples (DART) computer program implements precise and fast method of converting voltage to temperature for large-temperature-range thermocouple applications. Written using LabVIEW software. DART available only as object code for use on Macintosh II FX or higher-series computers running System 7.0 or later and IBM PC-series and compatible computers running Microsoft Windows 3.1. Macintosh version of DART (SSC-00032) requires LabVIEW 2.2.1 or 3.0 for execution. IBM PC version (SSC-00031) requires LabVIEW 3.0 for Windows 3.1. LabVIEW software product of National Instruments and not included with program.
A visual programming environment for the Navier-Stokes computer

NASA Technical Reports Server (NTRS)

Tomboulian, Sherryl; Crockett, Thomas W.; Middleton, David

1988-01-01

The Navier-Stokes computer is a high-performance, reconfigurable, pipelined machine designed to solve large computational fluid dynamics problems. Due to the complexity of the architecture, development of effective, high-level language compilers for the system appears to be a very difficult task. Consequently, a visual programming methodology has been developed which allows users to program the system at an architectural level by constructing diagrams of the pipeline configuration. These schematic program representations can then be checked for validity and automatically translated into machine code. The visual environment is illustrated by using a prototype graphical editor to program an example problem.
Hanford meteorological station computer codes: Volume 9, The quality assurance computer codes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Burk, K.W.; Andrews, G.L.

1989-02-01

The Hanford Meteorological Station (HMS) was established in 1944 on the Hanford Site to collect and archive meteorological data and provide weather forecasts and related services for Hanford Site approximately 1/2 mile east of the 200 West Area and is operated by PNL for the US Department of Energy. Meteorological data are collected from various sensors and equipment located on and off the Hanford Site. These data are stored in data bases on the Digital Equipment Corporation (DEC) VAX 11/750 at the HMS (hereafter referred to as the HMS computer). Files from those data bases are routinely transferred to themore » Emergency Management System (EMS) computer at the Unified Dose Assessment Center (UDAC). To ensure the quality and integrity of the HMS data, a set of Quality Assurance (QA) computer codes has been written. The codes will be routinely used by the HMS system manager or the data base custodian. The QA codes provide detailed output files that will be used in correcting erroneous data. The following sections in this volume describe the implementation and operation of QA computer codes. The appendices contain detailed descriptions, flow charts, and source code listings of each computer code. 2 refs.« less
AREVA Developments for an Efficient and Reliable use of Monte Carlo codes for Radiation Transport Applications

NASA Astrophysics Data System (ADS)

Chapoutier, Nicolas; Mollier, François; Nolin, Guillaume; Culioli, Matthieu; Mace, Jean-Reynald

2017-09-01

In the context of the rising of Monte Carlo transport calculations for any kind of application, AREVA recently improved its suite of engineering tools in order to produce efficient Monte Carlo workflow. Monte Carlo codes, such as MCNP or TRIPOLI, are recognized as reference codes to deal with a large range of radiation transport problems. However the inherent drawbacks of theses codes - laboring input file creation and long computation time - contrast with the maturity of the treatment of the physical phenomena. The goals of the recent AREVA developments were to reach similar efficiency as other mature engineering sciences such as finite elements analyses (e.g. structural or fluid dynamics). Among the main objectives, the creation of a graphical user interface offering CAD tools for geometry creation and other graphical features dedicated to the radiation field (source definition, tally definition) has been reached. The computations times are drastically reduced compared to few years ago thanks to the use of massive parallel runs, and above all, the implementation of hybrid variance reduction technics. From now engineering teams are capable to deliver much more prompt support to any nuclear projects dealing with reactors or fuel cycle facilities from conceptual phase to decommissioning.
Fluid Flow Investigations within a 37 Element CANDU Fuel Bundle Supported by Magnetic Resonance Velocimetry and Computational Fluid Dynamics

DOE PAGES

Piro, M.H.A; Wassermann, F.; Grundmann, S.; ...

2017-05-23

The current work presents experimental and computational investigations of fluid flow through a 37 element CANDU nuclear fuel bundle. Experiments based on Magnetic Resonance Velocimetry (MRV) permit three-dimensional, three-component fluid velocity measurements to be made within the bundle with sub-millimeter resolution that are non-intrusive, do not require tracer particles or optical access of the flow field. Computational fluid dynamic (CFD) simulations of the foregoing experiments were performed with the hydra-th code using implicit large eddy simulation, which were in good agreement with experimental measurements of the fluid velocity. Greater understanding has been gained in the evolution of geometry-induced inter-subchannel mixing,more » the local effects of obstructed debris on the local flow field, and various turbulent effects, such as recirculation, swirl and separation. These capabilities are not available with conventional experimental techniques or thermal-hydraulic codes. Finally, the overall goal of this work is to continue developing experimental and computational capabilities for further investigations that reliably support nuclear reactor performance and safety.« less
Performance of a Block Structured, Hierarchical Adaptive MeshRefinement Code on the 64k Node IBM BlueGene/L Computer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Greenough, Jeffrey A.; de Supinski, Bronis R.; Yates, Robert K.

2005-04-25

We describe the performance of the block-structured Adaptive Mesh Refinement (AMR) code Raptor on the 32k node IBM BlueGene/L computer. This machine represents a significant step forward towards petascale computing. As such, it presents Raptor with many challenges for utilizing the hardware efficiently. In terms of performance, Raptor shows excellent weak and strong scaling when running in single level mode (no adaptivity). Hardware performance monitors show Raptor achieves an aggregate performance of 3:0 Tflops in the main integration kernel on the 32k system. Results from preliminary AMR runs on a prototype astrophysical problem demonstrate the efficiency of the current softwaremore » when running at large scale. The BG/L system is enabling a physics problem to be considered that represents a factor of 64 increase in overall size compared to the largest ones of this type computed to date. Finally, we provide a description of the development work currently underway to address our inefficiencies.« less
Fluid Flow Investigations within a 37 Element CANDU Fuel Bundle Supported by Magnetic Resonance Velocimetry and Computational Fluid Dynamics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Piro, M.H.A; Wassermann, F.; Grundmann, S.

The current work presents experimental and computational investigations of fluid flow through a 37 element CANDU nuclear fuel bundle. Experiments based on Magnetic Resonance Velocimetry (MRV) permit three-dimensional, three-component fluid velocity measurements to be made within the bundle with sub-millimeter resolution that are non-intrusive, do not require tracer particles or optical access of the flow field. Computational fluid dynamic (CFD) simulations of the foregoing experiments were performed with the hydra-th code using implicit large eddy simulation, which were in good agreement with experimental measurements of the fluid velocity. Greater understanding has been gained in the evolution of geometry-induced inter-subchannel mixing,more » the local effects of obstructed debris on the local flow field, and various turbulent effects, such as recirculation, swirl and separation. These capabilities are not available with conventional experimental techniques or thermal-hydraulic codes. Finally, the overall goal of this work is to continue developing experimental and computational capabilities for further investigations that reliably support nuclear reactor performance and safety.« less

Energy and technology review: Engineering modeling

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cabayan, H.S.; Goudreau, G.L.; Ziolkowski, R.W.

1986-10-01

This report presents information concerning: Modeling Canonical Problems in Electromagnetic Coupling Through Apertures; Finite-Element Codes for Computing Electrostatic Fields; Finite-Element Modeling of Electromagnetic Phenomena; Modeling Microwave-Pulse Compression in a Resonant Cavity; Lagrangian Finite-Element Analysis of Penetration Mechanics; Crashworthiness Engineering; Computer Modeling of Metal-Forming Processes; Thermal-Mechanical Modeling of Tungsten Arc Welding; Modeling Air Breakdown Induced by Electromagnetic Fields; Iterative Techniques for Solving Boltzmann's Equations for p-Type Semiconductors; Semiconductor Modeling; and Improved Numerical-Solution Techniques in Large-Scale Stress Analysis.
Rotordynamics on the PC: Transient Analysis With ARDS

NASA Technical Reports Server (NTRS)

Fleming, David P.

1997-01-01

Personal computers can now do many jobs that formerly required a large mainframe computer. An example is NASA Lewis Research Center's program Analysis of RotorDynamic Systems (ARDS), which uses the component mode synthesis method to analyze the dynamic motion of up to five rotating shafts. As originally written in the early 1980's, this program was considered large for the mainframe computers of the time. ARDS, which was written in Fortran 77, has been successfully ported to a 486 personal computer. Plots appear on the computer monitor via calls programmed for the original CALCOMP plotter; plots can also be output on a standard laser printer. The executable code, which uses the full array sizes of the mainframe version, easily fits on a high-density floppy disk. The program runs under DOS with an extended memory manager. In addition to transient analysis of blade loss, step turns, and base acceleration, with simulation of squeeze-film dampers and rubs, ARDS calculates natural frequencies and unbalance response.
Optimizing Performance of Combustion Chemistry Solvers on Intel's Many Integrated Core (MIC) Architectures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sitaraman, Hariswaran; Grout, Ray W

This work investigates novel algorithm designs and optimization techniques for restructuring chemistry integrators in zero and multidimensional combustion solvers, which can then be effectively used on the emerging generation of Intel's Many Integrated Core/Xeon Phi processors. These processors offer increased computing performance via large number of lightweight cores at relatively lower clock speeds compared to traditional processors (e.g. Intel Sandybridge/Ivybridge) used in current supercomputers. This style of processor can be productively used for chemistry integrators that form a costly part of computational combustion codes, in spite of their relatively lower clock speeds. Performance commensurate with traditional processors is achieved heremore » through the combination of careful memory layout, exposing multiple levels of fine grain parallelism and through extensive use of vendor supported libraries (Cilk Plus and Math Kernel Libraries). Important optimization techniques for efficient memory usage and vectorization have been identified and quantified. These optimizations resulted in a factor of ~ 3 speed-up using Intel 2013 compiler and ~ 1.5 using Intel 2017 compiler for large chemical mechanisms compared to the unoptimized version on the Intel Xeon Phi. The strategies, especially with respect to memory usage and vectorization, should also be beneficial for general purpose computational fluid dynamics codes.« less
Development of the Tensoral Computer Language

NASA Technical Reports Server (NTRS)

Ferziger, Joel; Dresselhaus, Eliot

1996-01-01

The research scientist or engineer wishing to perform large scale simulations or to extract useful information from existing databases is required to have expertise in the details of the particular database, the numerical methods and the computer architecture to be used. This poses a significant practical barrier to the use of simulation data. The goal of this research was to develop a high-level computer language called Tensoral, designed to remove this barrier. The Tensoral language provides a framework in which efficient generic data manipulations can be easily coded and implemented. First of all, Tensoral is general. The fundamental objects in Tensoral represent tensor fields and the operators that act on them. The numerical implementation of these tensors and operators is completely and flexibly programmable. New mathematical constructs and operators can be easily added to the Tensoral system. Tensoral is compatible with existing languages. Tensoral tensor operations co-exist in a natural way with a host language, which may be any sufficiently powerful computer language such as Fortran, C, or Vectoral. Tensoral is very-high-level. Tensor operations in Tensoral typically act on entire databases (i.e., arrays) at one time and may, therefore, correspond to many lines of code in a conventional language. Tensoral is efficient. Tensoral is a compiled language. Database manipulations are simplified optimized and scheduled by the compiler eventually resulting in efficient machine code to implement them.
NCI HPC Scaling and Optimisation in Climate, Weather, Earth system science and the Geosciences

NASA Astrophysics Data System (ADS)

Evans, B. J. K.; Bermous, I.; Freeman, J.; Roberts, D. S.; Ward, M. L.; Yang, R.

2016-12-01

The Australian National Computational Infrastructure (NCI) has a national focus in the Earth system sciences including climate, weather, ocean, water management, environment and geophysics. NCI leads a Program across its partners from the Australian science agencies and research communities to identify priority computational models to scale-up. Typically, these cases place a large overall demand on the available computer time, need to scale to higher resolutions, use excessive scarce resources such as large memory or bandwidth that limits, or in some cases, need to meet requirements for transition to a separate operational forecasting system, with set time-windows. The model codes include the UK Met Office Unified Model atmospheric model (UM), GFDL's Modular Ocean Model (MOM), both the UK Met Office's GC3 and Australian ACCESS coupled-climate systems (including sea ice), 4D-Var data assimilation and satellite processing, the Regional Ocean Model (ROMS), and WaveWatch3 as well as geophysics codes including hazards, magentuellerics, seismic inversions, and geodesy. Many of these codes use significant compute resources both for research applications as well as within the operational systems. Some of these models are particularly complex, and their behaviour had not been critically analysed for effective use of the NCI supercomputer or how they could be improved. As part of the Program, we have established a common profiling methodology that uses a suite of open source tools for performing scaling analyses. The most challenging cases are profiling multi-model coupled systems where the component models have their own complex algorithms and performance issues. We have also found issues within the current suite of profiling tools, and no single tool fully exposes the nature of the code performance. As a result of this work, international collaborations are now in place to ensure that improvements are incorporated within the community models, and our effort can be targeted in a coordinated way. The coordinations have involved user stakeholders, the model developer community, and dependent software libraries. For example, we have spent significant time characterising I/O scalability, and improving the use of libraries such as NetCDF and HDF5.
Unsteady flow simulations around complex geometries using stationary or rotating unstructured grids

NASA Astrophysics Data System (ADS)

Sezer-Uzol, Nilay

In this research, the computational analysis of three-dimensional, unsteady, separated, vortical flows around complex geometries is studied by using stationary or moving unstructured grids. Two main engineering problems are investigated. The first problem is the unsteady simulation of a ship airwake, where helicopter operations become even more challenging, by using stationary unstructured grids. The second problem is the unsteady simulation of wind turbine rotor flow fields by using moving unstructured grids which are rotating with the whole three-dimensional rigid rotor geometry. The three dimensional, unsteady, parallel, unstructured, finite volume flow solver, PUMA2, is used for the computational fluid dynamics (CFD) simulations considered in this research. The code is modified to have a moving grid capability to perform three-dimensional, time-dependent rotor simulations. An instantaneous log-law wall model for Large Eddy Simulations is also implemented in PUMA2 to investigate the very large Reynolds number flow fields of rotating blades. To verify the code modifications, several sample test cases are also considered. In addition, interdisciplinary studies, which are aiming to provide new tools and insights to the aerospace and wind energy scientific communities, are done during this research by focusing on the coupling of ship airwake CFD simulations with the helicopter flight dynamics and control analysis, the coupling of wind turbine rotor CFD simulations with the aeroacoustic analysis, and the analysis of these time-dependent and large-scale CFD simulations with the help of a computational monitoring, steering and visualization tool, POSSE.
User's manual for semi-circular compact range reflector code

NASA Technical Reports Server (NTRS)

Gupta, Inder J.; Burnside, Walter D.

1986-01-01

A computer code was developed to analyze a semi-circular paraboloidal reflector antenna with a rolled edge at the top and a skirt at the bottom. The code can be used to compute the total near field of the antenna or its individual components at a given distance from the center of the paraboloid. Thus, it is very effective in computing the size of the sweet spot for RCS or antenna measurement. The operation of the code is described. Various input and output statements are explained. Some results obtained using the computer code are presented to illustrate the code's capability as well as being samples of input/output sets.
Highly fault-tolerant parallel computation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Spielman, D.A.

We re-introduce the coded model of fault-tolerant computation in which the input and output of a computational device are treated as words in an error-correcting code. A computational device correctly computes a function in the coded model if its input and output, once decoded, are a valid input and output of the function. In the coded model, it is reasonable to hope to simulate all computational devices by devices whose size is greater by a constant factor but which are exponentially reliable even if each of their components can fail with some constant probability. We consider fine-grained parallel computations inmore » which each processor has a constant probability of producing the wrong output at each time step. We show that any parallel computation that runs for time t on w processors can be performed reliably on a faulty machine in the coded model using w log{sup O(l)} w processors and time t log{sup O(l)} w. The failure probability of the computation will be at most t {center_dot} exp(-w{sup 1/4}). The codes used to communicate with our fault-tolerant machines are generalized Reed-Solomon codes and can thus be encoded and decoded in O(n log{sup O(1)} n) sequential time and are independent of the machine they are used to communicate with. We also show how coded computation can be used to self-correct many linear functions in parallel with arbitrarily small overhead.« less
An emulator for minimizing computer resources for finite element analysis

NASA Technical Reports Server (NTRS)

Melosh, R.; Utku, S.; Islam, M.; Salama, M.

1984-01-01

A computer code, SCOPE, has been developed for predicting the computer resources required for a given analysis code, computer hardware, and structural problem. The cost of running the code is a small fraction (about 3 percent) of the cost of performing the actual analysis. However, its accuracy in predicting the CPU and I/O resources depends intrinsically on the accuracy of calibration data that must be developed once for the computer hardware and the finite element analysis code of interest. Testing of the SCOPE code on the AMDAHL 470 V/8 computer and the ELAS finite element analysis program indicated small I/O errors (3.2 percent), larger CPU errors (17.8 percent), and negligible total errors (1.5 percent).
Numerical modeling of separated flows at moderate Reynolds numbers appropriate for turbine blades and unmanned aero vehicles

NASA Astrophysics Data System (ADS)

Castiglioni, Giacomo

Flows over airfoils and blades in rotating machinery, for unmanned and micro-aerial vehicles, wind turbines, and propellers consist of a laminar boundary layer near the leading edge that is often followed by a laminar separation bubble and transition to turbulence further downstream. Typical Reynolds averaged Navier-Stokes turbulence models are inadequate for such flows. Direct numerical simulation is the most reliable, but is also the most computationally expensive alternative. This work assesses the capability of immersed boundary methods and large eddy simulations to reduce the computational requirements for such flows and still provide high quality results. Two-dimensional and three-dimensional simulations of a laminar separation bubble on a NACA-0012 airfoil at Rec = 5x104 and at 5° of incidence have been performed with an immersed boundary code and a commercial code using body fitted grids. Several sub-grid scale models have been implemented in both codes and their performance evaluated. For the two-dimensional simulations with the immersed boundary method the results show good agreement with the direct numerical simulation benchmark data for the pressure coefficient Cp and the friction coefficient Cf, but only when using dissipative numerical schemes. There is evidence that this behavior can be attributed to the ability of dissipative schemes to damp numerical noise coming from the immersed boundary. For the three-dimensional simulations the results show a good prediction of the separation point, but an inaccurate prediction of the reattachment point unless full direct numerical simulation resolution is used. The commercial code shows good agreement with the direct numerical simulation benchmark data in both two and three-dimensional simulations, but the presence of significant, unquantified numerical dissipation prevents a conclusive assessment of the actual prediction capabilities of very coarse large eddy simulations with low order schemes in general cases. Additionally, a two-dimensional sweep of angles of attack from 0° to 5° is performed showing a qualitative prediction of the jump in lift and drag coefficients due to the appearance of the laminar separation bubble. The numerical dissipation inhibits the predictive capabilities of large eddy simulations whenever it is of the same order of magnitude or larger than the sub-grid scale dissipation. The need to estimate the numerical dissipation is most pressing for low-order methods employed by commercial computational fluid dynamics codes. Following the recent work of Schranner et al., the equations and procedure for estimating the numerical dissipation rate and the numerical viscosity in a commercial code are presented. The method allows for the computation of the numerical dissipation rate and numerical viscosity in the physical space for arbitrary sub-domains in a self-consistent way, using only information provided by the code in question. The method is first tested for a three-dimensional Taylor-Green vortex flow in a simple cubic domain and compared with benchmark results obtained using an accurate, incompressible spectral solver. Afterwards the same procedure is applied for the first time to a realistic flow configuration, specifically to the above discussed laminar separation bubble flow over a NACA 0012 airfoil. The method appears to be quite robust and its application reveals that for the code and the flow in question the numerical dissipation can be significantly larger than the viscous dissipation or the dissipation of the classical Smagorinsky sub-grid scale model, confirming the previously qualitative finding.
Verification of Software: The Textbook and Real Problems

NASA Technical Reports Server (NTRS)

Carlson, Jan-Renee

2006-01-01

The process of verification, or determining the order of accuracy of computational codes, can be problematic when working with large, legacy computational methods that have been used extensively in industry or government. Verification does not ensure that the computer program is producing a physically correct solution, it ensures merely that the observed order of accuracy of solutions are the same as the theoretical order of accuracy. The Method of Manufactured Solutions (MMS) is one of several ways for determining the order of accuracy. MMS is used to verify a series of computer codes progressing in sophistication from "textbook" to "real life" applications. The degree of numerical precision in the computations considerably influenced the range of mesh density to achieve the theoretical order of accuracy even for 1-D problems. The choice of manufactured solutions and mesh form shifted the observed order in specific areas but not in general. Solution residual (iterative) convergence was not always achieved for 2-D Euler manufactured solutions. L(sub 2,norm) convergence differed variable to variable therefore an observed order of accuracy could not be determined conclusively in all cases, the cause of which is currently under investigation.
A Framework for Debugging Geoscience Projects in a High Performance Computing Environment

NASA Astrophysics Data System (ADS)

Baxter, C.; Matott, L.

2012-12-01

High performance computing (HPC) infrastructure has become ubiquitous in today's world with the emergence of commercial cloud computing and academic supercomputing centers. Teams of geoscientists, hydrologists and engineers can take advantage of this infrastructure to undertake large research projects - for example, linking one or more site-specific environmental models with soft computing algorithms, such as heuristic global search procedures, to perform parameter estimation and predictive uncertainty analysis, and/or design least-cost remediation systems. However, the size, complexity and distributed nature of these projects can make identifying failures in the associated numerical experiments using conventional ad-hoc approaches both time- consuming and ineffective. To address these problems a multi-tiered debugging framework has been developed. The framework allows for quickly isolating and remedying a number of potential experimental failures, including: failures in the HPC scheduler; bugs in the soft computing code; bugs in the modeling code; and permissions and access control errors. The utility of the framework is demonstrated via application to a series of over 200,000 numerical experiments involving a suite of 5 heuristic global search algorithms and 15 mathematical test functions serving as cheap analogues for the simulation-based optimization of pump-and-treat subsurface remediation systems.
Numerical Simulation of Transit-Time Ultrasonic Flowmeters by a Direct Approach.

PubMed

Luca, Adrian; Marchiano, Regis; Chassaing, Jean-Camille

2016-06-01

This paper deals with the development of a computational code for the numerical simulation of wave propagation through domains with a complex geometry consisting in both solids and moving fluids. The emphasis is on the numerical simulation of ultrasonic flowmeters (UFMs) by modeling the wave propagation in solids with the equations of linear elasticity (ELE) and in fluids with the linearized Euler equations (LEEs). This approach requires high performance computing because of the high number of degrees of freedom and the long propagation distances. Therefore, the numerical method should be chosen with care. In order to minimize the numerical dissipation which may occur in this kind of configuration, the numerical method employed here is the nodal discontinuous Galerkin (DG) method. Also, this method is well suited for parallel computing. To speed up the code, almost all the computational stages have been implemented to run on graphical processing unit (GPU) by using the compute unified device architecture (CUDA) programming model from NVIDIA. This approach has been validated and then used for the two-dimensional simulation of gas UFMs. The large contrast of acoustic impedance characteristic to gas UFMs makes their simulation a real challenge.
A generalized one-dimensional computer code for turbomachinery cooling passage flow calculations

NASA Technical Reports Server (NTRS)

Kumar, Ganesh N.; Roelke, Richard J.; Meitner, Peter L.

1989-01-01

A generalized one-dimensional computer code for analyzing the flow and heat transfer in the turbomachinery cooling passages was developed. This code is capable of handling rotating cooling passages with turbulators, 180 degree turns, pin fins, finned passages, by-pass flows, tip cap impingement flows, and flow branching. The code is an extension of a one-dimensional code developed by P. Meitner. In the subject code, correlations for both heat transfer coefficient and pressure loss computations were developed to model each of the above mentioned type of coolant passages. The code has the capability of independently computing the friction factor and heat transfer coefficient on each side of a rectangular passage. Either the mass flow at the inlet to the channel or the exit plane pressure can be specified. For a specified inlet total temperature, inlet total pressure, and exit static pressure, the code computers the flow rates through the main branch and the subbranches, flow through tip cap for impingement cooling, in addition to computing the coolant pressure, temperature, and heat transfer coefficient distribution in each coolant flow branch. Predictions from the subject code for both nonrotating and rotating passages agree well with experimental data. The code was used to analyze the cooling passage of a research cooled radial rotor.
VisIVO: A Library and Integrated Tools for Large Astrophysical Dataset Exploration

NASA Astrophysics Data System (ADS)

Becciani, U.; Costa, A.; Ersotelos, N.; Krokos, M.; Massimino, P.; Petta, C.; Vitello, F.

2012-09-01

VisIVO provides an integrated suite of tools and services that can be used in many scientific fields. VisIVO development starts in the Virtual Observatory framework. VisIVO allows users to visualize meaningfully highly-complex, large-scale datasets and create movies of these visualizations based on distributed infrastructures. VisIVO supports high-performance, multi-dimensional visualization of large-scale astrophysical datasets. Users can rapidly obtain meaningful visualizations while preserving full and intuitive control of the relevant parameters. VisIVO consists of VisIVO Desktop - a stand-alone application for interactive visualization on standard PCs, VisIVO Server - a platform for high performance visualization, VisIVO Web - a custom designed web portal, VisIVOSmartphone - an application to exploit the VisIVO Server functionality and the latest VisIVO features: VisIVO Library allows a job running on a computational system (grid, HPC, etc.) to produce movies directly with the code internal data arrays without the need to produce intermediate files. This is particularly important when running on large computational facilities, where the user wants to have a look at the results during the data production phase. For example, in grid computing facilities, images can be produced directly in the grid catalogue while the user code is running in a system that cannot be directly accessed by the user (a worker node). The deployment of VisIVO on the DG and gLite is carried out with the support of EDGI and EGI-Inspire projects. Depending on the structure and size of datasets under consideration, the data exploration process could take several hours of CPU for creating customized views and the production of movies could potentially last several days. For this reason an MPI parallel version of VisIVO could play a fundamental role in increasing performance, e.g. it could be automatically deployed on nodes that are MPI aware. A central concept in our development is thus to produce unified code that can run either on serial nodes or in parallel by using HPC oriented grid nodes. Another important aspect, to obtain as high performance as possible, is the integration of VisIVO processes with grid nodes where GPUs are available. We have selected CUDA for implementing a range of computationally heavy modules. VisIVO is supported by EGI-Inspire, EDGI and SCI-BUS projects.
Efficient Proximity Computation Techniques Using ZIP Code Data for Smart Cities †

PubMed Central

Murdani, Muhammad Harist; Hong, Bonghee

2018-01-01

In this paper, we are interested in computing ZIP code proximity from two perspectives, proximity between two ZIP codes (Ad-Hoc) and neighborhood proximity (Top-K). Such a computation can be used for ZIP code-based target marketing as one of the smart city applications. A naïve approach to this computation is the usage of the distance between ZIP codes. We redefine a distance metric combining the centroid distance with the intersecting road network between ZIP codes by using a weighted sum method. Furthermore, we prove that the results of our combined approach conform to the characteristics of distance measurement. We have proposed a general and heuristic approach for computing Ad-Hoc proximity, while for computing Top-K proximity, we have proposed a general approach only. Our experimental results indicate that our approaches are verifiable and effective in reducing the execution time and search space. PMID:29587366
Efficient Proximity Computation Techniques Using ZIP Code Data for Smart Cities †.

PubMed

Murdani, Muhammad Harist; Kwon, Joonho; Choi, Yoon-Ho; Hong, Bonghee

2018-03-24

In this paper, we are interested in computing ZIP code proximity from two perspectives, proximity between two ZIP codes ( Ad-Hoc ) and neighborhood proximity ( Top-K ). Such a computation can be used for ZIP code-based target marketing as one of the smart city applications. A naïve approach to this computation is the usage of the distance between ZIP codes. We redefine a distance metric combining the centroid distance with the intersecting road network between ZIP codes by using a weighted sum method. Furthermore, we prove that the results of our combined approach conform to the characteristics of distance measurement. We have proposed a general and heuristic approach for computing Ad-Hoc proximity, while for computing Top-K proximity, we have proposed a general approach only. Our experimental results indicate that our approaches are verifiable and effective in reducing the execution time and search space.
Computational study of 3-D hot-spot initiation in shocked insensitive high-explosive

NASA Astrophysics Data System (ADS)

Najjar, F. M.; Howard, W. M.; Fried, L. E.; Manaa, M. R.; Nichols, A., III; Levesque, G.

2012-03-01

High-explosive (HE) material consists of large-sized grains with micron-sized embedded impurities and pores. Under various mechanical/thermal insults, these pores collapse generating hightemperature regions leading to ignition. A hydrodynamic study has been performed to investigate the mechanisms of pore collapse and hot spot initiation in TATB crystals, employing a multiphysics code, ALE3D, coupled to the chemistry module, Cheetah. This computational study includes reactive dynamics. Two-dimensional high-resolution large-scale meso-scale simulations have been performed. The parameter space is systematically studied by considering various shock strengths, pore diameters and multiple pore configurations. Preliminary 3-D simulations are undertaken to quantify the 3-D dynamics.
Three-dimensional computer model for the atmospheric general circulation experiment

NASA Technical Reports Server (NTRS)

Roberts, G. O.

1984-01-01

An efficient, flexible, three-dimensional, hydrodynamic, computer code has been developed for a spherical cap geometry. The code will be used to simulate NASA's Atmospheric General Circulation Experiment (AGCE). The AGCE is a spherical, baroclinic experiment which will model the large-scale dynamics of our atmosphere; it has been proposed to NASA for future Spacelab flights. In the AGCE a radial dielectric body force will simulate gravity, with hot fluid tending to move outwards. In order that this force be dominant, the AGCE must be operated in a low gravity environment such as Spacelab. The full potential of the AGCE will only be realized by working in conjunction with an accurate computer model. Proposed experimental parameter settings will be checked first using model runs. Then actual experimental results will be compared with the model predictions. This interaction between experiment and theory will be very valuable in determining the nature of the AGCE flows and hence their relationship to analytical theories and actual atmospheric dynamics.
Aeroacoustic Simulation of a Nose Landing Gear in an Open Jet Facility Using FUN3D

NASA Technical Reports Server (NTRS)

Vatsa, Veer N.; Lockard, David P.; Khorrami, Mehdi R.; Carlson, Jan-Renee

2012-01-01

Numerical simulations have been performed for a partially-dressed, cavity-closed nose landing gear configuration that was tested in NASA Langley s closed-wall Basic Aerodynamic Research Tunnel (BART) and in the University of Florida s open-jet acoustic facility known as UFAFF. The unstructured-grid flow solver, FUN3D, developed at NASA Langley Research center is used to compute the unsteady flow field for this configuration. A hybrid Reynolds-averaged Navier-Stokes/large eddy simulation (RANS/LES) turbulence model is used for these computations. Time-averaged and instantaneous solutions compare favorably with the measured data. Unsteady flowfield data obtained from the FUN3D code are used as input to a Ffowcs Williams-Hawkings noise propagation code to compute the sound pressure levels at microphones placed in the farfield. Significant improvement in predicted noise levels is obtained when the flowfield data from the open jet UFAFF simulations is used as compared to the case using flowfield data from the closed-wall BART configuration.

Seals Research at Texas A/M University

NASA Technical Reports Server (NTRS)

Morrison, Gerald L.

1991-01-01

The Turbomachinery Laboratory at Texas A&M has been providing experimental data and computational codes for the design seals for many years. The program began with the development of a Halon based seal test rig. This facility provided information about the effective stiffness and damping in whirling seals. The Halon effectively simulated cryogenic fluids. Another test facility was developed (using air as the working fluid) where the stiffness and damping matrices can be determined. This data was used to develop bulk flow models of the seal's effect upon rotating machinery; in conjunction with this research, a bulk flow model for calculation of performance and rotordynamic coefficients of annular pressure seals of arbitrary non-uniform clearance for barotropic fluids such as LH2, LOX, LN2, and CH4 was developed. This program is very efficient (fast) and converges for very large eccentricities. Currently, work is being performed on a bulk flow analysis of the effects of the impeller-shroud interaction upon the stability of pumps. The data was used along with data from other researchers to develop an empirical leakage prediction code for MSFC. Presently, the flow field inside labyrinth and annular seals are being studied in detail. An advanced 3-D Doppler anemometer system is being used to measure the mean velocity and entire Reynolds stress tensor distribution throughout the seals. Concentric and statically eccentric seals were studied; presently, whirling seals are being studied. The data obtained are providing valuable information about the flow phenomena occurring inside the seals, as well as a data base for comparison with numerical predictions and for turbulence model development. A finite difference computer code was developed for solving the Reynolds averaged Navier Stokes equation inside labyrinth seals. A multi-scale k-epsilon turbulence model is currently being evaluated. A new seal geometry was designed and patented using a computer code. A large scale, 2-D seal flow visualization facility is also being developed.
High performance computing in biology: multimillion atom simulations of nanoscale systems

PubMed Central

Sanbonmatsu, K. Y.; Tung, C.-S.

2007-01-01

Computational methods have been used in biology for sequence analysis (bioinformatics), all-atom simulation (molecular dynamics and quantum calculations), and more recently for modeling biological networks (systems biology). Of these three techniques, all-atom simulation is currently the most computationally demanding, in terms of compute load, communication speed, and memory load. Breakthroughs in electrostatic force calculation and dynamic load balancing have enabled molecular dynamics simulations of large biomolecular complexes. Here, we report simulation results for the ribosome, using approximately 2.64 million atoms, the largest all-atom biomolecular simulation published to date. Several other nanoscale systems with different numbers of atoms were studied to measure the performance of the NAMD molecular dynamics simulation program on the Los Alamos National Laboratory Q Machine. We demonstrate that multimillion atom systems represent a 'sweet spot' for the NAMD code on large supercomputers. NAMD displays an unprecedented 85% parallel scaling efficiency for the ribosome system on 1024 CPUs. We also review recent targeted molecular dynamics simulations of the ribosome that prove useful for studying conformational changes of this large biomolecular complex in atomic detail. PMID:17187988
Towards a supported common NEAMS software stack

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cormac Garvey

2012-04-01

The NEAMS IPSC's are developing multidimensional, multiphysics, multiscale simulation codes based on first principles that will be capable of predicting all aspects of current and future nuclear reactor systems. These new breeds of simulation codes will include rigorous verification, validation and uncertainty quantification checks to quantify the accuracy and quality of the simulation results. The resulting NEAMS IPSC simulation codes will be an invaluable tool in designing the next generation of Nuclear Reactors and also contribute to a more speedy process in the acquisition of licenses from the NRC for new Reactor designs. Due to the high resolution of themore » models, the complexity of the physics and the added computational resources to quantify the accuracy/quality of the results, the NEAMS IPSC codes will require large HPC resources to carry out the production simulation runs.« less
Multidimensional Multiphysics Simulation of TRISO Particle Fuel

DOE Office of Scientific and Technical Information (OSTI.GOV)

J. D. Hales; R. L. Williamson; S. R. Novascone

2013-11-01

Multidimensional multiphysics analysis of TRISO-coated particle fuel using the BISON finite-element based nuclear fuels code is described. The governing equations and material models applicable to particle fuel and implemented in BISON are outlined. Code verification based on a recent IAEA benchmarking exercise is described, and excellant comparisons are reported. Multiple TRISO-coated particles of increasing geometric complexity are considered. It is shown that the code's ability to perform large-scale parallel computations permits application to complex 3D phenomena while very efficient solutions for either 1D spherically symmetric or 2D axisymmetric geometries are straightforward. Additionally, the flexibility to easily include new physical andmore » material models and uncomplicated ability to couple to lower length scale simulations makes BISON a powerful tool for simulation of coated-particle fuel. Future code development activities and potential applications are identified.« less
Volume accumulator design analysis computer codes

NASA Technical Reports Server (NTRS)

Whitaker, W. D.; Shimazaki, T. T.

1973-01-01

The computer codes, VANEP and VANES, were written and used to aid in the design and performance calculation of the volume accumulator units (VAU) for the 5-kwe reactor thermoelectric system. VANEP computes the VAU design which meets the primary coolant loop VAU volume and pressure performance requirements. VANES computes the performance of the VAU design, determined from the VANEP code, at the conditions of the secondary coolant loop. The codes can also compute the performance characteristics of the VAU's under conditions of possible modes of failure which still permit continued system operation.
A Computationally Efficient Parallel Levenberg-Marquardt Algorithm for Large-Scale Big-Data Inversion

NASA Astrophysics Data System (ADS)

Lin, Y.; O'Malley, D.; Vesselinov, V. V.

2015-12-01

Inverse modeling seeks model parameters given a set of observed state variables. However, for many practical problems due to the facts that the observed data sets are often large and model parameters are often numerous, conventional methods for solving the inverse modeling can be computationally expensive. We have developed a new, computationally-efficient Levenberg-Marquardt method for solving large-scale inverse modeling. Levenberg-Marquardt methods require the solution of a dense linear system of equations which can be prohibitively expensive to compute for large-scale inverse problems. Our novel method projects the original large-scale linear problem down to a Krylov subspace, such that the dimensionality of the measurements can be significantly reduced. Furthermore, instead of solving the linear system for every Levenberg-Marquardt damping parameter, we store the Krylov subspace computed when solving the first damping parameter and recycle it for all the following damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved by using these computational techniques. We apply this new inverse modeling method to invert for a random transitivity field. Our algorithm is fast enough to solve for the distributed model parameters (transitivity) at each computational node in the model domain. The inversion is also aided by the use regularization techniques. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). Julia is an advanced high-level scientific programing language that allows for efficient memory management and utilization of high-performance computational resources. By comparing with a Levenberg-Marquardt method using standard linear inversion techniques, our Levenberg-Marquardt method yields speed-up ratio of 15 in a multi-core computational environment and a speed-up ratio of 45 in a single-core computational environment. Therefore, our new inverse modeling method is a powerful tool for large-scale applications.
"Hour of Code": Can It Change Students' Attitudes toward Programming?

ERIC Educational Resources Information Center

Du, Jie; Wimmer, Hayden; Rada, Roy

2016-01-01

The Hour of Code is a one-hour introduction to computer science organized by Code.org, a non-profit dedicated to expanding participation in computer science. This study investigated the impact of the Hour of Code on students' attitudes towards computer programming and their knowledge of programming. A sample of undergraduate students from two…
Large-scale three-dimensional phase-field simulations for phase coarsening at ultrahigh volume fraction on high-performance architectures

NASA Astrophysics Data System (ADS)

Yan, Hui; Wang, K. G.; Jones, Jim E.

2016-06-01

A parallel algorithm for large-scale three-dimensional phase-field simulations of phase coarsening is developed and implemented on high-performance architectures. From the large-scale simulations, a new kinetics in phase coarsening in the region of ultrahigh volume fraction is found. The parallel implementation is capable of harnessing the greater computer power available from high-performance architectures. The parallelized code enables increase in three-dimensional simulation system size up to a 5123 grid cube. Through the parallelized code, practical runtime can be achieved for three-dimensional large-scale simulations, and the statistical significance of the results from these high resolution parallel simulations are greatly improved over those obtainable from serial simulations. A detailed performance analysis on speed-up and scalability is presented, showing good scalability which improves with increasing problem size. In addition, a model for prediction of runtime is developed, which shows a good agreement with actual run time from numerical tests.
Talking about Code: Integrating Pedagogical Code Reviews into Early Computing Courses

ERIC Educational Resources Information Center

Hundhausen, Christopher D.; Agrawal, Anukrati; Agarwal, Pawan

2013-01-01

Given the increasing importance of soft skills in the computing profession, there is good reason to provide students withmore opportunities to learn and practice those skills in undergraduate computing courses. Toward that end, we have developed an active learning approach for computing education called the "Pedagogical Code Review"…
Staggered-grid finite-difference acoustic modeling with the Time-Domain Atmospheric Acoustic Propagation Suite (TDAAPS).

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aldridge, David Franklin; Collier, Sandra L.; Marlin, David H.

2005-05-01

This document is intended to serve as a users guide for the time-domain atmospheric acoustic propagation suite (TDAAPS) program developed as part of the Department of Defense High-Performance Modernization Office (HPCMP) Common High-Performance Computing Scalable Software Initiative (CHSSI). TDAAPS performs staggered-grid finite-difference modeling of the acoustic velocity-pressure system with the incorporation of spatially inhomogeneous winds. Wherever practical the control structure of the codes are written in C++ using an object oriented design. Sections of code where a large number of calculations are required are written in C or F77 in order to enable better compiler optimization of these sections. Themore » TDAAPS program conforms to a UNIX style calling interface. Most of the actions of the codes are controlled by adding flags to the invoking command line. This document presents a large number of examples and provides new users with the necessary background to perform acoustic modeling with TDAAPS.« less
Validation of radiative transfer computation with Monte Carlo method for ultra-relativistic background flow

NASA Astrophysics Data System (ADS)

Ishii, Ayako; Ohnishi, Naofumi; Nagakura, Hiroki; Ito, Hirotaka; Yamada, Shoichi

2017-11-01

We developed a three-dimensional radiative transfer code for an ultra-relativistic background flow-field by using the Monte Carlo (MC) method in the context of gamma-ray burst (GRB) emission. For obtaining reliable simulation results in the coupled computation of MC radiation transport with relativistic hydrodynamics which can reproduce GRB emission, we validated radiative transfer computation in the ultra-relativistic regime and assessed the appropriate simulation conditions. The radiative transfer code was validated through two test calculations: (1) computing in different inertial frames and (2) computing in flow-fields with discontinuous and smeared shock fronts. The simulation results of the angular distribution and spectrum were compared among three different inertial frames and in good agreement with each other. If the time duration for updating the flow-field was sufficiently small to resolve a mean free path of a photon into ten steps, the results were thoroughly converged. The spectrum computed in the flow-field with a discontinuous shock front obeyed a power-law in frequency whose index was positive in the range from 1 to 10 MeV. The number of photons in the high-energy side decreased with the smeared shock front because the photons were less scattered immediately behind the shock wave due to the small electron number density. The large optical depth near the shock front was needed for obtaining high-energy photons through bulk Compton scattering. Even one-dimensional structure of the shock wave could affect the results of radiation transport computation. Although we examined the effect of the shock structure on the emitted spectrum with a large number of cells, it is hard to employ so many computational cells per dimension in multi-dimensional simulations. Therefore, a further investigation with a smaller number of cells is required for obtaining realistic high-energy photons with multi-dimensional computations.
beachmat: A Bioconductor C++ API for accessing high-throughput biological data from a variety of R matrix types

PubMed Central

Pagès, Hervé

2018-01-01

Biological experiments involving genomics or other high-throughput assays typically yield a data matrix that can be explored and analyzed using the R programming language with packages from the Bioconductor project. Improvements in the throughput of these assays have resulted in an explosion of data even from routine experiments, which poses a challenge to the existing computational infrastructure for statistical data analysis. For example, single-cell RNA sequencing (scRNA-seq) experiments frequently generate large matrices containing expression values for each gene in each cell, requiring sparse or file-backed representations for memory-efficient manipulation in R. These alternative representations are not easily compatible with high-performance C++ code used for computationally intensive tasks in existing R/Bioconductor packages. Here, we describe a C++ interface named beachmat, which enables agnostic data access from various matrix representations. This allows package developers to write efficient C++ code that is interoperable with dense, sparse and file-backed matrices, amongst others. We evaluated the performance of beachmat for accessing data from each matrix representation using both simulated and real scRNA-seq data, and defined a clear memory/speed trade-off to motivate the choice of an appropriate representation. We also demonstrate how beachmat can be incorporated into the code of other packages to drive analyses of a very large scRNA-seq data set. PMID:29723188
A surface code quantum computer in silicon

PubMed Central

Hill, Charles D.; Peretz, Eldad; Hile, Samuel J.; House, Matthew G.; Fuechsle, Martin; Rogge, Sven; Simmons, Michelle Y.; Hollenberg, Lloyd C. L.

2015-01-01

The exceptionally long quantum coherence times of phosphorus donor nuclear spin qubits in silicon, coupled with the proven scalability of silicon-based nano-electronics, make them attractive candidates for large-scale quantum computing. However, the high threshold of topological quantum error correction can only be captured in a two-dimensional array of qubits operating synchronously and in parallel—posing formidable fabrication and control challenges. We present an architecture that addresses these problems through a novel shared-control paradigm that is particularly suited to the natural uniformity of the phosphorus donor nuclear spin qubit states and electronic confinement. The architecture comprises a two-dimensional lattice of donor qubits sandwiched between two vertically separated control layers forming a mutually perpendicular crisscross gate array. Shared-control lines facilitate loading/unloading of single electrons to specific donors, thereby activating multiple qubits in parallel across the array on which the required operations for surface code quantum error correction are carried out by global spin control. The complexities of independent qubit control, wave function engineering, and ad hoc quantum interconnects are explicitly avoided. With many of the basic elements of fabrication and control based on demonstrated techniques and with simulated quantum operation below the surface code error threshold, the architecture represents a new pathway for large-scale quantum information processing in silicon and potentially in other qubit systems where uniformity can be exploited. PMID:26601310
A surface code quantum computer in silicon.

PubMed

Hill, Charles D; Peretz, Eldad; Hile, Samuel J; House, Matthew G; Fuechsle, Martin; Rogge, Sven; Simmons, Michelle Y; Hollenberg, Lloyd C L

2015-10-01

The exceptionally long quantum coherence times of phosphorus donor nuclear spin qubits in silicon, coupled with the proven scalability of silicon-based nano-electronics, make them attractive candidates for large-scale quantum computing. However, the high threshold of topological quantum error correction can only be captured in a two-dimensional array of qubits operating synchronously and in parallel-posing formidable fabrication and control challenges. We present an architecture that addresses these problems through a novel shared-control paradigm that is particularly suited to the natural uniformity of the phosphorus donor nuclear spin qubit states and electronic confinement. The architecture comprises a two-dimensional lattice of donor qubits sandwiched between two vertically separated control layers forming a mutually perpendicular crisscross gate array. Shared-control lines facilitate loading/unloading of single electrons to specific donors, thereby activating multiple qubits in parallel across the array on which the required operations for surface code quantum error correction are carried out by global spin control. The complexities of independent qubit control, wave function engineering, and ad hoc quantum interconnects are explicitly avoided. With many of the basic elements of fabrication and control based on demonstrated techniques and with simulated quantum operation below the surface code error threshold, the architecture represents a new pathway for large-scale quantum information processing in silicon and potentially in other qubit systems where uniformity can be exploited.
beachmat: A Bioconductor C++ API for accessing high-throughput biological data from a variety of R matrix types.

PubMed

Lun, Aaron T L; Pagès, Hervé; Smith, Mike L

2018-05-01

Biological experiments involving genomics or other high-throughput assays typically yield a data matrix that can be explored and analyzed using the R programming language with packages from the Bioconductor project. Improvements in the throughput of these assays have resulted in an explosion of data even from routine experiments, which poses a challenge to the existing computational infrastructure for statistical data analysis. For example, single-cell RNA sequencing (scRNA-seq) experiments frequently generate large matrices containing expression values for each gene in each cell, requiring sparse or file-backed representations for memory-efficient manipulation in R. These alternative representations are not easily compatible with high-performance C++ code used for computationally intensive tasks in existing R/Bioconductor packages. Here, we describe a C++ interface named beachmat, which enables agnostic data access from various matrix representations. This allows package developers to write efficient C++ code that is interoperable with dense, sparse and file-backed matrices, amongst others. We evaluated the performance of beachmat for accessing data from each matrix representation using both simulated and real scRNA-seq data, and defined a clear memory/speed trade-off to motivate the choice of an appropriate representation. We also demonstrate how beachmat can be incorporated into the code of other packages to drive analyses of a very large scRNA-seq data set.
Xyce Parallel Electronic Simulator Users' Guide Version 6.8

DOE Office of Scientific and Technical Information (OSTI.GOV)

Keiter, Eric R.; Aadithya, Karthik Venkatraman; Mei, Ting

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been de- signed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: Capability to solve extremely large circuit problems by supporting large-scale parallel com- puting platforms (up to thousands of processors). This includes support for most popular parallel and serial computers. A differential-algebraic-equation (DAE) formulation, which better isolates the device model package from solver algorithms. This allows onemore » to develop new types of analysis without requiring the implementation of analysis-specific device models. Device models that are specifically tailored to meet Sandia's needs, including some radiation- aware devices (for Sandia users only). Object-oriented code design and implementation using modern coding practices. Xyce is a parallel code in the most general sense of the phrase$-$ a message passing parallel implementation $-$ which allows it to run efficiently a wide range of computing platforms. These include serial, shared-memory and distributed-memory parallel platforms. Attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows.« less
Guidelines for developing vectorizable computer programs

NASA Technical Reports Server (NTRS)

Miner, E. W.

1982-01-01

Some fundamental principles for developing computer programs which are compatible with array-oriented computers are presented. The emphasis is on basic techniques for structuring computer codes which are applicable in FORTRAN and do not require a special programming language or exact a significant penalty on a scalar computer. Researchers who are using numerical techniques to solve problems in engineering can apply these basic principles and thus develop transportable computer programs (in FORTRAN) which contain much vectorizable code. The vector architecture of the ASC is discussed so that the requirements of array processing can be better appreciated. The "vectorization" of a finite-difference viscous shock-layer code is used as an example to illustrate the benefits and some of the difficulties involved. Increases in computing speed with vectorization are illustrated with results from the viscous shock-layer code and from a finite-element shock tube code. The applicability of these principles was substantiated through running programs on other computers with array-associated computing characteristics, such as the Hewlett-Packard (H-P) 1000-F.
The Helicopter Antenna Radiation Prediction Code (HARP)

NASA Technical Reports Server (NTRS)

Klevenow, F. T.; Lynch, B. G.; Newman, E. H.; Rojas, R. G.; Scheick, J. T.; Shamansky, H. T.; Sze, K. Y.

1990-01-01

The first nine months effort in the development of a user oriented computer code, referred to as the HARP code, for analyzing the radiation from helicopter antennas is described. The HARP code uses modern computer graphics to aid in the description and display of the helicopter geometry. At low frequencies the helicopter is modeled by polygonal plates, and the method of moments is used to compute the desired patterns. At high frequencies the helicopter is modeled by a composite ellipsoid and flat plates, and computations are made using the geometrical theory of diffraction. The HARP code will provide a user friendly interface, employing modern computer graphics, to aid the user to describe the helicopter geometry, select the method of computation, construct the desired high or low frequency model, and display the results.
Parallel computation of transverse wakes in linear colliders

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhan, Xiaowei; Ko, Kwok

1996-11-01

SLAC has proposed the detuned structure (DS) as one possible design to control the emittance growth of long bunch trains due to transverse wakefields in the Next Linear Collider (NLC). The DS consists of 206 cells with tapering from cell to cell of the order of few microns to provide Gaussian detuning of the dipole modes. The decoherence of these modes leads to two orders of magnitude reduction in wakefield experienced by the trailing bunch. To model such a large heterogeneous structure realistically is impractical with finite-difference codes using structured grids. The authors have calculated the wakefield in the DSmore » on a parallel computer with a finite-element code using an unstructured grid. The parallel implementation issues are presented along with simulation results that include contributions from higher dipole bands and wall dissipation.« less
Investigations of flowfields found in typical combustor geometries

NASA Technical Reports Server (NTRS)

Lilley, D. G.

1982-01-01

Measurements and computations are being applied to an axisymmetric swirling flow, emerging from swirl vanes at angle phi, entering a large chamber test section via a sudden expansion of various side-wall angles alpha. New features are: the turbulence measurements are being performed on swirling as well as nonswirling flow; and all measurements and computations are also being performed on a confined jet flowfield with realistic downstream blockage. Recent activity falls into three categories: (1) Time-mean flowfield characterization by five-hole pitot probe measurements and by flow visualization; (2) Turbulence measurements by a variety of single- and multi-wire hot-wire probe techniques; and (3) Flowfield computations using the computer code developed during the previous year's research program.

Enhanced fault-tolerant quantum computing in d-level systems.

PubMed

Campbell, Earl T

2014-12-05

Error-correcting codes protect quantum information and form the basis of fault-tolerant quantum computing. Leading proposals for fault-tolerant quantum computation require codes with an exceedingly rare property, a transversal non-Clifford gate. Codes with the desired property are presented for d-level qudit systems with prime d. The codes use n=d-1 qudits and can detect up to ∼d/3 errors. We quantify the performance of these codes for one approach to quantum computation known as magic-state distillation. Unlike prior work, we find performance is always enhanced by increasing d.
Convergence acceleration of the Proteus computer code with multigrid methods

NASA Technical Reports Server (NTRS)

Demuren, A. O.; Ibraheem, S. O.

1992-01-01

Presented here is the first part of a study to implement convergence acceleration techniques based on the multigrid concept in the Proteus computer code. A review is given of previous studies on the implementation of multigrid methods in computer codes for compressible flow analysis. Also presented is a detailed stability analysis of upwind and central-difference based numerical schemes for solving the Euler and Navier-Stokes equations. Results are given of a convergence study of the Proteus code on computational grids of different sizes. The results presented here form the foundation for the implementation of multigrid methods in the Proteus code.
Implementation of radiation shielding calculation methods. Volume 1: Synopsis of methods and summary of results

NASA Technical Reports Server (NTRS)

Capo, M. A.; Disney, R. K.

1971-01-01

The work performed in the following areas is summarized: (1) Analysis of Realistic nuclear-propelled vehicle was analyzed using the Marshall Space Flight Center computer code package. This code package includes one and two dimensional discrete ordinate transport, point kernel, and single scatter techniques, as well as cross section preparation and data processing codes, (2) Techniques were developed to improve the automated data transfer in the coupled computation method of the computer code package and improve the utilization of this code package on the Univac-1108 computer system. (3) The MSFC master data libraries were updated.
Nonuniform code concatenation for universal fault-tolerant quantum computing

NASA Astrophysics Data System (ADS)

Nikahd, Eesa; Sedighi, Mehdi; Saheb Zamani, Morteza

2017-09-01

Using transversal gates is a straightforward and efficient technique for fault-tolerant quantum computing. Since transversal gates alone cannot be computationally universal, they must be combined with other approaches such as magic state distillation, code switching, or code concatenation to achieve universality. In this paper we propose an alternative approach for universal fault-tolerant quantum computing, mainly based on the code concatenation approach proposed in [T. Jochym-O'Connor and R. Laflamme, Phys. Rev. Lett. 112, 010505 (2014), 10.1103/PhysRevLett.112.010505], but in a nonuniform fashion. The proposed approach is described based on nonuniform concatenation of the 7-qubit Steane code with the 15-qubit Reed-Muller code, as well as the 5-qubit code with the 15-qubit Reed-Muller code, which lead to two 49-qubit and 47-qubit codes, respectively. These codes can correct any arbitrary single physical error with the ability to perform a universal set of fault-tolerant gates, without using magic state distillation.
Rosen's (M,R) system in Unified Modelling Language.

PubMed

Zhang, Ling; Williams, Richard A; Gatherer, Derek

2016-01-01

Robert Rosen's (M,R) system is an abstract biological network architecture that is allegedly non-computable on a Turing machine. If (M,R) is truly non-computable, there are serious implications for the modelling of large biological networks in computer software. A body of work has now accumulated addressing Rosen's claim concerning (M,R) by attempting to instantiate it in various software systems. However, a conclusive refutation has remained elusive, principally since none of the attempts to date have unambiguously avoided the critique that they have altered the properties of (M,R) in the coding process, producing merely approximate simulations of (M,R) rather than true computational models. In this paper, we use the Unified Modelling Language (UML), a diagrammatic notation standard, to express (M,R) as a system of objects having attributes, functions and relations. We believe that this instantiates (M,R) in such a way than none of the original properties of the system are corrupted in the process. Crucially, we demonstrate that (M,R) as classically represented in the relational biology literature is implicitly a UML communication diagram. Furthermore, since UML is formally compatible with object-oriented computing languages, instantiation of (M,R) in UML strongly implies its computability in object-oriented coding languages. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Solving large scale structure in ten easy steps with COLA

NASA Astrophysics Data System (ADS)

Tassev, Svetlin; Zaldarriaga, Matias; Eisenstein, Daniel J.

2013-06-01

We present the COmoving Lagrangian Acceleration (COLA) method: an N-body method for solving for Large Scale Structure (LSS) in a frame that is comoving with observers following trajectories calculated in Lagrangian Perturbation Theory (LPT). Unlike standard N-body methods, the COLA method can straightforwardly trade accuracy at small-scales in order to gain computational speed without sacrificing accuracy at large scales. This is especially useful for cheaply generating large ensembles of accurate mock halo catalogs required to study galaxy clustering and weak lensing, as those catalogs are essential for performing detailed error analysis for ongoing and future surveys of LSS. As an illustration, we ran a COLA-based N-body code on a box of size 100 Mpc/h with particles of mass ≈ 5 × 109Msolar/h. Running the code with only 10 timesteps was sufficient to obtain an accurate description of halo statistics down to halo masses of at least 1011Msolar/h. This is only at a modest speed penalty when compared to mocks obtained with LPT. A standard detailed N-body run is orders of magnitude slower than our COLA-based code. The speed-up we obtain with COLA is due to the fact that we calculate the large-scale dynamics exactly using LPT, while letting the N-body code solve for the small scales, without requiring it to capture exactly the internal dynamics of halos. Achieving a similar level of accuracy in halo statistics without the COLA method requires at least 3 times more timesteps than when COLA is employed.
Solution of large nonlinear quasistatic structural mechanics problems on distributed-memory multiprocessor computers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Blanford, M.

1997-12-31

Most commercially-available quasistatic finite element programs assemble element stiffnesses into a global stiffness matrix, then use a direct linear equation solver to obtain nodal displacements. However, for large problems (greater than a few hundred thousand degrees of freedom), the memory size and computation time required for this approach becomes prohibitive. Moreover, direct solution does not lend itself to the parallel processing needed for today`s multiprocessor systems. This talk gives an overview of the iterative solution strategy of JAS3D, the nonlinear large-deformation quasistatic finite element program. Because its architecture is derived from an explicit transient-dynamics code, it does not ever assemblemore » a global stiffness matrix. The author describes the approach he used to implement the solver on multiprocessor computers, and shows examples of problems run on hundreds of processors and more than a million degrees of freedom. Finally, he describes some of the work he is presently doing to address the challenges of iterative convergence for ill-conditioned problems.« less
An algorithm for continuum modeling of rocks with multiple embedded nonlinearly-compliant joints [Continuum modeling of elasto-plastic media with multiple embedded nonlinearly-compliant joints

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hurley, R. C.; Vorobiev, O. Y.; Ezzedine, S. M.

Here, we present a numerical method for modeling the mechanical effects of nonlinearly-compliant joints in elasto-plastic media. The method uses a series of strain-rate and stress update algorithms to determine joint closure, slip, and solid stress within computational cells containing multiple “embedded” joints. This work facilitates efficient modeling of nonlinear wave propagation in large spatial domains containing a large number of joints that affect bulk mechanical properties. We implement the method within the massively parallel Lagrangian code GEODYN-L and provide verification and examples. We highlight the ability of our algorithms to capture joint interactions and multiple weakness planes within individualmore » computational cells, as well as its computational efficiency. We also discuss the motivation for developing the proposed technique: to simulate large-scale wave propagation during the Source Physics Experiments (SPE), a series of underground explosions conducted at the Nevada National Security Site (NNSS).« less
An algorithm for continuum modeling of rocks with multiple embedded nonlinearly-compliant joints [Continuum modeling of elasto-plastic media with multiple embedded nonlinearly-compliant joints

DOE PAGES

Hurley, R. C.; Vorobiev, O. Y.; Ezzedine, S. M.

2017-04-06

Here, we present a numerical method for modeling the mechanical effects of nonlinearly-compliant joints in elasto-plastic media. The method uses a series of strain-rate and stress update algorithms to determine joint closure, slip, and solid stress within computational cells containing multiple “embedded” joints. This work facilitates efficient modeling of nonlinear wave propagation in large spatial domains containing a large number of joints that affect bulk mechanical properties. We implement the method within the massively parallel Lagrangian code GEODYN-L and provide verification and examples. We highlight the ability of our algorithms to capture joint interactions and multiple weakness planes within individualmore » computational cells, as well as its computational efficiency. We also discuss the motivation for developing the proposed technique: to simulate large-scale wave propagation during the Source Physics Experiments (SPE), a series of underground explosions conducted at the Nevada National Security Site (NNSS).« less
Evaluation of a Multicore-Optimized Implementation for Tomographic Reconstruction

PubMed Central

Agulleiro, Jose-Ignacio; Fernández, José Jesús

2012-01-01

Tomography allows elucidation of the three-dimensional structure of an object from a set of projection images. In life sciences, electron microscope tomography is providing invaluable information about the cell structure at a resolution of a few nanometres. Here, large images are required to combine wide fields of view with high resolution requirements. The computational complexity of the algorithms along with the large image size then turns tomographic reconstruction into a computationally demanding problem. Traditionally, high-performance computing techniques have been applied to cope with such demands on supercomputers, distributed systems and computer clusters. In the last few years, the trend has turned towards graphics processing units (GPUs). Here we present a detailed description and a thorough evaluation of an alternative approach that relies on exploitation of the power available in modern multicore computers. The combination of single-core code optimization, vector processing, multithreading and efficient disk I/O operations succeeds in providing fast tomographic reconstructions on standard computers. The approach turns out to be competitive with the fastest GPU-based solutions thus far. PMID:23139768
Characteristics code for shock initiation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Partom, Y.

1986-10-01

We developed SHIN, a characteristics code for shock initiation studies. We describe in detail the equations of state, reaction model, rate equations, and numerical difference equations that SHIN incorporates. SHIN uses the previously developed surface burning reaction model which better represents the shock initiation process in TATB, than do bulk reaction models. A large number of computed simulations prove the code is a reliable and efficient tool for shock initiation studies. A parametric study shows the effect on build-up and run distance to detonation of (1) type of boundary condtion, (2) burning velocity curve, (3) shock duration, (4) rise timemore » in ramp loading, (5) initial density (or porosity) of the explosive, (6) initial temperature, and (7) grain size. 29 refs., 65 figs.« less
Green's function methods in heavy ion shielding

NASA Technical Reports Server (NTRS)

Wilson, John W.; Costen, Robert C.; Shinn, Judy L.; Badavi, Francis F.

1993-01-01

An analytic solution to the heavy ion transport in terms of Green's function is used to generate a highly efficient computer code for space applications. The efficiency of the computer code is accomplished by a nonperturbative technique extending Green's function over the solution domain. The computer code can also be applied to accelerator boundary conditions to allow code validation in laboratory experiments.
Final Report Scalable Analysis Methods and In Situ Infrastructure for Extreme Scale Knowledge Discovery

DOE Office of Scientific and Technical Information (OSTI.GOV)

O'Leary, Patrick

The primary challenge motivating this project is the widening gap between the ability to compute information and to store it for subsequent analysis. This gap adversely impacts science code teams, who can perform analysis only on a small fraction of the data they calculate, resulting in the substantial likelihood of lost or missed science, when results are computed but not analyzed. Our approach is to perform as much analysis or visualization processing on data while it is still resident in memory, which is known as in situ processing. The idea in situ processing was not new at the time ofmore » the start of this effort in 2014, but efforts in that space were largely ad hoc, and there was no concerted effort within the research community that aimed to foster production-quality software tools suitable for use by Department of Energy (DOE) science projects. Our objective was to produce and enable the use of production-quality in situ methods and infrastructure, at scale, on DOE high-performance computing (HPC) facilities, though we expected to have an impact beyond DOE due to the widespread nature of the challenges, which affect virtually all large-scale computational science efforts. To achieve this objective, we engaged in software technology research and development (R&D), in close partnerships with DOE science code teams, to produce software technologies that were shown to run efficiently at scale on DOE HPC platforms.« less
A study of transonic aerodynamic analysis methods for use with a hypersonic aircraft synthesis code

NASA Technical Reports Server (NTRS)

Sandlin, Doral R.; Davis, Paul Christopher

1992-01-01

A means of performing routine transonic lift, drag, and moment analyses on hypersonic all-body and wing-body configurations were studied. The analysis method is to be used in conjunction with the Hypersonic Vehicle Optimization Code (HAVOC). A review of existing techniques is presented, after which three methods, chosen to represent a spectrum of capabilities, are tested and the results are compared with experimental data. The three methods consist of a wave drag code, a full potential code, and a Navier-Stokes code. The wave drag code, representing the empirical approach, has very fast CPU times, but very limited and sporadic results. The full potential code provides results which compare favorably to the wind tunnel data, but with a dramatic increase in computational time. Even more extreme is the Navier-Stokes code, which provides the most favorable and complete results, but with a very large turnaround time. The full potential code, TRANAIR, is used for additional analyses, because of the superior results it can provide over empirical and semi-empirical methods, and because of its automated grid generation. TRANAIR analyses include an all body hypersonic cruise configuration and an oblique flying wing supersonic transport.
Analytical modeling of operating characteristics of premixing-prevaporizing fuel-air mixing passages. Volume 2: User's manual

NASA Technical Reports Server (NTRS)

Anderson, O. L.; Chiappetta, L. M.; Edwards, D. E.; Mcvey, J. B.

1982-01-01

A user's manual describing the operation of three computer codes (ADD code, PTRAK code, and VAPDIF code) is presented. The general features of the computer codes, the input/output formats, run streams, and sample input cases are described.
Attacks on quantum key distribution protocols that employ non-ITS authentication

NASA Astrophysics Data System (ADS)

Pacher, C.; Abidin, A.; Lorünser, T.; Peev, M.; Ursin, R.; Zeilinger, A.; Larsson, J.-Å.

2016-01-01

We demonstrate how adversaries with large computing resources can break quantum key distribution (QKD) protocols which employ a particular message authentication code suggested previously. This authentication code, featuring low key consumption, is not information-theoretically secure (ITS) since for each message the eavesdropper has intercepted she is able to send a different message from a set of messages that she can calculate by finding collisions of a cryptographic hash function. However, when this authentication code was introduced, it was shown to prevent straightforward man-in-the-middle (MITM) attacks against QKD protocols. In this paper, we prove that the set of messages that collide with any given message under this authentication code contains with high probability a message that has small Hamming distance to any other given message. Based on this fact, we present extended MITM attacks against different versions of BB84 QKD protocols using the addressed authentication code; for three protocols, we describe every single action taken by the adversary. For all protocols, the adversary can obtain complete knowledge of the key, and for most protocols her success probability in doing so approaches unity. Since the attacks work against all authentication methods which allow to calculate colliding messages, the underlying building blocks of the presented attacks expose the potential pitfalls arising as a consequence of non-ITS authentication in QKD post-processing. We propose countermeasures, increasing the eavesdroppers demand for computational power, and also prove necessary and sufficient conditions for upgrading the discussed authentication code to the ITS level.
Star and Planet Formation through Cosmic Time

NASA Astrophysics Data System (ADS)

Lee, Aaron Thomas

The computational advances of the past several decades have allowed theoretical astrophysics to proceed at a dramatic pace. Numerical simulations can now simulate the formation of individual molecules all the way up to the evolution of the entire universe. Observational astrophysics is producing data at a prodigious rate, and sophisticated analysis techniques of large data sets continue to be developed. It is now possible for terabytes of data to be effectively turned into stunning astrophysical results. This is especially true for the field of star and planet formation. Theorists are now simulating the formation of individual planets and stars, and observing facilities are finally capturing snapshots of these processes within the Milky Way galaxy and other galaxies. While a coherent theory remains incomplete, great strides have been made toward this goal. This dissertation discusses several projects that develop models of star and planet forma- tion. This work spans large spatial and temporal scales: from the AU-scale of protoplanetary disks all the way up to the parsec-scale of star-forming clouds, and taking place in both contemporary environments like the Milky Way galaxy and primordial environments at redshifts of z 20. Particularly, I show that planet formation need not proceed in incremental stages, where planets grow from millimeter-sized dust grains all the way up to planets, but instead can proceed directly from small dust grains to large kilometer-sized boulders. The requirements for this model to operate effectively are supported by observations. Additionally, I draw suspicion toward one model for how you form high mass stars (stars with masses exceeding 8 Msun), which postulates that high-mass stars are built up from the gradual accretion of mass from the cloud onto low-mass stars. I show that magnetic fields in star forming clouds thwart this transfer of mass, and instead it is likely that high mass stars are created from the gravitational collapse of large clouds. This work also provides a sub-grid model for computational codes that employ sink particles accreting from magnetized gas. Finally, I analyze the role that radiation plays in determining the final masses of the first stars to ever form in the universe. These stars formed in starkly different environments than stars form in today, and the role of the direct radiation from these stars turns out to be a crucial component of primordial star formation theory. These projects use a variety of computational tools, including the use of spectral hydrodynamics codes, magneto-hydrodynamics grid codes that employ adaptive mesh refinement techniques, and long characteristic ray tracing methods. I develop and describe a long characteristic ray tracing method for modeling hydrogen-ionizing radiation from stars. Additionally, I have developed Monte Carlo routines that convert hydrodynamic data used in smoothed particle hydrodynamics codes for use in grid-based codes. Both of these advances will find use beyond simulations of star and planet formation and benefit the astronomical community at large.
Automated apparatus and method of generating native code for a stitching machine

NASA Technical Reports Server (NTRS)

Miller, Jeffrey L. (Inventor)

2000-01-01

A computer system automatically generates CNC code for a stitching machine. The computer determines the locations of a present stitching point and a next stitching point. If a constraint is not found between the present stitching point and the next stitching point, the computer generates code for making a stitch at the next stitching point. If a constraint is found, the computer generates code for changing a condition (e.g., direction) of the stitching machine's stitching head.
A Fast Optimization Method for General Binary Code Learning.

PubMed

Shen, Fumin; Zhou, Xiang; Yang, Yang; Song, Jingkuan; Shen, Heng; Tao, Dacheng

2016-09-22

Hashing or binary code learning has been recognized to accomplish efficient near neighbor search, and has thus attracted broad interests in recent retrieval, vision and learning studies. One main challenge of learning to hash arises from the involvement of discrete variables in binary code optimization. While the widely-used continuous relaxation may achieve high learning efficiency, the pursued codes are typically less effective due to accumulated quantization error. In this work, we propose a novel binary code optimization method, dubbed Discrete Proximal Linearized Minimization (DPLM), which directly handles the discrete constraints during the learning process. Specifically, the discrete (thus nonsmooth nonconvex) problem is reformulated as minimizing the sum of a smooth loss term with a nonsmooth indicator function. The obtained problem is then efficiently solved by an iterative procedure with each iteration admitting an analytical discrete solution, which is thus shown to converge very fast. In addition, the proposed method supports a large family of empirical loss functions, which is particularly instantiated in this work by both a supervised and an unsupervised hashing losses, together with the bits uncorrelation and balance constraints. In particular, the proposed DPLM with a supervised `2 loss encodes the whole NUS-WIDE database into 64-bit binary codes within 10 seconds on a standard desktop computer. The proposed approach is extensively evaluated on several large-scale datasets and the generated binary codes are shown to achieve very promising results on both retrieval and classification tasks.
Tinker-HP: a massively parallel molecular dynamics package for multiscale simulations of large complex systems with advanced point dipole polarizable force fields.

PubMed

Lagardère, Louis; Jolly, Luc-Henri; Lipparini, Filippo; Aviat, Félix; Stamm, Benjamin; Jing, Zhifeng F; Harger, Matthew; Torabifard, Hedieh; Cisneros, G Andrés; Schnieders, Michael J; Gresh, Nohad; Maday, Yvon; Ren, Pengyu Y; Ponder, Jay W; Piquemal, Jean-Philip

2018-01-28

We present Tinker-HP, a massively MPI parallel package dedicated to classical molecular dynamics (MD) and to multiscale simulations, using advanced polarizable force fields (PFF) encompassing distributed multipoles electrostatics. Tinker-HP is an evolution of the popular Tinker package code that conserves its simplicity of use and its reference double precision implementation for CPUs. Grounded on interdisciplinary efforts with applied mathematics, Tinker-HP allows for long polarizable MD simulations on large systems up to millions of atoms. We detail in the paper the newly developed extension of massively parallel 3D spatial decomposition to point dipole polarizable models as well as their coupling to efficient Krylov iterative and non-iterative polarization solvers. The design of the code allows the use of various computer systems ranging from laboratory workstations to modern petascale supercomputers with thousands of cores. Tinker-HP proposes therefore the first high-performance scalable CPU computing environment for the development of next generation point dipole PFFs and for production simulations. Strategies linking Tinker-HP to Quantum Mechanics (QM) in the framework of multiscale polarizable self-consistent QM/MD simulations are also provided. The possibilities, performances and scalability of the software are demonstrated via benchmarks calculations using the polarizable AMOEBA force field on systems ranging from large water boxes of increasing size and ionic liquids to (very) large biosystems encompassing several proteins as well as the complete satellite tobacco mosaic virus and ribosome structures. For small systems, Tinker-HP appears to be competitive with the Tinker-OpenMM GPU implementation of Tinker. As the system size grows, Tinker-HP remains operational thanks to its access to distributed memory and takes advantage of its new algorithmic enabling for stable long timescale polarizable simulations. Overall, a several thousand-fold acceleration over a single-core computation is observed for the largest systems. The extension of the present CPU implementation of Tinker-HP to other computational platforms is discussed.

Computer codes developed and under development at Lewis

NASA Technical Reports Server (NTRS)

Chamis, Christos C.

1992-01-01

The objective of this summary is to provide a brief description of: (1) codes developed or under development at LeRC; and (2) the development status of IPACS with some typical early results. The computer codes that have been developed and/or are under development at LeRC are listed in the accompanying charts. This list includes: (1) the code acronym; (2) select physics descriptors; (3) current enhancements; and (4) present (9/91) code status with respect to its availability and documentation. The computer codes list is grouped by related functions such as: (1) composite mechanics; (2) composite structures; (3) integrated and 3-D analysis; (4) structural tailoring; and (5) probabilistic structural analysis. These codes provide a broad computational simulation infrastructure (technology base-readiness) for assessing the structural integrity/durability/reliability of propulsion systems. These codes serve two other very important functions: they provide an effective means of technology transfer; and they constitute a depository of corporate memory.
Implementation and Characterization of Three-Dimensional Particle-in-Cell Codes on Multiple-Instruction-Multiple-Data Massively Parallel Supercomputers

NASA Technical Reports Server (NTRS)

Lyster, P. M.; Liewer, P. C.; Decyk, V. K.; Ferraro, R. D.

1995-01-01

A three-dimensional electrostatic particle-in-cell (PIC) plasma simulation code has been developed on coarse-grain distributed-memory massively parallel computers with message passing communications. Our implementation is the generalization to three-dimensions of the general concurrent particle-in-cell (GCPIC) algorithm. In the GCPIC algorithm, the particle computation is divided among the processors using a domain decomposition of the simulation domain. In a three-dimensional simulation, the domain can be partitioned into one-, two-, or three-dimensional subdomains ("slabs," "rods," or "cubes") and we investigate the efficiency of the parallel implementation of the push for all three choices. The present implementation runs on the Intel Touchstone Delta machine at Caltech; a multiple-instruction-multiple-data (MIMD) parallel computer with 512 nodes. We find that the parallel efficiency of the push is very high, with the ratio of communication to computation time in the range 0.3%-10.0%. The highest efficiency (> 99%) occurs for a large, scaled problem with 64(sup 3) particles per processing node (approximately 134 million particles of 512 nodes) which has a push time of about 250 ns per particle per time step. We have also developed expressions for the timing of the code which are a function of both code parameters (number of grid points, particles, etc.) and machine-dependent parameters (effective FLOP rate, and the effective interprocessor bandwidths for the communication of particles and grid points). These expressions can be used to estimate the performance of scaled problems--including those with inhomogeneous plasmas--to other parallel machines once the machine-dependent parameters are known.
Bioinformatics of prokaryotic RNAs

PubMed Central

Backofen, Rolf; Amman, Fabian; Costa, Fabrizio; Findeiß, Sven; Richter, Andreas S; Stadler, Peter F

2014-01-01

The genome of most prokaryotes gives rise to surprisingly complex transcriptomes, comprising not only protein-coding mRNAs, often organized as operons, but also harbors dozens or even hundreds of highly structured small regulatory RNAs and unexpectedly large levels of anti-sense transcripts. Comprehensive surveys of prokaryotic transcriptomes and the need to characterize also their non-coding components is heavily dependent on computational methods and workflows, many of which have been developed or at least adapted specifically for the use with bacterial and archaeal data. This review provides an overview on the state-of-the-art of RNA bioinformatics focusing on applications to prokaryotes. PMID:24755880
Development of a CRAY 1 version of the SINDA program. [thermo-structural analyzer program

NASA Technical Reports Server (NTRS)

Juba, S. M.; Fogerson, P. E.

1982-01-01

The SINDA thermal analyzer program was transferred from the UNIVAC 1110 computer to a CYBER And then to a CRAY 1. Significant changes to the code of the program were required in order to execute efficiently on the CYBER and CRAY. The program was tested on the CRAY using a thermal math model of the shuttle which was too large to run on either the UNIVAC or CYBER. An effort was then begun to further modify the code of SINDA in order to make effective use of the vector capabilities of the CRAY.
Extending the imaging volume for biometric iris recognition.

PubMed

Narayanswamy, Ramkumar; Johnson, Gregory E; Silveira, Paulo E X; Wach, Hans B

2005-02-10

The use of the human iris as a biometric has recently attracted significant interest in the area of security applications. The need to capture an iris without active user cooperation places demands on the optical system. Unlike a traditional optical design, in which a large imaging volume is traded off for diminished imaging resolution and capacity for collecting light, Wavefront Coded imaging is a computational imaging technology capable of expanding the imaging volume while maintaining an accurate and robust iris identification capability. We apply Wavefront Coded imaging to extend the imaging volume of the iris recognition application.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Mohr, C.L.; Rausch, W.N.; Hesson, G.M.

The LOCA Simulation Program in the NRU reactor is the first set of experiments to provide data on the behavior of full-length, nuclear-heated PWR fuel bundles during the heatup, reflood, and quench phases of a loss-of-coolant accident (LOCA). This paper compares the temperature time histories of 4 experimental test cases with 4 computer codes: CE-THERM, FRAP-T5, GT3-FLECHT, and TRUMP-FLECHT. The preliminary comparisons between prediction and experiment show that the state-of-the art fuel codes have large uncertainties and are not necessarily conservative in predicting peak temperatures, turn around times, and bundle quench times.
Eleventh NASTRAN User's Colloquium

NASA Technical Reports Server (NTRS)

1983-01-01

NASTRAN (NASA STRUCTURAL ANALYSIS) is a large, comprehensive, nonproprietary, general purpose finite element computer code for structural analysis which was developed under NASA sponsorship. The Eleventh Colloquium provides some comprehensive general papers on the application of finite element methods in engineering, comparisons with other approaches, unique applications, pre- and post-processing or auxiliary programs, and new methods of analysis with NASTRAN.
Recommendations for open data science.

PubMed

Gymrek, Melissa; Farjoun, Yossi

2016-01-01

Life science research increasingly relies on large-scale computational analyses. However, the code and data used for these analyses are often lacking in publications. To maximize scientific impact, reproducibility, and reuse, it is crucial that these resources are made publicly available and are fully transparent. We provide recommendations for improving the openness of data-driven studies in life sciences.
A novel potential/viscous flow coupling technique for computing helicopter flow fields

NASA Technical Reports Server (NTRS)

Summa, J. Michael; Strash, Daniel J.; Yoo, Sungyul

1993-01-01

The primary objective of this work was to demonstrate the feasibility of a new potential/viscous flow coupling procedure for reducing computational effort while maintaining solution accuracy. This closed-loop, overlapped velocity-coupling concept has been developed in a new two-dimensional code, ZAP2D (Zonal Aerodynamics Program - 2D), a three-dimensional code for wing analysis, ZAP3D (Zonal Aerodynamics Program - 3D), and a three-dimensional code for isolated helicopter rotors in hover, ZAPR3D (Zonal Aerodynamics Program for Rotors - 3D). Comparisons with large domain ARC3D solutions and with experimental data for a NACA 0012 airfoil have shown that the required domain size can be reduced to a few tenths of a percent chord for the low Mach and low angle of attack cases and to less than 2-5 chords for the high Mach and high angle of attack cases while maintaining solution accuracies to within a few percent. This represents CPU time reductions by a factor of 2-4 compared with ARC2D. The current ZAP3D calculation for a rectangular plan-form wing of aspect ratio 5 with an outer domain radius of about 1.2 chords represents a speed-up in CPU time over the ARC3D large domain calculation by about a factor of 2.5 while maintaining solution accuracies to within a few percent. A ZAPR3D simulation for a two-bladed rotor in hover with a reduced grid domain of about two chord lengths was able to capture the wake effects and compared accurately with the experimental pressure data. Further development is required in order to substantiate the promise of computational improvements due to the ZAPR3D coupling concept.
Analysis of quantum error-correcting codes: Symplectic lattice codes and toric codes

NASA Astrophysics Data System (ADS)

Harrington, James William

Quantum information theory is concerned with identifying how quantum mechanical resources (such as entangled quantum states) can be utilized for a number of information processing tasks, including data storage, computation, communication, and cryptography. Efficient quantum algorithms and protocols have been developed for performing some tasks (e.g. , factoring large numbers, securely communicating over a public channel, and simulating quantum mechanical systems) that appear to be very difficult with just classical resources. In addition to identifying the separation between classical and quantum computational power, much of the theoretical focus in this field over the last decade has been concerned with finding novel ways of encoding quantum information that are robust against errors, which is an important step toward building practical quantum information processing devices. In this thesis I present some results on the quantum error-correcting properties of oscillator codes (also described as symplectic lattice codes) and toric codes. Any harmonic oscillator system (such as a mode of light) can be encoded with quantum information via symplectic lattice codes that are robust against shifts in the system's continuous quantum variables. I show the existence of lattice codes whose achievable rates match the one-shot coherent information over the Gaussian quantum channel. Also, I construct a family of symplectic self-dual lattices and search for optimal encodings of quantum information distributed between several oscillators. Toric codes provide encodings of quantum information into two-dimensional spin lattices that are robust against local clusters of errors and which require only local quantum operations for error correction. Numerical simulations of this system under various error models provide a calculation of the accuracy threshold for quantum memory using toric codes, which can be related to phase transitions in certain condensed matter models. I also present a local classical processing scheme for correcting errors on toric codes, which demonstrates that quantum information can be maintained in two dimensions by purely local (quantum and classical) resources.
Three-Dimensional Wiring for Extensible Quantum Computing: The Quantum Socket

NASA Astrophysics Data System (ADS)

Béjanin, J. H.; McConkey, T. G.; Rinehart, J. R.; Earnest, C. T.; McRae, C. R. H.; Shiri, D.; Bateman, J. D.; Rohanizadegan, Y.; Penava, B.; Breul, P.; Royak, S.; Zapatka, M.; Fowler, A. G.; Mariantoni, M.

2016-10-01

Quantum computing architectures are on the verge of scalability, a key requirement for the implementation of a universal quantum computer. The next stage in this quest is the realization of quantum error-correction codes, which will mitigate the impact of faulty quantum information on a quantum computer. Architectures with ten or more quantum bits (qubits) have been realized using trapped ions and superconducting circuits. While these implementations are potentially scalable, true scalability will require systems engineering to combine quantum and classical hardware. One technology demanding imminent efforts is the realization of a suitable wiring method for the control and the measurement of a large number of qubits. In this work, we introduce an interconnect solution for solid-state qubits: the quantum socket. The quantum socket fully exploits the third dimension to connect classical electronics to qubits with higher density and better performance than two-dimensional methods based on wire bonding. The quantum socket is based on spring-mounted microwires—the three-dimensional wires—that push directly on a microfabricated chip, making electrical contact. A small wire cross section (approximately 1 mm), nearly nonmagnetic components, and functionality at low temperatures make the quantum socket ideal for operating solid-state qubits. The wires have a coaxial geometry and operate over a frequency range from dc to 8 GHz, with a contact resistance of approximately 150 m Ω , an impedance mismatch of approximately 10 Ω , and minimal cross talk. As a proof of principle, we fabricate and use a quantum socket to measure high-quality superconducting resonators at a temperature of approximately 10 mK. Quantum error-correction codes such as the surface code will largely benefit from the quantum socket, which will make it possible to address qubits located on a two-dimensional lattice. The present implementation of the socket could be readily extended to accommodate a quantum processor with a (10 ×10 )-qubit lattice, which would allow for the realization of a simple quantum memory.
Xyce parallel electronic simulator : users' guide.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mei, Ting; Rankin, Eric Lamont; Thornquist, Heidi K.

2011-05-01

This manual describes the use of the Xyce Parallel Electronic Simulator. Xyce has been designed as a SPICE-compatible, high-performance analog circuit simulator, and has been written to support the simulation needs of the Sandia National Laboratories electrical designers. This development has focused on improving capability over the current state-of-the-art in the following areas: (1) Capability to solve extremely large circuit problems by supporting large-scale parallel computing platforms (up to thousands of processors). Note that this includes support for most popular parallel and serial computers; (2) Improved performance for all numerical kernels (e.g., time integrator, nonlinear and linear solvers) through state-of-the-artmore » algorithms and novel techniques. (3) Device models which are specifically tailored to meet Sandia's needs, including some radiation-aware devices (for Sandia users only); and (4) Object-oriented code design and implementation using modern coding practices that ensure that the Xyce Parallel Electronic Simulator will be maintainable and extensible far into the future. Xyce is a parallel code in the most general sense of the phrase - a message passing parallel implementation - which allows it to run efficiently on the widest possible number of computing platforms. These include serial, shared-memory and distributed-memory parallel as well as heterogeneous platforms. Careful attention has been paid to the specific nature of circuit-simulation problems to ensure that optimal parallel efficiency is achieved as the number of processors grows. The development of Xyce provides a platform for computational research and development aimed specifically at the needs of the Laboratory. With Xyce, Sandia has an 'in-house' capability with which both new electrical (e.g., device model development) and algorithmic (e.g., faster time-integration methods, parallel solver algorithms) research and development can be performed. As a result, Xyce is a unique electrical simulation capability, designed to meet the unique needs of the laboratory.« less
Reactivity effects in VVER-1000 of the third unit of the kalinin nuclear power plant at physical start-up. Computations in ShIPR intellectual code system with library of two-group cross sections generated by UNK code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zizin, M. N.; Zimin, V. G.; Zizina, S. N., E-mail: zizin@adis.vver.kiae.ru

2010-12-15

The ShIPR intellectual code system for mathematical simulation of nuclear reactors includes a set of computing modules implementing the preparation of macro cross sections on the basis of the two-group library of neutron-physics cross sections obtained for the SKETCH-N nodal code. This library is created by using the UNK code for 3D diffusion computation of first VVER-1000 fuel loadings. Computation of neutron fields in the ShIPR system is performed using the DP3 code in the two-group diffusion approximation in 3D triangular geometry. The efficiency of all groups of control rods for the first fuel loading of the third unit ofmore » the Kalinin Nuclear Power Plant is computed. The temperature, barometric, and density effects of reactivity as well as the reactivity coefficient due to the concentration of boric acid in the reactor were computed additionally. Results of computations are compared with the experiment.« less
Reactivity effects in VVER-1000 of the third unit of the kalinin nuclear power plant at physical start-up. Computations in ShIPR intellectual code system with library of two-group cross sections generated by UNK code

NASA Astrophysics Data System (ADS)

Zizin, M. N.; Zimin, V. G.; Zizina, S. N.; Kryakvin, L. V.; Pitilimov, V. A.; Tereshonok, V. A.

2010-12-01

The ShIPR intellectual code system for mathematical simulation of nuclear reactors includes a set of computing modules implementing the preparation of macro cross sections on the basis of the two-group library of neutron-physics cross sections obtained for the SKETCH-N nodal code. This library is created by using the UNK code for 3D diffusion computation of first VVER-1000 fuel loadings. Computation of neutron fields in the ShIPR system is performed using the DP3 code in the two-group diffusion approximation in 3D triangular geometry. The efficiency of all groups of control rods for the first fuel loading of the third unit of the Kalinin Nuclear Power Plant is computed. The temperature, barometric, and density effects of reactivity as well as the reactivity coefficient due to the concentration of boric acid in the reactor were computed additionally. Results of computations are compared with the experiment.
An efficient MPI/OpenMP parallelization of the Hartree–Fock–Roothaan method for the first generation of Intel® Xeon Phi™ processor architecture

DOE PAGES

Mironov, Vladimir; Moskovsky, Alexander; D’Mello, Michael; ...

2017-10-04

The Hartree-Fock (HF) method in the quantum chemistry package GAMESS represents one of the most irregular algorithms in computation today. Major steps in the calculation are the irregular computation of electron repulsion integrals (ERIs) and the building of the Fock matrix. These are the central components of the main Self Consistent Field (SCF) loop, the key hotspot in Electronic Structure (ES) codes. By threading the MPI ranks in the official release of the GAMESS code, we not only speed up the main SCF loop (4x to 6x for large systems), but also achieve a significant (>2x) reduction in the overallmore » memory footprint. These improvements are a direct consequence of memory access optimizations within the MPI ranks. We benchmark our implementation against the official release of the GAMESS code on the Intel R Xeon PhiTM supercomputer. Here, scaling numbers are reported on up to 7,680 cores on Intel Xeon Phi coprocessors.« less
Plasma separation process. Betacell (BCELL) code, user's manual

NASA Astrophysics Data System (ADS)

Taherzadeh, M.

1987-11-01

The emergence of clearly defined applications for (small or large) amounts of long-life and reliable power sources has given the design and production of betavoltaic systems a new life. Moreover, because of the availability of the Plasma Separation Program, (PSP) at TRW, it is now possible to separate the most desirable radioisotopes for betacell power generating devices. A computer code, named BCELL, has been developed to model the betavoltaic concept by utilizing the available up-to-date source/cell parameters. In this program, attempts have been made to determine the betacell energy device maximum efficiency, degradation due to the emitting source radiation and source/cell lifetime power reduction processes. Additionally, comparison is made between the Schottky and PN junction devices for betacell battery design purposes. Certain computer code runs have been made to determine the JV distribution function and the upper limit of the betacell generated power for specified energy sources. A Ni beta emitting radioisotope was used for the energy source and certain semiconductors were used for the converter subsystem of the betacell system. Some results for a Promethium source are also given here for comparison.
The bioelectric code: An ancient computational medium for dynamic control of growth and form.

PubMed

Levin, Michael; Martyniuk, Christopher J

2018-02-01

What determines large-scale anatomy? DNA does not directly specify geometrical arrangements of tissues and organs, and a process of encoding and decoding for morphogenesis is required. Moreover, many species can regenerate and remodel their structure despite drastic injury. The ability to obtain the correct target morphology from a diversity of initial conditions reveals that the morphogenetic code implements a rich system of pattern-homeostatic processes. Here, we describe an important mechanism by which cellular networks implement pattern regulation and plasticity: bioelectricity. All cells, not only nerves and muscles, produce and sense electrical signals; in vivo, these processes form bioelectric circuits that harness individual cell behaviors toward specific anatomical endpoints. We review emerging progress in reading and re-writing anatomical information encoded in bioelectrical states, and discuss the approaches to this problem from the perspectives of information theory, dynamical systems, and computational neuroscience. Cracking the bioelectric code will enable much-improved control over biological patterning, advancing basic evolutionary developmental biology as well as enabling numerous applications in regenerative medicine and synthetic bioengineering. Copyright © 2017 Elsevier B.V. All rights reserved.
A Biosequence-based Approach to Software Characterization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Oehmen, Christopher S.; Peterson, Elena S.; Phillips, Aaron R.

For many applications, it is desirable to have some process for recognizing when software binaries are closely related without relying on them to be identical or have identical segments. Some examples include monitoring utilization of high performance computing centers or service clouds, detecting freeware in licensed code, and enforcing application whitelists. But doing so in a dynamic environment is a nontrivial task because most approaches to software similarity require extensive and time-consuming analysis of a binary, or they fail to recognize executables that are similar but nonidentical. Presented herein is a novel biosequence-based method for quantifying similarity of executable binaries.more » Using this method, it is shown in an example application on large-scale multi-author codes that 1) the biosequence-based method has a statistical performance in recognizing and distinguishing between a collection of real-world high performance computing applications better than 90% of ideal; and 2) an example of using family tree analysis to tune identification for a code subfamily can achieve better than 99% of ideal performance.« less
Investigation of thermal protection systems effects on viscid and inviscid flow fields for manned entry systems

NASA Technical Reports Server (NTRS)

Bartlett, E. P.; Morse, H. L.; Tong, H.

1971-01-01

Procedures and methods for predicting aerothermodynamic heating to delta orbiter shuttle vehicles were reviewed. A number of approximate methods were found to be adequate for large scale parameter studies, but are considered inadequate for final design calculations. It is recommended that final design calculations be based on a computer code which accounts for nonequilibrium chemistry, streamline spreading, entropy swallowing, and turbulence. It is further recommended that this code be developed with the intent that it can be directly coupled with an exact inviscid flow field calculation when the latter becomes available. A nonsimilar, equilibrium chemistry computer code (BLIMP) was used to evaluate the effects of entropy swallowing, turbulence, and various three dimensional approximations. These solutions were compared with available wind tunnel data. It was found study that, for wind tunnel conditions, the effect of entropy swallowing and three dimensionality are small for laminar boundary layers but entropy swallowing causes a significant increase in turbulent heat transfer. However, it is noted that even small effects (say, 10-20%) may be important for the shuttle reusability concept.
Current and planned numerical development for improving computing performance for long duration and/or low pressure transients

DOE Office of Scientific and Technical Information (OSTI.GOV)

Faydide, B.

1997-07-01

This paper presents the current and planned numerical development for improving computing performance in case of Cathare applications needing real time, like simulator applications. Cathare is a thermalhydraulic code developed by CEA (DRN), IPSN, EDF and FRAMATOME for PWR safety analysis. First, the general characteristics of the code are presented, dealing with physical models, numerical topics, and validation strategy. Then, the current and planned applications of Cathare in the field of simulators are discussed. Some of these applications were made in the past, using a simplified and fast-running version of Cathare (Cathare-Simu); the status of the numerical improvements obtained withmore » Cathare-Simu is presented. The planned developments concern mainly the Simulator Cathare Release (SCAR) project which deals with the use of the most recent version of Cathare inside simulators. In this frame, the numerical developments are related with the speed up of the calculation process, using parallel processing and improvement of code reliability on a large set of NPP transients.« less

Using the computer in the clinical consultation; setting the stage, reviewing, recording, and taking actions: multi-channel video study.

PubMed

Kumarapeli, Pushpa; de Lusignan, Simon

2013-06-01

Electronic patient record (EPR) systems are widely used. This study explores the context and use of systems to provide insights into improving their use in clinical practice. We used video to observe 163 consultations by 16 clinicians using four EPR brands. We made a visual study of the consultation room and coded interactions between clinician, patient, and computer. Few patients (6.9%, n=12) declined to participate. Patients looked at the computer twice as much (47.6 s vs 20.6 s, p<0.001) when it was within their gaze. A quarter of consultations were interrupted (27.6%, n=45); and in half the clinician left the room (12.3%, n=20). The core consultation takes about 87% of the total session time; 5% of time is spent pre-consultation, reading the record and calling the patient in; and 8% of time is spent post-consultation, largely entering notes. Consultations with more than one person and where prescribing took place were longer (R(2) adj=22.5%, p<0.001). The core consultation can be divided into 61% of direct clinician-patient interaction, of which 15% is examination, 25% computer use with no patient involvement, and 14% simultaneous clinician-computer-patient interplay. The proportions of computer use are similar between consultations (mean=40.6%, SD=13.7%). There was more data coding in problem-orientated EPR systems, though clinicians often used vague codes. The EPR system is used for a consistent proportion of the consultation and should be designed to facilitate multi-tasking. Clinicians who want to promote screen sharing should change their consulting room layout.
Users manual and modeling improvements for axial turbine design and performance computer code TD2-2

NASA Technical Reports Server (NTRS)

Glassman, Arthur J.

1992-01-01

Computer code TD2 computes design point velocity diagrams and performance for multistage, multishaft, cooled or uncooled, axial flow turbines. This streamline analysis code was recently modified to upgrade modeling related to turbine cooling and to the internal loss correlation. These modifications are presented in this report along with descriptions of the code's expanded input and output. This report serves as the users manual for the upgraded code, which is named TD2-2.
An Object-Oriented Approach to Writing Computational Electromagnetics Codes

NASA Technical Reports Server (NTRS)

Zimmerman, Martin; Mallasch, Paul G.

1996-01-01

Presently, most computer software development in the Computational Electromagnetics (CEM) community employs the structured programming paradigm, particularly using the Fortran language. Other segments of the software community began switching to an Object-Oriented Programming (OOP) paradigm in recent years to help ease design and development of highly complex codes. This paper examines design of a time-domain numerical analysis CEM code using the OOP paradigm, comparing OOP code and structured programming code in terms of software maintenance, portability, flexibility, and speed.
Language-Agnostic Reproducible Data Analysis Using Literate Programming.

PubMed

Vassilev, Boris; Louhimo, Riku; Ikonen, Elina; Hautaniemi, Sampsa

2016-01-01

A modern biomedical research project can easily contain hundreds of analysis steps and lack of reproducibility of the analyses has been recognized as a severe issue. While thorough documentation enables reproducibility, the number of analysis programs used can be so large that in reality reproducibility cannot be easily achieved. Literate programming is an approach to present computer programs to human readers. The code is rearranged to follow the logic of the program, and to explain that logic in a natural language. The code executed by the computer is extracted from the literate source code. As such, literate programming is an ideal formalism for systematizing analysis steps in biomedical research. We have developed the reproducible computing tool Lir (literate, reproducible computing) that allows a tool-agnostic approach to biomedical data analysis. We demonstrate the utility of Lir by applying it to a case study. Our aim was to investigate the role of endosomal trafficking regulators to the progression of breast cancer. In this analysis, a variety of tools were combined to interpret the available data: a relational database, standard command-line tools, and a statistical computing environment. The analysis revealed that the lipid transport related genes LAPTM4B and NDRG1 are coamplified in breast cancer patients, and identified genes potentially cooperating with LAPTM4B in breast cancer progression. Our case study demonstrates that with Lir, an array of tools can be combined in the same data analysis to improve efficiency, reproducibility, and ease of understanding. Lir is an open-source software available at github.com/borisvassilev/lir.
Language-Agnostic Reproducible Data Analysis Using Literate Programming

PubMed Central

Vassilev, Boris; Louhimo, Riku; Ikonen, Elina; Hautaniemi, Sampsa

2016-01-01

A modern biomedical research project can easily contain hundreds of analysis steps and lack of reproducibility of the analyses has been recognized as a severe issue. While thorough documentation enables reproducibility, the number of analysis programs used can be so large that in reality reproducibility cannot be easily achieved. Literate programming is an approach to present computer programs to human readers. The code is rearranged to follow the logic of the program, and to explain that logic in a natural language. The code executed by the computer is extracted from the literate source code. As such, literate programming is an ideal formalism for systematizing analysis steps in biomedical research. We have developed the reproducible computing tool Lir (literate, reproducible computing) that allows a tool-agnostic approach to biomedical data analysis. We demonstrate the utility of Lir by applying it to a case study. Our aim was to investigate the role of endosomal trafficking regulators to the progression of breast cancer. In this analysis, a variety of tools were combined to interpret the available data: a relational database, standard command-line tools, and a statistical computing environment. The analysis revealed that the lipid transport related genes LAPTM4B and NDRG1 are coamplified in breast cancer patients, and identified genes potentially cooperating with LAPTM4B in breast cancer progression. Our case study demonstrates that with Lir, an array of tools can be combined in the same data analysis to improve efficiency, reproducibility, and ease of understanding. Lir is an open-source software available at github.com/borisvassilev/lir. PMID:27711123
Enabling a Scientific Cloud Marketplace: VGL (Invited)

NASA Astrophysics Data System (ADS)

Fraser, R.; Woodcock, R.; Wyborn, L. A.; Vote, J.; Rankine, T.; Cox, S. J.

2013-12-01

The Virtual Geophysics Laboratory (VGL) provides a flexible, web based environment where researchers can browse data and use a variety of scientific software packaged into tool kits that run in the Cloud. Both data and tool kits are published by multiple researchers and registered with the VGL infrastructure forming a data and application marketplace. The VGL provides the basic work flow of Discovery and Access to the disparate data sources and a Library for tool kits and scripting to drive the scientific codes. Computation is then performed on the Research or Commercial Clouds. Provenance information is collected throughout the work flow and can be published alongside the results allowing for experiment comparison and sharing with other researchers. VGL's "mix and match" approach to data, computational resources and scientific codes, enables a dynamic approach to scientific collaboration. VGL allows scientists to publish their specific contribution, be it data, code, compute or work flow, knowing the VGL framework will provide other components needed for a complete application. Other scientists can choose the pieces that suit them best to assemble an experiment. The coarse grain workflow of the VGL framework combined with the flexibility of the scripting library and computational toolkits allows for significant customisation and sharing amongst the community. The VGL utilises the cloud computational and storage resources from the Australian academic research cloud provided by the NeCTAR initiative and a large variety of data accessible from national and state agencies via the Spatial Information Services Stack (SISS - http://siss.auscope.org). VGL v1.2 screenshot - http://vgl.auscope.org
Grid Computing and Collaboration Technology in Support of Fusion Energy Sciences

NASA Astrophysics Data System (ADS)

Schissel, D. P.

2004-11-01

The SciDAC Initiative is creating a computational grid designed to advance scientific understanding in fusion research by facilitating collaborations, enabling more effective integration of experiments, theory and modeling, and allowing more efficient use of experimental facilities. The philosophy is that data, codes, analysis routines, visualization tools, and communication tools should be thought of as easy to use network available services. Access to services is stressed rather than portability. Services share the same basic security infrastructure so that stakeholders can control their own resources and helps ensure fair use of resources. The collaborative control room is being developed using the open-source Access Grid software that enables secure group-to-group collaboration with capabilities beyond teleconferencing including application sharing and control. The ability to effectively integrate off-site scientists into a dynamic control room will be critical to the success of future international projects like ITER. Grid computing, the secure integration of computer systems over high-speed networks to provide on-demand access to data analysis capabilities and related functions, is being deployed as an alternative to traditional resource sharing among institutions. The first grid computational service deployed was the transport code TRANSP and included tools for run preparation, submission, monitoring and management. This approach saves user sites from the laborious effort of maintaining a complex code while at the same time reducing the burden on developers by avoiding the support of a large number of heterogeneous installations. This tutorial will present the philosophy behind an advanced collaborative environment, give specific examples, and discuss its usage beyond FES.
An implementation of a tree code on a SIMD, parallel computer

NASA Technical Reports Server (NTRS)

Olson, Kevin M.; Dorband, John E.

1994-01-01

We describe a fast tree algorithm for gravitational N-body simulation on SIMD parallel computers. The tree construction uses fast, parallel sorts. The sorted lists are recursively divided along their x, y and z coordinates. This data structure is a completely balanced tree (i.e., each particle is paired with exactly one other particle) and maintains good spatial locality. An implementation of this tree-building algorithm on a 16k processor Maspar MP-1 performs well and constitutes only a small fraction (approximately 15%) of the entire cycle of finding the accelerations. Each node in the tree is treated as a monopole. The tree search and the summation of accelerations also perform well. During the tree search, node data that is needed from another processor is simply fetched. Roughly 55% of the tree search time is spent in communications between processors. We apply the code to two problems of astrophysical interest. The first is a simulation of the close passage of two gravitationally, interacting, disk galaxies using 65,636 particles. We also simulate the formation of structure in an expanding, model universe using 1,048,576 particles. Our code attains speeds comparable to one head of a Cray Y-MP, so single instruction, multiple data (SIMD) type computers can be used for these simulations. The cost/performance ratio for SIMD machines like the Maspar MP-1 make them an extremely attractive alternative to either vector processors or large multiple instruction, multiple data (MIMD) type parallel computers. With further optimizations (e.g., more careful load balancing), speeds in excess of today's vector processing computers should be possible.
Comparison of two LES codes for wind turbine wake studies

NASA Astrophysics Data System (ADS)

Sarlak, H.; Pierella, F.; Mikkelsen, R.; Sørensen, J. N.

2014-06-01

For the third time a blind test comparison in Norway 2013, was conducted comparing numerical simulations for the rotor Cp and Ct and wake profiles with the experimental results. As the only large eddy simulation study among participants, results of the Technical University of Denmark (DTU) using their in-house CFD solver, EllipSys3D, proved to be more reliable among the other models for capturing the wake profiles and the turbulence intensities downstream the turbine. It was therefore remarked in the workshop to investigate other LES codes to compare their performance with EllipSys3D. The aim of this paper is to investigate on two CFD solvers, the DTU's in-house code, EllipSys3D and the open-sourse toolbox, OpenFoam, for a set of actuator line based LES computations. Two types of simulations are performed: the wake behind a signle rotor and the wake behind a cluster of three inline rotors. Results are compared in terms of velocity deficit, turbulence kinetic energy and eddy viscosity. It is seen that both codes predict similar near-wake flow structures with the exception of OpenFoam's simulations without the subgrid-scale model. The differences begin to increase with increasing the distance from the upstream rotor. From the single rotor simulations, EllipSys3D is found to predict a slower wake recovery in the case of uniform laminar flow. From the 3-rotor computations, it is seen that the difference between the codes is smaller as the disturbance created by the downstream rotors causes break down of the wake structures and more homogenuous flow structures. It is finally observed that OpenFoam computations are more sensitive to the SGS models.
Spectral-element Seismic Wave Propagation on CUDA/OpenCL Hardware Accelerators

NASA Astrophysics Data System (ADS)

Peter, D. B.; Videau, B.; Pouget, K.; Komatitsch, D.

2015-12-01

Seismic wave propagation codes are essential tools to investigate a variety of wave phenomena in the Earth. Furthermore, they can now be used for seismic full-waveform inversions in regional- and global-scale adjoint tomography. Although these seismic wave propagation solvers are crucial ingredients to improve the resolution of tomographic images to answer important questions about the nature of Earth's internal processes and subsurface structure, their practical application is often limited due to high computational costs. They thus need high-performance computing (HPC) facilities to improving the current state of knowledge. At present, numerous large HPC systems embed many-core architectures such as graphics processing units (GPUs) to enhance numerical performance. Such hardware accelerators can be programmed using either the CUDA programming environment or the OpenCL language standard. CUDA software development targets NVIDIA graphic cards while OpenCL was adopted by additional hardware accelerators, like e.g. AMD graphic cards, ARM-based processors as well as Intel Xeon Phi coprocessors. For seismic wave propagation simulations using the open-source spectral-element code package SPECFEM3D_GLOBE, we incorporated an automatic source-to-source code generation tool (BOAST) which allows us to use meta-programming of all computational kernels for forward and adjoint runs. Using our BOAST kernels, we generate optimized source code for both CUDA and OpenCL languages within the source code package. Thus, seismic wave simulations are able now to fully utilize CUDA and OpenCL hardware accelerators. We show benchmarks of forward seismic wave propagation simulations using SPECFEM3D_GLOBE on CUDA/OpenCL GPUs, validating results and comparing performances for different simulations and hardware usages.
Quinoa - Adaptive Computational Fluid Dynamics, 0.2

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bakosi, Jozsef; Gonzalez, Francisco; Rogers, Brandon

Quinoa is a set of computational tools that enables research and numerical analysis in fluid dynamics. At this time it remains a test-bed to experiment with various algorithms using fully asynchronous runtime systems. Currently, Quinoa consists of the following tools: (1) Walker, a numerical integrator for systems of stochastic differential equations in time. It is a mathematical tool to analyze and design the behavior of stochastic differential equations. It allows the estimation of arbitrary coupled statistics and probability density functions and is currently used for the design of statistical moment approximations for multiple mixing materials in variable-density turbulence. (2) Inciter,more » an overdecomposition-aware finite element field solver for partial differential equations using 3D unstructured grids. Inciter is used to research asynchronous mesh-based algorithms and to experiment with coupling asynchronous to bulk-synchronous parallel code. Two planned new features of Inciter, compared to the previous release (LA-CC-16-015), to be implemented in 2017, are (a) a simple Navier-Stokes solver for ideal single-material compressible gases, and (b) solution-adaptive mesh refinement (AMR), which enables dynamically concentrating compute resources to regions with interesting physics. Using the NS-AMR problem we plan to explore how to scale such high-load-imbalance simulations, representative of large production multiphysics codes, to very large problems on very large computers using an asynchronous runtime system. (3) RNGTest, a test harness to subject random number generators to stringent statistical tests enabling quantitative ranking with respect to their quality and computational cost. (4) UnitTest, a unit test harness, running hundreds of tests per second, capable of testing serial, synchronous, and asynchronous functions. (5) MeshConv, a mesh file converter that can be used to convert 3D tetrahedron meshes from and to either of the following formats: Gmsh, (http://www.geuz.org/gmsh), Netgen, (http://sourceforge.net/apps/mediawiki/netgen-mesher), ExodusII, (http://sourceforge.net/projects/exodusii), HyperMesh, (http://www.altairhyperworks.com/product/HyperMesh).« less
Computation of Engine Noise Propagation and Scattering Off an Aircraft

NASA Technical Reports Server (NTRS)

Xu, J.; Stanescu, D.; Hussaini, M. Y.; Farassat, F.

2003-01-01

The paper presents a comparison of experimental noise data measured in flight on a two-engine business jet aircraft with Kulite microphones placed on the suction surface of the wing with computational results. Both a time-domain discontinuous Galerkin spectral method and a frequency-domain spectral element method are used to simulate the radiation of the dominant spinning mode from the engine and its reflection and scattering by the fuselage and the wing. Both methods are implemented in computer codes that use the distributed memory model to make use of large parallel architectures. The results show that trends of the noise field are well predicted by both methods.
Experimental and computational results from a large low-speed centrifugal impeller

NASA Technical Reports Server (NTRS)

Hathaway, M. D.; Chriss, R. M.; Wood, J. R.; Strazisar, A. J.

1993-01-01

An experimental and computational investigation of the NASA Low-Speed Centrifugal Compressor (LSCC) flow field has been conducted using laser anemometry and Dawes' 3D viscous code. The experimental configuration consists of a backswept impeller followed by a vaneless diffuser. Measurements of the three-dimensional velocity field were acquired at several measurement planes through the compressor. The measurements describe both the throughflow and secondary velocity field along each measurement plane and in several cases provide details of the flow within the blade boundary layers. The experimental and computational results provide a clear understanding of the development of the throughflow momentum wake which is characteristic of centrifugal compressors.
CICART Center For Integrated Computation And Analysis Of Reconnection And Turbulence

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bhattacharjee, Amitava

CICART is a partnership between the University of New Hampshire (UNH) and Dartmouth College. CICART addresses two important science needs of the DoE: the basic understanding of magnetic reconnection and turbulence that strongly impacts the performance of fusion plasmas, and the development of new mathematical and computational tools that enable the modeling and control of these phenomena. The principal participants of CICART constitute an interdisciplinary group, drawn from the communities of applied mathematics, astrophysics, computational physics, fluid dynamics, and fusion physics. It is a main premise of CICART that fundamental aspects of magnetic reconnection and turbulence in fusion devices, smaller-scalemore » laboratory experiments, and space and astrophysical plasmas can be viewed from a common perspective, and that progress in understanding in any of these interconnected fields is likely to lead to progress in others. The establishment of CICART has strongly impacted the education and research mission of a new Program in Integrated Applied Mathematics in the College of Engineering and Applied Sciences at UNH by enabling the recruitment of a tenure-track faculty member, supported equally by UNH and CICART, and the establishment of an IBM-UNH Computing Alliance. The proposed areas of research in magnetic reconnection and turbulence in astrophysical, space, and laboratory plasmas include the following topics: (A) Reconnection and secondary instabilities in large high-Lundquist-number plasmas, (B) Particle acceleration in the presence of multiple magnetic islands, (C) Gyrokinetic reconnection: comparison with fluid and particle-in-cell models, (D) Imbalanced turbulence, (E) Ion heating, and (F) Turbulence in laboratory (including fusion-relevant) experiments. These theoretical studies make active use of three high-performance computer simulation codes: (1) The Magnetic Reconnection Code, based on extended two-fluid (or Hall MHD) equations, in an Adaptive Mesh Refinement (AMR) framework, (2) the Particle Simulation Code, a fully electromagnetic 3D Particle-In-Cell (PIC) code that includes a collision operator, and (3) GS2, an Eulerian, electromagnetic, kinetic code that is widely used in the fusion program, and simulates the nonlinear gyrokinetic equations, together with a self-consistent set of Maxwell’s equations.« less
Computer Description of the Field Artillery Ammunition Supply Vehicle

DTIC Science & Technology

1983-04-01

Combinatorial Geometry (COM-GEOM) GIFT Computer Code Computer Target Description 2& AfTNACT (Cmne M feerve shb N ,neemssalyan ify by block number) A...input to the GIFT computer code to generate target vulnerability data. F.a- 4 ono OF I NOV 5S OLETE UNCLASSIFIED SECUOITY CLASSIFICATION OF THIS PAGE...Combinatorial Geometry (COM-GEOM) desrription. The "Geometric Information for Tarqets" ( GIFT ) computer code accepts the CO!-GEOM description and
Computer aided design of extrusion forming tools for complex geometry profiles

NASA Astrophysics Data System (ADS)

Goncalves, Nelson Daniel Ferreira

In the profile extrusion, the experience of the die designer is crucial for obtaining good results. In industry, it is quite usual the need of several experimental trials for a specific extrusion die before a balanced flow distribution is obtained. This experimental based trial-and-error procedure is time and money consuming, but, it works, and most of the profile extrusion companies rely on such method. However, the competition is forcing the industry to look for more effective procedures and the design of profile extrusion dies is not an exception. For this purpose, computer aided design seems to be a good route. Nowadays, the available computational rheology numerical codes allow the simulation of complex fluid flows. This permits the die designer to evaluate and to optimize the flow channel, without the need to have a physical die and to perform real extrusion trials. In this work, a finite volume based numerical code was developed, for the simulation of non-Newtonian (inelastic) fluid and non-isothermal flows using unstructured meshes. The developed code is able to model the forming and cooling stages of profile extrusion, and can be used to aid the design of forming tools used in the production of complex profiles. For the code verification three benchmark problems were tested: flow between parallel plates, flow around a cylinder, and the lid driven cavity flow. The code was employed to design two extrusion dies to produce complex cross section profiles: a medical catheter die and a wood plastic composite profile for decking applications. The last was experimentally validated. Simple extrusion dies used to produced L and T shaped profiles were studied in detail, allowing a better understanding of the effect of the main geometry parameters on the flow distribution. To model the cooling stage a new implicit formulation was devised, which allowed the achievement of better convergence rates and thus the reduction of the computation times. Having in mind the solution of large dimension problems, the code was parallelized using graphics processing units (GPUs). Speedups of ten times could be obtained, drastically decreasing the time required to obtain results.
A method for the modelling of porous and solid wind tunnel walls in computational fluid dynamics codes

NASA Technical Reports Server (NTRS)

Beutner, Thomas John

1993-01-01

Porous wall wind tunnels have been used for several decades and have proven effective in reducing wall interference effects in both low speed and transonic testing. They allow for testing through Mach 1, reduce blockage effects and reduce shock wave reflections in the test section. Their usefulness in developing computational fluid dynamics (CFD) codes has been limited, however, by the difficulties associated with modelling the effect of a porous wall in CFD codes. Previous approaches to modelling porous wall effects have depended either upon a simplified linear boundary condition, which has proven inadequate, or upon detailed measurements of the normal velocity near the wall, which require extensive wind tunnel time. The current work was initiated in an effort to find a simple, accurate method of modelling a porous wall boundary condition in CFD codes. The development of such a method would allow data from porous wall wind tunnels to be used more readily in validating CFD codes. This would be beneficial when transonic validations are desired, or when large models are used to achieve high Reynolds numbers in testing. A computational and experimental study was undertaken to investigate a new method of modelling solid and porous wall boundary conditions in CFD codes. The method utilized experimental measurements at the walls to develop a flow field solution based on the method of singularities. This flow field solution was then imposed as a pressure boundary condition in a CFD simulation of the internal flow field. The effectiveness of this method in describing the effect of porosity changes on the wall was investigated. Also, the effectiveness of this method when only sparse experimental measurements were available has been investigated. The current work demonstrated this approach for low speed flows and compared the results with experimental data obtained from a heavily instrumented variable porosity test section. The approach developed was simple, computationally inexpensive, and did not require extensive or intrusive measurements of the boundary conditions during the wind tunnel test. It may be applied to both solid and porous wall wind tunnel tests.
A Proposal of Monitoring and Forecasting Method for Crustal Activity in and around Japan with 3-dimensional Heterogeneous Medium Using a Large-scale High-fidelity Finite Element Simulation

NASA Astrophysics Data System (ADS)

Hori, T.; Agata, R.; Ichimura, T.; Fujita, K.; Yamaguchi, T.; Takahashi, N.

2017-12-01

Recently, we can obtain continuous dense surface deformation data on land and partly on the sea floor, the obtained data are not fully utilized for monitoring and forecasting of crustal activity, such as spatio-temporal variation in slip velocity on the plate interface including earthquakes, seismic wave propagation, and crustal deformation. For construct a system for monitoring and forecasting, it is necessary to develop a physics-based data analysis system including (1) a structural model with the 3D geometry of the plate inter-face and the material property such as elasticity and viscosity, (2) calculation code for crustal deformation and seismic wave propagation using (1), (3) inverse analysis or data assimilation code both for structure and fault slip using (1) & (2). To accomplish this, it is at least necessary to develop highly reliable large-scale simulation code to calculate crustal deformation and seismic wave propagation for 3D heterogeneous structure. Unstructured FE non-linear seismic wave simulation code has been developed. This achieved physics-based urban earthquake simulation enhanced by 1.08 T DOF x 6.6 K time-step. A high fidelity FEM simulation code with mesh generator has also been developed to calculate crustal deformation in and around Japan with complicated surface topography and subducting plate geometry for 1km mesh. This code has been improved the code for crustal deformation and achieved 2.05 T-DOF with 45m resolution on the plate interface. This high-resolution analysis enables computation of change of stress acting on the plate interface. Further, for inverse analyses, waveform inversion code for modeling 3D crustal structure has been developed, and the high-fidelity FEM code has been improved to apply an adjoint method for estimating fault slip and asthenosphere viscosity. Hence, we have large-scale simulation and analysis tools for monitoring. We are developing the methods for forecasting the slip velocity variation on the plate interface. Although the prototype is for elastic half space model, we are applying it for 3D heterogeneous structure with the high-fidelity FE model. Furthermore, large-scale simulation codes for monitoring are being implemented on the GPU clusters and analysis tools are developing to include other functions such as examination in model errors.
48 CFR 252.227-7013 - Rights in technical data-Noncommercial items.

Code of Federal Regulations, 2011 CFR

2011-10-01

... causing a computer to perform a specific operation or series of operations. (3) Computer software means computer programs, source code, source code listings, object code listings, design details, algorithms... or will be developed exclusively with Government funds; (ii) Studies, analyses, test data, or similar...
48 CFR 252.227-7013 - Rights in technical data-Noncommercial items.

Code of Federal Regulations, 2012 CFR

2012-10-01

... causing a computer to perform a specific operation or series of operations. (3) Computer software means computer programs, source code, source code listings, object code listings, design details, algorithms... or will be developed exclusively with Government funds; (ii) Studies, analyses, test data, or similar...

Some links on this page may take you to non-federal websites. Their policies may differ from this site.